43 Commits

Author SHA1 Message Date
Noel Georgi
1b22df48a4
chore: support debug shell for advanced development
Support dropping into a very minimal debug shell.

```bash
sudo -E --preserve-env=HOME _out/talosctl-linux-amd64 cluster create --provisioner=qemu $REGISTRY_MIRROR_FLAGS --controlplanes=1 --workers=0 --with-bootloader=false --with-debug-shell
```

Co-authored-by: Dmitry Sharshakov <dmitry.sharshakov@siderolabs.com>
Signed-off-by: Noel Georgi <git@frezbo.dev>
Signed-off-by: Dmitry Sharshakov <dmitry.sharshakov@siderolabs.com>
2024-10-19 16:56:24 +02:00
Noel Georgi
fc89dc2164
fix: support extra-disks when using iso
When using `iso` and `extra-disks` we're getting errors like below for
any nodes than the first node.

```text
qemu-system-aarch64: -cdrom _out/metal-arm64-secureboot.iso: drive with bus=0, unit=2 (index=2) exists
```

Fix by explicitly specifying the the media is cdrom, so qemu doesn't
index.

Signed-off-by: Noel Georgi <git@frezbo.dev>
2024-10-16 21:41:58 +05:30
Andrey Smirnov
b453385bd9
feat: support volume configuration, provisioning, etc
This implements the first round of changes, replacing the volume backend
with the new implementation, while keeping most of the external
interfaces intact.

See #8367

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-08-30 18:32:34 +04:00
Noel Georgi
106c17d0b5
chore: aarch64 qemu local secureboot support
Support booting with SecureBoot on aarch64 with `talosctl cluster
create` with QEMU provisioner.

Signed-off-by: Noel Georgi <git@frezbo.dev>
2024-08-28 18:47:45 +05:30
Noel Georgi
6f969e3645
chore: improve cluster create UX on aarch64
Improve cluster create UX on aarch64.

Signed-off-by: Noel Georgi <git@frezbo.dev>
2024-08-16 15:55:26 +05:30
Andrey Smirnov
f07b79f4a8
feat: provide disk detection based on new blockdevices
Uses go-siderolabs/go-blockdevice/v2 for all the hard parts,
provides new resource `Disk` which describes all disks in the system.

Additional resource `SystemDisk` always point to the system disk (based
on the location of `META` partition).

The `Disks` API (and `talosctl disks`) provides a view now into the
`talosctl get disks` to keep backwards compatibility.

QEMU provisioner can now create extra disks of various types: IDE, AHCI,
SCSI, NVME, this allows to test detection properly.

The new resource will be the foundation for volume provisioning (to pick
up the disk to provision the volume on).

Example:

```
talosctl -n 172.20.0.5 get disks
NODE         NAMESPACE   TYPE   ID        VERSION   SIZE          READ ONLY   TRANSPORT   ROTATIONAL   WWID                                                               MODEL            SERIAL
172.20.0.5   runtime     Disk   loop0     1         65568768      true
172.20.0.5   runtime     Disk   nvme0n1   1         10485760000   false       nvme                     nvme.1b36-6465616462656566-51454d55204e564d65204374726c-00000001   QEMU NVMe Ctrl   deadbeef
172.20.0.5   runtime     Disk   sda       1         10485760000   false       virtio      true                                                                            QEMU HARDDISK
172.20.0.5   runtime     Disk   sdb       1         10485760000   false       sata        true         t10.ATA     QEMU HARDDISK                           QM00013        QEMU HARDDISK
172.20.0.5   runtime     Disk   sdc       1         10485760000   false       sata        true         t10.ATA     QEMU HARDDISK                           QM00001        QEMU HARDDISK
172.20.0.5   runtime     Disk   vda       1         12884901888   false       virtio      true
```

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-06-07 20:18:32 +04:00
Utku Ozdemir
0821b9c50b
feat: add --non-masquerade-cidrs flag to talosctl cluster create
Allow skipping NAT for the given destinations from a cluster network. This option makes it possible to form an etcd cluster from clusters in different networks created by running `talosctl cluster create` command multiple times using different CIDRs: they simply should have the CIDR of the other clusters passed with `--non-masquerade-cidrs`.

Signed-off-by: Utku Ozdemir <utku.ozdemir@siderolabs.com>
2024-04-23 13:30:13 +02:00
Dmitry Sharshakov
9456489147
feat: support hardware watchdog timers
Only enabled when activated by config, disabled on shutdown/reboot

Fixes #8284

Signed-off-by: Dmitry Sharshakov <dmitry.sharshakov@siderolabs.com>
Signed-off-by: Dmitry Sharshakov <d3dx12.xx@gmail.com>
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-03-25 18:19:39 +03:00
Andrey Smirnov
b2ad5dc5f8
fix: workaround a race in CNI setup (talosctl cluster create)
When provisioning VMs, each launch process sets up CNI network, and from
time to time CNI setup fails with something like:

```
error provisioning CNI network: plugin type="firewall" failed (add): running [/sbin/iptables -t filter -N CNI-ADMIN --wait]: exit status 4: iptables v1.8.10 (nf_tables)
```

This a race condition in the CNI plugins, and it looks like there is no
fix for it (see e.g. https://github.com/hashicorp/nomad/issues/8838).

As a workaround, take a mutex around CNI operation to serialize them.
CNI setup happens in different processes, so use a file-based mutex.

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-02-27 15:28:12 +04:00
Dmitriy Matrenichev
fa3b933705
chore: replace fmt.Errorf with errors.New where possible
This time use `eg` from `x/tools` repo tool to do this.

Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
2024-02-14 17:39:30 +03:00
fazledyn-or
d7d4154d5d
chore: remove channel blocking in qemu launch
The channel is never read from.

Signed-off-by: fazledyn-or <ataf@openrefactory.com>
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-02-02 18:57:36 +04:00
Andrey Smirnov
270604bead
fix: support user disks via symlinks
The core blockdevice library already supported resolving symlinks, we
just need to get the raw block device name from it, and use it
afterwards.

In QEMU provisioner, leave the first (system) disk as virtio (for
performance), and mount user disks as 'ata', which allows `udevd` to
pick up the disk IDs (not available for `virtio`), and use the symlink
path in the tests.

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2023-12-05 22:02:56 +04:00
Andrey Smirnov
a52d3cda3b
chore: update gen and COSI runtime
No actual changes, adapting to use new APIs.

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2023-09-22 12:13:13 +04:00
Andrey Smirnov
9608ef56dc
chore: allow bridge traffic with DHCP broadcast traffic
This is required for https://github.com/siderolabs/sidero/pull/1070, as
we need to allow DHCP traffic from Sidero controller running in a VM
through the bridge to other VMs.

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2023-08-23 18:37:37 +04:00
Noel Georgi
170a73e161
chore: support creating qemu guest socket
Support creating a qemu guest agent socket so we can test
`qemu-guest-agent` extension in CI.

Ref: https://github.com/siderolabs/extensions/pull/173#issuecomment-1611911106

Signed-off-by: Noel Georgi <git@frezbo.dev>
2023-07-21 22:46:13 +05:30
Andrey Smirnov
fe0f46980f
feat: implement secure boot from disk
This includes sd-boot handling, EFI variables, etc.

There are some TODOs which need to be addressed to make things smooth.

Install to disk, upgrades work.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2023-06-16 20:15:16 +05:30
Noel Georgi
3a865370f5
feat: qemu secureboot
Add qemu support for secureboot testing via `talosctl cluster create`.

Can be tested via:

```bash
sudo -E _out/talosctl-linux-amd64 cluster create --provisioner=qemu $REGISTRY_MIRROR_FLAGS --controlplanes=1 --workers=1 --iso-path=_out/talos-uki-amd64.iso --with-secureboot=true --with-tpm2=true --skip-injecting-config --with-apply-config
```

This currently only supports just booting Talos in SecureBoot mode.
Installation and Upgrade comes as extra PRs.

Fixes: #7324

Signed-off-by: Noel Georgi <git@frezbo.dev>
2023-06-06 19:20:07 +05:30
Andrey Smirnov
96aa9638f7
chore: rename talos-systems/talos to siderolabs/talos
There's a cyclic dependency on siderolink library which imports talos
machinery back. We will fix that after we get talos pushed under a new
name.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-11-03 16:50:32 +04:00
Andrey Smirnov
30bbf6463a
refactor: use siderolabs/net version with netip.Addr
Replace most of `net.IP` usage in Talos with `netip.Addr`, refactor code
accordingly.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-11-02 14:21:03 +04:00
Andrey Smirnov
343c55762e
chore: replace talos-systems Go modules with siderolabs
This the first step towards replacing all import paths to be based on
`siderolabs/` instead of `talos-systems/`.

All updates contain no functional changes, just refactorings to adapt to
the new path structure.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-11-01 12:55:40 +04:00
Dmitriy Matrenichev
fc48849d00
chore: move maps/slices/ordered to gen module
Use github.com/siderolabs/gen

Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
2022-09-21 20:22:43 +03:00
Noel Georgi
357b770cb5
fix: cryptsetup delete slot
Fix cryptsetup delete slot.

Fixes: #6298

Signed-off-by: Noel Georgi <git@frezbo.dev>
2022-09-21 16:37:54 +05:30
Dmitriy Matrenichev
4dbbf4ac50
chore: add generic methods and use them part #2
Use things from #5702.

Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
2022-06-09 23:10:02 +08:00
Andrey Smirnov
ec621477bd
chore: tune QEMU disk provisioner options
As QEMU clusters are used for testing, use unsafe cache options to
reduce amount of fsyncs going to the host blockdevice.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-04-21 22:39:30 +03:00
Andrey Smirnov
99338e5ffd
feat: update Flannel to 0.15.1
https://github.com/flannel-io/flannel/releases/tag/v0.15.1

Also updates CNI plugins to 1.0.1.

See:

* https://github.com/talos-systems/pkgs/pull/363
* https://github.com/talos-systems/extras/pull/31

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2021-12-02 17:48:41 +03:00
Andrey Smirnov
9d803d75bf
chore: bump dependencies and drop firecracker support
Note: Talos can be still run under `Firecracker`, support for
Firecracker was only removed for `talosctl cluster create`.

Reason:

* code is untested/unmaintained, and probably doesn't work correctly
* firecracker Go SDK pulls lots of dependencies and it blocks CNI Go
module update

Bonus: `talosctl-linux-amd64` shrinks by 2 MiB.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2021-09-20 17:13:34 +03:00
Andrey Smirnov
33119d2b8e chore: add an option to launch cluster with bad RTC state
This is useful for time sync testing.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2021-06-23 13:08:20 -07:00
Alexey Palazhchenko
4fe6912143 test: better talosctl ls tests
Refs #3018.

Signed-off-by: Alexey Palazhchenko <alexey.palazhchenko@gmail.com>
2021-05-20 03:29:21 -07:00
Andrey Smirnov
82804414fc test: provide a way to force different boot order in provision library
There's no change to the default behavior. This change is going to be
used with Sidero/Sfyra.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2021-05-18 13:38:22 -07:00
Artem Chernyshev
22f375300c chore: update golanci-lint to 1.38.0
Fix all discovered issues.
Detected couple bugs, fixed them as well.

Signed-off-by: Artem Chernyshev <artem.0xD2@gmail.com>
2021-03-12 06:50:02 -08:00
Alexey Palazhchenko
df52c13581 chore: fix //nolint directives
That's the recommended syntax:
https://golangci-lint.run/usage/false-positives/

Signed-off-by: Alexey Palazhchenko <alexey.palazhchenko@gmail.com>
2021-03-05 05:58:33 -08:00
Andrey Smirnov
7f3dca8e4c test: add support for IPv6 in talosctl cluster create
Modify provision library to support multiple IPs, CIDRs, gateways, which
can be IPv4/IPv6. Based on IP types, enable services in the cluster to
run DHCPv4/DHCPv6 in the test environment.

There's outstanding bug left with routes not being properly set up in
the cluster so, IPs are not properly routable, but DHCPv6 works and IPs
are allocated (validates DHCPv6 client).

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2021-02-09 13:28:53 -08:00
Andrey Smirnov
78eecc0574 chore: enable virtio-balloon and monitor in QEMU provisioner
Ballooning is not automatic, but it can be verified via QEMU monitor by
inflating/deflating the balloon inside the VM.

Monitor can be used like that:

```
$ sudo socat - unix-connect:/home/smira/.talos/clusters/talos-default/talos-default-master-1.monitor
QEMU 5.0.0 monitor - type 'help' for more information
(qemu) info status
info status
VM status: running
```

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2021-01-15 10:36:48 -08:00
Andrey Smirnov
c5ffe9f4f7 test: add support for mounting ISO in talosctl cluster create
If disk is empty and ISO path is given, QEMU provisioner mounts ISO on
the first boot.

To drop into maintenance mode:

```
talosctl cluster create --provisioner=qemu --iso-path=./_out/talos-amd64.iso --skip-injecting-config --wait=false
```

Then inject the config, bootstrap the node, wait for it to come up (via
`talosctl cluster health`).

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2020-12-10 05:55:44 -08:00
Andrey Smirnov
1eac88e470 feat: add support for installing to SBCs
This introduces the notion of a "board" in Talos. A board is an interface that is capable
of modifying the installation in specific ways for a given SBC. This also adds support for the
libretech_all_h3_cc_h5.

Signed-off-by: Andrew Rynhard <andrew@rynhard.io>
2020-11-26 07:18:25 -08:00
Andrey Smirnov
7767a41d4a feat: set interface MTU in DHCP mode even if DHCP is not successful
Fixes #2789

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2020-11-19 10:59:21 -08:00
Andrey Smirnov
a2efa44663 chore: enable gci linter
Fixes were applied automatically.

Import ordering might be questionable, but it's strict:

* stdlib
* other packages
* same package imports

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2020-11-09 08:09:48 -08:00
Artem Chernyshev
061b296530 feat: allow specifying user-disks in talosctl cluster create
User-disks are supported by QEMU and Firecracker providers.
Can be defined by using the following parameters:
```
--user-disk /mount/path:1GB
```

Can get more than 1 user disk.
Same set of user disks will be created for all master and worker nodes.

Additionally enable user-disks in qemu e2e test.

Signed-off-by: Artem Chernyshev <artem.0xD2@gmail.com>
2020-10-30 08:44:08 -07:00
Andrey Smirnov
569527e6ed test: potential fix for talosctl cluster destroy being stuck
Missing timeout in shutdown is the only reason I could find for Sfyra
tests being stuck on teardown.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2020-10-09 05:10:08 -07:00
Andrey Smirnov
018086d1fa refactor: extract blockdevice library
Library `blockdevice` was extracted as `talos-systems/go-blockdevice`,
this PR finalizes the move by removing Talos copy of it.

Some functions around `mkfs`/`growfs` were extracted as `makefs`
package, as they depend on `cmd` package.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2020-10-05 11:18:43 -07:00
Andrey Smirnov
7f5ffdacb8 test: implement API for QEMU VM provisioner
Fixes #2515

This implements simple HTTP API which should cover same methods as IPMI
methods in Sidero.

Examples:

```
$ curl http://172.20.0.1:34791/status
{"PoweredOn":false}
```

```
$ curl -X POST http://172.20.0.1:34791/poweroff
```

API listens on bridge address, each VM has unique port which can be
found in cluster state as `apiport: NNNN`.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2020-09-15 15:05:55 -07:00
Andrey Smirnov
d60adf9e3b test: add support for PXE nodes in qemu provision library
This isn't supposed to be used ever in Talos directly, but rather only
in integration tests for Sidero.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2020-08-21 06:49:32 -07:00
Andrey Smirnov
9379cf9ee1 refactor: expose provision as public package
This change is only moving packages and updating import paths.

Goal: expose `internal/pkg/provision` as `pkg/provision` to enable other
projects to import Talos provisioning library.

As cluster checks are almost always required as part of provisioning
process, package `internal/pkg/cluster` was also made public as
`pkg/cluster`.

Other changes were direct dependencies discovered by `importvet` which
were updated.

Public packages (useful, general purpose packages with stable API):

* `internal/pkg/conditions` -> `pkg/conditions`
* `internal/pkg/tail` -> `pkg/tail`

Private packages (used only on provisioning library internally):

* `internal/pkg/inmemhttp` -> `pkg/provision/internal/inmemhttp`
* `internal/pkg/kernel/vmlinuz` -> `pkg/provision/internal/vmlinuz`
* `internal/pkg/cniutils` -> `pkg/provision/internal/cniutils`

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2020-08-12 05:12:05 -07:00