The new mode allows changing the config for a period of time, which
allows trying the configuration and automatically rolling it back in case
if it doesn't work for example.
The mode can only be used with changes that can be applied without a
reboot.
When changed it doesn't write the configuration to disk, only changes it
in memory.
`--timeout` parameter can be used to customize the rollback delay.
The default timeout is 1 minute.
Any consequent configuration change will abort try mode and the last
applied configuration will be used.
Signed-off-by: Artem Chernyshev <artem.chernyshev@talos-systems.com>
Not sure how and when it got broken, but we're looking for mounts for
the blockdevice (like `/dev/vda`), while the actual mount info contains
the partition device (like `/dev/vda6`).
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
With update of the client library to 3.5.3, etcd library started using
the logger, so using `nil` isn't fine anymore.
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
Dry run prints out config diff, selected application mode without
changing the configuration.
Signed-off-by: Artem Chernyshev <artem.chernyshev@talos-systems.com>
Containerd CRI plugin was merged into the main repo, but we were using
old import path, so our constants coming from the module were outdated.
This fixes the image version for the pause container.
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
Docker by default disable IPv6 completely in the containers which breaks
SideroLink on Docker-based clusters, as SideroLink is using IPv6
addresses for the Wiregurard tunnel.
This change might break `talosctl cluster create` on host systems which
have IPv6 disabled completely, so provide a flag to revert this
behavior.
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
In the case of a node being reset, using kexec greatly
speeds up the process. However, in the event the boot
partition is wiped, a full reboot is required.
Closes#4670
Signed-off-by: Tim Jones <tim.jones@siderolabs.com>
When interating with the kubeconfig it can be
expected that a user may have the KUBECONFIG
environment variable set, so we need to use
it when appropriate.
Closes#5091
Signed-off-by: Tim Jones <tim.jones@siderolabs.com>
Generate separate file for each variable and assign them during go build using go:embed instead of using ldflags -X.
Resolves#5138
Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
With system extensions, size of the `initramfs` might increase
significantly. With 1000 MiB `/boot`, as we store `A` and `B` boot
directories, we have 500 MiB for each Talos boot (size of the kernel and
initramfs).
Fixes#5096
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
They were discovered as we tagged 1.0.0 version:
* wrong deprecated version
* incompatibility in extension compatibility checks
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
Fixes#4947
It turns out there's something related to boot process in BIOS mode
which leads to initramfs corruption on later `kexec`.
Booting via GRUB is always successful.
Problem with kexec was confirmed with:
* direct boot via QEMU
* QEMU boot via iPXE (bundled with QEMU)
The root cause is not known, but the only visible difference is the
placement of RAMDISK with UEFI and BIOS boots:
```
[ 0.005508] RAMDISK: [mem 0x312dd000-0x34965fff]
```
or:
```
[ 0.003821] RAMDISK: [mem 0x711aa000-0x747a7fff]
```
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
I believe it serves no purpose in GRUB config: GRUB pre-loads
`initramfs` into memory anyways, so kernel doesn't need to know, nor has
now way to load it from anywhere.
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
Fixes#4987
As machinery is supposed to be widely used project, and gRPC lacks
proper support to override default codec easily, it might come into
conflict with other projects.
Instead, move codec to core talos, and register it explicitly in the
server code (which covers machined, apid, trustd) and client code
(talosctl).
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
Pin talos default k8s version to `talosctl gen config`
Signed-off-by: Charlie Haley <charlie.haley@hotmail.com>
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
Fixes#4816
This changes the way system extensions are packaged into the squashfs
images: `/lib/firmware` is now moved out of the future squashfs images
and becomes part of `initramfs` to make firmware available in the early
boot.
Talos will bind-mount `/lib/firmware` into rootfs as well, so it will be
available in the rootfs as well.
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
Cordon & drain a node when the Shutdown message is received.
Also adds a '--force' option to the shutdown command in case the control
plane is unresponsive.
Signed-off-by: Tim Jones <timniverse@gmail.com>
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
Include filename content if value begins with @ (see curl for example).
Add multiple config-path option on cmdline to apply them in order.
ex:
```
talosctl-linux-amd64 gen config talos1 https://127.0.0.1:6443 --config-patch-control-plan @cidrs.json --config-patch-worker @sysctls-workders.json --config-path @cluster-name.json
```
Load JSON patch from YAML.
This applies to all commands handling config patches.
Closes: https://github.com/talos-systems/talos/issues/4764
Signed-off-by: Sébastien Bernard <sbernard@nerim.net>
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
Fixes#4815
This implements the following steps:
* machine configuration updates
* pulling and unpacking system extension images
* validating, listing system extensions
* re-packing system extensions
* preserving installed extensions in `/etc/extensions.yaml`
Once extension is enabled, raw information can be queried with:
```
$ talosctl -n 172.20.0.2 cat /etc/extensions.yaml
layers:
- image: 000.ghcr.io-smira-gvisor-c927b54-dirty.sqsh
metadata:
name: gvisor
version: 20220117.0-v1.0.0
author: Andrew Rynhard
description: |
This system extension provides gVisor using containerd's runtime handler.
compatibility:
talos:
version: '> v0.15.0-alpha.1'
```
This was tested with the `gvisor` system extension.
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
Talos shouldn't try to re-encode the machine config it was provided
with.
So add a `ReadonlyWrapper` around `*v1alpha1.Config` which makes sure
that raw config object is not available anymore (it's a private field),
but config accessors are available for read-only access.
Another thing that `ReadonlyWrapper` does is that it preserves the
original `[]byte` encoding of the config keeping it exactly same way as
it was loaded from file or read over the network.
Improved `talosctl edit mc` to preserve the config as it was submitted,
and preserve the edits on error from Talos (previously edits were lost).
`ReadonlyWrapper` is not used on config generation path though - config
there is represented by `*v1alpha.Config` and can be freely modified.
Why almost? Some parts of Talos (platform code) patch the machine
configuration with new data. We need to fix platforms to provide
networking configuration in a different way, but this will come with
other PRs later.
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
Fixes#4656
As now changes to kubelet configuration can be applied without a reboot,
`talosctl upgrade-k8s` can handle the kubelet upgrades as well.
The gist is simply modifying machine config and waiting for `Node`
version to be updated, rest of the code is required for reliability of
the process.
Also fixed a bug in the API while watching deleted items with
tombstones.
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
Makes `talosctl` autocomplete the most used dynamic positional parameters like resource definitions, IDs of resource definitions, and also values for arguments like `--nodes` and `--context`.
Signed-off-by: Nico Berlee <nico.berlee@on2it.net>
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
Related to #4448
The only remaining part is filtering out SideroLink addresses when Talos
looks for a node address.
See also https://github.com/talos-systems/siderolink/pull/2
The way to test it out:
```
$ talosctl cluster create ... --extra-boot-kernel-args
siderolink.api=172.20.0.1:4000
```
(where 172.20.0.1 is the bridge IP)
Run `siderolink-agent` (test implementation):
```
$ sudo _out/siderolink-agent-linux-amd64
```
Now on the host, there should be a `siderolink` Wireguard userspace
tunnel:
```
$ sudo wg
interface: siderolink
public key: 2aq/V91QyrHAoH24RK0bldukgo2rWk+wqE5Eg6TArCM=
private key: (hidden)
listening port: 51821
peer: Tyr6C/F3FFLWtnzqq7Dsm54B40bOPq6++PTiD/zqn2Y=
endpoint: 172.20.0.1:47857
allowed ips: fdae:41e4:649b:9303:b6db:d99c:215e:dfc4/128
latest handshake: 2 minutes, 2 seconds ago
transfer: 3.62 KiB received, 1012 B sent
...
```
Each Talos node will be registered as a peer, tunnel is established.
You can now ping Talos nodes from the host over the tunnel:
```
$ ping fdae:41e4:649b:9303:b6db:d99c:215e:dfc4
PING fdae:41e4:649b:9303:b6db:d99c:215e:dfc4(fdae:41e4:649b:9303:b6db:d99c:215e:dfc4) 56 data bytes
64 bytes from fdae:41e4:649b:9303:b6db:d99c:215e:dfc4: icmp_seq=1 ttl=64 time=0.352 ms
64 bytes from fdae:41e4:649b:9303:b6db:d99c:215e:dfc4: icmp_seq=2 ttl=64 time=0.437 ms
```
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
`strconv.ParseInt` always returns error for bitSize -1.
Found by the latest golangci-lint.
Signed-off-by: Alexey Palazhchenko <alexey.palazhchenko@talos-systems.com>
This enables cluster discovery by default for Talos 0.14. KubeSpan is
not enabled by default.
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>