Introduce `cluster.NodeInfo` to represent the basic info about a node which can be used in the health checks. This information, where possible, will be populated by the discovery service in following PRs. Part of siderolabs#5554.
Signed-off-by: Utku Ozdemir <utku.ozdemir@siderolabs.com>
Previously talosctl would accept multiple nodes for the bootstrap
command which is a strictly single-node operation. Talosctl will abort
the bootstrap command if more than one node is specified either as a
command-line flag or in talosconfig.
Fixes#5636
Signed-off-by: Philipp Sauter <philipp.sauter@siderolabs.com>
It wasn't used when building an endpoint to the local API server, so
Talos couldn't talk to the local API server when port was changed from
the default one.
Fixes#5706
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
This fixes an issue when invalid `--mode` option was treated as a
default mode.
Fixes#5712
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
Run `xfs_repair` on XFS filesystems that needs repairing indicated by
the `unix.EUCLEAN` error when mounting
Fixes#5319Fixes#5437
Signed-off-by: Noel Georgi <git@frezbo.dev>
Version API is only available over SideroLink connection.
This is useful to find Talos version as it got booted (e.g. to generate
proper machine configuration).
There's a security concern that version API might return sensitive
information via public API. At the same time Talos version can be
guessed by looking at the output of other APIs, e.g. resource type list
(`talosctl get rd`), which changes with every minor version.
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
Instead of hardcoded `grub.cfg`, use common code to generate list of
kernel arguments and allow using `--extra-kernel-arg` as well.
Before the change:
```
linux /boot/vmlinuz init_on_alloc=1 slab_nomerge pti=on panic=0 consoleblank=0 printk.devkmsg=on earlyprintk=ttyS0 console=tty0 console=ttyS0 talos.platform=metal
```
New (default line):
```
linux /boot/vmlinuz talos.platform=metal earlyprintk=ttyS0 console=ttyS0 console=tty0 init_on_alloc=1 slab_nomerge pti=on consoleblank=0 nvme_core.io_timeout=4294967295 random.trust_cpu=on printk.devkmsg=on ima_template=ima-ng ima_appraise=fix ima_hash=sha512
```
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
`GenerateKey` generates random 32 bytes vs. the key suitable for
Wireguard endpoint key.
This is the only place in code with this bug, and it is only used in
test code (`talosctl cluster create` with fixed Wireguard
configuration).
SideroLink and Kubespan are not affected.
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
Add extra context to error message when unable to properly
open the talos config file when creating a cluster.
Signed-off-by: Tim Jones <tim.jones@siderolabs.com>
The user will get an error message and talosctl aborts if `talosctl cluster create` is called with gen options and the --input-dir flag.
Fixes#2275
Signed-off-by: Philipp Sauter <philipp.sauter@siderolabs.com>
The new mode allows changing the config for a period of time, which
allows trying the configuration and automatically rolling it back in case
if it doesn't work for example.
The mode can only be used with changes that can be applied without a
reboot.
When changed it doesn't write the configuration to disk, only changes it
in memory.
`--timeout` parameter can be used to customize the rollback delay.
The default timeout is 1 minute.
Any consequent configuration change will abort try mode and the last
applied configuration will be used.
Signed-off-by: Artem Chernyshev <artem.chernyshev@talos-systems.com>
Not sure how and when it got broken, but we're looking for mounts for
the blockdevice (like `/dev/vda`), while the actual mount info contains
the partition device (like `/dev/vda6`).
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
With update of the client library to 3.5.3, etcd library started using
the logger, so using `nil` isn't fine anymore.
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
Dry run prints out config diff, selected application mode without
changing the configuration.
Signed-off-by: Artem Chernyshev <artem.chernyshev@talos-systems.com>
Containerd CRI plugin was merged into the main repo, but we were using
old import path, so our constants coming from the module were outdated.
This fixes the image version for the pause container.
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
Docker by default disable IPv6 completely in the containers which breaks
SideroLink on Docker-based clusters, as SideroLink is using IPv6
addresses for the Wiregurard tunnel.
This change might break `talosctl cluster create` on host systems which
have IPv6 disabled completely, so provide a flag to revert this
behavior.
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
In the case of a node being reset, using kexec greatly
speeds up the process. However, in the event the boot
partition is wiped, a full reboot is required.
Closes#4670
Signed-off-by: Tim Jones <tim.jones@siderolabs.com>
When interating with the kubeconfig it can be
expected that a user may have the KUBECONFIG
environment variable set, so we need to use
it when appropriate.
Closes#5091
Signed-off-by: Tim Jones <tim.jones@siderolabs.com>
Generate separate file for each variable and assign them during go build using go:embed instead of using ldflags -X.
Resolves#5138
Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
With system extensions, size of the `initramfs` might increase
significantly. With 1000 MiB `/boot`, as we store `A` and `B` boot
directories, we have 500 MiB for each Talos boot (size of the kernel and
initramfs).
Fixes#5096
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
They were discovered as we tagged 1.0.0 version:
* wrong deprecated version
* incompatibility in extension compatibility checks
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
Fixes#4947
It turns out there's something related to boot process in BIOS mode
which leads to initramfs corruption on later `kexec`.
Booting via GRUB is always successful.
Problem with kexec was confirmed with:
* direct boot via QEMU
* QEMU boot via iPXE (bundled with QEMU)
The root cause is not known, but the only visible difference is the
placement of RAMDISK with UEFI and BIOS boots:
```
[ 0.005508] RAMDISK: [mem 0x312dd000-0x34965fff]
```
or:
```
[ 0.003821] RAMDISK: [mem 0x711aa000-0x747a7fff]
```
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
I believe it serves no purpose in GRUB config: GRUB pre-loads
`initramfs` into memory anyways, so kernel doesn't need to know, nor has
now way to load it from anywhere.
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
Fixes#4987
As machinery is supposed to be widely used project, and gRPC lacks
proper support to override default codec easily, it might come into
conflict with other projects.
Instead, move codec to core talos, and register it explicitly in the
server code (which covers machined, apid, trustd) and client code
(talosctl).
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
Pin talos default k8s version to `talosctl gen config`
Signed-off-by: Charlie Haley <charlie.haley@hotmail.com>
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
Fixes#4816
This changes the way system extensions are packaged into the squashfs
images: `/lib/firmware` is now moved out of the future squashfs images
and becomes part of `initramfs` to make firmware available in the early
boot.
Talos will bind-mount `/lib/firmware` into rootfs as well, so it will be
available in the rootfs as well.
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
Cordon & drain a node when the Shutdown message is received.
Also adds a '--force' option to the shutdown command in case the control
plane is unresponsive.
Signed-off-by: Tim Jones <timniverse@gmail.com>
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
Include filename content if value begins with @ (see curl for example).
Add multiple config-path option on cmdline to apply them in order.
ex:
```
talosctl-linux-amd64 gen config talos1 https://127.0.0.1:6443 --config-patch-control-plan @cidrs.json --config-patch-worker @sysctls-workders.json --config-path @cluster-name.json
```
Load JSON patch from YAML.
This applies to all commands handling config patches.
Closes: https://github.com/talos-systems/talos/issues/4764
Signed-off-by: Sébastien Bernard <sbernard@nerim.net>
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
Fixes#4815
This implements the following steps:
* machine configuration updates
* pulling and unpacking system extension images
* validating, listing system extensions
* re-packing system extensions
* preserving installed extensions in `/etc/extensions.yaml`
Once extension is enabled, raw information can be queried with:
```
$ talosctl -n 172.20.0.2 cat /etc/extensions.yaml
layers:
- image: 000.ghcr.io-smira-gvisor-c927b54-dirty.sqsh
metadata:
name: gvisor
version: 20220117.0-v1.0.0
author: Andrew Rynhard
description: |
This system extension provides gVisor using containerd's runtime handler.
compatibility:
talos:
version: '> v0.15.0-alpha.1'
```
This was tested with the `gvisor` system extension.
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
Talos shouldn't try to re-encode the machine config it was provided
with.
So add a `ReadonlyWrapper` around `*v1alpha1.Config` which makes sure
that raw config object is not available anymore (it's a private field),
but config accessors are available for read-only access.
Another thing that `ReadonlyWrapper` does is that it preserves the
original `[]byte` encoding of the config keeping it exactly same way as
it was loaded from file or read over the network.
Improved `talosctl edit mc` to preserve the config as it was submitted,
and preserve the edits on error from Talos (previously edits were lost).
`ReadonlyWrapper` is not used on config generation path though - config
there is represented by `*v1alpha.Config` and can be freely modified.
Why almost? Some parts of Talos (platform code) patch the machine
configuration with new data. We need to fix platforms to provide
networking configuration in a different way, but this will come with
other PRs later.
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
Fixes#4656
As now changes to kubelet configuration can be applied without a reboot,
`talosctl upgrade-k8s` can handle the kubelet upgrades as well.
The gist is simply modifying machine config and waiting for `Node`
version to be updated, rest of the code is required for reliability of
the process.
Also fixed a bug in the API while watching deleted items with
tombstones.
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
Makes `talosctl` autocomplete the most used dynamic positional parameters like resource definitions, IDs of resource definitions, and also values for arguments like `--nodes` and `--context`.
Signed-off-by: Nico Berlee <nico.berlee@on2it.net>
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>