Use defer blocks and error joining to guarantee uncordon cleanup
runs regardless of reboot/upgrade success or failure. Prevents nodes
from staying cordoned when operations fail.
Also added gRPC keepalive params to prevent timeout issues during
long operations.
Signed-off-by: Mateusz Urbanek <mateusz.urbanek@siderolabs.com>
When run in "normal" mode, `talosctl` takes into account proxy
configuration, such as the `https_proxy` and `no_proxy` environment
variables; but when invoked with `--insecure`, those would be ignored,
which results in `talosctl` being unable to interact with nodes in
maintenance mode if they're only reachable through a proxy.
This commit adds the `WithDefaultGRPCDialOptions()` option to the
client created by `WithClientMaintenance()`, same as `WithClient()`.
Signed-off-by: Benoît Knecht <benoit.knecht@proton.ch>
Add --drain and --drain-timeout flags to `talosctl reboot` (default off)
and `talosctl upgrade` (default on) that cordon and drain the Kubernetes
node before rebooting, then wait for Ready and uncordon after it comes
back. When --drain is enabled, --wait is forced to true.
Signed-off-by: Mateusz Urbanek <mateusz.urbanek@siderolabs.com>
Add support for whole machine-wide image verification configuration.
Configuration is a set of rules applied top-down to the image reference,
each specifying a specific cosign-based identity or static public key
claim.
Talos provides a machined API to verify an image reference, resolving it
to the digest on the way as needed.
Talos itself hooks up in the image verification process, while
containerd CRI plugin accesses same API via the machined socket.
Signed-off-by: Laura Brehm <laurabrehm@hey.com>
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
Add new `talosctl install` command using the LifecycleService.Install
streaming API with support for insecure (maintenance) mode and progress
reporting. Refactor `talosctl upgrade` to use the new
LifecycleService.Upgrade streaming API with automatic fallback to the
legacy MachineService.Upgrade path for older Talos versions.
Signed-off-by: Mateusz Urbanek <mateusz.urbanek@siderolabs.com>
Via tools/pkgs, also pulling in Clang-built Linux
Update go.mod dependencies
Fix linter errors with new golangci-lint, modernize, use new()
Signed-off-by: Dmitrii Sharshakov <dmitry.sharshakov@siderolabs.com>
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
Re-generate, fix new linting issues.
Update containerd library to the latest 2.2.1 to address the new cgroups
package import (via tools update).
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
These new APIs only support one2one proxying, so they don't have any
hacks, and look as regular gRPC APIs.
Old APIs are deprecated, but still supported.
Implement client-side multiplexing in `talosctl`, provide fallback to
old APIs for legacy Talos versions.
New APIs include removing an image, importing an image.
Extracted from #12392
Co-authored-by: Laura Brehm <laurabrehm@hey.com>
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
The interactive installer has been deprecated since v1.12 cycle,
now removed completely including the API method.
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
Self-signed certificates are missing Subject/Issuer info,
which are not present in CA. This sometimes might be causing issues
as it is invalid format.
Signed-off-by: Mateusz Urbanek <mateusz.urbanek@siderolabs.com>
In certain situations, Talos's shutdown/reboot sequence hangs while
waiting for services/mounts to be gracefully stopped (see:
https://github.com/siderolabs/talos/issues/11775).
This patch adds a forceful mode to the reboot sequence (`talosctl reboot
--mode force`) that bypasses graceful userspace teardown and hard
reboots the machine.
Signed-off-by: Laura Brehm <laurabrehm@hey.com>
Fixes#10963
Also hides/deprecated `.machine.network.interfaces`, as every piece of
it is now available as proper multi-doc.
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
Update COSI, and stop using a fork of `gopkg.in/yaml.v3`, now we use new
supported for of this library.
Drop `MarshalYAMLBytes` for the machine config, as we actually marshal
config as a string, and we don't need this at all.
Make `talosctl` stop doing hacks on machine config for newer Talos, keep
hacks for backwards compatibility.
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
Add new `--airgapped` flag to talos cluster create (qemu)
to disable NAT in the VMs to effectively become airgapped.
Signed-off-by: Mateusz Urbanek <mateusz.urbanek@siderolabs.com>
Add new command that takes Talos version (semver) and generates a list
of images that are used in Image Factory for building talos.
Fixes#11927
Signed-off-by: Mateusz Urbanek <mateusz.urbanek@siderolabs.com>
Use key provider with fallback option for auth type SideroV1:
- Attempt to use $HOME/.talos/keys directory to read/remove existing PGP file if it exists or write new PGP file if directory is writable.
- Otherwise fallback to using $XDG_DATA_HOME/talos/keys directory.
- Add new talosctl flag --siderov1-key-dir (also configurable via SIDEROV1_KEYS_DIR env var) to allow customizing the directory to use for PGP keys
Update documentation to remove reference for $XDG_CONFIG_HOME for storing talosctl configuration, as it's not used anymore.
Signed-off-by: Oguz Kilcan <oguz.kilcan@siderolabs.com>
Add a user facing cluster create docker command with the following changes:
* renamed flags for simplicity and uniformity
* removed the bulk of the unnecessary flags
Other changes:
* split internal logic such that it's separate from the qemu cluster create logic
* refactor internal code aiming for simplicity
Ä change drives flag behavior in anticipation of the user facing create-qemu command
* extract code into separate functions
* add some unit tests
* remove the docker support from the cluster create command (docker is only supported via the user-facing create-docker command)
Signed-off-by: Orzelius <33936483+Orzelius@users.noreply.github.com>
Brings in Linux 6.12.21, go 1.24.2.
Also updates Go dependencies, golangci-lint, etc.
The configuration was migrated, fix new linting errors.
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
Update tools, pkgs, extras.
Brings in Go 1.23.1, Linux 6.6.52, new xfsprogs, etc.
Fork docs.
Add new version contract, etc.
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
Update tools, pkgs, extras, Go dependencies, Go tools, etc.
Linux 6.6.47 and containerd 2.0.0-rc.4.
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
- replace `interface{}` with `any` using `gofmt -r 'interface{} -> any -w'`
- replace `a = []T{}` with `var a []T` where possible.
- replace `a = []T{}` with `a = make([]T, 0, len(b))` where possible.
Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
If the `reboot/reset/shutdown/upgrade` action tracker cannot read the boot ID from the node under `/proc/sys/kernel/random/boot_id` due to insufficient permissions (e.g., when `talosctl reboot` is used over Omni), fall back to skipping boot ID check instead of hard-failing.
Closessiderolabs/talos#7197.
Signed-off-by: Utku Ozdemir <utku.ozdemir@siderolabs.com>
This is a small quality of life improvement that allows `logs` subcommand to suggest all available logs.
Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
To be used in the `go-talos-support` module without importing the whole
Talos repo.
Signed-off-by: Artem Chernyshev <artem.chernyshev@talos-systems.com>
Before we started a reboot/shutdown/reset/upgrade action with the action tracker (`--wait`), we were setting a flag to prevent cobra from printing the returned error from the command.
This was to prevent the error from being printed twice, as the reporter of the action tracker already prints any errors occurred during the action execution.
But if the error happens too early - i.e. before we even started the status printer goroutine, then that error wouldn't be printed at all, as we have suppressed the errors.
This PR moves the suppression flag to be set after the status printer is started - so we still do not double-print the errors, but neither do we suppress any early-stage error from being printed.
Closessiderolabs/talos#7900.
Signed-off-by: Utku Ozdemir <utku.ozdemir@siderolabs.com>
Currently, we use `github.com/coreos/go-semver/semver` and `github.com/hashicorp/go-version`
for version parsing. As we use `github.com/blang/semver/v4` in our other projects, and it
has more features, it makes sense to use it across the projects. It also doesn't allocate
like crazy in `KubernetesVersion.SupportedWith`.
Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
Fixes#7425
The previously used method doesn't handle YAML multi-doc, incorrectly
stripping only the first document and throwing away everything else.
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
`config.Container` implements a multi-doc container which implements
both `Container` interface (encoding, validation, etc.), and `Conifg`
interface (accessing parts of the config).
Refactor `generate` and `bundle` packages to support multi-doc, and
provide backwards compatibility.
Implement a first (mostly example) machine config document for
SideroLink API URL.
Many places don't properly support multi-doc yet (e.g. config patches).
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
See #7230
Refactor more config interfaces, move config accessor interfaces
to different package to break the dependency loop.
Make `.RawV1Alpha1()` method typed to avoid type assertions everywhere.
No functional changes.
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
See #7230
This is a step towards preparing for multi-doc config.
Split the `config.Provider` interface into parts which have different
implementation:
* `config.Config` accesses the config itself, it might be implemented by
`v1alpha1.Config` for example
* `config.Container` will be a set of config documents, which implement
validation, encoding, etc.
`Version()` method dropped, as it makes little sense and it was almost
not used.
`Raw()` method renamed to `RawV1Alpha1()` to support legacy direct
access to `v1alpha1.Config`, next PR will refactor more to make it
return proper type.
There will be many more changes coming up.
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
Message metadata might be missing, the easiest usecase is contacting
worker directly using it both as an endpoint and a node.
Fixes#7108
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
In the tracking of the `reset --reboot`, `reboot` and `upgrade` lifecycle commands, verify that the node was actually rebooted in the post check by comparing the pre- and post-check boot IDs.
In the `reset --reboot` post-check, try both maintenance and normal mode, since the reset might be issued to only remove `EPHEMERAL` partition, which will not put the node into the maintenance mode.
Fixessiderolabs/talos#7009.
Additionally, if an action tracking fails, return the error instead of swallowing it. This way the command erminates with a non-zero exit code. Suppress the re-printing this error after the command was run.
Fixessiderolabs/talos#6966.
Signed-off-by: Utku Ozdemir <utku.ozdemir@siderolabs.com>
Adds a new sub-command to talosctl config. It takes in the context to be
deleted as argument and supports glob matching.
A local flag --noconfirm|-y can be passed to bypass the confirmation
prompt.
It also supports dry run by passing the --dry-run flag similar to
apply-config and edit commands.
Example:
$ talosctl config remove 'ctx-*'
Remove context ctx-a? (y/N): y
Remove context ctx-b? (y/N): y
Signed-off-by: Murtaza Udaipurwala <murtaza@murtazau.xyz>
Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>