Add a SupportsFactoryTalosctlDownload quirk to mark the minimum version that supports talosctl downloads from factory
Signed-off-by: Edward Sammut Alessi <edward.sammutalessi@siderolabs.com>
(cherry picked from commit b43c3a124f6c6d1523c1feaddc9c4a23454eeb56)
Kill old-style "manual" tests, use `ctest` consistently now.
This should be no-op refactoring.
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
(cherry picked from commit df0b9a8da1423842d830261e5ddc5dc8f5a234c1)
Allows to authenticate to Image Factory (if Image Factory is configured
for auth), applies for HTTP downloads (e.g. ISO), and injects registry
auth into Talos as well.
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
(cherry picked from commit c2948cef232f6a175312636369b444124cb995db)
The final Kubernetes version for Talos v1.13.0.
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
(cherry picked from commit ecf2fa855b8eb19731b228990a3acbe1430ccad4)
While the OOM pressure is high, we might observe "extra kills" as there
are no other victims to kill anymore (as `stress-ng` is already gone).
Tolerate those kills, but log them in case we see this getting out of
hand.
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
(cherry picked from commit 71aeb347f90969cb6057651666bfda205269d917)
Use defer blocks and error joining to guarantee uncordon cleanup
runs regardless of reboot/upgrade success or failure. Prevents nodes
from staying cordoned when operations fail.
Also added gRPC keepalive params to prevent timeout issues during
long operations.
Signed-off-by: Mateusz Urbanek <mateusz.urbanek@siderolabs.com>
(cherry picked from commit 3db14309e058cacc2ab8664944fc18f80a3bb747)
A sample failure:
```
manifests.go:133:
Error Trace: /src/internal/integration/k8s/manifests.go:133
Error: []string{"/usr/local/bin/kube-proxy", "--cluster-cidr=10.244.0.0/16", "--conntrack-max-per-core=0", "--hostname-override=$(NODE_NAME)", "--kubeconfig=/etc/kubernetes/kubeconfig", "--proxy-mode=nftables"} does not contain "--nodeport-addresses=0.0.0.0/0"
Test: TestIntegration/k8s.ManifestsSuite/TestSync
manifests.go:137: disabling kube-proxy
```
My running theory is that `List()` picks up a stale pod, so trying to
filter it out and log it in full if we hit it.
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
(cherry picked from commit 9b9542cc55ee6d08f3490d270c1b497c7b9d3049)
Fixes#13169
Also fixes a number of other issues with controller being stuck
"watching" over stale data.
The major part of the change is to watch contents of kubelet's
kubeconfig and restart the watch when it changes.
The internals of the watch process don't always bubble up error
properly, or we don't watch for errors.
With this change, not only initial sync has a timeout and a way to abort
the sync process, Talos now can also restart the sync on kubeconfig
change make it more transparent.
This might become irrelevant if we start managing kubeconfig via Talos
controlplane for workers, but for now this seems to be the way to fix
issues.
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
(cherry picked from commit 149592fa59d20c5aa29e4c0af9a3760585f378ce)
See #13159, newer GPU operator v26.3.1 has better detection.
Signed-off-by: Noel Georgi <git@frezbo.dev>
(cherry picked from commit bba0b4aeefd7ec0daf7cc048e48c66d8b614f576)
At the end of every sequence that intentionally terminates the machine (reboot, shutdown, upgrade, etc.), a fatal event is published to signal expected termination. The machine status controller was unconditionally flipping the stage to "rebooting" on this event, which was correct for sequences that end in a reboot but incorrect for the shutdown sequence whose expected termination is a power-off.
The stage tracker now skips this transition when the current sequence is shutdown, so the machine stays in "shutting down" until it actually powers off.
Signed-off-by: Utku Ozdemir <utku.ozdemir@siderolabs.com>
(cherry picked from commit c028db0b8d25e85a4b580e10252d964785320291)
Make sure we run the check commands also on the same node where we created the pool.
Fixes: #13014
Signed-off-by: Noel Georgi <git@frezbo.dev>
(cherry picked from commit 7fa4d39197e1a9e54ba8a259c111f2cb8047ef9c)
This check was in maintenance Upgrade API for Talos <= 1.12,
so keep it in the "normal" API as well.
It always makes sense - the upgrade would fail if Talos is not
installed, but that failure in legacy Upgrade API is async and not
reported properly back.
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
(cherry picked from commit 0d8362119e4415182caa9349e0ddfb27ea290d90)
During removal of encryption key, we logged slot of current key instead of the removed key.
Signed-off-by: Mateusz Urbanek <mateusz.urbanek@siderolabs.com>
(cherry picked from commit be58eafaba98bb7b1bcd20ac1ed8f8b03734c7e0)
There are no security issues fixed.
Drop username/password creds - they were not used.
Improve security of token interceptor.
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
(cherry picked from commit 9fbb7c95df2b1dcd68fafa23865412bbd8300f4b)
They re-enabled support for absolute symlinks, but symlinks which target
paths with `../` are still dropped.
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
(cherry picked from commit 212182e6f655f61e8917059868fc381728e4a959)
Remove the skip statements/rework the code to allow
FIPS builds to do Wireguard by wrapping Wireguard operations
into `fips140.WithoutEnforcement` blocks.
Using Wireguard (or not using it) is still a user's choice, but this
allows tests to run in strict mode.
There might be more fixes required for FIPS strict, right now being
blocked by Go issue with X25119 which is going to be backported to Go
1.26.3.
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
(cherry picked from commit 1ef8e630ab77b3c849e7da6d1ff83e7c6795f070)
Reset ARPIPTargets and NSIP6Targets at the start of BondMasterSpec.Decode.
Without this, repeated decode calls on the same struct can retain old target
entries after config removes them, which makes link status drift from
current bond configuration.
Add a regression test that decodes a payload with targets, then decodes a
payload without target attributes into the same struct and asserts both
slices are empty.
Signed-off-by: Nico Berlee <nico.berlee@on2it.net>
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
(cherry picked from commit 0a47f40b3cdf304a079c6b3fa964e9f82e91ec63)
This is a fixup for #12475
Split the protobuf id for extraArgs fields to use new value, so that we
don't have unmarshal failure when using newer machinery with older Talos
or vice versa.
Also pull in a fix https://github.com/siderolabs/go-talos-support/pull/15
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
(cherry picked from commit f6e7346fa725a703ac4281854150d7a3be12c8d1)
NVIDIA extensions test with UKI boot.
Fixes: #11397
Signed-off-by: Noel Georgi <git@frezbo.dev>
(cherry picked from commit 3ba35c9b9fca9c54e596d5c6df61d515a4a39555)
Add an integration test and fix legacy upgrade API in maintenance mode.
There were several assumptions which do not hold true in maintenance as
we have no machine configuration.
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
(cherry picked from commit c464c7e88a3f058cb2bbc36af1910d69d903cd07)
Also fix one more place when version.Name wasn't used properly.
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
(cherry picked from commit 4ba11156fd164a0d94538508f5c028f249deed50)
Getting ready for 1.36.0 final release.
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
(cherry picked from commit b7512d9125b623d2bb92e3a8b5839e85e1309a39)
Add NVIDIA arm64 test matrix.
Also ensure we have a known baseline for nvidia cdi files,
so if upstream adds more files and we don't install to right location
the test would fail.
Signed-off-by: Noel Georgi <git@frezbo.dev>
(cherry picked from commit 6a3ab87c54f83f70869a2e298e6ed7722cf4afad)
When hostDNS.enabled is false but forwardKubeDNSToHost defaults to true
(via version contract >= 1.8), the controller still writes the host DNS
service address into HostDNSConfig. This causes CoreDNS pods to get a
resolv.conf pointing to 169.254.116.108 while nothing listens there,
leading to DNS query timeouts.
Add a config validation error when forwardKubeDNSToHost is true while
hostDNS.enabled is false.
Fixessiderolabs/talos#13100
Signed-off-by: Zadkiel AHARONIAN <hello@zadkiel.fr>
(cherry picked from commit ca208e51492c4584f9a4cea4d0762c2199f703e7)
When run in "normal" mode, `talosctl` takes into account proxy
configuration, such as the `https_proxy` and `no_proxy` environment
variables; but when invoked with `--insecure`, those would be ignored,
which results in `talosctl` being unable to interact with nodes in
maintenance mode if they're only reachable through a proxy.
This commit adds the `WithDefaultGRPCDialOptions()` option to the
client created by `WithClientMaintenance()`, same as `WithClient()`.
Signed-off-by: Benoît Knecht <benoit.knecht@proton.ch>
(cherry picked from commit 21f459aab5d8ac2841aa69a9237ca3faa06da7df)
For IPv4, they should be attached to no interfaces.
Discovered while doing some manual testing for the documentation.
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
(cherry picked from commit 0bfdf7f7035fefe804ec4b568709cd6a09195293)
Allow to set build NAME on build, propagate it down to more consumers.
Expose name in `Version` resource, and use that in the dashboard
next to Talos version.
Fix some places where `Name` was hardcoded.
Propagate Name down to UKI build.
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
(cherry picked from commit 968ec1e0ca26eb1f0de0836e0a55df09dea7dafe)
Update to resolve Dependabot alerts, these
are all not important as they come via tools.
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
(cherry picked from commit 0cfa6e3024100e34692a0b10e9dacb762c16a626)
Allow both /etc/ld.so.conf and /etc/ld.so.cache files in /etc since tools expect these to be standard.
See: https://github.com/siderolabs/extensions/pull/1031
Replaces changes for Dockerfile from #12909
Signed-off-by: Noel Georgi <git@frezbo.dev>
(cherry picked from commit 414f78a298fc1a196fe310b17b89d3aadc15e1b4)