Make the setup phase of the test a bit more consistent - wait for the
machine to be ready, connection refused to be cleared (after reboots).
This doesn't change anything in the tests themselves, but hopefully
should reduce number of flakes like: https://github.com/siderolabs/talos/actions/runs/15895820994/job/44827039818
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
Zswap allows to compress pages in memory before they hit the actual swap
device.
Both swap and zswap (or either one of these) can be enabled.
Fixes#10675
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
Fixes#10674
Provide a way to see current swap status, configure additional swap
devices (block) and de-configure them on the fly.
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
Instead of relying on the fact that CRI patch should modify the
generated final CRI config, rely on the specific checksum of the CRI
patch to be included into the generated CRI config.
This also to resolve Talos hanging on boot when a CRI patch is a no-op
(it doesn't change the generated config).
Fixes#11132
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
Linux kernel has the following policy:
* initial bridge MAC is random
* if the bridge MAC is not set explicitly by userspace,
bridge MAC is the smallest MAC address of all ports
But systemd-udevd which we use started to assign "stable" MACs to bridge
interfaces (when they are created), which Linux kernel treats as
userspace explicitly set, so the bridge no longer gets an automatic MAC
of the ports.
This is a breaking change, so we need to revert it.
Fixes#10884Fixes#11011
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
In maintenance mode, we still accept any config.
Fixes#10897
As "normal" mode requires v1alpha1 config today, it should be an easy
fix to require it part of the applied config always.
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
It seems that the integration test introduced in
https://github.com/siderolabs/talos/pull/10792 is causing some
unintented side-effects in kube-apiserver -> kubelet communication (most
probably around the TLS certificate??).
Instead of assigning dummy external IP, create a dummy link.
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
The bug was with incorrect condition: if `activeNetworkConfig` was ever
set to non-nil value, it was stuck with this value forever, despite new
network config being available with `networkConfig`.
In `talosctl dashboard` case, Talos `metal` platform always reports
initial data (before META is available) which doesn't have any network
config, but later on sends updates (if something updates META), so this
bug leads to Talos being stuck with initial empty network config.
Fixes#10787
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
User volumes are identified by a short name which serves both
as a `/var/mnt` mount point and a partition label.
User volumes can be added and removed on the fly, and they are
automatically propagated into the `kubelet` mount namespace.
Also deprecate `.machine.disks`.
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
Make sure a privileged pod cannot violate some of the important security rules enforced by SELinux.
Fixes#10615
Signed-off-by: Dmitrii Sharshakov <dmitry.sharshakov@siderolabs.com>
```text
❯ talosctl -n 10.5.0.3 get version
NODE NAMESPACE TYPE ID VERSION VERSION
10.5.0.3 runtime Version version 1 v1.10.0-alpha.3-33-g84f69f043-dirty
```
Fixes: #10574
Signed-off-by: Noel Georgi <git@frezbo.dev>
Fix UKI boot detection
Also fix bug introduced by #10640 which imported the unix package making
talosctl non-unix builds broken.
Signed-off-by: Noel Georgi <git@frezbo.dev>
This complements the previous PRs to implement more volume features:
directory volumes control their permissions, SELinux labels, etc.
Overlay mounts support additional parent relationship.
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
Use new controller for user disk and STATE mounts, drop
old code in the sequencer.
Also support mounts with parent (when e.g. `/var/lib` is mounted on top
of `/var`).
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
The issue is not so easy to fix, as GRPC tunnel on/off change requires
two different flow for the link (interface):
* no tunnel -> Talos link controller should create in-kernel `wireguard`
link and no userspace components
* tunnel on -> Talos link controller should never create the link, and
only adjust WG settings via UAPI, while the actual link is created by
the userspace implementation (it's a `tun` device)
Transition between those two links is impossible for the link controller
to distinguish, as it doesn't know that it has to drop old link and skip
creating new one based on the information available.
So, instead, use different names for the link in two states:
`siderolink` for the kernel flow, and `siderolinktun` for the userspace
flow. This fixes the issue of proper link cleanup/re-creation.
Add integration tests.
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
The previous fix#10354 was not full/complete.
The problem lies in the fact that `kube-proxy` creates a rule like:
```
chain nat-prerouting {
type nat hook prerouting priority dstnat; policy accept;
jump services
}
```
This chain has a prerouting hook, which gets executed before Talos's
input hook, and rewrites (does DNAT) for NodePort services before Talos
has a chance to block the packet, but rewritten packet hits the input
chain with DNAT address, or might be forwarded to another host and never
hit the firewall again.
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
Support showing current feature state, and changing features on the fly.
The output and interface should be similar to `ethtool`.
We don't support legacy feature names.
```
node: 172.20.0.5
metadata:
namespace: network
type: EthernetStatuses.net.talos.dev
id: enp0s2
version: 2
owner: network.EthernetStatusController
phase: running
created: 2025-02-10T11:40:32Z
updated: 2025-02-10T11:40:32Z
spec:
linkState: true
port: Other
duplex: Unknown
rings:
rx-max: 256
tx-max: 256
rx: 256
tx: 256
tx-push: false
rx-push: false
features:
tx-scatter-gather: on
tx-checksum-ipv4: off [fixed]
tx-checksum-ip-generic: on
tx-checksum-ipv6: off [fixed]
highdma: on [fixed]
tx-scatter-gather-fraglist: off [fixed]
tx-vlan-hw-insert: off [fixed]
rx-vlan-hw-parse: off [fixed]
rx-vlan-filter: on [fixed]
vlan-challenged: off [fixed]
tx-generic-segmentation: on
rx-gro: on
rx-lro: off [fixed]
tx-tcp-segmentation: on
tx-gso-robust: on [fixed]
tx-tcp-ecn-segmentation: on
tx-tcp-mangleid-segmentation: off
tx-tcp6-segmentation: on
tx-fcoe-segmentation: off [fixed]
tx-gre-segmentation: off [fixed]
tx-gre-csum-segmentation: off [fixed]
tx-ipxip4-segmentation: off [fixed]
tx-ipxip6-segmentation: off [fixed]
tx-udp_tnl-segmentation: off [fixed]
tx-udp_tnl-csum-segmentation: off [fixed]
tx-gso-partial: off [fixed]
tx-tunnel-remcsum-segmentation: off [fixed]
tx-sctp-segmentation: off [fixed]
tx-esp-segmentation: off [fixed]
tx-udp-segmentation: off
tx-gso-list: off [fixed]
tx-checksum-fcoe-crc: off [fixed]
tx-checksum-sctp: off [fixed]
rx-ntuple-filter: off [fixed]
rx-hashing: off [fixed]
rx-checksum: on [fixed]
tx-nocache-copy: off
loopback: off [fixed]
rx-fcs: off [fixed]
rx-all: off [fixed]
tx-vlan-stag-hw-insert: off [fixed]
rx-vlan-stag-hw-parse: off [fixed]
rx-vlan-stag-filter: off [fixed]
l2-fwd-offload: off [fixed]
hw-tc-offload: off [fixed]
esp-hw-offload: off [fixed]
esp-tx-csum-hw-offload: off [fixed]
rx-udp_tunnel-port-offload: off [fixed]
tls-hw-tx-offload: off [fixed]
tls-hw-rx-offload: off [fixed]
rx-gro-hw: on
tls-hw-record: off [fixed]
rx-gro-list: off
macsec-hw-offload: off [fixed]
rx-udp-gro-forwarding: off
hsr-tag-ins-offload: off [fixed]
hsr-tag-rm-offload: off [fixed]
hsr-fwd-offload: off [fixed]
hsr-dup-offload: off [fixed]
```
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
When using VIP, recovery of Kubernetes controlplane takes more time
(plus given the fact that the test rotates PKI twice).
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
This should improve watch reliability, as it was failing on channel
being closed.
Fixes#10039
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
Fixes#9820
This only affects volumes with multiple key slots configured.
Make sync issues non-fatal, so that if some keys fail to sync, proceed
with normal boot, but record an error in the `VolumeStatus` resource.
When opening, correctly try all key slots.
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
Implement SELinux labeling support in EtcFileController, label both squashfs and runtime-created files in /etc and /system/etc.
Add corresponding test cases.
Signed-off-by: Dmitry Sharshakov <dmitry.sharshakov@siderolabs.com>
Label mounted filesystems like ephemeral, overlay mounts, as well as data directories (going to become volumes later).
Signed-off-by: Dmitry Sharshakov <dmitry.sharshakov@siderolabs.com>
Fixes#9731
The wipe doesn't require a reboot, but it requires the blockdevice not
to be used as a volume.
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>