2308 Commits

Author SHA1 Message Date
Andrey Smirnov
da2e81120f
fix: add informer resync period for node status watcher
Also use a constant everywhere in informers.

Add some debug logs.

Might fix #9991

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2025-01-14 19:32:59 +04:00
Noel Georgi
9b957df646
chore: uki code restructure
UKI code re-structure, no-op.

Signed-off-by: Noel Georgi <git@frezbo.dev>
2025-01-14 18:01:53 +05:30
Noel Georgi
e41a995253
fix: kube-apiserver authorizers order
Fixes handling of `kube-apiserver` authorization config authorizers.
order.

Fixes: #10110

Signed-off-by: Noel Georgi <git@frezbo.dev>
2025-01-14 16:49:25 +05:30
Andrey Smirnov
db4ca5668a
feat: add a kernel parameter to disable built-in auditd
Fixes #9907

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2025-01-14 14:24:50 +04:00
Andrey Smirnov
ed7e47d158
refactor: drop usage of objcopy to generate UKIs
This brings native implementation without external dependencies.

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2025-01-13 13:43:36 +04:00
Andrey Smirnov
edf5c5e29b
fix: extfs repair and resize
Fixes #10103

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2025-01-13 13:04:33 +04:00
Noel Georgi
e6a4583ba8
feat: support generating unsigned UKIs
Support generating unsigned UKI's.

Also plumb in support to `talosctl cluster create` to boot off UKI's.
This doesn't work yet as installer needs more work.

Signed-off-by: Noel Georgi <git@frezbo.dev>
2025-01-10 18:39:57 +05:30
Andrey Smirnov
84fcc976f8
fix: yet another dashboard panic
Fixes #10088

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2025-01-09 15:58:01 +04:00
TomyLobo
499695e24e
fix: request previous IP address in discovery
This ensures that even in the event of a DHCP downtime that exceeds the
lease time, the current IP can be maintained.

Signed-off-by: TomyLobo <tomylobo@nurfuerspam.de>
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2025-01-09 14:42:03 +04:00
Dmitry Sharshakov
ae6d065beb
fix: mount selinuxfs only when SELinux is enabled
Having selinuxfs mounted might confuse some software, as conventional Linux systems do not have selinuxfs mounted when SELinux is disabled and no policy is loaded.

Fixes #10083

Signed-off-by: Dmitry Sharshakov <dmitry.sharshakov@siderolabs.com>
2025-01-05 19:17:34 +03:00
Noel Georgi
5ccbf4bcdb
feat: enable configfs
Enable `configfs`.

Ref: https://www.kernel.org/doc/Documentation/filesystems/configfs/configfs.txt

Part of: https://github.com/siderolabs/extensions/issues/562

Signed-off-by: Noel Georgi <git@frezbo.dev>
2025-01-03 20:06:17 +05:30
Noel Georgi
59582496d5
feat: bring in partity with sd-257
Bring in parity with systemd 257 by supporting more UKI sections.

The output of `sd-measure` and our measure code will be different until
https://github.com/systemd/systemd/pull/35765 is fixed upstream.

Fixes: #10075

Signed-off-by: Noel Georgi <git@frezbo.dev>
2025-01-03 17:34:17 +05:30
Noel Georgi
83d84a8318
chore(ci): better zfs checks
Part of: https://github.com/siderolabs/extensions/issues/572

Signed-off-by: Noel Georgi <git@frezbo.dev>
2025-01-02 21:12:31 +05:30
Andrey Smirnov
b7a7fdc4b8
refactor: generate /etc/os-release file static way
The file which is exported back to source via `make generate` is using
short tag (vX.Y.Z), while the one generated for the actual build comes
with full version tag.

Fixes #8898

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-12-27 13:26:10 +04:00
Andrey Smirnov
4189454441
fix: build of talosctl on non-Linux platforms
The code from `talosctl` imports transitively tpm package, so make it
build on non-Linux.

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-12-26 18:53:51 +04:00
Andrey Smirnov
4761a9e6aa
chore: update dependencies
Go modules, tools, pkgs, etc.

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-12-26 14:48:31 +04:00
Andrey Smirnov
f98efb333f
fix: ignore member not found error on leave cluster
Fixes #10040

Sometimes etcd after 'server stoppped' error actually removes a member,
so the next attempt returns member not found, ignore it, as our goal was
to remove a member.

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-12-25 22:12:48 +04:00
Andrey Smirnov
b72bda0a42
fix: talosctl support and race tests
1. Don't set max cgroups limit if race mode is enabled (only in test
   mode). When e.g. apid/trustd are built with race detector on, they
   consume 10x the memory.
2. Fix a data race in `talosctl support` when showing UI progress.
3. Fix an issue pulling `kubeconfig` in `talosctl support` - pull from
   endpoints (controlplanes) without setting any nodes.

Fixes #10036

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-12-25 21:05:27 +04:00
Andrey Smirnov
27233cf0fc
test: use node informer instead of raw watch
This should improve watch reliability, as it was failing on channel
being closed.

Fixes #10039

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-12-25 18:52:07 +04:00
Andrey Smirnov
5f3acd0f26
fix: use correct default search domain
Search domain should be domain name of the hostname, not the FQDN.

Bug introduced in #9844

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-12-25 14:38:00 +04:00
Dmitry Sharshakov
78b3e7f4f1
fix: get next rule number for IPv6 in the appropriate chain
Does not fix anything, but looks more correct

Signed-off-by: Dmitry Sharshakov <dmitry.sharshakov@siderolabs.com>
2024-12-24 13:45:07 +01:00
Andrey Smirnov
8212e4864d
refactor: use quirks in kernel args
Make default args depend on quirks, and also pass quirks down to
platform code.

Reduces amount of hacks, but it is functionally equivalent.

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-12-23 18:52:06 +04:00
Andrey Smirnov
b4aa5189d4
release(v1.10.0-alpha.0): prepare release
This is the official v1.10.0-alpha.0 release.

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-12-23 15:15:56 +04:00
Noel Georgi
a5660ed778
feat: pcirebind controller
Add a controller to support rebinding drivers for PCI devices.

Fixes: https://github.com/siderolabs/extensions/pull/488

Signed-off-by: Noel Georgi <git@frezbo.dev>
2024-12-20 17:35:37 +05:30
Andrey Smirnov
fb36753216
fix: dashboard crash on CPU data
This was introduced in 1.9 cycle.

Fixes #9998

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-12-20 14:10:11 +04:00
Dmitriy Matrenichev
dec0185c85
chore: reduce memory usage for secureboot functions
Mostly by using new version of `go-uefi` module and streaming instead of loading all at once.

Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
2024-12-20 02:11:13 +03:00
Dmitry Sharshakov
cee6c60a0f
fix: make talosctl time work with PTP time sync
Fix query function used by CLI to match the time syncer behavior.

Signed-off-by: Dmitry Sharshakov <dmitry.sharshakov@siderolabs.com>
2024-12-19 15:52:35 +01:00
Andrey Smirnov
7d39b9ec2b
feat: remove cgroupsv1 in non-container mode
Following up on deprecation in Talos 1.9, remove it completely for Talos
1.10.

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-12-18 18:48:11 +04:00
Andrey Smirnov
8003536c7c
fix: restore previous disk serial fetching
See https://github.com/siderolabs/go-blockdevice/pull/119

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-12-18 15:03:53 +04:00
Andrey Smirnov
03116ef9bd
chore: prepare for Talos 1.10
Fork docs, update tests, trim release notes, etc.

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-12-17 19:54:11 +04:00
Andrey Smirnov
284ab11794
feat: support link altnames/aliases
At the moment, we don't use/support aliases, but we might in the future.

Altnames are filled out by `systemd-udevd`.

This PR has two parts:

* show aliases & altnames in `LinkStatus`
* match links by aliases/altnames when we configure
  addresses/routes/links

This should make a transition to `systemd-udevd` less painful if the
previous link name is in `altNames`.

Forked rtnetlink for https://github.com/jsimonetti/rtnetlink/pull/241

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-12-17 14:09:26 +04:00
Andrey Smirnov
ec2e24fd96
fix: match MAC addresses case-insensitive (nocloud)
Fixes #9965

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-12-16 20:52:14 +04:00
Andrey Smirnov
b15917ecc6
chore: add more debugging logs for META and volumes
Doesn't fix anything, but the hope is that it would help
with #9852 and #9786.

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-12-16 19:11:44 +04:00
Andrey Smirnov
9470e842fc
test: cleanup failed Kubernetes pods
See #9870

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-12-16 16:48:30 +04:00
Andrey Smirnov
c9c6851504
fix: node identity flip
The issue shows up in our tests as:

```
=== RUN   TestIntegration/api.DiscoverySuite/TestRegistries
    discovery.go:210: waiting for cluster affiliates to be discovered: 4 expected, 6 found
    discovery.go:210: waiting for cluster affiliates to be discovered: 4 expected, 6 found
    discovery.go:210: waiting for cluster affiliates to be discovered: 4 expected, 6 found
    discovery.go:210: waiting for cluster affiliates to be discovered: 4 expected, 6 found
    discovery.go:210: waiting for cluster affiliates to be discovered: 4 expected, 6 found
    discovery.go:210: waiting for cluster affiliates to be discovered: 4 expected, 6 found
    discovery.go:210: waiting for cluster affiliates to be discovered: 4 expected, 6 found
    discovery.go:210: waiting for cluster affiliates to be discovered: 4 expected, 6 found
```

It should be a minor issue for non-KubeSpan'ed clusters (as members get
correctly de-duplicated), but might cause connectivity issues for
KubeSpan'ed clusters.

The issue comes from the short mount in the sequencer around
`loadConfig` step: as the mount time is short, it triggers a race in the
node identity controller when it tries to read existing identity from
`/system/state`, but as the partition is unmounted by the time it tries
to read, it assumes there's no identity and establishes a new one.

Eventually, it will write new identity back to disk, but that new
identity is different from the previous one, so it creates another entry
for itself in the discovery service.

A proper solution is a volume mount controller, but a temporary band aid
is to avoid broadcasting mount notification for this short `STATE` mount
via resources, so that controller isn't triggered.

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-12-16 16:19:56 +04:00
Noel Georgi
ab5bb68842
fix: generate and serve registries with port
Fix generating and serving registies having port in them.
This is needed to copy and serve imagecache from a vfat filesystem.

Signed-off-by: Noel Georgi <git@frezbo.dev>
2024-12-14 09:33:39 +05:30
Andrey Smirnov
58236066dd
fix: support image cache on VFAT USB stick
Scenario: copy contents of the ISO to the USB VFAT stick.

Make sure VFAT filesystem has a label `TALOS_*`.

Fixes #9936

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-12-13 17:15:24 +04:00
Noel Georgi
e193a50714
fix: image cache integration test
Fix image cache cli integration test.

Also fix the extensions test by skipping cloudflared.

Signed-off-by: Noel Georgi <git@frezbo.dev>
2024-12-13 16:56:00 +05:30
Andrey Smirnov
08ee400fdb
test: fix flaky test NodeAddressSort
There were two issues which showed up specifically under `race` tests:

1. As the address resources are added while the controller is running,
   and `default` address is immutable (by design), insert the future
   default address first, otherwise the controller might pick up another
   one it sees first randomly.

2. There was a bug in accumulative address handling when the sort only
   took into account addresses ignoring prefix lengths.

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-12-13 14:43:31 +04:00
Noel Georgi
136b129121
chore: drop semicolon for supporting vfat filesystems
Drop semicolon in generated cache to support copying image cache to vfat
filesystems.

Fixes: #9935

Signed-off-by: Noel Georgi <git@frezbo.dev>
2024-12-12 23:40:06 +05:30
Noel Georgi
d54414add4
fix: authorization config gen
We were appending to existing slice, fix by using a variable.

Signed-off-by: Noel Georgi <git@frezbo.dev>
2024-12-12 19:40:32 +05:30
Dmitriy Matrenichev
81805103de
chore: enable proper parallel usage of TestDepth
Rework the inners of `RunCLI` to support this.

Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
2024-12-12 02:26:59 +03:00
Andrey Smirnov
61b1489a0f
fix: order volume config by the requested size
This fixes an issue like that:

* the system disk is say 10GiB
* STATE is fixed 100 MiB always
* EPHEMERAL is configured to be min 6 GiB, max 100 GiB

As the EPHEMERAL/STATE provisioning order was not defined, EPHEMERAL
might be created first, occupying whole disk and leaving no space left
for STATE.

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-12-11 18:35:11 +04:00
Dmitriy Matrenichev
30016a0a8d
fix: avoid nil-pointer-panic in RegistriesConfigController
One billion dollar mistake strikes again. Increase code coverage.

Closes #9912

Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
2024-12-11 16:11:05 +03:00
Andrey Smirnov
707a77bf64
test: fix user namespace test, TPM2 fixes
Make sure the test runs on a specific node, wait for swtpm to be up.

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-12-09 20:16:04 +04:00
Dmitriy Matrenichev
c4724fc975
chore: add integration tests for image-cache
Provide separate `integration/image-cache` tag.

Closes #9860

Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
2024-12-06 20:28:34 +03:00
Andrey Smirnov
852baf819d
feat: support vlan/bond in v1, vlan in v2 for nocloud
Fixes #9753

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-12-06 14:37:01 +04:00
Andrey Smirnov
dd61ad8610
fix: lock provisioning order of user disk partitions
Fixes #9877

As a side-effect, fix alignment of user disks for newer QEMU versions.

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-12-05 16:12:22 +04:00
Andrey Smirnov
7d6507189f
feat: implement new address sorting algorithm
Fixes #9725

See #9749

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-12-05 14:38:46 +04:00
Dmitry Sharshakov
9081506d6c
feat: add process scheduling options
This commit adds runner options for priority, IO priority, scheduling policy. It also cleans up previously developed code for capabilities.

This is useful to launch background tasks such as xfs_scrub to not reduce system performance. We set nice 10 for dashboard so that it gives priority to more important system services.

Signed-off-by: Dmitry Sharshakov <dmitry.sharshakov@siderolabs.com>
2024-12-04 20:10:57 +01:00