693 Commits

Author SHA1 Message Date
Andrey Smirnov
da2e81120f
fix: add informer resync period for node status watcher
Also use a constant everywhere in informers.

Add some debug logs.

Might fix #9991

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2025-01-14 19:32:59 +04:00
Noel Georgi
9b957df646
chore: uki code restructure
UKI code re-structure, no-op.

Signed-off-by: Noel Georgi <git@frezbo.dev>
2025-01-14 18:01:53 +05:30
Andrey Smirnov
db4ca5668a
feat: add a kernel parameter to disable built-in auditd
Fixes #9907

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2025-01-14 14:24:50 +04:00
Andrey Smirnov
ed7e47d158
refactor: drop usage of objcopy to generate UKIs
This brings native implementation without external dependencies.

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2025-01-13 13:43:36 +04:00
Andrey Smirnov
edf5c5e29b
fix: extfs repair and resize
Fixes #10103

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2025-01-13 13:04:33 +04:00
Noel Georgi
e6a4583ba8
feat: support generating unsigned UKIs
Support generating unsigned UKI's.

Also plumb in support to `talosctl cluster create` to boot off UKI's.
This doesn't work yet as installer needs more work.

Signed-off-by: Noel Georgi <git@frezbo.dev>
2025-01-10 18:39:57 +05:30
Andrey Smirnov
84fcc976f8
fix: yet another dashboard panic
Fixes #10088

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2025-01-09 15:58:01 +04:00
Dmitry Sharshakov
ae6d065beb
fix: mount selinuxfs only when SELinux is enabled
Having selinuxfs mounted might confuse some software, as conventional Linux systems do not have selinuxfs mounted when SELinux is disabled and no policy is loaded.

Fixes #10083

Signed-off-by: Dmitry Sharshakov <dmitry.sharshakov@siderolabs.com>
2025-01-05 19:17:34 +03:00
Noel Georgi
5ccbf4bcdb
feat: enable configfs
Enable `configfs`.

Ref: https://www.kernel.org/doc/Documentation/filesystems/configfs/configfs.txt

Part of: https://github.com/siderolabs/extensions/issues/562

Signed-off-by: Noel Georgi <git@frezbo.dev>
2025-01-03 20:06:17 +05:30
Noel Georgi
59582496d5
feat: bring in partity with sd-257
Bring in parity with systemd 257 by supporting more UKI sections.

The output of `sd-measure` and our measure code will be different until
https://github.com/systemd/systemd/pull/35765 is fixed upstream.

Fixes: #10075

Signed-off-by: Noel Georgi <git@frezbo.dev>
2025-01-03 17:34:17 +05:30
Andrey Smirnov
4189454441
fix: build of talosctl on non-Linux platforms
The code from `talosctl` imports transitively tpm package, so make it
build on non-Linux.

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-12-26 18:53:51 +04:00
Andrey Smirnov
4761a9e6aa
chore: update dependencies
Go modules, tools, pkgs, etc.

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-12-26 14:48:31 +04:00
Andrey Smirnov
f98efb333f
fix: ignore member not found error on leave cluster
Fixes #10040

Sometimes etcd after 'server stoppped' error actually removes a member,
so the next attempt returns member not found, ignore it, as our goal was
to remove a member.

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-12-25 22:12:48 +04:00
Andrey Smirnov
8212e4864d
refactor: use quirks in kernel args
Make default args depend on quirks, and also pass quirks down to
platform code.

Reduces amount of hacks, but it is functionally equivalent.

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-12-23 18:52:06 +04:00
Noel Georgi
a5660ed778
feat: pcirebind controller
Add a controller to support rebinding drivers for PCI devices.

Fixes: https://github.com/siderolabs/extensions/pull/488

Signed-off-by: Noel Georgi <git@frezbo.dev>
2024-12-20 17:35:37 +05:30
Andrey Smirnov
fb36753216
fix: dashboard crash on CPU data
This was introduced in 1.9 cycle.

Fixes #9998

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-12-20 14:10:11 +04:00
Dmitriy Matrenichev
dec0185c85
chore: reduce memory usage for secureboot functions
Mostly by using new version of `go-uefi` module and streaming instead of loading all at once.

Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
2024-12-20 02:11:13 +03:00
Dmitry Sharshakov
cee6c60a0f
fix: make talosctl time work with PTP time sync
Fix query function used by CLI to match the time syncer behavior.

Signed-off-by: Dmitry Sharshakov <dmitry.sharshakov@siderolabs.com>
2024-12-19 15:52:35 +01:00
Andrey Smirnov
7d39b9ec2b
feat: remove cgroupsv1 in non-container mode
Following up on deprecation in Talos 1.9, remove it completely for Talos
1.10.

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-12-18 18:48:11 +04:00
Andrey Smirnov
b15917ecc6
chore: add more debugging logs for META and volumes
Doesn't fix anything, but the hope is that it would help
with #9852 and #9786.

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-12-16 19:11:44 +04:00
Andrey Smirnov
c9c6851504
fix: node identity flip
The issue shows up in our tests as:

```
=== RUN   TestIntegration/api.DiscoverySuite/TestRegistries
    discovery.go:210: waiting for cluster affiliates to be discovered: 4 expected, 6 found
    discovery.go:210: waiting for cluster affiliates to be discovered: 4 expected, 6 found
    discovery.go:210: waiting for cluster affiliates to be discovered: 4 expected, 6 found
    discovery.go:210: waiting for cluster affiliates to be discovered: 4 expected, 6 found
    discovery.go:210: waiting for cluster affiliates to be discovered: 4 expected, 6 found
    discovery.go:210: waiting for cluster affiliates to be discovered: 4 expected, 6 found
    discovery.go:210: waiting for cluster affiliates to be discovered: 4 expected, 6 found
    discovery.go:210: waiting for cluster affiliates to be discovered: 4 expected, 6 found
```

It should be a minor issue for non-KubeSpan'ed clusters (as members get
correctly de-duplicated), but might cause connectivity issues for
KubeSpan'ed clusters.

The issue comes from the short mount in the sequencer around
`loadConfig` step: as the mount time is short, it triggers a race in the
node identity controller when it tries to read existing identity from
`/system/state`, but as the partition is unmounted by the time it tries
to read, it assumes there's no identity and establishes a new one.

Eventually, it will write new identity back to disk, but that new
identity is different from the previous one, so it creates another entry
for itself in the discovery service.

A proper solution is a volume mount controller, but a temporary band aid
is to avoid broadcasting mount notification for this short `STATE` mount
via resources, so that controller isn't triggered.

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-12-16 16:19:56 +04:00
Dmitry Sharshakov
c254f261fd
fix: do not extract xattrs in unsquashfs
Fix building on SELinux systems. Extracting xattrs led to return code 2 as a non-critical error. This should not influence extension build.

Signed-off-by: Dmitry Sharshakov <dmitry.sharshakov@siderolabs.com>
2024-11-28 19:57:48 +01:00
Andrey Smirnov
fc3b31575c
fix: multiple issues with opening encrypted volumes
Fixes #9820

This only affects volumes with multiple key slots configured.

Make sync issues non-fatal, so that if some keys fail to sync, proceed
with normal boot, but record an error in the `VolumeStatus` resource.

When opening, correctly try all key slots.

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-11-28 21:34:41 +04:00
Dmitry Sharshakov
145b02642e
chore: deprecate cgroupsv1 in non-container mode
Fixes #9729.

Signed-off-by: Dmitry Sharshakov <dmitry.sharshakov@siderolabs.com>
2024-11-28 18:08:47 +01:00
Dmitriy Matrenichev
ccc5a8d34c
chore: split config.Registry into the separate resource
Required for #9614

Closes #9766

Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
2024-11-27 19:58:08 +03:00
Andrey Smirnov
c735d14928
fix: wait for udevd before starting sync
Fixes #8074

One part of the fix is to wait for udevd to be ready, as anyways before
udevd is ready network interfaces are not ready, so sync is not
possible.

Second part is that now u-root's rtc package supports closing rtc
devices, so we can properly open/close it as part of the sync loop (vs.
previous workaround with sync.Once).

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-11-27 20:33:03 +04:00
Dmitriy Matrenichev
177df62a0e
fix: small logrus fixes
Ensure correct logrus setup.

Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
2024-11-26 21:32:52 +03:00
Noel Georgi
939c555f9a
fix: imager disk image-cache generator
Move things around so `talosctl` is not dependent on `go-blockdevice`.

Signed-off-by: Noel Georgi <git@frezbo.dev>
2024-11-26 21:47:08 +05:30
Noel Georgi
1bac0b183a
feat: support generating disk images with image cache
Add support for generating disk images with image cache.

Fixes: #9616

Signed-off-by: Noel Georgi <git@frezbo.dev>
2024-11-26 18:52:25 +05:30
Dmitry Sharshakov
a13f82c594
feat: udev: label device nodes
Use udev rules to assign basic device file labels based on their subsystem

Signed-off-by: Dmitry Sharshakov <dmitry.sharshakov@siderolabs.com>
2024-11-22 12:42:22 +01:00
Dmitry Sharshakov
e899fb37fd
feat: label created files in /etc
Implement SELinux labeling support in EtcFileController, label both squashfs and runtime-created files in /etc and /system/etc.

Add corresponding test cases.

Signed-off-by: Dmitry Sharshakov <dmitry.sharshakov@siderolabs.com>
2024-11-22 09:16:13 +01:00
Andrey Smirnov
5f68c17eda
feat: implement image cache configuration
Implement a feature flag, a resource which controls the flow.

This controls the volume configuration, mounting, etc.

Fixes #9767

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-11-21 21:24:39 +04:00
Dmitry Sharshakov
1a8cc5f8b2
feat: add SELinux labels to volumes
Label mounted filesystems like ephemeral, overlay mounts, as well as data directories (going to become volumes later).

Signed-off-by: Dmitry Sharshakov <dmitry.sharshakov@siderolabs.com>
2024-11-21 14:23:43 +01:00
Dmitry Sharshakov
4caeae21e5
refactor: optimize flags and SetLabel
Do not do string lookups in repetitive calls. We do not support changing SELinux status during runtime, so once we read this we can assume status does not change.

Also avoid unneeded FS writes when appropriate label is already set on file.

Signed-off-by: Dmitry Sharshakov <dmitry.sharshakov@siderolabs.com>
2024-11-21 08:25:49 +01:00
Dmitry Sharshakov
d55a96e8cb
refactor: remove SELinux client_u and client_r
I added those in the early days of the current policy development, yet there was no use for them. This change simplifies the policy and handling of labels.

Signed-off-by: Dmitry Sharshakov <dmitry.sharshakov@siderolabs.com>
2024-11-18 16:17:03 +01:00
Andrey Smirnov
30f8b5a9f7
fix: registry mirror fallback handling
Fixes #9613

This has two changes:

* adjust Talos registry resolver to match containerd (CRI) resolver: use
  by default upstream as a fallback
* add a machine config option to skip upstream as a fallback, and adjust
  CRI configuration accordingly

See https://github.com/containerd/containerd/blob/main/docs/hosts.md#registry-configuration---examples
for details on CRI's `hosts.toml`.

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-11-14 20:38:36 +04:00
Dmitriy Matrenichev
e26d0043e0
chore: code cleanup
More usage of slices package, less usage of package sort.

Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
2024-11-14 12:25:56 +03:00
Andrey Smirnov
5a0fd5b882
refactor: move early initialization functions to pre-initialize phase
Fixes #8900

Closes #9687

(contains splitting of late mounts)

The benefits:

* tasks run _before_ controllers are started
* tasks can register `defer` to undo actions

This decomposes sequencer tasks a bit.

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-11-12 19:48:40 +04:00
Dmitriy Matrenichev
4fe6dc8a0a
chore: clean dns code
Split from #9596 (without IPv6 stuff). This PR does this things:
- Refactored `DNSResolveCacheController`. Most of the logic moved to `dns` package types. Simplify and streamline logic.
- Replace most of the goroutine orchestration with suture package.
- Support per-item reaction to the dns listeners/servers failing to start. This allows us to ignore IPv6 errors if it's disabled.
- Support per-item reaction to the dns listeners/servers failing to stop.

Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
2024-11-08 21:54:28 +03:00
Nico Berlee
11380f933d
feat: display current CPU frequency on dashboard
Dashboard now shows the active frequency of each CPU core when cpufreq
is available on non-virtualized systems, enhancing real-time accuracy.

Solves the issue of displaying 0MHz on certain SBCs due to
/proc/cpuinfo limitations.

Signed-off-by: Nico Berlee <nico.berlee@on2it.net>
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-11-08 12:05:48 +04:00
Noel Georgi
1800f81044
fix: selinux handling and apparmor tests
Conditionally mount selinuxfs only if it's present.

Fix AppArmor tests, `apparmor` and other minor LSM's and set
`apparmor=1`.

Signed-off-by: Noel Georgi <git@frezbo.dev>
2024-11-07 07:50:00 +05:30
Dmitry Sharshakov
a867f85e4c
feat: label system socket and runtime files
Set SELinux labels so that services could gain access permissions.

Signed-off-by: Dmitry Sharshakov <dmitry.sharshakov@siderolabs.com>
2024-11-06 07:29:35 +01:00
Dmitriy Matrenichev
cedabeddf7
chore: cleanup code
- Replace unsafe resource interface calls with type-safe versions.
- Remove unused parameter names.
- Minor changes.

Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
2024-11-04 21:14:00 +03:00
Dmitry Sharshakov
960a040491
feat: start enabling SELinux
Part of: #9127

Label executables and processes, build, load and manage SELinux policy, enable audit support.

Labeling filesystems, devices and runtime files will be done in further changes, see the full PR.

Signed-off-by: Dmitry Sharshakov <dmitry.sharshakov@siderolabs.com>
2024-11-04 16:56:53 +01:00
Andrey Smirnov
b379506259
fix: use more correct condition to skip generating hosts files
After a fix a while ago, the condition was hard to understand - but we
should skip this block as long as there's no TLS config, which might
mean either being nil or having default values.

I found this while debugging #9594, but it doesn't change anything.

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-10-30 14:48:30 +04:00
Andrey Smirnov
62ec7ec336
refactor: replace the old v1 mount package with new one
Re-design some methods, simplify flows and allow more simple
interactions.

Learn from mistakes and design better methods.

Fixes #9471

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-10-30 14:21:28 +04:00
Dmitry Sharshakov
423b1e5fb2
fix: do not trim 0 from process SELinux label
Seems like a typo, zero is a valid character in SELinux labels

Signed-off-by: Dmitry Sharshakov <dmitry.sharshakov@siderolabs.com>
2024-10-29 16:07:21 +01:00
Dmitriy Matrenichev
a13cf76a34
chore: simplify DNSUpstreamController and DNSUpstream resource
This PR does those things:
- Fixes race condition where controller could potentially modify upstream, while other controller is copying its internals to the slice.
- Simplifies `run` function in `DNSUpstreamController` by removing all `Idx` handling.
- Removes `Idx` field from `DNSUpstream`. Upstreams are now sorted by their id with №X prefix.
- `Proxy` Stop is now called from the finalizer. In combination with iterators, this ensures that we only stop upstream when it's fully unreachable.

Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
2024-10-24 19:29:21 +03:00
Noel Georgi
62d185473e
fix: talosctl process null character
Fixes using grep on `talosctl ps` complaining about being a binary
output.

Signed-off-by: Noel Georgi <git@frezbo.dev>
2024-10-24 19:49:22 +05:30
Joakim Nohlgård
ead46997c9
chore: rename tpm2.PCRExtent -> tpm2.PCRExtend
Fixes typo

Signed-off-by: Joakim Nohlgård <joakim@nohlgard.se>
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-10-21 16:10:53 +04:00