Commit Graph

231 Commits

Author SHA1 Message Date
Dmitry Sharshakov
e899fb37fd
feat: label created files in /etc
Implement SELinux labeling support in EtcFileController, label both squashfs and runtime-created files in /etc and /system/etc.

Add corresponding test cases.

Signed-off-by: Dmitry Sharshakov <dmitry.sharshakov@siderolabs.com>
2024-11-22 09:16:13 +01:00
Noel Georgi
77cf84fb57
feat: support generating iso with imagecache
Support generating iso with imagecache.

Part-of: #9616

Signed-off-by: Noel Georgi <git@frezbo.dev>
2024-11-21 20:40:05 +05:30
Dmitry Sharshakov
1a8cc5f8b2
feat: add SELinux labels to volumes
Label mounted filesystems like ephemeral, overlay mounts, as well as data directories (going to become volumes later).

Signed-off-by: Dmitry Sharshakov <dmitry.sharshakov@siderolabs.com>
2024-11-21 14:23:43 +01:00
Andrey Smirnov
cc768037f8
feat: implement block device wipe
Fixes #9731

The wipe doesn't require a reboot, but it requires the blockdevice not
to be used as a volume.

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-11-20 15:46:37 +04:00
Dmitriy Matrenichev
e26d0043e0
chore: code cleanup
More usage of slices package, less usage of package sort.

Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
2024-11-14 12:25:56 +03:00
Andrey Smirnov
43fe3807a8
feat: implement tracking of blockdevice secondaries
This is going to be used to detect disks that are safe to wipe.

For blockdevices, track secondaries as direct references, e.g. encrypted
`STATE` partition might have secondary `vda5`.

For disks, re-map secondaries to be whole devices names, e.g. `vda`.

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-11-13 22:43:27 +04:00
Andrey Smirnov
9916e2cd8a
chore: update pkgs/tools/extras for Go 1.23.3
Bump some dependencies as well.

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-11-12 16:38:32 +04:00
Andrey Smirnov
9a02ecc49f
feat: rewrite install disk selector to use CEL expressions
Rewrite matcher to take out old go-blockdevice library out of the way,
implementing translation from go-blockdevice format to CEL.

Implement facilities to build CEL expressions programmatically.

Now we can add a machine config disk match expression (CEL) easily.

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-11-11 17:23:15 +04:00
Noel Georgi
fb72e4b7b7
fix(ci): skip test if UserNamespacesSupport feature gate is not set
We should not just rely on the sysctl, also confirm that `UserNamespacesSupport=true`
feature gate is set for apiserver, so that the tests gets skipped if only sysctl is set.

Signed-off-by: Noel Georgi <git@frezbo.dev>
2024-11-08 16:16:11 +05:30
Nico Berlee
11380f933d
feat: display current CPU frequency on dashboard
Dashboard now shows the active frequency of each CPU core when cpufreq
is available on non-virtualized systems, enhancing real-time accuracy.

Solves the issue of displaying 0MHz on certain SBCs due to
/proc/cpuinfo limitations.

Signed-off-by: Nico Berlee <nico.berlee@on2it.net>
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-11-08 12:05:48 +04:00
Dmitry Sharshakov
a867f85e4c
feat: label system socket and runtime files
Set SELinux labels so that services could gain access permissions.

Signed-off-by: Dmitry Sharshakov <dmitry.sharshakov@siderolabs.com>
2024-11-06 07:29:35 +01:00
Dmitry Sharshakov
960a040491
feat: start enabling SELinux
Part of: #9127

Label executables and processes, build, load and manage SELinux policy, enable audit support.

Labeling filesystems, devices and runtime files will be done in further changes, see the full PR.

Signed-off-by: Dmitry Sharshakov <dmitry.sharshakov@siderolabs.com>
2024-11-04 16:56:53 +01:00
Noel Georgi
9abf16108e
feat: add auditd service
Adds a auditd service that gathers all audit logs from kernel.

Signed-off-by: Noel Georgi <git@frezbo.dev>
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
Signed-off-by: Noel Georgi <git@frezbo.dev>
2024-11-02 22:25:04 +05:30
Andrey Smirnov
c755b6d7e4
fix: update the CRI sandbox image reference
Fix the test, and update the reference.

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-10-28 14:52:19 +04:00
Andrey Smirnov
534b0ce183
feat: update runc to 1.2.0 final
Via pks.

See https://github.com/opencontainers/runc/releases/tag/v1.2.0

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-10-22 16:47:24 +04:00
Dmitry Sharshakov
29780d35a0
test: add an integration test for verifying process parameters
Validate capabilities are dropped and cgroup, UID, environment and OOM adjustments are set

Signed-off-by: Dmitry Sharshakov <dmitry.sharshakov@siderolabs.com>
2024-10-18 16:59:41 +02:00
Andrey Smirnov
e0434d77d7
feat: update dependencies
Bring in new tools, pkgs, update Go dependencies and others.

In preparation for Talos 1.9.0-alpha.0.

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-10-17 22:12:50 +04:00
Andrey Smirnov
182325cb07
test: skip lvm test if not enough user disks available
E.g. in trusted-boot pipeline, we don't have extra disks.

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-10-08 20:42:24 +04:00
Andrey Smirnov
0a2b4556c5
fix: volume encryption with failing keyslots
Fix the flow when a failing key slot leads to repeated attempts to open
the volume, while it's already open, but the failure was to sync other
keys.

Refactor the code to get rid of variable assignment in the outer block
from closures.

Fixes #9415

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-10-07 21:59:42 +04:00
Dmitry Sharshakov
b67bc73fd3
fix: fix mdadm system extension
Update pkgs to include a fixed version of systemd-udevd which searches for udev rules under /usr/etc/udev/rules.d as used by our system extensions.

Re-enable the affected test

Fixes #9423

Signed-off-by: Dmitry Sharshakov <dmitry.sharshakov@siderolabs.com>
2024-10-04 19:51:25 +02:00
Andrey Smirnov
74861573a7
fix: multiple fixes for LVM activation
Two fixes were in pkgs/lvm2:

* https://github.com/siderolabs/pkgs/pull/1041
* https://github.com/siderolabs/pkgs/pull/1042

Other fixes in this PR:

* adjust the controller a bit for some interactions
* make Rook test use more complicated, encrypted setup which uses LVM
* adjust LVM test to handle a case when there's more than one worker

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-10-03 11:33:22 +04:00
Dmitry Sharshakov
74c12c20e0
feat: replace eudev with systemd-udevd
Eudev has seen less development effort recently with Gentoo and others moving towards using systemd-udevd which can now be built independently

Update pkgs, include more libraries, change udevd executable name

Signed-off-by: Dmitry Sharshakov <dmitry.sharshakov@siderolabs.com>
2024-10-02 19:08:40 +02:00
Noel Georgi
dec653bfe1
chore: better lvm2 tests
Use LVM2 tests that relies on module loading by lvm.

Fixes: #9300

Signed-off-by: Noel Georgi <git@frezbo.dev>
2024-10-01 16:08:44 +04:00
Noel Georgi
9fa08e8437
chore: refactor tests
Refactor tests to avoid code duplication.

Signed-off-by: Noel Georgi <git@frezbo.dev>
2024-09-20 16:22:01 +05:30
Noel Georgi
d8ab4981b6
feat: support lvm auto activation
Support lvm auto-activation as per
https://man7.org/linux/man-pages/man7/lvmautoactivation.7.html.

This changes from how Talos previously used to unconditionally tried to
activate all volume groups to based on udev events.

Fixes: #9300

Signed-off-by: Noel Georgi <git@frezbo.dev>
2024-09-20 14:42:56 +05:30
Andrey Smirnov
dd4185b144
feat: add KubeSpan extra endpoint configuration
Fixes #9174

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-09-06 14:50:12 +04:00
Andrey Smirnov
3038ccfa88
feat: add configuration for EPHEMERAL volume
Fixes #9261

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-09-06 14:11:35 +04:00
Andrey Smirnov
b453385bd9
feat: support volume configuration, provisioning, etc
This implements the first round of changes, replacing the volume backend
with the new implementation, while keeping most of the external
interfaces intact.

See #8367

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-08-30 18:32:34 +04:00
Andrey Smirnov
be2ebf6b4d
chore: bump dependencies
Update tools, pkgs, extras, Go dependencies, Go tools, etc.

Linux 6.6.47 and containerd 2.0.0-rc.4.

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-08-29 20:44:37 +04:00
Andrey Smirnov
a9551b7caa
fix: host DNS access with firewall enabled
Explicitly enable access to host DNS from pod/service IPs.

Also fix the Kubernetes health checks to assert number of ready pods to
match expectation, otherwise the check might skip a pod (e.g.
`kube-proxy` one) which is not ready, allowing the test to proceed too
early.

Update DNS test to print more logs on error.

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-08-27 15:44:14 +04:00
Noel Georgi
64914b086c
chore: add test for crun extension
Add a test to verify the `crun` runtimeclass container-runtime extension
works as expected.

Signed-off-by: Noel Georgi <git@frezbo.dev>
2024-08-02 20:15:01 +05:30
Andrey Smirnov
7a1c62b8bc
feat: publish installed extensions as node labels/annotations
Extensions are posted the following way:

`extensions.talos.dev/<name>=<version>`

The name should be valid as a label (annotation) key.

If the value is valid as a label value, use labels, otherwise use
annotations.

Also implements node annotations in the machine config as a side-effect.

Fixes #9089

Fixes #8971

See #9070

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-08-01 17:32:09 +04:00
Noel Georgi
117628aa60
chore: add test for gvisor extension with platform kvm
Add test for Gvisor extensions when kvm platform is used.

The test is marked as skipped until pod termination issue is resolved.

Signed-off-by: Noel Georgi <git@frezbo.dev>
2024-07-25 19:15:27 +05:30
Andrey Smirnov
b07338f547
feat: provide machine config document to update trusted CA roots
Fixes #8867

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-07-12 19:28:31 +04:00
Dmitriy Matrenichev
dad9c40c73
chore: simplify code
- replace `interface{}` with `any` using `gofmt -r 'interface{} -> any -w'`
- replace `a = []T{}` with `var a []T` where possible.
- replace `a = []T{}` with `a = make([]T, 0, len(b))` where possible.

Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
2024-07-08 18:14:00 +03:00
Andrey Smirnov
2512ef435f
test: fix the integrtion tests for apply-config
They got broken after refactoring.

Also use this PR to test things before the release.

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-07-08 14:06:45 +04:00
Dmitriy Matrenichev
2d054ad355
chore: handle documents diff in apply-config dry run
Before this PR diff generator only diffed the v1alpha1 config and nothing else. With this PR it also takes
separate docs into the account.

```shell
~ > <editor> controlplane.yaml
~ > talosctl -n talos-default-controlplane-1  apply-config --file controlplane.yaml --dry-run
Dry run summary:
Applied configuration without a reboot (skipped in dry-run).
Config diff:
No changes.
Documents diff:
[]config.Document{
+	&runtime.KmsgLogV1Alpha1{
+		Meta:       meta.Meta{MetaAPIVersion: "v1alpha1", MetaKind: "KmsgLogConfig"},
+		MetaName:   "omni-kmsg",
+		KmsgLogURL: s"tcp://[fdae:41e4:649b:9303::1]:8092",
+	},
}
~ > talosctl -n talos-default-controlplane-1  apply-config --file controlplane.yaml
Applied configuration without a reboot
~ >
~ >
~ >
~ > <editor> controlplane.yaml
~ > talosctl -n talos-default-controlplane-1  apply-config --file controlplane.yaml --dry-run
Dry run summary:
Applied configuration without a reboot (skipped in dry-run).
Config diff:
No changes.
Documents diff:
[]config.Document{
	&runtime.KmsgLogV1Alpha1{Meta: {MetaAPIVersion: "v1alpha1", MetaKind: "KmsgLogConfig"}, MetaName: "omni-kmsg", KmsgLogURL: {URL: &{Scheme: "tcp", Host: "[fdae:41e4:649b:9303::1]:8092"}}},
+	&network.DefaultActionConfigV1Alpha1{
+		Meta:    meta.Meta{MetaAPIVersion: "v1alpha1", MetaKind: "NetworkDefaultActionConfig"},
+		Ingress: s"block",
+	},
}
```

Closes #8885

Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
2024-06-27 21:15:36 +03:00
Dmitriy Matrenichev
c603d2bf95
chore: output more info when ExecuteCommandInPod fails
This should make investigating things like [this](https://github.com/siderolabs/talos/actions/runs/9411253542/job/25924192027)
easier.

Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
2024-06-24 20:15:45 +03:00
Noel Georgi
86a3222aee
chore: use new disks api for iscsi tests
The iscsi test broke when the new disks api was introduced making the
test pass always, now filter other only `iscsi` disk types using the new
disks API.

Signed-off-by: Noel Georgi <git@frezbo.dev>
2024-06-18 18:38:21 +05:30
Andrey Smirnov
d1a0c1f983
test: fix the integration test for no META name
When META has never been written (e.g. booted from a disk image), it
won't be detected as `talosmeta`.

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-06-11 15:17:52 +04:00
Andrey Smirnov
7cbdce73f7
fix: detect CD devices, fix user disks wipe test
Detect CD devices, and set size to 0 for CD without media.

In user disk wipe tests, skip device mapper devices and CD-ROM.

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-06-10 18:00:06 +04:00
Andrey Smirnov
f07b79f4a8
feat: provide disk detection based on new blockdevices
Uses go-siderolabs/go-blockdevice/v2 for all the hard parts,
provides new resource `Disk` which describes all disks in the system.

Additional resource `SystemDisk` always point to the system disk (based
on the location of `META` partition).

The `Disks` API (and `talosctl disks`) provides a view now into the
`talosctl get disks` to keep backwards compatibility.

QEMU provisioner can now create extra disks of various types: IDE, AHCI,
SCSI, NVME, this allows to test detection properly.

The new resource will be the foundation for volume provisioning (to pick
up the disk to provision the volume on).

Example:

```
talosctl -n 172.20.0.5 get disks
NODE         NAMESPACE   TYPE   ID        VERSION   SIZE          READ ONLY   TRANSPORT   ROTATIONAL   WWID                                                               MODEL            SERIAL
172.20.0.5   runtime     Disk   loop0     1         65568768      true
172.20.0.5   runtime     Disk   nvme0n1   1         10485760000   false       nvme                     nvme.1b36-6465616462656566-51454d55204e564d65204374726c-00000001   QEMU NVMe Ctrl   deadbeef
172.20.0.5   runtime     Disk   sda       1         10485760000   false       virtio      true                                                                            QEMU HARDDISK
172.20.0.5   runtime     Disk   sdb       1         10485760000   false       sata        true         t10.ATA     QEMU HARDDISK                           QM00013        QEMU HARDDISK
172.20.0.5   runtime     Disk   sdc       1         10485760000   false       sata        true         t10.ATA     QEMU HARDDISK                           QM00001        QEMU HARDDISK
172.20.0.5   runtime     Disk   vda       1         12884901888   false       virtio      true
```

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-06-07 20:18:32 +04:00
Andrey Smirnov
7c9a14383e
fix: volume discovery improvements
Use shared locks, discover more partitions, some other small changes.

Re-enable the flaky test.

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-06-06 19:45:40 +04:00
Andrey Smirnov
30860210cc
test: fix hardware test not to require PCI devices
On e.g. Azure VMs there are non reported.

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-06-03 17:20:42 +04:00
Andrey Smirnov
4dd0aa7120
feat: implement PCI device bus enumeration
Fixes #8826

From the QEMU VM:

```shell
$ talosctl -n 172.20.0.5 get pcidevice
NODE         NAMESPACE   TYPE        ID             VERSION   CLASS                       SUBCLASS                    VENDOR              PRODUCT
172.20.0.5   hardware    PCIDevice   0000:00:00.0   1         Bridge                      Host bridge                 Intel Corporation   82G33/G31/P35/P31 Express DRAM Controller
172.20.0.5   hardware    PCIDevice   0000:00:01.0   1         Display controller          VGA compatible controller
172.20.0.5   hardware    PCIDevice   0000:00:02.0   1         Network controller          Ethernet controller         Red Hat, Inc.       Virtio network device
172.20.0.5   hardware    PCIDevice   0000:00:03.0   1         Unclassified device                                     Red Hat, Inc.       Virtio RNG
172.20.0.5   hardware    PCIDevice   0000:00:04.0   1         Unclassified device                                     Red Hat, Inc.       Virtio memory balloon
172.20.0.5   hardware    PCIDevice   0000:00:05.0   1         Communication controller    Communication controller    Red Hat, Inc.       Virtio console
172.20.0.5   hardware    PCIDevice   0000:00:06.0   1         Generic system peripheral   System peripheral           Intel Corporation   6300ESB Watchdog Timer
172.20.0.5   hardware    PCIDevice   0000:00:07.0   1         Mass storage controller     SCSI storage controller     Red Hat, Inc.       Virtio block device
172.20.0.5   hardware    PCIDevice   0000:00:1f.0   1         Bridge                      ISA bridge                  Intel Corporation   82801IB (ICH9) LPC Interface Controller
172.20.0.5   hardware    PCIDevice   0000:00:1f.2   1         Mass storage controller     SATA controller             Intel Corporation   82801IR/IO/IH (ICH9R/DO/DH) 6 port SATA Controller [AHCI mode]
172.20.0.5   hardware    PCIDevice   0000:00:1f.3   1         Serial bus controller       SMBus                       Intel Corporation   82801I (ICH9 Family) SMBus Controller
```

```yaml
node: 172.20.0.5
metadata:
    namespace: hardware
    type: PCIDevices.hardware.talos.dev
    id: 0000:00:1f.3
    version: 1
    owner: hardware.PCIDevicesController
    phase: running
    created: 2024-05-30T12:09:05Z
    updated: 2024-05-30T12:09:05Z
spec:
    class: Serial bus controller
    subclass: SMBus
    vendor: Intel Corporation
    product: 82801I (ICH9 Family) SMBus Controller
    class_id: "0x0c"
    subclass_id: "0x05"
    vendor_id: "0x8086"
    product_id: "0x2930"
```

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-05-31 20:56:16 +04:00
Dmitriy Matrenichev
893e64fcb1
fix: replace nslookup with dig in integration tests
This should be more reliable on `integration-aws-*` and others.

Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
2024-05-30 01:37:01 +03:00
Dmitry Sharshakov
da8305ffb4
test: add a test for watchdog timers
Try to activate/deactivate watchdogs, change timeout, run only on QEMU.

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
Signed-off-by: Dmitry Sharshakov <dmitry.sharshakov@siderolabs.com>
2024-05-28 16:46:04 +04:00
Dmitriy Matrenichev
a9cf9b7892
fix: correctly handle dns messages in our dns implementation
- By default, github.com/miekg/dns uses `dns.MinMsgSize` for UDP messages, which is 512 bytes. This is too small for some
DNS request/responses, and can cause truncation and errors. This change sets the buffer size to `dns.DefaultMsgSize`
4096 bytes, which is the maximum size of a dns packet payload per RFC 6891.
- We also retry the request if the response is truncated or previous connection was closed.
- And finally we properly handle the case where the response is larger than the client buffer size,
and we return a truncated correct response.

Closes #8763

Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
2024-05-24 21:41:00 +03:00
Andrey Smirnov
c2b19dcb97
chore: move to containerd 2.0 API
Lots of module moves/renames.

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-05-24 21:48:55 +04:00
Andrey Smirnov
2e64e9e4e0
fix: require accepted CAs on worker nodes
Note: this issue never happens with default Talos worker configuration
(generated by Omni, `talosctl gen config` or CABPT).

Before change https://github.com/siderolabs/talos/pull/4294 3 years ago,
worker nodes connected to trustd in "insecure" mode (without validating
the trustd server certificate). The change kept backwards compatibility,
so it still allowed insecure mode on upgrades.

Now it's time to break this compatibility promise, and require
accepted CAs to be always present. Adds validation for machine
configuration, so if upgrade is attempeted, it would not validate the
machine config without accepted CAs.

Now lack of accepted CAs would lead to failure to connect to trustd.

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-05-23 17:48:16 +04:00