4945 Commits

Author SHA1 Message Date
Andrey Smirnov
c14b446229
feat: update Kubernetes to v1.32.0-alpha.1
Talos 1.9 is going to be shipped with Kubernetes v1.32 by default.

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-10-18 20:28:14 +04:00
Dmitry Sharshakov
29780d35a0
test: add an integration test for verifying process parameters
Validate capabilities are dropped and cgroup, UID, environment and OOM adjustments are set

Signed-off-by: Dmitry Sharshakov <dmitry.sharshakov@siderolabs.com>
2024-10-18 16:59:41 +02:00
Andrey Smirnov
3d342af447
fix: update incorrect alias for PCIDevice resource
Fixes #9519

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-10-18 18:19:21 +04:00
Andrey Smirnov
f7d35a5e0b
release(v1.9.0-alpha.0): prepare release
This is the official v1.9.0-alpha.0 release.

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
v1.9.0-alpha.0 pkg/machinery/v1.9.0-alpha.0
2024-10-18 17:50:58 +04:00
Andrey Smirnov
e0434d77d7
feat: update dependencies
Bring in new tools, pkgs, update Go dependencies and others.

In preparation for Talos 1.9.0-alpha.0.

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-10-17 22:12:50 +04:00
Andrey Smirnov
5c5a248861
feat: add Talos 1.9 compatibility guarantees
To be backported to Talos 1.8 machinery to provide upgrade
compatibility.

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-10-17 16:58:19 +04:00
Andrey Smirnov
bc4c21f41a
test: add json logs test environment
Add an option to `talosctl cluster create` to start a JSON log receiver,
and enabled it optionally.

Enable in `integration-qemu`.

See #9510

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-10-17 16:23:26 +04:00
Ryan Borstelmann
71faa32942
docs: nvidia proprietary/oss hardware requirement
Update NVIDIA docs on proprietary/OSS driver requirements.

Signed-off-by: Ryan Borstelmann <ryan@ryanb.tv>

Documentation didn't outline why one would use OSS vs Proprietary Nvidia drivers, so added details for each. Biggest issue is hardware support, which differs between the two.

Signed-off-by: Noel Georgi <git@frezbo.dev>
2024-10-17 10:39:44 +05:30
Dmitriy Matrenichev
59a78da42c
chore: add proto-codec/codec
Unify usage of proto codec v2 across our projects.
Bump grpc library to 1.67.1 and ensure that we it still works with HTTP/2 ALPN value changes.

For https://github.com/siderolabs/talos/issues/9404

Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
2024-10-17 00:12:42 +03:00
Dmitriy Matrenichev
7ff1cedfe3
chore: update siderolabs/crypto module and return proper ALPN
Fixes #9463

Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
2024-10-16 22:12:49 +03:00
Philipp Kleber
ccbd5aed39
feat: optionally decode hcloud userdata as base64
When fetching the machine configuration in the hcloud platform implementation,
try to decode the data returned from the 'userdata' endpoint as a base64 string.
If the data is not in base64 format, decoding does not succeed and the unmodified data is used.

Signed-off-by: Philipp Kleber <philipp.t.kleber@gmail.com>
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-10-16 21:23:56 +04:00
Mike Beaumont
34f652ce82
feat: add well-known app.kubernetes.io labels to control-plane pods
This PR adds most of the recommended labels.

Signed-off-by: Mike Beaumont <mjboamail@gmail.com>
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-10-16 20:48:44 +04:00
Noel Georgi
fc89dc2164
fix: support extra-disks when using iso
When using `iso` and `extra-disks` we're getting errors like below for
any nodes than the first node.

```text
qemu-system-aarch64: -cdrom _out/metal-arm64-secureboot.iso: drive with bus=0, unit=2 (index=2) exists
```

Fix by explicitly specifying the the media is cdrom, so qemu doesn't
index.

Signed-off-by: Noel Georgi <git@frezbo.dev>
2024-10-16 21:41:58 +05:30
Noel Georgi
f2bff814de
chore: add arm64 target for integration-test
Add arm64 target for integration-test, make developing on arm64 machines
easier.

Signed-off-by: Noel Georgi <git@frezbo.dev>
2024-10-16 19:27:37 +05:30
Andrey Smirnov
5853bb0ea4
fix: json logging panic
Fixes #9466

There are two fixes:

* fix the actual panic via https://github.com/siderolabs/go-circular/pull/5
* prevent similar issues in the future by installing a panic handler

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-10-16 17:19:35 +04:00
Noel Georgi
a859cff364
chore: use virtio driver for disks in arm64
ARM64 doesn't support `ide` as a disk driver for disks, use `virtio`
instead.

Signed-off-by: Noel Georgi <git@frezbo.dev>
2024-10-16 17:43:13 +05:30
Noel Georgi
db248de88d
chore(ci): add config for lldpd extension
Add `ExtensionServiceConfig` for lldpd extension.

Signed-off-by: Noel Georgi <git@frezbo.dev>
2024-10-16 17:08:33 +05:30
Andrey Smirnov
9f0de9f43d
test: update provision upgrade tests for Talos 1.9
Use Talos 1.7 & Talos 1.8.

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-10-16 15:08:19 +04:00
Andrey Smirnov
39fe285e69
fix: skip ram disks
Fixes https://github.com/siderolabs/go-blockdevice/issues/113

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-10-16 13:51:52 +04:00
Andrey Smirnov
a9bff3a1d0
test: skip no error test in Cilium
This test often fails due to etcd leader changes.

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-10-15 21:26:58 +04:00
Dmitriy Matrenichev
4d902021bb
fix: do not use pflag csv comma reader for config-patch
Modules pflag and cobra use csv.Reader for `StringSliceVar` method. This doesn't work well with JSON, and we do not need this at all.
Drop it.

Fixes #9493

Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
2024-10-14 11:31:23 +03:00
naed3r
5371788ce1
fix: typo in documentation
`requests` -> `resources`

Signed-off-by: naed3r <40650681+nate-moo@users.noreply.github.com>
Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
2024-10-11 19:16:53 +03:00
Spencer Smith
8a228ba6bc
docs: add egress documentation
This PR adds a list of the domains I had to allow for a Talos cluster to pull all assets needed to install and bootstrap. I've added these docs back to 1.6 of Talos, as I'm not certain they would apply to anything earlier.

Signed-off-by: Spencer Smith <spencer.smith@talos-systems.com>
2024-10-09 08:07:39 -04:00
Andrey Smirnov
182325cb07
test: skip lvm test if not enough user disks available
E.g. in trusted-boot pipeline, we don't have extra disks.

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-10-08 20:42:24 +04:00
Andrey Smirnov
519a48302e
fix: wipe system partitions correctly via kernel args
Use `DiscoveredVolumes` instead of `VolumeStatus`, force reboot to avoid
confusion in the volume controller.

Fixes #9448

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-10-07 23:16:01 +04:00
Andrey Smirnov
0a2b4556c5
fix: volume encryption with failing keyslots
Fix the flow when a failing key slot leads to repeated attempts to open
the volume, while it's already open, but the failure was to sync other
keys.

Refactor the code to get rid of variable assignment in the outer block
from closures.

Fixes #9415

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-10-07 21:59:42 +04:00
Andrey Smirnov
6affbd3182
fix: update grpc-go the latest patch release
See https://github.com/grpc/grpc-go/releases/tag/v1.66.3

Specifically stream failures, I wonder if that is causing flaky
support script.

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-10-07 19:14:24 +04:00
Serge Logvinov
77a4a4adc7
fix: scaleway metadata
Support legacy-style region values.
Disable DHCPv4 for external interface when public IPv4 is disabled.

Signed-off-by: Serge Logvinov <serge.logvinov@sinextra.dev>
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-10-07 17:23:49 +04:00
Dmitry Sharshakov
7acadc0c8f
fix: do not stop udevd before unmounting volumes
As udevd is required by cryptsetup which will timeout if udevd is not working, do not stop it in StopServicesEphemeral, but let StopAllServices handle udev shutdown after cryptsetup close is called

Ref: https://bbs.archlinux.org/viewtopic.php?id=162415

Signed-off-by: Dmitry Sharshakov <dmitry.sharshakov@siderolabs.com>
2024-10-07 14:56:47 +02:00
Andrey Smirnov
6a081055b0
feat: update Flannel to v0.25.7
See https://github.com/flannel-io/flannel/releases/tag/v0.25.7

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-10-07 16:25:34 +04:00
Dmitry Sharshakov
2362f6d3ee
fix: improve container detection
Instead of relying on cmdline (which will not work in case it's TinK on Talos, for example), add a file to container rootfs to signal the platform to machined.

Signed-off-by: Dmitry Sharshakov <dmitry.sharshakov@siderolabs.com>
2024-10-07 12:50:57 +02:00
Dmitry Sharshakov
b67bc73fd3
fix: fix mdadm system extension
Update pkgs to include a fixed version of systemd-udevd which searches for udev rules under /usr/etc/udev/rules.d as used by our system extensions.

Re-enable the affected test

Fixes #9423

Signed-off-by: Dmitry Sharshakov <dmitry.sharshakov@siderolabs.com>
2024-10-04 19:51:25 +02:00
Andrey Smirnov
f08669c7a9
feat: bring in lpfc kernel module driver
See https://github.com/siderolabs/pkgs/pull/1044

Fixes #9437

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-10-04 15:35:22 +04:00
Andrey Smirnov
6a014374be
feat: enable QEDF driver
See https://github.com/siderolabs/talos/discussions/9391

Also bump pkgs & tools, bring in Go 1.23.2, containerd v2.0.0-rc.5

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-10-04 11:46:41 +04:00
Andrey Smirnov
f711907e03
fix: make /var/run empty on reboots
For new installs, simply symlink to `/run` (which is `tmpfs`).

For old installs, simulate by cleaning up the contents.

Fixes #9432

Related to #9365

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-10-03 20:39:21 +04:00
Eddie Wang
7d02eb60f4
docs: fix typo in CloudStack docs
Variable name.

Signed-off-by: Eddie Wang <bonjour@eddiewang.me>
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-10-03 14:56:48 +04:00
Andrey Smirnov
74861573a7
fix: multiple fixes for LVM activation
Two fixes were in pkgs/lvm2:

* https://github.com/siderolabs/pkgs/pull/1041
* https://github.com/siderolabs/pkgs/pull/1042

Other fixes in this PR:

* adjust the controller a bit for some interactions
* make Rook test use more complicated, encrypted setup which uses LVM
* adjust LVM test to handle a case when there's more than one worker

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-10-03 11:33:22 +04:00
Dmitry Sharshakov
74c12c20e0
feat: replace eudev with systemd-udevd
Eudev has seen less development effort recently with Gentoo and others moving towards using systemd-udevd which can now be built independently

Update pkgs, include more libraries, change udevd executable name

Signed-off-by: Dmitry Sharshakov <dmitry.sharshakov@siderolabs.com>
2024-10-02 19:08:40 +02:00
Andrey Smirnov
0a4df4ef84
docs: fix nvidia CRI config example
Fixes #9416

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-10-02 14:38:18 +04:00
Robby Ciliberto
afc1e1a46a
docs: fix typo in extraMounts directory
Typo in extraMounts directory
/var/openebs/local -> /var/local/openebs

Signed-off-by: Robby Ciliberto <robert.ciliberto@gmail.com>
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-10-02 13:43:26 +04:00
Andrey Smirnov
a341bdb064
fix: prevent file descriptors leaks to child processes
See #9412

I'll keep the issue open to track upstream PR status and remove replace
directives.

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-10-01 19:11:45 +04:00
Noel Georgi
dec653bfe1
chore: better lvm2 tests
Use LVM2 tests that relies on module loading by lvm.

Fixes: #9300

Signed-off-by: Noel Georgi <git@frezbo.dev>
2024-10-01 16:08:44 +04:00
Andrey Smirnov
908fd8789c
feat: support cgroup deep analysis in talosctl
The new command `talosctl cgroups` fetches cgroups snapshot from the
machine, parses it fully, enhances with additional information (e.g.
resolves pod names), and presents a customizable view of cgroups
configuration (e.g. limits) and current consumption.

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-09-30 18:57:12 +04:00
ekarlso
aa846cc186
feat: add support for CI Network config in nocloud
Fixes #9351

Signed-off-by: ekarlso <endre.karlson@gmail.com>
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-09-30 18:18:09 +04:00
Andrey Smirnov
10f2539f23
chore: disable cloud-images cron workflow
Otherwise it uploads an AMI every night, and eventually we run out of
AMI limit.

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-09-30 16:40:54 +04:00
Andrey Smirnov
b07a8b36b2
chore: ignore more plugins for system containerd
This is to suppress warnings on failure to load plugins, which were
harmless, but confusing.

Fixes #9393

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-09-30 14:30:59 +04:00
Andrey Smirnov
392c4798f0
feat: prepare for Talos 1.9
Update tools, pkgs, extras.

Brings in Go 1.23.1, Linux 6.6.52, new xfsprogs, etc.

Fork docs.

Add new version contract, etc.

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-09-27 21:23:48 +04:00
adilTepe
ea7bf9fb43
docs: update storage.md
A small typo fix.

Signed-off-by: adilTepe <104206649+adilTepe@users.noreply.github.com>
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-09-26 18:38:02 +04:00
Andrey Smirnov
4ab8dee69a
fix: build talosctl without tcell_minimal
We do it for Talos itself to minimize the memory footprint and binary
size for the `dashboard` when part of Talos, while for `talosctl` we
want to have better support of various terminals.

Fixes #9377

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-09-26 16:03:21 +04:00
Andrey Smirnov
2fa019bd97
docs: enable 'edit on GitHub' link
See attached screenshot.

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-09-25 14:48:54 +04:00