talos

mirror of https://github.com/siderolabs/talos.git synced 2025-10-29 15:31:12 +01:00

Author	SHA1	Message	Date
Dmitriy Matrenichev	dad9c40c73	chore: simplify code - replace `interface{}` with `any` using `gofmt -r 'interface{} -> any -w'` - replace `a = []T{}` with `var a []T` where possible. - replace `a = []T{}` with `a = make([]T, 0, len(b))` where possible. Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>	2024-07-08 18:14:00 +03:00
Andrey Smirnov	2512ef435f	test: fix the integrtion tests for apply-config They got broken after refactoring. Also use this PR to test things before the release. Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>	2024-07-08 14:06:45 +04:00
Dmitriy Matrenichev	2d054ad355	chore: handle documents diff in `apply-config` dry run Before this PR diff generator only diffed the v1alpha1 config and nothing else. With this PR it also takes separate docs into the account. ```shell ~ > <editor> controlplane.yaml ~ > talosctl -n talos-default-controlplane-1 apply-config --file controlplane.yaml --dry-run Dry run summary: Applied configuration without a reboot (skipped in dry-run). Config diff: No changes. Documents diff: []config.Document{ + &runtime.KmsgLogV1Alpha1{ + Meta: meta.Meta{MetaAPIVersion: "v1alpha1", MetaKind: "KmsgLogConfig"}, + MetaName: "omni-kmsg", + KmsgLogURL: s"tcp://[fdae:41e4:649b:9303::1]:8092", + }, } ~ > talosctl -n talos-default-controlplane-1 apply-config --file controlplane.yaml Applied configuration without a reboot ~ > ~ > ~ > ~ > <editor> controlplane.yaml ~ > talosctl -n talos-default-controlplane-1 apply-config --file controlplane.yaml --dry-run Dry run summary: Applied configuration without a reboot (skipped in dry-run). Config diff: No changes. Documents diff: []config.Document{ &runtime.KmsgLogV1Alpha1{Meta: {MetaAPIVersion: "v1alpha1", MetaKind: "KmsgLogConfig"}, MetaName: "omni-kmsg", KmsgLogURL: {URL: &{Scheme: "tcp", Host: "[fdae:41e4:649b:9303::1]:8092"}}}, + &network.DefaultActionConfigV1Alpha1{ + Meta: meta.Meta{MetaAPIVersion: "v1alpha1", MetaKind: "NetworkDefaultActionConfig"}, + Ingress: s"block", + }, } ``` Closes #8885 Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>	2024-06-27 21:15:36 +03:00
Dmitriy Matrenichev	c603d2bf95	chore: output more info when `ExecuteCommandInPod` fails This should make investigating things like [this](https://github.com/siderolabs/talos/actions/runs/9411253542/job/25924192027) easier. Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>	2024-06-24 20:15:45 +03:00
Noel Georgi	86a3222aee	chore: use new disks api for iscsi tests The iscsi test broke when the new disks api was introduced making the test pass always, now filter other only `iscsi` disk types using the new disks API. Signed-off-by: Noel Georgi <git@frezbo.dev>	2024-06-18 18:38:21 +05:30
Andrey Smirnov	d1a0c1f983	test: fix the integration test for no META name When META has never been written (e.g. booted from a disk image), it won't be detected as `talosmeta`. Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>	2024-06-11 15:17:52 +04:00
Andrey Smirnov	7cbdce73f7	fix: detect CD devices, fix user disks wipe test Detect CD devices, and set size to 0 for CD without media. In user disk wipe tests, skip device mapper devices and CD-ROM. Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>	2024-06-10 18:00:06 +04:00
Andrey Smirnov	f07b79f4a8	feat: provide disk detection based on new blockdevices Uses go-siderolabs/go-blockdevice/v2 for all the hard parts, provides new resource `Disk` which describes all disks in the system. Additional resource `SystemDisk` always point to the system disk (based on the location of `META` partition). The `Disks` API (and `talosctl disks`) provides a view now into the `talosctl get disks` to keep backwards compatibility. QEMU provisioner can now create extra disks of various types: IDE, AHCI, SCSI, NVME, this allows to test detection properly. The new resource will be the foundation for volume provisioning (to pick up the disk to provision the volume on). Example: ``` talosctl -n 172.20.0.5 get disks NODE NAMESPACE TYPE ID VERSION SIZE READ ONLY TRANSPORT ROTATIONAL WWID MODEL SERIAL 172.20.0.5 runtime Disk loop0 1 65568768 true 172.20.0.5 runtime Disk nvme0n1 1 10485760000 false nvme nvme.1b36-6465616462656566-51454d55204e564d65204374726c-00000001 QEMU NVMe Ctrl deadbeef 172.20.0.5 runtime Disk sda 1 10485760000 false virtio true QEMU HARDDISK 172.20.0.5 runtime Disk sdb 1 10485760000 false sata true t10.ATA QEMU HARDDISK QM00013 QEMU HARDDISK 172.20.0.5 runtime Disk sdc 1 10485760000 false sata true t10.ATA QEMU HARDDISK QM00001 QEMU HARDDISK 172.20.0.5 runtime Disk vda 1 12884901888 false virtio true ``` Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>	2024-06-07 20:18:32 +04:00
Andrey Smirnov	7c9a14383e	fix: volume discovery improvements Use shared locks, discover more partitions, some other small changes. Re-enable the flaky test. Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>	2024-06-06 19:45:40 +04:00
Andrey Smirnov	30860210cc	test: fix hardware test not to require PCI devices On e.g. Azure VMs there are non reported. Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>	2024-06-03 17:20:42 +04:00
Andrey Smirnov	4dd0aa7120	feat: implement PCI device bus enumeration Fixes #8826 From the QEMU VM: ```shell $ talosctl -n 172.20.0.5 get pcidevice NODE NAMESPACE TYPE ID VERSION CLASS SUBCLASS VENDOR PRODUCT 172.20.0.5 hardware PCIDevice 0000:00:00.0 1 Bridge Host bridge Intel Corporation 82G33/G31/P35/P31 Express DRAM Controller 172.20.0.5 hardware PCIDevice 0000:00:01.0 1 Display controller VGA compatible controller 172.20.0.5 hardware PCIDevice 0000:00:02.0 1 Network controller Ethernet controller Red Hat, Inc. Virtio network device 172.20.0.5 hardware PCIDevice 0000:00:03.0 1 Unclassified device Red Hat, Inc. Virtio RNG 172.20.0.5 hardware PCIDevice 0000:00:04.0 1 Unclassified device Red Hat, Inc. Virtio memory balloon 172.20.0.5 hardware PCIDevice 0000:00:05.0 1 Communication controller Communication controller Red Hat, Inc. Virtio console 172.20.0.5 hardware PCIDevice 0000:00:06.0 1 Generic system peripheral System peripheral Intel Corporation 6300ESB Watchdog Timer 172.20.0.5 hardware PCIDevice 0000:00:07.0 1 Mass storage controller SCSI storage controller Red Hat, Inc. Virtio block device 172.20.0.5 hardware PCIDevice 0000:00:1f.0 1 Bridge ISA bridge Intel Corporation 82801IB (ICH9) LPC Interface Controller 172.20.0.5 hardware PCIDevice 0000:00:1f.2 1 Mass storage controller SATA controller Intel Corporation 82801IR/IO/IH (ICH9R/DO/DH) 6 port SATA Controller [AHCI mode] 172.20.0.5 hardware PCIDevice 0000:00:1f.3 1 Serial bus controller SMBus Intel Corporation 82801I (ICH9 Family) SMBus Controller ``` ```yaml node: 172.20.0.5 metadata: namespace: hardware type: PCIDevices.hardware.talos.dev id: 0000:00:1f.3 version: 1 owner: hardware.PCIDevicesController phase: running created: 2024-05-30T12:09:05Z updated: 2024-05-30T12:09:05Z spec: class: Serial bus controller subclass: SMBus vendor: Intel Corporation product: 82801I (ICH9 Family) SMBus Controller class_id: "0x0c" subclass_id: "0x05" vendor_id: "0x8086" product_id: "0x2930" ``` Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>	2024-05-31 20:56:16 +04:00
Dmitriy Matrenichev	893e64fcb1	fix: replace `nslookup` with `dig` in integration tests This should be more reliable on `integration-aws-*` and others. Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>	2024-05-30 01:37:01 +03:00
Dmitry Sharshakov	da8305ffb4	test: add a test for watchdog timers Try to activate/deactivate watchdogs, change timeout, run only on QEMU. Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com> Signed-off-by: Dmitry Sharshakov <dmitry.sharshakov@siderolabs.com>	2024-05-28 16:46:04 +04:00
Dmitriy Matrenichev	a9cf9b7892	fix: correctly handle dns messages in our dns implementation - By default, github.com/miekg/dns uses `dns.MinMsgSize` for UDP messages, which is 512 bytes. This is too small for some DNS request/responses, and can cause truncation and errors. This change sets the buffer size to `dns.DefaultMsgSize` 4096 bytes, which is the maximum size of a dns packet payload per RFC 6891. - We also retry the request if the response is truncated or previous connection was closed. - And finally we properly handle the case where the response is larger than the client buffer size, and we return a truncated correct response. Closes #8763 Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>	2024-05-24 21:41:00 +03:00
Andrey Smirnov	c2b19dcb97	chore: move to containerd 2.0 API Lots of module moves/renames. Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>	2024-05-24 21:48:55 +04:00
Andrey Smirnov	2e64e9e4e0	fix: require accepted CAs on worker nodes Note: this issue never happens with default Talos worker configuration (generated by Omni, `talosctl gen config` or CABPT). Before change https://github.com/siderolabs/talos/pull/4294 3 years ago, worker nodes connected to trustd in "insecure" mode (without validating the trustd server certificate). The change kept backwards compatibility, so it still allowed insecure mode on upgrades. Now it's time to break this compatibility promise, and require accepted CAs to be always present. Adds validation for machine configuration, so if upgrade is attempeted, it would not validate the machine config without accepted CAs. Now lack of accepted CAs would lead to failure to connect to trustd. Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>	2024-05-23 17:48:16 +04:00
Andrey Smirnov	b7afe2669b	feat: update Linux 6.6.30 Update tools/pkgs to the latest version, brings in all updates. Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>	2024-05-13 17:14:03 +04:00
Andrey Smirnov	b690ffeb89	test: improve DNS resolver test stability Run a health check before the test, as the test depends on CoreDNS being healthy, and previous tests might disturb the cluster. Also refactor by using watch instead of retries, make pods terminate fast. Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>	2024-04-29 19:31:34 +04:00
Andrey Smirnov	05fd042bb3	test: improve the reset integration tests Provide a trace for each step of the reset sequence taken, so if one of those fails, integration test produces a meaningful message instead of proceeding and failing somewhere else. More cleanup/refactor, should be functionally equivalent. Fixes #8635 Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>	2024-04-24 18:35:39 +04:00
Andrey Smirnov	3433fa13bf	feat: use container DNS when in container mode More specifically, pick up `/etc/resolv.conf` contents by default when in container mode, and use that as a base resolver for the host DNS. Fixes #8303 Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>	2024-04-16 17:01:36 +04:00
Andrey Smirnov	c8f674bd3d	test: add a test for 'spin' container runtime See https://github.com/siderolabs/extensions/pull/355 Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>	2024-04-10 20:42:16 +04:00
Andrey Smirnov	9aa1e1b79b	fix: present all accepted CAs to the kube-apiserver This fixes an issue with a single controlplane cluster. Properly present all accepted CAs to the apiserver, in the test let the cluster fully recovery between two CA rotations performed. Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>	2024-04-08 23:33:22 +04:00
Dmitry Sharshakov	653f838b09	feat: support multiple Docker cluster in talosctl cluster create Dynamically map Kubernetes and Talos API ports to an available port on the host, so every cluster gets its own unique set of parts. As part of the changes, refactor the provision library and interfaces, dropping old weird interfaces replacing with (hopefully) much more descriprive names. Signed-off-by: Dmitry Sharshakov <dmitry.sharshakov@siderolabs.com> Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>	2024-04-04 21:21:39 +04:00
Andrey Smirnov	78b9bd9273	fix: report unsupported x86_64 microarchitecture level Fixes #8361 Talos requires v2 (circa 2008), but VMs are often configured to limit the exposed features to the baseline (v1). ``` [ 0.779218] [talos] [initramfs] booting Talos v1.7.0-alpha.1-35-gef5bbe728-dirty [ 0.779806] [talos] [initramfs] CPU: QEMU Virtual CPU version 2.5+, 4 core(s), 1 thread(s) per core [ 0.780529] [talos] [initramfs] x86_64 microarchitecture level: 1 [ 0.781018] [talos] [initramfs] it might be that the VM is configured with an older CPU model, please check the VM configuration [ 0.782346] [talos] [initramfs] x86_64 microarchitecture level 2 or higher is required, halting ``` Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>	2024-04-03 16:09:57 +04:00
Noel Georgi	f515741b52	chore: add equinix e2e-tests Add equinix e2e-tests. Signed-off-by: Noel Georgi <git@frezbo.dev>	2024-04-02 17:16:59 +05:30
Andrey Smirnov	7a68504b6b	feat: support rotating Kubernetes CA Fixes #8440 Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>	2024-04-01 22:08:02 +04:00
Dmitriy Matrenichev	8dc4910c48	chore: enable "WG over GRPC" testing in siderolink agent tests Fixes https://github.com/siderolabs/talos/issues/8514 For https://github.com/siderolabs/talos/issues/8392 Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>	2024-04-01 18:24:57 +03:00
Andrey Smirnov	8eacc4ba80	feat: support rotation of Talos API CA This allows to roll all nodes to use a new CA, to refresh it, or e.g. when the `talosconfig` was exposed accidentally. Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>	2024-03-22 12:16:47 +04:00
Dmitriy Matrenichev	19f15a840c	chore: bump golangci-lint to 1.57.0 Fix all discovered issues. Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>	2024-03-21 01:06:53 +03:00
Andrey Smirnov	ead37abf09	test: disable volume tests They're flaky, disable until the root cause is known. Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>	2024-03-19 16:40:42 +04:00
Andrey Smirnov	15beb14780	feat: implement blockdevice watch controller This controller combines kobject events, and scan of `/sys/block` to build a consistent list of available block devices, updating resources as the blockdevice changes. Based on these resources the next step can run probe on the blockdevices as they change to present a consistent view of filesystems/partitions. Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>	2024-03-18 18:28:40 +04:00
Andrey Smirnov	9afa70baf3	fix: patch correctly config in `talosctl upgrade-k8s` The current code was stipping non-`v1alpha1.Config` documents. Provide a proper method in the config provider, and update places using it. Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>	2024-03-15 20:42:44 +04:00
Andrey Smirnov	3130caf954	chore: re-enable DRBD extension See https://github.com/siderolabs/extensions/pull/343 Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>	2024-03-15 15:55:18 +04:00
Andrey Smirnov	bbed07e03a	feat: update Linux to 6.6.18 ZFS extension got re-enabled for 1.7. Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>	2024-02-29 20:08:59 +04:00
Fabiano Fidêncio	64e9703f86	chore: add tests for the Kata Containers extension Let's add a very basic test for the Kata Containers extension, mimicing what's already in place for gVisor. This depends on the work being done in: https://github.com/siderolabs/extensions/pull/279 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com> Signed-off-by: Noel Georgi <git@frezbo.dev>	2024-02-20 18:49:47 +05:30
Dmitriy Matrenichev	fa3b933705	chore: replace fmt.Errorf with errors.New where possible This time use `eg` from `x/tools` repo tool to do this. Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>	2024-02-14 17:39:30 +03:00
Dmitriy Matrenichev	5324d39167	chore: bump stuff Also fix .golangci.yml file. Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>	2024-02-09 19:19:25 +03:00
Saiyam Pathak	4184e617ab	chore: add test for wasmedge runtime extension Add tests for WasmEdge container runtime system extension. Signed-off-by: Saiyam Pathak <saiyam911@gmail.com> Signed-off-by: Noel Georgi <git@frezbo.dev>	2024-02-05 18:18:13 +05:30
Andrey Smirnov	b44551ccdb	feat: update Linux to 6.6.13 See https://github.com/siderolabs/pkgs/pull/873 Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>	2024-01-29 16:50:33 +04:00
Noel Georgi	0b94550c42	chore: fix the gvisor test The gvisor test was not using the correct runtimeclass and would have always passed the regardless. Signed-off-by: Noel Georgi <git@frezbo.dev>	2023-12-15 20:48:44 +05:30
Andrey Smirnov	10c59a6b90	fix: leave discovery service later in the reset sequence Fixes #8057 I went back and forth on the way to fix it exactly, and ended up with a pretty simple version of a fix. The problem was that discovery service was removing the member at the initial phase of reset, which actually still requires KubeSpan to be up: * leaving `etcd` (need to talk to other members) * stopping pods (might need to talk to Kubernetes API with some CNIs) Now leaving discovery service happens way later, when network interactions are no longer required. Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>	2023-12-13 19:16:12 +04:00
Andrey Smirnov	36c8ddb5e1	feat: implement ingress firewall rules Fixes #4421 See documentation for details on how to use the feature. With `talosctl cluster create`, firewall can be easily test with `--with-firewall=accept\|block` (default mode). Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>	2023-11-30 22:58:16 +04:00
Noel Georgi	f041b26299	chore: add tests for mdadm extension Add tests for mdadm extension. See: https://github.com/siderolabs/extensions/pull/271 Signed-off-by: Noel Georgi <git@frezbo.dev>	2023-11-27 23:18:35 +05:30
Dmitriy Matrenichev	dd45dd06cf	chore: add custom node taints This PR adds support for custom node taints. Refer to `nodeTaints` in the `configuration` for more information. Closes #7581 Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>	2023-11-25 18:33:18 +03:00
Andrey Smirnov	06941b7e5c	fix: allow rootfs propagation configuration for extension services Fixes #7873 Some services which perform mounts inside the container which require mounts to propagate back to the host (e.g. `stargz-snapshotter`) require this configuration setting. Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>	2023-11-13 21:58:22 +04:00
Andrey Smirnov	813442dd7a	fix: don't validate machine.install if installed As Talos doesn't consume `.machine.install` if already installed, there is no point in validating it once already installed. This fixes a problem users often run into: after a reboot/upgrade the system disk blockdevice name changes, due to the kernel upgrade, or just unpredictable behavior of device discovery. Talos fails to boot as it can't validate the machine config, while it's already installed, so actual blockdevice name doesn't matter. Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>	2023-11-03 15:08:42 +04:00
Noel Georgi	69d8054c9e	chore: drop UpdateEndpointSuite drop `UpdateEndpointSuite` suite since KubePrism is enabled by default starting Talos 1.6 and the test never passes since K8s node is always ready since it can connect to api server over KubePrism. Signed-off-by: Noel Georgi <git@frezbo.dev>	2023-10-04 00:26:59 +05:30
Noel Georgi	9b5cfdd0bc	chore: add tests for iscsi Add tests for iscsi to make sure it works. Signed-off-by: Noel Georgi <git@frezbo.dev>	2023-10-02 22:12:42 +05:30
Andrey Smirnov	a52d3cda3b	chore: update gen and COSI runtime No actual changes, adapting to use new APIs. Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>	2023-09-22 12:13:13 +04:00
Noel Georgi	9c2ba7c6fa	chore: add tests for chelsio drivers Add tests for Chelsio drivers and firmware. Ref: https://github.com/siderolabs/extensions/pull/232 Signed-off-by: Noel Georgi <git@frezbo.dev>	2023-09-20 20:07:25 +05:30

1 2 3 4

197 Commits