talos

mirror of https://github.com/siderolabs/talos.git synced 2025-12-17 23:42:01 +01:00

Author	SHA1	Message	Date
Bastiaan Schaap	2ff6db749a	chore: add Nedap Security Atlas as adopter Add Nedap Security Atlas as a Talos adopter. Signed-off-by: Bastiaan Schaap <bastiaan.schaap@nedap.com> Signed-off-by: Noel Georgi <git@frezbo.dev>	2022-05-05 16:31:08 +05:30
Noel Georgi	89cab200b8	chore: bump kubernetes to v1.24.0 Bump kubernetes to v1.24.0 Ref: https://github.com/siderolabs/kubelet/pull/45 Also update coredns [manifests](https://github.com/coredns/deployment/blob/master/kubernetes/coredns.yaml.sed) Signed-off-by: Noel Georgi <git@frezbo.dev>	2022-05-05 00:34:35 +05:30
Dmitriy Matrenichev	09d16349f4	chore: refactor StaticPod and StaticPodStatus into typed.Resource This two required some additional attention and were split into separate branch. Also fix data race in NodeAddressSpec.DeepCopy method. Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>	2022-05-04 19:53:16 +04:00
Dmitriy Matrenichev	d2935f98c4	chore: refactor LinkRefresh and LinkStatus into typed.Resource From #5472 Andrey comments, this commit changes LinkRefresh and LinkStatus into typed.Resource by moving Bump and Physical methods to *Spec types. Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>	2022-05-04 19:31:14 +04:00
Philipp Sauter	b52e0b9b9e	fix: talosctl throws error if gen option and --input-dir flags are combined The user will get an error message and talosctl aborts if `talosctl cluster create` is called with gen options and the --input-dir flag. Fixes #2275 Signed-off-by: Philipp Sauter <philipp.sauter@siderolabs.com>	2022-05-04 16:18:03 +02:00
Tim Jones	0e15de3a8a	docs: add adopters file Adds an ADOPTERS markdown to the repo to allow users to show they have adopted using Talos Linux in their organization. Signed-off-by: Tim Jones <tim.jones@siderolabs.com>	2022-05-04 11:06:30 +02:00
Noel Georgi	bb932c2970	chore: bump containerd to v1.6.4 Bump containerd to v1.6.4 Ref: https://github.com/siderolabs/pkgs/pull/466 Signed-off-by: Noel Georgi <git@frezbo.dev>	2022-05-04 00:41:30 +05:30
Noel Georgi	4eaaa2d597	chore: bump kernel to 5.15.37 Bump kernel to 5.15.37 Ref: https://github.com/siderolabs/pkgs/pull/463 Also bump [pkgs](https://github.com/siderolabs/pkgs/pull/465) and [tools](https://github.com/siderolabs/tools/pull/193) Signed-off-by: Noel Georgi <git@frezbo.dev>	2022-05-03 21:36:59 +05:30
Dmitriy Matrenichev	89dde8f2c4	chore: refactor remaining resources into typed.Resource Refactor remaining resources into typed.Resource. Exceptions are: - MachineConfig - MachineType - LinkRefresh - LinkStatus all of which contain additional methods, and cannot be simply reworked into new resource framework. StaticPod and StaticPodStatus are also absent from this PR, because they result in e2e errors which are going to be resolved in the next PR. Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>	2022-05-03 18:40:37 +04:00
Andrey Smirnov	bd089e702d	chore: bump dependencies dependabot + go-mod-outdated Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>	2022-05-03 16:30:59 +03:00
Tames McTigue	3136334b93	docs: fix links in VMware documentation The links to the patch and script files were changed and not reflected here. There was also a missing curl command in the first example of downloading the patch. Signed-off-by: Tames McTigue <tames@northwestern.edu> Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>	2022-05-03 16:07:31 +03:00
Andrey Smirnov	403df0e180	docs: provide example on using config generation package There were many discussions on creating native Talos providers for TF, Pulumi, etc., but there's no documented idiomatic way to use our machinery package to generate the config. This PR tries to fill this gap. Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>	2022-05-03 15:44:51 +03:00
Dmitriy Matrenichev	6351928611	chore: redo pointer with github.com/siderolabs/go-pointer module With the advent of generics, redo pointer functionality and remove github.com/AlekSi/pointer dependency. Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>	2022-05-02 02:17:13 +04:00
Andrey Smirnov	a269f740ce	docs: copy knowledge base to v1.0 docs As Talos v1.0.4 now supports kubelet with graceful shutdown disabled, update the docs. Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>	2022-04-29 12:22:52 +03:00
Dmitriy Matrenichev	4832010263	fix: return an error if there is no byte slice in ReadonlyProvider Current code contains a data race, since access to r.bytes in Bytes() is unguarded and can be called from several goroutines. There is no need for it anyway, since WrapReadonly always gets a full slice. Refactor code to reflect that. Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>	2022-04-28 17:10:49 +04:00
Andrey Smirnov	6e7486f099	fix: allow graceful node shutdown to be overridden The problem is that these values needs to be set to zero if the kubelet feature gate is disabled, so we can't assume that we can override zero value with the proper config, so we have to do an extra check on the supplied configuration. Also creates KB article on disabling this feature gate. Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>	2022-04-28 14:33:58 +03:00
Dmitriy Matrenichev	867d38f28f	feat: add bond slaves ordering Before this change, we didn't preserve bonded interfaces ordering, which caused problems in some scenarios. Fix this by remembering their position in the original config. Fixes #5207. Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>	2022-04-28 01:15:11 +04:00
Andrey Smirnov	03ef62ad8b	fix: include Go primitive types into unstructured deepcopy This code was written from JSON point of view, but when YAML is unmarshaled, we get more primitive Go types as values, so why not include all of them? This was showing as an error when applying a machine config e.g. for kubelet extraArgs like: ``` shutdownGracePeriod: 0 ``` Changing this to string fixes the problem, but it's not the best UX. Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>	2022-04-27 23:16:45 +03:00
Noel Georgi	f06e6acf2f	chore: bump kernel to 5.15.36 Bump kernel to 5.15.36 LTS Ref: - https://github.com/siderolabs/pkgs/pull/458 - https://github.com/siderolabs/pkgs/pull/460 Signed-off-by: Noel Georgi <git@frezbo.dev>	2022-04-28 01:09:54 +05:30
Andrey Smirnov	c0d386abb6	fix: don't mount D-Bus socket via mount under recursive bind mount `/var/run` was mounted from `/run`, and D-Bus socket to `/var/run/dbus/` path, so when the container is stopped, container mounts are removed, but on the host side mount propagates back, so D-Bus socket gets propagated back to the host `/run`, and on the next kubelet restart process continues adding even more mount levels exponentially. Eventually on kubelet restart kernel resources are exhausted and the node freezes. Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>	2022-04-27 21:09:59 +03:00
Andrey Smirnov	9a8ff76df2	refactor: rewrite perf resource to use typed.Resource No functional changes. Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>	2022-04-27 17:57:49 +03:00
Andrey Smirnov	71d04c4d5c	refactor: rewrite runtime resources to use typed.Resource No functional changes. Also bump bumped cosi-runtime with the fix for the UnmarshalProto. Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>	2022-04-27 16:47:50 +03:00
Andrey Smirnov	7568d51fc8	fix: trigger CRI config merge on correct resource update When registry CRI config gets updated, contents of the file are written to the `EtcFileSpec` resource, which gets rendered to disk and resource `EtcFileStatus` is updated when the config is ready. CRI config parts are merged from contents of `*.part` files which come from system extensions and dynamic registry config which is written via `EtcFileSpec` resource. As the controller was incorrectly triggered on `EtcFileSpec` resource while reading files from disk, it might have read stale contents of CRI config part (which hasn't been fully rendered to disk), it might miss the latest content of the CRI config. With the fix, controller is triggered on `EtcFileStatus` update, so when the file is rendered to disk. The symptom of the bug is the empty CRI registry config like: ```shell talosctl read /etc/cri/conf.d/cri.toml ## /etc/cri/conf.d/00-base.part version = 2 [plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc] runtime_type = "io.containerd.runc.v2" discard_unpacked_layers = true ## /etc/cri/conf.d/01-registries.part ``` Notice that the `01-registries.part` is empty. Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>	2022-04-26 23:25:59 +03:00
Tim Jones	c456dbcb93	docs: remove references to init nodes Init nodes were deprecated in v1.0 so it makes sense to remove the documentation about them and consign them to the past! Signed-off-by: Tim Jones <tim.jones@siderolabs.com>	2022-04-26 21:57:21 +02:00
Andrey Smirnov	1973095d14	feat: update containerd to 1.6.3 This includes a fix for image pull slowness from https://github.com/containerd/containerd/pull/6702. Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>	2022-04-26 21:43:28 +03:00
Tim Jones	b51292d884	docs: reformat config reference Update the configuration reference documentation to show field information in a tabular format. Signed-off-by: Tim Jones <tim.jones@siderolabs.com>	2022-04-26 18:06:55 +02:00
Dmitriy Matrenichev	c0709d9707	feat: increase aio-max-nr and inotify.max_user_instances Increase values: - fs.aio-max-nr to 1048576 (for Ceph\|Veritas\|other storages) - fs.inotify.max_user_instances to 8192 (since the usual 512 is too small today's needs) There is no need to adjust fs.inotify.max_user_watches since it's set dynamically during startup by kernel. Closes #5175 Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>	2022-04-26 18:29:29 +04:00
Andrey Smirnov	85b328e997	refactor: convert secrets resources to use typed.Resource No functional changes. Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>	2022-04-26 14:51:56 +03:00
Andrey Smirnov	e91350acd7	refactor: convert time & v1alpha1 resources to use typed.Resource No functional changes, just pure refactoring. Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>	2022-04-25 22:41:52 +03:00
Andrey Smirnov	45464412e0	chore: bump dependencies dependabot + go-mod-outdated Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>	2022-04-25 16:26:41 +03:00
Andrey Smirnov	0af6b35a66	feat: update etcd to 3.5.4 See https://github.com/etcd-io/etcd/releases/tag/v3.5.4 Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>	2022-04-25 15:49:02 +03:00
Tim Jones	7ad27751cb	docs: fix analytics and sitemap Fixes the Google Analytics tracking ID and restores the production sitemap. Signed-off-by: Tim Jones <tim.jones@siderolabs.com>	2022-04-23 23:00:16 +02:00
Andrey Smirnov	55ff876dc6	chore: bump K8s Go modules to 1.24.0-rc.0 This was skipped due to https://github.com/kubernetes/kubernetes/issues/109565 Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>	2022-04-22 20:32:42 +03:00
Andrey Smirnov	f1f43131f8	fix: strip 'v' prefix from versions on Kubernetes upgrade This fixes an issue when `talosctl upgrade-k8s` fails with unhelpful message if the version is specified as `v1.23.5` vs. `1.23.5`. Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>	2022-04-22 14:59:12 +03:00
Andrey Smirnov	ec621477bd	chore: tune QEMU disk provisioner options As QEMU clusters are used for testing, use unsafe cache options to reduce amount of fsyncs going to the host blockdevice. Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>	2022-04-21 22:39:30 +03:00
Andrey Smirnov	b085343dcb	feat: use discovery information for etcd join (and other etcd calls) Talos historically relied on `kubernetes` `Endpoints` resource (which specifies `kube-apiserver` endpoints) to find other controlplane members of the cluster to connect to the `etcd` nodes for the cluster (when node local etcd instance is not up, for example). This method works great, but it relies on Kubernetes endpoint being up. If the Kubernetes API is down for whatever reason, or if the loadbalancer malfunctions, endpoints are not available and join/leave operations don't work. This PR replaces the endpoints lookup to use the `Endpoints` COSI resource which is filled in using two methods: * from the discovery data (if discovery is enabled, default to enabled) * from the Kubernetes `Endpoints` resource If the discovery is disabled (or not available), this change does almost nothing: still Kubernetes is used to discover control plane endpoints, but as the data persists in memory, even if the Kubernetes control plane endpoint went down, cached copy will be used to connect to the endpoint. If the discovery is enabled, Talos can join the etcd cluster immediately on boot without waiting for Kubernetes to be up on the bootstrap node which means that Talos cluster initial bootstrap runs in parallel on all control plane nodes, while previously nodes were waiting for the first node to finish bootstrap enough to fill in the endpoints data. As the `etcd` communication is anyways protected with mutual TLS, there's no risk even if the discovery data is stale or poisoned, as etcd operations would fail on TLS mismatch. Most of the changes in this PR actually enable populating Talos `Endpoints` resource based on the `Kubernetes` `endpoints` resource using the watch API. Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>	2022-04-21 22:00:27 +03:00
Artem Chernyshev	2b03057b91	feat: implement a new mode `try` in the config manipulation commands The new mode allows changing the config for a period of time, which allows trying the configuration and automatically rolling it back in case if it doesn't work for example. The mode can only be used with changes that can be applied without a reboot. When changed it doesn't write the configuration to disk, only changes it in memory. `--timeout` parameter can be used to customize the rollback delay. The default timeout is 1 minute. Any consequent configuration change will abort try mode and the last applied configuration will be used. Signed-off-by: Artem Chernyshev <artem.chernyshev@talos-systems.com>	2022-04-21 20:31:45 +03:00
Noel Georgi	51a68c31ff	chore: allow mounting files from the host Allow mounting files from host into extension services as per the [OCI spec](https://github.com/opencontainers/runtime-spec/blob/main/config.md#mounts) Signed-off-by: Noel Georgi <git@frezbo.dev>	2022-04-21 21:00:31 +05:30
Noel Georgi	f3e330a0aa	docs: fix network dependency Fix network dependency Signed-off-by: Noel Georgi <git@frezbo.dev>	2022-04-21 19:04:33 +05:30
Steve Francis	7ba39bd600	docs: clarify discovery service Clarify discovery service Signed-off-by: Steve Francis <steve.francis@talos-systems.com> Signed-off-by: Noel Georgi <git@frezbo.dev>	2022-04-21 18:14:29 +05:30
Andrey Smirnov	8057d076ad	release(v1.1.0-alpha.1): prepare release This is the official v1.1.0-alpha.1 release. Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com> v1.1.0-alpha.1 pkg/machinery/v1.1.0-alpha.1	2022-04-20 20:56:48 +03:00
Noel Georgi	1d5c08e74f	chore: bump kernel to 5.15.35 Bump kernel to 5.15.35 LTS Ref: https://github.com/siderolabs/pkgs/pull/454 Signed-off-by: Noel Georgi <git@frezbo.dev>	2022-04-20 20:33:10 +05:30
Andrey Smirnov	9bf23e5162	feat: update Kubernetes to 1.24.0-rc.0 See https://github.com/kubernetes/kubernetes/releases/tag/v1.24.0-rc.0 Go modules are not updated due to missing tags: https://github.com/kubernetes/kubernetes/issues/109565 Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>	2022-04-20 16:53:51 +03:00
Andrey Smirnov	d78ed320b7	docs: fix the docs reference to star registry redirects Since Talos moved to new registry redirect CRI plugin format, start redirects are no longer supported in the CRI plugin (see https://github.com/containerd/containerd/blob/main/docs/hosts.md). Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>	2022-04-20 16:03:46 +03:00
Andrey Smirnov	257dfb8709	fix: run the 'post' stage of the service always For most of the Talos service `post` stage does nothing, so it was never properly noticed. FOr extension service, pre/post stages perform mounting and unmounting of the overlayfs, so if post stage doesn't run (if the runner can't be created), next time service is started, it won't start as the post stage never ran. Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>	2022-04-20 15:35:04 +03:00
Andrey Smirnov	992e230234	fix: correctly handle stopping services with reverse dependencies This bug showed up with extension services: say we have a service `ext-foo` which depends on service `cri`. Service `ext-foo` will be started correctly only once `cri` is up. But we should also stop `ext-foo` before `cri` is stopped, as otherwise the dependency chain is broken. This PR fixes exactly that: once `cri` is stopped, anything which depends on it should be stopped. We should stop as well anything which depends on `ext-foo` (transitive dependency). In practical terms we use dependency on `cri` in extension service to correctly stop/start extension services with `/var` filesystem mount/unmount. Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>	2022-04-20 15:14:08 +03:00
Tim Jones	bb7a50bd5b	docs: fix netlify redirects Fixes Netlify redirect commands by adding an extra path segment aliging the directory properly. Signed-off-by: Tim Jones <tim.jones@siderolabs.com>	2022-04-20 13:16:14 +02:00
Tim Jones	486f79bc77	docs: fix netlify deploy url Fixes the URL from Netlify given to Hugo to build absolute URLs with the proper base. Signed-off-by: Tim Jones <tim.jones@siderolabs.com>	2022-04-20 12:18:02 +02:00
Tim Jones	e8cbedb05b	docs: add canonical link ref Adds a canonical link tag to doc pages to help SEO find the current version of documentation. Signed-off-by: Tim Jones <tim.jones@siderolabs.com>	2022-04-20 10:41:27 +02:00
Tim Jones	0fe4a7832b	docs: improve latest-version banner Make the latest-version banner sticky and more noticeable, and ensure the link to the latest version links to the current document if possible. Signed-off-by: Tim Jones <tim.jones@siderolabs.com>	2022-04-19 22:37:14 +02:00

... 12 13 14 15 16 ...

3932 Commits