talos

mirror of https://github.com/siderolabs/talos.git synced 2025-10-08 22:21:16 +02:00

Author	SHA1	Message	Date
Andrey Smirnov	950f122c95	chore: update versions in upgrade tests In preparation for 0.13, start testing upgrades to 0.12. Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>	2021-08-25 18:02:47 +03:00
Andrey Smirnov	dadaa65d54	feat: print uid/gid for the files in `ls -l` This adds information about file ownership in the long listing which is crucial sometimes. Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>	2021-08-13 00:10:49 +03:00
Alexey Palazhchenko	09d70b7eaf	feat: update Kubernetes to v1.22.0 Closes #3967. Closes #3997. Signed-off-by: Alexey Palazhchenko <alexey.palazhchenko@talos-systems.com>	2021-08-06 09:06:32 -07:00
Alexey Palazhchenko	eea750de2c	chore: rename "join" type to "worker" Closes #3413. Signed-off-by: Alexey Palazhchenko <alexey.palazhchenko@gmail.com>	2021-07-09 07:10:45 -07:00
Andrey Smirnov	b969e7720e	chore: update references to old protobuf package This simply uses new protobuf package instead of old one. Old protobuf package is still in use by Talos dependencies. Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>	2021-07-08 05:34:12 -07:00
Andrey Smirnov	10c28758a4	fix: ignore DeadlineExceeded error correctly on bootstrap The problem was that gRPC method `status.Code(err)` doesn't unwrap errors, while Talos client returns errors wrapped with `multierror.Error` and `fmt.Errrorf`, so `status.Code` doesn't return error code correctly. Fix that by introducing our own client method which correctly goes over the chain of wrapped errors. Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>	2021-07-07 12:02:26 -07:00
Andrey Smirnov	84817f7334	chore: bump Talos version in upgrade tests Preparing for 0.11 to be stable release soon. Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>	2021-06-29 07:24:48 -07:00
Alexey Palazhchenko	2fa54107b2	chore: fix tests for disabled RBAC This commit also introduces a hidden `--json` flag for `talosctl version` command that is not supported and should be re-worked at #907. Refs #3852. Signed-off-by: Alexey Palazhchenko <alexey.palazhchenko@gmail.com>	2021-06-28 13:56:40 -07:00
Alexey Palazhchenko	bbf1c091d4	feat: add RBAC to `talosctl version` output Refs #3852. Signed-off-by: Alexey Palazhchenko <alexey.palazhchenko@gmail.com>	2021-06-28 07:10:25 -07:00
Alexey Palazhchenko	ad047a7dee	chore: small RBAC improvements * `talosctl config new` now sets endpoints in the generated config. * Avoid duplication of roles in metadata. * Remove method name prefix handling. All methods should be set explicitly. * Add tests. Closes #3421. Signed-off-by: Alexey Palazhchenko <alexey.palazhchenko@gmail.com>	2021-06-25 05:50:38 -07:00
Alexey Palazhchenko	3c1b32199d	chore: refactor CLI tests Use testing.T.TempDir. Add support for `talosctl --endpoints`. Signed-off-by: Alexey Palazhchenko <alexey.palazhchenko@gmail.com>	2021-06-23 05:49:00 -07:00
Alexey Palazhchenko	42c16f67f4	chore: bump dependencies Update k8s to 1.21.2. See #3787 #3788 #3789 #3790 #3791 #3792 #3793 #3794 #3795 #3796 #3798. Signed-off-by: Alexey Palazhchenko <alexey.palazhchenko@gmail.com>	2021-06-21 07:05:41 -07:00
Alexey Palazhchenko	f63ab9dd9b	feat: implement `talosctl config new` command Refs #3421. Signed-off-by: Alexey Palazhchenko <alexey.palazhchenko@gmail.com>	2021-06-17 09:06:43 -07:00
Andrey Smirnov	62c702c4fd	fix: remove conflicting etcd member on rejoin with empty data directory This fixes a scenario when control plane node loses contents of `/var` without leaving etcd first: on reboot etcd data directory is empty, but member is already present in the etcd member list, so etcd won't be able to join because of raft log being empty. The fix is to remove a member with matching hostname if found in the etcd member list followed by new member add. The risk here is removing another member which has same hostname as the joining node, but having duplicate hostnames for control plane node is a problem anyways. Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>	2021-06-03 15:11:44 -07:00
Andrew Rynhard	a71053fcd8	feat: default to bootstrap workflow Changes `gen config` to output `controlplane` and `join` machine config types only. Users can manually set the `type` to `init` if they need to. Signed-off-by: Andrew Rynhard <andrew@rynhard.io>	2021-06-03 11:29:56 -07:00
Andrey Smirnov	5811f4dda1	feat: implement link (interface) controllers The structure of the controllers is really similar to addresses and routes: * `LinkSpec` resource describes desired link state * `LinkConfig` controller generates `LinkSpecs` based on machine configuration and kernel cmdline * `LinkMerge` controller merges multiple configuration sources into a single `LinkSpec` paying attention to the config layer priority * `LinkSpec` controller applies the specs to the kernel state Controller `LinkStatus` (which was implemented before) watches the kernel state and publishes current link status. Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>	2021-06-01 09:36:25 -07:00
Andrey Smirnov	0acb04ad7a	feat: implement route network controllers Route handling is very similar to addresses: * `RouteStatus` describes kernel routing table state, `RouteStatusController` reflects kernel state into resources * `RouteSpec` defines routes to be configured * `RouteConfigController` creates `RouteSpec`s based on cmdline and machine configuration * `RouteMergeController` merges different configuration layers into the final representation * `RouteSpecController` applies the specs to the kernel routing table Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>	2021-05-25 11:09:21 -07:00
Alexey Palazhchenko	4fe6912143	test: better `talosctl ls` tests Refs #3018. Signed-off-by: Alexey Palazhchenko <alexey.palazhchenko@gmail.com>	2021-05-20 03:29:21 -07:00
Andrey Smirnov	76e38b7b82	feat: update Kubernetes to 1.21.1 See https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.21.md Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>	2021-05-13 08:05:08 -07:00
Andrey Smirnov	0f49722d0f	feat: add `--config-patch` flag by node type The problem is that some patches can't be applied to join config, as some nodes don't even exist in the config, for example `/cluster/apiServer` node, and applying such patches doesn't make any sense. Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>	2021-04-27 11:55:03 -07:00
Andrey Smirnov	daf2208749	test: update upgrade tests to 0.10 release In preparation for going 0.10 beta, start testing upgrades to 0.10, drop 0.8 and self-hosted control plane handling in the tests. Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>	2021-04-09 12:57:04 -07:00
Alexey Palazhchenko	1fcf38f9d6	feat: add support for "none" CNI type Closes #3411. Signed-off-by: Alexey Palazhchenko <alexey.palazhchenko@gmail.com>	2021-04-09 12:53:00 -07:00
Alexey Palazhchenko	37a5edf04a	feat: update Kubernetes to 1.21.0 release See CHANGELOG: https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.21.md Closes #3329. Signed-off-by: Alexey Palazhchenko <alexey.palazhchenko@gmail.com>	2021-04-09 20:08:20 +03:00
Alexey Palazhchenko	29da22d063	feat: add config validation warnings Closes #3412. Refs #3413. Signed-off-by: Alexey Palazhchenko <alexey.palazhchenko@gmail.com>	2021-04-08 13:49:58 -07:00
Andrey Smirnov	e0650218a6	feat: support etcd recovery from snapshot on bootstrap When Talos `controlplane` node is waiting for a bootstrap, `etcd` contents can be recovered from a snapshot created with `talosctl etcd snapshot` on a healthy cluster. Bootstrap process goes same way as before, but the etcd data directory is recovered from the snapshot. This flow enables disaster recovery for the control plane: given that periodic backups are available, destroy control plane nodes, re-create them with the same config, and bootstrap one node with the saved snapshot to recover etcd state at the time of the snapshot. Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>	2021-04-08 10:15:37 -07:00
Artem Chernyshev	39c6dbcc7a	feat: add --config-patch parameter to talosctl gen config Fixes: https://github.com/talos-systems/talos/issues/3410 Same as in `talosctl cluster create`. Will apply RFC6902 json patch during the config generation if specified. Signed-off-by: Artem Chernyshev <artem.0xD2@gmail.com>	2021-04-02 10:56:41 -07:00
Andrey Smirnov	e664362cec	feat: add API and command to save etcd snapshot (backup) This adds a simple API and `talosctl etcd snapshot` command to stream snapshot of etcd from one of the control plane nodes to the local file. Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>	2021-04-02 09:20:16 -07:00
Andrey Smirnov	abc2e17ebb	test: update 0.9.x version in upgrade tests to 0.9.1 Version 0.9.1 contains a fix for concurrent map write on unmount which was frequently breaking our upgrade tests. Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>	2021-04-02 03:59:36 -07:00
Andrey Smirnov	7d91258475	test: fix data race in apply config tests Variable `chanErr` was read before waiting for the goroutine to finish. Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>	2021-03-31 10:46:50 -07:00
Andrey Smirnov	204caf8eb9	test: fix apply-config integration test, bump clusterctl version Tests for ApplyConfig API were relying on not really supported behavior of modifying config via the `Provider` interface (and it was "fixed" in another PR which cleans up such access to the configuration). Cluster version bumped to try to workaround strange CAPI bootstrap failures in e2e-capi. Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>	2021-03-31 09:55:53 -07:00
Alexey Palazhchenko	a9451f5712	feat: update Kubernetes to 1.21.0-beta.1 See CHANGELOG: https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.21.md Refs #3329. Signed-off-by: Alexey Palazhchenko <alexey.palazhchenko@gmail.com>	2021-03-30 03:07:03 -07:00
Andrey Smirnov	2ea20f598a	feat: replace timed with time sync controller This is a complete rewrite of time sync process. Now the time sync process starts early at boot time, and it adapts to configuration changes: * before config is available, `pool.ntp.org` is used * once config is available, configured time servers are used Controller updates same time sync resource as other controllers had dependency on, so they have a chance to wait for the time sync event. Talos services which depend on time now wait on same resource instead of waiting on timed health. New features: * time sync now sticks to the particular time server unless there's an error from that server, and server is changed in that case, this improves time sync accuracy * time sync acts on config changes immediately, so it's possible to reconfigure time sync at any time * there's a new 'epoch' field in time sync resources which allows time-dependent controllers to regenerate certs when there's a big enough jump in time Features to implement later: * apid shouldn't depend on timed, it should be started early and it should regenerate certs on time jump * trustd should be updated in same way Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>	2021-03-29 09:29:43 -07:00
Alexey Palazhchenko	d7e9f6d6a8	chore: build integration tests with -race Refs https://github.com/talos-systems/talos/issues/3378. Signed-off-by: Alexey Palazhchenko <alexey.palazhchenko@gmail.com>	2021-03-26 10:08:12 -07:00
Alexey Palazhchenko	ed272e604e	feat: update Kubernetes to 1.21.0-beta.0 See CHANGELOG: https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.21.md Refs #3329. Signed-off-by: Alexey Palazhchenko <alexey.palazhchenko@gmail.com>	2021-03-24 07:36:54 -07:00
Andrey Smirnov	b0209fd29d	refactor: move networkd, timed APIs to machined, remove routerd This moves implementation of the user-facing APIs to the machined, and as now all the APIs are implemented by machined, remove routerd and adjust apid to proxy to machined. Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>	2021-03-24 00:00:28 -07:00
Artem Chernyshev	6ffabe5169	feat: add ability to find disk by disk properties Fixes: https://github.com/talos-systems/talos/issues/3323 Not exactly matching with udevd generated `by-<id>` symlinks, but should provide sufficient amount of property selectors to be able to pick specific disks for any kind of disk: sd card, hdd, ssd, nvme. Signed-off-by: Artem Chernyshev <artem.0xD2@gmail.com>	2021-03-23 14:23:02 -07:00
Andrey Smirnov	ac8764702f	refactor: move apid, routerd, timed and trustd to single executable This removes container images for the aforementioned services, they are now built into `machined` executable which launches one or another service based on `argv[0]`. Containers are started with rootfs directory which contains only a single executable file for the service. This creates rootfs on squashfs for each container in `/opt/<container>`. Service `networkd` is not touched as it's handled in #3350. This removes all the image imports, snapshots and other things which were associated with the existing way to run containers. Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>	2021-03-23 09:48:11 -07:00
Andrey Smirnov	125b86f4ef	fix: upgrade-k8s bug with empty config values and provision script First, if the config for some component image (e.g. `apiServer`) is empty, Talos pushes default image which is unknown to the script, so verify that change is not no-op, as otherwise script will hang forvever waiting for k8s control plane config update. Second, with bootkube bootstrap it was fine to omit explicit kubernetes version in upgrade test, but with Talos-managed that means that after Talos upgrade Kubernetes gets upgraded as well (as Talos config doesn't contain K8s version, and defaults are used). This is not what we want to test actually. Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>	2021-03-19 12:05:31 -07:00
Andrey Smirnov	f0512dfce9	feat: update Kubernetes to 1.20.5 See CHANGELOG: https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.20.md#changelog-since-v1204 Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>	2021-03-19 03:14:46 -07:00
Andrey Smirnov	ca8a5596c7	chore: fix provision tests after changes to build-container CNI was removed from build-container which works fine for `talosctl cluster create` clusters as it installs its own CNI, but fails for upgrade tests as they were never updated for the CNI bundle. Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>	2021-03-12 09:59:15 -08:00
Artem Chernyshev	22f375300c	chore: update golanci-lint to 1.38.0 Fix all discovered issues. Detected couple bugs, fixed them as well. Signed-off-by: Artem Chernyshev <artem.0xD2@gmail.com>	2021-03-12 06:50:02 -08:00
Alexey Palazhchenko	df52c13581	chore: fix //nolint directives That's the recommended syntax: https://golangci-lint.run/usage/false-positives/ Signed-off-by: Alexey Palazhchenko <alexey.palazhchenko@gmail.com>	2021-03-05 05:58:33 -08:00
Andrey Smirnov	7e8f13652c	chore: fix upgrade tests by bumping 0.9 to alpha.5 Resources/types were renamed after alpha.4, so we need Talos API to match expectations of the upgrade test built against master. Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>	2021-03-03 13:53:06 -08:00
Andrey Smirnov	60aa011c7a	feat: rename namespaces, resources, types etc See https://github.com/talos-systems/os-runtime/pull/12 for new mnaming conventions. No functional changes. Additionally implements printing extra columns in `talosctl get xyz`. Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>	2021-03-02 13:34:15 -08:00
Andrey Smirnov	1d8ed9b5cd	chore: update provision/upgrade tests to 0.9.0-alpha.3 This drops support for 0.7.x in upgrade tests, and bumps tests to use version 0.9.0-alpha.3 as the next stable (it will eventually graduate to 0.9.0). Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>	2021-03-02 07:11:16 -08:00
Andrey Smirnov	31e56e63db	fix: update in-cluster kubeconfig validity to match other certs Talos generates in-cluster kubeconfig for the kube-scheduler and kube-controller-manager to authenticate to kube-apiserver. Bug was that validity of that kubeconfig was set to 24h by mistake. Fix that by bumping validity to default for other Kubernetes certs (1 year). Add a certificate refresh at 50% of the validity. Fix bugs with copying secret resources which was leading to updates not being propagated correctly. Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>	2021-03-01 11:16:04 -08:00
Andrey Smirnov	c7ee239087	fix: show stopped/exited containers via CRI inspector This fixes output of `talosctl containers` to show failed/exited containers so that it's possible to see e.g. `kube-apiserver` container when it fails to start. This also enables using ID from the container list to see logs of failing containers, so it's easy to debug issues when control plane pods don't start because of wrong configuration. Also remove option to use either CRI or containerd inspector, default to containerd for system namespace and to CRI for kubernetes namespace. The only side effect is that we can't see `kubelet` container in the output of `talosctl containers -k`, but `kubelet` itself is available in `talosctl services` and `talosctl logs kubelet`. Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>	2021-02-26 14:45:13 -08:00
Artem Chernyshev	041620c852	feat: implement talosctl edit and patch config commands Fixes: https://github.com/talos-systems/talos/issues/3209 Using parts of `kubectl` package to run the editor. Also using the same approach as in `kubectl edit` command: - add commented section to the top of the file with the description. - if the config has errors, display validation errors in the commented section at the top of the file. - retry apply config until it succeeds. - abort if no changes were detected or if the edited file is empty. Patch currently supports jsonpatch only and can read it either from the file or from the inline argument. https://asciinema.org/a/wPawpctjoCFbJZKo2z2ATDXeC Signed-off-by: Artem Chernyshev <artem.0xD2@gmail.com>	2021-02-26 02:00:20 +03:00
Artem Chernyshev	7108bb3f5b	test: upgrade master to master tests Verify upgrade flow using the same version of the installer. Run that with disk encryption enabled. Signed-off-by: Artem Chernyshev <artem.0xD2@gmail.com>	2021-02-24 07:56:44 -08:00
Andrey Smirnov	e2f1fbcfdb	feat: support control plane upgrades with Talos managed control plane Upgrade is performed by updating node configuration (node by node, service by service), watching internal resource state to get new configuration version and verifying that pod with matching version successfully propagated to the API server state and pod is ready. Process is similar to the rolling update of the DaemonSet. Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>	2021-02-20 11:57:32 -08:00

1 2 3 4

197 Commits