talos

mirror of https://github.com/siderolabs/talos.git synced 2025-10-21 20:41:11 +02:00

Author	SHA1	Message	Date
Andrey Smirnov	dc294db16c	chore: bump dependencies via dependabot PRs #3336 #3337 #3338 #3339 Also bump proto tools via talos-systems/tools#133 Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>	2021-03-22 13:58:08 -07:00
Alexey Palazhchenko	0dbaeb9e65	chore: update tools, use new generators To stay current. Signed-off-by: Alexey Palazhchenko <alexey.palazhchenko@gmail.com>	2021-03-16 11:17:15 -07:00
Artem Chernyshev	376fdcf6cb	feat: implement etcd remove-member cli command Fixes: https://github.com/talos-systems/talos/issues/3219 We already have `etcd leave`, which makes the node exclude itself from etcd members. But in case if the node can't remove itself because it doesn't have connection to etcd we need this etcd remove-member cli, which basically removes a node from a different node. No unit tests for that as it's going to destroy the test cluster. Signed-off-by: Artem Chernyshev <artem.0xD2@gmail.com>	2021-03-01 07:55:08 -08:00
Andrey Smirnov	7751920dba	feat: add a tool and package to convert self-hosted CP to static pods This is required to upgrade from Talos 0.8.x to 0.9.x. After the cluster is fully upgraded, control plane is still self-hosted (as it was bootstrapped with bootkube). Tool `talosctl convert-k8s` (and library behind it) performs the upgrade to self-hosted version. Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>	2021-02-17 23:26:57 -08:00
Andrey Smirnov	cc83b83808	feat: rename apply-config --no-reboot to --on-reboot This explains the intetion better: config is applied on reboot, and allows to easily distinguish it from `apply-config --immediate` which applies config immediately without a reboot (that is coming in a different PR). Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>	2021-02-17 12:49:47 -08:00
Andrey Smirnov	d99a016af2	fix: correct response structure for GenerateConfig API Also fix recovery grpc handler to print panic stacktrace to the log. Any API should follow the structure compatible with apid proxying injection of errors/nodes. Explicitly fail GenerateConfig API on worker nodes, as it panics on worker nodes (missing certificates in node config). Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>	2021-02-11 06:34:10 -08:00
Andrey Smirnov	edf5777222	feat: add an option to force upgrade without checks Our upgrades are safe by default - we check etcd health, take locks, etc. But sometimes upgrades might be a way to recover broken (or semi-broken) cluster, in that case we need upgrade to run even if the checks are not passing. This is not a safe way to do upgrades, but it might be a way to recover a cluster. Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>	2021-02-09 10:20:03 -08:00
Andrey Smirnov	76a6794436	fix: kill all processes and umount all disk on reboot/shutdown There are several ways Talos node might be restarted or shut down: * error in sequence (initiated from machined) * panic in main goroutine (machined recovers panics) * error in sequence (initiated via API, event caught by machined) * reboot/shutdown via Talos API Before this change, paths (1) and (2) were handled in machined, and no disks were unmounted and processes killed, so technically all the processes are running and potentially writing to the filesystems. Paths (3) and (4) try to stop services (but not pods) and unmount explicitly mounted filesystems, followed by reboot directly from sequencer (bypassing machined handler). There was a bug that user disks were never explicitly unmounted (but they might have been unmounted if mounted on top `/var`). This refactors all the reboot/shutdown paths to flow through machined's main function: on paths (4) event is sent via event API from the sequencer back to the machined and machined initiates proper shutdown sequence. Refactoring in machined leads to all the paths (1)-(4) flowing through the same function `handle(error)`. Added two additional checks before flushing buffers: * kill all non-system processes, this also kills all mount namespaces * unmount any filesystem backed by `/dev/*` This ensures all filesystems are unmounted before buffers are flushed. Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>	2021-01-29 06:14:07 -08:00
Andrey Smirnov	0aaf8fa968	feat: replace bootkube with Talos-managed control plane Control plane components are running as static pods managed by the kubelets. Whole subsystem is managed via resources/controllers from os-runtime. Many supporting changes/refactoring to enable new code paths. Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>	2021-01-26 14:22:35 -08:00
Alexey Palazhchenko	275ca76c5b	chore: update protobuf, grpc-go, prototool To stay current. Signed-off-by: Alexey Palazhchenko <alexey.palazhchenko@gmail.com>	2021-01-11 08:52:58 -08:00
Alexey Palazhchenko	f3465b8e3e	feat: support type filter in list API and CLI Closes #2068. Signed-off-by: Alexey Palazhchenko <alexey.palazhchenko@gmail.com>	2020-12-24 06:34:02 -08:00
Artem Chernyshev	a83e8758db	feat: add commands to manage/query etcd cluster Used already existing protobufs for that. Commands: `talosctl etcd members -n <node>` `talosctl etcd leave -n <node>` `talosctl etcd forfeit-leadership -n <node>` Signed-off-by: Artem Chernyshev <artem.0xD2@gmail.com>	2020-12-22 11:49:10 -08:00
Andrey Smirnov	54ed80e244	feat: reset with system disk wipe spec Idea is to add an option to perform "selective" reset: default reset operation is to wipe all partitions (triggering reinstall), while spec allows only to wipe some of the operations. Other operations are performed exactly in the same way for any reset flow. Possible use case: reset only `EPHEMERAL` partition. Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>	2020-12-10 11:31:07 -08:00
Andrey Smirnov	350280eb59	feat: implement "staged" (failsafe/backup) upgrades Regular upgrade path takes just one reboot, but it requires all the processes to be stopped on the node before upgrade might proceed. Under some circumstances and with potential Talos bugs it might not work rendering Talos upgrades almost impossible. Staged upgrades build upon regular install flow to run the upgrade on the node reboot. Such upgrades require two reboots of the node, and it requires two pulls of the installer image, but they should be much less suspicious to the failure. Once the upgrade is staged, node can be rebooted in any possible way, including hard reset and upgrade is performed on the next boot. New ADV format was implemented as well to allow to store install image ref/options across reboots. New format allows for bigger values and takes 50% of the `META` partition. Old ADV is still kept for compatibility reasons. Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>	2020-12-08 08:34:26 -08:00
Artem Chernyshev	5d48bd5f6a	feat: allow disabling NoSchedule taint on masters using TUI installer I think this should come handy for setting up single node SBC clusters. Signed-off-by: Artem Chernyshev <artem.0xD2@gmail.com>	2020-12-07 07:31:54 -08:00
Artem Chernyshev	63e0d02aa9	feat: add TUI for configuring network interfaces settings Allows configuring: - cidr. - dhcp enable/disable. - MTU. - Ignore. - Dhcp metric. Signed-off-by: Artem Chernyshev <artem.0xD2@gmail.com>	2020-12-03 11:05:55 -08:00
Artem Chernyshev	c7062e3f4d	feat: make GenerateConfiguration accept current time as a parameter If the node time is out of sync, it can generate incorrect configuration. And maintenance mode does not allow us starting ntp, because there is no containerd. By providing current UTC time of the machine where talosctl client is running, it is possible to force GenerateConfiguration use correct time. Signed-off-by: Artem Chernyshev <artem.0xD2@gmail.com>	2020-12-03 08:28:11 -08:00
Artem Chernyshev	f96cffd2b2	feat: add ability to choose CNI config Initial version which only allows setting CNI using preset, no custom CNI urls are supported at the moment. Still need to figure out what kind of UI can be used for that. Signed-off-by: Artem Chernyshev <artem.0xD2@gmail.com>	2020-11-26 06:49:54 -08:00
Andrey Smirnov	9a32e34cb1	feat: implement apply configuration without reboot This allows config to be written to disk without being applied immediately. Small refactoring to extract common code paths. At first, I tried to implement this via the sequencer, but looks like it's too hard to get it right, as sequencer lacks context and config to be written is not applied to the runtime. Fixes #2828 Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>	2020-11-23 12:42:44 -08:00
Artem Chernyshev	8513123d22	feat: return client config as the second value in GenerateConfiguration To be used in interactive installer to output the node client configuration to a file. Signed-off-by: Artem Chernyshev <artem.0xD2@gmail.com>	2020-11-17 07:20:05 -08:00
Artem Chernyshev	0f924b5122	feat: add generate config gRPC API Fixes: https://github.com/talos-systems/talos/issues/2766 This API is implemented in Maintenance and Machine services. Can be used to generate configuration on the node, instead of using talosctl to generate it locally. To be used in interactive installer and talosctl gen config. Signed-off-by: Artem Chernyshev <artem.0xD2@gmail.com>	2020-11-13 08:07:32 -08:00
Artem Chernyshev	93e30a1738	chore: remove maintenance service interface and use machine service Now maintenance service implements `MachineService` interface, stubbing all not implemented methods. Signed-off-by: Artem Chernyshev <artem.0xD2@gmail.com>	2020-11-11 12:33:44 -08:00
Andrew Rynhard	71321214a1	feat: add storage API This is the initial implementation of a storage API. Signed-off-by: Andrew Rynhard <andrew@rynhard.io>	2020-11-11 10:12:25 -08:00
Andrew Rynhard	562f816526	refactor: use gRPC for interactive installation Instead of hosting a web service, we decided to implement a gRPC service that exposes APIs that can be used in a client-side interactive installer. Signed-off-by: Andrew Rynhard <andrew@rynhard.io>	2020-11-03 08:36:44 -08:00
Artem Chernyshev	e7e99cf1b3	feat: support disk usage command in talosctl Usage example: ```bash talosctl du --nodes 10.5.0.2 /var -H -d 2 NODE NAME 10.5.0.2 8.4 kB etc 10.5.0.2 1.3 GB lib 10.5.0.2 16 MB log 10.5.0.2 25 kB run 10.5.0.2 4.1 kB tmp 10.5.0.2 1.3 GB . ``` Supported flags: - `-a` writes counts for all files, not just directories. - `-d` recursion depth - '-H' humanize size outputs. - '-t' size threshold (skip files if < size or > size). Fixes: https://github.com/talos-systems/talos/issues/2504 Signed-off-by: Artem Chernyshev <artem.0xD2@gmail.com>	2020-10-13 09:30:31 -07:00
Andrew Rynhard	4eeef28e90	feat: add etcd API This adds RPCs for basic etcd management tasks. Signed-off-by: Andrew Rynhard <andrew@rynhard.io>	2020-10-06 11:30:04 -07:00
Seán C McCord	ff92d2a14b	feat: add ApplyConfiguration API Adds the ability to apply (replace) an existing node configuration with a new one via the Machine API. Fixes #2345 Signed-off-by: Seán C McCord <ulexus@gmail.com>	2020-09-29 14:44:06 -07:00
Andrey Smirnov	bddd4f1bf6	refactor: move external API packages into `machinery/` This moves `pkg/config`, `pkg/client` and `pkg/constants` under `pkg/machinery` umbrella. And `pkg/machinery` is published as Go module inside Talos repository. Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>	2020-08-17 09:56:14 -07:00

1 2

78 Commits