talos

mirror of https://github.com/siderolabs/talos.git synced 2025-10-29 23:41:41 +01:00

Author	SHA1	Message	Date
Andrey Smirnov	76a6794436	fix: kill all processes and umount all disk on reboot/shutdown There are several ways Talos node might be restarted or shut down: * error in sequence (initiated from machined) * panic in main goroutine (machined recovers panics) * error in sequence (initiated via API, event caught by machined) * reboot/shutdown via Talos API Before this change, paths (1) and (2) were handled in machined, and no disks were unmounted and processes killed, so technically all the processes are running and potentially writing to the filesystems. Paths (3) and (4) try to stop services (but not pods) and unmount explicitly mounted filesystems, followed by reboot directly from sequencer (bypassing machined handler). There was a bug that user disks were never explicitly unmounted (but they might have been unmounted if mounted on top `/var`). This refactors all the reboot/shutdown paths to flow through machined's main function: on paths (4) event is sent via event API from the sequencer back to the machined and machined initiates proper shutdown sequence. Refactoring in machined leads to all the paths (1)-(4) flowing through the same function `handle(error)`. Added two additional checks before flushing buffers: * kill all non-system processes, this also kills all mount namespaces * unmount any filesystem backed by `/dev/*` This ensures all filesystems are unmounted before buffers are flushed. Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>	2021-01-29 06:14:07 -08:00
Andrey Smirnov	cbb7ca8390	refactor: merge osd into machined This merges `osd` API into `machined`. API was copied from `osd` into `machined`, and `osd` API was deprecated. For backwards compatibility, `machined` still implements `osd` API, so older Talos API clients can still talk to the node without changes. Docs were updated. No functional changes. Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>	2020-07-13 12:50:00 -07:00
Andrey Smirnov	d210d7f1a3	fix: implement Unload() for services to make sure bootkube runs always The problem was that flow to re-run the service with different parameters was not consistent: it depends on whether services was loaded before or not, but that is not reliable, as e.g. with bootstrap API `bootkube` is loaded for the bootstrap and stays until reboot, and never loaded for any other boot. `Unload()` stops and removes the service completely so that new instance of the service could be loaded and started. This fixes the edge case with recovery API not running bootkube properly before reboot after bootstrap. Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>	2020-07-08 07:15:45 -07:00
Andrey Smirnov	d3d011c8d2	chore: replace `/* */` comments with `//` comments in license header This fixes issues with `// +build` directives not being recognized in source files. Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>	2019-10-25 14:15:17 -07:00
Andrew Rynhard	80e3876df5	feat: remove proxyd We have decided that proxyd is not the best architectue for HA Kubernetes. Our recommendation to users will be to create a load balancer instead. Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>	2019-10-14 08:11:00 -07:00
Andrey Smirnov	c2cb0f9778	chore: enable 'wsl' linter and fix all the issues I wish there were less of them :) Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>	2019-10-10 01:16:29 +03:00
Andrew Rynhard	5ee554128e	chore: move from gofumpt to gofumports The gofumports does everything that gofumpt does with the addition of formatting imports. This change proposes the use of the `-local` flag so that we can have imports separated in the following order: - standard library - third party - Talos specific Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>	2019-09-12 07:49:12 -07:00
Andrey Smirnov	9c63f4ed0a	feat(init): implement complete API for service lifecycle (start/stop) It is now possible to `start`/`stop`/`restart` any service via `osctl` commands. There are some changes in `ServiceRunner` to support re-use (re-entering running state). `Services` singleton now tracks service running state to avoid calling `Start()` on already running `ServiceRunner` instance. Method `Start()` was renamed to `LoadAndStart()` to break up service loading (adding to the list of service) and actual service start. Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>	2019-08-01 11:16:57 -07:00
Andrew Rynhard	8e8aae98dd	feat: add machined This commit splits our current init into init and machined. Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>	2019-07-16 13:12:21 -07:00

9 Commits