talos

mirror of https://github.com/siderolabs/talos.git synced 2025-10-08 06:01:12 +02:00

History

Andrey Smirnov 76a6794436 fix: kill all processes and umount all disk on reboot/shutdown

There are several ways Talos node might be restarted or shut down:

* error in sequence (initiated from machined)
* panic in main goroutine (machined recovers panics)
* error in sequence (initiated via API, event caught by machined)
* reboot/shutdown via Talos API

Before this change, paths (1) and (2) were handled in machined, and no
disks were unmounted and processes killed, so technically all the
processes are running and potentially writing to the filesystems.
Paths (3) and (4) try to stop services (but not pods) and unmount
explicitly mounted filesystems, followed by reboot directly from
sequencer (bypassing machined handler).

There was a bug that user disks were never explicitly unmounted (but
they might have been unmounted if mounted on top `/var`).

This refactors all the reboot/shutdown paths to flow through machined's
main function: on paths (4) event is sent via event API from the
sequencer back to the machined and machined initiates proper shutdown
sequence.

Refactoring in machined leads to all the paths (1)-(4) flowing through
the same function `handle(error)`.

Added two additional checks before flushing buffers:

* kill all non-system processes, this also kills all mount namespaces
* unmount any filesystem backed by `/dev/*`

This ensures all filesystems are unmounted before buffers are flushed.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>

2021-01-29 06:14:07 -08:00

api.md

fix: kill all processes and umount all disk on reboot/shutdown

2021-01-29 06:14:07 -08:00

cli.md

feat: use multi-arch images for k8s and Flannel CNI

2021-01-28 08:26:02 -08:00

configuration.md

feat: use multi-arch images for k8s and Flannel CNI

2021-01-28 08:26:02 -08:00

platform.md

docs: add v0.9 docs

2021-01-13 15:42:25 +03:00