Andrey Smirnov 76a6794436 fix: kill all processes and umount all disk on reboot/shutdown
There are several ways Talos node might be restarted or shut down:

* error in sequence (initiated from machined)
* panic in main goroutine (machined recovers panics)
* error in sequence (initiated via API, event caught by machined)
* reboot/shutdown via Talos API

Before this change, paths (1) and (2) were handled in machined, and no
disks were unmounted and processes killed, so technically all the
processes are running and potentially writing to the filesystems.
Paths (3) and (4) try to stop services (but not pods) and unmount
explicitly mounted filesystems, followed by reboot directly from
sequencer (bypassing machined handler).

There was a bug that user disks were never explicitly unmounted (but
they might have been unmounted if mounted on top `/var`).

This refactors all the reboot/shutdown paths to flow through machined's
main function: on paths (4) event is sent via event API from the
sequencer back to the machined and machined initiates proper shutdown
sequence.

Refactoring in machined leads to all the paths (1)-(4) flowing through
the same function `handle(error)`.

Added two additional checks before flushing buffers:

* kill all non-system processes, this also kills all mount namespaces
* unmount any filesystem backed by `/dev/*`

This ensures all filesystems are unmounted before buffers are flushed.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2021-01-29 06:14:07 -08:00

40 lines
1.0 KiB
Go

// This Source Code Form is subject to the terms of the Mozilla Public
// License, v. 2.0. If a copy of the MPL was not distributed with this
// file, You can obtain one at http://mozilla.org/MPL/2.0/.
package runtime
import (
"errors"
"fmt"
)
var (
// ErrLocked indicates that the sequencer is currently locked, and processing
// another sequence.
ErrLocked = errors.New("locked")
// ErrInvalidSequenceData indicates that the sequencer got data the wrong
// data type for a sequence.
ErrInvalidSequenceData = errors.New("invalid sequence data")
// ErrUndefinedRuntime indicates that the sequencer's runtime is not defined.
ErrUndefinedRuntime = errors.New("undefined runtime")
)
// RebootError encapsulates unix.Reboot() cmd argument.
type RebootError struct {
Cmd int
}
func (e RebootError) Error() string {
return fmt.Sprintf("unix.Reboot(%x)", e.Cmd)
}
// IsRebootError checks whether given error is RebootError.
func IsRebootError(err error) bool {
var rebootErr RebootError
return errors.As(err, &rebootErr)
}