18 Commits

Author SHA1 Message Date
Artem Chernyshev
d1f55f9012
fix: update blockdevice library to properly handle absent GPT
Next blockdevice library release reads MBR along with GPT and raises
an error if GPT is not set.

Signed-off-by: Artem Chernyshev <artem.chernyshev@talos-systems.com>
2021-11-24 00:33:00 +03:00
Andrey Smirnov
753a82188f
refactor: move pkg/resources to machinery
Fixes #4420

No functional changes, just moving packages around.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2021-11-15 19:50:35 +03:00
Andrey Smirnov
1d1e1df643
fix: handle skipped mounts correctly
If the mount is skipped, we shouldn't record it and create a matching
resource.

This fixes a problem discovered by cluster discovery tests when node
establishes more than a single identity on initial boot with first one
being lost, but still exists in the discovery service.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2021-10-25 17:23:01 +03:00
Andrey Smirnov
f8bebba2de
fix: ignore error on duplicate for MountStatus
We have multiple calls for `mountState` even when `STATE` is already
mounted, so we should handle it properly.

Example error:

```
[  152.736427] [talos] apply configuration failed: error running phase 2 in applyConfiguration sequence: task 1/1: failed, error creating mount status resource: resource MountStatuses.runtime.talos.dev(runtime/STATE@1) already exists
```

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2021-08-30 18:54:13 +03:00
Andrey Smirnov
85cda1b956
feat: provide MountStatus resource for system partition mounts
Fixes #4133

This is pretty limited resource, as it covers only system mounts, but
this is all we need for KubeSpan for now. More complete solution should
probably involve COSIfying whole mount subsystem.

Example:

```
$ talosctl -n 172.20.0.2 get mounts
NODE         NAMESPACE   TYPE          ID          VERSION   SOURCE      TARGET          FILESYSTEM TYPE
172.20.0.2   runtime     MountStatus   EPHEMERAL   1         /dev/vda6   /var            xfs
172.20.0.2   runtime     MountStatus   STATE       1         /dev/vda5   /system/state   xfs
```

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2021-08-25 20:56:49 +03:00
Artem Chernyshev
2dc27d9964 fix: do not format state partition in the initialize sequence
Initialize state should be only reading the config.
So now if it detects that the partition is not even formatted it will
skip it and will consider the state to be empty.

Signed-off-by: Artem Chernyshev <artem.0xD2@gmail.com>
2021-06-17 07:17:42 -07:00
Artem Chernyshev
032851844f fix: get rid of data race in encoder and fix concurrent map access
Fixes: https://github.com/talos-systems/talos/issues/3377, https://github.com/talos-systems/talos/issues/3380

Fixed the data race in the encoder documentation examples by using `sync.Once`.
We only need to generate them once anyways and then it's not a big deal
that we are using the same pointers everywhere as they're pretty much
constant.

As of `system.go`, looks like we actually have concurrent operations for
partitions unmount so I just added a mutex there.

Signed-off-by: Artem Chernyshev <artem.0xD2@gmail.com>
2021-03-29 01:00:46 -07:00
Alexey Palazhchenko
df52c13581 chore: fix //nolint directives
That's the recommended syntax:
https://golangci-lint.run/usage/false-positives/

Signed-off-by: Alexey Palazhchenko <alexey.palazhchenko@gmail.com>
2021-03-05 05:58:33 -08:00
Artem Chernyshev
58ff2c9808 feat: implement ephemeral partition encryption
This PR introduces the first part of disk encryption support.
New config section `systemDiskEncryption` was added into MachineConfig.
For now it contains only Ephemeral partition encryption.

Encryption itself supports two kinds of keys for now:
- node id deterministic key.
- static key which is hardcoded in the config and mainly used for test
purposes.

Talosctl cluster create can now be told to encrypt ephemeral partition
by using `--encrypt-ephemeral` flag.

Additionally:
- updated pkgs library version.
- changed Dockefile to copy cryptsetup deps from pkgs.

Signed-off-by: Artem Chernyshev <artem.0xD2@gmail.com>
2021-02-17 13:39:04 -08:00
Artem Chernyshev
02b3719df9 feat: skip filesystem for state and ephemeral partitions in the installer
Filesystem creation step is moved on the later stage: when Talos mounts
the partition for the first time.
Now it checks if the partition doesn't have any filesystem and formats
it right before mounting.

Additionally refactored mount options a bit:
- replaced separate options with a set of binary flags.
- implemented pre-mount and post-unmount hooks.

And fixed typos in couple of places and increased timeout for `apid ready`.

Signed-off-by: Artem Chernyshev <artem.0xD2@gmail.com>
2021-02-17 09:37:21 -08:00
Artem Chernyshev
a83af03730 refactor: update go-blockdevice and restructure disk interaction code
This refactoring is required to simplify the work to be done to support
disk encryption.

Tried to minimize amount of queries done by `blockdevice` `probe`
methods.
Instead, where we have `runtime.Runtime` we get all required blockdevices
there from blockdevice cache stored in `State().Machine().Disk()`.
This opens a way to store encryption settings in the `Partition`
objects.

Signed-off-by: Artem Chernyshev <artem.0xD2@gmail.com>
2021-01-28 17:42:09 +03:00
Andrey Smirnov
a2efa44663 chore: enable gci linter
Fixes were applied automatically.

Import ordering might be questionable, but it's strict:

* stdlib
* other packages
* same package imports

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2020-11-09 08:09:48 -08:00
Andrey Smirnov
8560fb9662 chore: enable nlreturn linter
Most of the fixes were automatically applied.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2020-11-09 06:48:07 -08:00
Andrey Smirnov
ff4d702f77 fix: implement preserving contents of partition on install
This fixes A/B upgrades and rollback API.

Installer manifest supports now an option to preserve partition contents
while disk is being re-partitioned and partitions are re-formatted.

Mount `/boot` partition as needed (to find current label before starting
the installation and in the rollback API).

Fix upgrade API for non-master nodes.

Contents of `/boot`, `/system/state` and META partitions are preserved
in memory while the disk is re-partitioned.

Remove `--save` flag from the installer as it's not being used.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2020-10-22 23:56:39 +03:00
Andrey Smirnov
018086d1fa refactor: extract blockdevice library
Library `blockdevice` was extracted as `talos-systems/go-blockdevice`,
this PR finalizes the move by removing Talos copy of it.

Some functions around `mkfs`/`growfs` were extracted as `makefs`
package, as they depend on `cmd` package.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2020-10-05 11:18:43 -07:00
Andrew Rynhard
1a4059a553 feat: add grub bootloader
This moves to using grub instead of syslinux.

BREAKING CHANGE: Single node upgrades will fail in this change. This
will also break the A/B fallback setup since this version introduces
an entirely new partition scheme, that any fallback will not know about.
We plan on addressing these issues in a follow up change.

Signed-off-by: Andrew Rynhard <andrew@rynhard.io>
2020-09-01 12:06:43 -07:00
Andrey Smirnov
bddd4f1bf6 refactor: move external API packages into machinery/
This moves `pkg/config`, `pkg/client` and `pkg/constants`
under `pkg/machinery` umbrella.

And `pkg/machinery` is published as Go module inside Talos repository.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2020-08-17 09:56:14 -07:00
Andrew Rynhard
49307d554d refactor: improve machined
This is a rewrite of machined. It addresses some of the limitations and
complexity in the implementation. This introduces the idea of a
controller. A controller is responsible for managing the runtime, the
sequencer, and a new state type introduced in this PR.

A few highlights are:

- no more event bus
- functional approach to tasks (no more types defined for each task)
  - the task function definition now offers a lot more context, like
    access to raw API requests, the current sequence, a logger, the new
    state interface, and the runtime interface.
- no more panics to handle reboots
- additional initialize and reboot sequences
- graceful gRPC server shutdown on critical errors
- config is now stored at install time to avoid having to download it at
  install time and at boot time
- upgrades now use the local config instead of downloading it
- the upgrade API's preserve option takes precedence over the config's
  install force option

Additionally, this pulls various packes in under machined to make the
code easier to navigate.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2020-04-28 08:20:55 -07:00