This has two big visible changes:
* `installer` image now contains assets for both `amd64` and `arm64`, so
it can be used to generate any Talos image (including RPi on amd64 host)
* Talos is using cross-compilation instead of emulation to build
non-native architectures: on amd64, Go amd64 compiler produces binaries
for both arm64 and amd64
(before this change: Go arm64 compiler via QEMU produces arm64 binaries on amd64)
CI implications: we no longer require arm64 nodes.
Changes walkthrough:
* `installer` container now keeps assets under `/usr/install/<arch>`
* Dockerfile build starts forcing toolchain/base image to use the build
host native architecture, not target architecture
* lots of duplication for amd64/arm64 as we want to combine assets for
both arches in a single image (e.g. we have multi-arch amd64/arm64
installer image, each arch has native installer binary, but both arches
contain full set of amd64/arm64 assets)
* fixed a small bug preventing arm64 on amd64 talosctl cluster create
Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
Filesystem creation step is moved on the later stage: when Talos mounts
the partition for the first time.
Now it checks if the partition doesn't have any filesystem and formats
it right before mounting.
Additionally refactored mount options a bit:
- replaced separate options with a set of binary flags.
- implemented pre-mount and post-unmount hooks.
And fixed typos in couple of places and increased timeout for `apid ready`.
Signed-off-by: Artem Chernyshev <artem.0xD2@gmail.com>
This refactoring is required to simplify the work to be done to support
disk encryption.
Tried to minimize amount of queries done by `blockdevice` `probe`
methods.
Instead, where we have `runtime.Runtime` we get all required blockdevices
there from blockdevice cache stored in `State().Machine().Disk()`.
This opens a way to store encryption settings in the `Partition`
objects.
Signed-off-by: Artem Chernyshev <artem.0xD2@gmail.com>
Fixes#3011
See also https://github.com/talos-systems/go-procfs/pull/8
We don't want to allow all the kernel args to be overridden, as this
might compromise KSPP, but we would rather allow some args to be
overridden explicitly.
Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
That change should make Talos updates more straightforward in any
projects that depend on Talos.
Signed-off-by: Artem Chernyshev <artem.0xD2@gmail.com>
SBC should always overwrite default kernel params.
Otherwise we will always get duplicate values for some of them.
Signed-off-by: Artem Chernyshev <artem.0xD2@gmail.com>
Regular upgrade path takes just one reboot, but it requires all the
processes to be stopped on the node before upgrade might proceed. Under
some circumstances and with potential Talos bugs it might not work
rendering Talos upgrades almost impossible.
Staged upgrades build upon regular install flow to run the upgrade on
the node reboot. Such upgrades require two reboots of the node, and it
requires two pulls of the installer image, but they should be much less
suspicious to the failure. Once the upgrade is staged, node can be
rebooted in any possible way, including hard reset and upgrade is
performed on the next boot.
New ADV format was implemented as well to allow to store install image
ref/options across reboots. New format allows for bigger values and
takes 50% of the `META` partition. Old ADV is still kept for
compatibility reasons.
Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
This allows boards to provide kernel args at install time. We need this so that
we can set the console.
Signed-off-by: Andrew Rynhard <andrew@rynhard.io>
This introduces the notion of a "board" in Talos. A board is an interface that is capable
of modifying the installation in specific ways for a given SBC. This also adds support for the
libretech_all_h3_cc_h5.
Signed-off-by: Andrew Rynhard <andrew@rynhard.io>
For 0.6 -> 0.7 upgrade, in any case config.yaml is preserved and moved
from `/boot` to `/system/state`.
For single node upgrade, `EPHEMERAL` partition is not touched and other
partitions are re-created as needed.
Bump provision tests to 0.6/0.7 upgrades as we get closer to the new
release.
Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
This fixes A/B upgrades and rollback API.
Installer manifest supports now an option to preserve partition contents
while disk is being re-partitioned and partitions are re-formatted.
Mount `/boot` partition as needed (to find current label before starting
the installation and in the rollback API).
Fix upgrade API for non-master nodes.
Contents of `/boot`, `/system/state` and META partitions are preserved
in memory while the disk is re-partitioned.
Remove `--save` flag from the installer as it's not being used.
Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
This unifies more code paths under the control of `install.Manifest` vs.
being split across the installer and manifest code.
There should be no functional changes now.
Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
Library `blockdevice` was extracted as `talos-systems/go-blockdevice`,
this PR finalizes the move by removing Talos copy of it.
Some functions around `mkfs`/`growfs` were extracted as `makefs`
package, as they depend on `cmd` package.
Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
This moves to using grub instead of syslinux.
BREAKING CHANGE: Single node upgrades will fail in this change. This
will also break the A/B fallback setup since this version introduces
an entirely new partition scheme, that any fallback will not know about.
We plan on addressing these issues in a follow up change.
Signed-off-by: Andrew Rynhard <andrew@rynhard.io>
This moves `pkg/config`, `pkg/client` and `pkg/constants`
under `pkg/machinery` umbrella.
And `pkg/machinery` is published as Go module inside Talos repository.
Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
Fixes#2272
`gofumpt` is now included into `golangci-lint`, but not the
`gofumports`, so we keep it using it as separate binary, but we keep
versions in sync with `golangci-lint`.
This contains fixes from:
* `gofumpt` (automated, mostly around octal constants)
* `exhaustive` in `switch` statements
* `noctx` (adding context with default timeout to http requests)
Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
This adds a sentinel error for a missing partition table. This error
is used to detect if a partition table already exists when setting
up user defined disks.
In addition to the fix, this removes a legacy parameter from the
`PartitionTable` method that indicated that the partition table
should be read. It is safer to just read it every time. Also, I
can't think of a case when the block device partition table is nil
and we want to read.
Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
This change ensures that any symlinks under `/dev/disks` are created
by udevd early enough so that users can specify disks identified
by these symlinks in `machine.disks` entries. This change also moves
the udevd-trigger from a service to a post-startup task for udevd.
This is a one time command that doesn't need to be a service. Some
other minor changes were made around parsing the partition name that
makes use of the symlinks created by udevd. This also ensures that
target mount points in `machine.disks` entries are created, fixing a
preexisting bug.
Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
There is no need to really warn about this. We handle this sceanrio just
fine now. The warning just adds confusion.
Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
This is a rewrite of machined. It addresses some of the limitations and
complexity in the implementation. This introduces the idea of a
controller. A controller is responsible for managing the runtime, the
sequencer, and a new state type introduced in this PR.
A few highlights are:
- no more event bus
- functional approach to tasks (no more types defined for each task)
- the task function definition now offers a lot more context, like
access to raw API requests, the current sequence, a logger, the new
state interface, and the runtime interface.
- no more panics to handle reboots
- additional initialize and reboot sequences
- graceful gRPC server shutdown on critical errors
- config is now stored at install time to avoid having to download it at
install time and at boot time
- upgrades now use the local config instead of downloading it
- the upgrade API's preserve option takes precedence over the config's
install force option
Additionally, this pulls various packes in under machined to make the
code easier to navigate.
Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>