This PR adds a flag to imager that allows for tweaking the size of the created disk. Additionally, it sets the default value of that created disk to 10GB, as most images are cloud images that fail when uploaded b/c it only picks up a 1GB disk currently. Also adds some processing the makefile to make sure we set the default small value for metal images and SBCs.
Signed-off-by: Spencer Smith <spencer.smith@talos-systems.com>
E.g. with the command:
```
make image-metal IMAGER_ARGS="--meta 0xc=abc --meta 0xd=abc"
```
This doesn't support ISO/PXE boot yet, it's going to come into the next
PR.
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
Bump golangci-lint and fixup new warnings. Ignore check that checks for
used function parameters, it's kind of noisy and makes it confusing to
read interface implementations.
Signed-off-by: Noel Georgi <git@frezbo.dev>
Use a global instance, handle loading/saving META in global context.
Deprecate legacy syslinux ADV, provide an easier interface for
consumers.
Expose META as resources.
Fix the bootloader revert process (it was completely broken for quite a
while :sad:).
This is a first step which mostly does preparation work, real changes
will come in the next PRs:
* add APIs to write to META
* consume META keys for platform network config for `metal`
* custom key for URL `${code}`
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
Run `depmod` during install/upgrades when extensions provide kernel
modules and `modules.dep` needs to be re-generated. This also allows
modules of same name from kernel to co-exist. Modules in `extras`
folder takes precedence over `in-built` ones.
Signed-off-by: Noel Georgi <git@frezbo.dev>
Host Talos mounts machined socket for API access into the installer
container (for upgrades).
Installer runs any check it might need to verify compatibility.
At the moment following checks are implemented:
* Talos version (whether upgrade from version X to Y is supported)
* Kubernetes version (whether Kubernetes version X is supported with
Talos Y).
Fixes#6149
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
Enabling BTF in the kernel brakes kexec from pre-BTF kernel (e.g. when
upgrading from 1.2.x to 1.3.x).
As there's no way to detect Talos version in the installer at the
moment, use another way to detect whether BTF is enabled in the Talos
version which is running right now.
Fixes#6443
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
There's a cyclic dependency on siderolink library which imports talos
machinery back. We will fix that after we get talos pushed under a new
name.
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
This the first step towards replacing all import paths to be based on
`siderolabs/` instead of `talos-systems/`.
All updates contain no functional changes, just refactorings to adapt to
the new path structure.
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
I had to do several things:
- contextcheck now supports Go 1.18 generics, but I had to disable it because of this https://github.com/kkHAIKE/contextcheck/issues/9
- dupword produces to many false positives, so it's also disabled
- revive found all packages which didn't have a documentation comment before. And tehre is A LOT of them. I updated some of them, but gave up at some point and just added them to exclude rules for now.
- change lint-vulncheck to use `base` stage as base
Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
This commit replaces `ioutil.TempDir` with `t.TempDir` in tests. The
directory created by `t.TempDir` is automatically removed when the test
and all its subtests complete.
Prior to this commit, temporary directory created using `ioutil.TempDir`
needs to be removed manually by calling `os.RemoveAll`, which is omitted
in some tests. The error handling boilerplate e.g.
defer func() {
if err := os.RemoveAll(dir); err != nil {
t.Fatal(err)
}
}
is also tedious, but `t.TempDir` handles this for us nicely.
Reference: https://pkg.go.dev/testing#T.TempDir
Signed-off-by: Eng Zer Jun <engzerjun@gmail.com>
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
The example configuration generated by talosctl contains
```yaml
extraKernelArgs:
- talos.platform=metal
```
in the install section, which, if uncommented, causes the installer to append the
`talos.platform` option twice. Thus, if the platform is set/changed here, it will
not be respected. This change allows the existing value to be overridden.
Signed-off-by: Dennis Marttinen <twelho@welho.tech>
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
Run `xfs_repair` on XFS filesystems that needs repairing indicated by
the `unix.EUCLEAN` error when mounting
Fixes#5319Fixes#5437
Signed-off-by: Noel Georgi <git@frezbo.dev>
Not sure how and when it got broken, but we're looking for mounts for
the blockdevice (like `/dev/vda`), while the actual mount info contains
the partition device (like `/dev/vda6`).
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
In the case of a node being reset, using kexec greatly
speeds up the process. However, in the event the boot
partition is wiped, a full reboot is required.
Closes#4670
Signed-off-by: Tim Jones <tim.jones@siderolabs.com>
I believe it serves no purpose in GRUB config: GRUB pre-loads
`initramfs` into memory anyways, so kernel doesn't need to know, nor has
now way to load it from anywhere.
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
Fixes#4816
This changes the way system extensions are packaged into the squashfs
images: `/lib/firmware` is now moved out of the future squashfs images
and becomes part of `initramfs` to make firmware available in the early
boot.
Talos will bind-mount `/lib/firmware` into rootfs as well, so it will be
available in the rootfs as well.
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
Fixes#4815
This implements the following steps:
* machine configuration updates
* pulling and unpacking system extension images
* validating, listing system extensions
* re-packing system extensions
* preserving installed extensions in `/etc/extensions.yaml`
Once extension is enabled, raw information can be queried with:
```
$ talosctl -n 172.20.0.2 cat /etc/extensions.yaml
layers:
- image: 000.ghcr.io-smira-gvisor-c927b54-dirty.sqsh
metadata:
name: gvisor
version: 20220117.0-v1.0.0
author: Andrew Rynhard
description: |
This system extension provides gVisor using containerd's runtime handler.
compatibility:
talos:
version: '> v0.15.0-alpha.1'
```
This was tested with the `gvisor` system extension.
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
We need to be able to run an install with `docker run`. This checks if
we are running from docker and skips overlay mount checks if we are, as
docker creates a handful of overlay mounts by default that we can't
workaround (not easily at least).
Signed-off-by: Andrew Rynhard <andrew@rynhard.io>
The structure of the controllers is really similar to addresses and
routes:
* `LinkSpec` resource describes desired link state
* `LinkConfig` controller generates `LinkSpecs` based on machine
configuration and kernel cmdline
* `LinkMerge` controller merges multiple configuration sources into a
single `LinkSpec` paying attention to the config layer priority
* `LinkSpec` controller applies the specs to the kernel state
Controller `LinkStatus` (which was implemented before) watches the
kernel state and publishes current link status.
Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
This has two big visible changes:
* `installer` image now contains assets for both `amd64` and `arm64`, so
it can be used to generate any Talos image (including RPi on amd64 host)
* Talos is using cross-compilation instead of emulation to build
non-native architectures: on amd64, Go amd64 compiler produces binaries
for both arm64 and amd64
(before this change: Go arm64 compiler via QEMU produces arm64 binaries on amd64)
CI implications: we no longer require arm64 nodes.
Changes walkthrough:
* `installer` container now keeps assets under `/usr/install/<arch>`
* Dockerfile build starts forcing toolchain/base image to use the build
host native architecture, not target architecture
* lots of duplication for amd64/arm64 as we want to combine assets for
both arches in a single image (e.g. we have multi-arch amd64/arm64
installer image, each arch has native installer binary, but both arches
contain full set of amd64/arm64 assets)
* fixed a small bug preventing arm64 on amd64 talosctl cluster create
Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
Overlay mount in `mountinfo` don't show up as mounts for any particular
block device, so the existing check doesn't catch them.
This was discovered as our current master can't upgrade because of
overlay mount for `/opt` and `apid` image in `/opt/apid` (which will be
fixed in a separate PR).
Without the check, installer fails on resetting partition table for the
disk effectively wiping the node (`device or resource busy` error).
Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
This fixes a case of upgrade from 0.9.0-alpha.4 to 0.9.0-beta.0. With
introduced proper partition alignment and physical block size != 512,
partitions before ephemeral will be moved around a bit (due to the
alignment), and `STATE` partition size might change a bit.
If encryption is enabled, contents are preserved as raw bytes, so
partition size should be exactly same during restore.
Drop code (mostly tests) which handled 0.6 to 0.7 upgrades.
On upgrade with preserve don't touch any partitions, at least for 0.8 ->
0.9 layout hasn't changed.
Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
This PR introduces the first part of disk encryption support.
New config section `systemDiskEncryption` was added into MachineConfig.
For now it contains only Ephemeral partition encryption.
Encryption itself supports two kinds of keys for now:
- node id deterministic key.
- static key which is hardcoded in the config and mainly used for test
purposes.
Talosctl cluster create can now be told to encrypt ephemeral partition
by using `--encrypt-ephemeral` flag.
Additionally:
- updated pkgs library version.
- changed Dockefile to copy cryptsetup deps from pkgs.
Signed-off-by: Artem Chernyshev <artem.0xD2@gmail.com>
Filesystem creation step is moved on the later stage: when Talos mounts
the partition for the first time.
Now it checks if the partition doesn't have any filesystem and formats
it right before mounting.
Additionally refactored mount options a bit:
- replaced separate options with a set of binary flags.
- implemented pre-mount and post-unmount hooks.
And fixed typos in couple of places and increased timeout for `apid ready`.
Signed-off-by: Artem Chernyshev <artem.0xD2@gmail.com>
This fixes spurious race conditions when user disks are partitioned
and formatted in `mountUserDisks` task. While this task runs, `udevd` is
running to allow various `/dev/` symlinks to be used for user disks.
At the same time `udevd` might trigger syscall `BLKRRPART` at any time
concurrently with Talos which leads to a race on kernel side when Talos
tries to update kernel partition table while kernel does it on its own
as a result of `udevd` call.
As part of the fix, `RereadPartitionTable()` calls were removed (they
trigger `BLKRRPART` and they're not needed as Talos updates partition
table on its own).
Some cleanups to make sure blockdevice is open/closed just in matching
pairs (no lingering open blockdevice instances). This is import for
`WithExclusiveLock()` calls, as it would lead to a deadlock if previous
blockdevice instance is not closed.
Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
This refactoring is required to simplify the work to be done to support
disk encryption.
Tried to minimize amount of queries done by `blockdevice` `probe`
methods.
Instead, where we have `runtime.Runtime` we get all required blockdevices
there from blockdevice cache stored in `State().Machine().Disk()`.
This opens a way to store encryption settings in the `Partition`
objects.
Signed-off-by: Artem Chernyshev <artem.0xD2@gmail.com>
Fixes#3011
See also https://github.com/talos-systems/go-procfs/pull/8
We don't want to allow all the kernel args to be overridden, as this
might compromise KSPP, but we would rather allow some args to be
overridden explicitly.
Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
That change should make Talos updates more straightforward in any
projects that depend on Talos.
Signed-off-by: Artem Chernyshev <artem.0xD2@gmail.com>
SBC should always overwrite default kernel params.
Otherwise we will always get duplicate values for some of them.
Signed-off-by: Artem Chernyshev <artem.0xD2@gmail.com>
Idea is to add an option to perform "selective" reset: default reset
operation is to wipe all partitions (triggering reinstall), while spec
allows only to wipe some of the operations.
Other operations are performed exactly in the same way for any reset
flow.
Possible use case: reset only `EPHEMERAL` partition.
Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
Regular upgrade path takes just one reboot, but it requires all the
processes to be stopped on the node before upgrade might proceed. Under
some circumstances and with potential Talos bugs it might not work
rendering Talos upgrades almost impossible.
Staged upgrades build upon regular install flow to run the upgrade on
the node reboot. Such upgrades require two reboots of the node, and it
requires two pulls of the installer image, but they should be much less
suspicious to the failure. Once the upgrade is staged, node can be
rebooted in any possible way, including hard reset and upgrade is
performed on the next boot.
New ADV format was implemented as well to allow to store install image
ref/options across reboots. New format allows for bigger values and
takes 50% of the `META` partition. Old ADV is still kept for
compatibility reasons.
Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
This makes sure Talos won't pick up any potential leftover data on fresh
install. On upgrade contents of META partitions are preserved anyways.
Fixes#2919
Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
This allows boards to provide kernel args at install time. We need this so that
we can set the console.
Signed-off-by: Andrew Rynhard <andrew@rynhard.io>
This introduces the notion of a "board" in Talos. A board is an interface that is capable
of modifying the installation in specific ways for a given SBC. This also adds support for the
libretech_all_h3_cc_h5.
Signed-off-by: Andrew Rynhard <andrew@rynhard.io>
Fixes were applied automatically.
Import ordering might be questionable, but it's strict:
* stdlib
* other packages
* same package imports
Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
This will make it more obvious when installer got started, and when it
starts to wipe a disk (which might take some time).
Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>