To be able to support arm64 native SDK without cross builds, we should
make generate_au_zip support both architectures, amd64 and arm64.
Without doing that, `build_image` fails with `ERROR : Required
WHITE_LIST items ld-linux-x86-64.so.2 not found!!!`, because the
script recognizes only amd64 libs in WHITE_LIST.
We should first determine the architecture in build_image, before
running generate_au_zip, and pass the architecture, either amd64 or
arm64. Also add allow_list and ld_linux parameters to necessary
functions.
Previously before https://github.com/kinvolk/flatcar-scripts/pull/134,
`bootstrap_sdk` was looking at the wrong path
(/usr/lib/rust-*/rustlib/aarch64-unknown-linux-gnu instead of
/usr/lib/rustlib/aarch64-unknown-linux-gnu). As a result, Rust got always
removed and rebuilt in `install_cross_rust`, which resulted in
`flatcar-sdk/crossdev/dev-lang/rust/rust-1.54.0-1.xpak` being created.
Now legitimate changes of https://github.com/kinvolk/flatcar-scripts/pull/134
prevent the rebuild from happening. The path already exists in a stage4
SDK build, because the seed stage already has cross-compilers so the
Rust upgrade has all the right cross-paths.
That's why SDK builds with only stage4 failed when it tries uploading Rust
packages like the following. On the other hand, full SDK builds with stage1
to 4 worked well, because in that case Rust is rebuilt anyway.
```
INFO bootstrap_sdk: Uploading cross toolchain packages to
gs://flatcar-jenkins/developer/sdk/amd64/2021.08.04+dev-flatcar-master-3209
CommandException: No URLs matched:
/mnt/host/source/src/build/catalyst/packages/flatcar-sdk/crossdev/*
CommandException: No URLs matched:
/tmp/tmp.xyjXbCFhUc//mnt/host/source/src/build/catalyst/packages/flatcar-sdk/crossdev/*.sig
CommandException: 2 files/objects could not be transferred.
```
To fix that, we have to skip uploading packages when the crossdev
directory does not exist.
Debugged and suggested by @jepio
We are enabling CgroupV2 support globally, which requires Docker 20.10.
It is possible to return to CgroupV1 locally via kernel commandline, but
that will still work with Docker 20. If someone really needs older
Docker versions we will recommend to also fetch torcx packages from
older releases or rely on upstream binaries.
The vmlinuz kernel image gets installed to /usr/boot/ but isn't usable
for dm-verity until it gets copied over to /boot/flatcar/ and the hash
gets embedded at a particular offset. The file in /usr/boot/ uses space
while it's not having a real purpose as long as dm-verity is used.
Delete the vmlinuz file under /usr/boot/ to free up space. When
generating the ISO image we use the vmlinuz file from /boot/flatcar/
which also has the advantage that we only distribute a single vmlinuz
file with one particular checksum.
For consistency with code further down in the file: aarch64 cross compilation only applies when CBUILD is x86,
for native aarch64 builds rust is guaranteed to have aarch64 rustlibs.
The defaults already give more space than the ext4 defaults but it's
recommended to use the mixed mode for filesystems smaller than 1-5 GB.
Another aspect is the duplication of metadata and while it currently is
off it's actually related to the underlying block device and could
change as soon as the block device type changes.
Select the mixed mode that uses a merged area for data and metadata
blocks. Also ensure that no metadata duplication gets enabled
automatically.
The compression feature of btrfs allows us to store more in the
size-limited /usr and OEM partitions. The size should of course still
be monitored to not bloat the image but more headroom helps to try
things out quickly without hitting the hard limit which fails the
build.
Use btrfs for the OEM partition but with zlib compression because
the outdated GRUB version doesn't support zstd yet.
New subvolumes currently can't be used for the OEM partition as default
subvolumes because GRUB tries to read the grub.cfg from the top
subvolume (at least with our old version). (We could however use
subvolumes for the /usr partition when switching to btrfs if that
makes any sense.)
The limited /usr and OEM partiton size is a challenge when adding new
packages or updating a package. Since the disk layout can't be changed
for compatibility reasons when updating an existing instance, we can't
simply try out something without ensuring first that enough space is
there by removing something else. This situation can be relaxed by
leveraging btrfs compression. There was some support for btrfs but it
was a bit outdated and didn't allow to configure compression or setting
read-only flags.
Fix the btrfs support, allow to mark the default subvolume as read only
and add a compression variable that allows to select a compression
algorithm. Instead of enabling compression by setting the mount option,
we can set the filesystem attribute which has the benefit that
compression is still used with the default mount options for this (top)
directory and its contents. While for the ext2 /usr partition a hack
existed to force read-only mode by modifying some bytes and checking
these bytes could also be used to know if read-only should be used to
prevent corruption of dm-verity data, we rather check directly whether
dm-verity is active for this partition and mount it read-only (and
with the norecovery option to really prevent any write attempt).
The newly enabled update test performs an update from the built image
to itself. This is useful to test that the update mechanism didn't
break but it doesn't say if the built image will be accepted as update
from the previous official release.
Introduce an additional kola run that begins from the previous official
release and tests to update to the built image. Since the test does two
updates it also covers the case of updating from the built image to the
built image. Thus, we can skip the test in the normal run.
This new kola run is done first to keep the qemu-latest symlink valid
for the main test suite.
When performing a full bootstrap (stage1-4), the stage1 results are currently
discarded because of the logic in catalyst_build: the first build stage uses
the "seed" and every following stage uses the previous stages results *but*
stage1 is built before catalyst_build. So from the point of view of
catalyst_build, stage2 is the first one and uses the seed tarball.
To make sure stage1 results are used if it was built, set the SEED variable to
the latest stage1 location.
Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
The rust ebuild has some magic to detect cross-toolchains present on the
system and enable building additional cross targets. The code to trigger
the rebuild of rust is part of install_cross_rust, and checks whether
the cross directories exist in the rust installation. If they don't,
then rust is removed and rebuilt to allow for the auto-detection to
happen.
Right now there are two issues with the code. Firstly, the path that is
checked is wrong, which leads to rust always being removed and rebuilt.
The path checked is /usr/lib/rust-*/rustlib but /usr/lib/rustlib is
where the files are installed.
The second issue is that it checks for aarch64 dirs when CHOST is
aarch64-cros-linux-gnu. However, on an aarch64 host the aarch64 dirs
will already exist from building the sdk itself. The rust ebuild is not
ready to handle aarch64 hosts yet and blows up. The correct behavior is
to combine the check for CHOST with a check for the right CBUILD.
On an aarch64 host we should presumably check for the x86 CHOST and rust
dirs, but that can be added later, because it needs more work.
Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
The arm64 profiles don't specify SYMLINK_LIB=yes, which makes sense
since arm64 systems don't support multilib in the way that we are used
to from x86. What this means is that build artifacts are installed into
separate lib and lib64 directories. The root overlay installed in stage4
needs to check for SYMLINK_LIB before trying to create a symlink,
otherwise it fails to be applied because it collides with the directory
in the rootfs.
This uncovered a second minor issues - the rust toolchain bootstrap
scripts checked for /usr/lib64/rust*, but the ebuild installs to
/usr/lib/rust.
Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
The previously used uuid 4f68bce3-e8cd-4db1-96e7-fbcaf984b709 is valid
for x86_64 root partitions, which resulted in the dev container not
working with systemd-nspawn on aarch64. systemd-nspawn fails with:
No suitable root partition found in image
Change the partition uuid to the architecture agnostic one documented
in the man page:
A GUID partition table (GPT) with a single partition of type 0fc63daf-8483-4772-8e79-3d69d8477de4.
This makes systemd-nspawn happy on aarch64.
Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
The kola update tests need a dev-key-signed update payload. This was
lacking and caused the update tests to be skipped.
Generate the test update payload for both dev builds and release builds
and run the kola tests for both. The test update payload has a special
name to not confuse it with the real update payload for releases, and
we keep the previous behavior to sign releases. Therefore, the
generate_update function wasn't used but the extract_update function
extended with generating the additional test payload.
The logic of the inline bash scripts of each job was sometimes
separated into the flatcar-scripts/jenkins/*.sh helpers but mostly
part of the Groovy file. This coupling had its advantages but also
downsides when special cases needed to be added for different release
versions. Other issues were that the inline scripts needed the
backslash character to be escaped twice and Jenkins was not good in
terminating the child processes when stopping a job. Having inline
bash scripts in Groovy also mandated the use of Jenkins to build and
release Flatcar Container Linux which hinders test builds in other CI
platforms.
Move the inline bash scripts fully to to the files in
flatcar-scripts/jenkins/ and create new ones for job that didn't have
a script there yet. Also invoke them through a systemd-run wrapper
script which ensures that all child processes are terminated and also
sets up /opt/bin as additional path for the static lbzcat binary.
A workaround for bash 4 was needed to use a temporary file instead of
the <(cmd) bash feature which caused a strange syntax error, otherwise
the bash commands are moved as they are.
This change removes 8 years old code from the toolchains build which
tries to update SDK libraries for unknown reasons, breaking the
toolchains build in the glibc-2.33 update.
Signed-off-by: Thilo Fromm <thilo@kinvolk.io>
This change uses portage-stable and coreos-overlay from the local SDK
chroot (from /var/lib/gentoo/repos) in the stage 1 SDK bootstrap build.
This is part 2 of the SDK bootstrap stage 1 fix (part 1 is covered in
64d8a73ac0), which ensures stage 1 does
not introduce any changes in its ebuilds over the seed SDK.
The change also introduces an option to consciously divert from the
above enforcement by use of command line parameters:
--stage1_overlay_ref <gitref> will check out coreos-overlay and use
<gitref> for stage 1 instead of the
local SDK's
/var/lib/gentoo/repos/coreos-overlay
--stage1_portage_ref <gitref> will check out portage-stable and use
<gitref> for stage 1 instead of the
local SDK's
/var/lib/gentoo/repos/gentoo
Signed-off-by: Thilo Fromm <thilo@kinvolk.io>
This change to stage 4 of the SDK bootstrap process will keep a
snapshot of coreos-overlay in the SDK tarball. This snapshot can be
used in future SDK bootstraps' stage1 to ensure a clean stage 1 output
without any package updates.
Signed-off-by: Thilo Fromm <thilo@kinvolk.io>
This change updates the stage1 SDK bootstrap build to use local
("known good") package ebuilds only, preventing updated package ebuilds
to apply in stage 1. This fixes SDK build breakage we observed when
upgrading core libraries like readline.
The change also removes the seed update from stage 1 as it should not
be needed anymore now that we postpone any package updates to stage 2.
The following package ebuild repos are used for stage 1:
- for portage-stable, we simply copy /var/gentoo/repos/gentoo
from the SDK root.
- coreos-overlay is more complicated since ebuilds are missing from
the SDK. So we grok the version the SDK was built with from
/mnt/host/source/.repo/manifests/default.xml
and then we create a local stage 1 clone of
https://github.com/kinvolk/coreos-overlay.git
in which we then check out the revision noted in the default mnifest.
Signed-off-by: Thilo Fromm <thilo@kinvolk.io>