The defaults already give more space than the ext4 defaults but it's
recommended to use the mixed mode for filesystems smaller than 1-5 GB.
Another aspect is the duplication of metadata and while it currently is
off it's actually related to the underlying block device and could
change as soon as the block device type changes.
Select the mixed mode that uses a merged area for data and metadata
blocks. Also ensure that no metadata duplication gets enabled
automatically.
The compression feature of btrfs allows us to store more in the
size-limited /usr and OEM partitions. The size should of course still
be monitored to not bloat the image but more headroom helps to try
things out quickly without hitting the hard limit which fails the
build.
Use btrfs for the OEM partition but with zlib compression because
the outdated GRUB version doesn't support zstd yet.
New subvolumes currently can't be used for the OEM partition as default
subvolumes because GRUB tries to read the grub.cfg from the top
subvolume (at least with our old version). (We could however use
subvolumes for the /usr partition when switching to btrfs if that
makes any sense.)
The limited /usr and OEM partiton size is a challenge when adding new
packages or updating a package. Since the disk layout can't be changed
for compatibility reasons when updating an existing instance, we can't
simply try out something without ensuring first that enough space is
there by removing something else. This situation can be relaxed by
leveraging btrfs compression. There was some support for btrfs but it
was a bit outdated and didn't allow to configure compression or setting
read-only flags.
Fix the btrfs support, allow to mark the default subvolume as read only
and add a compression variable that allows to select a compression
algorithm. Instead of enabling compression by setting the mount option,
we can set the filesystem attribute which has the benefit that
compression is still used with the default mount options for this (top)
directory and its contents. While for the ext2 /usr partition a hack
existed to force read-only mode by modifying some bytes and checking
these bytes could also be used to know if read-only should be used to
prevent corruption of dm-verity data, we rather check directly whether
dm-verity is active for this partition and mount it read-only (and
with the norecovery option to really prevent any write attempt).
The newly enabled update test performs an update from the built image
to itself. This is useful to test that the update mechanism didn't
break but it doesn't say if the built image will be accepted as update
from the previous official release.
Introduce an additional kola run that begins from the previous official
release and tests to update to the built image. Since the test does two
updates it also covers the case of updating from the built image to the
built image. Thus, we can skip the test in the normal run.
This new kola run is done first to keep the qemu-latest symlink valid
for the main test suite.
When performing a full bootstrap (stage1-4), the stage1 results are currently
discarded because of the logic in catalyst_build: the first build stage uses
the "seed" and every following stage uses the previous stages results *but*
stage1 is built before catalyst_build. So from the point of view of
catalyst_build, stage2 is the first one and uses the seed tarball.
To make sure stage1 results are used if it was built, set the SEED variable to
the latest stage1 location.
Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
The rust ebuild has some magic to detect cross-toolchains present on the
system and enable building additional cross targets. The code to trigger
the rebuild of rust is part of install_cross_rust, and checks whether
the cross directories exist in the rust installation. If they don't,
then rust is removed and rebuilt to allow for the auto-detection to
happen.
Right now there are two issues with the code. Firstly, the path that is
checked is wrong, which leads to rust always being removed and rebuilt.
The path checked is /usr/lib/rust-*/rustlib but /usr/lib/rustlib is
where the files are installed.
The second issue is that it checks for aarch64 dirs when CHOST is
aarch64-cros-linux-gnu. However, on an aarch64 host the aarch64 dirs
will already exist from building the sdk itself. The rust ebuild is not
ready to handle aarch64 hosts yet and blows up. The correct behavior is
to combine the check for CHOST with a check for the right CBUILD.
On an aarch64 host we should presumably check for the x86 CHOST and rust
dirs, but that can be added later, because it needs more work.
Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
The arm64 profiles don't specify SYMLINK_LIB=yes, which makes sense
since arm64 systems don't support multilib in the way that we are used
to from x86. What this means is that build artifacts are installed into
separate lib and lib64 directories. The root overlay installed in stage4
needs to check for SYMLINK_LIB before trying to create a symlink,
otherwise it fails to be applied because it collides with the directory
in the rootfs.
This uncovered a second minor issues - the rust toolchain bootstrap
scripts checked for /usr/lib64/rust*, but the ebuild installs to
/usr/lib/rust.
Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
The previously used uuid 4f68bce3-e8cd-4db1-96e7-fbcaf984b709 is valid
for x86_64 root partitions, which resulted in the dev container not
working with systemd-nspawn on aarch64. systemd-nspawn fails with:
No suitable root partition found in image
Change the partition uuid to the architecture agnostic one documented
in the man page:
A GUID partition table (GPT) with a single partition of type 0fc63daf-8483-4772-8e79-3d69d8477de4.
This makes systemd-nspawn happy on aarch64.
Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
The kola update tests need a dev-key-signed update payload. This was
lacking and caused the update tests to be skipped.
Generate the test update payload for both dev builds and release builds
and run the kola tests for both. The test update payload has a special
name to not confuse it with the real update payload for releases, and
we keep the previous behavior to sign releases. Therefore, the
generate_update function wasn't used but the extract_update function
extended with generating the additional test payload.
The logic of the inline bash scripts of each job was sometimes
separated into the flatcar-scripts/jenkins/*.sh helpers but mostly
part of the Groovy file. This coupling had its advantages but also
downsides when special cases needed to be added for different release
versions. Other issues were that the inline scripts needed the
backslash character to be escaped twice and Jenkins was not good in
terminating the child processes when stopping a job. Having inline
bash scripts in Groovy also mandated the use of Jenkins to build and
release Flatcar Container Linux which hinders test builds in other CI
platforms.
Move the inline bash scripts fully to to the files in
flatcar-scripts/jenkins/ and create new ones for job that didn't have
a script there yet. Also invoke them through a systemd-run wrapper
script which ensures that all child processes are terminated and also
sets up /opt/bin as additional path for the static lbzcat binary.
A workaround for bash 4 was needed to use a temporary file instead of
the <(cmd) bash feature which caused a strange syntax error, otherwise
the bash commands are moved as they are.
This change removes 8 years old code from the toolchains build which
tries to update SDK libraries for unknown reasons, breaking the
toolchains build in the glibc-2.33 update.
Signed-off-by: Thilo Fromm <thilo@kinvolk.io>
This change uses portage-stable and coreos-overlay from the local SDK
chroot (from /var/lib/gentoo/repos) in the stage 1 SDK bootstrap build.
This is part 2 of the SDK bootstrap stage 1 fix (part 1 is covered in
64d8a73ac0), which ensures stage 1 does
not introduce any changes in its ebuilds over the seed SDK.
The change also introduces an option to consciously divert from the
above enforcement by use of command line parameters:
--stage1_overlay_ref <gitref> will check out coreos-overlay and use
<gitref> for stage 1 instead of the
local SDK's
/var/lib/gentoo/repos/coreos-overlay
--stage1_portage_ref <gitref> will check out portage-stable and use
<gitref> for stage 1 instead of the
local SDK's
/var/lib/gentoo/repos/gentoo
Signed-off-by: Thilo Fromm <thilo@kinvolk.io>
This change to stage 4 of the SDK bootstrap process will keep a
snapshot of coreos-overlay in the SDK tarball. This snapshot can be
used in future SDK bootstraps' stage1 to ensure a clean stage 1 output
without any package updates.
Signed-off-by: Thilo Fromm <thilo@kinvolk.io>
This change updates the stage1 SDK bootstrap build to use local
("known good") package ebuilds only, preventing updated package ebuilds
to apply in stage 1. This fixes SDK build breakage we observed when
upgrading core libraries like readline.
The change also removes the seed update from stage 1 as it should not
be needed anymore now that we postpone any package updates to stage 2.
The following package ebuild repos are used for stage 1:
- for portage-stable, we simply copy /var/gentoo/repos/gentoo
from the SDK root.
- coreos-overlay is more complicated since ebuilds are missing from
the SDK. So we grok the version the SDK was built with from
/mnt/host/source/.repo/manifests/default.xml
and then we create a local stage 1 clone of
https://github.com/kinvolk/coreos-overlay.git
in which we then check out the revision noted in the default mnifest.
Signed-off-by: Thilo Fromm <thilo@kinvolk.io>
The `--jobs` parameter that some scripts defined was not used anywhere
in jenkins or mantle. So the value of the parameter always ended up
being equal to `${NUM_JOBS}` set by `common.sh`. Also, even if the
`--jobs` parameter was used for some script, that script usually
didn't forward the jobs value to other scripts, so the other scripts
ended up using `${NUM_JOBS}` again. Also, the `${FLAGS_jobs}` variable
was used by some functions in the build library, and those functions
were sometimes invoked by scripts that didn't define the
`${FLAGS_jobs}` variable. It is tedious to track which script should
actually define the parameter, and where it should be forwarded.
Just get rid of this half-working pretense. If you want to affect how
many jobs `emerge` uses, export the `NUM_JOBS` environment variable
before calling any script.
For `EMERGE_FLAGS` and `REBUILD_FLAGS` we unconditionally specify the
`--jobs` flag's value to `${NUM_JOBS}` because they are passed to
`emerge`. On the other hand we drop the `--jobs` parameter from the
`UPDATE_ARGS` variable, because this variable passed to `setup_board`
or `update_chroot`, which don't have this flag any more.
Now, in the oem aci creation step we make use of the jobs param.
Without this flag, an empty string is passed to to emerge which results
in failure.
Signed-off-by: Sayan Chowdhury <sayan.chowdhury2012@gmail.com>
They end up using emerge_to_image which needs uses the `$FLAGS_jobs`
parameter. Seems like new portage does not like getting the parameter
like `--jobs=` (with an empty value).
The script is written in python2 and it imports portage code. Since
the portage is going to become a python3-only code, the script needs
porting to python3. It is not the high priority right now, because it
seems to be not used by other scripts or jenkins.
The script needs to be ported, because it is importing portage code
which became python3 only.
The porting I did is likely a lousy job, but at least it stopped
failing with some p(yt)hony errors.
Previously we broke the cycle caused by sys-apps/util-linux only,
while disabling cryptsetup USE flag in systemd to avoid another
cycle. That worked before, because the follow-up merge of the rest of
packages built sys-fs/cryptsetup before sys-apps/systemd. After an
update, the new portage is ordering the builds in different way and
sys-apps/systemd ended up being built before sys-fs/cryptsetup and
that failed during the configure phase because of unmet dependencies.
Better build all the packages taking part in the loop (not counting
the virtual packages), so we become less reliant on the package build
ordering. It is going to take slightly more time as we build a couple
of packages more.
Instead of rebuilding just one package and maybe rebuilding others as
a fallout, force rebuilding all the mentioned packages. This makes the
build process a bit more robust against package build ordering
changes.
May be useful when breaking multiple dep loops that have some common
packages, so we build them all once.