- updated github actions for runc, containerd, and docker to not handle
nonexistent ebuilds in app-torcx/ anymore
- removed spurious package_run_dependencies from build_image_util.sh
- build_sysext: generate pkginfo before mangle script runs
use zstd for compression; add cli flag to select compression
- ci_automation_common.sh: remove spurious `/` from match string
- coreos, board-packages, bootengine: bump ebuild revisions
- kernel commonconfig: add squashfs zstd support
Signed-off-by: Thilo Fromm <thilofromm@microsoft.com>
This change refactors base OS sysext builds to use a separate build
script `build_library/sysext_prod_builder`, which is called from
`build_library/prod_image_util.sh` when `build_image` runs.
This allows for better separation of cleanup traps: prod image sysext
builds need its own trap / cleanup function for temporary build
directories and loopback mounts.
Prod sysext builds properly generate lincense and SBOM information, and
provide detailed file listings and disk space usage stats.
- SBOM / licenses JSON now include all packages of the
final image, i.e. a combined list of base image and all base OS
sysexts.
- Packages lists, files list and detailed files list include the sysext
squashfs files for the base image, and separate sections with files /
packages lists for each sysext.
- Disk usage contains both final disk image usage as well as usage of
each individual sysext squashfs.
This change refactors sysext builds during build_image and generalises
the code (no hard-coded containerd and docker anymore).
A command line option is added to build_image for sysexts to include in
the OS image. It defaults to containerd and docker but may be set to
arbitrary packages. The command line supports simple depenencies, i.e.
the "docker" sysext will re-use package information from the
"containerd" sysext and not include another containerd.
Signed-off-by: Thilo Fromm <thilofromm@microsoft.com>
This change removes torcx libraries, references, and commandline options
from build automation scripts and from build_library/.
Containerd and docker are shipped via sysexts which are included in the
base image.
Signed-off-by: Thilo Fromm <thilofromm@microsoft.com>
The `localedef` tool expects `/usr/lib/locale` directory to
exist. This directory used to be created by the `sys-libs/glibc`
package (with the `keepdir` directive), but after the update of the
package, the locale generation stuff (and the `keepdir` directive )was
moved to the `sys-libs/locale-gen` package. This package is not
installed in the production images, so the `/usr/lib/locale` directory
was not created. In such a situation, calling localedef to generate
C.UTF-8 locale resulted in an error like:
cannot create temporary file: ${SOME_ROOTFS}/usr/lib/locale/locale-archive.ufpG15: No such file or directory
Create the directory before calling localedef to fix the problem.
The temporary /etc backup created during emerging packages should only
contain empty files that will make sure that the symlinks pointing to
files within the /etc backup won't dangle at any time.
We used to create a base_image_var.conf tmpfiles config file that
contained information about directories under /var that weren't
covered by any other tmpfiles config file. Recently some package
update started installing a directory under /var that belonged to a
user/group not found directly in passwd/group file in /etc. This
user/group was defined in passwd/group in /usr/share/baselayout, but
at the early boot, these are not yet checked for user/group
information, so systemd-tmpfiles running inside initrd failed when
trying to create such an entry using the base_image_var.conf tmpfiles
config file.
Split the base_image_var.conf into two files - base_image_var.conf and
base_image_var_late.conf. The former will only contain entries owned
by user/group that are supposed to exist very early in the boot, while
the latter will contain the rest of directories - those will be
created later during the boot.
We already drop tmpfile rules that we don't need because we ship the
files through our /etc overlay. However, some rules weren't dropped
because they used tabs and not spaces (/etc/selinux/, /etc/iscsi and
/etc/ssl/*).
Drop rule lines for /etc that use tabs. Also rules modifiers like ! to
only do it during boot or - to allow failure will be removed but those
with + or = will stay as they to explicit recreation.
The existing tmpfile logic took care of folders that the ebuild keepdir
directive wanted to exist on the OS. However, files and symlinks were
not created, causing them to be missing if we didn't explicitly modify
the ebuild files in coreos-overlay to use tmpfiles or patching of
paths to be in /usr. We need a logic to provide /etc files from the
current /usr partition without getting stale. This can be done best
with an overlay mount which requires to keep the original /etc files
under /usr.
Move the final /etc folder of the image build to /usr/share/flatcar/etc
to serve as lower layer in the overlay. Also remove any state from the
rootfs to make sure that we don't rely on it when testing our images
before the release. What we get with an overlay mount is essentially a
similar behavior to a 3-way merge because as long as the user didn't
change the files, the old version is replaced with the new version and
as soon as the user did changes, that file is frozen and wins over the
provided old (in case of a rollback) or new versions from /usr. It does
not work on file lines but on whole file contents, yet that is also
what rpm-ostree does to my knowledge. Also, run tmpfiles once and do
the SELinux labeling to prevent files being created in the upperdir
because they were missing in the lowerdir, or because they had missing
SELinux labels.
Emerge flags are cryptic in general, but short flags even more so, so
expand them. While at it, I noticed some places where bash arrays
could be used, so convert those places too.
Timestamp and user/group information are out, in are device ID and
inode number. That way, the file can be used for accounting size
differences of files/image.
We need to fix another symlink created by gcc-config, so extend the
function that was doing it for some other links and rename it - it's
not about only liblto any more.
Some packages are currently missing from the /usr/share/SLSA directory
compared to flatcar_production_image_packages.txt. For torcx packages,
extract the reports from the torcx bundle when adding it to the rootfs.
For initramfs packages, as a substitute we enumerate build dependencies
of coreos-kernel (image_packages_implicit()). At this time these are
bootengine and intel-microcode.
Since v0.51.0 syft supports generating parsing the gentoo package
database. This is a first go at integrating that into our image build
process. This doesn't yet include packages inside torcx packages, or the
kernel, or initramfs-only packages.
There is some cruft left after grub hashes generation. After the
contents are zipped into archive, they don't need to be around any
more.
Try to remove the rootfs directory after unmounting the
image. disk_util can recreate it again if there is a need for it.
Remove the build directory used for generating ACI images - it's not
needed after successful installation.
Disable ebuild-locks for the emerge command that creates the image.
Ebuild-locks protect unsandboxed ebuild phases from running
concurrently, but also slow things down greatly when a lot of
concurrency would otherwise be possible. The image build phase merges a
big amount of binary packages, and I am not aware of us having any
phases that risk concurrently modifying shared files.
I have been testing this for the last months and have not seen any
failures. The time savings are significant: this cuts image build time
from 20m to 10m for me.
Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
Package users nowadays get created through systemd-sysuser files.
Gentoo uses the acct-user|groups packages to allocate stable IDs for
these users. Since they get created at runtime, we have the problem
that they end up in /etc/passwd at boot time which would be fine if
they follow the acct-user allocations but it could also be that there
is a package that uses its own sysuser files, leading to dynamic ID
allocation which we can't control and may result in ugly user ID
mismatches that are hard to resolve again. Normally we intend to ship
all system users under /usr/share/baselayout/passwd so that /etc/passwd
is really left to the user's own entries.
Generate the /etc/passwd sysuser entries at image build time and move
these entries over to /usr/share/baselayout/passwd so that all
system users reside in this database. We should still ensure to have
acct-user packages for all system users or at least hardcoded user
IDs, therefore, add a check for that.
Stop relying on github redirects, they are a mixed blessing and using
them broke emerge-gitclone inside dev-container in silent way. The
script could not find a desired revision of portage-stable or
coreos-overlay, because it tried to pull from kinvolk instead of
flatcar-linux github org. The redirects seem to hinder fetching a
specific commit, so the script pulled something else (HEAD or main?).
To be able to support arm64 native SDK without cross builds, we should
make generate_au_zip support both architectures, amd64 and arm64.
Without doing that, `build_image` fails with `ERROR : Required
WHITE_LIST items ld-linux-x86-64.so.2 not found!!!`, because the
script recognizes only amd64 libs in WHITE_LIST.
We should first determine the architecture in build_image, before
running generate_au_zip, and pass the architecture, either amd64 or
arm64. Also add allow_list and ld_linux parameters to necessary
functions.
The vmlinuz kernel image gets installed to /usr/boot/ but isn't usable
for dm-verity until it gets copied over to /boot/flatcar/ and the hash
gets embedded at a particular offset. The file in /usr/boot/ uses space
while it's not having a real purpose as long as dm-verity is used.
Delete the vmlinuz file under /usr/boot/ to free up space. When
generating the ISO image we use the vmlinuz file from /boot/flatcar/
which also has the advantage that we only distribute a single vmlinuz
file with one particular checksum.
The limited /usr and OEM partiton size is a challenge when adding new
packages or updating a package. Since the disk layout can't be changed
for compatibility reasons when updating an existing instance, we can't
simply try out something without ensuring first that enough space is
there by removing something else. This situation can be relaxed by
leveraging btrfs compression. There was some support for btrfs but it
was a bit outdated and didn't allow to configure compression or setting
read-only flags.
Fix the btrfs support, allow to mark the default subvolume as read only
and add a compression variable that allows to select a compression
algorithm. Instead of enabling compression by setting the mount option,
we can set the filesystem attribute which has the benefit that
compression is still used with the default mount options for this (top)
directory and its contents. While for the ext2 /usr partition a hack
existed to force read-only mode by modifying some bytes and checking
these bytes could also be used to know if read-only should be used to
prevent corruption of dm-verity data, we rather check directly whether
dm-verity is active for this partition and mount it read-only (and
with the norecovery option to really prevent any write attempt).
The kola update tests need a dev-key-signed update payload. This was
lacking and caused the update tests to be skipped.
Generate the test update payload for both dev builds and release builds
and run the kola tests for both. The test update payload has a special
name to not confuse it with the real update payload for releases, and
we keep the previous behavior to sign releases. Therefore, the
generate_update function wasn't used but the extract_update function
extended with generating the additional test payload.
The `--jobs` parameter that some scripts defined was not used anywhere
in jenkins or mantle. So the value of the parameter always ended up
being equal to `${NUM_JOBS}` set by `common.sh`. Also, even if the
`--jobs` parameter was used for some script, that script usually
didn't forward the jobs value to other scripts, so the other scripts
ended up using `${NUM_JOBS}` again. Also, the `${FLAGS_jobs}` variable
was used by some functions in the build library, and those functions
were sometimes invoked by scripts that didn't define the
`${FLAGS_jobs}` variable. It is tedious to track which script should
actually define the parameter, and where it should be forwarded.
Just get rid of this half-working pretense. If you want to affect how
many jobs `emerge` uses, export the `NUM_JOBS` environment variable
before calling any script.
For `EMERGE_FLAGS` and `REBUILD_FLAGS` we unconditionally specify the
`--jobs` flag's value to `${NUM_JOBS}` because they are passed to
`emerge`. On the other hand we drop the `--jobs` parameter from the
`UPDATE_ARGS` variable, because this variable passed to `setup_board`
or `update_chroot`, which don't have this flag any more.
When a license file is newly added, portage may not yet have it in the
shared folder and the license inclusion step fails.
Fall back to the source repositories and look for the license file
there, too. Print a warning if not found instead of failing to build.