Since v0.51.0 syft supports generating parsing the gentoo package
database. This is a first go at integrating that into our image build
process. This doesn't yet include packages inside torcx packages, or the
kernel, or initramfs-only packages.
There is some cruft left after grub hashes generation. After the
contents are zipped into archive, they don't need to be around any
more.
Try to remove the rootfs directory after unmounting the
image. disk_util can recreate it again if there is a need for it.
Remove the build directory used for generating ACI images - it's not
needed after successful installation.
Disable ebuild-locks for the emerge command that creates the image.
Ebuild-locks protect unsandboxed ebuild phases from running
concurrently, but also slow things down greatly when a lot of
concurrency would otherwise be possible. The image build phase merges a
big amount of binary packages, and I am not aware of us having any
phases that risk concurrently modifying shared files.
I have been testing this for the last months and have not seen any
failures. The time savings are significant: this cuts image build time
from 20m to 10m for me.
Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
Package users nowadays get created through systemd-sysuser files.
Gentoo uses the acct-user|groups packages to allocate stable IDs for
these users. Since they get created at runtime, we have the problem
that they end up in /etc/passwd at boot time which would be fine if
they follow the acct-user allocations but it could also be that there
is a package that uses its own sysuser files, leading to dynamic ID
allocation which we can't control and may result in ugly user ID
mismatches that are hard to resolve again. Normally we intend to ship
all system users under /usr/share/baselayout/passwd so that /etc/passwd
is really left to the user's own entries.
Generate the /etc/passwd sysuser entries at image build time and move
these entries over to /usr/share/baselayout/passwd so that all
system users reside in this database. We should still ensure to have
acct-user packages for all system users or at least hardcoded user
IDs, therefore, add a check for that.
Stop relying on github redirects, they are a mixed blessing and using
them broke emerge-gitclone inside dev-container in silent way. The
script could not find a desired revision of portage-stable or
coreos-overlay, because it tried to pull from kinvolk instead of
flatcar-linux github org. The redirects seem to hinder fetching a
specific commit, so the script pulled something else (HEAD or main?).
To be able to support arm64 native SDK without cross builds, we should
make generate_au_zip support both architectures, amd64 and arm64.
Without doing that, `build_image` fails with `ERROR : Required
WHITE_LIST items ld-linux-x86-64.so.2 not found!!!`, because the
script recognizes only amd64 libs in WHITE_LIST.
We should first determine the architecture in build_image, before
running generate_au_zip, and pass the architecture, either amd64 or
arm64. Also add allow_list and ld_linux parameters to necessary
functions.
The vmlinuz kernel image gets installed to /usr/boot/ but isn't usable
for dm-verity until it gets copied over to /boot/flatcar/ and the hash
gets embedded at a particular offset. The file in /usr/boot/ uses space
while it's not having a real purpose as long as dm-verity is used.
Delete the vmlinuz file under /usr/boot/ to free up space. When
generating the ISO image we use the vmlinuz file from /boot/flatcar/
which also has the advantage that we only distribute a single vmlinuz
file with one particular checksum.
The limited /usr and OEM partiton size is a challenge when adding new
packages or updating a package. Since the disk layout can't be changed
for compatibility reasons when updating an existing instance, we can't
simply try out something without ensuring first that enough space is
there by removing something else. This situation can be relaxed by
leveraging btrfs compression. There was some support for btrfs but it
was a bit outdated and didn't allow to configure compression or setting
read-only flags.
Fix the btrfs support, allow to mark the default subvolume as read only
and add a compression variable that allows to select a compression
algorithm. Instead of enabling compression by setting the mount option,
we can set the filesystem attribute which has the benefit that
compression is still used with the default mount options for this (top)
directory and its contents. While for the ext2 /usr partition a hack
existed to force read-only mode by modifying some bytes and checking
these bytes could also be used to know if read-only should be used to
prevent corruption of dm-verity data, we rather check directly whether
dm-verity is active for this partition and mount it read-only (and
with the norecovery option to really prevent any write attempt).
The kola update tests need a dev-key-signed update payload. This was
lacking and caused the update tests to be skipped.
Generate the test update payload for both dev builds and release builds
and run the kola tests for both. The test update payload has a special
name to not confuse it with the real update payload for releases, and
we keep the previous behavior to sign releases. Therefore, the
generate_update function wasn't used but the extract_update function
extended with generating the additional test payload.
The `--jobs` parameter that some scripts defined was not used anywhere
in jenkins or mantle. So the value of the parameter always ended up
being equal to `${NUM_JOBS}` set by `common.sh`. Also, even if the
`--jobs` parameter was used for some script, that script usually
didn't forward the jobs value to other scripts, so the other scripts
ended up using `${NUM_JOBS}` again. Also, the `${FLAGS_jobs}` variable
was used by some functions in the build library, and those functions
were sometimes invoked by scripts that didn't define the
`${FLAGS_jobs}` variable. It is tedious to track which script should
actually define the parameter, and where it should be forwarded.
Just get rid of this half-working pretense. If you want to affect how
many jobs `emerge` uses, export the `NUM_JOBS` environment variable
before calling any script.
For `EMERGE_FLAGS` and `REBUILD_FLAGS` we unconditionally specify the
`--jobs` flag's value to `${NUM_JOBS}` because they are passed to
`emerge`. On the other hand we drop the `--jobs` parameter from the
`UPDATE_ARGS` variable, because this variable passed to `setup_board`
or `update_chroot`, which don't have this flag any more.
When a license file is newly added, portage may not yet have it in the
shared folder and the license inclusion step fails.
Fall back to the source repositories and look for the license file
there, too. Print a warning if not found instead of failing to build.
For some unicode characters in ca-certificates file names "rev" complains
about an "invalid or incomplete multibyte or wide character"
and gives no output.
Filter out any unexpected characters for "rev" and replace them with "?"
so that "ls some?name" will still resolve the original name.
The license JSON file did only include the package names but not
any other metadata. Also since the file was not on the image itself,
it had to be downloaded.
Add more metadata to the license JSON and store it on the image.
This includes the source package of all torcx packages that are
installed on disk, including cases where multiple versions of the
same package are available.
This moves the default symlinking logic into build image as well.
This assumes that a torcx store is available locally with all images
referenced in the torcx manifest.
This is accomplished with a highly-indented double-for-loop, but I think
it's still decently readable.
This adds the option --torcx_store to specify the path to a
directory containing torcx images to be baked into the OS image. A
blank string can be given instead of a path to restore the previous
behavior and leave an empty vendor store.
The default value is the default path created by build_torcx_store,
which is used when build_packages updates torcx images. This means
that the current pattern "./build_packages && ./build_image prod"
should result in a fully updated OS image with all torcx images
available in the vendor store.
This changes the format from:
sys-apps/systemd-212-r8::coreos GPL-2 LGPL-2.1 MIT public-domain
to a JSON structure:
[
{
"project": "sys-apps/systemd-212-r8::coreos",
"license": ["GPL-2", "LGPL-2.1", "MIT", "public-domain"]
}
]
We don't have to worry about the changing format because the previous
format was never published. This is designed to match the
bill-of-materials [1] format so that it can be consumed by the site.
[1]: https://github.com/coreos/license-bill-of-materials
INFO build_oem_aci: Writing coreos_oem_gce_aci_stage_packages.txt
awk: cmd. line:1: fatal: cannot open file `/build/amd64-usr/var/db/pkg//DEPEND' for reading (No such file or directory)
INFO build_oem_aci: Writing coreos_oem_gce_aci_stage_licenses.txt
awk: cmd. line:1: fatal: cannot open file `/build/amd64-usr/var/db/pkg//DEPEND' for reading (No such file or directory)
This is useful for emerges that are meant for incomplete rootfs's, such
as ACI building emerges. There are cases where the #! check is expected
to fail while doing those.
This reverts commit a7ffba9a9f.
The build_image script can build multiple formats. When our
releases and automated builds are creating developer containers and
production images from the same command, the verity flag would be
disabled while building the container and remain disabled when building
the production image. This resulted in no verity in all our builds.
They're not in the root fs, but they are in the initramfs. Handle this
by augmenting the package list with packages that are both
- build dependencies of coreos-kernel, and
- configured to cause rebuilds of coreos-kernel when their sub-slot
changes.
To clean things up and prepare for arrm64 support move
all the enable_rootfs_verification processing into one
location and add some comments.
Signed-off-by: Geoff Levand <geoff@infradead.org>
To make verity work both enable_rootfs_verification and enable_verity
need to be set. Without one verity just gets half enabled. Remove
the enable_verity flag and do the full verity setup when
enable_rootfs_verification is set.
Signed-off-by: Geoff Levand <geoff@infradead.org>
The disable_read_write variable was just a copy of FLAGS_enable_rootfs_verification,
so to make things less confusing just use FLAGS_enable_rootfs_verification.
Signed-off-by: Geoff Levand <geoff@infradead.org>
The Xen loader in GRUB never received support for our hacky scheme of
adding the verity hash to the kernel cmdline. Disable till that's fixed.
Partially reverts 2016567 and 533b1b9.
Consolidates two very similar flags into one and fix an issue where
verity could get enabled in the GRUB config when rootfs verification was
turned off (e.g. on arm64 which cannot use verity yet).
PROD_IMAGE is a flag that indicates a production image should be
built, and will be set for dev builds if the user specifies that
both dev and prod images should be built. build_image was
incorrectly using the PROD_IMAGE variable to conditionaly do some
setup depending on the image type.
Add a new variable IMAGE_BUILD_TYPE that can be tested for the type
of image currently being built and replace the PROD_IMAGE usage.
Signed-off-by: Geoff Levand <geoff@infradead.org>
We need to ship some PCR measurements alongside images in order to make it
easier for admins to provide an appropriate policy. Add some tooling to
generate the appropriate hashes during build, pack those into a zip file
and upload it.
Allows build_image to be used without first running build_packages.
Note: setup_board --force is required before build_packages will work
properly after doing this since baselayout won't be installed otherwise.
- May be sourced early, so explicitly die if source fails.
- Add a function for getting the latest version of a package.
- Read PROVIDES metadata using portageq, enabling data to be read from
binary packages in addition to installed packages. The performance
issue is not an issue here and needed to support empty build roots.
This variable was semi-deprecated ages ago so `version.txt` could follow
a similar variable naming pattern to `os-release`. Finally drop usage of
it here in favor of `$COREOS_VERSION`.
Add the necessary variables in grub.cfg and populate the EFI
partition with arm64 efi executable and modules.
Signed-off-by: Andrej Rosano <andrej@inversepath.com>
Failing to explicitly set the selinux policy store to operate on may
result in semodule installing the policy in an incorrect location. Pass
it on the command line in order to avoid this.
ldconfig does not work for non-native arches. Create a new
build_image routine run_ldconfig that uses qemu user emulation
to run the board ldconfig on the board rootfs when the board and
SDK arches are different.
See: http://code.google.com/p/chromium/issues/detail?id=378377
Prior to calling run_ldconfig the board rootfs must have ldconfig
installed. To arrange this move the call of run_ldconfig to after
the base package install.
Fixes build_image errors like these when building for arm64:
/sbin/ldconfig: /lib64/libXXX is for unknown machine 183.
Signed-off-by: Geoff Levand <geoff@infradead.org>
The new python script check_root uses data that portage already
maintains on what shared libraries packages need or provide instead of
re-scanning whatever ELF files that can be found. This is much more
comprehensive but there is a bit of a transition issue for folks with
long-lived SDKs: packages built with portage older than 2.2.18 do not
include this data. As such for now the check is non-fatal and provides a
command you can use to refresh locally installed packages.
The code checking for conflicts between top level directories and /usr
has also been rewritten. Both tests now are considerably faster.
Once we're signing the root filesystem, we're not going to be able to boot
the kernel from there. Copy the kernel out to the EFI System Partition and
sign it.
There isn't a sane way for users to know the licenses of individual
packages in CoreOS images in built images. The information is hidden
away back in the original ebuilds. This extends our existing package
list with a new file that also includes licenses:
```
app-admin/flannel-0.3.0-r3::coreos Apache-2.0
app-admin/fleet-0.9.1::coreos Apache-2.0
app-admin/locksmith-0.2.3::coreos Apache-2.0
app-admin/sdnotify-proxy-0.1.0::coreos Apache-2.0
app-admin/sudo-1.8.10_p2::portage-stable ISC BSD
app-admin/toolbox-0.0.0-r4::coreos Apache-2.0
app-arch/bzip2-1.0.6-r6::portage-stable BZIP2
app-arch/gzip-1.5::portage-stable GPL-3
app-arch/tar-1.27.1-r2::portage-stable GPL-3+
...
```
This uses our new GRUB2 features to handle GPT priority partition
selection, terminal selection, OEM tweaks, etc. The old SYSLINUX and
PV-GRUB configs are now unused except for maintaining compatibility
with older installs. Of the old configs only the ones that
coreos-postinst copies are needed. The new setup supports using GRUB2
under Xen, giving us automatic fallback support on all of our platforms
for the very first time!
Since grub.cfg is copied into place instead of generated, build_image's
--boot_args option is no longer supported. It could be re-added later
with some sed goo but for now it is easy enough to just edit grub.cfg.
The new grub install script must be called after the image is unmounted
and the old bootloaders script doesn't need to touch grub at all. For
now we will continue to use the existing syslinux configs but
interpreted by grub. Beyond the grub menu flashing by during boot
everything should still be functionally equivalent.
The passing ROOT= as an environment variable to board wrapper scripts
doesn't work, the script unconditionally overrides it. This means so far
our packages.txt files have listed the contents of /build/amd64-usr
instead of the image. Fix this by calling equery directly instead.
We have long since stopped installing anything to the /boot directory of
the root filesystem. Mount the ESP partition to /boot for consistancy
with the discoverable partition spec.
Create profile as a real directory instead of a symlink to the board
root's configuration. Normally the board root does not modify this but
it is useful for build_image to use it to modify package.provided.
Instead of gluing in a special PROD_INSTALL_MASK for all images use
profiles to configure the differences between the base build root,
production images, and developer images. This offers much more
flexibility and is needed for providing a full dev environment in
developer images.
Using parallel_emerge has been disabled by default for all commands
except build_image for quite a while now, build_image kept it just
because it was still a bit faster than normal emerge. Keeping
parallel_emerge complicates future changes to build_image so it needs to
drop it entirely. Since that means nothing uses it by default we might
as well just rip out support for it entirely.
Missed this in 7231b95a, the update zip should still be built when the
usr partition is extracted for generating updates but build_image itself
is not generating and signing the update.
The current generate_update function is now less useful, the important
part that we need is just the partition image now. Also by defaulting to
extracting the partition the old cors_generate_update which is still in
use by devserver can be removed entirely, devserver will just expect the
extracted partition image instead.
Evaluating this as a user config causes it to block on
coreos-environment-setup.service which will wait on networking. This
makes it hard to add extra tricks for testing/debugging situations where
networking is failing. For example, to trigger dhcpcd if networkd dies:
#cloud-config
write_files:
- path: /etc/systemd/system/systemd-networkd.service.d/dhcpcd.conf
content: |
[Unit]
OnFailure=dhcpcd.service
[Service]
Restart=no
The new --developer_data option can be used to specify a path to a cloud
config to bundle into the image. If none is provided but a shared user
password (for core) is set then generate a config to set that password.
This lets us use the same mechanism for setting the default password for
both disk and PXE images.
To make it possible to plop a CoreOS install into a simple
single-filesystem image for use as a container some things like
configuring bootloaders need to be skipped.
Use what was the base image build function as setup/finalize steps in
the dev and prod build functions. This eliminates duplicate code
that mounted and unmounted the filesystem images.
We need some more control over exactly what lands in dev vs prod images
which will require letting them diverge in what is currently the common
base image step. There isn't any real need for the base image in the
first place other than to speed up building both dev and prod images at
the same time but that isn't common enough to worry about.
As part of this cleanup also remove references to CHROMEOS_* variables
and the recovery image that never actually existed in CoreOS.