This is some fallout from converting scripts from python2 to
python3. Output received from the functions in subprocess module now
return bytearrays, but we operate on them as if they were a text. So
decode the bytearrays to strings. Otherwise we are either getting some
junk values passed to the command line utilities (for example:
`b'/dev/loop2'` instead of `/dev/loop2`), or exceptions are thrown,
because a function expected a string.
This is just a conversion done by 2to3 with a manual updates of
shebangs to mention python3 explicitly. The fixups for bytearray vs
string issues will follow up.
To be able to support arm64 native SDK without cross builds, we should
make generate_au_zip support both architectures, amd64 and arm64.
Without doing that, `build_image` fails with `ERROR : Required
WHITE_LIST items ld-linux-x86-64.so.2 not found!!!`, because the
script recognizes only amd64 libs in WHITE_LIST.
We should first determine the architecture in build_image, before
running generate_au_zip, and pass the architecture, either amd64 or
arm64. Also add allow_list and ld_linux parameters to necessary
functions.
The vmlinuz kernel image gets installed to /usr/boot/ but isn't usable
for dm-verity until it gets copied over to /boot/flatcar/ and the hash
gets embedded at a particular offset. The file in /usr/boot/ uses space
while it's not having a real purpose as long as dm-verity is used.
Delete the vmlinuz file under /usr/boot/ to free up space. When
generating the ISO image we use the vmlinuz file from /boot/flatcar/
which also has the advantage that we only distribute a single vmlinuz
file with one particular checksum.
For consistency with code further down in the file: aarch64 cross compilation only applies when CBUILD is x86,
for native aarch64 builds rust is guaranteed to have aarch64 rustlibs.
The defaults already give more space than the ext4 defaults but it's
recommended to use the mixed mode for filesystems smaller than 1-5 GB.
Another aspect is the duplication of metadata and while it currently is
off it's actually related to the underlying block device and could
change as soon as the block device type changes.
Select the mixed mode that uses a merged area for data and metadata
blocks. Also ensure that no metadata duplication gets enabled
automatically.
The compression feature of btrfs allows us to store more in the
size-limited /usr and OEM partitions. The size should of course still
be monitored to not bloat the image but more headroom helps to try
things out quickly without hitting the hard limit which fails the
build.
Use btrfs for the OEM partition but with zlib compression because
the outdated GRUB version doesn't support zstd yet.
New subvolumes currently can't be used for the OEM partition as default
subvolumes because GRUB tries to read the grub.cfg from the top
subvolume (at least with our old version). (We could however use
subvolumes for the /usr partition when switching to btrfs if that
makes any sense.)
The limited /usr and OEM partiton size is a challenge when adding new
packages or updating a package. Since the disk layout can't be changed
for compatibility reasons when updating an existing instance, we can't
simply try out something without ensuring first that enough space is
there by removing something else. This situation can be relaxed by
leveraging btrfs compression. There was some support for btrfs but it
was a bit outdated and didn't allow to configure compression or setting
read-only flags.
Fix the btrfs support, allow to mark the default subvolume as read only
and add a compression variable that allows to select a compression
algorithm. Instead of enabling compression by setting the mount option,
we can set the filesystem attribute which has the benefit that
compression is still used with the default mount options for this (top)
directory and its contents. While for the ext2 /usr partition a hack
existed to force read-only mode by modifying some bytes and checking
these bytes could also be used to know if read-only should be used to
prevent corruption of dm-verity data, we rather check directly whether
dm-verity is active for this partition and mount it read-only (and
with the norecovery option to really prevent any write attempt).
The rust ebuild has some magic to detect cross-toolchains present on the
system and enable building additional cross targets. The code to trigger
the rebuild of rust is part of install_cross_rust, and checks whether
the cross directories exist in the rust installation. If they don't,
then rust is removed and rebuilt to allow for the auto-detection to
happen.
Right now there are two issues with the code. Firstly, the path that is
checked is wrong, which leads to rust always being removed and rebuilt.
The path checked is /usr/lib/rust-*/rustlib but /usr/lib/rustlib is
where the files are installed.
The second issue is that it checks for aarch64 dirs when CHOST is
aarch64-cros-linux-gnu. However, on an aarch64 host the aarch64 dirs
will already exist from building the sdk itself. The rust ebuild is not
ready to handle aarch64 hosts yet and blows up. The correct behavior is
to combine the check for CHOST with a check for the right CBUILD.
On an aarch64 host we should presumably check for the x86 CHOST and rust
dirs, but that can be added later, because it needs more work.
Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
The arm64 profiles don't specify SYMLINK_LIB=yes, which makes sense
since arm64 systems don't support multilib in the way that we are used
to from x86. What this means is that build artifacts are installed into
separate lib and lib64 directories. The root overlay installed in stage4
needs to check for SYMLINK_LIB before trying to create a symlink,
otherwise it fails to be applied because it collides with the directory
in the rootfs.
This uncovered a second minor issues - the rust toolchain bootstrap
scripts checked for /usr/lib64/rust*, but the ebuild installs to
/usr/lib/rust.
Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
The previously used uuid 4f68bce3-e8cd-4db1-96e7-fbcaf984b709 is valid
for x86_64 root partitions, which resulted in the dev container not
working with systemd-nspawn on aarch64. systemd-nspawn fails with:
No suitable root partition found in image
Change the partition uuid to the architecture agnostic one documented
in the man page:
A GUID partition table (GPT) with a single partition of type 0fc63daf-8483-4772-8e79-3d69d8477de4.
This makes systemd-nspawn happy on aarch64.
Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
The kola update tests need a dev-key-signed update payload. This was
lacking and caused the update tests to be skipped.
Generate the test update payload for both dev builds and release builds
and run the kola tests for both. The test update payload has a special
name to not confuse it with the real update payload for releases, and
we keep the previous behavior to sign releases. Therefore, the
generate_update function wasn't used but the extract_update function
extended with generating the additional test payload.
This change removes 8 years old code from the toolchains build which
tries to update SDK libraries for unknown reasons, breaking the
toolchains build in the glibc-2.33 update.
Signed-off-by: Thilo Fromm <thilo@kinvolk.io>
This change to stage 4 of the SDK bootstrap process will keep a
snapshot of coreos-overlay in the SDK tarball. This snapshot can be
used in future SDK bootstraps' stage1 to ensure a clean stage 1 output
without any package updates.
Signed-off-by: Thilo Fromm <thilo@kinvolk.io>
This change updates the stage1 SDK bootstrap build to use local
("known good") package ebuilds only, preventing updated package ebuilds
to apply in stage 1. This fixes SDK build breakage we observed when
upgrading core libraries like readline.
The change also removes the seed update from stage 1 as it should not
be needed anymore now that we postpone any package updates to stage 2.
The following package ebuild repos are used for stage 1:
- for portage-stable, we simply copy /var/gentoo/repos/gentoo
from the SDK root.
- coreos-overlay is more complicated since ebuilds are missing from
the SDK. So we grok the version the SDK was built with from
/mnt/host/source/.repo/manifests/default.xml
and then we create a local stage 1 clone of
https://github.com/kinvolk/coreos-overlay.git
in which we then check out the revision noted in the default mnifest.
Signed-off-by: Thilo Fromm <thilo@kinvolk.io>
The `--jobs` parameter that some scripts defined was not used anywhere
in jenkins or mantle. So the value of the parameter always ended up
being equal to `${NUM_JOBS}` set by `common.sh`. Also, even if the
`--jobs` parameter was used for some script, that script usually
didn't forward the jobs value to other scripts, so the other scripts
ended up using `${NUM_JOBS}` again. Also, the `${FLAGS_jobs}` variable
was used by some functions in the build library, and those functions
were sometimes invoked by scripts that didn't define the
`${FLAGS_jobs}` variable. It is tedious to track which script should
actually define the parameter, and where it should be forwarded.
Just get rid of this half-working pretense. If you want to affect how
many jobs `emerge` uses, export the `NUM_JOBS` environment variable
before calling any script.
For `EMERGE_FLAGS` and `REBUILD_FLAGS` we unconditionally specify the
`--jobs` flag's value to `${NUM_JOBS}` because they are passed to
`emerge`. On the other hand we drop the `--jobs` parameter from the
`UPDATE_ARGS` variable, because this variable passed to `setup_board`
or `update_chroot`, which don't have this flag any more.
The script needs to be ported, because it is importing portage code
which became python3 only.
The porting I did is likely a lousy job, but at least it stopped
failing with some p(yt)hony errors.
I have no idea how this thing worked before - the repos never were in
/usr/portage nor in /usr/local/portage… But the newer version of
portage seems to be pretty picky about the validity of repos location,
so fix them.
When a license file is newly added, portage may not yet have it in the
shared folder and the license inclusion step fails.
Fall back to the source repositories and look for the license file
there, too. Print a warning if not found instead of failing to build.
The build_image script invokes the create_dev_container function, and
passes the `FLAGS_group` as param. Use the param, to generate the
binhost URL instead of using the DEFAULT_GROUP which stays as developer
always.
Fixes: kinvolk/Flatcar#298
Signed-off-by: Sayan Chowdhury <sayan@kinvolk.io>
Kernel source tree started to have a broken link
`tools/testing/selftests/powerpc/copyloops/memcpy_mcsafe_64.S`.
Especially in case of Kernel 5.8.18, like:
```
broken link: /usr/src/linux-5.8.18-coreos/tools/testing/selftests/powerpc/copyloops/memcpy_mcsafe_64.S
ERROR build_packages: test_image_content: Failed symlink check
```
Ignore the symlink when checking broken symlinks.
Setting the invalid CCACHE_ variables resulted in strange failure
in projects depending on meson, newer version like 0.55.3. For example
systemd build fails like the following errors:
```
* ACCESS DENIED: utimes: /mnt/host/source/ccache
* ACCESS DENIED: utimes: /mnt/host/source/ccache
F: utimes
S: deny
P: /mnt/host/source/ccache
A: /mnt/host/source/ccache
R: /mnt/host/source/ccache
C: ccache cc /build/amd64-usr/var/tmp/portage/sys-apps/systemd-246/work/systemd-246-abi_x86_64.amd64/meson-private/sanitycheckc.c -o /build/amd64-usr/var/tmp/portage/sys-apps/systemd-246/work/systemd-246-abi_x86_64.amd64/meson-private/sanitycheckc.exe -O1 -pipe -pipe -D_FILE_OFFSET_BITS=64
```
We should not set up ccache at all, as it has been already disabled in
coreos-overlay repo.
The SDK now includes a Rust version with the aarch64 cross-compilation
libraries and the toolchain job doesn't build it anymore. Yet it was
still recompiled because the path had changed.
Remove the adjustment of the download URL and any automatic building
of Rust. Just issue a warning so that any problem can be spotted easily.
This change does not affect the SDK bootstrapping (full or just stage4)
but affects ./build_packages and the toolchains job. For the toolchains
job the crossdev setup is missing anyway and rebuilding wouldn't help
but only downloading, yet since in stage4 there are no binary package
URLs at all, it's best to remove this step and if it is needed later,
the warning will help.
The default update seed command does only specify gcc which leads to
an error because »The short ebuild name "gcc" is ambiguous«.
Choose the standard package name instead of the cross compiler packages
which are only known to emerge because we build them as part of an SDK
release now.
The VM hardware and OS type versions were outdated and resulted in
features not being available by default.
Choose a newer ESXi host version (requires 6.5) and set the guest
OS type to Linux 3.x 64 bit.
Before, we were relying on the toolchains job to build and upload
packages that were part of the SDK. With this change, all packages that
should be part of the SDK are built and uploaded by the SDK job. The
toolchains job only builds toolchain packages specific for the release.
This change includes several adjustments done to both the SDK and the
toolchains jobs to make this work:
* Make the SDK job build all cross toolchains, including Rust
* Stop building Rust in the toolchains job and use the one in the SDK
instead.
* In toolchain_util.sh: detect when the symlink folder for crossdev
packages is missing and run crossdev to create it during
update_chroot setup.
* Make it possible to build the SDK starting from stage 4 instead of
stage 1, to make the SDK building faster for PR branches / nightlies
(full build should still be done for releases / weeklies).
For some unicode characters in ca-certificates file names "rev" complains
about an "invalid or incomplete multibyte or wide character"
and gives no output.
Filter out any unexpected characters for "rev" and replace them with "?"
so that "ls some?name" will still resolve the original name.
Toolchain utils have installed only `dev-lang/rust`. It could result
in version mismatch between `virtual/rust` and `dev-lang/rust`, because
`dev-lang/rust` does not automatically pull in `virtual/rust`.
So install `virtual/rust` instead of `dev-lang/rust`.
Install `virtual/rust` to avoid version conflicts that happen in case of
rust versions in the SDK being different from those in the new ebuilds.
`/usr/share/catalyst/targets/stage1/stage1-chroot.sh` installs gcc and
its dependencies, including `dev-lang/rust`, while `virtual/rust` does
not get updated. That results in version conflicts between
`virtual/rust` and `dev-lang/rust`. To avoid such an issue, we should
update also `virtual/rust` when building stage1. Since `virtual/rust`
automatically pulls in `dev-lang/rust`, we do not need to explicitly
specify `dev-lang/rust` here.
The license JSON file did only include the package names but not
any other metadata. Also since the file was not on the image itself,
it had to be downloaded.
Add more metadata to the license JSON and store it on the image.
systemd and sudo are already fixed. Git was fixed by updating to 2.23.2,
not 2.24.1. Samba is 2 years old and customized, thus difficult to update.
file, Python, and gdb are only in the SDK.
If a user or old software creates the flag file on the old CoreOS location,
nothing would happen.
Check the old location, too, so that Ignition is rerun.
The dev build SDKs are not in $FLATCAR_DEV_BUILDS/sdk but published under
$FLATCAR_DEV_BUILDS/developer/sdk.
Add an environment variable to specify where the SDK is to be found
but default to $FLATCAR_DEV_BUILDS/sdk if it is not specified.
From Jenkins this variable is exported as DOWNLOAD_ROOT_SDK.
Two Flatcar versions were used in /etc/portage/make.conf both in the SDK
and in the boards.
Use only a single version by default to get the expected results and not
something else when using binary packages.
The Rust crossdev package was never uploaded to /sdk/ and always
had to be compiled again.
Upload it in a separate toolchain-arm64 directory because /Packages in /crossdev/
doesn't refer to the Rust package and its use flags.
The BINHOST was still configured to be the CoreOS CL upstream location
which does not work for independent Flatcar CL releases. This broke
binary package installation in the development container.
Use the correct BINHOST to fix installation of binary packages in the
development container.
The configuration variables for the Ignition configuration also serve as
data source for coreos-cloudinit config data (which includes plain scripts).
Document them properly and also call out that the networking variables only
work if coreos-cloudinit data is used.
For some use cases, too few networking variables were available. Add secondary
routing variables for the main network interface and add a second interface.
There was a logical mistake in Ignition that caused ignition.config.*
only to work when it was part of the ovfenv. Thus they were added but
the old CoreOS variables marked deprecated and kept. With both as OVF
variables each of them worked but directly specifying ignition.config.*
as guest variable still didn't because of the logical mistake.
Now there is a fix and both work well when specified directly as guest
variable (https://github.com/flatcar-linux/ignition/pull/11).
Delete the old CoreOS OVF variables because they just clutter the UI
and only the Ignition variables should be used in the UI.
Write out an iPXE script file for Packet.
The script uses relative URLs to refer to
the other PXE files and thus can be copied
along with the files to any server.
This is useful because it saves the creation
of an iPXE script for a release/channel on a
third-party service. For CI testing it is
also helpful because the script does not only
end up on the release server but also already
on the Google buckets, refering to unpublished
PXE payloads.
For the Ignition variables to be usable they need to be
specified in the OVF.
Call out that the CoreOS variables are deprecated to
reduce confusion when both are displayed besides each other.
Create a tar ball with the contents of the / and /usr partitions
to be used as follows with systemd-nspawn (via machinectl):
machinectl import-tar flatcar-container.tar.gz flatcar-container
machinectl start flatcar-container
machinectl shell flatcar-container
or with docker by converting it to an OCI image:
docker import -c "CMD /bin/bash" flatcar-container.tar.gz flatcar-container
Since the new "prodtar" command relies on the results of the "prod" command,
it bundles it so that "prod prodtar" and "prodtar" is the same.
When loop device partition nodes aren't cleaned up, building images
will fail with:
mkfs.vfat: Partitions or virtual mappings on device '/dev/loop0', not making filesystem (use -I to override)
Just add the flag unconditionally to work around it.
The Perl update will break SDK bootstrapping during seed update, so
disable it again. This can be reverted after bumping the SDK to a
version that includes the new Perl.
The custom sys-firmware/edk2 package has been replaced by Gentoo's
sys-firmware/edk2-ovmf package now that only amd64 is supported.
This partially reverts 1761d9d071 .