When started by the Flatcar core user, the SDK failed to use UID 500
because inside the SDK there already is the core user from nss-altfiles
with the same ID. This way, the SDK user was continuing with UID 1000
and had permission errors.
Allow to reuse an existing ID for the SDK user. However, this only
works when usermod doesn't find a process that uses this ID, and we had
a race between the SDK entry points called by "docker start" and by
"docker exec". The race is unwanted anyway because we don't want to
execute the commands while setup_board is still running. Solve it by
setting the entrypoint for "docker start" directly to "bash -l" in
"docker create" (this is also what the entry point does as last step:
sudo su -l).
Some of the signing may happen inside the SDK container, so make sure
to forward the SIGNER environment variable, as it will be used by the
signing function, when it's introduced.
This change temporarily disables the Gentoo sandbox when updating the
SDK to work around sandbox permission errors some pakage builds (like
e.g. GO) run into.
Fixes e.g.
```
Building Go cmd/dist using /usr/lib/go-bootstrap. (go1.5.3 linux/amd64)
* /var/tmp/portage/sys-apps/sandbox-2.12/work/sandbox-2.12/libsandbox/trace.c:do_peekstr():125: failure (Operation not permitted):
* ISE:do_peekstr:process_vm_readv(6863, 0x00007ffe4a502180{0x00007f01abd3e010, 0x570}, 1, 0x00007ffe4a502190{0x000000c820012a90, 0x570}, 1, 0) failed: Operation not permitted
* ERROR: dev-lang/go-1.17.8::coreos failed (compile phase):
```
Signed-off-by: Thilo Fromm <thilo@kinvolk.io>
For now we had only "developer" images in the new pipeline.
Based on the git tag like "alpha-1234.0.0" set the channel (group) for
the image and also use this logic when finding the channel in the QEMU
update test.
This change introduces update_sdk_container_image, a script to generate
a new SDK container image based on an existing SDK container. The
script is meant to be used for minor / patch level SDK changes (like
test suite updates).
Signed-off-by: Thilo Fromm <thilo@kinvolk.io>
In bce3bd9031, we added support for podman
for building and running the SDK container. The presence of podman is
auto-detected in sdk_container_common.sh. However, podman is preverred
over docker, requiring users to use *sudo* (which podman requires and
docker does not).
This change uses docker when present, podman otherwise. It also improves
podman detection - 'podman' uses argv[0] in its version string, so if
'docker' is a symlink to 'podman', 'podman --version' output uses
'docker'. This broke the SDK container on hosts which have a 'docker'
symlink to 'podman' since 'podman' is then run w/o 'sudo'.
Signed-off-by: Thilo Fromm <thilo@kinvolk.io>
sdk_entry.sh is expected to be called by the root user, so we set USER
root:root. Also we add a "root" entry to passwd and group since it does
not exist in the SDK tarball.
Signed-off-by: Thilo Fromm <thilo@kinvolk.io>
The creation of the target version file failed:
/home/sdk/sdk_entry.sh: line 32: /build/amd64-usr/etc/target-version.txt: Permission denied
Use root permissions to create the file.
When the docker wrapper script for Podman is used, we need to
explicitly create a root user container with "sudo podman".
Podman also has its own bridge for root user containers which we need
to detect, and it requires to explicitly say to use the Docker Hub
Caddy image.
Add a "$docker" variable that uses sudo podman as needed, and also
check which bridge interface to use. The filter had to be changed
because it didn't work with Podman. Use the Docker Hub Caddy image
explicitly.
This change ensures the binpkg host is updated if the board (OS) version
differs from the SDK version.
This is to ensure /build/[arch] uses the correct binary package cache.
Signed-off-by: Thilo Fromm <thilo@kinvolk.io>
For execution of the compiled binaries in /build/arm64-usr we rely on
qemu-user binfmt emulation and have to tell it where the root is with
QEMU_LD_PREFIX because build systems don't chroot into /build/arm64-usr
themselves (which also works just by chance on amd64 because we have
similar glibc versions and so on). The env var setup was done in
/etc/profile.d/qemu-aarch64.sh but is now not read anymore since the
container runs the shell not as login shell.
Add the login options to the bash and su calls when starting the
container.
This change introduces a containerised SDK as a replacement for cork SDK
operations. It also simplifies versioning by removing the need for
manifest repos as well as usage of the "repo" tool by use of git
submodules for coreos-overlay and portage-stable.
The following feature scripts are added:
- run_sdk_container: Run a command in an SDK container, using the
current scripts repo + ebuild submodules.
current scripts repo + ebuild submodules.
- bootstrap_sdk_container / build_sdk_container_image: Bootstrap a new
SDK and create an SDK container from the resulting SDK tarball.
The following additions have been made to SDK scripts:
- setup_board: add --pkgdir parameter to use a custom binary packge
directory.
Signed-off-by: Thilo Fromm <thilo@kinvolk.io>
The dev build SDKs are not in $FLATCAR_DEV_BUILDS/sdk but published under
$FLATCAR_DEV_BUILDS/developer/sdk.
Add an environment variable to specify where the SDK is to be found
but default to $FLATCAR_DEV_BUILDS/sdk if it is not specified.
From Jenkins this variable is exported as DOWNLOAD_ROOT_SDK.
We were chowning the host directory, not the one in the chroot.
Host gpg >= 2.1.13 puts the gpg-agent socket in /run/user/UID/gnupg,
which is bind-mounted into the chroot, but the SDK gpg was ignoring it
because /run/user/UID was not owned by UID. This broke tag signing with
YubiKeys.
COREOS_BUILD_ID is set to a default value in common.sh if unset in the
environment. When entering the chroot this default value should not then
get promoted into the environment. Doing so causes catalyst to re-use
stale builds and multiple build_image runs to conflict with each other.
Commit 09851b84 didn't do a recursive bind by mistake, so if the host
system has anything mounted under the chroot directory for some reason
the bind would hide those mounts. Recursive ensures existing mounts
remain exposed as they did before.
The path of $GNUPGHOME outside the chroot may not really make sense
inside the chroot. Although that's probably not a big deal there's no
need to keep the outside value. Instead just bind it to the usual spot.
When running under jenkins the $GNUPGHOME may be located under the
current build directory instead of $HOME to avoid conflicting with other
jobs on the same build host.
The distfiles cache is always under .cache in the repo tree but there is
a lot of extra logic to make that configurable along with compatibility
symlinks for previous locations. Just yank it all out.
The version of repos.conf/coreos.conf that catalyst needs isn't valid
for normal SDK chroots and causes env-update to spew errors when it is
run prior to update_chroot which configures portage properly.
A step in reducing the amount of initialization code required: drop
needless symlinks under /usr/local/portage to the portage trees. Just
configure portage to point directly at the source instead. Only crossdev
remains in that location because it is a locally managed overlay.
This code is not applicable to us, it predates CoreOS and is a weird
thing for common.sh to be doing as well. Instead always define
CHROOT_TRUNK_DIR to /mnt/host/source, create ~/trunk in make_chroot.
Currently building images on older kernels will fail because mkfs.btrfs
enables an incompatible feature 'extref' by default. We never really
made this requirement explicit and the SDK in general has continued to
maintain compatibility with older kernels. Make the requirement explicit
so users will get errors quicker and there is a clear line for what
kernel features can be used in the SDK.
Newer git ebuilds have decided that the "git-prompt" script isn't really
bash completion so stopped installing it via that mechanism. Instead it
installed it started installing it in /usr/share/docs which gets
compressed by default and the path is based on ebuild version. The path
changed again in 1.9.3 to /usr/share/git and didn't compress it so that
makes it actually possibly usable but 1.9.3 or later isn't stable yet.
We can re-enable it the next time git gets updated but not worth fussing
over the current brokenness right now.
Using parallel_emerge has been disabled by default for all commands
except build_image for quite a while now, build_image kept it just
because it was still a bit faster than normal emerge. Keeping
parallel_emerge complicates future changes to build_image so it needs to
drop it entirely. Since that means nothing uses it by default we might
as well just rip out support for it entirely.
Previously /etc/os-release was installed both by set_lsb_release and
the baselayout package. Now it is only installed by set_lsb_release but
when baselayout is upgraded it removes /etc/os-release. So the first
update_chroot works but the second detects the chroot's version
incorrectly and tries to apply the one time updates in this directory.
Both of them are very old so we can just delete them. The second run
will now fix up /etc/os-release and we can all move on and be happy.
The host system's PATH may not be match the one required by the SDK.
When going through the enter_chroot script it gets reset because bash is
invoked as a login shell but this doesn't happen when using the plain
old chroot command.
Fixes https://github.com/coreos/scripts/pull/290