Instead of patching portage to support the `disabled` flag now we just
patch it to leave the `[gentoo]` section out of the default repos.conf.
Follow up to 585275b268
PROD_IMAGE is a flag that indicates a production image should be
built, and will be set for dev builds if the user specifies that
both dev and prod images should be built. build_image was
incorrectly using the PROD_IMAGE variable to conditionaly do some
setup depending on the image type.
Add a new variable IMAGE_BUILD_TYPE that can be tested for the type
of image currently being built and replace the PROD_IMAGE usage.
Signed-off-by: Geoff Levand <geoff@infradead.org>
A bunch of packages install PAM configuration fragments in /etc. Rather than
modify them all to install into /usr/lib, just move the entire directory at
image build time.
We need to ship some PCR measurements alongside images in order to make it
easier for admins to provide an appropriate policy. Add some tooling to
generate the appropriate hashes during build, pack those into a zip file
and upload it.
profile is already set up to source /usr/share/baselayout/profile.env
but it never has because I forgot to add this line during the migration
to amd64-usr images. Sure took us a while to notice that one... :(
This resolves two issues:
- Large dependencies are *never* built during image_to_vm,
build_packages must now handle that.
- Since build_packages can't resonably do the oem-* packages (they all
conflict with eachother) we do want to build them from the ebuild.
This is now enforced so a old binpkg is never used. This resolves
confusing issues people have always had while when editing oem
ebuilds but getting a stale build instead.
Allows build_image to be used without first running build_packages.
Note: setup_board --force is required before build_packages will work
properly after doing this since baselayout won't be installed otherwise.
- May be sourced early, so explicitly die if source fails.
- Add a function for getting the latest version of a package.
- Read PROVIDES metadata using portageq, enabling data to be read from
binary packages in addition to installed packages. The performance
issue is not an issue here and needed to support empty build roots.
Most vm images have an expanded root partiton to make them practical to
use as-is. Some deployments may not want such a large root, putting most
storage on other volumes.
This variable was semi-deprecated ages ago so `version.txt` could follow
a similar variable naming pattern to `os-release`. Finally drop usage of
it here in favor of `$COREOS_VERSION`.
The one-liner `[[ -z ${PIPESTATUS[*]#0} ]]` no longer works because the
expansion still includes spaces even if all the values are zero. Somehow
that didn't matter in bash 4.2 but it does mater in 4.3 to be consistent
with the general behavior of variables in [[ tests.
The console often contains very useful information in the event of a
hard crash, in such situations there's no ability to unblank the console
via keypress because the kernel won't handle the interrupt.
Since CoreOS is a server/cluster operating system, there won't generally
be monitors connected benefitting from a blanked console. Disabling the
blanking altogether allows the frame buffer contents to always be
visible, even when the kernel can't handle keypresses.
coreos.first_boot=1 will no longer trigger disk-guid randomization, so
manual ignition triggers in diskless/pxe scenarios may succeed. Instead
we explicitly request the randomization when first_boot=1 was added by
grub finding the 00000000-0000-0000-0000-000000000001 disk-guid.
In order to boot properly we need `rootflags=rw mount.usrflags=ro` on
the command line. These have been build into the kernel directly but for
arm64 builds the built in options seem to be ignored.
Add the necessary variables in grub.cfg and populate the EFI
partition with arm64 efi executable and modules.
Signed-off-by: Andrej Rosano <andrej@inversepath.com>
Now that ccache is turned on by default in the profile portage complains
a lot if ccache isn't actually installed, sleeping 5 seconds for each
error message. Since pkgcache is in use ccache isn't going to make that
much of a difference but getting rid of those 5 second sleeps will. :)
Failing to explicitly set the selinux policy store to operate on may
result in semodule installing the policy in an incorrect location. Pass
it on the command line in order to avoid this.
ldconfig does not work for non-native arches. Create a new
build_image routine run_ldconfig that uses qemu user emulation
to run the board ldconfig on the board rootfs when the board and
SDK arches are different.
See: http://code.google.com/p/chromium/issues/detail?id=378377
Prior to calling run_ldconfig the board rootfs must have ldconfig
installed. To arrange this move the call of run_ldconfig to after
the base package install.
Fixes build_image errors like these when building for arm64:
/sbin/ldconfig: /lib64/libXXX is for unknown machine 183.
Signed-off-by: Geoff Levand <geoff@infradead.org>
This reverts commit 39bb800f16.
This change disabled a number of features so it isn't suitable for the
generic VMware templates. We need to re-trace our steps to list exactly
what tools/systems weren't accepting the linux26 type.
The new python script check_root uses data that portage already
maintains on what shared libraries packages need or provide instead of
re-scanning whatever ELF files that can be found. This is much more
comprehensive but there is a bit of a transition issue for folks with
long-lived SDKs: packages built with portage older than 2.2.18 do not
include this data. As such for now the check is non-fatal and provides a
command you can use to refresh locally installed packages.
The code checking for conflicts between top level directories and /usr
has also been rewritten. Both tests now are considerably faster.
SDK tarballs have a .DIGESTS file but it is created by catalyst instead
of the upload_image function. In order to support plain GPG signing but
not avoid re-generating .DIGESTS we need to move that code out of
upload_image to a new function. upload_files shouldn't do it itself
because it is also used for portage binary packages which shouldn't be
signed (there is no point, nothing would verify the signatures).
The grub configuration needs some updates to handle dealing with booting
the kernel from the ESP rather than from inside the image. We also want to
be able to avoid dealing with signing the config file, so build it into the
binary. Finally, rather than having to cope with signing grub modules, build
the ones we need to boot into the grub image.
Once we're signing the root filesystem, we're not going to be able to boot
the kernel from there. Copy the kernel out to the EFI System Partition and
sign it.
Add qemu_uefi_secure target for building Secure Boot images. These are
identical to qemu_uefi images with the exception that the test keys have
been installed into the flash image, enabling Secure Boot by default. In
addition, sign the grub binary with the test keys during build when
producing unofficial images.
Adds CROSS_PROFILES, BOARD_CHOSTS, and BOARD_PROFILES definitions to support a
generic arm64-usr board.
get_portage_arch() is updated to convert aarch64 correctly.
Signed-off-by: Geoff Levand <geoff@infradead.org>
This is required for the eventual removal of `$PORTDIR` and
`$PORTDIR_OVERLAY` and ensures toolchain rebuilds/updates with
`./build_packages --nousepkg` don't erroniously try to use ebuilds from
`/usr/portage` inside of the SDK.
In order to fix up the build_toolchains script the crossdev overlay
needs to be setup properly, previously only setup_board did it.
Overall silences a lot of warnings and fixes an issue with crossdev:
/usr/bin/emerge-wrapper: line 48: /eclass/toolchain-funcs.eclass: No such file or directory
/usr/bin/emerge-wrapper: line 49: tc-arch: command not found
The portage CBUILD and HOSTCC variables need to be set to the SDK host to get
a proper cross build when building target binaries.
Change _configure_sysroot to use the CBUILD environment variable to set the
CBUILD and HOSTCC variables of ${ROOT}/etc/portage/make.conf. Also, fix up all
calls to _configure_sysroot to set the CBUILD environment variable.
Fixes setup_board failure when the host and target architectures differ.
Signed-off-by: Geoff Levand <geoff@infradead.org>
[marineam: fixed a copy/paste error]
Previously fsck output was suppressed to reduce the amount of noise in
build logs on the assumption that fsck really shouldn't have a reason to
fail. The filesystem is freshly created after all. However some users
have reported that fsck is failing but without error messages we don't
know why.
There isn't a sane way for users to know the licenses of individual
packages in CoreOS images in built images. The information is hidden
away back in the original ebuilds. This extends our existing package
list with a new file that also includes licenses:
```
app-admin/flannel-0.3.0-r3::coreos Apache-2.0
app-admin/fleet-0.9.1::coreos Apache-2.0
app-admin/locksmith-0.2.3::coreos Apache-2.0
app-admin/sdnotify-proxy-0.1.0::coreos Apache-2.0
app-admin/sudo-1.8.10_p2::portage-stable ISC BSD
app-admin/toolbox-0.0.0-r4::coreos Apache-2.0
app-arch/bzip2-1.0.6-r6::portage-stable BZIP2
app-arch/gzip-1.5::portage-stable GPL-3
app-arch/tar-1.27.1-r2::portage-stable GPL-3+
...
```
- "./build_image prod" already has the ability to specify which package will specify all the packages that should be pulled in and built into an image by specifying a package name using the --base_pkg command line flag. This creates an equivalent option for "./build_image dev" creating a --base_dev_pkg flag that passes a package name into the create_dev_img() function in dev_image_util.sh the same way that --base_pkg is passed into create_prod_image() inside prod_image_util.sh.
This change changes the default 'bytes-per-inode' ration from 16K to 4K,
the block size. To prevent this from wasting too much space change the
inode size from the default 256 to the minimum size, 128. Larger inodes
are used to store extended attributes more efficiently but since we do
not use SELinux the majority of files do not have security attributes.
These defaults may be modified via the new `bytes_per_inode` and
`inode_size` options.
Fix parsing the following output:
[ebuild N ] dev-libs/gmp-5.1.3-r1 to /usr/x86_64-cros-linux-gnu/
[ebuild UD] sys-libs/timezone-data-2013d [2014i-r1] to /usr/x86_64-cros-linux-gnu/
The previous regex did not account for upgrades and got confused by the
`[2014i-r1]` listing and goobbled up too much of the string. I am not
sure *why* portage is reporting an upgrade when --emptytree is also used
but there it is. Match all not-] characters instead.
Disable ccache as it is causing issues in other builds so disable it
everywhere to be safe. Disable the autoresume feature because our build
process doesn't actually make use of it.
Adding the update step appears to break permissions on the distfiles
directory. Ensure the portage user is correct and set the permissions on
directories it needs to write to in advance.
When bootstrapping a SDK we need to update GCC dependencies to ensure
the GCC built for stage1 is linked against the same library versions as
those that are included in the stage1. Without this updating the mpc
library just results in a broken stage1.
Probing all filesystem types on all block devices appears to hang
booting Amazon EC2 HVM instances. The console output is unreliably
buffered so there is no information on what the failure actually is. On
the up side we can work around it easily by only searching the GPT which
appears to be safe.
To aid testing things under Xen it helps to have a machine locally that
actually runs Xen! This isn't a particularly great setup but it works
well enough to simplify my own testing. Must be used with a developer
image and packages built with `USE=vm-testing` set to include the Xen
userspace tools.
This uses our new GRUB2 features to handle GPT priority partition
selection, terminal selection, OEM tweaks, etc. The old SYSLINUX and
PV-GRUB configs are now unused except for maintaining compatibility
with older installs. Of the old configs only the ones that
coreos-postinst copies are needed. The new setup supports using GRUB2
under Xen, giving us automatic fallback support on all of our platforms
for the very first time!
Since grub.cfg is copied into place instead of generated, build_image's
--boot_args option is no longer supported. It could be re-added later
with some sed goo but for now it is easy enough to just edit grub.cfg.
Mark the initial copy of CoreOS as 'successful' and with a non-zero
priority. Required to boot with a stricter interpretation of the
partition selection scheme which ignores partitions that have a priority
of zero. The new grub implementation follows this rule and is what the
original ChromeOS spec used too.
For the sake of completeness if multiple partitions are configured in
the json file with this feature they will be prioritized in disk-order.
The VHD format actually uses 2MB blocks internally so the 1MB alignment
used in e77e4e54 wasn't sufficent to prevent other tools from further
adjusting the image size to align it. Additionally a 1MB alignment may
be triggering a bug in OpenStack or XenServer disk resizing that renders
that partial block at the end of the old image size unmapped/unavailabe.
So far the default iteration order of python dicts has mostly matched
the order that we want the partitions on disk but this is not always the
case. I caught the BIOS-BOOT partition being ordered on disk after the
USR-A partition. Nothing bad came of this but consistancy is good.
The new disk size alignment left too much extra space at the end of the
disk which would lead to pointless resizing on first boot. Fill in the
extra space so that no more than 1MB is left unused.
The VHD disk format internally includes CHS addressing and qemu-img
respectfully aligns disk images to the common 16 heads 63 sectors
geometry when possible. This is unfortunate since images uploaded to
Azure must also be aligned to 1MB we normally do.
Since qemu-img doesn't have a way to handle this well right now adjust
our existing alignment logic to create disk images aligned to both.
I am unsure exactly what situation is causing the loopback partition
device node to not exist when it is being mounted but this should help
work around the situation and log loudly about it so we can hopefully
figure out where to dig further.
Version 4 is too low. Some VMware products even crash trying to
upgrade it to a greater version (VMware Fusion 6 Pro). Having at
least 7 will allow us to use some modern features in most VMware
products, such as enabling vmxnet3 virtual network adapters or adding
much more memory and cpu cores to virtual machines.
Pruning files via INSTALL_MASK in the profile is a bit more apropriate
since it allows us to keep most of that info in one place. The only
parts that need to be deleted or adjusted here are inputs and outputs of
`env-update` which has to be run after everything is installed.
Previously we didn't actually clean up `env.d` at all which lead at
least one user to think they should edit those files and run
`env-update` themselves but we don't ship that tool on prod images.
This sets the IMG_FORCE_OEM_PACKAGE variable to the supplied string. If a
':' is present, what follows it gets put in the IMG_FORCE_OEM_USE variable
and what precedes in the former.
_get_vm_opt() has been modified to generally support forced overrides such
as this one, simply set variables named IMG_FORCE_$opt.
Now you can do things like:
for fmt in cloudstack \
digitalocean \
ec2-compat:ec2 \
ec2-compat:openstack \
ec2-compat:brightbox \
exoscale \
gce \
hyperv \
rackspace \
rackspace-onmetal; do
./image_to_vm.sh --format=qemu --oem_pkg=$fmt
../build/images/amd64-usr/latest/coreos_developer_qemu.sh -curses
done
rather than having to modify build_library/vm_image_util.sh to test oem
builds in qemu.
My primary use case for this flag is to fix booting with UEFI firmware
which can have problems when mixed with KVM, adding kexec into the mix
doesn't help matters either. The current version of OVMF can boot from
virtio drives just fine so that is now enabled and KVM is disabled.
So the -s option can also mean sloooooooow but boots!
The new grub install script must be called after the image is unmounted
and the old bootloaders script doesn't need to touch grub at all. For
now we will continue to use the existing syslinux configs but
interpreted by grub. Beyond the grub menu flashing by during boot
everything should still be functionally equivalent.
This script replaces the standard grub-install tool to give us some more
control over what is going and ensure grub-install's auto-detection
magic doesn't make any incorrect choices. Also this script sets up a
loopback device and mounts the EFI partition in just the right way for
grub-bios-setup's auto-detection magic to work correctly.
I've chosen not to adapt disk_util to use partitioned loop devices to
make grub happy because ensuring loop devices get cleaned up properly
for the general case gets tricky and less robust.
The passing ROOT= as an environment variable to board wrapper scripts
doesn't work, the script unconditionally overrides it. This means so far
our packages.txt files have listed the contents of /build/amd64-usr
instead of the image. Fix this by calling equery directly instead.
Not currently used, this configuration which sets up grub to re-use the
syslinux configuration only works with recent git versions, not any
releases. Compatibility is also limited because the serial configuration
in syslinux must be duplicated in the grub config.
We don't need to do anything like manually install the MBR boot code
for grub but we do need to continue to expose the ESP partition as a
hybrid partition to support pvgrub.
Calling cgpt create when resizing zeros the MBR boot code. This worked
with the syslinux setup because the boot code was re-written. When not
using syslinux it is easier to just preserve the existing MBR instead.
Unlike SYSLINUX, GRUB2 does not recommend embedding itself in a FAT
filesystem. Instead GRUB2 prefers embedding in the space between the MBR
and first partition or using a dedicated partition that is safe from
tampering by fs utilities. In our case the space after the MBR is where
the GPT lives so we need to use the extra partition scheme instead.
The 64MB "BOOT-B" partition has never been used so we can replace it
with a 2MB partition which is more than enough for GRUB.
We have long since stopped installing anything to the /boot directory of
the root filesystem. Mount the ESP partition to /boot for consistancy
with the discoverable partition spec.
Normally GCC is installed in a way that allows installing multiple
versions and switching between them. Our production images do not need
this and additionally the only things from the GCC package that are
needed are the shared libraries. To ensure these libraries are *always*
locatable regardless of the presence of /etc/ld.so.conf and
/etc/ld.so.cache we can install those libraries to plain old /usr/lib.
The GCC packages don't have a built in way to do this but we can get
away with extracting the libraries directly from the binary package.
This is actually similar to what ChromeOS did with a few exceptions:
- We use a native GCC build instead of the cross toolchain
- The archive is properly extracted from the package instead of feeding
the package directly to tar and ignoring the resulting warnings.
As an added benefit switching from a blacklist to a whitelist ensures
that extra cruft does not slip through the cracks, saving 5-10MB.
Create profile as a real directory instead of a symlink to the board
root's configuration. Normally the board root does not modify this but
it is useful for build_image to use it to modify package.provided.
Normally Gentoo expects moving between major GCC releases to be a manual
step. In our case we want this to always be automatic, otherwise the GCC
version won't be switched at all.
Apparently expanding an empty string before a variable assignment forces
that assignment to be interpreted as a command instead. Instead of an
empty string use env as our sudo alternative when running as root.
Needed for portage 2.2. Sync URIs are included but not very useful yet
because portage only can do `git pull` but not `git clone`. An extra
helper script will be required to do the initial clone it seems.
Binary packages may be useful for re-installing a package with a
different INSTALL_MASK. Can be used to install debug symbols.
Instead of gluing in a special PROD_INSTALL_MASK for all images use
profiles to configure the differences between the base build root,
production images, and developer images. This offers much more
flexibility and is needed for providing a full dev environment in
developer images.
Using parallel_emerge has been disabled by default for all commands
except build_image for quite a while now, build_image kept it just
because it was still a bit faster than normal emerge. Keeping
parallel_emerge complicates future changes to build_image so it needs to
drop it entirely. Since that means nothing uses it by default we might
as well just rip out support for it entirely.
VHD was just for testing, raw is more useful for published images.
coreos-install will now be able to install working xen instances:
coreos-install -d /dev/xvda -o xen -c cloud-config.yml
Missed this in 7231b95a, the update zip should still be built when the
usr partition is extracted for generating updates but build_image itself
is not generating and signing the update.
The current generate_update function is now less useful, the important
part that we need is just the partition image now. Also by defaulting to
extracting the partition the old cors_generate_update which is still in
use by devserver can be removed entirely, devserver will just expect the
extracted partition image instead.
Attempting to work around an apparent race in mtools, the command
'extlinux' these days is just the install tool for mounted partitions
while 'syslinux' is for unmounted devices.
Evaluating this as a user config causes it to block on
coreos-environment-setup.service which will wait on networking. This
makes it hard to add extra tricks for testing/debugging situations where
networking is failing. For example, to trigger dhcpcd if networkd dies:
#cloud-config
write_files:
- path: /etc/systemd/system/systemd-networkd.service.d/dhcpcd.conf
content: |
[Unit]
OnFailure=dhcpcd.service
[Service]
Restart=no
The new Update() performs the same tasks as the old Resize()
in addition to formatting previously-unformatted partitions. This
allows children disk-layouts to repartition the base layout in
addition to resizing.
I started to move board files under a boards/ directory similar to how
the SDK is under sdk/ but didn't do so everywhere. This should finish
the job so everything is consistent now.
Note: This prefix is only used in developer and buildbot uploads. When
final releases are copied to $channel.release.core-os.net it doesn't use
the prefix since a) I already published urls without the prefix and b)
no sdk files are ever posted to the public release locations.
btrfs isn't designed for small volumes and can run out of space sooner
than one would expect in our current setup, particularly with docker.
To try to improve the situation always create the filesystem initially
as 2GB instead of 512MB using the default settings: metadata is
duplicated, data is single, not mixed. The mixed setting may have been
partly why our performance can be so poor. For the default vm layout
use 6GB instead of 3GB, about what we use for EC2.
Since the new bucket scheme uploads images to a private staging area
first we need to configure the final location to generate vagrant json
metadata correctly.
- Automated builds drop SDK and binary packages into
gs://builds.developer.core-os.net/ and the new download URL is
http://builds.developer.core-os.net/ (COREOS_DEV_BUILDS)
- Change default upload path to gs://users.developer.core-os.net/ for
misc developer builds. Official builds go elsewhere and will just be
configured in buildbot/jenkins so some COREOS_OFFICIAL stuff is gone.
- Automated builds of images go to a private bucket,
gs://builds.release.core-os.net which later gets copied to
gs://alpha.release.core-os.net and friends by core_promote.
The new --developer_data option can be used to specify a path to a cloud
config to bundle into the image. If none is provided but a shared user
password (for core) is set then generate a config to set that password.
This lets us use the same mechanism for setting the default password for
both disk and PXE images.
This image type is the same as the developer image except that it is a
single root filesystem and is bootable via systemd-nspawn. This may
become obsolete eventually when it becomes possible to boot the normal
disk images under nspawn but it is useful for testing until then.
The partition type is defined by the Discoverable Partitions Spec.
http://www.freedesktop.org/wiki/Specifications/DiscoverablePartitionsSpec/
To make it possible to plop a CoreOS install into a simple
single-filesystem image for use as a container some things like
configuring bootloaders need to be skipped.
To behave more like setup_board/build_packages update_chroot should
fully configure portage to make sure everything is accurate.
Now binhosts are defined in make.conf.host_setup so the static config in
coreos-overlays doesn't need to refer to version.txt. setup_board
already made this change in 7a43a07f.
Define path locations to reduce dependency between static configs in
coreos-overlays and the behavior of the scripts repo. Spreading
configuration across two repos makes everything harder to understand.
Eventually everything should either be defined in profiles in
coreos-overlays or minimal auto-generated config files here in scripts.
Use what was the base image build function as setup/finalize steps in
the dev and prod build functions. This eliminates duplicate code
that mounted and unmounted the filesystem images.
We need some more control over exactly what lands in dev vs prod images
which will require letting them diverge in what is currently the common
base image step. There isn't any real need for the base image in the
first place other than to speed up building both dev and prod images at
the same time but that isn't common enough to worry about.
As part of this cleanup also remove references to CHROMEOS_* variables
and the recovery image that never actually existed in CoreOS.
For generating images for groups other than the one given to build_image
run this script along with the usual image_to_vm.sh commands. To avoid
ambiguity with the 'latest' symlink, this script creates $group-latest
symlinks instead. build_image creates the new symlink too.
Only the key is needed, and currently the vagrant OEM is completely
broken outside of vagrant. This gets vmware_insecure images back into
the state that they were before cloud config came along. :)
This adds two new optional build steps. The first user of these is the
vagrant images but many of the targets can be simplified now.
- fs_hook: Anything that needs to happen before unmounting the image.
This happens after the OEM is installed but before disk images are
made. It can be used to copy any data out of the image.
- bundle_format: Many VM types ship as some sort of archive format
rather than plain disk images as this script originally assumed.
Adding this final step lets us stop using the conf step awkwardly.
Vagrant now ships with a Vagrantfile and related code included in the
OEM package. This lets us version our vagrant-side code along with the
images themselves as well making the coreos-vagrant repo optional again.
The coreos-vagrant code will still be useful for handling the fancier
cluster configuration stuff but no longer has to carry the plugin code.
This should make it less difficult for people to add kernel options for
debugging. Without a prompt/timeout the user must be holding down space
or some other key while syslinux loads but it may not be possible for
the user to do so provide input quite that fast. Only a half second to
avoid needlessly increasing boot times in the common case.
Using the classic mbr.bin was only needed during the transition from
syslinux 3 to 6 because the behavior of gptmbr.bin changed after 3.
Now that the transition is done and cgpt supports the new scheme now it
is time we switched back. This avoids depending on using a hybrid MBR.
The .DIGESTS format is clunky and annoying. It also requires uses to
perform two steps to verify images using GPG. Instead support signing
all files directly so there is no need for .DIGESTS.
The old DIGESTS code will remain in place for now but after a few
releases I plan on deleting it.
The use of getopts was leading to conflicts between this script's short
options and qemu's long options. For example -serial was getting
interpreted as -s -- erial which is not very helpful.
We added a new https certificate on the new update service and changed
the hostname to be consistent with all of the other endpoints. Update
the new images to use this.
The old URL http://public.roller.core-os.net will remain working until
all of the old client have been updated.
Installing to a temporary directory and then copying over the final
contents of /usr/share/oem allows more complicated OEM packages such as
python to be configured with --prefix=/usr/share/oem while previously
the atypical use of ROOT=/usr/share/oem would have complicated things.
This can be used by update_engine as a quick test to determine if it is
running on a system that it can handle. This avoids needing something
like the 'coreos.diskless' kernel command line flag.
If QEMU is given a uuid systemd will detect that and in turn use it for
the machine-id. This made the bug causing the machine-id to be always
re-generated on boot harder to notice since it didn't happen on QEMU.
Taking a bit of a new approach to booting PXE images here for both
amd64-generic and amd64-usr. Instead of requiring the user to specify
squashfs and tmpfs on the kernel command line we can simply provide
defaults in the initrd's fstab.
Now with SYSLINUX 6 we can use the same bootloader on EFI and BIOS
systems. This replaces our previous reliance on building default kernel
options into the kernel image itself.
- Remove custom COREOS_* attributes from /etc/lsb-release
- Move dev image logic to dev_image_util
For extra fun fix detection of local host URL for devserver.
- Remove weirdly verbose "DESCRIPTION" format.
- Add COREOS_RELEASE_BOARD back to /usr/share/coreos/release
This is mostly just so update_engine and gmerge report the correct
board name to devserver, informative-only on prod images.
- Remove version info from /etc/gentoo-release
- Switch from 'track' to 'group' terminology.
Fix the problem of: "pecified switch root path %s does not seem to be an
OS tree. /etc/os-release is missing" because dracut doesn't have an
/usr/share/coreos/os-release file.
cgpt now supports generating hybrid MBRs and the classic style mbr.bin
from any version of SYSLINUX should work the same with the hybrid MBR.
The other code, gptmbr.bin, changes after SYSLINUX 3. Switching lets me
play with different versions of SYSLINUX without breaking everything.
With this change all images feature a hybrid MBR so the special case for
some VM platforms has been removed.
This doesn't make things go faster and I am suspicious that it makes
things worse. For example /etc/xml/catalog winds up empty from time to
time and I wonder if this locking is related to that.
While attempting to fix the easy to mix up DIGESTS names in feb59db9f I
stumbled across yet another way that the DIGESTS names were a bit
unpredictable: previously a .bz2 got stuck into the file name when
upload_images automatically compressed some file types. The new code
missed this and never added the .bz2. Correct this now, both for
image_to_vm which has a pile of glue to keep the legacy names in place
for now and build_image which I never intended to change.
Vagrant reads this file to determine that we are CoreOS... so lets not
break that just yet. A PR to switch to os-release has been posted:
https://github.com/mitchellh/vagrant/pull/2985
Some day gentoo-release will be dropped but that day is not today.
Make it possible for other scripts to share the same value for our
release repository and equally easy to override with a custom value.
Also allow setting the root from the command line in addition to the
environment. Usually --upload_root is better to use than --upload_path.
Switch from naming DIGESTS based in disk image name to a common prefix.
old: coreos_production_qemu_image.img.DIGESTS ->
new: coreos_production_qemu.DIGESTS
The old behavior wasn't very consistent since plain disk images aren't
used by all types and the code implementing that was easy to brake,
namely by mistake coreos_production_pxe_image.cpio.gz.DIGESTS became
coreos_production_pxe.vmlinuz.DIGESTS a couple releases ago.
The old names will continue to exist as well for the time being to avoid
breaking existing install/download scripts and the original pxe DIGESTS
name is back.
For multi-file uploads we should explicitly declare what the name of the
.DIGESTS file should be instead of using the first file name. Relying on
the ordering was subtle and easy to break.
To avoid having to sync directory creation and tmpfiles installed by
ebuilds just require ebuilds to do `keepdir /etc/path` and this script
will handle the rest. A possible extension would be to update to handle
symlinks in addition to directories in gen_tmpfiles.py
The funky UUID and other special settings should only be applied to
coreos-rootfs and coreos-usr partitions which will never be fscked. When
STATE becomes ROOT in -usr images it gets fsked while mounted read-only
and fsck updates the filesystem's UUID if it is blank. Turns out this
causes disagreement between the kernel and the disk leading to bad
things. A related issue was fixed in a newer version of tune2fs but
unless I missed it the same bugfix didn't make it into e2fsck so
updating wouldn't resolve the issue.
http://e2fsprogs.sourceforge.net/e2fsprogs-release.html#1.42.9
Check if the disk layout is a /usr layout and if so hack the USR-A
partition, not whatever is mounted to /. Also use the new functionality
in disk_util for this as it can look up partitions by label.
The basic infrastructure to support this is now in place. Add a new
board that uses the experimental coreos/amd64/usr profile /usr based
disk layouts. This is just enough to successfully build images, they
aren't bootable yet.
The boot kernel parameters change depending on whether the new /usr
scheme is in use. Pass the disk layout to the bootloader config script
and adjust generated configs accordingly.
Nothing from chromeos-common.sh is needed for image building now. Also
kill off build_common.sh which was just a weird way of sourcing
common.sh. The two piddly functions it provided fit better in
build_image_util.sh
Now disk_util is aware of the weird ext2 read-only hack, both by
providing a command to manipulate it and support in the mount command to
automatically set the 'ro' mount option for filesystems with it.
Making mount aware of the hack makes it much easier to mount prod
images with a mix read-only and read-write filesystems.
write_gpt --update <img> will read an existing image and make sure all
existing partitions will not get moved or truncated in the new layout.
This is mostly useful for resizing the final partition or just rewriting
metadata like partition types and labels.
cros_make_image_bootable now only is relevant for prod images, so move
the remaining code to prod_image_util in a similar scheme that base and
dev images use.
Lots of things are either unused or meaningless. A particularly creative
one is the fact that there are command line flags for mount point
locations that are then overwritten.
The verification flag was being passed through to the bootloader
template script but no longer had any effect.
Force the base image to always remain writable, its only purpose is to
be modified in a later build step anyway.
Merge GetPartitionTable and partition alignment from WritePartitionTable
into LoadPartitionConfig so that all this config manipulation code is in
one place and inheritance from the 'base' layout is more predictable.
This isn't a feature we've been using as far as I know and if someone
needs a custom partition layout it's probably better to just add it to
the json file. Removing this avoids some complexity.
Move from optparse to argparse. Move layout file and layout type to
global options with reasonable default values so every command doesn't
need to them. Adjust calling scripts to match.
For now layout type is being passed via an environment variable
DISK_LAYOUT_TYPE but this is a temporary situation.
Now uses the package database instead of filesystem so the check works
even if /bin and friends are symlinks to /usr. Also disable the
whitelist and check that the expected symlinks are correct if the
symlink-usr USE flag is enabled.
When calling update_chroot with --usepkg --nogetbinpkg the default
emerge command line will force binary packages for the toolchain but if
the packages are not available locally building via crossdev is
required. Since the crossdev bootstrap process rebuilds the toolchain a
couple times with different use flags if binary packages are forced the
second stages gets skipped resulting in a broken gcc and glibc install.
This makes it possible to toggle parallel_emerge just as other scripts
do. In other scripts update the help string to be more specific, the
--jobs option can be used to control parallelism.
Although it didn't seem to be causing any bugs the global variables in
toolchain_util conflicted with some names used elsewhere. Clean that up
by adding an S to the array names that didn't already have one.
When calling update_chroot with --nousepkg it is silly to always force a
rebuild of the cross toolchain. Change the test to work regardless of
whether binary packages are enabled by checking if anything needs to be
built from source.
Now this code can be shared with setup_board. Only required if
setup_board is called with --nousepkg which is rare to never but feels
like the correct thing to do. Alternatively setup_board could always
use binary packages (as it basically does now).
Right now there is some funky logic to either use a previous build as a
seed or the current SDK tarball if it happens to have been downloaded.
This is a bit confusing and doesn't work reliably since it is reasonable
for there to be neither a previous build or the current SDK available if
the SDK chroot was created some time ago. Fix this by using the new SDK
library and always use the latest SDK, downloading it if needed.
A number of places refer to these paths and that number is going to
grow. Since the standard pattern is to use environment variables for
commonly used paths it is time to add ones for these:
REPO_CACHE_DIR
REPO_MANIFESTS_DIR
This scheme only works robustly with kexec. Until the happy day that
kexec is supported on Xen (or when Xen is dead, long live Xen!) we
shouldn't bother trying. This allows us to use kernel modules again.
Previously the code in base_image_util.sh properly handled the disk
layout command line flag but the spaghetti code later on calls a
function from disk_layout_util.sh which only returned 'base' resulting
in a bit of a mess if something other than 'base' is used. Sync up the
two code paths to avoid that...
Use 2*CPUs for the target load average but add load average throttling
to emerge in addition to make. Also work around how catalyst sets
FEATURES so we can disable extra locking for hopefully faster builds.
This replaces the cross-toolchain compile step in bootstrap_sdk and adds the
ability to build native toolchains using the cross toolchain. This is just
the first step towards actually providing the native toolchain in a container.
We don't need to reserve space on disk just to reserve partition
numbers. And now that partitions are aligned these blanks spots grew
from 512 bytes to 1MB which is not much but still silly.
When using anything other than classic spinning disks with 512 sectors
it is generally best to maintain some alignment with the underlying
physical sector or erase block size. The default alignment most
partitioning tools use these days is 1MB (2048 sectors). Also sometimes
qemu-img requires disk sizes to be aligned to 64KB.
The existing code arbitrarily multiplies START_SECTOR by 512 converting
from blocks/sectors to bytes, but blocks was the correct unit to begin
with. Also the secondary GPT area is not considered but that was OK
because the bogus unit conversion oversized our disks by almost 16MB.
Instead of relying on bugs properly reserve 34 sectors at each end of
the disk. (Well, we could get away with only 33 at the end since it
doesn't have a MBR but meh.)
Previously shifts were added into the getopts loop to work around
differences between different sh implementations but that causes getopts
to end the loop early. Instead use an intermediate variable to work
around inconsistent OPTIND behavior and explicitly check for the --
separator. Tested in bash, dash, and ash.
We don't have any particular reason for the weird hackery required to
install packages into /usr/local instead of root. The rootfs image is
already being modified a little might as well modify it a lot. :)
Vagrant users are accustomed to much larger disk sizes so lets give it
to them. I'm leaving the others as-is since it is easier to grow than
shrink disks if anyone has a particular size they need.
Use the smaller base format for 'raw' disk images since these will
usually be dd'd to a block device to create AMIs and what not. For
images using qcow2 and vmdk stick with the larger vm size.
This reverts commit b97cfe126f.
The minor device numbers of loop partitions are allocated dynamically
which significantly complicates dunning under Docker which uses a static
/dev. Rolling this back until we can rely on /dev being dynamic.
If git is installed via coreos-dev in the STATE partition it will need
some help finding its install location since it was built thinking it
would be installed in /usr rather than /usr/local.
This avoids the need to dd individual filesystem images into a complete
disk image, just mount the partitions directly from a loop device
covering the whole image. This does add the requirement that mkfs run as
root but that isn't a problem.
These are just cluttering things and adding an element of "how does this
work?" because base_image_util was defaulting to the "usb" layout in
some places and "base" in others.
This change removes /usr/sbin/write_gpt.sh from images which we have no
use for. This allows us to drop the indirection of writing partition
tables by first writing out a script to call. Now cgpt.py can call cgpt
directly to initialize the partition layout. This opens the way for
further improvements to how disk images are created.
This currently does nothing because our state partition is not partition
number 1. Even if it did we don't really needed it since we rely on
expanding on boot instead.
Remove --verity_*: Unused, we don't support verity
Remove --usb_disk: Unused, we use PARTUUID now.
Remove --enable_serial: Unused, and serial is enabled for syslinux
Right now the initial (pre image_to_vm) images oversize the root
partitions, creating the expected 1GB filesystem in a 2GB partition.
image_to_vm later shrinks the partition back down to match. Just start
out with 1GB partitions to begin with instead.
This one is more automagical and sets up ssh keys from ssh-agent and the
user's home directory by default. Also adds an option for setting the
ssh port so it can be something other than 2222. Script should be
sufficiently portable, tested in bash, dash, and ash.
Useful for qemu -nographic or any any other situation where serial is
easier to get at than VGA. It may be possible that in some setups ttyS0
isn't appropriate but we can figure out a way to customize kernel
options if/when that ever comes up.
Remove unused dev/dm-0 vs dm-1 logic from verity and the associated
rootwait option it required (meaningless with our initrd). Move old
cros_legacy to common instead of using it in every command line option.
We should remove it entirely soon since it isn't useful for us. Remove
unneeded intel graphics modeset option.
I want to start including version info in SDK builds as an alternative
scheme to the existing "chroot_version_hooks" system which always
assumes freshly unpacked SDKs are the latest regardless of what version
they actually were.
The recommended command using the config file was triggering a massive
memory leak in qemu because it was adding both the default virtual
hardware nic as well as the virtio nic. This could be worked around by
adding something like -net none or moving all the -net commands from the
file to the command line but eh. Clearly qemu config files are used and
tested by nobody else so lets just use a trusty script instead.
Vagrant will need the virtualbox ovf plus its own Vagrantfile config.
After this we will need an optional "package" step to this script to
take these files and bundle them into a .box tarball. This could also be
used to switch from .ovf+vmdk for plain virtualbox images to a bundled
.ova archive which combines the two.
Trying to include version info by adding the directory name to VM image
names didn't work and a better solution is for build_image to write out
a version.txt file. This should also fix an issue where uploading from
image_to_vm.sh didn't always go to the same location as the images
uploaded from build_image did.
As of Linux 3.2 loopback supports discard by punching holes in the
underlying file. This doesn't actually seem to impact things right now
since we are writing to fresh filesystems but might as well do this to
prevent wasted space from sneaking in later on.
Enable sparse files for all dd and cp commands and replace some dd
commands that are really better off being truncate commands.
While in the neighborhood there were a number of useless sudo commands
for things that just happen to be in sbin. Call them directly instead.
Its single use is in build_common and even then having a little progress
bar for copying images isn't that interesting, they just get lost in the
noise of the emerge output. Keep it simple, use cp.
/mnt/stateful_partition was already a little unruly with
/mnt/stateful_partition/home and /mnt/stateful_partition/var_overlay
serving similar functional purposes.
Then we needed to also add /opt and /srv overlays.
I also have wanted to get rid of the ugly and weird
/mnt/stateful_partition name so lets just have one big move.
/mnt/stateful_partition -> /media/state
/mnt/stateful_partition/var_overlay -> /media/overlays/var
/mnt/stateful_partition/home -> /media/overlays/home
From there we add /media/overlays/srv and /media/overlays/opt
The basic system directory structure including the lib symlinks were
fixed for sysroot in the following commits but the image build uses an
entirely different bit of code to do the exact same set of hacks. Port
those changes to the image building code to hopefully make all happy.
2ae0c30f4eac931bd088
Add --production_track argument to set_lsb_release and
cros_make_image_bootable to support using the production update service
on developer builds of the 'prod' image. This replaces the previous hack
of setting COREOS_OFFICIAL=1 in the middle of the build.
Since lsb-release doesn't exist prior to the first call to
set_lsb_release switch to sudo_clobber instead of append. That way if it
is called a second time later the contents aren't duplicated.
Write the info to gentoo-release and os-release as well so everything
gets the same information.
Last minute bug slipped in because of a line I commented out since the
current coreos kernel doesn't support virtio block devices (that change
coming soon). Qemu doesn't tolerate any spaces before # in comments.
The old script was heading towards spaghetti code realm. This breaks up
all the image variations such as hybrid MBR, OEM packages, etc into
configuration options and small functions that actually do the work.
All this is in the new vm_image_util.sh library but the command line
parsing and overall procedure remains in image_to_vm.sh
As part of this we gain support for putting some qemu options in a
config file as well as Xen virtual machines using pygrub and pvgrub.
Lots of generally unused options have been removed to simplify things
and keep output file names consistent.
reintroduce unique A/B menu.lsts to work around the kexec problems that
we have. Essentially instead of always using boot_kernel on pvgrub
systems use the A/B kernels installed at update time to the boot
partition.
Switching the toolchain to upstream Gentoo brought this directory back
and based on the Chromium OS history keeping this directory out of the
builds is a bit tedious. Keeping image sizes down isn't *that* important
right now so just let it be.
This adds the boot_kernel to the build boot partiton and updates the
relevant config files. Mission accomplished.
TODO: Update the installer to not worry about moving files around
anymore
Sync up bootstrap_sdk with other tools by using the common upload
functions. As part of this refactor release_util a bit to provide a
truly generic upload function.
This will be used to upload the latest images built from master, we
don't need every build so we just want to upload to a 'master'
directory, not one named for the current version.
Add core_upload_update to au-generator.zip which requires some extra
logic to make it runnable anywhere it may be. To organize the code a
little better all the delta_generator calls have been moved to
cros_generate_update_payload. core_upload_update is now just a wrapper
around cros_generate_update_payload and core-admin.
Cgpt was moved and a symlink based wrapper was added. That wrapper will
be improved soon, when when that's true we'll need to change this back.
A specific note... cgpt is currently statically linked. If that wrapper does
not remain statically linked, then a simple revert won't be enough.
BUG=chromium-os:39814
TEST=Manual au-generate.zip creation.
Change-Id: I2705b1eddd8ef28c7eb099512513daf80f586218
Reviewed-on: https://gerrit.chromium.org/gerrit/45128
Reviewed-by: Chris Sosa <sosa@chromium.org>
Commit-Queue: Don Garrett <dgarrett@chromium.org>
Tested-by: Don Garrett <dgarrett@chromium.org>
Make use of the new partition UUIDs for ROOT-A and ROOT-B in the root=
kernel parameters provided by the legacy (non-kexec) bootloaders. This
makes all of our images bootable as-is without having to pass them
through image_to_vm.sh. :-D
Before we can switch from using device names in root= to partition table
UUIDs we need some values that will remain consistent across upgrades
since the partition table is not updated when filesystems are.
vboot_reference now recognizes coreos-reserved and coreos-rootfs. Use
these prefixes so we stop using the chromeos GUIDs.
Test-plan: Tested on a VM and it boots and updates.
A few things here:
- Source manifests/version.txt directly instead of coreos-version.sh
- Remove Chrome branch from target image directory names.
- Use proper version instead of timestap for catalyst builds.
- Move lsb_release script from coreos-overlay to build_library.
During builds var_overlay is always mounted over /var. We want to do the
same at run time but we also want to ensure everything expected to be
there always does. After emerge completes gen_tmpfiles.py will scan /var
for any .keep files that were installed and records their parent
directories' permissions and ownership to /usr/lib/tmpfiles.d. On each
boot systemd will automatically recreate anything that goes missing.
This also means that going forward any ebuild that needs a directory in
/var (or anywhere else the stateful partition is bound) can simply rely
on the 'keepdir' ebuild function instead of adding things to
coreos_startup.
As outlined here we need a new partition layout, this patch makes the
necessary changes:
https://groups.google.com/forum/#!topic/coreos-dev/bA7gwGGoTng
The first big change is making all of the scripts obey partition numbers
based on labels in the disk_layout.json. This makes it much easier to
change later on.
The second big change is in the layout itself. The json file was updated
to reflect the document above.
And finally the grub boot configuration needed for pv-grub and pygrub
were added to the create_legacy_bootloader_templates.sh library utlity.
Everything seems to work and boot now.
this is a bit of a hack but I wanted to see if it had any utility during
development before making it all pretty. Essentially this is a copy of
build_image but instead of building up an entire image it simply puts
the files into directories on disk to be ran with systemd-nspawn/lxc/etc
so it is a bit complicated but essentially gtest pulls in python which
pulls in pyton-updater which wants portage so portage gets installed in
teh real root not the dev one. Just leave it for now.
on Fedora 18 on Gnome 3.0 something is making the first attempt at
unmounting return busy. Unfortunatly, the return code is 32 everytime
so we have to parse the output of umount :( :( :(
Change-Id: I7f94bf6c2059c7e7cb4fb173d9ffbabd59f2b24f