This changes the oem-package for vagrant to vagrant-virtualbox,
which uses ignition instead of cloud-clonfig and sets the oem id
to "virtualbox" so that ignition can handle the machine correctly
This adds the option --torcx_store to specify the path to a
directory containing torcx images to be baked into the OS image. A
blank string can be given instead of a path to restore the previous
behavior and leave an empty vendor store.
The default value is the default path created by build_torcx_store,
which is used when build_packages updates torcx images. This means
that the current pattern "./build_packages && ./build_image prod"
should result in a fully updated OS image with all torcx images
available in the vendor store.
When OVA template is not being used, the default dhcp value yes will
trigger cloud-init to generate a 00-.network file, which will break
network connectivity Intermittently. Please see the details here:
https://github.com/coreos/bugs/issues/1802#issuecomment-297847614
When building dev images, the PORTAGE_BINHOST value during build
time is written to the image's make.conf. This breaks the default
binary package setup, since Jenkins is using gs:// URLs for signed
package verification and authenticated downloads, and the make.conf
doesn't inherit the GS_* variables to handle those schemes.
This should be reverted when signed packages are properly supported
by default in the dev images.
This changes the format from:
sys-apps/systemd-212-r8::coreos GPL-2 LGPL-2.1 MIT public-domain
to a JSON structure:
[
{
"project": "sys-apps/systemd-212-r8::coreos",
"license": ["GPL-2", "LGPL-2.1", "MIT", "public-domain"]
}
]
We don't have to worry about the changing format because the previous
format was never published. This is designed to match the
bill-of-materials [1] format so that it can be consumed by the site.
[1]: https://github.com/coreos/license-bill-of-materials
INFO build_oem_aci: Writing coreos_oem_gce_aci_stage_packages.txt
awk: cmd. line:1: fatal: cannot open file `/build/amd64-usr/var/db/pkg//DEPEND' for reading (No such file or directory)
INFO build_oem_aci: Writing coreos_oem_gce_aci_stage_licenses.txt
awk: cmd. line:1: fatal: cannot open file `/build/amd64-usr/var/db/pkg//DEPEND' for reading (No such file or directory)
We've always generated these license manifests (detailing which ebuilds
are covered by which license), but never published them. This adds these
manifests to the list of published files so that they are publicly
available.
os-release is requested in bug reports, and knowing which board
the problem occurred on is often helpful.
Signed-off-by: Geoff Levand <geoff@infradead.org>
Detect first boot based on the existence of a coreos/first_boot file
in the EFI partition, and set "coreos.first_boot=detected" command line
argument when found. We use "detected" rather than "1" so the initramfs
knows that it should mount the ESP and delete the file. This lets us
defer clearing the first-boot flag until Ignition has run successfully,
without having to change the disk GUID after filesystems are mounted.
Continue detecting the first-boot disk GUID and adding the command-line
argument to randomize it, since we still want unique disk GUIDs
regardless of Ignition.
This is useful for emerges that are meant for incomplete rootfs's, such
as ACI building emerges. There are cases where the #! check is expected
to fail while doing those.
This reverts commit a7ffba9a9f.
The build_image script can build multiple formats. When our
releases and automated builds are creating developer containers and
production images from the same command, the verity flag would be
disabled while building the container and remain disabled when building
the production image. This resulted in no verity in all our builds.
Allow "coreos-install -o vmware_raw" to install Container Linux with
the vmware OEM.
Use base DISK_LAYOUT to reduce the minimum disk size.
Fixescoreos/bugs#359.
They're not in the root fs, but they are in the initramfs. Handle this
by augmenting the package list with packages that are both
- build dependencies of coreos-kernel, and
- configured to cause rebuilds of coreos-kernel when their sub-slot
changes.
To clean things up and prepare for arrm64 support move
all the enable_rootfs_verification processing into one
location and add some comments.
Signed-off-by: Geoff Levand <geoff@infradead.org>
To make verity work both enable_rootfs_verification and enable_verity
need to be set. Without one verity just gets half enabled. Remove
the enable_verity flag and do the full verity setup when
enable_rootfs_verification is set.
Signed-off-by: Geoff Levand <geoff@infradead.org>
The disable_read_write variable was just a copy of FLAGS_enable_rootfs_verification,
so to make things less confusing just use FLAGS_enable_rootfs_verification.
Signed-off-by: Geoff Levand <geoff@infradead.org>
Add a new grub variable extra_options, the contents of which is
added to the linux command line. Use extra_options to set
the ACPI options needed for arm64.
Signed-off-by: Geoff Levand <geoff@infradead.org>
Two new image types have been added:
1. parallels - this produces VM images with extension pvm.tgz that can be loaded directly into Parallels Desktop
2. vagrant_parallels - this produces a Vagrant box that works with parallels vagrant provider (http://parallels.github.io/vagrant-parallels/)
Just like vmdk and others we rely on qemu-img to convert raw images. Support for Parallels disk images was added to qemu-img in version 2.4.
I also removed the box files from the actual image since there are not needed in /usr/share/oem.
Signed-off-by: Bassam Tabbara <bassam.tabbara@quantum.com>
The ACI root is created by reusing the create_prod_image function
to install a base meta-package. It then runs a script to customize
the file structure as required by agent software (if necessary),
writes a manifest file from a supplied template, and then packages
it all into a tar file.
The Xen loader in GRUB never received support for our hacky scheme of
adding the verity hash to the kernel cmdline. Disable till that's fixed.
Partially reverts 2016567 and 533b1b9.
Consolidates two very similar flags into one and fix an issue where
verity could get enabled in the GRUB config when rootfs verification was
turned off (e.g. on arm64 which cannot use verity yet).
workaround for bootstrap_sdk on an Ubuntu host where /dev/shm is a
symlink to /run/shm. Since we mount the hosts /dev (for losetup) this
interferes with building python 2.7. The workaround is to disable the
/dev/shm during python builds. A longer term fix would be to not mount
the hosts /dev. Thanks for marineam for suggesting the fix on IRC.
If the gptprio.next command fails to give us something to boot we
shouldn't try! In order to diagnose why the failure happened halt
immediately so the user can see the error message.
Once we've built the packages, verify against the Gentoo Linux Security
Advisories to ensure that we're not shipping anything with known
vulnerabilities.
Instead of patching portage to support the `disabled` flag now we just
patch it to leave the `[gentoo]` section out of the default repos.conf.
Follow up to 585275b268
PROD_IMAGE is a flag that indicates a production image should be
built, and will be set for dev builds if the user specifies that
both dev and prod images should be built. build_image was
incorrectly using the PROD_IMAGE variable to conditionaly do some
setup depending on the image type.
Add a new variable IMAGE_BUILD_TYPE that can be tested for the type
of image currently being built and replace the PROD_IMAGE usage.
Signed-off-by: Geoff Levand <geoff@infradead.org>
A bunch of packages install PAM configuration fragments in /etc. Rather than
modify them all to install into /usr/lib, just move the entire directory at
image build time.
We need to ship some PCR measurements alongside images in order to make it
easier for admins to provide an appropriate policy. Add some tooling to
generate the appropriate hashes during build, pack those into a zip file
and upload it.
profile is already set up to source /usr/share/baselayout/profile.env
but it never has because I forgot to add this line during the migration
to amd64-usr images. Sure took us a while to notice that one... :(
This resolves two issues:
- Large dependencies are *never* built during image_to_vm,
build_packages must now handle that.
- Since build_packages can't resonably do the oem-* packages (they all
conflict with eachother) we do want to build them from the ebuild.
This is now enforced so a old binpkg is never used. This resolves
confusing issues people have always had while when editing oem
ebuilds but getting a stale build instead.
Allows build_image to be used without first running build_packages.
Note: setup_board --force is required before build_packages will work
properly after doing this since baselayout won't be installed otherwise.
- May be sourced early, so explicitly die if source fails.
- Add a function for getting the latest version of a package.
- Read PROVIDES metadata using portageq, enabling data to be read from
binary packages in addition to installed packages. The performance
issue is not an issue here and needed to support empty build roots.
Most vm images have an expanded root partiton to make them practical to
use as-is. Some deployments may not want such a large root, putting most
storage on other volumes.
This variable was semi-deprecated ages ago so `version.txt` could follow
a similar variable naming pattern to `os-release`. Finally drop usage of
it here in favor of `$COREOS_VERSION`.
The one-liner `[[ -z ${PIPESTATUS[*]#0} ]]` no longer works because the
expansion still includes spaces even if all the values are zero. Somehow
that didn't matter in bash 4.2 but it does mater in 4.3 to be consistent
with the general behavior of variables in [[ tests.
The console often contains very useful information in the event of a
hard crash, in such situations there's no ability to unblank the console
via keypress because the kernel won't handle the interrupt.
Since CoreOS is a server/cluster operating system, there won't generally
be monitors connected benefitting from a blanked console. Disabling the
blanking altogether allows the frame buffer contents to always be
visible, even when the kernel can't handle keypresses.
coreos.first_boot=1 will no longer trigger disk-guid randomization, so
manual ignition triggers in diskless/pxe scenarios may succeed. Instead
we explicitly request the randomization when first_boot=1 was added by
grub finding the 00000000-0000-0000-0000-000000000001 disk-guid.
In order to boot properly we need `rootflags=rw mount.usrflags=ro` on
the command line. These have been build into the kernel directly but for
arm64 builds the built in options seem to be ignored.
Add the necessary variables in grub.cfg and populate the EFI
partition with arm64 efi executable and modules.
Signed-off-by: Andrej Rosano <andrej@inversepath.com>
Now that ccache is turned on by default in the profile portage complains
a lot if ccache isn't actually installed, sleeping 5 seconds for each
error message. Since pkgcache is in use ccache isn't going to make that
much of a difference but getting rid of those 5 second sleeps will. :)
Failing to explicitly set the selinux policy store to operate on may
result in semodule installing the policy in an incorrect location. Pass
it on the command line in order to avoid this.
ldconfig does not work for non-native arches. Create a new
build_image routine run_ldconfig that uses qemu user emulation
to run the board ldconfig on the board rootfs when the board and
SDK arches are different.
See: http://code.google.com/p/chromium/issues/detail?id=378377
Prior to calling run_ldconfig the board rootfs must have ldconfig
installed. To arrange this move the call of run_ldconfig to after
the base package install.
Fixes build_image errors like these when building for arm64:
/sbin/ldconfig: /lib64/libXXX is for unknown machine 183.
Signed-off-by: Geoff Levand <geoff@infradead.org>
This reverts commit 39bb800f16.
This change disabled a number of features so it isn't suitable for the
generic VMware templates. We need to re-trace our steps to list exactly
what tools/systems weren't accepting the linux26 type.
The new python script check_root uses data that portage already
maintains on what shared libraries packages need or provide instead of
re-scanning whatever ELF files that can be found. This is much more
comprehensive but there is a bit of a transition issue for folks with
long-lived SDKs: packages built with portage older than 2.2.18 do not
include this data. As such for now the check is non-fatal and provides a
command you can use to refresh locally installed packages.
The code checking for conflicts between top level directories and /usr
has also been rewritten. Both tests now are considerably faster.
SDK tarballs have a .DIGESTS file but it is created by catalyst instead
of the upload_image function. In order to support plain GPG signing but
not avoid re-generating .DIGESTS we need to move that code out of
upload_image to a new function. upload_files shouldn't do it itself
because it is also used for portage binary packages which shouldn't be
signed (there is no point, nothing would verify the signatures).
The grub configuration needs some updates to handle dealing with booting
the kernel from the ESP rather than from inside the image. We also want to
be able to avoid dealing with signing the config file, so build it into the
binary. Finally, rather than having to cope with signing grub modules, build
the ones we need to boot into the grub image.
Once we're signing the root filesystem, we're not going to be able to boot
the kernel from there. Copy the kernel out to the EFI System Partition and
sign it.
Add qemu_uefi_secure target for building Secure Boot images. These are
identical to qemu_uefi images with the exception that the test keys have
been installed into the flash image, enabling Secure Boot by default. In
addition, sign the grub binary with the test keys during build when
producing unofficial images.
Adds CROSS_PROFILES, BOARD_CHOSTS, and BOARD_PROFILES definitions to support a
generic arm64-usr board.
get_portage_arch() is updated to convert aarch64 correctly.
Signed-off-by: Geoff Levand <geoff@infradead.org>
This is required for the eventual removal of `$PORTDIR` and
`$PORTDIR_OVERLAY` and ensures toolchain rebuilds/updates with
`./build_packages --nousepkg` don't erroniously try to use ebuilds from
`/usr/portage` inside of the SDK.
In order to fix up the build_toolchains script the crossdev overlay
needs to be setup properly, previously only setup_board did it.
Overall silences a lot of warnings and fixes an issue with crossdev:
/usr/bin/emerge-wrapper: line 48: /eclass/toolchain-funcs.eclass: No such file or directory
/usr/bin/emerge-wrapper: line 49: tc-arch: command not found
The portage CBUILD and HOSTCC variables need to be set to the SDK host to get
a proper cross build when building target binaries.
Change _configure_sysroot to use the CBUILD environment variable to set the
CBUILD and HOSTCC variables of ${ROOT}/etc/portage/make.conf. Also, fix up all
calls to _configure_sysroot to set the CBUILD environment variable.
Fixes setup_board failure when the host and target architectures differ.
Signed-off-by: Geoff Levand <geoff@infradead.org>
[marineam: fixed a copy/paste error]
Previously fsck output was suppressed to reduce the amount of noise in
build logs on the assumption that fsck really shouldn't have a reason to
fail. The filesystem is freshly created after all. However some users
have reported that fsck is failing but without error messages we don't
know why.
There isn't a sane way for users to know the licenses of individual
packages in CoreOS images in built images. The information is hidden
away back in the original ebuilds. This extends our existing package
list with a new file that also includes licenses:
```
app-admin/flannel-0.3.0-r3::coreos Apache-2.0
app-admin/fleet-0.9.1::coreos Apache-2.0
app-admin/locksmith-0.2.3::coreos Apache-2.0
app-admin/sdnotify-proxy-0.1.0::coreos Apache-2.0
app-admin/sudo-1.8.10_p2::portage-stable ISC BSD
app-admin/toolbox-0.0.0-r4::coreos Apache-2.0
app-arch/bzip2-1.0.6-r6::portage-stable BZIP2
app-arch/gzip-1.5::portage-stable GPL-3
app-arch/tar-1.27.1-r2::portage-stable GPL-3+
...
```
- "./build_image prod" already has the ability to specify which package will specify all the packages that should be pulled in and built into an image by specifying a package name using the --base_pkg command line flag. This creates an equivalent option for "./build_image dev" creating a --base_dev_pkg flag that passes a package name into the create_dev_img() function in dev_image_util.sh the same way that --base_pkg is passed into create_prod_image() inside prod_image_util.sh.
This change changes the default 'bytes-per-inode' ration from 16K to 4K,
the block size. To prevent this from wasting too much space change the
inode size from the default 256 to the minimum size, 128. Larger inodes
are used to store extended attributes more efficiently but since we do
not use SELinux the majority of files do not have security attributes.
These defaults may be modified via the new `bytes_per_inode` and
`inode_size` options.
Fix parsing the following output:
[ebuild N ] dev-libs/gmp-5.1.3-r1 to /usr/x86_64-cros-linux-gnu/
[ebuild UD] sys-libs/timezone-data-2013d [2014i-r1] to /usr/x86_64-cros-linux-gnu/
The previous regex did not account for upgrades and got confused by the
`[2014i-r1]` listing and goobbled up too much of the string. I am not
sure *why* portage is reporting an upgrade when --emptytree is also used
but there it is. Match all not-] characters instead.
Disable ccache as it is causing issues in other builds so disable it
everywhere to be safe. Disable the autoresume feature because our build
process doesn't actually make use of it.
Adding the update step appears to break permissions on the distfiles
directory. Ensure the portage user is correct and set the permissions on
directories it needs to write to in advance.
When bootstrapping a SDK we need to update GCC dependencies to ensure
the GCC built for stage1 is linked against the same library versions as
those that are included in the stage1. Without this updating the mpc
library just results in a broken stage1.
Probing all filesystem types on all block devices appears to hang
booting Amazon EC2 HVM instances. The console output is unreliably
buffered so there is no information on what the failure actually is. On
the up side we can work around it easily by only searching the GPT which
appears to be safe.
To aid testing things under Xen it helps to have a machine locally that
actually runs Xen! This isn't a particularly great setup but it works
well enough to simplify my own testing. Must be used with a developer
image and packages built with `USE=vm-testing` set to include the Xen
userspace tools.
This uses our new GRUB2 features to handle GPT priority partition
selection, terminal selection, OEM tweaks, etc. The old SYSLINUX and
PV-GRUB configs are now unused except for maintaining compatibility
with older installs. Of the old configs only the ones that
coreos-postinst copies are needed. The new setup supports using GRUB2
under Xen, giving us automatic fallback support on all of our platforms
for the very first time!
Since grub.cfg is copied into place instead of generated, build_image's
--boot_args option is no longer supported. It could be re-added later
with some sed goo but for now it is easy enough to just edit grub.cfg.
Mark the initial copy of CoreOS as 'successful' and with a non-zero
priority. Required to boot with a stricter interpretation of the
partition selection scheme which ignores partitions that have a priority
of zero. The new grub implementation follows this rule and is what the
original ChromeOS spec used too.
For the sake of completeness if multiple partitions are configured in
the json file with this feature they will be prioritized in disk-order.
The VHD format actually uses 2MB blocks internally so the 1MB alignment
used in e77e4e54 wasn't sufficent to prevent other tools from further
adjusting the image size to align it. Additionally a 1MB alignment may
be triggering a bug in OpenStack or XenServer disk resizing that renders
that partial block at the end of the old image size unmapped/unavailabe.
So far the default iteration order of python dicts has mostly matched
the order that we want the partitions on disk but this is not always the
case. I caught the BIOS-BOOT partition being ordered on disk after the
USR-A partition. Nothing bad came of this but consistancy is good.
The new disk size alignment left too much extra space at the end of the
disk which would lead to pointless resizing on first boot. Fill in the
extra space so that no more than 1MB is left unused.
The VHD disk format internally includes CHS addressing and qemu-img
respectfully aligns disk images to the common 16 heads 63 sectors
geometry when possible. This is unfortunate since images uploaded to
Azure must also be aligned to 1MB we normally do.
Since qemu-img doesn't have a way to handle this well right now adjust
our existing alignment logic to create disk images aligned to both.
I am unsure exactly what situation is causing the loopback partition
device node to not exist when it is being mounted but this should help
work around the situation and log loudly about it so we can hopefully
figure out where to dig further.
Version 4 is too low. Some VMware products even crash trying to
upgrade it to a greater version (VMware Fusion 6 Pro). Having at
least 7 will allow us to use some modern features in most VMware
products, such as enabling vmxnet3 virtual network adapters or adding
much more memory and cpu cores to virtual machines.
Pruning files via INSTALL_MASK in the profile is a bit more apropriate
since it allows us to keep most of that info in one place. The only
parts that need to be deleted or adjusted here are inputs and outputs of
`env-update` which has to be run after everything is installed.
Previously we didn't actually clean up `env.d` at all which lead at
least one user to think they should edit those files and run
`env-update` themselves but we don't ship that tool on prod images.
This sets the IMG_FORCE_OEM_PACKAGE variable to the supplied string. If a
':' is present, what follows it gets put in the IMG_FORCE_OEM_USE variable
and what precedes in the former.
_get_vm_opt() has been modified to generally support forced overrides such
as this one, simply set variables named IMG_FORCE_$opt.
Now you can do things like:
for fmt in cloudstack \
digitalocean \
ec2-compat:ec2 \
ec2-compat:openstack \
ec2-compat:brightbox \
exoscale \
gce \
hyperv \
rackspace \
rackspace-onmetal; do
./image_to_vm.sh --format=qemu --oem_pkg=$fmt
../build/images/amd64-usr/latest/coreos_developer_qemu.sh -curses
done
rather than having to modify build_library/vm_image_util.sh to test oem
builds in qemu.
My primary use case for this flag is to fix booting with UEFI firmware
which can have problems when mixed with KVM, adding kexec into the mix
doesn't help matters either. The current version of OVMF can boot from
virtio drives just fine so that is now enabled and KVM is disabled.
So the -s option can also mean sloooooooow but boots!
The new grub install script must be called after the image is unmounted
and the old bootloaders script doesn't need to touch grub at all. For
now we will continue to use the existing syslinux configs but
interpreted by grub. Beyond the grub menu flashing by during boot
everything should still be functionally equivalent.
This script replaces the standard grub-install tool to give us some more
control over what is going and ensure grub-install's auto-detection
magic doesn't make any incorrect choices. Also this script sets up a
loopback device and mounts the EFI partition in just the right way for
grub-bios-setup's auto-detection magic to work correctly.
I've chosen not to adapt disk_util to use partitioned loop devices to
make grub happy because ensuring loop devices get cleaned up properly
for the general case gets tricky and less robust.
The passing ROOT= as an environment variable to board wrapper scripts
doesn't work, the script unconditionally overrides it. This means so far
our packages.txt files have listed the contents of /build/amd64-usr
instead of the image. Fix this by calling equery directly instead.
Not currently used, this configuration which sets up grub to re-use the
syslinux configuration only works with recent git versions, not any
releases. Compatibility is also limited because the serial configuration
in syslinux must be duplicated in the grub config.
We don't need to do anything like manually install the MBR boot code
for grub but we do need to continue to expose the ESP partition as a
hybrid partition to support pvgrub.
Calling cgpt create when resizing zeros the MBR boot code. This worked
with the syslinux setup because the boot code was re-written. When not
using syslinux it is easier to just preserve the existing MBR instead.
Unlike SYSLINUX, GRUB2 does not recommend embedding itself in a FAT
filesystem. Instead GRUB2 prefers embedding in the space between the MBR
and first partition or using a dedicated partition that is safe from
tampering by fs utilities. In our case the space after the MBR is where
the GPT lives so we need to use the extra partition scheme instead.
The 64MB "BOOT-B" partition has never been used so we can replace it
with a 2MB partition which is more than enough for GRUB.
We have long since stopped installing anything to the /boot directory of
the root filesystem. Mount the ESP partition to /boot for consistancy
with the discoverable partition spec.
Normally GCC is installed in a way that allows installing multiple
versions and switching between them. Our production images do not need
this and additionally the only things from the GCC package that are
needed are the shared libraries. To ensure these libraries are *always*
locatable regardless of the presence of /etc/ld.so.conf and
/etc/ld.so.cache we can install those libraries to plain old /usr/lib.
The GCC packages don't have a built in way to do this but we can get
away with extracting the libraries directly from the binary package.
This is actually similar to what ChromeOS did with a few exceptions:
- We use a native GCC build instead of the cross toolchain
- The archive is properly extracted from the package instead of feeding
the package directly to tar and ignoring the resulting warnings.
As an added benefit switching from a blacklist to a whitelist ensures
that extra cruft does not slip through the cracks, saving 5-10MB.
Create profile as a real directory instead of a symlink to the board
root's configuration. Normally the board root does not modify this but
it is useful for build_image to use it to modify package.provided.
Normally Gentoo expects moving between major GCC releases to be a manual
step. In our case we want this to always be automatic, otherwise the GCC
version won't be switched at all.
Apparently expanding an empty string before a variable assignment forces
that assignment to be interpreted as a command instead. Instead of an
empty string use env as our sudo alternative when running as root.
Needed for portage 2.2. Sync URIs are included but not very useful yet
because portage only can do `git pull` but not `git clone`. An extra
helper script will be required to do the initial clone it seems.
Binary packages may be useful for re-installing a package with a
different INSTALL_MASK. Can be used to install debug symbols.
Instead of gluing in a special PROD_INSTALL_MASK for all images use
profiles to configure the differences between the base build root,
production images, and developer images. This offers much more
flexibility and is needed for providing a full dev environment in
developer images.
Using parallel_emerge has been disabled by default for all commands
except build_image for quite a while now, build_image kept it just
because it was still a bit faster than normal emerge. Keeping
parallel_emerge complicates future changes to build_image so it needs to
drop it entirely. Since that means nothing uses it by default we might
as well just rip out support for it entirely.
VHD was just for testing, raw is more useful for published images.
coreos-install will now be able to install working xen instances:
coreos-install -d /dev/xvda -o xen -c cloud-config.yml
Missed this in 7231b95a, the update zip should still be built when the
usr partition is extracted for generating updates but build_image itself
is not generating and signing the update.
The current generate_update function is now less useful, the important
part that we need is just the partition image now. Also by defaulting to
extracting the partition the old cors_generate_update which is still in
use by devserver can be removed entirely, devserver will just expect the
extracted partition image instead.
Attempting to work around an apparent race in mtools, the command
'extlinux' these days is just the install tool for mounted partitions
while 'syslinux' is for unmounted devices.
Evaluating this as a user config causes it to block on
coreos-environment-setup.service which will wait on networking. This
makes it hard to add extra tricks for testing/debugging situations where
networking is failing. For example, to trigger dhcpcd if networkd dies:
#cloud-config
write_files:
- path: /etc/systemd/system/systemd-networkd.service.d/dhcpcd.conf
content: |
[Unit]
OnFailure=dhcpcd.service
[Service]
Restart=no
The new Update() performs the same tasks as the old Resize()
in addition to formatting previously-unformatted partitions. This
allows children disk-layouts to repartition the base layout in
addition to resizing.
I started to move board files under a boards/ directory similar to how
the SDK is under sdk/ but didn't do so everywhere. This should finish
the job so everything is consistent now.
Note: This prefix is only used in developer and buildbot uploads. When
final releases are copied to $channel.release.core-os.net it doesn't use
the prefix since a) I already published urls without the prefix and b)
no sdk files are ever posted to the public release locations.
btrfs isn't designed for small volumes and can run out of space sooner
than one would expect in our current setup, particularly with docker.
To try to improve the situation always create the filesystem initially
as 2GB instead of 512MB using the default settings: metadata is
duplicated, data is single, not mixed. The mixed setting may have been
partly why our performance can be so poor. For the default vm layout
use 6GB instead of 3GB, about what we use for EC2.
Since the new bucket scheme uploads images to a private staging area
first we need to configure the final location to generate vagrant json
metadata correctly.
- Automated builds drop SDK and binary packages into
gs://builds.developer.core-os.net/ and the new download URL is
http://builds.developer.core-os.net/ (COREOS_DEV_BUILDS)
- Change default upload path to gs://users.developer.core-os.net/ for
misc developer builds. Official builds go elsewhere and will just be
configured in buildbot/jenkins so some COREOS_OFFICIAL stuff is gone.
- Automated builds of images go to a private bucket,
gs://builds.release.core-os.net which later gets copied to
gs://alpha.release.core-os.net and friends by core_promote.
The new --developer_data option can be used to specify a path to a cloud
config to bundle into the image. If none is provided but a shared user
password (for core) is set then generate a config to set that password.
This lets us use the same mechanism for setting the default password for
both disk and PXE images.
This image type is the same as the developer image except that it is a
single root filesystem and is bootable via systemd-nspawn. This may
become obsolete eventually when it becomes possible to boot the normal
disk images under nspawn but it is useful for testing until then.
The partition type is defined by the Discoverable Partitions Spec.
http://www.freedesktop.org/wiki/Specifications/DiscoverablePartitionsSpec/
To make it possible to plop a CoreOS install into a simple
single-filesystem image for use as a container some things like
configuring bootloaders need to be skipped.
To behave more like setup_board/build_packages update_chroot should
fully configure portage to make sure everything is accurate.
Now binhosts are defined in make.conf.host_setup so the static config in
coreos-overlays doesn't need to refer to version.txt. setup_board
already made this change in 7a43a07f.
Define path locations to reduce dependency between static configs in
coreos-overlays and the behavior of the scripts repo. Spreading
configuration across two repos makes everything harder to understand.
Eventually everything should either be defined in profiles in
coreos-overlays or minimal auto-generated config files here in scripts.
Use what was the base image build function as setup/finalize steps in
the dev and prod build functions. This eliminates duplicate code
that mounted and unmounted the filesystem images.
We need some more control over exactly what lands in dev vs prod images
which will require letting them diverge in what is currently the common
base image step. There isn't any real need for the base image in the
first place other than to speed up building both dev and prod images at
the same time but that isn't common enough to worry about.
As part of this cleanup also remove references to CHROMEOS_* variables
and the recovery image that never actually existed in CoreOS.
For generating images for groups other than the one given to build_image
run this script along with the usual image_to_vm.sh commands. To avoid
ambiguity with the 'latest' symlink, this script creates $group-latest
symlinks instead. build_image creates the new symlink too.
Only the key is needed, and currently the vagrant OEM is completely
broken outside of vagrant. This gets vmware_insecure images back into
the state that they were before cloud config came along. :)
This adds two new optional build steps. The first user of these is the
vagrant images but many of the targets can be simplified now.
- fs_hook: Anything that needs to happen before unmounting the image.
This happens after the OEM is installed but before disk images are
made. It can be used to copy any data out of the image.
- bundle_format: Many VM types ship as some sort of archive format
rather than plain disk images as this script originally assumed.
Adding this final step lets us stop using the conf step awkwardly.
Vagrant now ships with a Vagrantfile and related code included in the
OEM package. This lets us version our vagrant-side code along with the
images themselves as well making the coreos-vagrant repo optional again.
The coreos-vagrant code will still be useful for handling the fancier
cluster configuration stuff but no longer has to carry the plugin code.
This should make it less difficult for people to add kernel options for
debugging. Without a prompt/timeout the user must be holding down space
or some other key while syslinux loads but it may not be possible for
the user to do so provide input quite that fast. Only a half second to
avoid needlessly increasing boot times in the common case.
Using the classic mbr.bin was only needed during the transition from
syslinux 3 to 6 because the behavior of gptmbr.bin changed after 3.
Now that the transition is done and cgpt supports the new scheme now it
is time we switched back. This avoids depending on using a hybrid MBR.
The .DIGESTS format is clunky and annoying. It also requires uses to
perform two steps to verify images using GPG. Instead support signing
all files directly so there is no need for .DIGESTS.
The old DIGESTS code will remain in place for now but after a few
releases I plan on deleting it.
The use of getopts was leading to conflicts between this script's short
options and qemu's long options. For example -serial was getting
interpreted as -s -- erial which is not very helpful.
We added a new https certificate on the new update service and changed
the hostname to be consistent with all of the other endpoints. Update
the new images to use this.
The old URL http://public.roller.core-os.net will remain working until
all of the old client have been updated.
Installing to a temporary directory and then copying over the final
contents of /usr/share/oem allows more complicated OEM packages such as
python to be configured with --prefix=/usr/share/oem while previously
the atypical use of ROOT=/usr/share/oem would have complicated things.
This can be used by update_engine as a quick test to determine if it is
running on a system that it can handle. This avoids needing something
like the 'coreos.diskless' kernel command line flag.
If QEMU is given a uuid systemd will detect that and in turn use it for
the machine-id. This made the bug causing the machine-id to be always
re-generated on boot harder to notice since it didn't happen on QEMU.
Taking a bit of a new approach to booting PXE images here for both
amd64-generic and amd64-usr. Instead of requiring the user to specify
squashfs and tmpfs on the kernel command line we can simply provide
defaults in the initrd's fstab.
Now with SYSLINUX 6 we can use the same bootloader on EFI and BIOS
systems. This replaces our previous reliance on building default kernel
options into the kernel image itself.
- Remove custom COREOS_* attributes from /etc/lsb-release
- Move dev image logic to dev_image_util
For extra fun fix detection of local host URL for devserver.
- Remove weirdly verbose "DESCRIPTION" format.
- Add COREOS_RELEASE_BOARD back to /usr/share/coreos/release
This is mostly just so update_engine and gmerge report the correct
board name to devserver, informative-only on prod images.
- Remove version info from /etc/gentoo-release
- Switch from 'track' to 'group' terminology.
Fix the problem of: "pecified switch root path %s does not seem to be an
OS tree. /etc/os-release is missing" because dracut doesn't have an
/usr/share/coreos/os-release file.
cgpt now supports generating hybrid MBRs and the classic style mbr.bin
from any version of SYSLINUX should work the same with the hybrid MBR.
The other code, gptmbr.bin, changes after SYSLINUX 3. Switching lets me
play with different versions of SYSLINUX without breaking everything.
With this change all images feature a hybrid MBR so the special case for
some VM platforms has been removed.
This doesn't make things go faster and I am suspicious that it makes
things worse. For example /etc/xml/catalog winds up empty from time to
time and I wonder if this locking is related to that.
While attempting to fix the easy to mix up DIGESTS names in feb59db9f I
stumbled across yet another way that the DIGESTS names were a bit
unpredictable: previously a .bz2 got stuck into the file name when
upload_images automatically compressed some file types. The new code
missed this and never added the .bz2. Correct this now, both for
image_to_vm which has a pile of glue to keep the legacy names in place
for now and build_image which I never intended to change.
Vagrant reads this file to determine that we are CoreOS... so lets not
break that just yet. A PR to switch to os-release has been posted:
https://github.com/mitchellh/vagrant/pull/2985
Some day gentoo-release will be dropped but that day is not today.
Make it possible for other scripts to share the same value for our
release repository and equally easy to override with a custom value.
Also allow setting the root from the command line in addition to the
environment. Usually --upload_root is better to use than --upload_path.
Switch from naming DIGESTS based in disk image name to a common prefix.
old: coreos_production_qemu_image.img.DIGESTS ->
new: coreos_production_qemu.DIGESTS
The old behavior wasn't very consistent since plain disk images aren't
used by all types and the code implementing that was easy to brake,
namely by mistake coreos_production_pxe_image.cpio.gz.DIGESTS became
coreos_production_pxe.vmlinuz.DIGESTS a couple releases ago.
The old names will continue to exist as well for the time being to avoid
breaking existing install/download scripts and the original pxe DIGESTS
name is back.
For multi-file uploads we should explicitly declare what the name of the
.DIGESTS file should be instead of using the first file name. Relying on
the ordering was subtle and easy to break.
To avoid having to sync directory creation and tmpfiles installed by
ebuilds just require ebuilds to do `keepdir /etc/path` and this script
will handle the rest. A possible extension would be to update to handle
symlinks in addition to directories in gen_tmpfiles.py
The funky UUID and other special settings should only be applied to
coreos-rootfs and coreos-usr partitions which will never be fscked. When
STATE becomes ROOT in -usr images it gets fsked while mounted read-only
and fsck updates the filesystem's UUID if it is blank. Turns out this
causes disagreement between the kernel and the disk leading to bad
things. A related issue was fixed in a newer version of tune2fs but
unless I missed it the same bugfix didn't make it into e2fsck so
updating wouldn't resolve the issue.
http://e2fsprogs.sourceforge.net/e2fsprogs-release.html#1.42.9
Check if the disk layout is a /usr layout and if so hack the USR-A
partition, not whatever is mounted to /. Also use the new functionality
in disk_util for this as it can look up partitions by label.
The basic infrastructure to support this is now in place. Add a new
board that uses the experimental coreos/amd64/usr profile /usr based
disk layouts. This is just enough to successfully build images, they
aren't bootable yet.
The boot kernel parameters change depending on whether the new /usr
scheme is in use. Pass the disk layout to the bootloader config script
and adjust generated configs accordingly.
Nothing from chromeos-common.sh is needed for image building now. Also
kill off build_common.sh which was just a weird way of sourcing
common.sh. The two piddly functions it provided fit better in
build_image_util.sh