The increased /boot and /usr partitions meant that we grew all images
types. The raw image had the root partition decreased a bit but the VM
images not, and AWS and Akamai images even got aligned to also have the
larger VM rootfs instead of the raw rootfs. All image types are way
smaller than Azure with its 30 GB size and thus the size had to be
increased. For Azure, however, we already have enough free space and it
is good to avoid increasing the image size because this requires action
for those cases where users had assumed that the image fits into a
hardcoded 30 GB disk.
Reduce the root partition by the amount of blocks that is the difference
between the old and current full disk image size for Azure.
Signed-off-by: Kai Lueke <kailuke@microsoft.com>
The /usr partition was too small some time ago and we gained space again
by switching to btrfs with compression and also removing/splitting out
content. The /boot partition is too small all the time and we added
many hacks to fit the kernel+initrd under 60 MB. To handle the case
where the /oem partition is too small for the A/B-updated OEM extensions
we added the workaround to write the inactive one (or both) to the
rootfs. All this would not be needed if we had increased the partition
sizes a few years ago so that we could now assume that most nodes have
the increased sizes and we can make use of them. Still, we can do it now
to prepare for the next time when in five or ten years we have serious
size problems and run out of workarounds. We have to do the change now
and wait a few years so that most nodes have been provisioned with the
new layout. Then we can drop the workarounds and have a full featured
kernel and initrd, and we can also increase the /usr filesystem to make
use of the larger partition. Ideally we use large enough sizes that we
never have to worry again but since we also want to support small ARM
boards which might only have 8 GB internal storage, let's target this
when increasing the partition sizes. With 1 GB /boot, two 2 GB /usr, and
1 GB /oem partitions we are already at 6 GB, leaving 2 GB for the
rootfs. For now, reduce the extracted /usr update payload size to the
current combined filesystem and verity data usage (same size as before).
The rootfs size was also reduced for the initial .bin image so that we
don't overshoot 8 GB - it will be resized to fit the disk anyway on
first boot.
Signed-off-by: Adrian Vladu <avladu@cloudbasesolutions.com>
Signed-off-by: Kai Lueke <kailuke@microsoft.com>
This change removes the legacy_boot flag from the EFI system partition.
We already have a BIOS boot partition which should offer compatibility with
legacy bios systems.
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
The compression feature of btrfs allows us to store more in the
size-limited /usr and OEM partitions. The size should of course still
be monitored to not bloat the image but more headroom helps to try
things out quickly without hitting the hard limit which fails the
build.
Use btrfs with zstd compression for the /usr partition. While for ext2
a hack exists to force read-only mounts by manipulating some bytes of
the filesystem, on btrfs we can use the subvolume read-only flag
instead which also works for the default top level subvolume. However,
it also makes also sense to mount the filesystem with the "norecovery"
mount option to prevent any write attempts even when the "ro" option is
set (not needed when using dm-verity in read-only mode but when
directly mounting without dm-verity). A new subvolumes is not created
because subvolumes don't offer anything special as long as we use the
A/B partition update mechanism (but they could be an alternative for
that). Note that switching to the btrfs on the /usr partition is only
possible when the Flatcar Stable release has all patches in
update-engine and seismograph's rootdev.
This reverts commit bb9ddfb08a83a456fc56962e62538bdfc0031a2f,
meaning that the planned change is now done and we switch the OEM
partition to btrfs. The reason for the revert is resolved in
https://github.com/kinvolk/ignition/pull/22
The compression feature of btrfs allows us to store more in the
size-limited /usr and OEM partitions. The size should of course still
be monitored to not bloat the image but more headroom helps to try
things out quickly without hitting the hard limit which fails the
build.
Use btrfs for the OEM partition but with zlib compression because
the outdated GRUB version doesn't support zstd yet.
New subvolumes currently can't be used for the OEM partition as default
subvolumes because GRUB tries to read the grub.cfg from the top
subvolume (at least with our old version). (We could however use
subvolumes for the /usr partition when switching to btrfs if that
makes any sense.)
The previously used uuid 4f68bce3-e8cd-4db1-96e7-fbcaf984b709 is valid
for x86_64 root partitions, which resulted in the dev container not
working with systemd-nspawn on aarch64. systemd-nspawn fails with:
No suitable root partition found in image
Change the partition uuid to the architecture agnostic one documented
in the man page:
A GUID partition table (GPT) with a single partition of type 0fc63daf-8483-4772-8e79-3d69d8477de4.
This makes systemd-nspawn happy on aarch64.
Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
Mark the initial copy of CoreOS as 'successful' and with a non-zero
priority. Required to boot with a stricter interpretation of the
partition selection scheme which ignores partitions that have a priority
of zero. The new grub implementation follows this rule and is what the
original ChromeOS spec used too.
For the sake of completeness if multiple partitions are configured in
the json file with this feature they will be prioritized in disk-order.
The VHD format actually uses 2MB blocks internally so the 1MB alignment
used in e77e4e54 wasn't sufficent to prevent other tools from further
adjusting the image size to align it. Additionally a 1MB alignment may
be triggering a bug in OpenStack or XenServer disk resizing that renders
that partial block at the end of the old image size unmapped/unavailabe.
The new disk size alignment left too much extra space at the end of the
disk which would lead to pointless resizing on first boot. Fill in the
extra space so that no more than 1MB is left unused.
The VHD disk format internally includes CHS addressing and qemu-img
respectfully aligns disk images to the common 16 heads 63 sectors
geometry when possible. This is unfortunate since images uploaded to
Azure must also be aligned to 1MB we normally do.
Since qemu-img doesn't have a way to handle this well right now adjust
our existing alignment logic to create disk images aligned to both.
The new grub install script must be called after the image is unmounted
and the old bootloaders script doesn't need to touch grub at all. For
now we will continue to use the existing syslinux configs but
interpreted by grub. Beyond the grub menu flashing by during boot
everything should still be functionally equivalent.
Unlike SYSLINUX, GRUB2 does not recommend embedding itself in a FAT
filesystem. Instead GRUB2 prefers embedding in the space between the MBR
and first partition or using a dedicated partition that is safe from
tampering by fs utilities. In our case the space after the MBR is where
the GPT lives so we need to use the extra partition scheme instead.
The 64MB "BOOT-B" partition has never been used so we can replace it
with a 2MB partition which is more than enough for GRUB.
We have long since stopped installing anything to the /boot directory of
the root filesystem. Mount the ESP partition to /boot for consistancy
with the discoverable partition spec.
btrfs isn't designed for small volumes and can run out of space sooner
than one would expect in our current setup, particularly with docker.
To try to improve the situation always create the filesystem initially
as 2GB instead of 512MB using the default settings: metadata is
duplicated, data is single, not mixed. The mixed setting may have been
partly why our performance can be so poor. For the default vm layout
use 6GB instead of 3GB, about what we use for EC2.
This image type is the same as the developer image except that it is a
single root filesystem and is bootable via systemd-nspawn. This may
become obsolete eventually when it becomes possible to boot the normal
disk images under nspawn but it is useful for testing until then.
The partition type is defined by the Discoverable Partitions Spec.
http://www.freedesktop.org/wiki/Specifications/DiscoverablePartitionsSpec/