Experimentation suggests Alibaba Cloud API calls are extremely
unreliable, with a failure rate around 1%. It is therefore necessary
to allow for retrying basically every API call.
Some API calls (e.g. DescribeImages or ModifyImageAttribute) are
naturally idempotent and so safe to retry. Some non-idempotent API
calls (e.g. CopyImage) support explicit idempotence tokens. The
remaining API calls may simply fail on a retry, if the original
request happened to succeed but failed to return a response.
We could write convoluted retry logic around the non-idempotent calls,
but this would substantially increase the complexity of the already
unnecessarily complex code. For now, we assume that retrying
non-idempotent requests is probably more likely to fix transient
failures than to cause additional problems.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The CopyImage API call does work, but is unacceptably slow due to rate
limiting. Importing a full set of images to all regions can take
several hours (and is likely to fail at some point due to transient
errors in making API calls).
Resort to a mixture of strategies to get images imported to all
regions:
- For regions with working OSS that are not blocked by Chinese state
censorship laws, upload the image files to an OSS bucket and then
import the images.
- For regions with working OSS that are blocked by Chinese state
censorship laws but that have working FC, use a temporary FC
function to copy the image files from the uncensored OSS buckets
and then import the images. Attempt downloads from a variety of
uncensored buckets, since cross-region OSS traffic tends to
experience a failure rate of around 10% of requests.
- For regions that have working OSS but are blocked by Chinese state
censorship laws and do not have working FC, or for regions that
don't even have working OSS, resort to using CopyImage to copy the
previously imported images from another region. Spread the
imports across as many source regions as possible to minimise the
effect of the CopyImage rate limiting.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Spinning up ECS instances is supported in all ECS regions (unlike
Function Compute), but turns out to be unacceptably unreliable since
Alibaba Cloud has a very irritating tendency to fail to launch ECS
instances for a variety of spurious and unpredictable reasons.
Rewrite the censorship bypass mechanism to use the (extremely slow)
CopyImage API call to copy an imported image from an uncensored region
to a censored region.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Function Compute is unsupported in several Alibaba Cloud regions.
Rewrite the censorship bypass mechanism to access OSS buckets using a
temporary ECS instance instead of a temporary Function Compute
function.
Importing images now requires that the account has been prepared using
the "ali-setup" script, which creates the necessary role, VPCs, and
vSwitches to allow ECS instances to be launched in each region.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Importing images into Alibaba Cloud currently relies upon using a
temporary Function Compute function to work around Chinese state
censorship laws that prevent direct access to OSS bucket contents in
mainland China regions.
Unfortunately, Alibaba Cloud regions are extremely asymmetric in terms
of feature support. (For example, some regions do not even support
IPv6 networking.) Several mainland China regions do not support
Function Compute, and so this workaround is not available for those
regions.
A possible alternative censorship workaround is to create temporary
ECS virtual machine instances instead of temporary Function Compute
functions. This requires the existence of a role that can be used by
ECS instances to access OSS. We cannot use the AliyunFcDefaultRole
that is currently used by Function Compute, since this role cannot be
assumed by ECS instances.
Creating roles is a privileged operation, and it would be sensible to
assume that the image importer (which may be running as part of a
GitHub Actions workflow) may not have permission to itself create a
suitable temporary role. The censorship bypass role must therefore be
set up once in advance by a suitably privileged user.
Add the ability to create a suitable censorship bypass role to the
Alibaba Cloud setup utility.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Creating ad hoc instances in Alibaba Cloud is extremely cumbersome and
tedious due to the need to specify an explicit vSwitch and security
group, with no defaults being available.
Add a utility that will create a VPC within each region, a vSwitch
within each zone within each region, and a security group within each
region.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Update the descriptive text for the disk log console tools to remove
references to INT13, since these now work for both BIOS and UEFI disk
log consoles.
Leave the script names as {aws,gce,ali}-int13con, to avoid breaking
any existing tooling that might use these names.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Allow the UEFI CPU architecture to be detected for the partitioned
disk images generated by genfsimg as of commit 2c84b68 ("[build] Use a
partition table in generated USB disk images").
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Following the examples of aws-int13con and gce-int13con, add a utility
that can be used to read the INT13 console log from a used iPXE boot
disk in Alibaba Cloud Elastic Compute Service (ECS).
We cannot reliably access the used iPXE boot disk (or a snapshot
created from it) since OSS buckets in mainland China cannot be
accessed due to Chinese laws. We therefore create a snapshot and
attach this snapshot as a data disk to a temporary Linux instance, as
we do in Google Compute Engine.
Unlike in Google Compute Engine, we cannot reliably capture serial
port output from the temporary Linux instance. Issuing the relevant
GetInstanceConsoleOutput API call will cause the output to be captured
once and (unpredictably) cached. Without knowing in advance precisely
when the output is complete, we cannot use this approach to capture
the relevant part of the output.
We therefore use an Alibaba Cloud Linux image that includes the Cloud
Assistant Agent. This allows us to use the RunCommand API call to run
a command on the instance and capture the output, all done via the
control plane so that we are not dependent on having direct network
access to the temporary instance.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Following the examples of aws-import and gce-import, add a utility
that can be used to upload an iPXE disk image to Alibaba Cloud Elastic
Compute Service (ECS) as a bootable image.
The iPXE disk image is first uploaded to a temporary Object Storage
Service (OSS) bucket and then imported as an ECS image. The temporary
bucket is deleted after use.
As with Google Compute Engine, an appropriate image family name is
identified automatically: "ipxe" for BIOS images, "ipxe-uefi-x86-64"
for x86_64 UEFI images, and "ipxe-uefi-arm64" for AArch64 UEFI images.
This allows the latest image within each family to be launched within
needing to know the precise image name.
Copies of the images are uploaded to all selected regions. One major
complication is that OSS buckets in mainland China can be created but
cannot be accessed due to Chinese laws, which require an ICP filing
for any bucket hosted in mainland China. We work around this
restriction by first uploading the image to a region outside mainland
China and then using a temporary Function Compute function running in
each region to copy the images to the OSS bucket via the internal OSS
endpoints, which are not subject to the same restrictions.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The storage client is currently constructed with the project inferred
from the environment, rather than using the project specified via the
command line arguments.
Fix by passing the project name to the storage client constructor.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Making images public is blocked by default in new AWS regions. Remove
this block automatically whenever creating a public image.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Add the "--retain <N>" option to limit the number of retained old AMI
images (within the same family, architecture, and public visibility).
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Allow for easier identification of images and snapshots created by the
aws-import script by adding tags for image family (e.g. "iPXE") and
architecture (e.g. "x86_64") to both.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Following the example of aws-int13con, add a utility that can be used
to read the INT13 console log from a used iPXE boot disk in Google
Compute Engine.
There seems to be no easy way to directly read the contents of either
a disk image or a snapshot in Google Cloud. Work around this
limitation by creating a snapshot and attaching this snapshot as a
data disk to a temporary Linux instance, which is then used to echo
the INT13 console log to the serial port.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Following the example of aws-import, add a utility that can be used to
upload an iPXE disk image to Google Compute Engine as a bootable
image. For example:
make CONFIG=cloud EMBED=config/cloud/gce.ipxe \
bin-x86_64-pcbios/ipxe.usb bin-x86_64-efi/ipxe.usb
make CONFIG=cloud EMBED=config/cloud/gce.ipxe \
CROSS=aarch64-linux-gnu- bin-arm64-efi/ipxe.usb
../contrib/cloud/gce-import -p \
bin-x86_64-pcbios/ipxe.usb \
bin-x86_64-efi/ipxe.usb \
bin-arm64-efi/ipxe.usb
The iPXE disk image is automatically wrapped into a tarball containing
a single file named "disk.raw", uploaded to a temporary bucket in
Google Cloud Storage, and used to create a bootable image. The
temporary bucket is deleted after use.
An appropriate image family name is identified automatically: "ipxe"
for BIOS images, "ipxe-uefi-x86-64" for x86_64 UEFI images, and
"ipxe-uefi-arm64" for AArch64 UEFI images. This allows the latest
image within each family to be launched within needing to know the
precise image name.
Google Compute Engine images are globally scoped and are available
(and cached upon first use) in all regions. The initial placement of
the image may be controlled indirectly by using the "--location"
option to specify the Google Cloud Storage location used for the
temporary upload bucket: the image will then be created in the closest
multi-region to the storage location.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Some AWS instance types still do not support serial console output or
screenshots. For these instance types, the only viable way to extract
debugging information is to use the INT13 console (which is already
enabled via CONFIG=cloud for all AWS images).
Obtaining the INT13 console output can be very cumbersome, since there
is no direct way to read from an AWS volume. The simplest current
approach is to stop the instance under test, detach its root volume,
and reattach the volume to a Linux instance in the same region.
Add a utility script aws-int13con to retrieve the INT13 console output
by creating a temporary snapshot, reading the first block from the
snapshot, and extracting the INT13 console partition content.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
AMI names must be unique within a region. Add a --overwrite option
that allows an existing AMI of the same name to be deregistered (and
its underlying snapshot deleted).
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Allow both x86_64 and arm64 images to be imported in a single import
command, thereby allowing for e.g.
make CONFIG=cloud EMBED=config/cloud/aws.ipxe bin/ipxe.usb
make CONFIG=cloud EMBED=config/cloud/aws.ipxe \
CROSS=aarch64-linux-gnu- bin-arm64-efi/ipxe.usb
../contrib/cloud/aws-import -w amilist.txt -p \
bin/ipxe.usb bin-arm64-efi/ipxe.usb
This simplifies the process of generating a single amilist.txt file
for inclusion in the documentation at https://ipxe.org/howto/ec2
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The AWS console user interface provides no convenient way to sort AMIs
by creation date.
Provide a default AMI name constructed from the current date and CPU
architecture, to simplify the task of finding the most recent iPXE AMI
in a given AWS region.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Add an option to generate the amilist.txt list of current AMI images
as included in the EC2 documentation at https://ipxe.org/howto/ec2
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Add a utility that can be used to upload an iPXE disk image to AWS EC2
as an Amazon Machine Image (AMI). For example:
make CONFIG=cloud EMBED=config/cloud/aws.ipxe bin/ipxe.usb
../contrib/cloud/aws-import -p -n "iPXE 1.21.1" bin/ipxe.usb
Uploads are performed in parallel across all regions, and use the EBS
direct APIs to avoid the need to store temporary files in S3 or to run
VM import tasks.
Signed-off-by: Michael Brown <mcb30@ipxe.org>