Alibaba Cloud will refuse to use images for some instance types unless
the image is explicitly marked as supporting NVMe disks.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The underlying snapshots are not automatically deleted along with the
image, and there is no flag that can be set to cause them to be
automatically deleted.
Tag the underlying snapshots for deletion before deleting the image,
delete the image, and then delete any such tagged snapshots (including
any that may remain from a previous failed deletion attempt).
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Experimentation suggests Alibaba Cloud API calls are extremely
unreliable, with a failure rate around 1%. It is therefore necessary
to allow for retrying basically every API call.
Some API calls (e.g. DescribeImages or ModifyImageAttribute) are
naturally idempotent and so safe to retry. Some non-idempotent API
calls (e.g. CopyImage) support explicit idempotence tokens. The
remaining API calls may simply fail on a retry, if the original
request happened to succeed but failed to return a response.
We could write convoluted retry logic around the non-idempotent calls,
but this would substantially increase the complexity of the already
unnecessarily complex code. For now, we assume that retrying
non-idempotent requests is probably more likely to fix transient
failures than to cause additional problems.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The CopyImage API call does work, but is unacceptably slow due to rate
limiting. Importing a full set of images to all regions can take
several hours (and is likely to fail at some point due to transient
errors in making API calls).
Resort to a mixture of strategies to get images imported to all
regions:
- For regions with working OSS that are not blocked by Chinese state
censorship laws, upload the image files to an OSS bucket and then
import the images.
- For regions with working OSS that are blocked by Chinese state
censorship laws but that have working FC, use a temporary FC
function to copy the image files from the uncensored OSS buckets
and then import the images. Attempt downloads from a variety of
uncensored buckets, since cross-region OSS traffic tends to
experience a failure rate of around 10% of requests.
- For regions that have working OSS but are blocked by Chinese state
censorship laws and do not have working FC, or for regions that
don't even have working OSS, resort to using CopyImage to copy the
previously imported images from another region. Spread the
imports across as many source regions as possible to minimise the
effect of the CopyImage rate limiting.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Spinning up ECS instances is supported in all ECS regions (unlike
Function Compute), but turns out to be unacceptably unreliable since
Alibaba Cloud has a very irritating tendency to fail to launch ECS
instances for a variety of spurious and unpredictable reasons.
Rewrite the censorship bypass mechanism to use the (extremely slow)
CopyImage API call to copy an imported image from an uncensored region
to a censored region.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Function Compute is unsupported in several Alibaba Cloud regions.
Rewrite the censorship bypass mechanism to access OSS buckets using a
temporary ECS instance instead of a temporary Function Compute
function.
Importing images now requires that the account has been prepared using
the "ali-setup" script, which creates the necessary role, VPCs, and
vSwitches to allow ECS instances to be launched in each region.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Allow the UEFI CPU architecture to be detected for the partitioned
disk images generated by genfsimg as of commit 2c84b68 ("[build] Use a
partition table in generated USB disk images").
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Following the examples of aws-import and gce-import, add a utility
that can be used to upload an iPXE disk image to Alibaba Cloud Elastic
Compute Service (ECS) as a bootable image.
The iPXE disk image is first uploaded to a temporary Object Storage
Service (OSS) bucket and then imported as an ECS image. The temporary
bucket is deleted after use.
As with Google Compute Engine, an appropriate image family name is
identified automatically: "ipxe" for BIOS images, "ipxe-uefi-x86-64"
for x86_64 UEFI images, and "ipxe-uefi-arm64" for AArch64 UEFI images.
This allows the latest image within each family to be launched within
needing to know the precise image name.
Copies of the images are uploaded to all selected regions. One major
complication is that OSS buckets in mainland China can be created but
cannot be accessed due to Chinese laws, which require an ICP filing
for any bucket hosted in mainland China. We work around this
restriction by first uploading the image to a region outside mainland
China and then using a temporary Function Compute function running in
each region to copy the images to the OSS bucket via the internal OSS
endpoints, which are not subject to the same restrictions.
Signed-off-by: Michael Brown <mcb30@ipxe.org>