We still perform some tests using Gen 1 on amd64. Standard_D2s_v6 does
not support this, but v5 will presumably be the last version that does,
so hardcode that case.
For Gen 2, you need to set the SKU for amd64 to work, and it has to use
the gallery like arm64 already does.
Using the gallery is possibly slightly slower, so ideally we would only
upload the image once like we do for AWS, but let's just get it working
for now.
Signed-off-by: James Le Cuirot <jlecuirot@microsoft.com>
It happens that we have some leftovers instances running in an "error"
state (the error comes from the OpenStack scheduled deletion). This
leads to instance creation error during the test because quota limits
are hit.
Let's clean-up everything before running the new tests.
This won't impact tests from other channels as OpenStack is limited to
one CI job at a time.
Signed-off-by: Mathieu Tortuyaux <mtortuyaux@microsoft.com>
Hetzner is having some capacity issues[^1]:
- amd64: CPX plans (CPX11 to CPX51) - Falkenstein (FSN) and Nuremberg (NBG)
- arm64: CAX plans (CAX11 to CAX41) - Helsinki (HEL) and Nuremberg (NBG)
Let's switch the location:
* Helsinki (hel1) for amd64
* Keep Falkenstein (fsn1) for arm64
[^1]: https://status.hetzner.com/incident/aa5ce33b-faa5-4fd0-9782-fde43cd270cf
Signed-off-by: Mathieu Tortuyaux <mtortuyaux@microsoft.com>
Kola's logic for choosing BIOS vs EFI isn't too smart, and not
specifying --qemu-ovmf-vars leads to it passing -bios to QEMU. This
doesn't make sense for arm64, but it did work anyway with the old
firmware in raw format. The new firmware in QCOW2 format doesn't work
this way.
Signed-off-by: James Le Cuirot <jlecuirot@microsoft.com>
No need for garbage collection since one temporary project is allocated with 1h of
lifespan for each run.
Signed-off-by: Mathieu Tortuyaux <mtortuyaux@microsoft.com>
Co-authored-by: Julian Tölle <julian.toelle97@gmail.com>
This adds support for providing a value for the newly introduce
--azure-kola-vnet kola parameter through the environment. This parameter is
meant to indicate that kola is running inside of a vnet in Azure and the kola
created storage account will be restricted to being accessed from that vnet.
This lets us disable public access to storage accounts.
Needs a corresponding change to jenkins jobs, because we have no way of
determining what vnet a worker node is connected to programmatically. So it
needs to be defined by the job.
Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
This change makes QEMU_UPDATE_PAYLOAD configurable via
ci-automation/settings.env where it was hard-wired before.
The change also fixes fall-out in qemu_update.sh by ensuring a local tmp
directory is created before it is used by the test.
Signed-off-by: Thilo Fromm <thilofromm@microsoft.com>
Switch to using a managed identity instead of file based credentials for
running kola/ore (not plume). This covers our test subscription, but not our
publishing subscription.
Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
The kola run didn't pick up the version that was set up in the build
because the git changes from that step are lost.
Redo the version setup in the kola run to use the same version, and
skip the kola update test if no update payload can be found. In the
future we should copy it over from the GitHub Action artifact.
The vendor tools on the OEM partition weren't updated. We now want to
ship them as systemd-sysext images which we can easily update. This
change extends the Flatcar A/B update mechanism to cover the OEM
systemd-sysext images. The same mechanism is also able to support
"official" Flatcar extensions, e.g., a ZFS extension.
I seem to have problems with getting a network connectivity inside
QEMU vm when running the tests in the azure machine. I don't know
what's the cause, but for the dev container tests these problems can
be worked around by using the locally provided dev container
image. Make it possible by specifying QEMU_DEVCONTAINER_FILE in the
environment.
This change updates the github actions kola test runner workflow to use
the new, separated artifacts produced by ci.yaml.
Further, it adds a fix for the devcontainer tests. Devcontainer and bin
packages used in the devcontainer tests are now served from a local
temporary web server.
The change also adds the qemu_update test and provides the respective
update payload.
Lastly, the tests now use a local torcx_manifest.json produced by
ci.yaml, which points to a torcx tarball also served by the local
temporary web server.
For the GitHub CI we have to use --qemu-skip-mangle because the LXC
containers don't have access to loop devices. Running with
--qemu-skip-mangle means that the serial console does not get captured
completely because systemd and dracut messages are missing, and thus we
don't catch these errors in kola.
Make the skipping conditional and use it in Jenkins at least for the
nightlies and releases.
To ensure that we can update from very old releases, add a test with a
fixed old release, here the Stable release that introduced arm64
support to have the same test logic for both architectures.
Recently we ran into sporadic corruption issues for AWS EC2 AMIs.
We use the streamOptimized VMDK format and it seems to cause problems
at the AWS side, regardless if created by qemu-img or vmdk-convert.
Switch to using the plain AMI images for uploading as workaround.
qemu_update vendor test was downloading a wrong LTS image when it was
testing the old LTS image. This is because it was using a current
symlink, which for LTS channel will always point to the new LTS. Old
LTS is available under current-${YEAR} symlink. We can get the
information about year from the lts-info file.
* Add SKIP_COPY_TO_BINCACHE environment variable that will skip
uploading test results to bincache. This is useful if we want to
upload test results as artifacts on github.
* make QEMU_IMAGE_NAME configurable
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
The new build pipeline compresses images already but uploaded both the
compressed and uncompressed files because the whole build folder gets
uploaded.
Add a new flag "--only_store_compressed" to the image generation which
deletes the uncompressed file after compression is done. Uncompressed
images are still supported if specified in the flag
"image_compression_formats".
Closes https://github.com/flatcar-linux/Flatcar/issues/793
The kola test run time shouldn't be longer than the GC duration to
prevent failing tests caused by GC interference.
Align the Azure kola timeout with the GC duration.
The Azure tests use a similar logic as the GCE tests where an the
instance type parameter normally used in AWS/Equinix Metal tests is
here used to specify whether the VM gets started in Gen V1 or V2 mode.
Signed-off-by: Sayan Chowdhury <schowdhury@microsoft.com>
Co-authored-by: Kai Lüke <pothos@users.noreply.github.com>
The logic we had in some tests for covering different instance types
now got more easy to reuse for testing the GVNIC mode in GCE.
Align the GCE test with AWS and DigitalOcean to test an additional
"instance type" (here just changing the NIC) and break the retest spin
case it gets called for arm64.
The test framework from the AWS PR allows us to align the logic which
also addresses some bugs we had here.
Port the Equinix Metal test over to the new framework (and also use
different test basenames per architecture while at it which could
otherwise result in clashes).