The new m3.small instance does not have official Flatcar support yet
but we can already cover it in our PXE boot release tests.
The c3.small instances are legacy and m3.small is the new smallest
type.
`c3.large.arm64` instances of Equinix Metal are available in metro
either `DA` or `DC`. However, recently arm64 CI builds started to fail
due to too few servers available in the DA metro. As the DC metro has
more servers available, let's change metro to DC.
How to check how many servers are available in a specific metro:
```
curl -X POST \
-H "Content-Type: application/json" -H "X-Auth-Token: ..." \
https://api.equinix.com/metal/v1/capacity/metros \
-d '{"servers": [ { \
"metro": "dc", \
"plan": "c3.large.arm64", \
"quantity": 34 \
} ] }'
curl -X POST \
-H "Content-Type: application/json" -H "X-Auth-Token: ..." \
https://api.equinix.com/metal/v1/capacity/metros \
-d '{"servers": [ { \
"metro": "da", \
"plan": "c3.large.arm64", \
"quantity": 17 \
} ] }'
```
The kola test run time shouldn't be longer than the GC duration to
prevent failing tests caused by GC interference.
Align the Azure kola timeout with the GC duration.
With the limit of 2 parallel tests, meaning 6 machines, the test time
is ~10 hours which is longer than the GC time. It seems that the
regional capacity is not so limited at the moment and we can try to
increase the number of machines.
Adjust the timeout to reflect the GC time and increase the parallel
tests to 3, meaning 9 machines.
Azure ARM64 instances entered preview, so produce images for them regularly
with every release now. Flatcar has supported Azure ARM64 since the first
release with the 5.15 kernel, which was something like 3139.0.0
GCP Pro is failing because hostname is > 63 char:
```
Apr 5 19:52:27.522820 kubelet[1762]: E0405 19:52:27.522513 1762 kubelet_node_status.go:93] "Unable to register node with API server" err="Node \"jenkins-gce-pro-5-91a967ef5450cb932bc5.c.flatcar-212911.internal\" is invalid: metadata.labels: Invalid value: \"jenkins-gce-pro-5-91a967ef5450cb932bc5.c.flatcar-212911.internal\": must be no more than 63 characters" node="jenkins-gce-pro-5-91a967ef5450cb932bc5.c.flatcar-212911.internal"
```
Let's remove `jenkins` and `gce` from the hostname, these
information are not critical for debugging purposes.
Hostname should now looks like
"basic-5-91a967ef5450cb932bc5.c.flatcar-212911.internal" or
"pro-5-91a967ef5450cb932bc5.c.flatcar-212911.internal"
Signed-off-by: Mathieu Tortuyaux <mtortuyaux@microsoft.com>
There was a regression for an instance type which could have been
prevented by testing.
Add the same extended test logic used for Equinix Metal to AWS.
Sometimes the Linux build system changes and results in unexpected
kernel config results.
Print changes in the kernel config as part of the image diff report.
Rename sztd to zst and amend the changelog. The zstd binary generates a
compressed file with the .zst extension by default.
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
This change makes the Jenkins job output openstack images using gzip
compression format. This allows OpenStack users to directly consume images
by simply specifying the URL to the image. Glance will then download the
image, unarchive it and add it to it's catalogue.
Fixes#575
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
Using the tags on the branch is not enough to find the channel we want
to be the dev build be related to.
Use the base channel variable which was introduced for this.
Often a change results in unexpected effects on the image, e.g., when
a wrong package version gets chosen or the package installs files under
/etc, or binaries of library dependencies get pulled in. Besides
inspecting the image manually, the package-diff tool also gives
valuable insights.
Run the package-diff tool in a comparison to the last release and print
the image URL alongside for convenience.
The default branch of both repos, coreos-overlay and portage-stable,
should be `main`. If we checkout `master` branch, which contains
invalid source code that was deprecated many years ago, the build could
sometimes fail, e.g. when trying to build perl 5.26.2 with gcc 10.
Simply delete the code checking out branches, as the part is already
being handled in emerge-gitclone.
number of test increased. While we don't have yet a way to reduce
testing time, let's increase the timeout.
Signed-off-by: Mathieu Tortuyaux <mtortuyaux@microsoft.com>
We override `PARALLEL_TESTS`, because kola run with PARALLEL_TESTS >= 4
causes the tests to provision >= 12 ARM servers at the same time. As the
da11 region does not have that many free ARM servers, the whole tests
will fail. With PARALLEL_TESTS=2 the total number of servers stays < 10.
In addition, we override `timeout` to 10 hours, because it takes more
than 8 hours to run all tests only with 2 tests in parallel.
Equinix Metal ARM server are not yet hourly available in the default `sv15` region
so we override the `PACKET_REGION` to `Dallas` since it's available in this region.
We do not override `PACKET_REGION` for both board on top level because we need to keep proximity
for PXE booting.
Signed-off-by: Mathieu Tortuyaux <mtortuyaux@microsoft.com>
Currently the os/sdk and os/toolchains job perform a chroot update whose
results are immediately discarded because the rest of the build uses a fresh
chroot and catalyst. Towards the end of a release period this can extend the
build time by about an hour (longer if rust is involved).
Introduce a `--setuponly` flag that bails after the chroot configuration, and
the skips chroot update.
Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
and add script used for that purpose. This requires access to a github PAT
with 'repo.status' permissions.
Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
Currently the kubeadm tests fail on arm64 because the instance type
only offers 1 vCPU:
cluster.go:117: error execution phase preflight: [preflight] Some fatal errors occurred:
cluster.go:117: [ERROR NumCPU]: the number of available CPUs 1 is less than the required 2
Switch to the next larger instance type which has 2 vCPUS.
if the test is ran for ARM64, there is no need to run `update_chroot`
since there is no SDK.
Signed-off-by: Mathieu Tortuyaux <mtortuyaux@microsoft.com>
The SDK can either be a release SDK or a dev build SDK which are stored
in different paths. DOWNLOAD_ROOT_SDK should be based on the
SDK_URL_PATH value which indicates whether it's a release or dev build
path.
bootstrap_sdk runs catalyst.sh which will try to download the SDK if the
verify digest fails.
Importing the DIGEST allows to skip this step and to continue with the
previously downloaded SDK.
Signed-off-by: Mathieu Tortuyaux <mtortuyaux@microsoft.com>
When PORTAGE_REF or OVERLAY_REF are numbers, we can change the way the refspec
is constructed to allow fetching a PR instead instead of a branch. Checking for
equality using '[' works to detect numbers, bash's '[[' doesn't.
Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>