The new m3.small instance does not have official Flatcar support yet
but we can already cover it in our PXE boot release tests.
The c3.small instances are legacy and m3.small is the new smallest
type.
`c3.large.arm64` instances of Equinix Metal are available in metro
either `DA` or `DC`. However, recently arm64 CI builds started to fail
due to too few servers available in the DA metro. As the DC metro has
more servers available, let's change metro to DC.
How to check how many servers are available in a specific metro:
```
curl -X POST \
-H "Content-Type: application/json" -H "X-Auth-Token: ..." \
https://api.equinix.com/metal/v1/capacity/metros \
-d '{"servers": [ { \
"metro": "dc", \
"plan": "c3.large.arm64", \
"quantity": 34 \
} ] }'
curl -X POST \
-H "Content-Type: application/json" -H "X-Auth-Token: ..." \
https://api.equinix.com/metal/v1/capacity/metros \
-d '{"servers": [ { \
"metro": "da", \
"plan": "c3.large.arm64", \
"quantity": 17 \
} ] }'
```
With the limit of 2 parallel tests, meaning 6 machines, the test time
is ~10 hours which is longer than the GC time. It seems that the
regional capacity is not so limited at the moment and we can try to
increase the number of machines.
Adjust the timeout to reflect the GC time and increase the parallel
tests to 3, meaning 9 machines.
number of test increased. While we don't have yet a way to reduce
testing time, let's increase the timeout.
Signed-off-by: Mathieu Tortuyaux <mtortuyaux@microsoft.com>
We override `PARALLEL_TESTS`, because kola run with PARALLEL_TESTS >= 4
causes the tests to provision >= 12 ARM servers at the same time. As the
da11 region does not have that many free ARM servers, the whole tests
will fail. With PARALLEL_TESTS=2 the total number of servers stays < 10.
In addition, we override `timeout` to 10 hours, because it takes more
than 8 hours to run all tests only with 2 tests in parallel.
Equinix Metal ARM server are not yet hourly available in the default `sv15` region
so we override the `PACKET_REGION` to `Dallas` since it's available in this region.
We do not override `PACKET_REGION` for both board on top level because we need to keep proximity
for PXE booting.
Signed-off-by: Mathieu Tortuyaux <mtortuyaux@microsoft.com>
The cl.basic and cl.internet tests are different tests which wasn't
clear before. Also, the grep process returns an exit code of 1 if it
didn't find a match, causing the job to cancel. The list of tests is
space separated and should not be quoted but on the other hand, we
do have to handle a literal *.
Look for the right test and handle the grep exit code, and disable
globs for the subshell for preserving a literal *.
The Linux 5.10 stable kernel introduced a regression that we didn't
catch because we only run kola on one hardware type in Equinix Metal.
Validate that a simple network test works on various instance types of
the current hardware generation.
The logic of the inline bash scripts of each job was sometimes
separated into the flatcar-scripts/jenkins/*.sh helpers but mostly
part of the Groovy file. This coupling had its advantages but also
downsides when special cases needed to be added for different release
versions. Other issues were that the inline scripts needed the
backslash character to be escaped twice and Jenkins was not good in
terminating the child processes when stopping a job. Having inline
bash scripts in Groovy also mandated the use of Jenkins to build and
release Flatcar Container Linux which hinders test builds in other CI
platforms.
Move the inline bash scripts fully to to the files in
flatcar-scripts/jenkins/ and create new ones for job that didn't have
a script there yet. Also invoke them through a systemd-run wrapper
script which ensures that all child processes are terminated and also
sets up /opt/bin as additional path for the static lbzcat binary.
A workaround for bash 4 was needed to use a temporary file instead of
the <(cmd) bash feature which caused a strange syntax error, otherwise
the bash commands are moved as they are.