38 Commits

Author SHA1 Message Date
Kai Lueke
41506d0e39 jenkins/kola/azure.sh: Align the timeout with the GC duration
The kola test run time shouldn't be longer than the GC duration to
prevent failing tests caused by GC interference.
Align the Azure kola timeout with the GC duration.
2022-06-02 14:01:11 +09:00
Kai Lueke
730e07fd9e jenkins/kola/packet.sh: Remove hardcoded arm64 parallel test limit
The arm64 tests on EM sometimes hit the timeout.
Remove the hardcoded limit of 3 tests to default to 4 and otherwise
use the overwritten parameter.
2022-06-02 13:59:34 +09:00
Kai Lueke
60dfe14460 jenkins/kola/packet: try to reduce test time by increasing parallelism
With the limit of 2 parallel tests, meaning 6 machines, the test time
is ~10 hours which is longer than the GC time. It seems that the
regional capacity is not so limited at the moment and we can try to
increase the number of machines.
Adjust the timeout to reflect the GC time and increase the parallel
tests to 3, meaning 9 machines.
2022-05-04 16:50:14 +09:00
Kai Lueke
2c29875627 Use gangue compiled from Jenkins
These scripts happened to use the copy of gangue in the SDK which isn't
expected because it should use the binaries complied by Jenkins.
2022-04-21 11:34:36 +09:00
Mathieu Tortuyaux
ae73d66a07
kola/gce: shrink hostname to be lower than 63 char
GCP Pro is failing because hostname is > 63 char:
```
Apr  5 19:52:27.522820 kubelet[1762]: E0405 19:52:27.522513    1762 kubelet_node_status.go:93] "Unable to register node with API server" err="Node \"jenkins-gce-pro-5-91a967ef5450cb932bc5.c.flatcar-212911.internal\" is invalid: metadata.labels: Invalid value: \"jenkins-gce-pro-5-91a967ef5450cb932bc5.c.flatcar-212911.internal\": must be no more than 63 characters" node="jenkins-gce-pro-5-91a967ef5450cb932bc5.c.flatcar-212911.internal"
```

Let's remove `jenkins` and `gce` from the hostname, these
information are not critical for debugging purposes.

Hostname should now looks like
"basic-5-91a967ef5450cb932bc5.c.flatcar-212911.internal" or
"pro-5-91a967ef5450cb932bc5.c.flatcar-212911.internal"

Signed-off-by: Mathieu Tortuyaux <mtortuyaux@microsoft.com>
2022-04-06 16:45:31 +02:00
Jeremi Piotrowski
b353d8cf6b jenkins/kola/azure: forward azure private networking parameters to kola
These allow the configuration of virtual network for the created
instances to join, and tell kola to use the private instance IP for
connectivity.
2022-03-14 15:35:24 +01:00
Kai Lueke
d26530d4cb jenkins/kola/packet|aws: silence debug output of concurrent test
The concurrent tests' debug output is not helping and causes confusion.
2022-03-10 15:42:45 +01:00
Kai Lueke
f2c7bcb78d jenkins/kola/aws: run tests on additional instance types
There was a regression for an instance type which could have been
prevented by testing.
Add the same extended test logic used for Equinix Metal to AWS.
2022-03-10 15:10:02 +01:00
Mathieu Tortuyaux
e63a191c16
Merge pull request #222 from flatcar-linux/tormath1/timeout-em
kola/em: increase timeout
2022-02-22 18:04:31 +01:00
Mathieu Tortuyaux
6d0d7ea2ba
Merge pull request #224 from flatcar-linux/tormath1/https
jenkins/kola: use httpS URL for PXE boot
2022-02-11 15:09:15 +01:00
Mathieu Tortuyaux
bd30be56ee
jenkins/kola: use httpS URL for PXE boot
Follow-up of:
* https://github.com/flatcar-linux/mantle/pull/288
* https://github.com/flatcar-linux/Flatcar/issues/527

Signed-off-by: Mathieu Tortuyaux <mtortuyaux@microsoft.com>
2022-02-09 10:46:42 +01:00
Dongsu Park
39301b007f jenkins: do not check out branches of coreos-overlay and portage-stable
The default branch of both repos, coreos-overlay and portage-stable,
should be `main`. If we checkout `master` branch, which contains
invalid source code that was deprecated many years ago, the build could
sometimes fail, e.g. when trying to build perl 5.26.2 with gcc 10.

Simply delete the code checking out branches, as the part is already
being handled in emerge-gitclone.
2022-02-08 12:26:47 +01:00
Mathieu Tortuyaux
5c4ac96f69
kola/em: increase timeout
number of test increased. While we don't have yet a way to reduce
testing time, let's increase the timeout.

Signed-off-by: Mathieu Tortuyaux <mtortuyaux@microsoft.com>
2022-02-06 11:56:12 +01:00
Mathieu Tortuyaux
373976b1eb
jenkins/kola/packet: use metro instead of facility
Follow up of: https://github.com/flatcar-linux/mantle/pull/281

Signed-off-by: Mathieu Tortuyaux <mtortuyaux@microsoft.com>
2022-01-26 15:49:30 +01:00
Jeremi Piotrowski
7cafff85f4 jenkins/kola/azure: make use of --azure-use-gallery parameter
Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
2022-01-07 13:56:32 +01:00
Dongsu Park
5c391e9008 jenkins: override PARALLEL_TESTS for ARM servers in da11
We override `PARALLEL_TESTS`, because kola run with PARALLEL_TESTS >= 4
causes the tests to provision >= 12 ARM servers at the same time. As the
da11 region does not have that many free ARM servers, the whole tests
will fail. With PARALLEL_TESTS=2 the total number of servers stays < 10.
In addition, we override `timeout` to 10 hours, because it takes more
than 8 hours to run all tests only with 2 tests in parallel.
2021-11-25 16:55:10 +01:00
Mathieu Tortuyaux
c746ab2333
kola/packet: override EM region for ARM64 server
Equinix Metal ARM server are not yet hourly available in the default `sv15` region
so we override the `PACKET_REGION` to `Dallas` since it's available in this region.
We do not override `PACKET_REGION` for both board on top level because we need to keep proximity
for PXE booting.

Signed-off-by: Mathieu Tortuyaux <mtortuyaux@microsoft.com>
2021-11-22 19:43:27 +01:00
Sayan Chowdhury
2fe1e979c0
jenkins/kola/packet: Remove c3.small.x86 to try running cl.internet
Signed-off-by: Sayan Chowdhury <schowdhury@microsoft.com>
2021-10-22 21:21:39 +05:30
Kai Lueke
0bd6d1aae5 jenkins/kola/aws.sh: use larger arm64 instance type for kubeadm
Currently the kubeadm tests fail on arm64 because the instance type
only offers 1 vCPU:
cluster.go:117: error execution phase preflight: [preflight] Some fatal errors occurred:
cluster.go:117: [ERROR NumCPU]: the number of available CPUs 1 is less than the required 2

Switch to the next larger instance type which has 2 vCPUS.
2021-10-21 14:03:39 +02:00
Sayan Chowdhury
0028f95a26
packet: Update the base URL to point to bucket.release.f-ln
Signed-off-by: Sayan Chowdhury <schowdhury@microsoft.com>
2021-10-21 14:01:09 +05:30
Sayan Chowdhury
e04af554fa
do: Update the base URL to point to bucket.release.f-ln
Signed-off-by: Sayan Chowdhury <schowdhury@microsoft.com>
2021-10-21 12:03:54 +05:30
Mathieu Tortuyaux
5c304ffac9
jenkins/kola/qemu: run update_chroot only for amd
if the test is ran for ARM64, there is no need to run `update_chroot`
since there is no SDK.

Signed-off-by: Mathieu Tortuyaux <mtortuyaux@microsoft.com>
2021-10-19 11:29:56 +02:00
Kai Lueke
a668d961a9 jenkins: use the SDK_URL_PATH path for DOWNLOAD_ROOT_SDK
The SDK can either be a release SDK or a dev build SDK which are stored
in different paths. DOWNLOAD_ROOT_SDK should be based on the
SDK_URL_PATH value which indicates whether it's a release or dev build
path.
2021-10-15 14:35:08 +02:00
Mathieu Tortuyaux
4e1e707628
jenkins/kola/container: pass PORTAGE_BINHOST to container
Otherwise, it was failing since we check for unbound variable:
```
/bin/bash: line 1: PORTAGE_BINHOST: unbound variable
```

Signed-off-by: Mathieu Tortuyaux <mtortuyaux@microsoft.com>
2021-10-13 10:57:35 +02:00
Mathieu Tortuyaux
47f5feff68
jenkins/kola/container: fix verify-key sharing in systemd container
`$verify_key` actually holds `--verify-key=verify.asc` so of course
`systemd-nspawn` fails since it does not expect `--verify-key` value.

Signed-off-by: Mathieu Tortuyaux <mtortuyaux@microsoft.com>
2021-10-12 16:27:29 +02:00
Jeremi Piotrowski
24128594e0
Merge pull request #163 from flatcar-linux/tormath1-jepio/ci-authenticated
jenkins: use private bucket with authentication for fetching binaries.
2021-10-12 13:33:22 +02:00
Jeremi Piotrowski
cbf003e617 jenkins: use 'cork create' instead of 'cork update'
because we need to pass google credentials to update_chroot, and 'cork update'
doesn't support that.

Add --sdk-url-path to sdk.sh for new cork default.
2021-10-12 13:32:07 +02:00
Mathieu Tortuyaux
7ef55eb15d
kola/dev-container: override binhost to use private GCS
in this commit we make sure to use GCS bucket for dev container tests by
providing the required credentials and the associated fetch command.

Signed-off-by: Mathieu Tortuyaux <mtortuyaux@microsoft.com>
2021-10-11 14:42:09 +02:00
Mathieu Tortuyaux
998d2f4fc6
jenkins: add --json-key to cork update commands
it pulls https://github.com/flatcar-linux/mantle/pull/239 to be able to
use `--json-key` in order to access private GCS bucket

Signed-off-by: Mathieu Tortuyaux <mtortuyaux@microsoft.com>
2021-10-07 14:41:16 +02:00
Kai Lueke
e24c456889 jenkins/kola/packet.sh: fix check for skipping machine type tests
The cl.basic and cl.internet tests are different tests which wasn't
clear before. Also, the grep process returns an exit code of 1 if it
didn't find a match, causing the job to cancel. The list of tests is
space separated and should not be quoted but on the other hand, we
do have to handle a literal *.
Look for the right test and handle the grep exit code, and disable
globs for the subshell for preserving a literal *.
2021-09-30 11:50:30 +02:00
Kai Lueke
6a04b54f0b jenkins: run simple network test on different hardware
The Linux 5.10 stable kernel introduced a regression that we didn't
catch because we only run kola on one hardware type in Equinix Metal.
Validate that a simple network test works on various instance types of
the current hardware generation.
2021-09-28 18:10:24 +02:00
Jeremi Piotrowski
c8dd87c095 jenkins: add script to run kola arm64 tests under docker
Included is a dockerfile that installs system deps of kola in an debian:11
image. For the test script, the control flow is:

qemu_uefi.sh
  qemu_uefi_arm64.sh
    (docker)
      qemu_common.sh

qemu_common uses the 'NATIVE_ARM64' variable passed by the jenkins job to control the behavior.
The differences are:

* use git directly to fetch (and verify) the manifest
* setup some symlinks so that /var/tmp is on the same BTRFS partition as $PWD/tmp
* setup symlinks so that we don't have to fixup installation of mantle to chroot
* run things directly instead of in chroot through cork

The whole script is executed as root, because kola requires root privileges
anyway and making kvm and sudo work with an arbitrary host user inside the
container would require a custom entrypoint to setup groups.

Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
2021-09-06 14:08:11 +02:00
Kai Lueke
0e8ea8b9d7 jenkins/kola/qemu_common.sh: continue regular tests even if update test fails
The update test runs first but if it fails, we still want to continue
with the regular test suite.
2021-09-02 10:38:02 +02:00
Jeremi Piotrowski
a2b3950ac2 jenkins: add support for running tests on Azure Gen2 VMs
This requires passing the --azure-hyper-v-generation=V2 argument to kola. The
vhd/image is the same as for azure gen1 vms, the azure_gen2 specifier is only
for jenkins usage.

Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
2021-08-26 19:14:10 +02:00
Kai Lüke
e96c1c5e6d jenkins/kola/qemu(_uefi): run update test from previous release
The newly enabled update test performs an update from the built image
to itself. This is useful to test that the update mechanism didn't
break but it doesn't say if the built image will be accepted as update
from the previous official release.

Introduce an additional kola run that begins from the previous official
release and tests to update to the built image. Since the test does two
updates it also covers the case of updating from the built image to the
built image. Thus, we can skip the test in the normal run.
This new kola run is done first to keep the qemu-latest symlink valid
for the main test suite.
2021-07-27 11:51:58 +02:00
Kai Lüke
1b70f59cd0 jenkins/kola: share a single qemu script file 2021-07-26 15:01:24 +02:00
Kai Lüke
177bea4a74 Generate test update payload and run the kola update test
The kola update tests need a dev-key-signed update payload. This was
lacking and caused the update tests to be skipped.
Generate the test update payload for both dev builds and release builds
and run the kola tests for both. The test update payload has a special
name to not confuse it with the real update payload for releases, and
we keep the previous behavior to sign releases. Therefore, the
generate_update function wasn't used but the extract_update function
extended with generating the additional test payload.
2021-07-12 18:49:54 +02:00
Kai Lüke
8eaef708be jenkins: move all inline bash scripts to flatcar-scripts
The logic of the inline bash scripts of each job was sometimes
separated into the flatcar-scripts/jenkins/*.sh helpers but mostly
part of the Groovy file. This coupling had its advantages but also
downsides when special cases needed to be added for different release
versions. Other issues were that the inline scripts needed the
backslash character to be escaped twice and Jenkins was not good in
terminating the child processes when stopping a job. Having inline
bash scripts in Groovy also mandated the use of Jenkins to build and
release Flatcar Container Linux which hinders test builds in other CI
platforms.
Move the inline bash scripts fully to to the files in
flatcar-scripts/jenkins/ and create new ones for job that didn't have
a script there yet. Also invoke them through a systemd-run wrapper
script which ensures that all child processes are terminated and also
sets up /opt/bin as additional path for the static lbzcat binary.
A workaround for bash 4 was needed to use a temporary file instead of
the <(cmd) bash feature which caused a strange syntax error, otherwise
the bash commands are moved as they are.
2021-06-30 16:31:58 +02:00