Commit Graph

279 Commits

Author SHA1 Message Date
Krzesimir Nowak
66e72c79a0 ci-automation: Properly escape parameters passed to bash
Forwarding parameters to another bash invocation through a string
interpreted as a bash script is a bit troublesome. It is not enough to
wrap a parameter like 'foo bar' in escaped double quotes (\") to avoid
it being split into two parameters by bash executing the script
string. It mostly works, but there's always a risk of having a path
where this breaks. It's rare

Wrapping into escaped quotes, be them double or single, also won't
work for passing an array of parameters, so it's even easier here to
trigger globbing or bracket expansion or another unwanted splitting of
supposedly one parameter into multiple. Globbing can be temporarily
disabled with 'set -f' or 'set -o noglob', but this still leaves all
the other special bash characters unescaped. So each parameter in the
array should be escaped before they are put into the script string.

The escaping can be done with `printf` and its '%q` formatter, so
let's do so. For single parameters it is as simple as
`foo_escaped=$(printf '%q' "${foo}")`, for arrays a loop needs to be
used.
2023-04-27 12:55:00 +02:00
Krzesimir Nowak
0b83fbf127 sdk_bootstrap: Push a branch also for two-phase SDK nightly builds
The two-phase nightly builds create an intermediate tag first, which
didn't match the main nightly tag regexp before. Because of that, the
commit was not pushed to the main branch. The following final SDK
build had a version that matched the regexp, but the last commit (with
the intermediate tag on it) wasn't in main, and thus was also not
pushed.
2023-04-24 14:41:22 +02:00
Thilo Fromm
401af830d1 scripts, CI, workflows: remove submodule handling (main) 2023-04-13 12:26:36 +02:00
Thilo Fromm
f07cb5f781 tapfile_helper ff.: support TAP and Markdown output
This change adds markdown output support to tapfile helper.
tap_generate_report() has been refactored to use low-level output
functions to write tests; TAP and markdown output is supported and both
are generated by default. Also, it should be straightforward to add
other output formats by implementing the respective low level print
functions.

The markdown output is now used by run-kola-tests.yaml to generate step
output and, if run from a PR, add a comment with test results to the PR.

Signed-off-by: Thilo Fromm <thilofromm@microsoft.com>
2023-03-29 21:05:47 +02:00
Thilo Fromm
27d540692f run-kola-tests.yaml: use new artifacts, local web server
This change updates the github actions kola test runner workflow to use
the new, separated artifacts produced by ci.yaml.

Further, it adds a fix for the devcontainer tests. Devcontainer and bin
packages used in the devcontainer tests are now served from a local
temporary web server.

The change also adds the qemu_update test and provides the respective
update payload.

Lastly, the tests now use a local torcx_manifest.json produced by
ci.yaml, which points to a torcx tarball also served by the local
temporary web server.
2023-03-28 10:02:17 +02:00
Kai Lueke
d53d415ef8 Run kola without --qemu-skip-mangle on Jenkins
For the GitHub CI we have to use --qemu-skip-mangle because the LXC
containers don't have access to loop devices. Running with
--qemu-skip-mangle means that the serial console does not get captured
completely because systemd and dracut messages are missing, and thus we
don't catch these errors in kola.
Make the skipping conditional and use it in Jenkins at least for the
nightlies and releases.
2023-03-24 22:17:13 +09:00
Kai Lueke
87e13eb3de ci-automation: Allow git to work on directory owned by other user
The get_git_channel function failed to work which resulted in the
Alpha release job skipping the AWS publishing for the Alpha channel
because it defaulted to the developer channel as fallback when git
rejected to work on the directory owned by the build user while running
as root user. A new version of git caused this behavior change and also
prints an error message that explains to have to set safe.directory.

Set the git config entry safe.directory for the /work path when
entering the mantle container where git runs as root while working on
the directory owned by the build user.
2023-02-14 11:39:33 +09:00
Krzesimir Nowak
50183b48b8 ci-automation: Get two files to build vms instead of a whole directory
Getting the contents of the directory in the buildcache involves using
rsync with some ssh invocation to log in as a bincache user. It's not
a thing that will work locally unless the user gets ahold of the SSH
key allowing the user to log in to buildcache as a bincache user.

Replace it by downloading two files that are actually needed for
building vms: an image file and the version file. This just uses curl
and is accessible for everyone.
2023-02-08 14:50:36 +01:00
Krzesimir Nowak
46a250bf33 ci-automation: Report file size changes
This uses the new size-change-report.sh script to print out some
information about largest files being added/removed and files with
greatest increase/decrease in file size between two versions of the
image.
2023-02-02 10:05:02 +01:00
Krzesimir Nowak
219326392c ci-automation: Try reporting the changes in initrd too
This relies on flatcar_production_image_initrd_contents.txt being
uploaded to the server. It also exports the WITHWTD environment
variable with a value 1, which will make the package-diff script to
try out the wtd contents file variant first.
2023-02-02 10:04:40 +01:00
Mathieu Tortuyaux
b8cafc1b9f
gc: pass OPENSTACK_CREDS to mantle container
Signed-off-by: Mathieu Tortuyaux <mtortuyaux@microsoft.com>
2023-01-03 09:24:16 +01:00
Kai Lueke
bc3a9aeacd qemu_update: Add update test from an old release
To ensure that we can update from very old releases, add a test with a
fixed old release, here the Stable release that introduced arm64
support to have the same test logic for both architectures.
2022-11-29 16:51:27 +01:00
Krzesimir Nowak
fbb962c7f6 ci-automation: Add an environment variable to skip build shortcuts
This will be used for the "run all tests" days in Jenkins.
2022-11-03 12:00:10 +01:00
Kai Lueke
3cb9736c33 ci-automation: Use plain AMI image for uploads
Recently we ran into sporadic corruption issues for AWS EC2 AMIs.
We use the streamOptimized VMDK format and it seems to cause problems
at the AWS side, regardless if created by qemu-img or vmdk-convert.
Switch to using the plain AMI images for uploading as workaround.
2022-10-28 17:21:39 +02:00
Kai Lueke
25dbccc14d ci-automation: Support local patches
For embargoed releases it is useful to apply patches locally to build
with them before they are public. This allows to push the same patches
to the repo during the Flatcar release at the embargo lift. The result
is the same (as long as the scripts patches did not change parts of the
setup logic that was running before they got applied), we can just build
earlier and thus do the Flatcar release directly on the embargo lift
instead of having to wait with the build because it would require the
patches to be in the repos.
2022-10-27 11:53:33 +02:00
Krzesimir Nowak
06d2aabaa2 ci-automation/vendor-testing/vmware.sh: Fix unbound variable use
This gets triggered when the test is rerun and an existing image is
reused.
2022-10-11 15:25:56 +02:00
Jeremi Piotrowski
de132c62d5
Merge pull request #521 from flatcar/jepio/gpg-import-batch
ci-automation: use --batch when importing gpg key
2022-10-06 09:52:07 +02:00
Kai Lueke
00223be1c7 ci-automation/release: Only upload SDK if a new one was built
A release includes an SDK if its SDK version is the release version.
Only then we need to upload a new SDK container image.
2022-10-04 14:24:28 +02:00
Jeremi Piotrowski
6e11ae3394 ci-automation: use --batch when importing gpg key
All invocations of gpg in ci-automation pass --batch as an argument except the
import. Be consistent by having it included everywhere. Additionally, since
ci-automation runs wrapped in a systemd service, no tty is available so batch
is needed for correctness.
2022-10-04 10:22:43 +02:00
Mathieu Tortuyaux
289cc52c5f
automation/gc: add openstack garbage collector
Signed-off-by: Mathieu Tortuyaux <mtortuyaux@microsoft.com>
2022-09-29 11:21:25 +02:00
Mathieu Tortuyaux
de8b4eae6a
ci-automation: add openstack to tested vendors
Missing link to enable the tests in the Flatcar test suite.

Signed-off-by: Mathieu Tortuyaux <mtortuyaux@microsoft.com>
2022-09-29 11:21:25 +02:00
Kai Lueke
89495373d9 ci-automation: Ensure to use latest container image
The container image was only created if it didn't exist locally. This
would result in fixes not being in a downstream job that is scheduled
to a different worker node on Jenkins that has a stale copy.
For the build automation we will now always download the latest
container tar ball based on comparing the image ID from a new artifact,
and for registry images we pull the container image to make sure that
we don't use a stale copy when we rebuild.
2022-09-29 10:04:23 +02:00
Kai Lüke
dca21df916
Merge pull request #513 from flatcar/kai/container-fallback
ci-automation: Fallback also to the mirror for container download
2022-09-27 17:49:53 +02:00
Kai Lueke
20643b260e ci-automation: Fallback also to the mirror for container download
When there is no SDK container image in the registry, the fallback
looks at bincache but bincache isn't backed up and may be cleaned of
old releases. While this won't be the regular case, the container
image registry may be unavailable (or renamed as happened now), or
people would like to rerun the image job which relies on the packages
container.
2022-09-27 15:53:33 +02:00
Krzesimir Nowak
24213a5c96 ci-automation: Download correct previous image for LTS release
qemu_update vendor test was downloading a wrong LTS image when it was
testing the old LTS image. This is because it was using a current
symlink, which for LTS channel will always point to the new LTS. Old
LTS is available under current-${YEAR} symlink. We can get the
information about year from the lts-info file.
2022-09-27 11:56:39 +02:00
Krzesimir Nowak
2606380396 ci-automation: Fix unbound variable errors
FLATCAR_VERSION and FLATCAR_SDK_VERSION are defined in the version
file, so it should be sourced before trying to use those. Here we try
to do it in a limited scope.

Also, SDK container link should use the dockerized version in a
directory name.
2022-09-27 10:55:08 +02:00
Kai Lueke
326c645647 ci-automation: Fix syntax error 2022-09-26 17:24:53 +02:00
Kai Lueke
bca6e6e41d ci-automation: Don't skip nightly build when the previous one failed
Currently we skip the nightly build if there are no changes. This
didn't work well because a new run doesn't fix any failure because the
rerun became a no-op.
Check if the main artifacts we expect from a step are found, as simple
heuristic on whether a rerun is needed.
2022-09-26 17:06:21 +02:00
Kai Lueke
18627499c1 Annotate a copied function
I found a duplicate function and verified that it's the only one via
comm -12 <(sort ci-automation/ci_automation_common.sh) <(sort sdk_lib/sdk_container_common.sh) | grep function
I'm not sure if this is due to a case where we only import one but
can't import the other, hence I'm not deleting it now.
2022-09-26 15:39:45 +02:00
Kai Lueke
3fef1eb801 ci-automation/release: Set up secret envs 2022-09-22 18:31:50 +02:00
Kai Lueke
ffee812d32 ci-automation/release: Run plume release only once
We need to run plume only once for each arch, move it out of the loop.
Also, address some smaller things that shellcheck complains about.
2022-09-22 18:31:50 +02:00
Kai Lueke
79d89faf91 ci-automation/secret_to_file: Fix usage from subshell
This failed when used from ( secret_to_file ... VAR ; cat $VAR )
because ( ) starts a new subshell PID and secret_to_file's returned
/proc/PID/fd/X path was then using the wrong PID.
2022-09-22 18:31:50 +02:00
Kai Lueke
ef8f20f9dd ci-automation/release: Disable GCS auth for plume pre-release
When GCS auth is expected, plume would upload the AMI list to GCS.
2022-09-22 18:31:50 +02:00
Mathieu Tortuyaux
593cf19a7a release: get product IDs from Jenkins
the JSON object is passed from the Groovy script to the release script,
we just need to extract the correct AWS Marketplace product ID based on
the "<channel>-<arch>".

Exception for the stable-amd64 where we also need to get the stable-pro
product ID.

Signed-off-by: Mathieu Tortuyaux <mtortuyaux@microsoft.com>
2022-09-22 18:31:50 +02:00
Mathieu Tortuyaux
27b62deb81 sdk_container: publish the SDK on a Docker registry
Signed-off-by: Mathieu Tortuyaux <mtortuyaux@microsoft.com>
2022-09-22 18:31:50 +02:00
Kai Lueke
20ed1ad3a4 ci-automation/release.sh: Run plume to release cloud images
The mantle plume tool has two steps, pre-release is the mere upload and
release is the publication. In the past this was used to run the tests
inbetween but we don't do this anymore.
Run plume pre-release and release in a single job. Since plume can't
push to GCS in our case, we upload the files to bincache. Also do the
cloudformation update which was previously done in
flatcar-build-scripts but could only be run after the sync to Origin.
It requires the "aws" tool in the mantle container until we implement
this in plume directly.
2022-09-22 18:31:48 +02:00
Krzesimir Nowak
1585ede78a ci-automation: Implement a stricter image version check
I made a mistake and wrote a version like main-3363-0.0-stuff (note a
dash instead of a dot after the first number). Surprisingly the build
chugged along just fine almost until the end of the image job - it
detected invalid version string when the job wanted to create a
version.txt file:

ERROR   build_image: script called: build_image '--board=amd64-usr' '--group=developer' '--output_root=/home/sdk/build/images' '--only_store_compressed' '--torcx_root=/home/sdk/build/torcx' 'prodtar' 'container'
ERROR   build_image: Backtrace:  (most recent call is last)
ERROR   build_image:   file build_image, line 196, called: split_ver '3363' 'SPLIT'
ERROR   build_image:   file common.sh, line 192, called: die 'Invalid version string '3363''
ERROR   build_image:
ERROR   build_image: Error was:
ERROR   build_image:   Invalid version string '3363'

Let's have a stricter version check in the beginning of the build
process, so the process fails sooner rather than later.
2022-09-19 12:12:37 +02:00
Kai Lueke
91a26e5e1e Use new github org name "flatcar"
The "flatcar-linux" github org was renamed to "flatcar". There are no
github redirections in place and we have to update all links.
2022-09-14 14:33:27 +02:00
Kai Lueke
edba76c012 Use ghcr.io/flatcar, there are no redirects
The GitHub org rename also moved the ghcr.io container image repo but
in contrast to git repos, there are no redirects!
2022-09-14 14:33:24 +02:00
Krzesimir Nowak
1ecea3544f ci-automation: Change the way we prepare torcx manifest for testing
Now URLs for torcx packages are always present in the torcx manifest,
but for releases they may be pointing to the origin server where the
packages will be eventually uploaded. At the time of running the
tests, those packages are still only in the build cache, so change the
URLs to point to the build cache, so the test can pass.
2022-09-06 14:00:50 +02:00
Krzesimir Nowak
b2d6f7fc6e *: Allow specifying extra URLs for torcx packages
Torcx manifest may contain paths and URLs as locations of
packages. There are two kinds of packages - vendored and
extra. Vendored packages normally have two locations - path to the
directory inside the image where the package is (which is why it's
called vendored), and a URL to the package on some remote
server. Extra packages only have a URL. But the URLs are added only
when we tell the build_torcx_store script to upload the packages at
the same time, which is what the old build pipeline was doing. With
the new pipeline, the upload happens as a separate step, thus the
upload is disabled when invoking build_torcx_store, and so the
packages are not getting URLs set. This change went unnoticed, because
a kola test checking the generated torcx manifest was only checking if
there is at least one location, either path or URL, and all the new
releases have no extra packages, only vendored ones.

When backporting the new pipeline to old LTS, the kola tests started
to fail, because old LTS had one extra package, and this is how I
noticed the problem.
2022-09-06 14:00:50 +02:00
Kai Lueke
b30654ef22 ci-automation: Prepare release job
The old pipeline had a release job where mantle's plume release tool
was invoked to publish the cloud images.
Implement a release job in the new pipeline with the same goals and
eventually even more automation.
2022-09-05 11:41:41 +02:00
Kai Lueke
1319e4c95a ci-automation: Move image change report to own file
To review the image changes and the changelog more easily and in case
of fixes, iterate over it without rebuilding the image, move this logic
to its own file where a new job could call it.
2022-09-05 11:41:41 +02:00
Kai Lüke
7b7c3e5b76
Merge pull request #425 from flatcar-linux/kai/em-m3
Cover Equinix Metal m3.small.x86 instances in release test
2022-09-01 13:34:20 +02:00
Krzesimir Nowak
8b52a9b04c ci-automation: Use an array for storing failed tests 2022-08-31 09:37:18 +02:00
Krzesimir Nowak
8cd06230ba ci-automation: Print failed tests nicer
Instead of printing failed tests like this:

    Failed tests: kubeadm.v1.25.0.cilium.base
    kubeadm.v1.24.1.cilium.base

Do it like this:

    Failed tests:
    kubeadm.v1.25.0.cilium.base
    kubeadm.v1.24.1.cilium.base
2022-08-31 09:37:18 +02:00
Krzesimir Nowak
9e05a07a77 ci-automation: Return 1 on broken cycle
We have set success to true when the test cycle was broken, which was
a hacky way to avoid printing the give up message. But this setting
success to true also meant that the script returned with status 0,
which is wrong.

Add another variable for controlling printing the give up message.
2022-08-31 09:37:18 +02:00
Krzesimir Nowak
6c77ebde54 ci-automation: Break test cycle properly
Create a tapfile and break out of the loop.
2022-08-31 09:37:18 +02:00
Kai Lueke
b8133d92a0 Cover Equinix Metal m3.small.x86 instances in release test
The new m3.small instance does not have official Flatcar support yet
but we can already cover it in our PXE boot release tests.
The c3.small instances are legacy and m3.small is the new smallest
type.
2022-08-24 18:57:17 +02:00
Krzesimir Nowak
73bb00a9d0 ci-automation: Break retest cycle properly in qemu on arm64
Rerunning the test will always yield the same result in this case, so
it's pointless.
2022-08-24 13:48:35 +02:00
Krzesimir Nowak
2d226f864e ci-automation/packages.sh: Fix access to unbound variable
We were running the run_sdk_container script with passing a value of a
variable named version to the script through the -v flag. But nowhere
is the variable defined. This worked under jenkins, because jenkins
job has a version parameter that gets exported into environment under
the same name. But running it manually outside jenkins revealed the
bug.

The script should have been using a vernum variable. Now, the
difference between this variable and the version variable is that
"version" was in form of <channel>-<version>-<build_id>, whereas
"vernum" comes without the channel part. Fortunately,
"run_sdk_container" was stripping the channel part before using this
value, so it makes no difference whether we pass
main-3333.0.0.0-some-id or just 3333.0.0-some-id.
2022-08-24 13:48:35 +02:00
Krzesimir Nowak
1974033edd ci-automation: Sync used EquinixMetal region to use for ARM64 servers
Recently we changed the region from DA (Dallas) to DC (Washington),
because there are more ARM64 servers available. Reflect this change in
the new pipeline too.
2022-08-05 11:14:36 +02:00
Krzesimir Nowak
661a4067a1 ci-automation/vendor-testing/azure.sh: Use an array for extra instance types 2022-08-03 16:23:15 +02:00
Krzesimir Nowak
23a05949c1 ci-automation/vendor-testing/azure.sh: Use proper machine size on arm64 2022-08-03 16:22:38 +02:00
Krzesimir Nowak
4d09ab35d6 ci-automation/vendor-testing/azure.sh: Fix unbound variable use
This gets triggered when the test is rerun and an existing image is
reused.
2022-08-03 15:21:00 +02:00
Krzesimir Nowak
7f5282e259 ci-automation/vendor-testing/azure.sh: Fix hyperv generation argument
The "v" must be a capital letter. It seems that Azure got picker about
parameters it accepts.
2022-08-03 15:21:00 +02:00
Kai Lueke
5e0dc0a85d ci-automation: Move git tagging into own script
When the build system runs the packages jobs for both architectures in
parallel and has to create a new tag, tagging fails due to the race in
the tagging.
Move the git tagging to its own script that is run from a new top-level
job that starts the packages jobs for both architectures.
2022-07-18 19:20:44 +02:00
Krzesimir Nowak
a96a66d222
Merge pull request #376 from flatcar-linux/krnowak/digests
ci-automation: Generate digests for artifacts
2022-07-14 14:42:49 +02:00
Kai Lüke
f83ee4f9a1
Merge pull request #375 from flatcar-linux/kai/print-changelog
ci-automation: Show changes by finding the previous channel
2022-07-14 13:44:22 +02:00
Kai Lueke
da370b54c1 ci-automation: Show changes by finding the previous channel
The image comparison was done against the old release in the channel
we release to instead of the previous release with the same major
version. This means when a channel transition happens we see a large
diff instead of the diff against the previous release. While not bad
for finding problems, this is normally not needed. However, we want
to have two changelogs generated, one against the old release in the
channel we relese to and one against the previous release with the same
major version when a transition happens. There was no changelog
printing yet, and this is added now.
2022-07-13 19:11:50 +02:00
Kai Lüke
76b47a00b2
Merge pull request #374 from gabriel-samfira/make-workflow-pluggable
Make the kola test workflow reusable
2022-07-13 18:09:43 +02:00
Gabriel Adrian Samfira
b518c3cdb8
Disable image mangle in qemu tests
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2022-07-13 14:19:50 +03:00
Gabriel Adrian Samfira
7dc45a4a1f
Make QEMU_UEFI_BIOS configurable
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2022-07-12 19:46:27 +03:00
Krzesimir Nowak
4e6f44e7b8 ci-automation: Generate digests files for the built artifacts 2022-07-12 16:59:14 +02:00
Krzesimir Nowak
d475d36766 ci-automation: Add a function for generating digests
It works in a similar way to sign_artifacts - it takes a signer, a
list of files and directories, and generates digests next to the
respective files.
2022-07-12 16:59:14 +02:00
Krzesimir Nowak
133cb6b52f ci-automation: Factor out listing files into a separate function
This will come in handy when listing files for creating digests files.
2022-07-12 16:59:14 +02:00
Gabriel Adrian Samfira
dc8cf9c18d
Add configuration options to test functions
* Add SKIP_COPY_TO_BINCACHE environment variable that will skip
    uploading test results to bincache. This is useful if we want to
    upload test results as artifacts on github.
  * make QEMU_IMAGE_NAME configurable

Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2022-07-11 13:32:25 +03:00
Kai Lueke
ef9b0ff820 ci-automation: Only store compressed images
The new build pipeline compresses images already but uploaded both the
compressed and uncompressed files because the whole build folder gets
uploaded.
Add a new flag "--only_store_compressed" to the image generation which
deletes the uncompressed file after compression is done. Uncompressed
images are still supported if specified in the flag
"image_compression_formats".

Closes https://github.com/flatcar-linux/Flatcar/issues/793
2022-07-05 16:13:22 +02:00
Kai Lueke
c1f1404df8 ci-automation: Run package-diff to report image changes
The original pipeline has package-diff commands to print out image
differences compared to the last release. This is used for the release
Go/No-Go QA checks.
Add the same logic to the new pipeline.
2022-06-29 15:23:16 +02:00
Kai Lueke
1b3e9ef188 ci-automation: Use the package container for VM image building
The image job builds an image container that is multiple GBs big and
takes >10 mins to be loaded in the vms job. The vms job can also do its
work by running from the packages container from the packages job when
it fetchs the built image from bincache first and assuming the images
job copies it there.
Skip generating the image container and instead use the packages
container for VM image building by copying the image folder first to
bincache and then retrieving it from there. While reworking this we
also address the issue that the VMs container had used the same name
for both architectures, causing a race when both run in parallel on
the same worker.
2022-06-29 15:23:16 +02:00
Kai Lueke
aae09eef4b ci-automation: align VM image compression with existing pipeline
In jenkins/vms.sh the Digital Ocean and OpenStack images get also
compressed as gzip.
Do so for the new pipeline, too.
2022-06-28 18:08:53 +02:00
Gabriel Adrian Samfira
6e03ea1821 Add CI workflow 2022-06-24 15:35:23 +00:00
Krzesimir Nowak
c3e5e754e9
Merge pull request #334 from flatcar-linux/krnowak/sign-images
ci-automation: Sign the artifacts
2022-06-03 17:29:13 +02:00
Krzesimir Nowak
527bd2237b ci-automation: Sign artifacts and upload the signatures
It uses the SIGNER environment variable to decide whether the
signatures should be created or not. It expect the key of the SIGNER
to exist in GPGHOME, and that's what gpg_setup.sh is already doing.

In some places we need to recursively change the owner of the
directory that contains artifacts to be signed, otherwise we won't be
able to create new files with signatures there. This is because some
of the artifacts are either created inside the SDK container (so the
created files belong to root outside the container) or are created
with `sudo`.
2022-06-03 14:59:38 +02:00
Krzesimir Nowak
0e0eb67ca2 ci-automation: Set up keys for signing
Not used for anything yet. This sets up a temporary GPGHOME directory
and a trap that will remove it after we are done.
2022-06-03 14:59:26 +02:00
Krzesimir Nowak
090d7ec176 ci-automation: Run functions in subshells
The functions are sourcing other files that define global variables,
so they will spill into the callers shell unnecessarily. We will also
add some functionality that uses traps in follow-up commits, so it's
good to limit the scope of traps too.
2022-06-03 14:58:29 +02:00
Krzesimir Nowak
698d0de129 ci-automation: Trivial fixes
Dropped some trailing whitespace, fixed a typo. Trivial.
2022-06-03 14:56:51 +02:00
Krzesimir Nowak
cec96aeec5 ci-automation/vendor-testing/azure.sh: Small fixes
Fix some comments, quote some variables.
2022-06-02 18:40:06 +02:00
Kai Lueke
6c0fb8959d ci-automation/vendor-testing/azure.sh: Align timeout with GC duration
The kola test run time shouldn't be longer than the GC duration to
prevent failing tests caused by GC interference.
Align the Azure kola timeout with the GC duration.
2022-06-02 14:08:28 +09:00
Sayan Chowdhury
42608d3c67
ci-automation/azure: Add initial container tests infra for Azure (#274)
The Azure tests use a similar logic as the GCE tests where an the
instance type parameter normally used in AWS/Equinix Metal tests is
here used to specify whether the VM gets started in Gen V1 or V2 mode.

Signed-off-by: Sayan Chowdhury <schowdhury@microsoft.com>
Co-authored-by: Kai Lüke <pothos@users.noreply.github.com>
2022-05-27 08:01:59 +02:00
Kai Lueke
cee8a6aadf ci-automation: Push version file early
When a nightly build is started that pushes the version file to the
branch it was doing so only at the end of the build, causing the push
to fail if something else got merged in between.
Push the version file early by generating it the same way it would be
generated by the run_sdk_container/bootstrap_sdk_container scripts.
In the case of the SDK the version file gets the same version for the
OS and the SDK. Add some explanations about the version formats. Note
that the scripts will still rewrite the file but it should be a no-op.
2022-05-23 22:40:02 +09:00
Kai Lueke
95367851fa ci-automation/sdk_bootstrap.sh: Allow omitting the optional parameters
The coreos/portage refs were allowed to be empty strings but the way
the function was run from Groovy the lack of quoting caused the empty
strings to be missing parameters.
Since the two parameters are meant to be optional, support omitting
them.
2022-05-23 19:29:39 +09:00
Krzesimir Nowak
8c3d7b977b ci-automation: Fix potential use of unbound variable error
`local -a stuff` does not make `stuff` a bound array variable, so
checking length of the array will trigger an error about unbound
variable. Fortunately, `local stuff=()` does the trick.
2022-05-11 12:43:08 +02:00
Krzesimir Nowak
bf1bc21498 ci-automation: Do not rerun tests on unrelated instances
We forgot to clear the array with instance tests to rerun, so the list
grew from one iteration to another when going over all the instance
types. I did not spot it before, because I tested it with only one
extra instance.
2022-05-11 12:15:52 +02:00
Kai Lüke
42fd4919c4
Merge pull request #331 from flatcar-linux/kai/equinix-metal-refactor
ci-automation/vendor-testing/equinix_metal.sh: Use test framework
2022-05-11 19:12:32 +09:00
Kai Lueke
3fd7825310 ci-automation/vendor-testing/gce.sh: Test GVNIC and break retest cycle
The logic we had in some tests for covering different instance types
now got more easy to reuse for testing the GVNIC mode in GCE.
Align the GCE test with AWS and DigitalOcean to test an additional
"instance type" (here just changing the NIC) and break the retest spin
case it gets called for arm64.
2022-05-11 12:07:58 +09:00
Kai Lueke
9fe14ffe34 ci-automation/vendor-testing/equinix_metal.sh: Use test framework
The test framework from the AWS PR allows us to align the logic which
also addresses some bugs we had here.
Port the Equinix Metal test over to the new framework (and also use
different test basenames per architecture while at it which could
otherwise result in clashes).
2022-05-11 11:39:30 +09:00
Krzesimir Nowak
d60d514482 ci-automation: Make AWS test script to work 2022-05-10 12:46:33 +02:00
Krzesimir Nowak
6278762fa8 ci-automation: Add helper functions for running tests on multiple instances 2022-05-10 12:46:33 +02:00
Krzesimir Nowak
e1d9beaeee ci-automation: Fix passing multiple test names to vendor scripts 2022-05-10 12:46:02 +02:00
Krzesimir Nowak
f0765e22c3 ci-automation: Let vendor scripts know if this is their first run
I will need it to correctly handle test reruns as we will need to
handle passed test names differently on first runs than on reruns.
2022-05-10 12:46:02 +02:00
Dongsu Park
76abe0d9cb ci-automation: Add WIP AWS test script for CI automation 2022-05-10 12:45:43 +02:00
Krzesimir Nowak
a8ac124d53 ci-automation: Add new vendor test for VMware 2022-05-06 12:57:20 +02:00
Krzesimir Nowak
3b3cffabc8 ci-automation: Fix credentials handling in digital ocean 2022-05-06 09:16:23 +02:00
Krzesimir Nowak
3c119f14b2 ci-automation: Fix secret file handling
It can't be done in a subshell, because the file will be gone after
subshell quits.
2022-05-06 09:16:23 +02:00
Krzesimir Nowak
413689c779 ci-automation: Rename some variables and make them overridable 2022-05-06 09:16:23 +02:00
Krzesimir Nowak
197e9a334f ci-automation: Add secrets handling 2022-05-06 09:16:23 +02:00
Krzesimir Nowak
cd2f3f0d6c ci-automation: Drop boilerplate code from digital ocean test 2022-05-06 09:16:23 +02:00
Krzesimir Nowak
2fe896b558 ci-automation: Add retest cycle breaking functionality 2022-05-06 09:16:23 +02:00
Krzesimir Nowak
9650650e4b ci-automation: Add URL template handling 2022-05-06 09:16:23 +02:00
Krzesimir Nowak
b6bb07acdc ci-automation: Initial test script for Digital Ocean 2022-05-06 09:16:23 +02:00
Krzesimir Nowak
d26f2b3b39 ci-automation: Use vendor_test.sh in equinix_metal and gce tests 2022-05-05 13:07:52 +02:00
Krzesimir Nowak
1d6f38a72e ci-automation: Reduce boilerplate in vendor tests
Move the common setup to the vendor_test.sh script, which will be
sourced by the vendor scripts.
2022-05-05 12:57:14 +02:00
Kai Lueke
f7edd4e061 ci-automation: add GCE image test
The GCE image test runs on a single instance type for now. In the
future it would be good to test the new NIC type with the cl.internet
test.
2022-05-05 16:52:42 +09:00
Mathieu Tortuyaux
550e702f90 ci-container/test: add equinix-metal test script
Signed-off-by: Mathieu Tortuyaux <mtortuyaux@microsoft.com>
Co-authored-by: Kai Lüke <pothos@users.noreply.github.com>
2022-05-04 16:34:37 +09:00
Kai Lüke
28ee2a3256
Merge pull request #298 from flatcar-linux/kai/test-lts
CI: Support comparing to current LTS and updating from it
2022-04-29 16:34:47 +09:00
Mathieu Tortuyaux
4bd316ac74
Merge pull request #272 from flatcar-linux/tormath1/pxe
ci-automation/vm: build PXE if Equinix Metal is built
2022-04-28 11:52:52 +02:00
Kai Lueke
9a98cc2917 ci-automation/vms: handle platform names and generate the image formats
The kola test scripts are named by the platforms. The image naming is
also quite difficult to know and remember, e.g., whether "ami" or
"ami_vmdk" is needed for AWS tests and whether it's "vmware" or
"vmware_ova".

To address these problems the vms build stage now accepts the platform
names as format input, and for each platform it will automatically
generate the needed image types to run the tests.
2022-04-28 17:15:02 +09:00
Kai Lueke
c4af762e26 ci-automation/garbage_collect: clean up kola cloud resources
The garbage collect job should also clean up kola resources if a test
job failed to do so due to forced terminator or misbehavior. The
cleanup is done by "ore" which needs credentials like kola.

Run ore from the mantle container image. Unfortunately Docker does not
support Podman's --env-host option and the env vars had to be passed
explicitly. While --env-file=<(env) would work it contains a lot of
variables that cause the container to behave a bit weird.
2022-04-28 16:27:14 +09:00
Kai Lueke
856929d357 CI: Support comparing to current LTS and updating from it
When the restriction that the CI can't access the LTS release is gone
we can support to run the image comparison and the kola update test.
2022-04-26 15:00:31 +09:00
Krzesimir Nowak
1916936e34 ci-automation: Update test.sh script docs
We are not using SDK container for running the tests any more - it was
replaced with mantle container. Update the docs accordingly.
2022-04-20 16:34:07 +02:00
Mathieu Tortuyaux
19ca42b3dd
ci-automation/vm: build PXE if Equinix Metal is built
Signed-off-by: Mathieu Tortuyaux <mtortuyaux@microsoft.com>
2022-04-20 16:33:00 +02:00
Kai Lueke
98e9947a06 ci-automation: silence rsync
The rsync copy logs made it hard to navigate the job output.
Remove the --progress and -v flags.
2022-04-20 19:13:02 +09:00
Kai Lueke
da0380c7e8 Run CI container pipeline kola tests with the new mantle image
The SDK container does not exist for arm64 and is quite heavy. We
currently also resort to a unconditional rebuilding of mantle inside
the SDK.
Use the new mantle container image to run kola tests, and pin its
version through a text file that gets updated by GitHub Actions.
2022-04-20 19:13:02 +09:00
Mathieu Tortuyaux
de7e05403b
ci-automation/vms: rename equinix_metal to packet
this is required to keep "packet" in the SDK linguo while the user can
use "equinix_metal" term.

Signed-off-by: Mathieu Tortuyaux <mtortuyaux@microsoft.com>
Co-authored-by: Krzesimir Nowak <knowak@microsoft.com>
2022-04-13 13:09:51 +02:00
Kai Lüke
7376494ef2
Merge pull request #266 from flatcar-linux/kai/sdk-from-release-tag
ci-automation: use a single git tag and skip nightlies with no changes
2022-04-04 17:12:36 +02:00
Kai Lueke
bd970357c8 ci-automation: use a single git tag and skip nightlies with no changes
The pipeline created two tags if an SDK was built, one for the SDK and
one for the OS build (which was a free-standing tag or a local state
that was equivalent to the existing tag of the same name). The
nightlies created update commits on the main branch, even if no change
was done, and on the release branches we lacked these commits.

Create the release tag in the nightly SDK bootstrap already and reuse
it for the nightly OS build. Instead of local state, checkout the
existing tags explicitly. Extend the nightly update commit logic to
cover release branches and detect if we can skip building because no
changes were done.
2022-04-01 17:18:51 +02:00
Thilo Fromm
6dcfd9aeb6 ci-automation/test.sh: remove PARALLEL_TEST passing (move to CI)
Signed-off-by: Thilo Fromm <thilo@kinvolk.io>
2022-04-01 13:59:47 +02:00
Thilo Fromm
1e0dc777fe ci-automation/test.sh: export PARALLEL_TESTS
Export PARALLEL_TESTS in the container's .env file to ensure it is
passed to the vendor script.
2022-03-23 12:11:12 +01:00
Thilo Fromm
9b83d3e80b
Merge pull request #258 from flatcar-linux/t-lo/ci-automation-tests-use-latest-kola
ci-automation/test.sh: use the latest kola from coreos-overlay
2022-03-16 17:04:16 +01:00
Kai Lueke
c149d24ced run_sdk_container: Fall back to tar ball download for SDK image
The nightly SDK image is not pushed to a registry but has to be
downloaded from the build server as tar ball.
Fall back to the tar ball import for a better user experience.
To reuse the ci logic it had to support the "docker" env variable.
The use of the pigz container is not always needed if the user has
pigz available.
2022-03-16 15:31:03 +01:00
Thilo Fromm
6286b0a442 ci-automation/test.sh: use the latest kola from coreos-overlay 2022-03-16 14:14:46 +01:00
Thilo Fromm
a9700e16fb ci-automation/tapfile_helper_lib.sh: remove non-printable ASCII
Jenkins TAP file parser does not process non-printable ASCII characters
but bails out. This change removes all ASCII < 0x1F, so non-printable
characters are not included in the TAP report.

Fixes
    Caused by: unacceptable character '' (0x1B) special characters are not allowed
2022-03-16 09:37:48 +01:00
Thilo Fromm
53c90388c0 ci-automation/vendor-testing/qemu_update.sh: fix unbound
One-line fix to resolve
    ci-automation/vendor-testing/qemu_update.sh: line 64: testscript: unbound variable
error.

Signed-off-by: Thilo Fromm <thilo@kinvolk.io>
2022-03-15 17:39:28 +01:00
Kai Lüke
d3aa4f1331
Merge pull request #247 from flatcar-linux/kai/set-official
ci-automation: set images as official based on version
2022-03-10 12:41:10 +01:00
Thilo Fromm
b6caa4163d
Merge pull request #246 from flatcar-linux/t-lo/ci-automation-tests-pass-parallel-env-to-container
automation/test.sh: pass PARALLEL_TESTS to container
2022-03-10 12:24:55 +01:00
Kai Lüke
fff05f00c5
Merge pull request #245 from flatcar-linux/kai/sdk-tests
ci-automation: print kola command line
2022-03-10 12:06:07 +01:00
Thilo Fromm
194c503b56
Merge pull request #249 from flatcar-linux/t-lo/ci-automation-tapfile-ascii
ci-automation/tapfile_helper_lib.sh: only ASCII chars
2022-03-10 11:34:04 +01:00
Thilo Fromm
aa6a742efa ci-automation/tapfile_helper_lib.sh: only ASCII chars
This change removes all non-ASCII characters from test debug / error
output when ingesting the output for inclusion in the TAP report.
Jenkins TAP parser does not handle some unicode chars, leading to tap
parser errors with e.g. Cilium output (which uses unicode).

Signed-off-by: Thilo Fromm <thilo@kinvolk.io>
2022-03-10 10:20:11 +01:00
Jeremi Piotrowski
25cf7c4fc5 ci-automation/tapfile_helper_lib: fix commiting last transaction
Move the final db commit to inside the subshell. Since the while loop
runs inside a subshell, the SQL variable outside of the subshell is not
modified, and so the last contents of the SQL variable are dropped. This
shows up when the last couple test cases don't have an error message,
and simply append the transaction to 'SQL'. They are never written to
the db.

Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
2022-03-10 09:14:42 +01:00
Thilo Fromm
d266229434
ci-automation/test.sh: handle unset PARALLEL_TESTS
Co-authored-by: Kai Lüke <pothos@users.noreply.github.com>
2022-03-10 08:24:44 +01:00
Kai Lueke
c3ae1ce3b0 ci-automation: set images as official based on version
The image needs to be set into official mode through a helper script
(see jenkins/images.sh) and the COREOS_OFFICIAL variable needs to be
set for prod_image_util.sh/build_image_util.sh/grub_install.sh.
2022-03-09 18:19:52 +01:00
Thilo Fromm
8ca2393eb8 automation/test.sh: pass PARALLEL_TESTS to container
Signed-off-by: Thilo Fromm <thilo@kinvolk.io>
2022-03-09 17:38:15 +01:00
Kai Lueke
79e07bca44 ci-automation: print kola command line
For running kola manually and knowing which parameters where set, it
helps to print the kola command line in the job.
2022-03-09 16:44:06 +01:00
Kai Lüke
0cc95e3b3e
Merge pull request #244 from flatcar-linux/kai/set-group
Fix and improve channel handling
2022-03-04 14:22:10 +01:00
Kai Lueke
db7220eced ci-automation: set the channel from the git tag
For now we had only "developer" images in the new pipeline.
Based on the git tag like "alpha-1234.0.0" set the channel (group) for
the image and also use this logic when finding the channel in the QEMU
update test.
2022-03-04 13:49:18 +01:00
Thilo Fromm
eba1cdb4c2 ci-automation/tapfile_helper_lib.sh: fix CI TAP parse errors
This change fixes and adds more string chars escaping in the test error
debug output ("\" are removed and a bug in removing '"' is fixed),
addressing a parser errof the CI encountered when ingesting TAP output.

Furthermore, line numbering is shortened, and test names have a spurious
"-" prefix removed.

Signed-off-by: Thilo Fromm <thilo@kinvolk.io>
2022-03-03 16:29:38 +01:00
Thilo Fromm
2b2bfad5e1 ci-automation/tapfile_helper_lib.sh: use subshell, break lines after 200
Signed-off-by: Thilo Fromm <thilo@kinvolk.io>
2022-03-02 15:35:57 +01:00
Thilo Fromm
22af2876e6 ci-automation/tapfile_helper_lib.sh: read test output from file
This change updates the tapfile helper to read test output from a file
instead of passig it inline in the SQL INSERT statement.
This fixes an issue with large error output from tests, which breaks
tap_ingest_tapfile():
    ci-automation/tapfile_helper_lib.sh: line 31: /usr/bin/sqlite3: Argument list too long
Error observed with the cl.toolbox.dnf-install test, which can generate
8000 lines of output. Fix tested with the same output.

Signed-off-by: Thilo Fromm <thilo@kinvolk.io>
2022-03-02 13:02:52 +01:00
Kai Lüke
ef9914e06e
Merge pull request #235 from flatcar-linux/kai/add-update-test
ci-automation: add update test
2022-03-02 11:27:43 +01:00
Thilo Fromm
b72586c4de
Merge pull request #236 from flatcar-linux/t-lo/tool-for-fetching-build-stage-images
ci-automation/util/fetch_image.sh: fetch CI build stage image
2022-03-02 10:10:17 +01:00
Thilo Fromm
637e5d52ec
Apply suggestions of my favourite proofreader
Krzesimir continues to save me from embarrassing spelling mistakes 💙

Co-authored-by: Krzesimir Nowak <knowak@microsoft.com>
2022-03-02 10:09:59 +01:00
Kai Lueke
a9c3a31efb ci-automation: add missing update test
The kola update test was missing. It is performed as update from the
old image to the newly built payload to ensure that the new image is
compatible for old clients.
2022-02-28 15:59:17 +01:00
Thilo Fromm
88a4df98b1
Merge pull request #239 from flatcar-linux/t-lo/ci-automation-add-qemu-uefi-test
ci-automation/vendor-testing: add qemu_uefi
2022-02-25 12:15:01 +01:00
Thilo Fromm
1b4022d237
Merge pull request #237 from flatcar-linux/t-lo/container-builds-update-version-in-main-branch
ci-automation: SDK build updates version.txt in main branch
2022-02-25 12:14:24 +01:00
Thilo Fromm
9afab3aac4 ci-automation/vendor-testing: add qemu_uefi
This change adds the qemu_uefi.sh vendor test. It reuses most of the
implementation in qemu.sh (qemu_uefi.sh is a soft-link to qemu.sh).

This also enables qemu testing for ARM64.

Signed-off-by: Thilo Fromm <thilo@kinvolk.io>
2022-02-24 13:54:58 +01:00
Thilo Fromm
308a2a2ab6 ci-automation/sdk_bootstrap: check submodules for nightly branch push
Signed-off-by: Thilo Fromm <thilo@kinvolk.io>
2022-02-24 12:06:28 +01:00
Thilo Fromm
82da911c27 ci-automation/sdk_bootstrap: Only push to main in nightlies
Signed-off-by: Thilo Fromm <thilo@kinvolk.io>
2022-02-24 11:02:34 +01:00
Thilo Fromm
39b65765b4 ci-automation: fix test_update_reruns typo
Signed-off-by: Thilo Fromm <thilo@kinvolk.io>
2022-02-24 10:11:38 +01:00
Thilo Fromm
7cadae957a ci-automation: SDK build updates version.txt in main branch
This change has sdk_bootstrap update the origin branch when run from the
main branch, updating the SDK and OS version in 'main' for each SDK
bootstrap build.

Release / maintenance branches have the SDK version set in the
versionfile at release time. But main is never updated.

Updating the versionfile in main when a new SDK is built ensures that
dev branches based on main will also use the correct SDK version (e.g.
in subsequent CI builds).
2022-02-23 20:41:57 +01:00