Commit Graph

256 Commits

Author SHA1 Message Date
Kai Lueke
20643b260e ci-automation: Fallback also to the mirror for container download
When there is no SDK container image in the registry, the fallback
looks at bincache but bincache isn't backed up and may be cleaned of
old releases. While this won't be the regular case, the container
image registry may be unavailable (or renamed as happened now), or
people would like to rerun the image job which relies on the packages
container.
2022-09-27 15:53:33 +02:00
Krzesimir Nowak
24213a5c96 ci-automation: Download correct previous image for LTS release
qemu_update vendor test was downloading a wrong LTS image when it was
testing the old LTS image. This is because it was using a current
symlink, which for LTS channel will always point to the new LTS. Old
LTS is available under current-${YEAR} symlink. We can get the
information about year from the lts-info file.
2022-09-27 11:56:39 +02:00
Krzesimir Nowak
2606380396 ci-automation: Fix unbound variable errors
FLATCAR_VERSION and FLATCAR_SDK_VERSION are defined in the version
file, so it should be sourced before trying to use those. Here we try
to do it in a limited scope.

Also, SDK container link should use the dockerized version in a
directory name.
2022-09-27 10:55:08 +02:00
Kai Lueke
326c645647 ci-automation: Fix syntax error 2022-09-26 17:24:53 +02:00
Kai Lueke
bca6e6e41d ci-automation: Don't skip nightly build when the previous one failed
Currently we skip the nightly build if there are no changes. This
didn't work well because a new run doesn't fix any failure because the
rerun became a no-op.
Check if the main artifacts we expect from a step are found, as simple
heuristic on whether a rerun is needed.
2022-09-26 17:06:21 +02:00
Kai Lueke
18627499c1 Annotate a copied function
I found a duplicate function and verified that it's the only one via
comm -12 <(sort ci-automation/ci_automation_common.sh) <(sort sdk_lib/sdk_container_common.sh) | grep function
I'm not sure if this is due to a case where we only import one but
can't import the other, hence I'm not deleting it now.
2022-09-26 15:39:45 +02:00
Kai Lueke
3fef1eb801 ci-automation/release: Set up secret envs 2022-09-22 18:31:50 +02:00
Kai Lueke
ffee812d32 ci-automation/release: Run plume release only once
We need to run plume only once for each arch, move it out of the loop.
Also, address some smaller things that shellcheck complains about.
2022-09-22 18:31:50 +02:00
Kai Lueke
79d89faf91 ci-automation/secret_to_file: Fix usage from subshell
This failed when used from ( secret_to_file ... VAR ; cat $VAR )
because ( ) starts a new subshell PID and secret_to_file's returned
/proc/PID/fd/X path was then using the wrong PID.
2022-09-22 18:31:50 +02:00
Kai Lueke
ef8f20f9dd ci-automation/release: Disable GCS auth for plume pre-release
When GCS auth is expected, plume would upload the AMI list to GCS.
2022-09-22 18:31:50 +02:00
Mathieu Tortuyaux
593cf19a7a release: get product IDs from Jenkins
the JSON object is passed from the Groovy script to the release script,
we just need to extract the correct AWS Marketplace product ID based on
the "<channel>-<arch>".

Exception for the stable-amd64 where we also need to get the stable-pro
product ID.

Signed-off-by: Mathieu Tortuyaux <mtortuyaux@microsoft.com>
2022-09-22 18:31:50 +02:00
Mathieu Tortuyaux
27b62deb81 sdk_container: publish the SDK on a Docker registry
Signed-off-by: Mathieu Tortuyaux <mtortuyaux@microsoft.com>
2022-09-22 18:31:50 +02:00
Kai Lueke
20ed1ad3a4 ci-automation/release.sh: Run plume to release cloud images
The mantle plume tool has two steps, pre-release is the mere upload and
release is the publication. In the past this was used to run the tests
inbetween but we don't do this anymore.
Run plume pre-release and release in a single job. Since plume can't
push to GCS in our case, we upload the files to bincache. Also do the
cloudformation update which was previously done in
flatcar-build-scripts but could only be run after the sync to Origin.
It requires the "aws" tool in the mantle container until we implement
this in plume directly.
2022-09-22 18:31:48 +02:00
Krzesimir Nowak
1585ede78a ci-automation: Implement a stricter image version check
I made a mistake and wrote a version like main-3363-0.0-stuff (note a
dash instead of a dot after the first number). Surprisingly the build
chugged along just fine almost until the end of the image job - it
detected invalid version string when the job wanted to create a
version.txt file:

ERROR   build_image: script called: build_image '--board=amd64-usr' '--group=developer' '--output_root=/home/sdk/build/images' '--only_store_compressed' '--torcx_root=/home/sdk/build/torcx' 'prodtar' 'container'
ERROR   build_image: Backtrace:  (most recent call is last)
ERROR   build_image:   file build_image, line 196, called: split_ver '3363' 'SPLIT'
ERROR   build_image:   file common.sh, line 192, called: die 'Invalid version string '3363''
ERROR   build_image:
ERROR   build_image: Error was:
ERROR   build_image:   Invalid version string '3363'

Let's have a stricter version check in the beginning of the build
process, so the process fails sooner rather than later.
2022-09-19 12:12:37 +02:00
Kai Lueke
91a26e5e1e Use new github org name "flatcar"
The "flatcar-linux" github org was renamed to "flatcar". There are no
github redirections in place and we have to update all links.
2022-09-14 14:33:27 +02:00
Kai Lueke
edba76c012 Use ghcr.io/flatcar, there are no redirects
The GitHub org rename also moved the ghcr.io container image repo but
in contrast to git repos, there are no redirects!
2022-09-14 14:33:24 +02:00
Krzesimir Nowak
1ecea3544f ci-automation: Change the way we prepare torcx manifest for testing
Now URLs for torcx packages are always present in the torcx manifest,
but for releases they may be pointing to the origin server where the
packages will be eventually uploaded. At the time of running the
tests, those packages are still only in the build cache, so change the
URLs to point to the build cache, so the test can pass.
2022-09-06 14:00:50 +02:00
Krzesimir Nowak
b2d6f7fc6e *: Allow specifying extra URLs for torcx packages
Torcx manifest may contain paths and URLs as locations of
packages. There are two kinds of packages - vendored and
extra. Vendored packages normally have two locations - path to the
directory inside the image where the package is (which is why it's
called vendored), and a URL to the package on some remote
server. Extra packages only have a URL. But the URLs are added only
when we tell the build_torcx_store script to upload the packages at
the same time, which is what the old build pipeline was doing. With
the new pipeline, the upload happens as a separate step, thus the
upload is disabled when invoking build_torcx_store, and so the
packages are not getting URLs set. This change went unnoticed, because
a kola test checking the generated torcx manifest was only checking if
there is at least one location, either path or URL, and all the new
releases have no extra packages, only vendored ones.

When backporting the new pipeline to old LTS, the kola tests started
to fail, because old LTS had one extra package, and this is how I
noticed the problem.
2022-09-06 14:00:50 +02:00
Kai Lueke
b30654ef22 ci-automation: Prepare release job
The old pipeline had a release job where mantle's plume release tool
was invoked to publish the cloud images.
Implement a release job in the new pipeline with the same goals and
eventually even more automation.
2022-09-05 11:41:41 +02:00
Kai Lueke
1319e4c95a ci-automation: Move image change report to own file
To review the image changes and the changelog more easily and in case
of fixes, iterate over it without rebuilding the image, move this logic
to its own file where a new job could call it.
2022-09-05 11:41:41 +02:00
Kai Lüke
7b7c3e5b76
Merge pull request #425 from flatcar-linux/kai/em-m3
Cover Equinix Metal m3.small.x86 instances in release test
2022-09-01 13:34:20 +02:00
Krzesimir Nowak
8b52a9b04c ci-automation: Use an array for storing failed tests 2022-08-31 09:37:18 +02:00
Krzesimir Nowak
8cd06230ba ci-automation: Print failed tests nicer
Instead of printing failed tests like this:

    Failed tests: kubeadm.v1.25.0.cilium.base
    kubeadm.v1.24.1.cilium.base

Do it like this:

    Failed tests:
    kubeadm.v1.25.0.cilium.base
    kubeadm.v1.24.1.cilium.base
2022-08-31 09:37:18 +02:00
Krzesimir Nowak
9e05a07a77 ci-automation: Return 1 on broken cycle
We have set success to true when the test cycle was broken, which was
a hacky way to avoid printing the give up message. But this setting
success to true also meant that the script returned with status 0,
which is wrong.

Add another variable for controlling printing the give up message.
2022-08-31 09:37:18 +02:00
Krzesimir Nowak
6c77ebde54 ci-automation: Break test cycle properly
Create a tapfile and break out of the loop.
2022-08-31 09:37:18 +02:00
Kai Lueke
b8133d92a0 Cover Equinix Metal m3.small.x86 instances in release test
The new m3.small instance does not have official Flatcar support yet
but we can already cover it in our PXE boot release tests.
The c3.small instances are legacy and m3.small is the new smallest
type.
2022-08-24 18:57:17 +02:00
Krzesimir Nowak
73bb00a9d0 ci-automation: Break retest cycle properly in qemu on arm64
Rerunning the test will always yield the same result in this case, so
it's pointless.
2022-08-24 13:48:35 +02:00
Krzesimir Nowak
2d226f864e ci-automation/packages.sh: Fix access to unbound variable
We were running the run_sdk_container script with passing a value of a
variable named version to the script through the -v flag. But nowhere
is the variable defined. This worked under jenkins, because jenkins
job has a version parameter that gets exported into environment under
the same name. But running it manually outside jenkins revealed the
bug.

The script should have been using a vernum variable. Now, the
difference between this variable and the version variable is that
"version" was in form of <channel>-<version>-<build_id>, whereas
"vernum" comes without the channel part. Fortunately,
"run_sdk_container" was stripping the channel part before using this
value, so it makes no difference whether we pass
main-3333.0.0.0-some-id or just 3333.0.0-some-id.
2022-08-24 13:48:35 +02:00
Krzesimir Nowak
1974033edd ci-automation: Sync used EquinixMetal region to use for ARM64 servers
Recently we changed the region from DA (Dallas) to DC (Washington),
because there are more ARM64 servers available. Reflect this change in
the new pipeline too.
2022-08-05 11:14:36 +02:00
Krzesimir Nowak
661a4067a1 ci-automation/vendor-testing/azure.sh: Use an array for extra instance types 2022-08-03 16:23:15 +02:00
Krzesimir Nowak
23a05949c1 ci-automation/vendor-testing/azure.sh: Use proper machine size on arm64 2022-08-03 16:22:38 +02:00
Krzesimir Nowak
4d09ab35d6 ci-automation/vendor-testing/azure.sh: Fix unbound variable use
This gets triggered when the test is rerun and an existing image is
reused.
2022-08-03 15:21:00 +02:00
Krzesimir Nowak
7f5282e259 ci-automation/vendor-testing/azure.sh: Fix hyperv generation argument
The "v" must be a capital letter. It seems that Azure got picker about
parameters it accepts.
2022-08-03 15:21:00 +02:00
Kai Lueke
5e0dc0a85d ci-automation: Move git tagging into own script
When the build system runs the packages jobs for both architectures in
parallel and has to create a new tag, tagging fails due to the race in
the tagging.
Move the git tagging to its own script that is run from a new top-level
job that starts the packages jobs for both architectures.
2022-07-18 19:20:44 +02:00
Krzesimir Nowak
a96a66d222
Merge pull request #376 from flatcar-linux/krnowak/digests
ci-automation: Generate digests for artifacts
2022-07-14 14:42:49 +02:00
Kai Lüke
f83ee4f9a1
Merge pull request #375 from flatcar-linux/kai/print-changelog
ci-automation: Show changes by finding the previous channel
2022-07-14 13:44:22 +02:00
Kai Lueke
da370b54c1 ci-automation: Show changes by finding the previous channel
The image comparison was done against the old release in the channel
we release to instead of the previous release with the same major
version. This means when a channel transition happens we see a large
diff instead of the diff against the previous release. While not bad
for finding problems, this is normally not needed. However, we want
to have two changelogs generated, one against the old release in the
channel we relese to and one against the previous release with the same
major version when a transition happens. There was no changelog
printing yet, and this is added now.
2022-07-13 19:11:50 +02:00
Kai Lüke
76b47a00b2
Merge pull request #374 from gabriel-samfira/make-workflow-pluggable
Make the kola test workflow reusable
2022-07-13 18:09:43 +02:00
Gabriel Adrian Samfira
b518c3cdb8
Disable image mangle in qemu tests
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2022-07-13 14:19:50 +03:00
Gabriel Adrian Samfira
7dc45a4a1f
Make QEMU_UEFI_BIOS configurable
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2022-07-12 19:46:27 +03:00
Krzesimir Nowak
4e6f44e7b8 ci-automation: Generate digests files for the built artifacts 2022-07-12 16:59:14 +02:00
Krzesimir Nowak
d475d36766 ci-automation: Add a function for generating digests
It works in a similar way to sign_artifacts - it takes a signer, a
list of files and directories, and generates digests next to the
respective files.
2022-07-12 16:59:14 +02:00
Krzesimir Nowak
133cb6b52f ci-automation: Factor out listing files into a separate function
This will come in handy when listing files for creating digests files.
2022-07-12 16:59:14 +02:00
Gabriel Adrian Samfira
dc8cf9c18d
Add configuration options to test functions
* Add SKIP_COPY_TO_BINCACHE environment variable that will skip
    uploading test results to bincache. This is useful if we want to
    upload test results as artifacts on github.
  * make QEMU_IMAGE_NAME configurable

Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2022-07-11 13:32:25 +03:00
Kai Lueke
ef9b0ff820 ci-automation: Only store compressed images
The new build pipeline compresses images already but uploaded both the
compressed and uncompressed files because the whole build folder gets
uploaded.
Add a new flag "--only_store_compressed" to the image generation which
deletes the uncompressed file after compression is done. Uncompressed
images are still supported if specified in the flag
"image_compression_formats".

Closes https://github.com/flatcar-linux/Flatcar/issues/793
2022-07-05 16:13:22 +02:00
Kai Lueke
c1f1404df8 ci-automation: Run package-diff to report image changes
The original pipeline has package-diff commands to print out image
differences compared to the last release. This is used for the release
Go/No-Go QA checks.
Add the same logic to the new pipeline.
2022-06-29 15:23:16 +02:00
Kai Lueke
1b3e9ef188 ci-automation: Use the package container for VM image building
The image job builds an image container that is multiple GBs big and
takes >10 mins to be loaded in the vms job. The vms job can also do its
work by running from the packages container from the packages job when
it fetchs the built image from bincache first and assuming the images
job copies it there.
Skip generating the image container and instead use the packages
container for VM image building by copying the image folder first to
bincache and then retrieving it from there. While reworking this we
also address the issue that the VMs container had used the same name
for both architectures, causing a race when both run in parallel on
the same worker.
2022-06-29 15:23:16 +02:00
Kai Lueke
aae09eef4b ci-automation: align VM image compression with existing pipeline
In jenkins/vms.sh the Digital Ocean and OpenStack images get also
compressed as gzip.
Do so for the new pipeline, too.
2022-06-28 18:08:53 +02:00
Gabriel Adrian Samfira
6e03ea1821 Add CI workflow 2022-06-24 15:35:23 +00:00
Krzesimir Nowak
c3e5e754e9
Merge pull request #334 from flatcar-linux/krnowak/sign-images
ci-automation: Sign the artifacts
2022-06-03 17:29:13 +02:00
Krzesimir Nowak
527bd2237b ci-automation: Sign artifacts and upload the signatures
It uses the SIGNER environment variable to decide whether the
signatures should be created or not. It expect the key of the SIGNER
to exist in GPGHOME, and that's what gpg_setup.sh is already doing.

In some places we need to recursively change the owner of the
directory that contains artifacts to be signed, otherwise we won't be
able to create new files with signatures there. This is because some
of the artifacts are either created inside the SDK container (so the
created files belong to root outside the container) or are created
with `sudo`.
2022-06-03 14:59:38 +02:00
Krzesimir Nowak
0e0eb67ca2 ci-automation: Set up keys for signing
Not used for anything yet. This sets up a temporary GPGHOME directory
and a trap that will remove it after we are done.
2022-06-03 14:59:26 +02:00
Krzesimir Nowak
090d7ec176 ci-automation: Run functions in subshells
The functions are sourcing other files that define global variables,
so they will spill into the callers shell unnecessarily. We will also
add some functionality that uses traps in follow-up commits, so it's
good to limit the scope of traps too.
2022-06-03 14:58:29 +02:00
Krzesimir Nowak
698d0de129 ci-automation: Trivial fixes
Dropped some trailing whitespace, fixed a typo. Trivial.
2022-06-03 14:56:51 +02:00
Krzesimir Nowak
cec96aeec5 ci-automation/vendor-testing/azure.sh: Small fixes
Fix some comments, quote some variables.
2022-06-02 18:40:06 +02:00
Kai Lueke
6c0fb8959d ci-automation/vendor-testing/azure.sh: Align timeout with GC duration
The kola test run time shouldn't be longer than the GC duration to
prevent failing tests caused by GC interference.
Align the Azure kola timeout with the GC duration.
2022-06-02 14:08:28 +09:00
Sayan Chowdhury
42608d3c67
ci-automation/azure: Add initial container tests infra for Azure (#274)
The Azure tests use a similar logic as the GCE tests where an the
instance type parameter normally used in AWS/Equinix Metal tests is
here used to specify whether the VM gets started in Gen V1 or V2 mode.

Signed-off-by: Sayan Chowdhury <schowdhury@microsoft.com>
Co-authored-by: Kai Lüke <pothos@users.noreply.github.com>
2022-05-27 08:01:59 +02:00
Kai Lueke
cee8a6aadf ci-automation: Push version file early
When a nightly build is started that pushes the version file to the
branch it was doing so only at the end of the build, causing the push
to fail if something else got merged in between.
Push the version file early by generating it the same way it would be
generated by the run_sdk_container/bootstrap_sdk_container scripts.
In the case of the SDK the version file gets the same version for the
OS and the SDK. Add some explanations about the version formats. Note
that the scripts will still rewrite the file but it should be a no-op.
2022-05-23 22:40:02 +09:00
Kai Lueke
95367851fa ci-automation/sdk_bootstrap.sh: Allow omitting the optional parameters
The coreos/portage refs were allowed to be empty strings but the way
the function was run from Groovy the lack of quoting caused the empty
strings to be missing parameters.
Since the two parameters are meant to be optional, support omitting
them.
2022-05-23 19:29:39 +09:00
Krzesimir Nowak
8c3d7b977b ci-automation: Fix potential use of unbound variable error
`local -a stuff` does not make `stuff` a bound array variable, so
checking length of the array will trigger an error about unbound
variable. Fortunately, `local stuff=()` does the trick.
2022-05-11 12:43:08 +02:00
Krzesimir Nowak
bf1bc21498 ci-automation: Do not rerun tests on unrelated instances
We forgot to clear the array with instance tests to rerun, so the list
grew from one iteration to another when going over all the instance
types. I did not spot it before, because I tested it with only one
extra instance.
2022-05-11 12:15:52 +02:00
Kai Lüke
42fd4919c4
Merge pull request #331 from flatcar-linux/kai/equinix-metal-refactor
ci-automation/vendor-testing/equinix_metal.sh: Use test framework
2022-05-11 19:12:32 +09:00
Kai Lueke
3fd7825310 ci-automation/vendor-testing/gce.sh: Test GVNIC and break retest cycle
The logic we had in some tests for covering different instance types
now got more easy to reuse for testing the GVNIC mode in GCE.
Align the GCE test with AWS and DigitalOcean to test an additional
"instance type" (here just changing the NIC) and break the retest spin
case it gets called for arm64.
2022-05-11 12:07:58 +09:00
Kai Lueke
9fe14ffe34 ci-automation/vendor-testing/equinix_metal.sh: Use test framework
The test framework from the AWS PR allows us to align the logic which
also addresses some bugs we had here.
Port the Equinix Metal test over to the new framework (and also use
different test basenames per architecture while at it which could
otherwise result in clashes).
2022-05-11 11:39:30 +09:00
Krzesimir Nowak
d60d514482 ci-automation: Make AWS test script to work 2022-05-10 12:46:33 +02:00
Krzesimir Nowak
6278762fa8 ci-automation: Add helper functions for running tests on multiple instances 2022-05-10 12:46:33 +02:00
Krzesimir Nowak
e1d9beaeee ci-automation: Fix passing multiple test names to vendor scripts 2022-05-10 12:46:02 +02:00
Krzesimir Nowak
f0765e22c3 ci-automation: Let vendor scripts know if this is their first run
I will need it to correctly handle test reruns as we will need to
handle passed test names differently on first runs than on reruns.
2022-05-10 12:46:02 +02:00
Dongsu Park
76abe0d9cb ci-automation: Add WIP AWS test script for CI automation 2022-05-10 12:45:43 +02:00
Krzesimir Nowak
a8ac124d53 ci-automation: Add new vendor test for VMware 2022-05-06 12:57:20 +02:00
Krzesimir Nowak
3b3cffabc8 ci-automation: Fix credentials handling in digital ocean 2022-05-06 09:16:23 +02:00
Krzesimir Nowak
3c119f14b2 ci-automation: Fix secret file handling
It can't be done in a subshell, because the file will be gone after
subshell quits.
2022-05-06 09:16:23 +02:00
Krzesimir Nowak
413689c779 ci-automation: Rename some variables and make them overridable 2022-05-06 09:16:23 +02:00
Krzesimir Nowak
197e9a334f ci-automation: Add secrets handling 2022-05-06 09:16:23 +02:00
Krzesimir Nowak
cd2f3f0d6c ci-automation: Drop boilerplate code from digital ocean test 2022-05-06 09:16:23 +02:00
Krzesimir Nowak
2fe896b558 ci-automation: Add retest cycle breaking functionality 2022-05-06 09:16:23 +02:00
Krzesimir Nowak
9650650e4b ci-automation: Add URL template handling 2022-05-06 09:16:23 +02:00
Krzesimir Nowak
b6bb07acdc ci-automation: Initial test script for Digital Ocean 2022-05-06 09:16:23 +02:00
Krzesimir Nowak
d26f2b3b39 ci-automation: Use vendor_test.sh in equinix_metal and gce tests 2022-05-05 13:07:52 +02:00
Krzesimir Nowak
1d6f38a72e ci-automation: Reduce boilerplate in vendor tests
Move the common setup to the vendor_test.sh script, which will be
sourced by the vendor scripts.
2022-05-05 12:57:14 +02:00
Kai Lueke
f7edd4e061 ci-automation: add GCE image test
The GCE image test runs on a single instance type for now. In the
future it would be good to test the new NIC type with the cl.internet
test.
2022-05-05 16:52:42 +09:00
Mathieu Tortuyaux
550e702f90 ci-container/test: add equinix-metal test script
Signed-off-by: Mathieu Tortuyaux <mtortuyaux@microsoft.com>
Co-authored-by: Kai Lüke <pothos@users.noreply.github.com>
2022-05-04 16:34:37 +09:00
Kai Lüke
28ee2a3256
Merge pull request #298 from flatcar-linux/kai/test-lts
CI: Support comparing to current LTS and updating from it
2022-04-29 16:34:47 +09:00
Mathieu Tortuyaux
4bd316ac74
Merge pull request #272 from flatcar-linux/tormath1/pxe
ci-automation/vm: build PXE if Equinix Metal is built
2022-04-28 11:52:52 +02:00
Kai Lueke
9a98cc2917 ci-automation/vms: handle platform names and generate the image formats
The kola test scripts are named by the platforms. The image naming is
also quite difficult to know and remember, e.g., whether "ami" or
"ami_vmdk" is needed for AWS tests and whether it's "vmware" or
"vmware_ova".

To address these problems the vms build stage now accepts the platform
names as format input, and for each platform it will automatically
generate the needed image types to run the tests.
2022-04-28 17:15:02 +09:00
Kai Lueke
c4af762e26 ci-automation/garbage_collect: clean up kola cloud resources
The garbage collect job should also clean up kola resources if a test
job failed to do so due to forced terminator or misbehavior. The
cleanup is done by "ore" which needs credentials like kola.

Run ore from the mantle container image. Unfortunately Docker does not
support Podman's --env-host option and the env vars had to be passed
explicitly. While --env-file=<(env) would work it contains a lot of
variables that cause the container to behave a bit weird.
2022-04-28 16:27:14 +09:00
Kai Lueke
856929d357 CI: Support comparing to current LTS and updating from it
When the restriction that the CI can't access the LTS release is gone
we can support to run the image comparison and the kola update test.
2022-04-26 15:00:31 +09:00
Krzesimir Nowak
1916936e34 ci-automation: Update test.sh script docs
We are not using SDK container for running the tests any more - it was
replaced with mantle container. Update the docs accordingly.
2022-04-20 16:34:07 +02:00
Mathieu Tortuyaux
19ca42b3dd
ci-automation/vm: build PXE if Equinix Metal is built
Signed-off-by: Mathieu Tortuyaux <mtortuyaux@microsoft.com>
2022-04-20 16:33:00 +02:00
Kai Lueke
98e9947a06 ci-automation: silence rsync
The rsync copy logs made it hard to navigate the job output.
Remove the --progress and -v flags.
2022-04-20 19:13:02 +09:00
Kai Lueke
da0380c7e8 Run CI container pipeline kola tests with the new mantle image
The SDK container does not exist for arm64 and is quite heavy. We
currently also resort to a unconditional rebuilding of mantle inside
the SDK.
Use the new mantle container image to run kola tests, and pin its
version through a text file that gets updated by GitHub Actions.
2022-04-20 19:13:02 +09:00
Mathieu Tortuyaux
de7e05403b
ci-automation/vms: rename equinix_metal to packet
this is required to keep "packet" in the SDK linguo while the user can
use "equinix_metal" term.

Signed-off-by: Mathieu Tortuyaux <mtortuyaux@microsoft.com>
Co-authored-by: Krzesimir Nowak <knowak@microsoft.com>
2022-04-13 13:09:51 +02:00
Kai Lüke
7376494ef2
Merge pull request #266 from flatcar-linux/kai/sdk-from-release-tag
ci-automation: use a single git tag and skip nightlies with no changes
2022-04-04 17:12:36 +02:00
Kai Lueke
bd970357c8 ci-automation: use a single git tag and skip nightlies with no changes
The pipeline created two tags if an SDK was built, one for the SDK and
one for the OS build (which was a free-standing tag or a local state
that was equivalent to the existing tag of the same name). The
nightlies created update commits on the main branch, even if no change
was done, and on the release branches we lacked these commits.

Create the release tag in the nightly SDK bootstrap already and reuse
it for the nightly OS build. Instead of local state, checkout the
existing tags explicitly. Extend the nightly update commit logic to
cover release branches and detect if we can skip building because no
changes were done.
2022-04-01 17:18:51 +02:00
Thilo Fromm
6dcfd9aeb6 ci-automation/test.sh: remove PARALLEL_TEST passing (move to CI)
Signed-off-by: Thilo Fromm <thilo@kinvolk.io>
2022-04-01 13:59:47 +02:00
Thilo Fromm
1e0dc777fe ci-automation/test.sh: export PARALLEL_TESTS
Export PARALLEL_TESTS in the container's .env file to ensure it is
passed to the vendor script.
2022-03-23 12:11:12 +01:00
Thilo Fromm
9b83d3e80b
Merge pull request #258 from flatcar-linux/t-lo/ci-automation-tests-use-latest-kola
ci-automation/test.sh: use the latest kola from coreos-overlay
2022-03-16 17:04:16 +01:00
Kai Lueke
c149d24ced run_sdk_container: Fall back to tar ball download for SDK image
The nightly SDK image is not pushed to a registry but has to be
downloaded from the build server as tar ball.
Fall back to the tar ball import for a better user experience.
To reuse the ci logic it had to support the "docker" env variable.
The use of the pigz container is not always needed if the user has
pigz available.
2022-03-16 15:31:03 +01:00
Thilo Fromm
6286b0a442 ci-automation/test.sh: use the latest kola from coreos-overlay 2022-03-16 14:14:46 +01:00
Thilo Fromm
a9700e16fb ci-automation/tapfile_helper_lib.sh: remove non-printable ASCII
Jenkins TAP file parser does not process non-printable ASCII characters
but bails out. This change removes all ASCII < 0x1F, so non-printable
characters are not included in the TAP report.

Fixes
    Caused by: unacceptable character '' (0x1B) special characters are not allowed
2022-03-16 09:37:48 +01:00
Thilo Fromm
53c90388c0 ci-automation/vendor-testing/qemu_update.sh: fix unbound
One-line fix to resolve
    ci-automation/vendor-testing/qemu_update.sh: line 64: testscript: unbound variable
error.

Signed-off-by: Thilo Fromm <thilo@kinvolk.io>
2022-03-15 17:39:28 +01:00
Kai Lüke
d3aa4f1331
Merge pull request #247 from flatcar-linux/kai/set-official
ci-automation: set images as official based on version
2022-03-10 12:41:10 +01:00
Thilo Fromm
b6caa4163d
Merge pull request #246 from flatcar-linux/t-lo/ci-automation-tests-pass-parallel-env-to-container
automation/test.sh: pass PARALLEL_TESTS to container
2022-03-10 12:24:55 +01:00
Kai Lüke
fff05f00c5
Merge pull request #245 from flatcar-linux/kai/sdk-tests
ci-automation: print kola command line
2022-03-10 12:06:07 +01:00
Thilo Fromm
194c503b56
Merge pull request #249 from flatcar-linux/t-lo/ci-automation-tapfile-ascii
ci-automation/tapfile_helper_lib.sh: only ASCII chars
2022-03-10 11:34:04 +01:00
Thilo Fromm
aa6a742efa ci-automation/tapfile_helper_lib.sh: only ASCII chars
This change removes all non-ASCII characters from test debug / error
output when ingesting the output for inclusion in the TAP report.
Jenkins TAP parser does not handle some unicode chars, leading to tap
parser errors with e.g. Cilium output (which uses unicode).

Signed-off-by: Thilo Fromm <thilo@kinvolk.io>
2022-03-10 10:20:11 +01:00
Jeremi Piotrowski
25cf7c4fc5 ci-automation/tapfile_helper_lib: fix commiting last transaction
Move the final db commit to inside the subshell. Since the while loop
runs inside a subshell, the SQL variable outside of the subshell is not
modified, and so the last contents of the SQL variable are dropped. This
shows up when the last couple test cases don't have an error message,
and simply append the transaction to 'SQL'. They are never written to
the db.

Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
2022-03-10 09:14:42 +01:00
Thilo Fromm
d266229434
ci-automation/test.sh: handle unset PARALLEL_TESTS
Co-authored-by: Kai Lüke <pothos@users.noreply.github.com>
2022-03-10 08:24:44 +01:00
Kai Lueke
c3ae1ce3b0 ci-automation: set images as official based on version
The image needs to be set into official mode through a helper script
(see jenkins/images.sh) and the COREOS_OFFICIAL variable needs to be
set for prod_image_util.sh/build_image_util.sh/grub_install.sh.
2022-03-09 18:19:52 +01:00
Thilo Fromm
8ca2393eb8 automation/test.sh: pass PARALLEL_TESTS to container
Signed-off-by: Thilo Fromm <thilo@kinvolk.io>
2022-03-09 17:38:15 +01:00
Kai Lueke
79e07bca44 ci-automation: print kola command line
For running kola manually and knowing which parameters where set, it
helps to print the kola command line in the job.
2022-03-09 16:44:06 +01:00
Kai Lüke
0cc95e3b3e
Merge pull request #244 from flatcar-linux/kai/set-group
Fix and improve channel handling
2022-03-04 14:22:10 +01:00
Kai Lueke
db7220eced ci-automation: set the channel from the git tag
For now we had only "developer" images in the new pipeline.
Based on the git tag like "alpha-1234.0.0" set the channel (group) for
the image and also use this logic when finding the channel in the QEMU
update test.
2022-03-04 13:49:18 +01:00
Thilo Fromm
eba1cdb4c2 ci-automation/tapfile_helper_lib.sh: fix CI TAP parse errors
This change fixes and adds more string chars escaping in the test error
debug output ("\" are removed and a bug in removing '"' is fixed),
addressing a parser errof the CI encountered when ingesting TAP output.

Furthermore, line numbering is shortened, and test names have a spurious
"-" prefix removed.

Signed-off-by: Thilo Fromm <thilo@kinvolk.io>
2022-03-03 16:29:38 +01:00
Thilo Fromm
2b2bfad5e1 ci-automation/tapfile_helper_lib.sh: use subshell, break lines after 200
Signed-off-by: Thilo Fromm <thilo@kinvolk.io>
2022-03-02 15:35:57 +01:00
Thilo Fromm
22af2876e6 ci-automation/tapfile_helper_lib.sh: read test output from file
This change updates the tapfile helper to read test output from a file
instead of passig it inline in the SQL INSERT statement.
This fixes an issue with large error output from tests, which breaks
tap_ingest_tapfile():
    ci-automation/tapfile_helper_lib.sh: line 31: /usr/bin/sqlite3: Argument list too long
Error observed with the cl.toolbox.dnf-install test, which can generate
8000 lines of output. Fix tested with the same output.

Signed-off-by: Thilo Fromm <thilo@kinvolk.io>
2022-03-02 13:02:52 +01:00
Kai Lüke
ef9914e06e
Merge pull request #235 from flatcar-linux/kai/add-update-test
ci-automation: add update test
2022-03-02 11:27:43 +01:00
Thilo Fromm
b72586c4de
Merge pull request #236 from flatcar-linux/t-lo/tool-for-fetching-build-stage-images
ci-automation/util/fetch_image.sh: fetch CI build stage image
2022-03-02 10:10:17 +01:00
Thilo Fromm
637e5d52ec
Apply suggestions of my favourite proofreader
Krzesimir continues to save me from embarrassing spelling mistakes 💙

Co-authored-by: Krzesimir Nowak <knowak@microsoft.com>
2022-03-02 10:09:59 +01:00
Kai Lueke
a9c3a31efb ci-automation: add missing update test
The kola update test was missing. It is performed as update from the
old image to the newly built payload to ensure that the new image is
compatible for old clients.
2022-02-28 15:59:17 +01:00
Thilo Fromm
88a4df98b1
Merge pull request #239 from flatcar-linux/t-lo/ci-automation-add-qemu-uefi-test
ci-automation/vendor-testing: add qemu_uefi
2022-02-25 12:15:01 +01:00
Thilo Fromm
1b4022d237
Merge pull request #237 from flatcar-linux/t-lo/container-builds-update-version-in-main-branch
ci-automation: SDK build updates version.txt in main branch
2022-02-25 12:14:24 +01:00
Thilo Fromm
9afab3aac4 ci-automation/vendor-testing: add qemu_uefi
This change adds the qemu_uefi.sh vendor test. It reuses most of the
implementation in qemu.sh (qemu_uefi.sh is a soft-link to qemu.sh).

This also enables qemu testing for ARM64.

Signed-off-by: Thilo Fromm <thilo@kinvolk.io>
2022-02-24 13:54:58 +01:00
Thilo Fromm
308a2a2ab6 ci-automation/sdk_bootstrap: check submodules for nightly branch push
Signed-off-by: Thilo Fromm <thilo@kinvolk.io>
2022-02-24 12:06:28 +01:00
Thilo Fromm
82da911c27 ci-automation/sdk_bootstrap: Only push to main in nightlies
Signed-off-by: Thilo Fromm <thilo@kinvolk.io>
2022-02-24 11:02:34 +01:00
Thilo Fromm
39b65765b4 ci-automation: fix test_update_reruns typo
Signed-off-by: Thilo Fromm <thilo@kinvolk.io>
2022-02-24 10:11:38 +01:00
Thilo Fromm
7cadae957a ci-automation: SDK build updates version.txt in main branch
This change has sdk_bootstrap update the origin branch when run from the
main branch, updating the SDK and OS version in 'main' for each SDK
bootstrap build.

Release / maintenance branches have the SDK version set in the
versionfile at release time. But main is never updated.

Updating the versionfile in main when a new SDK is built ensures that
dev branches based on main will also use the correct SDK version (e.g.
in subsequent CI builds).
2022-02-23 20:41:57 +01:00
Thilo Fromm
248ffcef03 ci-automation/util/fetch_image.sh: fetch CI build stage image
Signed-off-by: Thilo Fromm <thilo@kinvolk.io>
2022-02-23 19:25:12 +01:00
Thilo Fromm
e92521c166 ci-automation/test*: per-image summary, honor parallel
Signed-off-by: Thilo Fromm <thilo@kinvolk.io>
2022-02-23 11:30:54 +01:00
Thilo Fromm
3253435d6c ci-automation/test.sh: publish to _kola_tmp, not debug/_kola_tmp
Signed-off-by: Thilo Fromm <thilo@kinvolk.io>
2022-02-22 18:27:36 +01:00
Thilo Fromm
3397167b5e
Update ci-automation/test.sh: fix typo.
Co-authored-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
2022-02-22 18:13:15 +01:00
Thilo Fromm
8157bf0302 ci-automation: publish test results, add to GC
This change adds copying test results to the build cache server, and
adds respective deletion to the garbage collector.

Also, the patch fixes an issue with torcx publishing (manifest
publishing had arch hard-coded).

Signed-off-by: Thilo Fromm <thilo@kinvolk.io>
2022-02-22 16:22:57 +01:00
Thilo Fromm
081df6cd2c ci-automtion/packages.sh: fix torcx URL, add manifest
Signed-off-by: Thilo Fromm <thilo@kinvolk.io>
2022-02-22 15:44:04 +01:00
Thilo Fromm
4f39e0112f ci-automation/tests.sh: use http in torcx manifest
Use HTTP instead of https because Ignition does not recognise
letsencrypt certificates, leading to test breakage in
docker.torcx-manifest-pkgs.

Add a note in settings.env to explicitly call out HTTP requirement of
build cache server.

Signed-off-by: Thilo Fromm <thilo@kinvolk.io>
2022-02-21 17:23:53 +01:00
Thilo Fromm
0fa985b872 ci-automation/test.sh: stage torcx manifest
Signed-off-by: Thilo Fromm <thilo@kinvolk.io>
2022-02-21 16:30:30 +01:00
Thilo Fromm
1045fd5ac8 ci-automation/README.md: add docs for qemu test 2022-02-21 13:57:11 +01:00
Thilo Fromm
bee5ac7f74 ci-automation/tapfile: enforce foreign keys, simplify 2022-02-21 12:56:45 +01:00
Thilo Fromm
cafa385164 ci-automation: publish torcx json and use in tests
This change updates the package build script to publish the torcx
manifest file to the build cache so it can be used by tests.
It also updates the generic test script to use the SDK container instead
of the packages container image, and to download and use the torcx
manifest from the build cache.

Signed-off-by: Thilo Fromm <thilo@kinvolk.io>
2022-02-18 15:52:47 +01:00
Thilo Fromm
a5b958fd07 ci-automation/test.sh: fix reruns, set retry to 20
Signed-off-by: Thilo Fromm <thilo@kinvolk.io>
2022-02-18 14:40:18 +01:00
Thilo Fromm
95ef0b7322 ci-automation: git author and curl verboseness
- Git author configuration moves to tagging function and put under a
  condition so as to not pollute peoples' workspaces.
- curl now less verbose since it was spamming logs with TLS debug
  information.

Signed-off-by: Thilo Fromm <thilo@kinvolk.io>
2022-02-17 12:31:07 +01:00
Thilo Fromm
3a416fbf32 ci-automation testing: address PR review comments
- add cleanup script to test.sh
- remove wrapper function from qemu test

Signed-off-by: Thilo Fromm <thilo@kinvolk.io>
2022-02-17 12:30:36 +01:00
Thilo Fromm
6c76bfa1cd
ci-automation/tapfile_helper_lib.sh: add @pothos' retcode fix
Co-authored-by: Kai Lüke <pothos@users.noreply.github.com>
2022-02-17 10:45:10 +01:00
Thilo Fromm
5bfe2f395c
Apply @pothos' suggestions from code review
Co-authored-by: Kai Lüke <pothos@users.noreply.github.com>
2022-02-17 10:29:05 +01:00
Thilo Fromm
f6f44e2ca8 ci-automation: first stab at adding testing
Signed-off-by: Thilo Fromm <thilo@kinvolk.io>
2022-02-16 19:59:45 +01:00
flatcar-ci
0bbae51a5e settings / ci-automation: remove "binpkg" prefix
The original intention of the "binpkg" prefix in the CI binary package
cache URL was to separate packages from other build artifacts like
containers, images, and SDK tarballs. Motivation was to separate
developer content (binary packages) from CI automation artifacts
(everything else); since binary packages are not used by the CI.

This broke assumptions in scripts which use the binary host URL for
other things than packages - e.g. SDK tarballs or images. These
scripts would get a bincache URL with "binpkg/" prepended, while CI
automation would *not* use that prefix.

This change removes the use of "binpkg/" altogether since it would not
work as intended without more significant changes to build scripts.
2022-01-11 09:56:21 +01:00
Thilo Fromm
e076931c7d ci-automation/garbage_collect: fix binpkg url
garbage_collect.sh was using 'docker_vernum' where it should have been
using 'vernum' (as push_pkgs.sh does).

Also, make sure release directories are removed, not just packages.

Signed-off-by: Thilo Fromm <thilo@kinvolk.io>
2022-01-10 14:07:33 +01:00
Thilo Fromm
a6ddcda88e ci-automation: Apply suggestions from PR review
Co-authored-by: Krzesimir Nowak <knowak@microsoft.com>
2022-01-10 11:41:03 +01:00
Thilo Fromm
0ecd0be77a ci-automation/README.md: pkg publish, bin cache added
Signed-off-by: Thilo Fromm <thilo@kinvolk.io>
2022-01-07 17:47:50 +01:00
flatcar-ci
7d76cfedf7 ci-automation + setup_board: publish and use binpkgs
This change adds a job for publishing binary packages to the build cache
server to the ci automation.

Also, setup_board is updated to use the buildcache package cache if a
nightly build version is detected.

Signed-off-by: flatcar-ci <infra+ci@flatcar-linux.org>
2022-01-07 17:16:44 +01:00
Kai Lueke
e85a16fe1c ci-automation: allow to optionally push and sign the commit
For test builds the commit that updates the submodules can be free-
standing but for releases we need to push it to the branch and also
sign the tag.
Add optional arguments that are used by the tag-release script in
flatcar-build-scripts.
2022-01-05 15:25:31 +01:00