245 Commits

Author SHA1 Message Date
Andrey Smirnov
2fb00344ab chore: upgrade Go to 1.14.3 and use toolchain for race detector
With Go 1.14.3 we can run race-enabled code on muslc, so this opens path
to run unit-tests-race under Talos environment with rootfs, enabling all
the tests to run under race detector.

Also fixed the tests run by specifying platform in the test environment.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2020-05-25 08:35:11 -07:00
Spencer Smith
6383a78065 chore: serialize firecracker e2e tests
This PR will ensure that the firecracker provision tests will only run
after a successful e2e_firecracker run. This is being added in hopes of
freeing up some resources during CI testing and making things more
stable.

Signed-off-by: Spencer Smith <robertspencersmith@gmail.com>
2020-05-11 12:25:14 -07:00
Andrey Smirnov
23be80fd96 test: stabilize tests by bumping timeouts
Bump timeouts for reset API test as K8s control plane teardown might
take 3 minutes on its own.

Bump Go Firecracker SDK timeout when talking to firecracker process.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2020-05-06 08:26:18 -07:00
Spencer Smith
c1b6f05b00 chore: use clusterctl and v1alpha3 providers for tests
This PR will update our testing ocde to make use of the clusterctl tool,
as well as use the newer versions of various providers and updated
manifests.

Signed-off-by: Spencer Smith <robertspencersmith@gmail.com>
2020-05-01 07:42:19 -07:00
Andrew Rynhard
8af77c0f3d release(v0.5.0-alpha.2): prepare release
This is the official v0.5.0-alpha.2 release.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2020-04-28 09:35:44 -07:00
Andrew Rynhard
3332ca58d3 release(v0.5.0-alpha.1): prepare release
This is the official v0.5.0-alpha.1 release.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2020-04-21 11:52:29 -07:00
Andrew Rynhard
5f996e737d chore: use a single CHANGELOG
Instead of keeping a CHANGELOG for each release in the master branch, a
single CHANGELOG should be used since it will move into release branches
anyways. This prevents us from having to keep the files in sync across
master and the release branch. This also adds better tooling for
generating the CHANGELOG.md.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2020-04-17 11:24:48 -07:00
Andrew Rynhard
4ccd4d5364 fix: set ephemeral partition to max size
This sets the size of the ephemeral partition to the maximum
allowed size at installation time. We have reports of `xfs_growfs` causing
extremely slow boot times when the disk is 1TB or more. In our research
we found evidence that `xfs_growfs` is an expensive operation when
growing to a size of 10 times or more of the base. Instead, users should
create the disk close to the max disk size at install time. The
difference being that `mkfs.xfs` will handle larger disks better.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2020-04-17 07:08:04 -07:00
Spencer Smith
8d2f8d6127 chore: remove random.trust_cpu references
This PR removes the references to adding in the random CPU trust to the
kernel for all v0.4 docs, as well as in the iso command in the
installer. This is no longer needed with the newer linux kernel.

Signed-off-by: Spencer Smith <robertspencersmith@gmail.com>
2020-04-14 17:10:56 -07:00
Andrew Rynhard
a10acd592a chore: address random CI nits
This PR does the following:

- updates the conform config
- cleans up conform scopes
- moves slash commands to the talos-bot
- adds a check list to the pull request template
- disables codecov comments
- uses `BOT_TOKEN` so all actions are performed as the talos-bot user
- adds a `make conformance` target to make it easy for contributors to
check their commit before creating a PR
- bumps golangci-lint to v1.24.0

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2020-04-13 13:01:14 -07:00
Andrey Smirnov
2d5c6f4c10 test: serialize docs step execution
`make docs` removes and then regenerates contents of some docs, so it
might cause random `-dirty` issue when running concurrently with build
steps.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2020-04-07 23:46:16 +03:00
Spencer Smith
3a4eaeeef0 feat: upgrade kubernetes to 1.18
This PR will pull in the latest release of k8s 1.18 so we can start
validating it through our test suite.

Signed-off-by: Spencer Smith <robertspencersmith@gmail.com>
2020-03-26 14:59:43 -04:00
Andrey Smirnov
104af4380e feat: make --wait default option to talosctl cluster create
It seems to be useful enough to be the default one and it prevents
simple mistakes while trying to access the cluster which is not ready
yet.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2020-03-25 06:36:43 -07:00
Andrey Smirnov
e38cde9b48 chore: update upgrade tests for new version, split into two tracks
This updates upgrade tests to run two flows with 3+1 clusters:

1. 0.3 -> current (testing upgrade with partition wiping)
2. 0.4-alpha.7 -> current (testing upgrade without partition wiping,
boot-a/boot-b)

And small upgrade with preserve enabled for single-node cluster.

Provision tests are now split into two parallel tracks in Drone.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2020-03-24 15:30:00 -07:00
Spencer Smith
3485ea9f09 fix: update k8s to 1.17.3
This PR will update k8s to v1.17.3 to address CVEs mentioned in https://groups.google.com/forum/?utm_medium=email&utm_source=footer#!topic/kubernetes-security-announce/2UOlsba2g0s

Signed-off-by: Spencer Smith <robertspencersmith@gmail.com>
2020-03-23 17:08:52 -07:00
Andrew Rynhard
c6581fabac feat: build talosctl for ARM v7
This adds an ARM v7 build of `talosctl`.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2020-03-21 18:35:00 -07:00
Andrew Rynhard
43662e4a24 feat: build talosctl for ARM64
This adds an ARM64 build of `talosctl`.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2020-03-21 16:40:52 -07:00
Andrew Rynhard
5dbc26c7a3 feat: rename osctl to talosctl
This is a rename of the osctl binary. We decided that talosctl is a
better name for the Talos CLI. This does not break any APIs, but does
make older documentation only accurate for previous versions of Talos.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2020-03-20 19:07:39 -07:00
Andrey Smirnov
2e3681054d chore: improve handling of etcd responses in bootkube pre-func
Try more attempts, wait for the response. Treat empty response as no
error (as this is what to expect when key is not set yet).

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2020-03-06 21:06:48 +03:00
Andrey Smirnov
d5d3035c8c test: enable upgrade tests 0.4.x -> latest
With the fix #1904, it's now possible to upgrade 0.4.x with
`machine.File` extra files (caused by registry mirror for
registry.ci.svc).

Bump resources for upgrade tests in attempt to speed it up.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2020-02-26 00:09:32 +03:00
Andrey Smirnov
923ef4537b test: implement new class of tests: provision tests (upgrades)
This class of tests is included/excluded by build tags, but as it is
pretty different from other integration tests, we build it as separate
executable. Provision tests provision cluster for the test run, perform
some actions and verify results (could be upgrade, reset, scale up/down,
etc.)

There's now framework to implement upgrade tests, first of the tests
tests upgrade from latest 0.3 (0.3.2 at the moment) to current version
of Talos (being built in CI). Tests starts by booting with 0.3
kernel/initramfs, runs 0.3 installer to install 0.3.2 cluster, wait for
bootstrap, followed by upgrade to 0.4 in rolling fashion. As Firecracker
supports bootloader, this boots 0.4 system from boot disk (as installed
by installer).

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2020-02-21 07:04:03 -08:00
Andrey Smirnov
5f330f1f64 chore: push installer & talos images to the CI registry on every build
This enables a way to run the matching installer image in firecracker
tests. New image is used in firecracker tests and bootloader support to
use installed kernel/initramfs, which opens path for upgrade tests.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2020-02-18 07:32:45 -08:00
Andrew Rynhard
c9a8605f87 chore: move golangci-lint.yaml to .golangci.yml
This allows local runs of golangci-lint to use the default config path.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2020-02-18 07:10:21 -08:00
Andrey Smirnov
f51e9a14fe chore: build app container images skipping export to host
Container images for `apid`, `networkd`, etc. are now built inside the
buildkit using the `img` tool. This means that all the dependencies are
now controlled in `buildkit` and many more stages can run in parallel
without problems (overwriting content in `_out/images`).

This also simplifies Drone configuration, as we can let buildkit handle
the dependencies. I also enabled more stages to run in parallel.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2020-02-14 13:17:25 -08:00
Spencer Smith
1d73a9e6d1 chore: only run ok-to-test when PR
This PR fixes a quick bug in CI where the ok-to-test step in drone was
running after a merge to master.

Signed-off-by: Spencer Smith <robertspencersmith@gmail.com>
2020-02-04 10:27:46 -08:00
Spencer Smith
c825b83d47 chore: support slash commands in drone
This PR adds the necessary drone step to check for the `ok-to-test`
label before running any testing against a PR.

Signed-off-by: Spencer Smith <robertspencersmith@gmail.com>
2020-02-04 12:57:16 -05:00
Andrey Smirnov
01d696ed10 chore: update golangci-lint-1.23.3
`gomnd` disabled, as it complains about every number used in the code,
and `wsl` became much more thorough.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2020-02-04 08:56:39 -08:00
Spencer Smith
05aad743df chore: update capi-upstream
This PR will bring in the latest v1alpha2-supporting release ofthe upstream capi provider

Signed-off-by: Spencer Smith <robertspencersmith@gmail.com>
2020-01-31 11:30:16 -05:00
Andrey Smirnov
0afd0f651b chore: provide provisioned cluster info to integration test
Integration test can optionally consume cluster state as generated by
the call to `osctl cluster create` and use it to discover nodes in
integration tests.

This means that now CLI tests can use that as discovery source, and
API/K8s tests by default as well.

Flat list of nodes is to be replaced by something more complex in the
next iteration, but it's good for this PR.

As a demo, add CLI test with multiple nodes (dmesg).

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2020-01-31 18:21:30 +03:00
Andrew Rynhard
88667641df chore: refactor E2E scripts
This PR aims to simplify our E2E scripts.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2020-01-26 20:47:25 -08:00
Andrew Rynhard
c359caef3d chore: fix CI
We need `DOCKER_NET` to be set.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2020-01-24 12:42:43 -08:00
Andrew Rynhard
f87c6d74d3 chore: use firecracker in basic-integration
This adds a basic integration step that uses firecracker.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2020-01-23 05:52:22 -08:00
Spencer Smith
e0181c85eb feat: allow ability to customize containerd
This PR will allow for any toml files added into `/var/cri/conf.d` to be
picked up and parsed as a containerd config. This should allow users a
nice way to add additional configs by passing extra files in machine
config like:

```
machine:
  ...
  files:
    - content: |
        [metrics]
          address = "0.0.0.0:11234"
      path: /var/cri/conf.d/metrics.toml
      op: create
```

Will close #1718.

Signed-off-by: Spencer Smith <robertspencersmith@gmail.com>
2020-01-22 17:08:10 -05:00
Spencer Smith
60260c85d1 feat: upgrade kubernetes version to 1.17.1
This PR will bring in the latest point release of k8s 1.17

Signed-off-by: Spencer Smith <robertspencersmith@gmail.com>
2020-01-17 09:39:26 -08:00
Andrew Rynhard
40f803de66 chore: run sonobuoy in quick mode
This adds sonobuoy's quick mode test to basic integration.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2020-01-17 09:25:43 -08:00
Andrew Rynhard
6533a41da7 chore: fix E2E script
The basic integration cluster name was changed in a previous PR. This
aligns the E2E script with the new naming conventions, and mounts the
correct integration test binary.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2020-01-10 07:53:26 -08:00
Andrew Rynhard
d824d0bfdb chore: publish boot.tar.gz
This adds a convenience tarball that includes vmlinuz, and initramfs.xz
in a single tarball.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2020-01-09 12:38:21 -08:00
Andrew Rynhard
d123d24b93 chore: allow docgen to ignore a struct
Using a well known comment (docgen: nodoc), we can now tell docgen to
ignore certain structs.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2020-01-09 11:44:43 -05:00
Spencer Smith
04639824b9 chore: disable iso artifact publication
This PR will disable iso publication for now. We plan to reincorporate the
ability to use ISOs once we've researched #1722.

Will close #1442

Signed-off-by: Spencer Smith <robertspencersmith@gmail.com>
2020-01-07 07:01:46 -08:00
Andrew Rynhard
794d9e6066 chore: update all target in Makefile
We should build the most common things by default.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2020-01-06 11:08:27 -08:00
Andrew Rynhard
f0732cafcf chore: fix release dependency
The GitHub release should depend on the push step instead of
push-latest.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2020-01-02 06:25:21 -08:00
Andrew Rynhard
0778214f1f chore: fix push events
Now that we have a push target and push-% target, we can simplify the drone
conditions. This updates the conditions so that the latest channel updates
on pushes to master, the edge channel updates on successful nightly cron, and
an image with the standard tag is pushed in all events except pull requests.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2020-01-01 13:00:12 -08:00
Andrew Rynhard
288d4d0b51 chore: push latest tag on tag events
This ensures that the latest tag is updated on git tag events.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2020-01-01 11:41:49 -08:00
Andrew Rynhard
e6a16d5572 chore: use the correct condition for latest and edge pushes
This updates the drone conditions to push the latest tag only
for pushes to the master branch. Additionally, the edge tag will be
pushed only when the nightly cron is executed.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2020-01-01 10:23:11 -08:00
Andrew Rynhard
7f2483e848 chore: fix releases
The GitHub release plugin doesn't allow directories, and has no way to
tell it to ignore a path. The workaround is to be explicit about what
files we want in a release.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2019-12-30 18:13:14 -08:00
Andrey Smirnov
ebd40bd0eb chore: use osctl cluster --wait in basic-integration
There are few workarounds for Drone way of running integration test:
DinD runs as a separate pod, and we can only access its exposed on the
"host" ports, while from Talos cluster this endpoint is not reachable.

So internally Talos nodes still use addresses like "10.5.0.2", while
test is using "docker" to access it (that's name of the `docker` service
in the pipeline).

When running locally, 127.0.0.1 is used as endpoint, which should work
fine both on OS X and Linux.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2019-12-30 15:15:42 -08:00
Andrew Rynhard
9faf4907f2 chore: exclude cron events in push-latest step
We shouldn't push latest on cron events.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2019-12-27 17:23:50 -08:00
Andrew Rynhard
0b23727ad3 chore: fix conformance
The drone linter complained about duplicate steps. This removes them.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2019-12-27 15:38:14 -08:00
Andrew Rynhard
c8d3da5376 chore: add more functions to the release script
This adds functionality to update the CHANGELOG, and cherry-pick a
commit into a release branch.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2019-12-26 08:52:56 -08:00
Andrew Rynhard
6aa4a9e305 chore: remove gitmeta references
We no longer depend on gitmeta.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2019-12-26 07:05:09 -08:00