240 Commits

Author SHA1 Message Date
Patatman
90acb01a4e docs: digital rebar docs
Digital rebar docs in the guide section.

Signed-off-by: Patatman <git@jeursen.nl>
2020-06-30 18:52:39 -07:00
Andrey Smirnov
e46a09f56a chore: make default pipeline run shorter integration test
This moves full integratation test and provision tests to
the `integration` pipeline.

Docker test wasn't affected much, as anyways docker can't run long
integration tests, so it mostly affects firecracker and provision tests.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2020-07-01 00:14:55 +03:00
Andrew Rynhard
d0d2ac3c74 test: default to using the bootstrap API
This moves our test scripts to using the bootstrap API. Some
automation around invoking the bootstrap API was also added
to give the same ease of use when creating clusters with the
CLI.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2020-06-24 08:46:10 -07:00
Spencer Smith
e03a68f8eb feat: update k8s and sonobuoy versions
This PR will update k8s to the latest 1.18 release and bump sonobuoy to
help resolve some e2e flakes. Also adds some retry logic around the
sonobuoy run.

Signed-off-by: Spencer Smith <robertspencersmith@gmail.com>
2020-06-10 06:47:36 -07:00
Spencer Smith
c1b6f05b00 chore: use clusterctl and v1alpha3 providers for tests
This PR will update our testing ocde to make use of the clusterctl tool,
as well as use the newer versions of various providers and updated
manifests.

Signed-off-by: Spencer Smith <robertspencersmith@gmail.com>
2020-05-01 07:42:19 -07:00
Spencer Smith
8d2f8d6127 chore: remove random.trust_cpu references
This PR removes the references to adding in the random CPU trust to the
kernel for all v0.4 docs, as well as in the iso command in the
installer. This is no longer needed with the newer linux kernel.

Signed-off-by: Spencer Smith <robertspencersmith@gmail.com>
2020-04-14 17:10:56 -07:00
Spencer Smith
3a4eaeeef0 feat: upgrade kubernetes to 1.18
This PR will pull in the latest release of k8s 1.18 so we can start
validating it through our test suite.

Signed-off-by: Spencer Smith <robertspencersmith@gmail.com>
2020-03-26 14:59:43 -04:00
Andrey Smirnov
104af4380e feat: make --wait default option to talosctl cluster create
It seems to be useful enough to be the default one and it prevents
simple mistakes while trying to access the cluster which is not ready
yet.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2020-03-25 06:36:43 -07:00
Andrey Smirnov
e38cde9b48 chore: update upgrade tests for new version, split into two tracks
This updates upgrade tests to run two flows with 3+1 clusters:

1. 0.3 -> current (testing upgrade with partition wiping)
2. 0.4-alpha.7 -> current (testing upgrade without partition wiping,
boot-a/boot-b)

And small upgrade with preserve enabled for single-node cluster.

Provision tests are now split into two parallel tracks in Drone.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2020-03-24 15:30:00 -07:00
Spencer Smith
3485ea9f09 fix: update k8s to 1.17.3
This PR will update k8s to v1.17.3 to address CVEs mentioned in https://groups.google.com/forum/?utm_medium=email&utm_source=footer#!topic/kubernetes-security-announce/2UOlsba2g0s

Signed-off-by: Spencer Smith <robertspencersmith@gmail.com>
2020-03-23 17:08:52 -07:00
Andrew Rynhard
5dbc26c7a3 feat: rename osctl to talosctl
This is a rename of the osctl binary. We decided that talosctl is a
better name for the Talos CLI. This does not break any APIs, but does
make older documentation only accurate for previous versions of Talos.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2020-03-20 19:07:39 -07:00
Andrey Smirnov
2e3681054d chore: improve handling of etcd responses in bootkube pre-func
Try more attempts, wait for the response. Treat empty response as no
error (as this is what to expect when key is not set yet).

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2020-03-06 21:06:48 +03:00
Andrey Smirnov
d5d3035c8c test: enable upgrade tests 0.4.x -> latest
With the fix #1904, it's now possible to upgrade 0.4.x with
`machine.File` extra files (caused by registry mirror for
registry.ci.svc).

Bump resources for upgrade tests in attempt to speed it up.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2020-02-26 00:09:32 +03:00
Andrey Smirnov
923ef4537b test: implement new class of tests: provision tests (upgrades)
This class of tests is included/excluded by build tags, but as it is
pretty different from other integration tests, we build it as separate
executable. Provision tests provision cluster for the test run, perform
some actions and verify results (could be upgrade, reset, scale up/down,
etc.)

There's now framework to implement upgrade tests, first of the tests
tests upgrade from latest 0.3 (0.3.2 at the moment) to current version
of Talos (being built in CI). Tests starts by booting with 0.3
kernel/initramfs, runs 0.3 installer to install 0.3.2 cluster, wait for
bootstrap, followed by upgrade to 0.4 in rolling fashion. As Firecracker
supports bootloader, this boots 0.4 system from boot disk (as installed
by installer).

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2020-02-21 07:04:03 -08:00
Andrey Smirnov
5f330f1f64 chore: push installer & talos images to the CI registry on every build
This enables a way to run the matching installer image in firecracker
tests. New image is used in firecracker tests and bootloader support to
use installed kernel/initramfs, which opens path for upgrade tests.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2020-02-18 07:32:45 -08:00
Spencer Smith
05aad743df chore: update capi-upstream
This PR will bring in the latest v1alpha2-supporting release ofthe upstream capi provider

Signed-off-by: Spencer Smith <robertspencersmith@gmail.com>
2020-01-31 11:30:16 -05:00
Andrey Smirnov
0afd0f651b chore: provide provisioned cluster info to integration test
Integration test can optionally consume cluster state as generated by
the call to `osctl cluster create` and use it to discover nodes in
integration tests.

This means that now CLI tests can use that as discovery source, and
API/K8s tests by default as well.

Flat list of nodes is to be replaced by something more complex in the
next iteration, but it's good for this PR.

As a demo, add CLI test with multiple nodes (dmesg).

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2020-01-31 18:21:30 +03:00
Andrew Rynhard
88667641df chore: refactor E2E scripts
This PR aims to simplify our E2E scripts.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2020-01-26 20:47:25 -08:00
Andrew Rynhard
f87c6d74d3 chore: use firecracker in basic-integration
This adds a basic integration step that uses firecracker.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2020-01-23 05:52:22 -08:00
Spencer Smith
60260c85d1 feat: upgrade kubernetes version to 1.17.1
This PR will bring in the latest point release of k8s 1.17

Signed-off-by: Spencer Smith <robertspencersmith@gmail.com>
2020-01-17 09:39:26 -08:00
Andrew Rynhard
40f803de66 chore: run sonobuoy in quick mode
This adds sonobuoy's quick mode test to basic integration.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2020-01-17 09:25:43 -08:00
Andrew Rynhard
6533a41da7 chore: fix E2E script
The basic integration cluster name was changed in a previous PR. This
aligns the E2E script with the new naming conventions, and mounts the
correct integration test binary.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2020-01-10 07:53:26 -08:00
Andrew Rynhard
794d9e6066 chore: update all target in Makefile
We should build the most common things by default.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2020-01-06 11:08:27 -08:00
Andrey Smirnov
ebd40bd0eb chore: use osctl cluster --wait in basic-integration
There are few workarounds for Drone way of running integration test:
DinD runs as a separate pod, and we can only access its exposed on the
"host" ports, while from Talos cluster this endpoint is not reachable.

So internally Talos nodes still use addresses like "10.5.0.2", while
test is using "docker" to access it (that's name of the `docker` service
in the pipeline).

When running locally, 127.0.0.1 is used as endpoint, which should work
fine both on OS X and Linux.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2019-12-30 15:15:42 -08:00
Andrew Rynhard
5a7eb631b2 feat: add installer command to installer container
This replaces the entrypoint.sh shell script with a go binary.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2019-12-26 06:41:25 -08:00
Andrew Rynhard
31baa14e36 feat: add support for tftp download
This adds support for downloading the machine config over TFTP. This
will allow users to avoid having to setup an HTTP server, and use
whatever they are using for PXE.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2019-12-18 09:28:38 -08:00
Andrew Rynhard
49de6e96b3 chore: fix KVM test
This adds hostname and domain name options to the DHCP response.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2019-12-18 09:13:03 -08:00
Brad Beam
9584b47cd7 feat: Upgrade kubernetes to 1.17.0
Primarily doc/constant changes.

Added additionnal bits to `docs` target in makefile to generate osctl
docs as well as config files. Explicitly define a HOME variable so we
get consistent home directories for talosconfig variables in our docs.

Signed-off-by: Brad Beam <brad.beam@talos-systems.com>
2019-12-10 16:03:35 -08:00
Andrew Rynhard
5fb12f735a chore: make the CNI configurable in local KVM test
This changes the local KVM setup to use a configurable CNI URL.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2019-12-10 08:08:26 -08:00
Andrey Smirnov
399aeda0b9 feat: rename confusing target options, --endpoints, etc.
Fixes #1610

1. In `talosconfig`, deprecate `Target` in favor of `Endpoints`
(client-side LB to come next).

2. In `osctl`, use `--nodes` in place of `--target`.

3. In `osctl` add option `--endpoints` to override `Endpoints` for the
call.

Other changes are just updates to catch up with the changes. Most
probably I missed something... And CAPI provider needs update.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2019-12-10 02:23:54 +03:00
Andrew Rynhard
f7f85d7585 chore: upgrade sonobuoy to v0.16.5
This brings in the latest sonobuoy.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2019-12-05 14:36:04 -08:00
Andrey Smirnov
edb40437ec feat: add support for osctl logs -f
Now default is not to follow the logs (which is similar to `kubectl logs`).

Integration test was added for `Logs()` API and `osctl logs` command.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2019-12-05 13:58:52 -08:00
Spencer Smith
509ec5b6ff chore: update gcp disk sizes
This PR updates the disks to 100GB for hopes of better disk perf.

Signed-off-by: Spencer Smith <robertspencersmith@gmail.com>
2019-12-05 13:55:40 -08:00
Andrew Rynhard
9f9fd02ceb chore: fix conformance
The `--e2e-parallel` flag seems to skip all tests when running in
certified-conformance mode. This reverts that change, and also adds a
check that fails if the conformance tests do not pass. This ensures that
we are not publishing broken versions of our edge release.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2019-12-01 17:08:57 -08:00
Andrew Rynhard
712275dfea chore: upgrade sonobuoy
This upgrades sonouoy and additionally adds the `--e2e-parallel` flag to
hopefully speed things up.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2019-11-28 12:13:17 -08:00
Andrew Rynhard
031c65be47 feat: add IMA policy
This creates an IMA policy at boot. It uses the default TCB policy with
a dont_measure rule for XFS.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2019-11-26 16:49:48 -08:00
Andrew Rynhard
103620dc5c chore: add ability to specify custom intaller to libvirt setup
This is useful when developing Talos.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2019-11-25 14:57:18 -08:00
Andrew Rynhard
ae83221e4a test: add integration test for full boot sequence
This adds an integration test that can be ran on a KVM enabled Linux
machine. It makes use of docker, matchbox, dnsmasq, libvirt, and HAproxy
to create an HA cluster.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2019-11-15 09:02:52 -08:00
Spencer Smith
6d5bbaf7c8 chore: re-enable e2e for aws clusters
This PR adds in the necessary manifests and fixes to deploy aws clusters
as part of e2e testing.

Signed-off-by: Spencer Smith <robertspencersmith@gmail.com>
2019-11-07 15:32:14 -05:00
Andrey Smirnov
8fdf71789e test: add 'integration-test' to e2e runs
Also refactored `integration-test` build as a generic step to be shared
by basic-integration and e2e-integration steps.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2019-11-07 06:30:34 -08:00
Spencer Smith
ce7a0e36cc chore: re-enable e2e testing
This PR will re-enable e2e testing by using the new cluster api
bootstrap provider and various infra providers.

Signed-off-by: Spencer Smith <robertspencersmith@gmail.com>
2019-11-05 16:53:38 -05:00
Andrey Smirnov
b0aef2cf22 test: add integration test framework
This is just first steps and core foundation.

It can be used like:

```
make integration.test
osctl cluster create
build/integration.test -test.v
```

This should run the test against the Docker instance.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2019-11-05 17:21:38 +03:00
Brad Beam
a4e1479b07 refactor: Move kubeconfig to machined
This moves the Kubeconfig api endpoint to machined and consolidates the
"read a file" code into machined. This also changes Kubeconfig to
use the CopyOut method which changes Kubeconfig to a streaming grpc call.

Signed-off-by: Brad Beam <brad.beam@talos-systems.com>
2019-11-04 14:45:23 -08:00
Andrew Rynhard
a3dc6adec1 chore: remove unused files
This removes unused files in hack.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2019-10-31 22:46:38 -07:00
Andrew Rynhard
3c6d0135d0 feat: upgrade Kubernetes to 1.16.2
This brings in 1.16.2 modules and bumps the default hyperkube image.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2019-10-30 06:35:12 -07:00
Brad Beam
573cce8d18 feat: Add APId
This PR introduces APId. This service replaces the frontend functionality
previously provided by OSD. The main driver for this is two fold:

1. Create a single purpose application to expose the talos api

2. Make use of code generation to DRY api changes

Signed-off-by: Brad Beam <brad.beam@talos-systems.com>
2019-10-25 13:02:33 -05:00
Andrew Rynhard
7a4b4d42b5 docs: add v0.3 AWS guide
This adds documentation for v0.3 AWS users.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2019-10-23 15:20:07 -07:00
Andrew Rynhard
638d36bce7 fix: ensure control plane endpoint is set
We were mistakenly overwriting the control plane endpoint in the
`generate` command. This fixes that and adds a simple validation of the
endpoint field in the config. We should expand the validation to ensure
that a valid IP or DNS name have been provided.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2019-10-21 19:44:20 -07:00
Brad Beam
e6bf92ce31 feat(osd): Enable hitting multiple OSD endpoints
This enables the ability to specify additional <talos> endpoints to connect to
to pull back data.

Signed-off-by: Brad Beam <brad.beam@talos-systems.com>
2019-10-16 15:30:25 -05:00
Spencer Smith
5d5f530bb0 chore: update sonobuoy for conformance tests
This PR updates the sonobuoy version. We're currently running
conformance tests with 0.15.x

Signed-off-by: Spencer Smith <robertspencersmith@gmail.com>
2019-10-10 18:26:05 -07:00