talos

mirror of https://github.com/siderolabs/talos.git synced 2025-11-25 20:51:47 +01:00

Author	SHA1	Message	Date
Andrey Smirnov	11c2b8f80c	test: bump defaults for provision tests resources Our defaults are too low now. Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>	2020-12-07 07:01:41 -08:00
Andrey Smirnov	621968977e	feat: update kubernetes to 1.20.0-rc.0 Talos 0.8 is going to ship with K8s 1.20. Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>	2020-12-02 10:50:58 -08:00
Andrey Smirnov	28ba6e416e	feat: update Kubernetes to v1.20.0-beta.2 Talos 0.8 is going to ship with K8s 1.20.x. Changes to support new `control-plane` label, upgrade-k8s supports automated fixups for 1.20. See also: https://github.com/talos-systems/bootkube-plugin/pull/22 Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>	2020-11-25 06:39:14 -08:00
Andrey Smirnov	1add26b42a	chore: bump K8s to 1.19.4 in e2e scripts with CABPT version This should fix the problem with the kubelet image. Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>	2020-11-16 07:18:44 -08:00
Andrey Smirnov	61facf700a	chore: build arm64 images in CI This changes installer image/iso output to be tar via stdout (optionally), so that we can copy back artifacts back from remote docker daemon. Fixes #2776 Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>	2020-11-13 12:34:48 -08:00
Andrey Smirnov	df6ad3fa80	feat: upgrade Kubernetes default version to 1.19.4 k8s.io modules don't have 1.19.4 tag yet :( Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>	2020-11-12 08:51:04 -08:00
Andrey Smirnov	350d75eb46	feat: build talosctl-cni-bundle, use it in talosctl for QEMU This builds a bundle with CNI plugins for talosctl which is automatically downloaded by `talosctl` if CNI plugins are missing. CNI directories are moved by default to the `~/.talos/cni` path. Also add a bunch of pre-flight checks to the QEMU provisioner to make it easier to bootstrap the Talos QEMU cluster. Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>	2020-10-30 16:30:37 -07:00
Artem Chernyshev	061b296530	feat: allow specifying user-disks in talosctl cluster create User-disks are supported by QEMU and Firecracker providers. Can be defined by using the following parameters: ``` --user-disk /mount/path:1GB ``` Can get more than 1 user disk. Same set of user disks will be created for all master and worker nodes. Additionally enable user-disks in qemu e2e test. Signed-off-by: Artem Chernyshev <artem.0xD2@gmail.com>	2020-10-30 08:44:08 -07:00
Andrew Rynhard	1b0ed13231	docs: move to gridsome Brings in a new theme, improved content, and restructured layout. Signed-off-by: Andrew Rynhard <andrew@rynhard.io>	2020-10-26 21:14:14 -07:00
Spencer Smith	7b4633b35d	chore: update CI scripts This PR pulls in the latest version of our CAPI providers, as well as makes some minor tweaks to our bash scripts to disable terminal output of commands during certain actions. Signed-off-by: Spencer Smith <robertspencersmith@gmail.com>	2020-10-22 09:00:41 -07:00
Andrew Rynhard	7017327059	chore: update qemu hack script to use ISO This can serve as an example of providing the config via an ISO, and simplify local setups a bit. Signed-off-by: Andrew Rynhard <andrew@rynhard.io>	2020-09-30 10:58:42 -07:00
Andrey Smirnov	ff0d4b305a	feat: build Talos images/artifacts for amd64/arm64 By default, build outside of Drone works the same and builds only amd64 version, loads images back into dockerd, etc. If multiple platforms are used, multi-arch images are built which can't be exported to docker or to `.tar` image, they're always pushed to the registry (even for PR builds to our internal CI registry). Artifacts as files (initramfs, kernel) now have `-arch` suffix: `vmlinuz-amd64`, `initramfs-amd64.xz`. "Magic" script normalizes output paths depending on whether single platform or multiple platforms were given. VM provisioners accept magic `${ARCH}` in initramfs/kernel paths which gets replaced by cluster architecture. Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>	2020-09-27 10:32:07 -07:00
Andrew Rynhard	7d2741fc4b	chore: migrate to ghcr.io Move to GHCR. Signed-off-by: Andrew Rynhard <andrew@rynhard.io>	2020-09-23 15:06:30 -07:00
Andrey Smirnov	788cd15c29	test: add e2e test to the provision (upgrade) tests Add sonobuoy runner code with log fetching on failure. Use hand-picked set of e2e tests to run: verify basic pod functionality, verify service connectivity. Add option `--run-e2e` to the `talosctl health` to run quick e2e test to verify cluster health. Add option to run provision tests with custom CNI, run one track of provision tests with Cilium. Bump Cilium to 1.8.2. Talos 0.6 won't uncordon node automatically after upgrade from 0.5, as 0.5 doesn't put annotation. Workaround that in upgrade tests. Bump upgrade test version to 0.6.0 release. Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>	2020-09-08 13:26:31 -07:00
Andrew Rynhard	1a4059a553	feat: add grub bootloader This moves to using grub instead of syslinux. BREAKING CHANGE: Single node upgrades will fail in this change. This will also break the A/B fallback setup since this version introduces an entirely new partition scheme, that any fallback will not know about. We plan on addressing these issues in a follow up change. Signed-off-by: Andrew Rynhard <andrew@rynhard.io>	2020-09-01 12:06:43 -07:00
Andrey Smirnov	59adf7315d	feat: provide option to run Talos under UEFI in QEMU This also adds integration pipeline tests for UEFI. Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>	2020-08-28 12:51:10 -07:00
Spencer Smith	c07ce17b7a	fix: update e2e scripts to work with python3 This PR replaces python with python3 Signed-off-by: Spencer Smith <robertspencersmith@gmail.com>	2020-08-11 12:38:52 -07:00
Spencer Smith	303c477051	chore: update capi CI manifests to use control planes This PR will update the CI testing to make use of our control plane provider, as well as the other CAPI components. Signed-off-by: Spencer Smith <robertspencersmith@gmail.com>	2020-08-11 11:14:44 -04:00
Andrey Smirnov	55f3249783	test: use registry mirrors in CI This relies on registry caching mirrors running in the CI. Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>	2020-07-31 16:30:41 +03:00
Andrey Smirnov	58aa2b75bb	test: destroy clusters in e2e tests (qemu/firecracker) As the build runs inside containers which are part of a single pod, we need to clean up networking bits (bridge interface, etc.), so that it doesn't cause problems for other steps. Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>	2020-07-31 06:21:09 -07:00
Andrey Smirnov	a5d64d97c1	test: update qemu/firecracker provisioners Fixes #2363 #2364 #2370 #2371 Several changes packed together: * use compressed `vmlinuz` everywhere, firecracker provisioner uncompresses it before first use, drop `vmlinux` * handle reboots in qemu launcher to support reset API case, update empty disk check to handle reset behavior (erasing partition table) * make bootloader support default in provisioners, and flag to disable that * early support for target architecture for qemu provisioner This should allow us to use `qemu` in CI/CD (not included into this PR): integration test passes with qemu. Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>	2020-07-30 21:17:25 +03:00
Artem Chernyshev	c6eb18eed5	feat: qemu provisioner Starts and stops qemu VMs, has some initial configuration subset. Sets up networking through CNI tools, sets up DHCP server which gives IP addresses to nodes. Signed-off-by: Artem Chernyshev <artem.0xD2@gmail.com>	2020-07-28 14:55:35 -07:00
Andrey Smirnov	6a81f30941	test: provide node discovery for cli tests via kubectl Fixes #2330 CLI tests require node discovery as `--nodes` flag is enforced for most of the `talosctl commands`. For clusters created via `talosctl cluster create`, cluster provisioner state provides all the necessary information, but clusters created via CAPI don't have the state attached. API tests rely on Talos and Kubernetes APIs to fetch kubeconfig and access Nodes K8s API. CLI tests should rely only on CLI tools, so we use `kubectl get nodes` + `talosctl kubeconfig` to fetch list of master and worker nodes. This discovery method relies on "bootstrap" node being set in `talosconfig` (to fetch `kubeconfig`). Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>	2020-07-28 11:35:47 -07:00
Andrey Smirnov	76c44ac468	test: remove apid load balancer for firecracker We're not using load balancer for `apid` (always using client-side load balancing), so we can remove this safely. Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>	2020-07-28 20:21:21 +03:00
Andrew Rynhard	1f31d24e55	chore: use Kubernetes pipelines This moves to using Kubernetes pipelines. Signed-off-by: Andrew Rynhard <andrew@rynhard.io>	2020-07-27 12:09:53 -07:00
Andrey Smirnov	3d8418a689	feat: force nodes to be set in `talosctl` commands using the API With load-balancing enabled by default running `talosctl` without `--nodes` is risky, as it might hit any control plane by default without `--nodes`. Only two commands do not enforce this check, as they do their own node contexts: `crashdump` and `health` (client-side). Integration tests were updated to always supply `--nodes` cli argument, while doing that I refactored the storage for discovered nodes to use existing `cluster.Info` interface. The downside is that with e2e CAPI tests CLI tests will be mostly skipped as we don't support discovery in CLI tests at the momemnt. This can be fixed by using `talosctl kubeconfig` + `kubectl get nodes` for node discovery. Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>	2020-07-21 12:17:43 -07:00
Spencer Smith	f290f88160	chore: update clusterctl for CI testing This PR brings in the latest version of clusterctl that has built-in support for the talos repos. I'll be chasing this with a move to using the control-plane provider as well! Signed-off-by: Spencer Smith <robertspencersmith@gmail.com>	2020-07-15 19:33:59 -04:00
Andrey Smirnov	9590030a84	feat: print crash dump in `talosctl cluster create` on failure When cluster fails to be bootstrapped or it fails the health check, it's hard to find the root cause without the logs. This change adds optional crashdump (it dumps firecracker logs or docker logs) after provisioning failure. It's not enabled by default. Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>	2020-07-10 11:54:07 -07:00
Andrey Smirnov	4f5660b22b	test: fix sonobuoy delete It expects kubeconfig as required argument. Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>	2020-07-09 18:46:57 +03:00
Spencer Smith	67cddaff44	chore: wait for resource deletion in sonobuoy This PR fixes the fix where we try to cleanup sonobuoy. We did that successfully, but still got errors b/c we were immediately trying to create service accounts in a namespace that was being deleted. This should fix that. The sonobuoy default wait period is 1hr, should be plenty. Signed-off-by: Spencer Smith <robertspencersmith@gmail.com>	2020-07-07 10:58:47 -07:00
Spencer Smith	13bd77355e	chore: cleanup sonobuoy after failed attempts This PR will make sure that, if we're going to retry sonobuoy, we run the delete command first to clean up any dangling resources. Closes #2266. Signed-off-by: Spencer Smith <robertspencersmith@gmail.com>	2020-07-06 11:46:49 -07:00
Andrey Smirnov	3ae5e0e749	test: add short integration test with custom CNI This adds new flug to `cluster create` to launch cluster with custom CNI, `integration` pipeline gets a new step to run short test with Cilium 1.8.0 CNI. Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>	2020-07-01 11:19:19 -07:00
Patatman	90acb01a4e	docs: digital rebar docs Digital rebar docs in the guide section. Signed-off-by: Patatman <git@jeursen.nl>	2020-06-30 18:52:39 -07:00
Andrey Smirnov	e46a09f56a	chore: make default pipeline run shorter integration test This moves full integratation test and provision tests to the `integration` pipeline. Docker test wasn't affected much, as anyways docker can't run long integration tests, so it mostly affects firecracker and provision tests. Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>	2020-07-01 00:14:55 +03:00
Andrew Rynhard	d0d2ac3c74	test: default to using the bootstrap API This moves our test scripts to using the bootstrap API. Some automation around invoking the bootstrap API was also added to give the same ease of use when creating clusters with the CLI. Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>	2020-06-24 08:46:10 -07:00
Spencer Smith	e03a68f8eb	feat: update k8s and sonobuoy versions This PR will update k8s to the latest 1.18 release and bump sonobuoy to help resolve some e2e flakes. Also adds some retry logic around the sonobuoy run. Signed-off-by: Spencer Smith <robertspencersmith@gmail.com>	2020-06-10 06:47:36 -07:00
Spencer Smith	c1b6f05b00	chore: use clusterctl and v1alpha3 providers for tests This PR will update our testing ocde to make use of the clusterctl tool, as well as use the newer versions of various providers and updated manifests. Signed-off-by: Spencer Smith <robertspencersmith@gmail.com>	2020-05-01 07:42:19 -07:00
Spencer Smith	8d2f8d6127	chore: remove random.trust_cpu references This PR removes the references to adding in the random CPU trust to the kernel for all v0.4 docs, as well as in the iso command in the installer. This is no longer needed with the newer linux kernel. Signed-off-by: Spencer Smith <robertspencersmith@gmail.com>	2020-04-14 17:10:56 -07:00
Spencer Smith	3a4eaeeef0	feat: upgrade kubernetes to 1.18 This PR will pull in the latest release of k8s 1.18 so we can start validating it through our test suite. Signed-off-by: Spencer Smith <robertspencersmith@gmail.com>	2020-03-26 14:59:43 -04:00
Andrey Smirnov	104af4380e	feat: make `--wait` default option to `talosctl cluster create` It seems to be useful enough to be the default one and it prevents simple mistakes while trying to access the cluster which is not ready yet. Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>	2020-03-25 06:36:43 -07:00
Andrey Smirnov	e38cde9b48	chore: update upgrade tests for new version, split into two tracks This updates upgrade tests to run two flows with 3+1 clusters: 1. 0.3 -> current (testing upgrade with partition wiping) 2. 0.4-alpha.7 -> current (testing upgrade without partition wiping, boot-a/boot-b) And small upgrade with preserve enabled for single-node cluster. Provision tests are now split into two parallel tracks in Drone. Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>	2020-03-24 15:30:00 -07:00
Spencer Smith	3485ea9f09	fix: update k8s to 1.17.3 This PR will update k8s to v1.17.3 to address CVEs mentioned in https://groups.google.com/forum/?utm_medium=email&utm_source=footer#!topic/kubernetes-security-announce/2UOlsba2g0s Signed-off-by: Spencer Smith <robertspencersmith@gmail.com>	2020-03-23 17:08:52 -07:00
Andrew Rynhard	5dbc26c7a3	feat: rename osctl to talosctl This is a rename of the osctl binary. We decided that talosctl is a better name for the Talos CLI. This does not break any APIs, but does make older documentation only accurate for previous versions of Talos. Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>	2020-03-20 19:07:39 -07:00
Andrey Smirnov	2e3681054d	chore: improve handling of etcd responses in bootkube pre-func Try more attempts, wait for the response. Treat empty response as no error (as this is what to expect when key is not set yet). Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>	2020-03-06 21:06:48 +03:00
Andrey Smirnov	d5d3035c8c	test: enable upgrade tests 0.4.x -> latest With the fix #1904, it's now possible to upgrade 0.4.x with `machine.File` extra files (caused by registry mirror for registry.ci.svc). Bump resources for upgrade tests in attempt to speed it up. Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>	2020-02-26 00:09:32 +03:00
Andrey Smirnov	923ef4537b	test: implement new class of tests: provision tests (upgrades) This class of tests is included/excluded by build tags, but as it is pretty different from other integration tests, we build it as separate executable. Provision tests provision cluster for the test run, perform some actions and verify results (could be upgrade, reset, scale up/down, etc.) There's now framework to implement upgrade tests, first of the tests tests upgrade from latest 0.3 (0.3.2 at the moment) to current version of Talos (being built in CI). Tests starts by booting with 0.3 kernel/initramfs, runs 0.3 installer to install 0.3.2 cluster, wait for bootstrap, followed by upgrade to 0.4 in rolling fashion. As Firecracker supports bootloader, this boots 0.4 system from boot disk (as installed by installer). Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>	2020-02-21 07:04:03 -08:00
Andrey Smirnov	5f330f1f64	chore: push installer & talos images to the CI registry on every build This enables a way to run the matching installer image in firecracker tests. New image is used in firecracker tests and bootloader support to use installed kernel/initramfs, which opens path for upgrade tests. Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>	2020-02-18 07:32:45 -08:00
Spencer Smith	05aad743df	chore: update capi-upstream This PR will bring in the latest v1alpha2-supporting release ofthe upstream capi provider Signed-off-by: Spencer Smith <robertspencersmith@gmail.com>	2020-01-31 11:30:16 -05:00
Andrey Smirnov	0afd0f651b	chore: provide provisioned cluster info to integration test Integration test can optionally consume cluster state as generated by the call to `osctl cluster create` and use it to discover nodes in integration tests. This means that now CLI tests can use that as discovery source, and API/K8s tests by default as well. Flat list of nodes is to be replaced by something more complex in the next iteration, but it's good for this PR. As a demo, add CLI test with multiple nodes (dmesg). Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>	2020-01-31 18:21:30 +03:00
Andrew Rynhard	88667641df	chore: refactor E2E scripts This PR aims to simplify our E2E scripts. Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>	2020-01-26 20:47:25 -08:00

1 2 3

122 Commits