talos

mirror of https://github.com/siderolabs/talos.git synced 2025-08-09 16:17:05 +02:00

Author	SHA1	Message	Date
Andrey Smirnov	6fb55229a2	test: fix and improve reboot/reset tests These tests rely on node uptime checks. These checks are quite flaky. Following fixes were applied: * code was refactored as common method shared between reset/reboot tests (reboot all nodes does checks in a different way, so it wasn't updated) * each request to read uptime times out in 5 seconds, so that checks don't wait forever when node is down (or connection is aborted) * to account for node availability vs. lower uptime in the beginning of test, add extra elapsed time to the check condition Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>	2020-06-29 13:56:48 -07:00
Andrey Smirnov	28a6eb207a	test: add node name to error messages in RebootAllNodes This makes troubleshooting easier. Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>	2020-05-07 12:12:46 -07:00
Spencer Smith	31668f1c4c	chore: update timeout values for e2e tests This PR will update the values for timeout when testing e2e. We were hitting issues in GCP on the reboot test, as the nodes seemed to be taking a few minutes to become responsive again. I also moved the "cluster health" check in the node-by-node reboot test to use the default suite context, so it'll have a timeout of 30m instead of the 5 that it had initially. This seems to solve the node-by-node bailing as well. Signed-off-by: Spencer Smith <robertspencersmith@gmail.com>	2020-04-03 19:16:30 -04:00
Andrey Smirnov	682dd433ba	refactor: move Talos client package to `pkg/` As this implements Go client for Talos API, it makes sense to publish it one the top level. Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>	2020-04-01 23:45:58 +03:00
Andrey Smirnov	b94be4f6a1	test: mark long tests as !short This skips long-running integration tests if `-test.short` mode is enabled. Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>	2020-03-27 22:34:26 +03:00
Andrew Rynhard	5dbc26c7a3	feat: rename osctl to talosctl This is a rename of the osctl binary. We decided that talosctl is a better name for the Talos CLI. This does not break any APIs, but does make older documentation only accurate for previous versions of Talos. Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>	2020-03-20 19:07:39 -07:00
Andrey Smirnov	d5f80858dd	test: add 'reset' integration test for Reset() API Every node is reset, rebooted and it comes back up again except for the init node due to known issues with init node boostrapping etcd cluster from scratch when metadata is missing (as node was wiped). Planned workaround is to prohibit resetting init node (should be coming next). Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>	2020-03-06 23:05:46 +03:00
Andrey Smirnov	9bfb5f1501	test: fix `RebootAllNodes` test to reboot all nodes in one call As calls to the nodes are proxied through `apid` on init node, we can't reboot all nodes concurrently, as init node might be already down by the moment any other node is going to be rebooted. Rewrite the test to reboot all the nodes in a single multi-node request. Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>	2020-02-17 14:34:00 -08:00
Andrey Smirnov	491e7e58e0	test: implement RebootAllNodes test This complements "rolling restart" RebootNodeByNode test by providing more of a disaster scenario, when all the nodes are restarted at once. Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>	2020-02-17 13:58:57 -08:00
Andrey Smirnov	76c2038b13	chore: implement loadbalancer for firecracker provisioner This PR contains generic simple TCP loadbalancer code, and glue code for firecracker provisioner to use this loadbalancer. K8s control plane is passed through the load balancer, and Talos API is passed only to the init node (for now, as some APIs, including kubeconfig, don't work with non-init node). Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>	2020-02-13 23:07:13 +03:00
Andrey Smirnov	a2dee289d1	test: skip reboot tests Seems that with a single endpoint k8s is not able to recover (?). Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>	2020-02-04 08:37:32 -08:00
Andrey Smirnov	afa8a48174	chore: implement reboot test Reboot test does node-by-node reboots followed by cluster health checks (same as done by provisioner). Fixed bug with `Read()` returning `Reader` instead of `ReadCloser` (minor). Allowed `bootkube` to be `Skipped` (for rebooted node). Added support for doing checks via provided client instance. Implemented generic capabilities to skip tests based on cluster platform. Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>	2020-02-03 11:02:43 -08:00

12 Commits