talos

mirror of https://github.com/siderolabs/talos.git synced 2025-10-08 06:01:12 +02:00

Author	SHA1	Message	Date
Andrey Smirnov	0af7624c7d	fix: resolve race condition in createNodes Due to the race, main goroutine might consume all the errors from `errCh` and close `nodesCh`, so node goroutine might hit panic on send to closed channel. ``` panic: send on closed channel goroutine 40 [running]: github.com/talos-systems/talos/internal/pkg/provision/providers/firecracker.(provisioner).createNodes.func1(0x26ab668, 0xc00025a000, 0xc0005a83c0, 0xc00029d540, 0xc000536120, 0xc000464540, 0xc000041d80, 0x18, 0xc0006d406c, 0x4, ...) /src/internal/pkg/provision/providers/firecracker/node.go:55 +0x1fa created by github.com/talos-systems/talos/internal/pkg/provision/providers/firecracker.(provisioner).createNodes /src/internal/pkg/provision/providers/firecracker/node.go:50 +0x1ca ``` Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>	2020-04-10 14:15:41 -07:00
Spencer Smith	b84d5e2660	feat: allow for exposing ports on docker clusters This PR will introduce a `-p/--exposed-ports` flag to talosctl. This flag will allow us to enable port forwards on worker nodes only. This will allow for ingresses on docker clusters so we can hopefully use ingress for Arges initial bootstrapping. I modeled this after how KIND allows ingresses [here](https://kind.sigs.k8s.io/docs/user/ingress/) Signed-off-by: Spencer Smith <robertspencersmith@gmail.com>	2020-03-30 15:24:25 -04:00
Andrey Smirnov	d5d3035c8c	test: enable upgrade tests 0.4.x -> latest With the fix #1904, it's now possible to upgrade 0.4.x with `machine.File` extra files (caused by registry mirror for registry.ci.svc). Bump resources for upgrade tests in attempt to speed it up. Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>	2020-02-26 00:09:32 +03:00
Andrey Smirnov	76c2038b13	chore: implement loadbalancer for firecracker provisioner This PR contains generic simple TCP loadbalancer code, and glue code for firecracker provisioner to use this loadbalancer. K8s control plane is passed through the load balancer, and Talos API is passed only to the init node (for now, as some APIs, including kubeconfig, don't work with non-init node). Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>	2020-02-13 23:07:13 +03:00
Andrey Smirnov	fae5e6915d	chore: rework firecracker code around upstream Go SDK + PRs This removes use of private fork with custom `ip=` kernel argument handling and switches fully to upstream version of it. Firecracker Go SDK version is `master` + following PRs: * https://github.com/firecracker-microvm/firecracker-go-sdk/pull/167 * https://github.com/firecracker-microvm/firecracker-go-sdk/pull/177 * https://github.com/firecracker-microvm/firecracker-go-sdk/pull/178 MTU handling support was implemented as well. Changes: * hostname to each node is passed via `talos.hostname=` kernel arg * IP configuration is generated by SDK from CNI result * fixed bugs with wrong netmask * nameservers & MTU is passed via Talos config Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>	2020-01-29 02:35:15 +03:00
Andrey Smirnov	9da687d2a3	test: firecracker provisioner fixes, implement cluster destroy This implements `osctl cluster destroy` for Firecracker, adds new utility command `osctl cluser show`. Firecracker mode now has control process for firecracker VMs, allowing clean reboots and background operations. Lots of small fixes to Firecracker mode, clean CNI shutdown, cleaning up netns, etc. Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>	2020-01-21 17:11:06 -08:00
Andrey Smirnov	2bf8540855	test: provision Talos clusters via Firecracker VMs This is initial PR to push the initial code, it has several known problems which are going to be addressed in follow-up PRs: 1. there's no "cluster destroy", so the only way to stop the VMs is to `pkill firecracker` 2. provisioner creates state in `/tmp` and never deletes it, that is required to keep cluster running when `osctl cluster create` finishes 3. doesn't run any controller process around firecracker to support reboots/CNI cleanup (vethxyz interfaces are lingering on the host as they're never cleaned up) The plan is to create some structure in `~/.talos` to manage cluster state, e.g. `~/.talos/clusters/<name>` which will contain all the required files (disk images, file sockets, VM logs, etc.). This directory structure will also work as a way to detect running clusters and clean them up. For point number 3, `osctl cluster create` is going to exec lightweight process to control the firecracker VM process and to simulate VM reboots if firecracker finishes cleanly (when VM reboots). Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>	2020-01-16 00:27:08 +03:00
Andrew Rynhard	898cf01f0a	refactor: unify generate type and machine type We have been using two packages that define a config type and a machine type, when really they are one and the same. This unifies the types down to one set. Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>	2020-01-10 16:46:28 -08:00
Spencer Smith	75d9f7b454	feat: support configurable docker-based clusters This PR will allow users to issue `osctl config generate`, tweak the configs to their liking, then use those configs to call `osctl cluster create`. Example workflow: ``` osctl config generate my-cluster https://10.5.0.2:6443 -o ./my-cluster tweaky tweak osctl cluster create --name my-cluster --input-dir "$PWD/my-cluster" ``` Signed-off-by: Spencer Smith <robertspencersmith@gmail.com>	2020-01-08 14:11:56 -05:00
Spencer Smith	6722a52aba	chore: allow re-use of docker network for local clusters This PR will allow users to use an existing docker network for their talos cluster. Hoping this will be useful for those wanting further control and configuration of their local docker clusters, as well as possibly useful for us during CI. The docker networks can be pre-created with something like: `docker network create my-cluster --subnet 192.168.0.0/24 --label talos.owned=true --label talos.cluster.name=my-cluster`. Note that the labels are pre-reqs for our discovery and re-use of these networks. Signed-off-by: Spencer Smith <robertspencersmith@gmail.com>	2020-01-03 16:21:07 -05:00
Andrey Smirnov	ebd40bd0eb	chore: use osctl cluster --wait in basic-integration There are few workarounds for Drone way of running integration test: DinD runs as a separate pod, and we can only access its exposed on the "host" ports, while from Talos cluster this endpoint is not reachable. So internally Talos nodes still use addresses like "10.5.0.2", while test is using "docker" to access it (that's name of the `docker` service in the pipeline). When running locally, 127.0.0.1 is used as endpoint, which should work fine both on OS X and Linux. Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>	2019-12-30 15:15:42 -08:00
Andrey Smirnov	0081ac5fac	refactor: extract Talos cluster provisioner as common code This extracts Docker Talos cluster provisioner as common code which might be shared between `osctl cluster` and integration-test. There should be almost no functional changes. As proof of concept, abstract cluster readiness checks were implemented based on provisioned cluster state. It implements same checks as `basic-integration.sh` in pure Go via Talos/K8s clients. `conditions` package was promoted from machined-internal to `internal/pkg` as it is used to run the checks. Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>	2019-12-27 12:14:19 -08:00

12 Commits