Due to the race, main goroutine might consume all the errors from
`errCh` and close `nodesCh`, so node goroutine might hit panic on send
to closed channel.
```
panic: send on closed channel
goroutine 40 [running]:
github.com/talos-systems/talos/internal/pkg/provision/providers/firecracker.(*provisioner).createNodes.func1(0x26ab668, 0xc00025a000, 0xc0005a83c0, 0xc00029d540, 0xc000536120, 0xc000464540, 0xc000041d80, 0x18, 0xc0006d406c, 0x4, ...)
/src/internal/pkg/provision/providers/firecracker/node.go:55 +0x1fa
created by github.com/talos-systems/talos/internal/pkg/provision/providers/firecracker.(*provisioner).createNodes
/src/internal/pkg/provision/providers/firecracker/node.go:50 +0x1ca
```
Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
This PR will introduce a `-p/--exposed-ports` flag to talosctl. This
flag will allow us to enable port forwards on worker nodes only. This
will allow for ingresses on docker clusters so we can hopefully use
ingress for Arges initial bootstrapping. I modeled this after how KIND allows ingresses
[here](https://kind.sigs.k8s.io/docs/user/ingress/)
Signed-off-by: Spencer Smith <robertspencersmith@gmail.com>
With the fix#1904, it's now possible to upgrade 0.4.x with
`machine.File` extra files (caused by registry mirror for
registry.ci.svc).
Bump resources for upgrade tests in attempt to speed it up.
Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
This PR contains generic simple TCP loadbalancer code, and glue code for
firecracker provisioner to use this loadbalancer.
K8s control plane is passed through the load balancer, and Talos API is
passed only to the init node (for now, as some APIs, including
kubeconfig, don't work with non-init node).
Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
This implements `osctl cluster destroy` for Firecracker, adds
new utility command `osctl cluser show`.
Firecracker mode now has control process for firecracker VMs, allowing
clean reboots and background operations.
Lots of small fixes to Firecracker mode, clean CNI shutdown, cleaning up
netns, etc.
Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
This is initial PR to push the initial code, it has several known
problems which are going to be addressed in follow-up PRs:
1. there's no "cluster destroy", so the only way to stop the VMs is to
`pkill firecracker`
2. provisioner creates state in `/tmp` and never deletes it, that is
required to keep cluster running when `osctl cluster create` finishes
3. doesn't run any controller process around firecracker to support
reboots/CNI cleanup (vethxyz interfaces are lingering on the host as
they're never cleaned up)
The plan is to create some structure in `~/.talos` to manage cluster
state, e.g. `~/.talos/clusters/<name>` which will contain all the
required files (disk images, file sockets, VM logs, etc.). This
directory structure will also work as a way to detect running clusters
and clean them up.
For point number 3, `osctl cluster create` is going to exec lightweight
process to control the firecracker VM process and to simulate VM reboots
if firecracker finishes cleanly (when VM reboots).
Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
We have been using two packages that define a config type and a machine
type, when really they are one and the same. This unifies the types down
to one set.
Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
This PR will allow users to issue `osctl config generate`, tweak the
configs to their liking, then use those configs to call `osctl cluster
create`.
Example workflow:
```
osctl config generate my-cluster https://10.5.0.2:6443 -o ./my-cluster
** tweaky tweak **
osctl cluster create --name my-cluster --input-dir "$PWD/my-cluster"
```
Signed-off-by: Spencer Smith <robertspencersmith@gmail.com>
This PR will allow users to use an existing docker network for their
talos cluster. Hoping this will be useful for those wanting further
control and configuration of their local docker clusters, as well as
possibly useful for us during CI. The docker networks can be pre-created
with something like: `docker network create my-cluster --subnet
192.168.0.0/24 --label talos.owned=true --label
talos.cluster.name=my-cluster`. Note that the labels are pre-reqs for our discovery and re-use of these networks.
Signed-off-by: Spencer Smith <robertspencersmith@gmail.com>
There are few workarounds for Drone way of running integration test:
DinD runs as a separate pod, and we can only access its exposed on the
"host" ports, while from Talos cluster this endpoint is not reachable.
So internally Talos nodes still use addresses like "10.5.0.2", while
test is using "docker" to access it (that's name of the `docker` service
in the pipeline).
When running locally, 127.0.0.1 is used as endpoint, which should work
fine both on OS X and Linux.
Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
This extracts Docker Talos cluster provisioner as common code
which might be shared between `osctl cluster` and integration-test.
There should be almost no functional changes.
As proof of concept, abstract cluster readiness checks were implemented
based on provisioned cluster state. It implements same checks as
`basic-integration.sh` in pure Go via Talos/K8s clients.
`conditions` package was promoted from machined-internal to
`internal/pkg` as it is used to run the checks.
Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>