14 Commits

Author SHA1 Message Date
Andrey Smirnov
dd8336c9ee
fix: refresh kubelet self-issued serving certificates
Kubelet doesn't refresh self-issued serving certificates, so force it by
removing the cert on each restart.

Fix the code which was forcing rejoin when the nodename changes, it was
broken, as it was checking serving certificate instead of client
certificate. It worked by accident when not using controlplane-issued
serving certificates.

Fixes #7235

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2023-05-18 22:19:34 +04:00
Noel Georgi
d1a61fd343
chore: bump golangci-lint
Bump golangci-lint and fixup new warnings. Ignore check that checks for
used function parameters, it's kind of noisy and makes it confusing to
read interface implementations.

Signed-off-by: Noel Georgi <git@frezbo.dev>
2023-03-22 19:55:38 +05:30
Andrey Smirnov
82e8c9e1f6
fix: workaround panic in the kubelet service controller
The traceback:

```
user: warning: [2022-12-02T17:31:09.496341098Z]: [talos] controller failed {"component": "controller-runtime", "controller": "k8s.KubeletServiceController", "error": "controller \x5c"k8s.KubeletServiceController\x5c" panicked: runtime error: invalid memory address or nil pointer dereference\x5cn\x5cngoroutine 308 [running]:\x5cnruntime/debug.Stack()\x5cn\x5ct/toolchain/go/src/runtime/debug/stack.go:24 +0x65\x5cngithub.com/cosi-project/runtime/pkg/controller/runtime.(*adapter).runOnce.func2()\x5cn\x5ct/.cache/mod/github.com/cosi-project/runtime@v0.1.1/pkg/controller/runtime/adapter.go:403 +0x5d\x5cnpanic({0x2b7b600, 0x536c7c0})\x5cn\x5ct/toolchain/go/src/runtime/panic.go:884 +0x212\x5cngithub.com/talos-systems/talos/internal/app/machined/pkg/controllers/k8s.updateKubeconfig(0xc0000d49b0?)\x5cn\x5ct/src/internal/app/machined/pkg/controllers/k8s/kubelet_service.go:302 +0xb8\x5cngithub.com/talos-systems/talos/internal/app/machined/pkg/controllers/k8s.(*KubeletServiceController).Run(0xc000956030, {0x389f7c0, 0xc000808040}, {0x38bce60, 0xc0000dfa80}, 0x0?)\x5cn\x5ct/s...
```

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-12-06 20:53:30 +04:00
Andrey Smirnov
a505b8909a
fix: update COSI and reset restart backoff on success
See https://github.com/cosi-project/runtime/pull/191

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-12-06 17:43:26 +04:00
Andrey Smirnov
96aa9638f7
chore: rename talos-systems/talos to siderolabs/talos
There's a cyclic dependency on siderolink library which imports talos
machinery back. We will fix that after we get talos pushed under a new
name.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-11-03 16:50:32 +04:00
Dmitriy Matrenichev
fc48849d00
chore: move maps/slices/ordered to gen module
Use github.com/siderolabs/gen

Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
2022-09-21 20:22:43 +03:00
Andrey Smirnov
a6b010a8b4
chore: update Go to 1.19, Linux to 5.15.58
See https://go.dev/doc/go1.19

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-08-03 17:03:58 +04:00
Utku Ozdemir
bb4abc0961
fix: regenerate kubelet certs when hostname changes
Clear the kubelet certificates and kubeconfig when hostname changes so that on next start, kubelet goes through the bootstrap process and new certificates are generated and the node is joined to the cluster with the new name.

Fixes siderolabs/talos#5834.

Signed-off-by: Utku Ozdemir <utku.ozdemir@siderolabs.com>
2022-07-21 01:54:15 +02:00
Utku Ozdemir
87ea1d9611
fix: update kubelet kubeconfig when cluster control plane endpoint changes
Overwrite cluster's server URL in the kubeconfig file used by kubelet when the cluster control plane endpoint is changed in machineconfig, so that kubelet doesn't lose connectivity to kube-apiserver.

Closes siderolabs/talos#4470.

Signed-off-by: Utku Ozdemir <utku.ozdemir@siderolabs.com>
2022-07-16 14:19:25 +02:00
Andrey Smirnov
c1aed62405
fix: wait for /var to be mounted in kubelet service controller
This is a cosmetic fix: when `KubeletServiceController` tries to write
files to `/etc/kubernetes` before `/var` mounted, it would fail.
Controller will be restarted, but each restart involves a backoff on
each restart which gets longer with each restart.

On the first boot, or when EPHEMERAL is encrypted, mounting might take
considerable time (seconds), so during that time controller might enter
such long backoff timeout that it will delay whole boot sequence - it
won't finish before `kubelet` is started.

By waiting for `EPHEMERAL` to be mounted before starting the controller
we eliminate long backoff cycles.

Also fix a bug when `StartAllServices` task might start a kubelet early
(before `KubeletServiceController` is actually going to start it).

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-06-02 01:00:47 +04:00
Dmitriy Matrenichev
6351928611
chore: redo pointer with github.com/siderolabs/go-pointer module
With the advent of generics, redo pointer functionality and remove github.com/AlekSi/pointer dependency.

Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
2022-05-02 02:17:13 +04:00
Andrey Smirnov
b2bf3117ff
feat: implement extension services
Fixes #4694

User services run alongside with Talos system services.
Every user service container root filesystem should be already present
in the Talos root filesystem.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-02-22 23:11:20 +03:00
Andrey Smirnov
2e735714d9
fix: derive machine-id from node identity
Fixes #4759

This uses existing features: Talos always generates 32 bytes random node
identity, we use first 16 bytes of that to generate `machine-id` in
compliant format and mount that into the `kubelet` container.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-01-10 15:59:07 +03:00
Andrey Smirnov
d2fd7c2170
feat: make kubelet service apply changes immediately
The gist is that `kubelet` service code only manages the container
lifecycle, while `kubelet` configuration is managed now in the
controllers and resources.

New resources:

* `secrets.Kubelet` contains Kubelet PKI derived directly from the
machine configuration
* `k8s.KubeletConfig` contains Kubelet non-secret config derived
directly from the machine configuration
* `k8s.NodeIPConfig` contains configuration on picking up Node IP for
the kubelet (from machine configuration)
* `k8s.NodeIP` contains actual Node IPs picked from the node addresses
based on `NodeIPConfig`
* `k8s.KubeletSpec` contains final `kubelet` container configuration,
including merged arguments, KubeletConfig, etc. It is derived from
`KubeletConfig`, `Nodename` and `NodeIP`.

Final controller `KubeletServiceController` writes down configuration
and PKI to disk, and manages restart/start of the `kubelet` service
which is a pure wrapper around container lifecycle.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2021-12-03 23:02:49 +03:00