There's a cyclic dependency on siderolink library which imports talos
machinery back. We will fix that after we get talos pushed under a new
name.
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
It got broken with the changes to the kubelet now sourcing static pods
from a HTTP internal server.
As we don't want it to be broken, and to make health checks better, add
a new check to make sure kubelet reports control plane static pods as
running. This coupled with API server check should make it more
thorough.
Also add logging when static pod definitions are updated (they were
previously there for file-based implementation). These logs are very
helpful for troubleshooting.
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
When we query kubelet API to populate the StaticPodStatuses, instead of checking for ownerReferences to be empty, we check the annotation "kubernetes.io/config.source" value so we avoid including standalone pods (that are regular pods but not part of a replicaset).
We also optimize their fetching by avoiding to unmarshal the fields we do not need.
Signed-off-by: Utku Ozdemir <utku.ozdemir@siderolabs.com>
This changes the way Kubernetes nodename is computed: it is set by the
controller based on the hostname and machine configuration, and pulled
from the resource when needed.
Kubelet client now also uses nodename to fix the certifcate mismatch
issue on AWS.
Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
Kubelet might be running either self-signed cert (by default) or API
server issued cert (signed by the CA). User might switch between the two
methods, so instead of guessing based on filesystem contents, accept
both Kubernetes CA and self-signed cert (if available).
Spotted by @aceat64
Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
If kubelet is configured to issue certificates from the control plane,
`/var/lib/kubelet/pki/kubelet.crt` file is never created, and cluster CA
canv be used to verify the TLS connection.
Use k8s `RESTClient` instead of a custom client, this also results in
much more descriptive error messages if API call fails.
Fix a problem in apid on worker nodes with issued serving certificates:
`/var/lib/kubelet/pki` doesn't exist by the time `apid` starts.
First write static pods, then try to build kubelet client: for issued
serving kubelet certificates, control plane should be up first.
Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
Control plane components are running as static pods managed by the
kubelets.
Whole subsystem is managed via resources/controllers from os-runtime.
Many supporting changes/refactoring to enable new code paths.
Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>