The problem is that there's no official way to close Kuberentes client
underlying TCP/HTTP connections. So each time Talos initializes
connection to the control plane endpoint, new client is built, but this
client is never closed, so the connection stays active on the load
balancers, on the API server level, etc. It also eats some resources out
of Talos itself.
We add a way to close underlying connections by using helper from the
Kubernetes client libraries to force close all TCP connections which
should shut down all HTTP/2 connections as well.
Alternative approach might be to cache a client for some time, but many
of the clients are created with temporary PKI, so even cached client
still needs to be closed once it gets stale, and it's not clear how to
recreate a client in case existing one is broken for one reason or
another (and we need to force a re-connection).
Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
When forfeiting etcd leadership, it might be that the node still reports
leadership status while not being a leader once the actual API call is
used. We should ignore such an error as the node is not a leader.
Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
This should fix an error like:
```
failed to create etcd client: error getting kubernetes endpoints: Unauthorized
```
The problem is that the generated cert was used immediately, so even
slight time sync issue across nodes might render the cert not (yet)
usable. Cert is generated on one node, but might be used on any other
node (as it goes via the LB).
Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
This PR bumps pkgs to v0.7.0-alpha.0, so that we gain a fix for
hotplugging of nvme drives.
Signed-off-by: Spencer Smith <robertspencersmith@gmail.com>
This removes `retrying error` messages while waiting for the API server
pod state to reflect changes from the updated static pod definition.
Log more lines to notify about the progress.
Skip `kube-proxy` if not found (as we allow it to be disabled).
```
$ talosctl upgrade-k8s -n 172.20.0.2 --from 1.21.0 --to 1.21.2
discovered master nodes ["172.20.0.2" "172.20.0.3" "172.20.0.4"]
updating "kube-apiserver" to version "1.21.2"
> "172.20.0.2": starting update
> "172.20.0.2": machine configuration patched
> "172.20.0.2": waiting for API server state pod update
< "172.20.0.2": successfully updated
> "172.20.0.3": starting update
> "172.20.0.3": machine configuration patched
> "172.20.0.3": waiting for API server state pod update
< "172.20.0.3": successfully updated
> "172.20.0.4": starting update
> "172.20.0.4": machine configuration patched
> "172.20.0.4": waiting for API server state pod update
< "172.20.0.4": successfully updated
updating "kube-controller-manager" to version "1.21.2"
> "172.20.0.2": starting update
> "172.20.0.2": machine configuration patched
> "172.20.0.2": waiting for API server state pod update
< "172.20.0.2": successfully updated
> "172.20.0.3": starting update
> "172.20.0.3": machine configuration patched
> "172.20.0.3": waiting for API server state pod update
< "172.20.0.3": successfully updated
> "172.20.0.4": starting update
> "172.20.0.4": machine configuration patched
> "172.20.0.4": waiting for API server state pod update
< "172.20.0.4": successfully updated
updating "kube-scheduler" to version "1.21.2"
> "172.20.0.2": starting update
> "172.20.0.2": machine configuration patched
> "172.20.0.2": waiting for API server state pod update
< "172.20.0.2": successfully updated
> "172.20.0.3": starting update
> "172.20.0.3": machine configuration patched
> "172.20.0.3": waiting for API server state pod update
< "172.20.0.3": successfully updated
> "172.20.0.4": starting update
> "172.20.0.4": machine configuration patched
> "172.20.0.4": waiting for API server state pod update
< "172.20.0.4": successfully updated
updating daemonset "kube-proxy" to version "1.21.2"
kube-proxy skipped as DaemonSet was not found
```
Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
Fixes#3861
What this change effectively does is that it changes immediate reconcile
request to an error return, so that controller will be restarted with a
backoff.
More details:
* root cause of the update/teardown conflict is that the finalizer is
still pending on the tearing down resource
* finalizer might not be removed immediately, e.g. if the controller
which put the finalizer is itself in the crash loop
* if the merge controller queues reconcile immediately, it restarts
itself, but the finalizer is still there, so it once again goes into
reconcile loop and that goes forever until the finalizer is removed, so
instead if the controller fails, it will be restarted with exponential
backoff lowering the load on the system
Change is validated with the unit-tests reproducing the conflict.
Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
With the recent changes, bootstrap API might wait for the time to be in
sync (as the apid is launched before time is sync). We set timeout to
500ms for the bootstrap API call, so there's a chance that a call might
time out, and we should ignore it.
Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
This fixes endless block on RemoteGenerator.Close method rewriting the
RemoteGenerator using the retry package.
Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
This commit also introduces a hidden `--json` flag for `talosctl version` command
that is not supported and should be re-worked at #907.
Refs #3852.
Signed-off-by: Alexey Palazhchenko <alexey.palazhchenko@gmail.com>
Basically all delay options are interlocked with `miimon`: if `miimon`
is zero, all delays are set to zero, and kernel complains even if zero
delay attribute is sent while miimon is zero.
Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
The sequence of events to reproduce the problem:
* some resource was merged as final representation with ID `x`
* underlying source resource gets destroyed
* merge controller marks final resource `x` for teardown and waits
for the finalizers to be empty
* another source resource appears which gets merged to same final `x`
* as `x` is in the teardown phase, spec controller will ignore it
* merge controller doesn't see the problem as well, as `x` spec is
correct, but the phase is wrong (which merge controller ignores)
This pulls in COSI fix to return an error if a resource in teardown
phase is modified. This way merge controller knows that the resource `x`
is in the teardown phase, so it should be first fully torn down, and
then new representation should be re-created as new resource with same ID
`x`.
Regression unit-tests included (they don't reproduce the sequence of
events always reliably, but they do with 10% probability).
Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
This PR makes sure we pin to a known CAPI version because with the new
v0.4.x released, we'll fail until we support the v1alpha4 APIs.
Signed-off-by: Spencer Smith <robertspencersmith@gmail.com>
* `talosctl config new` now sets endpoints in the generated config.
* Avoid duplication of roles in metadata.
* Remove method name prefix handling. All methods should be set explicitly.
* Add tests.
Closes#3421.
Signed-off-by: Alexey Palazhchenko <alexey.palazhchenko@gmail.com>
This makes sure that apid can't access any resources than the one it
actually needs. This improves the security in case of a container
breach.
Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
This change is for Theila which is going to use gRPC proxy to forward
requests from TS frontend right to the node's apid.
`gRPC` proxy operates on top of `grpc.ClientConn` objects, so getting
this connection from the clients which are already being created is the
easiest path.
Signed-off-by: Artem Chernyshev <artem.0xD2@gmail.com>
Bump dependencies, clean up go.mod files, update for netaddr changes
(all around `netaddr.IPPrefix` being a private struct now).
Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
This will allow to keep track of when the resource was created and
updated.
Update is tied to the version bump.
Signed-off-by: Artem Chernyshev <artem.0xD2@gmail.com>
This PR updates our CI so that when we release talos, a json file
containing our cloud images for AWS will be published as a release
asset.
Signed-off-by: Spencer Smith <robertspencersmith@gmail.com>
This PR can be split into two parts:
* controllers
* apid binding into COSI world
Controllers
-----------
* `k8s.EndpointController` provides control plane endpoints on worker
nodes (it isn't required for now on control plane nodes)
* `secrets.RootController` now provides OS top-level secrets (CA cert)
and secret configuration
* `secrets.APIController` generates API secrets (certificates) in a bit
different way for workers and control plane nodes: controlplane nodes
generate directly, while workers reach out to `trustd` on control plane
nodes via `k8s.Endpoint` resource
apid Binding
------------
Resource `secrets.API` provides binding to protobuf by converting
itself back and forth to protobuf spec.
apid no longer receives machine configuration, instead it receives
gRPC-backed socket to access Resource API. apid watches `secrets.API`
resource, fetches certs and CA from it and uses that in its TLS
configuration.
Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
This isn't strictly require, but it should be backwards compatible with
Talos 0.10 (networkd).
Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
We need to be able to run an install with `docker run`. This checks if
we are running from docker and skips overlay mount checks if we are, as
docker creates a handful of overlay mounts by default that we can't
workaround (not easily at least).
Signed-off-by: Andrew Rynhard <andrew@rynhard.io>