This breaks some pods which specifically drop everything but gain
capabilities back via file capabilities (e.g. `nginx-ingress`).
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
Fixes#4243
The idea is to make sure kubelet picks node IP based on filtering by
CIDRs of the node's addresses. The flow is simple - every address is
filtered by subnet and picked if it matches the subnet.
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
If Talos is built with `sidero.debug` build tag (`make WITH_DEBUG=1`),
the machine configuration is allowed to use insecure HTTP for the discovery service.
Signed-off-by: Alexey Palazhchenko <alexey.palazhchenko@talos-systems.com>
Also ignore expected errors for other platforms to keep controller from
failing over and over again.
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
Fixes#4250
Each KubeSpan peer sees each other KubeSpan peer endpoint as it got
connected. If the peer is behind NAT, the discovered endpoint is
different from the endpoints node knows about itself (as it punched a
hole in NAT). This discovered endpoint is pushed to the discovery
service so that every other peer now can use that punched hole to talk
to the peer.
If the endpoint observed is actually in the list of the endpoints
reported by the peer itself, discovery service will take care of
deduplicating them and suppressing updates.
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
This means that ExternalIPs (as presented by the platform) will be
published as `AddressStatus` resource, and transitively as
`NodeAddresses` (which includes cert generation) and as KubeSpan
endpoints (for KubeSpan connectivity in the cloud).
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
This fixes a bug when after an error generating certificates controller
gets into a state of not being able to read its expected inputs
(NetworkStatus specifically).
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
Use `argsbuilder` same way as it's used in services.
Rewrite `kubeProxy` generation code to override default args.
As a consequence of this change now flags do not have determined order
as they all come from a single merged map.
Introduced merge policy in the `ArgsBuilder` to deny overrides for some
arguments and do additive merge of others.
Fixes: https://github.com/talos-systems/talos/issues/4238
Signed-off-by: Artem Chernyshev <artem.chernyshev@talos-systems.com>
Looks like `zaptest` package when used from the goroutine (like in gRPC
server) results in a potential data race on test tear down.
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
This provides integration layer with discovery service to provide
cluster discovery (and transitively KubeSpan peer discovery).
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
This distributes API CA (just the certificate, not the key) to the
worker nodes on config generation, and if the CA cert is present on the
worker node, it verifies TLS connection to the trustd with the CA
certificate.
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
Talos supports automatic virtual IP for the control plane with pure
layer 2 connectivity. Hetzner Cloud API supports assigning Floating IPs
to the nodes, this PR combines existing virtual IP functionality with calls
to HCloud API to move the IP address on HCloud side to the leader node.
The only thing which should be supplied in the machine configuration is
the Hetzner Cloud API token, every other setting is automatically
discovered by Talos.
Talos supports two types of floating IPs:
* external Floating IP for external network
* server alias IP for local networks
The controlplane can have only one alias on the local network interface.
Signed-off-by: Serge Logvinov <serge.logvinov@sinextra.dev>
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
Fixes#4094
Deprecate old networkd APIs, `talosctl interfaces` and `talosctl routes`
now suggest different commands to be used to achieve same task.
TUI installer was updated to stop using Interfaces API.
Those APIs will be completely removed in 0.14.
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
This basically provides `talosctl get --insecure` in maintenance mode.
Only non-sensitive resources are available (equivalent to having
`os:reader` role in the Talos client certificate).
Changes:
* refactored insecure/maintenance client setup in talosctl
* `LinkStatus` is no longer sensitive as it shows only Wireguard public
key, `LinkSpec` still contains private key for obvious reasons
* maintenance mode injects `os:reader` role implicitly
The motivation behind this PR is to deprecate networkd-era interfaces &
routes APIs which are being used in TUI installer, and we need a
replacement.
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
Fixes#4232
The result:
```
talosctl -n 172.20.0.2 get members
NODE NAMESPACE TYPE ID VERSION HOSTNAME MACHINE TYPE OS ADDRESSES
172.20.0.2 cluster Member talos-default-master-1 2 talos-default-master-1 controlplane Talos (v0.13.0-alpha.0-13-gfdd80a12-dirty) ["172.20.0.2","fdd1:f54:2697:3902:44f8:92ff:fe2e:1aea"]
172.20.0.2 cluster Member talos-default-worker-1 1 talos-default-worker-1 worker Talos (v0.13.0-alpha.0-13-gfdd80a12-dirty) ["172.20.0.3","fdd1:f54:2697:3902:d4ba:55ff:fe8a:f551"]
172.20.0.2 cluster Member talos-default-worker-2 1 talos-default-worker-2 worker Talos (v0.13.0-alpha.0-13-gfdd80a12-dirty) ["172.20.0.4","fdd1:f54:2697:3902:e00d:f4ff:fecf:51c8"]
```
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
The problem is that the kubelet kubeconfig gets created early, but the
actual client key and cert files are not written, so controllers spam
with scary errors that the config is not valid. This PR removes those
scary messages as we wait for the kubeconfig to be usable.
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
This fixes the bug with host networking pods not being able to reach out
to the Kubernetes services.
This also moves any node-to-node networking over to KubeSpan link as
well.
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
This concludes basic KubeSpan implementation.
Most of the code is from #3577 with some fixes and refactoring.
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
Signed-off-by: Seán C McCord <ulexus@gmail.com>
Co-authored-by: Seán C McCord <ulexus@gmail.com>
Sometimes we cannot redefine user-data in the openstack.
we need to catch this and allow user to apply configuration
through api.
Signed-off-by: Serge Logvinov <serge.logvinov@sinextra.dev>
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
When `etcd` is stopped, control plane can't function anymore as API
server connects to the local etcd instance.
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
KubeSpan manager uses list of KubeSpan peers prepared from the discovery
and local KubeSpan identity to set up and update configuration of the
Wireguard interface.
As new peers are getting added or deleted, manager takes care of
updating the Wireguard config.
Manager also keeps track of all peers and their state coming from the
Wireguard link status: whether the connection is up or not, some stats,
last actually used endpoint, etc.
Manager cycles through the available peer endpoints until it finds the
one which works.
Manager exposes peer status as `PeerStatus` resources.
Example:
```
$ talosctl -n 172.20.0.2 get kubespanpeerstatuses
NODE NAMESPACE TYPE ID VERSION LABEL ENDPOINT STATE RX TX
172.20.0.2 kubespan KubeSpanPeerStatus GpO3gs5n09WpoiVANbzRL5nwrkRi+9Q19qoeC8RTkQ4= 30 talos-default-worker-2 172.20.0.6:51820 up 640 1920
172.20.0.2 kubespan KubeSpanPeerStatus j4CRlKByMcTWOBS2ifZcPzcUr3lXdBOc/I4AxGmhXxI= 30 talos-default-worker-1 172.20.0.5:51820 up 672 1888
172.20.0.2 kubespan KubeSpanPeerStatus o5EPScFrD895A5EpVyKU8hFR+vi25D0CJMYsoaXN3Qk= 28 talos-default-master-3 172.20.0.4:51820 up 640 1920
172.20.0.2 kubespan KubeSpanPeerStatus rBp5wyHdxqZkq5CWher2DcPcGgwHrFOwB6fP/ReFRlE= 16 talos-default-master-2 172.20.0.3:51820 up 432 2088
```
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
Signed-off-by: Seán C McCord <ulexus@gmail.com>
Co-authored-by: Seán C McCord <ulexus@gmail.com>
Looks like this was an error made long ago, fixed similarly for etcd
in b52b20666.
Signed-off-by: Lennard Klein <lennard.klein@eu.equinix.com>
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
* calculate covering IPPrefixes for the KubeSpan peer `AllowedIPs`,
check for overlap
* don't use KubeSpan IP as potential node endpoint (inception!)
* allow Wireguard config to be applied which doesn't change peer
endpoint
* support for pre-shared Wireguard peer keys
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
Signed-off-by: Seán C McCord <ulexus@gmail.com>
Co-authored-by: Seán C McCord <ulexus@gmail.com>
* cloud-init for scaleway
* set ipv6 to the interface
Signed-off-by: Serge Logvinov <serge.logvinov@sinextra.dev>
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
With the last changes, `kube-apiserver` certificates are generated based
on the assigned `NodeAdresses`, machine configuration, etc. Whenver the
certificate is regenerated, `kube-apiserver` is reloaded to pick up the
new cert.
With Virtual IP enabled, Virtual IP address is included into the
certificate from the beginning as it is specified in the machine
configuration, but as virtual IP moves between the nodes this causes
`NodeAddresses` update, which triggers the controller, generates new
certs and reloads `kube-apiserver` at bad time (right after VIP got
moved). Even though the cert generated is identical to the previous one,
the API server reload makes it unavailable for 30-90 seconds.
This change extracts `CertSANs` as a separate resource so that its
updates are suppressed if the CertSANs sources change, but the final
list stays the same, and in turn prevents final certificate from being
updated.
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
Controller watches cluster Affiliates and generates KubeSpanPeerSpecs
from those which are not local node Affiliates and have KubeSpan
configuration attached.
Example:
```
$ talosctl -n 172.20.0.2 get kubespanpeerspecs
NODE NAMESPACE TYPE ID VERSION LABEL ENDPOINTS
172.20.0.2 kubespan KubeSpanPeerSpec 27E8I+ekrqT21cq2iW6+fDe+H7WBw6q9J7vqLCeswiM= 1 talos-default-worker-1 ["172.20.0.3:51820"]
```
```
$ talosctl -n 172.20.0.3 get kubespanpeerspecs -o yaml
node: 172.20.0.3
metadata:
namespace: kubespan
type: KubeSpanPeerSpecs.kubespan.talos.dev
id: mB6WlFOR66Jx5rtPMIpxJ3s4XHyer9NCzqWPP7idGRo=
version: 1
owner: kubespan.PeerSpecController
phase: running
created: 2021-09-07T19:26:35Z
updated: 2021-09-07T19:26:35Z
spec:
address: fdc8:8aee:4e2d:1202:f073:9cff:fe6c:4d67
additionalAddresses:
- 10.244.1.0/32
endpoints:
- 172.20.0.2:51820
label: talos-default-master-1
```
KubeSpan peers will be used to drive configuration of the KubeSpan
networking components.
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
This changes machinery API for the configuration to make it more
obvious that the returned value is a list of CIDRs and adjusts usage
accordingly.
For the K8s Address Filter controller, fix the actual bug by parsing
CIDRs as a list of values.
Fixes#4192
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
This PR makes sure that some capabilities (SYS_BOOT and SYS_MODULES) and
never be gained by any process running on Talos except for `machined`
itself.
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
* cloud-init for hcloud
* set ipv6 to the interface
Signed-off-by: Serge Logvinov <serge.logvinov@sinextra.dev>
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
This implements pushing to and pulling from Kubernetes cluster discovery
registry which is simply using extra Talos annotations on the Node
resources.
Note: cluster discovery is still disabled by default.
This means that each Talos node is going to push data from its own local
`Affiliate` structure to the `Node` resource, and also watches the other
`Node`s to scrape data to build `Affiliate`s from each other cluster
member.
Further down the pipeline, `Affiliate` is converted to a cluster
`Member` which is an easy way to see the cluster membership.
In its current form, `talosctl get members` is mostly equivalent to
`kubectl get nodes`, but as we add more registries, it will become more
powerful.
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
Fixes#4139
This builds the local (for the node) `Affiliate` structure which
describes node for the cluster discovery. Dependending on the
configuration, KubeSpan information might be included as well.
`NodeAddresses` were updated to hold CIDRs instead of simple IPs.
The `Affiliate` will be pushed to the registries, while `Affiliate`s for
other nodes will be fetched back from the registries.
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
This is a PR on a path towards removing `ApplyDynamicConfig`.
This fixes Kubernetes API server certificate generation to use dynamic
data to generate cert with proper SANs for IPs of the node.
As part of that refactored a bit apid certificate generation (without
any changes).
Added two unit-tests for apid and Kubernetes certificate generation.
Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
This adds the ability to specify the subnet that `etcd`'s listen address
should be in. This allows users to ensure that `etcd` is on a private
subnet.
Signed-off-by: Andrew Rynhard <andrew@rynhard.io>
This implements abstract `NodeAddressFilter` which can be attached to
build additional `NodeAddress` resources filtering out some entries from
the complete list.
Kubernetes creates two filters to get all node IPs without Kubernetes
CIDRs and vice versa, only Kubernetes CIDRs.
API certificate generation now considers all addresses minus K8s CIDRs.
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>