docs: consolidate and expand on discovery service

This PR expands the explanation of the discover service.

Signed-off-by: Steve Francis <steve.francis@talos-systems.com>
This commit is contained in:
Steve Francis 2022-09-30 19:31:28 -07:00 committed by Spencer Smith
parent 139c62d762
commit 54a687fb8e
No known key found for this signature in database
GPG Key ID: 036C4E711644C223
3 changed files with 47 additions and 31 deletions

View File

@ -1,21 +0,0 @@
---
title: "Discovery"
weight: 90
description: "Discover how Sidero Labs implements Talos node discovery."
---
We maintain a public discovery service whereby members of your cluster can use a shared key that is globally unique to coordinate the most basic connection information (i.e. the set of possible "endpoints", or IP:port pairs).
We call this data "affiliate data."
> Note: If KubeSpan is enabled the data has the addition of the WireGuard public key.
Before sending data to the discovery service, Talos will encrypt the affiliate data with AES-GCM encryption and separately encrypt endpoints with AES in ECB mode so that endpoints coming from different sources can be deduplicated server-side.
Each node submits its data encrypted plus it submits the endpoints it sees from other peers to the discovery service.
The discovery service aggregates the data, deduplicates the endpoints, and sends updates to each connected peer.
Each peer receives information back about other affiliates from the discovery service, decrypts it and uses it to drive KubeSpan and cluster discovery.
The discovery service has no persistence.
Data is stored in memory only with a TTL set by the clients (i.e. Talos).
The cluster ID is used as a key to select the affiliates (so that different clusters see different affiliates).
To summarize, the discovery service knows the client version, cluster ID, the number of affiliates, some encrypted data for each affiliate, and a list of encrypted endpoints.

View File

@ -29,7 +29,7 @@ For this discussion, we will point out two of these tiers:
- an external service
- a Kubernetes-based system
See [discovery service]({{< relref "discovery" >}}) to learn more about the external service.
See [discovery service]({{< relref "../talos-guides/discovery" >}}) to learn more about the external service.
The Kubernetes-based system utilises annotations on Kubernetes Nodes which describe each node's public key and local addresses.

View File

@ -1,10 +1,25 @@
---
title: "Discovery"
description: "How to use Talos Linux cluster discovery"
title: "Discovery Service"
description: "Talos Linux Node discovery services"
aliases:
- ../../guides/discovery
---
Talos Linux includes node-discovery capabilities that depend on a discovery registry.
This allows you to see the members of your cluster, and the associated IP addresses of the nodes.
```bash
talosctl get members
NODE NAMESPACE TYPE ID VERSION HOSTNAME MACHINE TYPE OS ADDRESSES
10.5.0.2 cluster Member talos-default-controlplane-1 1 talos-default-controlplane-1 controlplane Talos (v1.2.3) ["10.5.0.2"]
10.5.0.2 cluster Member talos-default-worker-1 1 talos-default-worker-1 worker Talos (v1.2.3) ["10.5.0.3"]
```
There are currently two supported discovery services: a Kubernetes registry (which stores data in the cluster's etcd service) and an external registry service.
Sidero Labs runs a public external registry service, which is enabled by default.
The Kubernetes registry service is disabled by default.
The advantage of the external registry service is that it is not dependent on etcd, and thus can inform you of cluster membership even when Kubernetes is down.
## Video Walkthrough
To see a live demo of Cluster Discovery, see the video below:
@ -13,9 +28,8 @@ To see a live demo of Cluster Discovery, see the video below:
## Registries
Peers are aggregated from a number of optional registries.
By default, Talos will use the `service` registry, while `kubernetes` registry is not enabled.
Either one can be disabled.
Peers are aggregated from enabled registries.
By default, Talos will use the `service` registry, while the `kubernetes` registry is disabled.
To disable a registry, set `disabled` to `true` (this option is the same for all registries):
For example, to disable the `service` registry:
@ -30,9 +44,9 @@ cluster:
Disabling all registries effectively disables member discovery altogether.
> Talos supports the `kubernetes` and `service` registries.
> An enabled discovery service is required for [KubeSpan]({{< relref "../kubernetes-guides/network/kubespan/" >}}) to function correctly.
`Kubernetes` registry uses Kubernetes `Node` resource data and additional Talos annotations:
The `Kubernetes` registry uses Kubernetes `Node` resource data and additional Talos annotations:
```sh
$ kubectl describe node <nodename>
@ -42,7 +56,30 @@ Annotations: cluster.talos.dev/node-id: Utoh3O0ZneV0kT2IUBrh7TgdouRcUW2yz
...
```
`Service` registry uses external [Discovery Service]({{< relref "../../learn-more/discovery/" >}}) to exchange encrypted information about cluster members.
The `Service` registry by default uses a public external Discovery Service to exchange encrypted information about cluster members.
## Discovery Service
Sidero Labs maintains a public discovery service at `https://discovery.talos.dev/` whereby cluster members use a shared key that is globally unique to coordinate basic connection information (i.e. the set of possible "endpoints", or IP:port pairs).
We call this data "affiliate data."
> Note: If KubeSpan is enabled the data has the addition of the WireGuard public key.
Data sent to the discovery service is encrypted with AES-GCM encryption and endpoint data is separately encrypted with AES in ECB mode so that endpoints coming from different sources can be deduplicated server-side.
Each node submits its own data, plus the endpoints it sees from other peers, to the discovery service.
The discovery service aggregates the data, deduplicates the endpoints, and sends updates to each connected peer.
Each peer receives information back from the discovery service, decrypts it and uses it to drive KubeSpan and cluster discovery.
Data is stored in memory only.
The cluster ID is used as a key to select the affiliates (so that different clusters see different affiliates).
To summarize, the discovery service knows the client version, cluster ID, the number of affiliates, some encrypted data for each affiliate, and a list of encrypted endpoints.
The discovery service doesnt see actual node information it only stores and updates encrypted blobs.
Discovery data is encrypted/decrypted by the clients the cluster members.
The discovery service does not have the encryption key.
The discovery service may, with a commercial license, be operated by your organization and can be [downloaded here](https://github.com/siderolabs/discovery-service).
In order for nodes to communicate to the discovery service, they must be able to connect to it on TCP port 443.
## Resource Definitions
@ -65,7 +102,7 @@ spec:
Node identity is used as the unique `Affiliate` identifier.
Node identity resource is preserved in the [STATE]({{< relref "../../learn-more/architecture/#file-system-partitions" >}}) partition in `node-identity.yaml` file.
Node identity resource is preserved in the [STATE]({{< relref "../learn-more/architecture/#file-system-partitions" >}}) partition in `node-identity.yaml` file.
Node identity is preserved across reboots and upgrades, but it is regenerated if the node is reset (wiped).
#### Affiliates