docs: add what's new for v1.7

Initial set of updates for v1.7 without detailed documentation for each
topic.

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
This commit is contained in:
Andrey Smirnov 2024-04-11 19:25:07 +04:00
parent 908f67fa15
commit d7c3a0735e
No known key found for this signature in database
GPG Key ID: FE042E3D4085A811
13 changed files with 896 additions and 13 deletions

View File

@ -4,9 +4,9 @@ no_list: true
linkTitle: "Documentation" linkTitle: "Documentation"
cascade: cascade:
type: docs type: docs
lastRelease: v1.7.0-beta.0 lastRelease: v1.7.0-beta.1
kubernetesRelease: "1.30.0-rc.2" kubernetesRelease: "1.30.0"
prevKubernetesRelease: "1.28.3" prevKubernetesRelease: "1.29.3"
nvidiaContainerToolkitRelease: "v1.14.5" nvidiaContainerToolkitRelease: "v1.14.5"
nvidiaDriverRelease: "535.129.03" nvidiaDriverRelease: "535.129.03"
preRelease: true preRelease: true

View File

@ -0,0 +1,187 @@
---
title: "CA Rotation"
description: "How to rotate Talos and Kubernetes API root certificate authorities."
---
In general, you almost never need to rotate the root CA certificate and key for the Talos API and Kubernetes API.
Talos sets up root certificate authorities with the lifetime of 10 years, and all Talos and Kubernetes API certificates are issued by these root CAs.
So the rotation of the root CA is only needed if:
- you suspect that the private key has been compromised;
- you want to revoke access to the cluster for a leaked `talosconfig` or `kubeconfig`;
- once in 10 years.
## Overview
There are some details which make Talos and Kubernetes API root CA rotation a bit different, but the general flow is the same:
- generate new CA certificate and key;
- add new CA certificate as 'accepted', so new certificates will be accepted as valid;
- swap issuing CA to the new one, old CA as accepted;
- refresh all certificates in the cluster;
- remove old CA from 'accepted'.
At the end of the flow, old CA is completely removed from the cluster, so all certificates issued by it will be considered invalid.
Both rotation flows are described in detail below.
## Talos API
### Automated Talos API CA Rotation
Talos API CA rotation doesn't interrupt connections within the cluster, and it doesn't require a reboot of the nodes.
Run the following command in dry-run mode to see the steps which will be taken:
```shell
$ talosctl -n <CONTROLPLANE> rotate-ca --dry-run=true --talos=true --kubernetes=false
> Starting Talos API PKI rotation, dry-run mode true...
> Using config context: "talos-default"
> Using Talos API endpoints: ["172.20.0.2"]
> Cluster topology:
- control plane nodes: ["172.20.0.2"]
- worker nodes: ["172.20.0.3"]
> Current Talos CA:
...
```
No changes will be done to the cluster in dry-run mode, so you can safely run it to see the steps.
Before proceeding, make sure that you can capture the output of `talosctl` command, as it will contain the new CA certificate and key.
Record a list of Talos API users to make sure they can all be updated with new `talosconfig`.
Run the following command to rotate the Talos API CA:
```shell
$ talosctl -n <CONTROLPLANE> rotate-ca --dry-run=false --talos=true --kubernetes=false
> Starting Talos API PKI rotation, dry-run mode false...
> Using config context: "talos-default-268"
> Using Talos API endpoints: ["172.20.0.2"]
> Cluster topology:
- control plane nodes: ["172.20.0.2"]
- worker nodes: ["172.20.0.3"]
> Current Talos CA:
...
> New Talos CA:
...
> Generating new talosconfig:
context: talos-default
contexts:
talos-default:
....
> Verifying connectivity with existing PKI:
- 172.20.0.2: OK (version {{< release >}})
- 172.20.0.3: OK (version {{< release >}})
> Adding new Talos CA as accepted...
- 172.20.0.2: OK
- 172.20.0.3: OK
> Verifying connectivity with new client cert, but old server CA:
2024/04/17 21:26:07 retrying error: rpc error: code = Unavailable desc = connection error: desc = "error reading server preface: remote error: tls: unknown certificate authority"
- 172.20.0.2: OK (version {{< release >}})
- 172.20.0.3: OK (version {{< release >}})
> Making new Talos CA the issuing CA, old Talos CA the accepted CA...
- 172.20.0.2: OK
- 172.20.0.3: OK
> Verifying connectivity with new PKI:
2024/04/17 21:26:08 retrying error: rpc error: code = Unavailable desc = connection error: desc = "transport: authentication handshake failed: tls: failed to verify certificate: x509: certificate signed by unknown authority (possibly because of \"x509: Ed25519 verification failure\" while trying to verify candidate authority certificate \"talos\")"
- 172.20.0.2: OK (version {{< release >}})
- 172.20.0.3: OK (version {{< release >}})
> Removing old Talos CA from the accepted CAs...
- 172.20.0.2: OK
- 172.20.0.3: OK
> Verifying connectivity with new PKI:
- 172.20.0.2: OK (version {{< release >}})
- 172.20.0.3: OK (version {{< release >}})
> Writing new talosconfig to "talosconfig"
```
Once the rotation is done, stash the new Talos CA, update `secrets.yaml` (if using that for machine configuration generation) with new CA key and certificate.
The new client `talosconfig` is written to the current directory as `talosconfig`.
You can merge it to the default location with `talosctl config merge ./talosconfig`.
If other client access `talosconfig` files needs to be generated, use `talosctl config new` with new `talosconfig`.
> Note: if using [Talos API access from Kubernetes]({{< relref "./talos-api-access-from-k8s" >}}) feature, pods might need to be restarted manually to pick up new `talosconfig`.
### Manual Steps for Talos API CA Rotation
1. Generate new Talos CA (e.g. use `talosctl gen secrets` and use Talos CA).
2. Patch machine configuration on all nodes updating `.machine.acceptedCAs` with new CA certificate.
3. Generate `talosconfig` with client certificate generated with new CA, but still using old CA as server CA, verify connectivity, Talos should accept new client certificate.
4. Patch machine configuration on all nodes updating `.machine.ca` with new CA certificate and key, and keeping old CA certificate in `.machine.acceptedCAs` (on worker nodes `.machine.ca` doesn't have the key).
5. Generate `talosconfig` with both client certificate and server CA using new CA PKI, verify connectivity.
6. Remove old CA certificate from `.machine.acceptedCAs` on all nodes.
7. Verify connectivity.
## Kubernetes API
### Automated Kubernetes API CA Rotation
The automated process only rotates Kubernetes API CA, used by the `kube-apiserver`, `kubelet`, etc.
Other Kubernetes secrets might need to be rotated manually as required.
Kubernetes pods might need to be restarted to handle changes, and communication within the cluster might be disrupted during the rotation process.
Run the following command in dry-run mode to see the steps which will be taken:
```shell
$ talosctl -n <CONTROLPLANE> rotate-ca --dry-run=true --talos=false --kubernetes=true
> Starting Kubernetes API PKI rotation, dry-run mode true...
> Cluster topology:
- control plane nodes: ["172.20.0.2"]
- worker nodes: ["172.20.0.3"]
> Building current Kubernetes client...
> Current Kubernetes CA:
...
```
Before proceeding, make sure that you can capture the output of `talosctl` command, as it will contain the new CA certificate and key.
As Talos API access will not be disrupted, the changes can be reverted back if needed by reverting machine configuration.
Run the following command to rotate the Kubernetes API CA:
```shell
$ talosctl -n <CONTROLPLANE> rotate-ca --dry-run=false --talos=false --kubernetes=true
> Starting Kubernetes API PKI rotation, dry-run mode false...
> Cluster topology:
- control plane nodes: ["172.20.0.2"]
- worker nodes: ["172.20.0.3"]
> Building current Kubernetes client...
> Current Kubernetes CA:
...
> New Kubernetes CA:
...
> Verifying connectivity with existing PKI...
- OK (2 nodes ready)
> Adding new Kubernetes CA as accepted...
- 172.20.0.2: OK
- 172.20.0.3: OK
> Making new Kubernetes CA the issuing CA, old Kubernetes CA the accepted CA...
- 172.20.0.2: OK
- 172.20.0.3: OK
> Building new Kubernetes client...
> Verifying connectivity with new PKI...
2024/04/17 21:45:52 retrying error: Get "https://172.20.0.1:6443/api/v1/nodes": EOF
- OK (2 nodes ready)
> Removing old Kubernetes CA from the accepted CAs...
- 172.20.0.2: OK
- 172.20.0.3: OK
> Verifying connectivity with new PKI...
- OK (2 nodes ready)
> Kubernetes CA rotation done, new 'kubeconfig' can be fetched with `talosctl kubeconfig`.
```
At the end of the process, Kubernetes control plane components will be restarted to pick up CA certificate changes.
Each node `kubelet` will re-join the cluster with new client certficiate.
New `kubeconfig` can be fetched with `talosctl kubeconfig` command from the cluster.
Kubernetes pods might need to be restarted manually to pick up changes to the Kubernetes API CA.
### Manual Steps for Kubernetes API CA Rotation
Steps are similar [to the Talos API CA rotation](#manual-steps-for-talos-api-ca-rotation), but use:
- `.cluster.acceptedCAs` in place of `.machine.acceptedCAs`;
- `.cluster.ca` in place of `.machine.ca`;
- `kubeconfig` in place of `talosconfig`.

View File

@ -30,10 +30,12 @@ machine:
image: nginx image: nginx
``` ```
Talos renders static pod definitions to the `kubelet` manifest directory (`/etc/kubernetes/manifests`), `kubelet` picks up the definition and launches the pod. Talos renders static pod definitions to the `kubelet` using a local HTTP server, `kubelet` picks up the definition and launches the pod.
Talos accepts changes to the static pod configuration without a reboot. Talos accepts changes to the static pod configuration without a reboot.
To see a full list of static pods, use `talosctl get staticpods`, and to see the status of the static pods (as reported by the `kubelet`), use `talosctl get staticpodstatus`.
## Usage ## Usage
Kubelet mirrors pod definition to the API server state, so static pods can be inspected with `kubectl get pods`, logs can be retrieved with `kubectl logs`, etc. Kubelet mirrors pod definition to the API server state, so static pods can be inspected with `kubectl get pods`, logs can be retrieved with `kubectl logs`, etc.

View File

@ -0,0 +1,69 @@
---
title: "Watchdog Timers"
description: "Using hardware watchdogs to workaround hardware/software lockups."
---
Talos Linux now supports hardware watchdog timers configuration.
Hardware watchdog timers allow to reset (reboot) the system if the software stack becomes unresponsive.
Please consult your hardware/VM documentation for the availability of the hardware watchdog timers.
## Configuration
To discover the available watchdog devices, run:
```shell
$ talosctl ls /sys/class/watchdog/
NODE NAME
172.20.0.2 .
172.20.0.2 watchdog0
172.20.0.2 watchdog1
```
The implementation of the watchdog device can be queried with:
```shell
$ talosctl read /sys/class/watchdog/watchdog0/identity
i6300ESB timer
```
To enable the watchdog timer, patch the machine configuration with the following:
```yaml
# watchdog.yaml
apiVersion: v1alpha1
kind: WatchdogTimerConfig
device: /dev/watchdog0
timeout: 5m
```
```shell
talosctl patch mc -p @watchdog.yaml
```
Talos Linux will set up the watchdog time with a 5-minute timeout, and it will keep resetting the timer to prevent the system from rebooting.
If the software becomes unresponsive, the watchdog timer will expire, and the system will be reset by the watchdog hardware.
## Inspection
To inspect the watchdog timer configuration, run:
```shell
$ talosctl get watchdogtimerconfig
NODE NAMESPACE TYPE ID VERSION DEVICE TIMEOUT
172.20.0.2 runtime WatchdogTimerConfig timer 1 /dev/watchdog0 5m0s
```
To inspect the watchdog timer status, run:
```shell
$ talosctl get watchdogtimerstatus
NODE NAMESPACE TYPE ID VERSION DEVICE TIMEOUT
172.20.0.2 runtime WatchdogTimerStatus timer 1 /dev/watchdog0 5m0s
```
Current status of the watchdog timer can also be inspected via Linux sysfs:
```shell
$ talosctl read /sys/class/watchdog/watchdog0/state
active
```

View File

@ -18,9 +18,9 @@ description: "Table of supported Talos Linux versions and respective platforms."
| - SBCs | Banana Pi M64, Jetson Nano, Libre Computer Board ALL-H3-CC, Nano Pi R4S, Pine64, Pine64 Rock64, Radxa ROCK Pi 4c, Radxa Rock4c+, Raspberry Pi 4B, Raspberry Pi Compute Module 4 | Banana Pi M64, Jetson Nano, Libre Computer Board ALL-H3-CC, Nano Pi R4S, Orange Pi R1 Plus LTS, Pine64, Pine64 Rock64, Radxa ROCK Pi 4c, Raspberry Pi 4B, Raspberry Pi Compute Module 4 | | - SBCs | Banana Pi M64, Jetson Nano, Libre Computer Board ALL-H3-CC, Nano Pi R4S, Pine64, Pine64 Rock64, Radxa ROCK Pi 4c, Radxa Rock4c+, Raspberry Pi 4B, Raspberry Pi Compute Module 4 | Banana Pi M64, Jetson Nano, Libre Computer Board ALL-H3-CC, Nano Pi R4S, Orange Pi R1 Plus LTS, Pine64, Pine64 Rock64, Radxa ROCK Pi 4c, Raspberry Pi 4B, Raspberry Pi Compute Module 4 |
| - local | Docker, QEMU | Docker, QEMU | | - local | Docker, QEMU | Docker, QEMU |
| **Cluster API** | | | | **Cluster API** | | |
| [CAPI Bootstrap Provider Talos](https://github.com/siderolabs/cluster-api-bootstrap-provider-talos) | >= 0.6.3 | >= 0.6.3 | | [CAPI Bootstrap Provider Talos](https://github.com/siderolabs/cluster-api-bootstrap-provider-talos) | >= 0.6.5 | >= 0.6.3 |
| [CAPI Control Plane Provider Talos](https://github.com/siderolabs/cluster-api-control-plane-provider-talos) | >= 0.5.4 | >= 0.5.4 | | [CAPI Control Plane Provider Talos](https://github.com/siderolabs/cluster-api-control-plane-provider-talos) | >= 0.5.6 | >= 0.5.4 |
| [Sidero](https://www.sidero.dev/) | >= 0.6.2 | >= 0.6.2 | | [Sidero](https://www.sidero.dev/) | >= 0.6.4 | >= 0.6.2 |
## Platform Tiers ## Platform Tiers

View File

@ -6,4 +6,209 @@ description: "List of new and shiny features in Talos Linux."
See also [upgrade notes]({{< relref "../../talos-guides/upgrading-talos/">}}) for important changes. See also [upgrade notes]({{< relref "../../talos-guides/upgrading-talos/">}}) for important changes.
TBD ## Important Changes
* The [default NTP server](#time-sync) was updated to `time.cloudflare.com` instead of `pool.ntp.org` (only if not specified in the machine configuration).
* Talos Linux [now](#iptables) forces `kubelet` and `kube-proxy` to use `iptables-nft` instead of `iptables-legacy`.
* SBC (Single Board Computers) images are no longer part of the Talos release assets, please read [SBC](#sbc) before upgrading.
* Talos clusters created with `talosctl cluster create` in Docker mode now use a [random port](#containers-docker) for Kubernetes and Talos API.
## Security
### CA Rotation
Talos Linux now supports [rotating the root CA certificate and key]({{< relref "../../advanced/ca-rotation" >}}) for Talos API and Kubernetes API.
## Networking
### Device Selectors
Talos Linux now supports `physical: true` qualifier for [device selectors]({{< relref "../../talos-guides/network/device-selector" >}}), it selects non-virtual network interfaces (i.e. `en0` is selected, while `bond0` is not).
### DNS Caching
Talos Linux now provides a [caching DNS resolver]({{< relref "../../talos-guides/network/host-dns" >}}) for host workloads (including host networking pods).
Host DNS resolver is enabled by default for clusters created with Talos 1.7.
### Time Sync
Default [NTP server]({{< relref "../../talos-guides/configuration/time-sync" >}}) was updated to be `time.cloudflare.com` instead of `pool.ntp.org`.
Default server is only used if the user does not specify any NTP servers in the configuration.
Talos Linux can now sync to PTP devices (e.g. provided by the hypervisor) skipping the network time servers.
In order to activate PTP sync, set `machine.time.servers` to the PTP device name (e.g. `/dev/ptp0`):
```yaml
machine:
time:
servers:
- /dev/ptp0
```
### SideroLink HTTP Proxy
[SideroLink]({{< relref "../../talos-guides/network/siderolink" >}}) connections can now proxy Wireguard UDP packet over existing HTTP/2 SideroLink API connection (for networks where UDP protocol is filtered, but HTTP is allowed).
## Kubernetes
### API Server Service Account Key
Talos Linux 1.7.0 when generating machine configuration uses RSA key for Kubernetes API Server Service Account instead of ECDSA key to provide better compatibility with external OpenID Connect implementations.
### IPTables
Talos Linux now forces `kubelet` and `kube-proxy` to use `iptables-nft` instead of `iptables-legacy` (`xtables`) which was the default
before Talos 1.7.0.
Container images based on `iptables-wrapper` should work without changes, but if there was a direct call to `legacy` mode of `iptables`, make sure
to update to use `iptables-nft`.
## Platforms
### New Supported Platforms
Talos Linux now supports:
* [OpenNebula](https://opennebula.io/) platform ([Talos platform `opennebula`]({{< relref "../../talos-guides/install/virtualized-platforms/opennebula" >}}))
* [Akamai Connected Cloud](https://www.linode.com/) provider ([Talos platform `akamai`]({{< relref "../../talos-guides/install/cloud-platforms/akamai" >}}))
### Containers (`docker`)
The `talosctl cluster create` command now can create [multiple Talos clusters on the same machine]({{< relref "../../talos-guides/install/local-platforms/docker" >}}).
The Kubernetes and Talos APIs are mapped to a random port on the host machine.
Talos Linux now uses provided DNS resolver when running inside a container.
### Talos-in-Kubernetes
Talos Linux now supports running Talos inside [Kubernetes as a pod]({{< relref "../../talos-guides/install/cloud-platforms/kubernetes" >}}): e.g. to run controlplane nodes inside existing Kubernetes cluster.
## SBC
Talos has split the SBC's (Single Board Computers) into separate repositories.
There will not be any more SBC specific release assets as part of Talos release.
The default Talos `installer` image will stop working for SBC's and will fail the upgrade, if used, starting from Talos v1.7.0.
The SBC's images and installers can be generated on the fly using [Image Factory](https://factory.talos.dev) or using [imager]({{< relref "../../talos-guides/install/boot-assets">}}) for custom images.
The list of official SBC's images supported by Image Factory can be found in the [overlays](https://github.com/siderolabs/overlays/) repository.
In order to upgrade an SBC running Talos 1.6 to Talos 1.7, generate an `installer` image with an SBC overlay and use it to upgrade the cluster.
## System Extensions
### Extension Services Configuration
Talos now supports supplying configuration files and environment variables for extension services.
The extension service configuration is a separate config document.
An example is shown below:
```yaml
---
apiVersion: v1alpha1
kind: ExtensionServiceConfig
name: nut-client
configFiles:
- content: MONITOR ${upsmonHost} 1 remote pass password
mountPath: /usr/local/etc/nut/upsmon.conf
environment:
- UPS_NAME=ups
```
For documentation, see [Extension Services Config Files]({{< relref "../../reference/configuration/extensions/extensionserviceconfig" >}}).
> **Note**: The use of `environmentFile` in extension service [spec]({{< relref "../../advanced/extension-services">}}) is now deprecated and will be removed in a future release of Talos,
> use `ExtensionServiceConfig` instead.
### New Extensions
Talos Linux in version v1.7 introduces new [extensions](https://github.com/siderolabs/exensions):
* `kata-containers`
* `spin`
* `v4l-uvc-drivers`
* `vmtoolsd-guest-agent`
* `wasmedge`
* `xen-guest-agent`
## Logging
### Additional Tags
Talos Linux now supports setting [extra tags]({{< relref "../../talos-guides/configuration/logging" >}}) when sending logs in JSON format:
```yaml
machine:
logging:
destinations:
- endpoint: "udp://127.0.0.1:12345/"
format: "json_lines"
extraTags:
server: s03-rack07
```
### Syslog
Talos Linux now starts a basic syslog receiver listening on `/dev/log`.
The receiver can mostly parse both RFC3164 and RFC5424 messages and writes them as JSON formatted message.
The logs can be viewed via `talosctl logs syslogd`.
This is mostly implemented for extension services that log to syslog.
## Miscellaneous
### Kubernetes Upgrade
The [command]({{< relref "../../kubernetes-guides/upgrading-kubernetes" >}}) `talosctl upgrade-k8s` now supports specifying custom image references for Kubernetes components via `--*-image` flags.
The default behavior is unchanged, and the flags are optional.
### KubeSpan
Talos Linux disables by default a [KubeSpan]({{< relref "../../talos-guides/network/kubespan" >}}) feature to harvest additional endpoints from KubeSpan members.
This feature turned out to be less helpful than expected and caused unnecessary performance issues.
Previous behavior can be restored with:
```yaml
machine:
network:
kubespan:
harvestExtraEndpoints: true
```
### Secure Boot ISO
Talos Linux now provides a way to configure systemd-boot ISO 'secure-boot-enroll' option while [generating]({{< relref "../../talos-guides/install/boot-assets" >}}) a SecureBoot ISO image:
```yaml
output:
kind: iso
isoOptions:
sdBootEnrollKeys: force # default is still if-safe
outFormat: raw
```
### Hardware Watchdog Timers
Talos Linux now supports [hardware watchdog timers]({{< relref "../../advanced/watchdog" >}}) configuration.
If enabled, and the machine becomes unresponsive, the hardware watchdog will reset the machine.
The watchdog can be enabled with the following [configuration document]({{< relref "../../reference/configuration/runtime/watchdogtimerconfig" >}}):
```yaml
apiVersion: v1alpha1
kind: WatchdogTimerConfig
device: /dev/watchdog0
timeout: 3m0s
```
## Component Updates
* Linux: 6.6.26
* etcd: 3.5.11
* Kubernetes: 1.30.0
* containerd: 1.7.15
* runc: 1.1.12
* Flannel: 0.24.4
Talos is built with Go 1.22.2.

View File

@ -56,6 +56,8 @@ $ talosctl -n 172.20.1.2 logs -k kube-system/kube-proxy-gfkqj:kube-proxy:ad5e8dd
[...] [...]
``` ```
If some host workloads (e.g. system extensions) send syslog messages, they can be retrieved with `talosctl logs syslogd` command.
## Sending logs ## Sending logs
### Service logs ### Service logs

View File

@ -0,0 +1,97 @@
---
title: "Time Synchronization"
description: "Configuring time synchronization."
---
Talos Linux itself does not require time to be synchronized across the cluster, but as Talos Linux and Kubernetes components issue certificates
with expiration dates, it is recommended to have time synchronized across the cluster.
Some workloads (e.g. Ceph) might require to be in sync across the machines in the cluster due to the design of the application.
Talos Linux tries to launch API even if the time is not sync, and if time jumps as a result of NTP sync, the API certificates will be rotated automatically.
Some components like `kubelet` and `etcd` wait for the time to be in sync before starting, as they don't support graceful certificate rotation.
By default, Talos Linux uses `time.cloudflare.com` as the NTP server, but it can be overridden in the machine configuration, or provided via DHCP, kernel args, platform sources, etc.
Talos Linux implements SNTP protocol to sync time with the NTP server.
## Observing Status
Current time sync status can be observed with:
```shell
$ talosctl get timestatus
NODE NAMESPACE TYPE ID VERSION SYNCED
172.20.0.2 runtime TimeStatus node 2 true
```
The list of servers Talos Linux is syncing with can be observed with:
```shell
$ talosctl get timeservers
NODE NAMESPACE TYPE ID VERSION TIMESERVERS
172.20.0.2 network TimeServerStatus timeservers 1 ["time.cloudflare.com"]
```
More detailed logs about the time sync process can be queried with:
```shell
$ talosctl logs controller-runtime | grep -i time.Sync
172.20.0.2: 2024-04-17T18:32:16.690Z DEBUG NTP response {"component": "controller-runtime", "controller": "time.SyncController", "clock_offset": "37.060204ms", "rtt": "3.044816ms", "leap": 0, "stratum": 3, "precision": "29ns", "root_delay": "70.617676ms", "root_dispersion": "259.399µs", "root_distance": "37.090645ms"}
172.20.0.2: 2024-04-17T18:32:16.690Z DEBUG sample stats {"component": "controller-runtime", "controller": "time.SyncController", "jitter": "150.196588ms", "poll_interval": "34m8s", "spike": false}
172.20.0.2: 2024-04-17T18:32:16.690Z DEBUG adjusting time (slew) by 37.060204ms via 162.159.200.1, state TIME_OK, status STA_PLL | STA_NANO {"component": "controller-runtime", "controller": "time.SyncController"}
172.20.0.2: 2024-04-17T18:32:16.690Z DEBUG adjtime state {"component": "controller-runtime", "controller": "time.SyncController", "constant": 7, "offset": "37.060203ms", "freq_offset": -1302069, "freq_offset_ppm": -19}
```
## Using PTP Devices
When running in a VM on a hypervisor, instead of doing network time sync, Talos can sync the time to the hypervisor clock (if supported by the hypervisor).
To check if the PTP device is available:
```shell
$ talosctl ls /sys/class/ptp/
NODE NAME
172.20.0.2 .
172.20.0.2 ptp0
```
Make sure that the PTP device is provided by the hypervisor, as some PTP devices don't provide accurate time value without proper setup:
```shell
talosctl read /sys/class/ptp/ptp0/clock_name
KVM virtual PTP
```
To enable PTP sync, set the `machine.time.servers` to the PTP device name (e.g. `/dev/ptp0`):
```yaml
machine:
time:
servers:
- /dev/ptp0
```
After setting the PTP device, Talos will sync the time to the PTP device instead of using the NTP server:
```text
172.20.0.2: 2024-04-17T19:11:48.817Z DEBUG adjusting time (slew) by 32.223689ms via /dev/ptp0, state TIME_OK, status STA_PLL | STA_NANO {"component": "controller-runtime", "controller": "time.SyncController"}
```
## Additional Configuration
Talos NTP sync can be disabled with the following machine configuration patch:
```yaml
machine:
time:
disabled: true
```
When time sync is disabled, Talos assumes that time is always in sync.
Time sync can be also configured on best-effort basis, where Talos will try to sync time for the specified period of time, but if it fails to do so, time will be configured to be in sync when the period expires:
```yaml
machine:
time:
bootTimeout: 2m
```

View File

@ -0,0 +1,105 @@
---
title: "Kubernetes"
description: "Running Talos Linux as a pod in Kubernetes."
---
Talos Linux can be run as a pod in Kubernetes similar to running Talos in [Docker]({{< relref "../local-platforms/docker" >}}).
This can be used e.g. to run controlplane nodes inside an existing Kubernetes cluster.
Talos Linux running in Kubernetes is not full Talos Linux experience, as it is running in a container using the host's kernel and network stack.
Some operations like upgrades and reboots are not supported.
## Prerequisites
* a running Kubernetes cluster
* a `talos` container image: `ghcr.io/siderolabs/talos:{{< release >}}`
## Machine Configuration
Machine configuration can be generated using [Getting Started]({{< relref "../../../introduction/getting-started" >}}) guide.
Machine install disk will ge ignored, as the install image.
The Talos version will be driven by the container image being used.
The required machine configuration patch to enable using container runtime DNS:
```yaml
machine:
features:
hostDNS:
enabled: true
forwardKubeDNSToHost: true
```
Talos and Kubernetes API can be exposed using Kubernetes services or load balancers, so they can be accessed from outside the cluster.
## Running Talos Pods
There might be many ways to run Talos in Kubernetes (StatefulSet, Deployment, single Pod), so we will only provide some basic guidance here.
### Container Settings
```yaml
env:
- name: PLATFORM
value: container
image: ghcr.io/siderolabs/talos:{{< release >}}
ports:
- containerPort: 50000
name: talos-api
protocol: TCP
- containerPort: 6443
name: k8s-api
protocol: TCP
securityContext:
privileged: true
readOnlyRootFilesystem: true
seccompProfile:
type: Unconfined
```
### Submitting Initial Machine Configuration
Initial machine configuration can be submitted using `talosctl apply-config --insecure` when the pod is running, or it can be submitted
via an environment variable `USERDATA` with base64-encoded machine configuration.
### Volume Mounts
Three ephemeral mounts are required for `/run`, `/system`, and `/tmp` directories:
```yaml
volumeMounts:
- mountPath: /run
name: run
- mountPath: /system
name: system
- mountPath: /tmp
name: tmp
```
```yaml
volumes:
- emptyDir: {}
name: run
- emptyDir: {}
name: system
- emptyDir: {}
name: tmp
```
Several other mountpoints are required, and they should persist across pod restarts, so one should use `PersistentVolume` for them:
```yaml
volumeMounts:
- mountPath: /system/state
name: system-state
- mountPath: /var
name: var
- mountPath: /etc/cni
name: etc-cni
- mountPath: /etc/kubernetes
name: etc-kubernetes
- mountPath: /usr/libexec/kubernetes
name: usr-libexec-kubernetes
- mountPath: /opt
name: opt
```

View File

@ -28,10 +28,10 @@ Further, when running on a Mac in docker, due to networking limitations, VIPs a
Creating a local cluster is as simple as: Creating a local cluster is as simple as:
```bash ```bash
talosctl cluster create --wait talosctl cluster create
``` ```
Once the above finishes successfully, your talosconfig(`~/.talos/config`) will be configured to point to the new cluster. Once the above finishes successfully, your `talosconfig` (`~/.talos/config`) and `kubeconfig` (`~/.kube/config`) will be configured to point to the new cluster.
> Note: Startup times can take up to a minute or more before the cluster is available. > Note: Startup times can take up to a minute or more before the cluster is available.
@ -40,6 +40,23 @@ Talosctl can operate on one or all the nodes in the cluster this makes clust
`talosctl config nodes 10.5.0.2 10.5.0.3` `talosctl config nodes 10.5.0.2 10.5.0.3`
Talos and Kubernetes API are mapped to a random port on the host machine, the retrieved `talosconfig` and `kubeconfig` are configured automatically to point to the new cluster.
Talos API endpoint can be found using `talosctl config info`:
```bash
$ talosctcl config info
...
Endpoints: 127.0.0.1:38423
```
Kubernetes API endpoint is available with `talosctl cluster show`:
```bash
$ talosctl cluster show
...
KUBERNETES ENDPOINT https://127.0.0.1:43083
```
## Using the Cluster ## Using the Cluster
Once the cluster is available, you can make use of `talosctl` and `kubectl` to interact with the cluster. Once the cluster is available, you can make use of `talosctl` and `kubectl` to interact with the cluster.
@ -54,6 +71,32 @@ To cleanup, run:
talosctl cluster destroy talosctl cluster destroy
``` ```
## Multiple Clusters
Multiple Talos Linux cluster can be created on the same host, each cluster will need to have:
- a unique name (default is `talos-default`)
- a unique network CIDR (default is `10.5.0.0/24`)
To create a new cluster, run:
```bash
talosctl cluster create --name cluster2 --cidr 10.6.0.0/24
```
To destroy a specific cluster, run:
```bash
talosctl cluster destroy --name cluster2
```
To switch between clusters, use `--context` flag:
```bash
talosctl --context cluster2 version
kubectl --context admin@cluster2 get nodes
```
## Running Talos in Docker Manually ## Running Talos in Docker Manually
To run Talos in a container manually, run: To run Talos in a container manually, run:
@ -77,3 +120,6 @@ docker run --rm -it \
-e PLATFORM=container \ -e PLATFORM=container \
ghcr.io/siderolabs/talos:{{< release >}} ghcr.io/siderolabs/talos:{{< release >}}
``` ```
The machine configuration submitted to the container should have a [host DNS feature]({{< relref "../../../reference/configuration/v1alpha1/config#Config.machine.features.hostDNS" >}}) enabled with `forwardKubeDNSToHost` enabled.
It is used to forward DNS requests to the resolver provided by Docker (or other container runtime).

View File

@ -0,0 +1,117 @@
---
title: "Host DNS"
description: "How to configure Talos host DNS caching server."
---
Talos Linux starting with 1.7.0 provides a caching DNS resolver for host workloads (including host networking pods).
Host DNS resolver is enabled by default for clusters created with Talos 1.7, and it can be enabled manually on upgrade.
## Enabling Host DNS
Use the following machine configuration patch to enable host DNS resolver:
```yaml
machine:
features:
hostDNS:
enabled: true
```
Host DNS can be disabled by setting `enabled: false` as well.
## Operations
When enabled, Talos Linux starts a DNS caching server on the host, listening on address `127.0.0.53:53` (both TCP and UDP protocols).
The host `/etc/resolv.conf` file is rewritten to point to the host DNS server:
```shell
$ talosctl read /etc/resolv.conf
nameserver 127.0.0.53
```
All host-based workloads will use the host DNS server for name resolution.
Host DNS server forwards requests to the upstream DNS servers, which are either acquired automatically (DHCP, platform sources, kernel args), or specified in the machine configuration.
The upstream DNS servers can be observed with:
```shell
$ talosctl get resolvers
NODE NAMESPACE TYPE ID VERSION RESOLVERS
172.20.0.2 network ResolverStatus resolvers 2 ["8.8.8.8","1.1.1.1"]
```
Logs of the host DNS resolver can be queried with:
```shell
talosctl logs dns-resolve-cache
```
Upstream server status can be observed with:
```shell
$ talosctl get dnsupstream
NODE NAMESPACE TYPE ID VERSION HEALTHY ADDRESS
172.20.0.2 network DNSUpstream 1.1.1.1 1 true 1.1.1.1:53
172.20.0.2 network DNSUpstream 8.8.8.8 1 true 8.8.8.8:53
```
## Forwarding `kube-dns` to Host DNS
When host DNS is enabled, by default, `kube-dns` service (`CoreDNS` in Kubernetes) uses upstream DNS servers to resolve external names.
But Talos allows forwarding `kube-dns` to the host DNS resolver, so that the cache is shared between the host and `kube-dns`:
```yaml
machine:
features:
hostDNS:
enabled: true
forwardKubeDNSToHost: true
```
This configuration should be applied to all nodes in the cluster, if enabled after cluster creation, restart `coredns` pods in Kubernetes to pick up changes.
When `forwardKubeDNSToHost` is enabled, Talos Linux allocates 9th IP address in the `serviceSubnet` range for host DNS server, and `kube-dns` service is configured to use this IP address as the upstream DNS server:
```shell
$ kubectl get services -n kube-system host-dns
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
host-dns ClusterIP 10.96.0.9 <none> 53/UDP,53/TCP 27s
$ talosctl read /system/resolved/resolv.conf
nameserver 10.96.0.9
```
With this configuration, `kube-dns` service forwards all DNS requests to the host DNS server, and the cache is shared between the host and `kube-dns`.
## Resolving Talos Cluster Member Names
Host DNS can be configured to resolve Talos cluster member names to IP addresses, so that the host can communicate with the cluster members by name.
Sometimes machine hostnames are already resolvable by the upstream DNS, but this might not always be the case.
Enabling the feature:
```yaml
machine:
features:
hostDNS:
enabled: true
resolveMemberNames: true
```
When enabled, Talos Linux uses [discovery]({{< relref "../discovery" >}}) data to resolve Talos cluster member names to IP addresses:
```shell
$ talosctl get members
NODE NAMESPACE TYPE ID VERSION HOSTNAME MACHINE TYPE OS ADDRESSES
172.20.0.2 cluster Member talos-default-controlplane-1 1 talos-default-controlplane-1 controlplane Talos ({{< release >}}) ["172.20.0.2"]
172.20.0.2 cluster Member talos-default-worker-1 1 talos-default-worker-1 worker Talos ({{< release >}}) ["172.20.0.3"]
```
With the example output above, `talos-default-worker-1` name will resolve to `127.0.0.3`.
Example usage:
```shell
talosctl -n talos-default-worker-1 version
```
When combined with `forwardKubeDNSToHost`, `kube-dns` service will also resolve Talos cluster member names to IP addresses.

View File

@ -0,0 +1,36 @@
---
title: "SideroLink"
description: "Point-to-point management overlay Wireguard network."
---
SideroLink provides a secure point-to-point management overlay network for Talos clusters.
Each Talos machine configured to use SideroLink will establish a secure Wireguard connection to the SideroLink API server.
SideroLink provides overlay network using ULA IPv6 addresses allowing to manage Talos Linux machines even if direct access to machine IP addresses is not possible.
SideroLink is a foundation building block of [Sidero Omni](https://www.siderolabs.com/platform/saas-for-kubernetes/).
## Configuration
SideroLink is configured by providing the SideroLink API server address, either via kernel command line argument `siderolink.api` or as a [config document]({{< relref "../../reference/configuration/siderolink/siderolinkconfig" >}}).
SideroLink API URL: `https://siderolink.api/?jointoken=token&grpc_tunnel=yes`.
If URL scheme is `grpc://`, the connection will be established without TLS, otherwise, the connection will be established with TLS.
If specified, join token `token` will be sent to the SideroLink server.
If `grpc_tunnel` is set to `yes`, the Wireguard traffic will be tunneled over the same SideroLink API gRPC connection instead of using plain UDP.
## Connection Flow
1. Talos Linux creates an ephemeral Wireguard key.
2. Talos Linux establishes a gRPC connection to the SideroLink API server, sends its own Wireguard public key, join token and other connection settings.
3. If the join token is valid, the SideroLink API server sends back the Wireguard public key of the SideroLink API server, and two overlay IPv6 addresses: machine address and SideroLink server address.
4. Talos Linux configured Wireguard interface with the received settings.
5. Talos Linux monitors status of the Wireguard connection and re-establishes the connection if needed.
## Operations with SideroLink
When SideroLink is configured, Talos maintenance mode API listens only on the SideroLink network.
Maintenance mode API over SideroLink allows operations which are not generally available over the public network: getting Talos version, getting sensitive resources, etc.
Talos Linux always provides Talos API over SideroLink, and automatically allows access over SideroLink even if the [Ingress Firewall]({{< relref "./ingress-firewall" >}}) is enabled.
Wireguard connections should be still allowed by the Ingress Firewall.
SideroLink only allows point-to-point connections between Talos machines and the SideroLink management server, two Talos machines cannot communicate directly over SideroLink.

View File

@ -36,9 +36,11 @@ For example, if upgrading from Talos 1.0 to Talos 1.2.4, the recommended upgrade
## Before Upgrade to {{% release %}} ## Before Upgrade to {{% release %}}
### Extension Configuration
If running `tailscale` or `nut-client` extension, follow the below steps for upgrade. If running `tailscale` or `nut-client` extension, follow the below steps for upgrade.
### nut-client #### nut-client
First start by editing the machine config in `staged` mode (`talosctl edit mc --mode=staged`) and remove the `.machine.files` section that adds the `nut-client` config. First start by editing the machine config in `staged` mode (`talosctl edit mc --mode=staged`) and remove the `.machine.files` section that adds the `nut-client` config.
@ -46,7 +48,7 @@ Now upgrade talos to `{{% release %}}`, the `nut-client` service would now be wa
Create a config document as described in the `nut-client` [README](https://github.com/siderolabs/extensions/blob/main/power/nut-client/README.md#usage) and apply the patch. Create a config document as described in the `nut-client` [README](https://github.com/siderolabs/extensions/blob/main/power/nut-client/README.md#usage) and apply the patch.
### tailscale #### tailscale
First start by editing the machine config in `staged` mode (`talosctl edit mc --mode=staged`) and remove the `.machine.files` section that adds the `tailscale` auth key file. First start by editing the machine config in `staged` mode (`talosctl edit mc --mode=staged`) and remove the `.machine.files` section that adds the `tailscale` auth key file.
@ -56,6 +58,13 @@ Create a config document as described in the `tailscale` [README](https://github
Please review the [release notes]({{< relref "../introduction/what-is-new" >}}) for any changes that may affect your cluster. Please review the [release notes]({{< relref "../introduction/what-is-new" >}}) for any changes that may affect your cluster.
### SBC
The SBC's images and installers can be generated on the fly using [Image Factory](https://factory.talos.dev) or using [imager]({{< relref "./install/boot-assets">}}) for custom images.
The list of official SBC's images supported by Image Factory can be found in the [overlays](https://github.com/siderolabs/overlays/) repository.
In order to upgrade an SBC running Talos 1.6 to Talos 1.7, generate an `installer` image with an SBC overlay and use it to upgrade the cluster.
## Video Walkthrough ## Video Walkthrough
To see a live demo of an upgrade of Talos Linux, see the video below: To see a live demo of an upgrade of Talos Linux, see the video below:
@ -111,7 +120,15 @@ future.
## Machine Configuration Changes ## Machine Configuration Changes
TBD * new configuration documents:
* [ExtensionServiceConfig]({{< relref "../reference/configuration/extensions/extensionserviceconfig" >}})
* [WatchdogTimerConfig]({{< relref "../reference/configuration/runtime/watchdogtimerconfig" >}})
* new fields in [v1alpha1.Config]({{< relref "../reference/configuration/v1alpha1/config" >}}) document:
* [`.machine.acceptedCAs`]({{< relref "../reference/configuration/v1alpha1/config#Config.machine" >}})
* [`.cluster.acceptedCAs`]({{< relref "../reference/configuration/v1alpha1/config#Config.cluster" >}})
* [`.machine.features.hostDNS`]({{< relref "../reference/configuration/v1alpha1/config#Config.machine.features.hostDNS" >}})
* [`.machine.network.interfaces[].deviceSelector.physical`]({{< relref "../reference/configuration/v1alpha1/config#Config.machine.network.interfaces..deviceSelector" >}})
* [`.machine.logging.extraTags`]({{< relref "../reference/configuration/v1alpha1/config#Config.machine.logging" >}})
## Upgrade Sequence ## Upgrade Sequence