15 Commits

Author SHA1 Message Date
LukasAuerbeck
c81ce8cfb0
feat: support controlplane resources configuration
Fixes #7379

Add possibility to configure the controlplane static pod resources via
APIServer, ControllerManager and Scheduler configs.

Signed-off-by: LukasAuerbeck <17929465+LukasAuerbeck@users.noreply.github.com>
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2023-07-07 22:44:56 +04:00
Artem Chernyshev
ce63abb219
feat: add KMS assisted encryption key handler
Talos now supports new type of encryption keys which rely on Sealing/Unsealing randomly generated bytes with a KMS server:

```
systemDiskEncryption:
  ephemeral:
    keys:
      - kms:
          endpoint: https://1.2.3.4:443
        slot: 0
```
gRPC API definitions and a simple reference implementation of the KMS server can be found in this
[repository](https://github.com/siderolabs/kms-client/blob/main/cmd/kms-server/main.go).

Signed-off-by: Artem Chernyshev <artem.chernyshev@talos-systems.com>
2023-07-07 19:02:39 +03:00
Utku Ozdemir
0d313b9733
feat: add reboot-mode flag to talosctl upgrade
Allow specifying the reboot mode during upgrades by introducing `--reboot-mode` flag, similar to the `--mode` flag of the reboot command.

Closes siderolabs/talos#7302.

Signed-off-by: Utku Ozdemir <utku.ozdemir@siderolabs.com>
2023-06-26 17:37:19 +02:00
Dmitriy Matrenichev
445f5ad542
feat: support API server load balancer
This commit adds support for API load balancer. Quick way to enable it is during cluster creation using new `api-server-balancer-port` flag (0 by default - disabled). When enabled all API request will be routed across
cluster control plane endpoints.

Closes #7191

Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
2023-06-16 10:09:20 -04:00
Andrey Smirnov
0a99965efb
refactor: replace uncordonNode with controllers
Fixes #7233

Waiting for node readiness now happens in the `MachineStatus` controller
which won't mark the node as ready until Kubernetes `Node` is ready.

Handling cordoning/uncordining happens with help of additional resource
in `NodeApplyController`.

New controller provides reactive `NodeStatus` resource to see current
status of Kubernetes `Node`.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2023-06-13 21:48:42 +04:00
Andrey Smirnov
dbaf5c6997
refactor: task labelControlPlane into controllers
See #7233

The controlplane label is simply injected into existing controller-based
node label flow.

For controlplane taint default NoScheduleTaint, additional controller &
resource was implemented to handle node taints.

This also fixes a problem with `allowSchedulingOnControlPlanes` not
being reactive to config changes - now it is.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2023-06-12 15:25:13 +04:00
Andrey Smirnov
e7be6ee7c3
refactor: make event log streaming fully reactive
I ended up completely rewriting the controller, simplifying the flow
(somewhat) so that there's just a single control flow in the controller,
while reading from v1alpha1 events is converted to reading from a
channel.

Fixes #7227

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2023-06-08 23:13:33 +04:00
Andrey Smirnov
5dab45e869
refactor: allow kmsg log streaming to be reconfigured on the fly
Fixes #7226

This follows same flow as other similar changes - split out logging
configuration as a separate resource, source it for now in the cmdline.

Rewrite the controller to allow multiple log outputs, add send retries.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2023-06-06 15:56:24 +04:00
Dmitriy Matrenichev
8a02ecd4cb
chore: add endpoints balancer controller
This PR adds support for creating a list of API endpoints (each is pair of host and port).

It gets them from
- Machine config cluster endpoint.
- Localhost with LocalAPIServerPort if machine is control panel.
- netip.Addr[0] and port from affiliates if they are control panels.

For #7191

Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
2023-06-05 20:47:52 -04:00
Andrey Smirnov
bab484a405
feat: use stable network interface names
Use `udevd` rules to create stable interface names.

Link controllers should wait for `udevd` to settle down, otherwise link
rename will fail (interface should not be UP).

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2023-06-01 21:29:12 +04:00
Andrey Smirnov
10155c390e
feat: enable xfs project quota support, kubelet feature
This is controlled with a feature flag which gets enabled automatically
for Talos 1.5+.

Fixes #7181

If enabled, configures kubelet to use project quotas to track xfs volume
usage, which is much more efficient than doing `du` periodically.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2023-05-19 20:33:39 +04:00
Andrey Smirnov
bb02dd263c
chore: drop deprecated stuff for Talos 1.5
* drop old resources API, which was deprecated long time ago
* use bootstrapped event in `talosctl get --watch` to better align
  columns in the table output

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2023-05-18 19:46:37 +04:00
Utku Ozdemir
62c6e9655c
feat: introduce siderolink config resource & reconnect
Introduce a new resource, `SiderolinkConfig`, to store SideroLink connection configuration (api endpoint for now).

Introduce a controller for this resource which populates it from the Kernel cmdline.

Rework the SideroLink `ManagerController` to take this new resource as input and reconfigure the link on changes.

Additionally, if the siderolink connection is lost, reconnect to it and reconfigure the links/addresses.

Closes siderolabs/talos#7142, siderolabs/talos#7143.

Signed-off-by: Utku Ozdemir <utku.ozdemir@siderolabs.com>
2023-05-05 17:04:34 +02:00
Andrey Smirnov
860002c735
fix: don't reload control plane pods on cert SANs changes
Fixes #7159

The change looks big, but it's actually pretty simple inside: the static
pods had an annotation which tracks a version of the secrets which
forced control plane pods to reload on a change. At the same time
`kube-apiserver` can reload certificate inputs automatically from files
without restart.

So the inputs were split: the dynamic (for kube-apiserver) inputs don't
need to be reloaded, so its version is not tracked in static pod
annotation, so they don't cause a reload. The previous non-dynamic
resource still causes a reload, but it doesn't get updated when e.g.
node addresses change.

There might be many more refactoring done, the resource chain is a bit
of a mess there, but I wanted to keep number of changes minimal to keep
this backportable.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2023-05-05 16:59:09 +04:00
Andrey Smirnov
d9bdea2b54
chore: fork docs and compatibility modules for Talos 1.5
Getting ready for the next Talos 1.5.0-alpha.0 release.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2023-04-27 15:36:31 +04:00