6 Commits

Author SHA1 Message Date
Dmitrii Sharshakov
9758bd4fe0
feat: update Go to 1.26
Via tools/pkgs, also pulling in Clang-built Linux

Update go.mod dependencies

Fix linter errors with new golangci-lint, modernize, use new()

Signed-off-by: Dmitrii Sharshakov <dmitry.sharshakov@siderolabs.com>
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2026-02-19 22:15:19 +01:00
Laura Brehm
d43a01ccbd
feat: implement talosctl debug
This implements a way to run a debug container with a provided image on
the node.

The container runs with privileged profile, allowing to issue debugging
commands (e.g. using some advanced network tools) to troubleshoot a
machine.

Signed-off-by: Laura Brehm <laurabrehm@hey.com>
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2026-02-04 21:26:09 +04:00
Andrey Smirnov
154952175a
fix: disable swap for system services
If system services including kubelet/CRI start using swap, it might lead
to extreme performance degradation.

Disable swap for all system services except for dashboard (which is not
critical).

```
NAME                                                                          SwapCurrent   SwapPeak   SwapHigh   SwapMax    ZswapCurrent   ZswapMax   ZswapWriteback
.                                                                                unset         unset      unset      unset      unset          unset   1
├──init                                                                            0 B           0 B        max        0 B        0 B            max   1
├──podruntime                                                                      0 B           0 B        max        max        0 B            max   1
│   ├──etcd                                                                        0 B           0 B        max        0 B        0 B            max   1
│   ├──kubelet                                                                     0 B           0 B        max        0 B        0 B            max   1
│   └──runtime                                                                     0 B           0 B        max        0 B        0 B            max   1
└──system                                                                          0 B           0 B        max        max        0 B            max   1
    ├──apid                                                                        0 B           0 B        max        0 B        0 B            max   1
    ├──dashboard                                                                   0 B           0 B        max        max        0 B            max   1
    ├──runtime                                                                     0 B           0 B        max        0 B        0 B            max   1
    ├──trustd                                                                      0 B           0 B        max        0 B        0 B            max   1
```

Refactor etcd cgroup to use same common pattern while keeping same
settings (but limit swap).

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2025-12-26 18:25:25 +04:00
Dmitrii Sharshakov
69ab076b4d
fix: re-create cgroups when restarting runners
Make sure processes are only launched into freshly-created cgroups
with all limits set when they are restarted.

This also allows processes to restart after cgroup being killed via the
cgroup.kill mechanism.

Fixes #11785

Signed-off-by: Dmitrii Sharshakov <dmitry.sharshakov@siderolabs.com>
2025-09-12 11:33:15 +02:00
Andrey Smirnov
6b15ca19cd
fix: audit and fix cgroup reservations
Fixes: #7081

Review all reservations and limits set, test under stress load (using
both memory and CPU).

The goal: system components (Talos itself) and runtime (kubelet, CRI)
should survive under extreme resource starvation (workloads consuming
all CPU/memory).

Uses #9337 to visualize changes, but doesn't depend on it.

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-09-20 22:22:28 +04:00
Andrey Smirnov
66f3ffdd4a
fix: ensure that Talos runs in a pod (container)
Drop the Kubernetes manifests as static files clean up (this is only
needed for upgrades from 1.2.x).

Fix Talos handling of cgroup hierarchy: if started in container in a
non-root cgroup hiearachy, use that to handle proper cgroup paths.

Add a test for a simple TinK mode (Talos-in-Kubernetes).

Update the docs.

Fixes #8274

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-02-20 15:06:48 +04:00