Explicitly enable access to host DNS from pod/service IPs.
Also fix the Kubernetes health checks to assert number of ready pods to
match expectation, otherwise the check might skip a pod (e.g.
`kube-proxy` one) which is not ready, allowing the test to proceed too
early.
Update DNS test to print more logs on error.
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
Re-structure k8s components health checks so that K8s health can be
independently checked without auxiliary components being up.
Signed-off-by: Noel Georgi <git@frezbo.dev>
Talos diagnostics analyzes current system state and comes up with detailed
warnings on the system misconfiguration which might be tricky to figure
out other way.
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
The previous implementation used old events API, which had several
issues:
* buffer overruns, and weird checks
* big timeout even if the all nodes are booted up
Replace that with direct reading of `MachineStatus` resource which is
available since Talos 1.2.0.
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
`config.Container` implements a multi-doc container which implements
both `Container` interface (encoding, validation, etc.), and `Conifg`
interface (accessing parts of the config).
Refactor `generate` and `bundle` packages to support multi-doc, and
provide backwards compatibility.
Implement a first (mostly example) machine config document for
SideroLink API URL.
Many places don't properly support multi-doc yet (e.g. config patches).
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
Add cilium e2e tests. The existing cilium check was very old, update to
latest cilium version and also add a test for KPR strict mode.
Signed-off-by: Noel Georgi <git@frezbo.dev>
This PR adds two additional checks which are performed during boot sequence and in `talosctl health`. They ensure that nodes have enough memory and disk.
- Boot check will print a warning if memory / disk size is not sufficient.
- Health check will fail if memory / disk size is not sufficient.
Closes#6467
Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
There's a cyclic dependency on siderolink library which imports talos
machinery back. We will fix that after we get talos pushed under a new
name.
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
It got broken with the changes to the kubelet now sourcing static pods
from a HTTP internal server.
As we don't want it to be broken, and to make health checks better, add
a new check to make sure kubelet reports control plane static pods as
running. This coupled with API server check should make it more
thorough.
Also add logging when static pod definitions are updated (they were
previously there for file-based implementation). These logs are very
helpful for troubleshooting.
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
Add new health check which checks if the etcd members match the control plane nodes. Closes siderolabs#5553.
Signed-off-by: Utku Ozdemir <utku.ozdemir@siderolabs.com>
We're no longer testing against Talos <= 0.8, so no reason to
run this check (even if it's no-op).
Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
Verify upgrade flow using the same version of the installer.
Run that with disk encryption enabled.
Signed-off-by: Artem Chernyshev <artem.0xD2@gmail.com>
Filesystem creation step is moved on the later stage: when Talos mounts
the partition for the first time.
Now it checks if the partition doesn't have any filesystem and formats
it right before mounting.
Additionally refactored mount options a bit:
- replaced separate options with a set of binary flags.
- implemented pre-mount and post-unmount hooks.
And fixed typos in couple of places and increased timeout for `apid ready`.
Signed-off-by: Artem Chernyshev <artem.0xD2@gmail.com>
For 0.6 -> 0.7 upgrade, in any case config.yaml is preserved and moved
from `/boot` to `/system/state`.
For single node upgrade, `EPHEMERAL` partition is not touched and other
partitions are re-created as needed.
Bump provision tests to 0.6/0.7 upgrades as we get closer to the new
release.
Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
This moves `pkg/config`, `pkg/client` and `pkg/constants`
under `pkg/machinery` umbrella.
And `pkg/machinery` is published as Go module inside Talos repository.
Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
This change is only moving packages and updating import paths.
Goal: expose `internal/pkg/provision` as `pkg/provision` to enable other
projects to import Talos provisioning library.
As cluster checks are almost always required as part of provisioning
process, package `internal/pkg/cluster` was also made public as
`pkg/cluster`.
Other changes were direct dependencies discovered by `importvet` which
were updated.
Public packages (useful, general purpose packages with stable API):
* `internal/pkg/conditions` -> `pkg/conditions`
* `internal/pkg/tail` -> `pkg/tail`
Private packages (used only on provisioning library internally):
* `internal/pkg/inmemhttp` -> `pkg/provision/internal/inmemhttp`
* `internal/pkg/kernel/vmlinuz` -> `pkg/provision/internal/vmlinuz`
* `internal/pkg/cniutils` -> `pkg/provision/internal/cniutils`
Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>