Run a health check before the test, as the test depends on CoreDNS being
healthy, and previous tests might disturb the cluster.
Also refactor by using watch instead of retries, make pods terminate
fast.
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
Provide a trace for each step of the reset sequence taken, so if one of
those fails, integration test produces a meaningful message instead of
proceeding and failing somewhere else.
More cleanup/refactor, should be functionally equivalent.
Fixes#8635
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
More specifically, pick up `/etc/resolv.conf` contents by default when
in container mode, and use that as a base resolver for the host DNS.
Fixes#8303
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
This fixes an issue with a single controlplane cluster.
Properly present all accepted CAs to the apiserver, in the test let the
cluster fully recovery between two CA rotations performed.
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
Dynamically map Kubernetes and Talos API ports to an available port on
the host, so every cluster gets its own unique set of parts.
As part of the changes, refactor the provision library and interfaces,
dropping old weird interfaces replacing with (hopefully) much more
descriprive names.
Signed-off-by: Dmitry Sharshakov <dmitry.sharshakov@siderolabs.com>
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
Fixes#8361
Talos requires v2 (circa 2008), but VMs are often configured to limit
the exposed features to the baseline (v1).
```
[ 0.779218] [talos] [initramfs] booting Talos v1.7.0-alpha.1-35-gef5bbe728-dirty
[ 0.779806] [talos] [initramfs] CPU: QEMU Virtual CPU version 2.5+, 4 core(s), 1 thread(s) per core
[ 0.780529] [talos] [initramfs] x86_64 microarchitecture level: 1
[ 0.781018] [talos] [initramfs] it might be that the VM is configured with an older CPU model, please check the VM configuration
[ 0.782346] [talos] [initramfs] x86_64 microarchitecture level 2 or higher is required, halting
```
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
This allows to roll all nodes to use a new CA, to refresh it, or e.g.
when the `talosconfig` was exposed accidentally.
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
This controller combines kobject events, and scan of `/sys/block` to
build a consistent list of available block devices, updating resources
as the blockdevice changes.
Based on these resources the next step can run probe on the blockdevices
as they change to present a consistent view of filesystems/partitions.
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
The current code was stipping non-`v1alpha1.Config` documents. Provide a
proper method in the config provider, and update places using it.
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
Let's add a very basic test for the Kata Containers extension, mimicing
what's already in place for gVisor.
This depends on the work being done in:
https://github.com/siderolabs/extensions/pull/279
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Signed-off-by: Noel Georgi <git@frezbo.dev>
Fixes#8057
I went back and forth on the way to fix it exactly, and ended up with a
pretty simple version of a fix.
The problem was that discovery service was removing the member at the
initial phase of reset, which actually still requires KubeSpan to be up:
* leaving `etcd` (need to talk to other members)
* stopping pods (might need to talk to Kubernetes API with some CNIs)
Now leaving discovery service happens way later, when network
interactions are no longer required.
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
Fixes#4421
See documentation for details on how to use the feature.
With `talosctl cluster create`, firewall can be easily test with
`--with-firewall=accept|block` (default mode).
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
This PR adds support for custom node taints. Refer to `nodeTaints` in the `configuration` for more information.
Closes#7581
Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
Fixes#7873
Some services which perform mounts inside the container which require
mounts to propagate back to the host (e.g. `stargz-snapshotter`) require
this configuration setting.
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
As Talos doesn't consume `.machine.install` if already installed, there
is no point in validating it once already installed.
This fixes a problem users often run into: after a reboot/upgrade the
system disk blockdevice name changes, due to the kernel upgrade, or just
unpredictable behavior of device discovery. Talos fails to boot as it
can't validate the machine config, while it's already installed, so
actual blockdevice name doesn't matter.
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
drop `UpdateEndpointSuite` suite since KubePrism is enabled by default
starting Talos 1.6 and the test never passes since K8s node is always
ready since it can connect to api server over KubePrism.
Signed-off-by: Noel Georgi <git@frezbo.dev>
Move drone extensions integration to a function. This allows us to
re-use the code and just depend on a single step rather than explicitly
defining all dependencies.
Signed-off-by: Noel Georgi <git@frezbo.dev>
This is required for https://github.com/siderolabs/sidero/pull/1070, as
we need to allow DHCP traffic from Sidero controller running in a VM
through the bridge to other VMs.
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
Talos now supports new type of encryption keys which rely on Sealing/Unsealing randomly generated bytes with a KMS server:
```
systemDiskEncryption:
ephemeral:
keys:
- kms:
endpoint: https://1.2.3.4:443
slot: 0
```
gRPC API definitions and a simple reference implementation of the KMS server can be found in this
[repository](https://github.com/siderolabs/kms-client/blob/main/cmd/kms-server/main.go).
Signed-off-by: Artem Chernyshev <artem.chernyshev@talos-systems.com>
See #7233
The controlplane label is simply injected into existing controller-based
node label flow.
For controlplane taint default NoScheduleTaint, additional controller &
resource was implemented to handle node taints.
This also fixes a problem with `allowSchedulingOnControlPlanes` not
being reactive to config changes - now it is.
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
This PR adds support for creating a list of API endpoints (each is pair of host and port).
It gets them from
- Machine config cluster endpoint.
- Localhost with LocalAPIServerPort if machine is control panel.
- netip.Addr[0] and port from affiliates if they are control panels.
For #7191
Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
`config.Container` implements a multi-doc container which implements
both `Container` interface (encoding, validation, etc.), and `Conifg`
interface (accessing parts of the config).
Refactor `generate` and `bundle` packages to support multi-doc, and
provide backwards compatibility.
Implement a first (mostly example) machine config document for
SideroLink API URL.
Many places don't properly support multi-doc yet (e.g. config patches).
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
See #7230
Refactor more config interfaces, move config accessor interfaces
to different package to break the dependency loop.
Make `.RawV1Alpha1()` method typed to avoid type assertions everywhere.
No functional changes.
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
See #7230
This is a step towards preparing for multi-doc config.
Split the `config.Provider` interface into parts which have different
implementation:
* `config.Config` accesses the config itself, it might be implemented by
`v1alpha1.Config` for example
* `config.Container` will be a set of config documents, which implement
validation, encoding, etc.
`Version()` method dropped, as it makes little sense and it was almost
not used.
`Raw()` method renamed to `RawV1Alpha1()` to support legacy direct
access to `v1alpha1.Config`, next PR will refactor more to make it
return proper type.
There will be many more changes coming up.
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>