Many changes to the nftables backend which will be used in the follow-up
PR with #4421.
1. Add support for chain policy: drop/accept.
2. Properly handle match on all IPs in the set (`0.0.0.0/0` like).
3. Implement conntrack state matching.
4. Implement multiple ifname matching in a single rule.
5. Implement anonymous counters.
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
Implement initial set of backend controllers/resources to handle
nftables chains/rules etc.
Replace the KubeSpan nftables operations with controller-based.
See #4421
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
This commit deprecates those things:
- Removes the support of `.persist` flag. From now, it should always be enabled or not defined in the config.
- Removes the documentation for `.bootloader`. It never worked anyway.
- Adds a warning for `.machine.install.extensions`, suggests to use boot-assets.
Closes#7972Closes#7507
Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
This PR does those things:
- It allows API calls `MetaWrite` and `MetaRead` in maintenance mode.
- SystemInformation resource now waits for available META
- SystemInformation resource now overwrites UUID from META if there is an override
- META now supports "UUID override" and "unique token" keys
- ProvisionRequest now includes unique token and Talos version
For #7694
Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
Add security state resource that describes the state of Talos SecureBoot
and PCR signing key fingerprints.
The UKI fingerprint is currently not populated.
Fixes: #7514
Signed-off-by: Noel Georgi <git@frezbo.dev>
Fixes#6391
Implement a set of APIs and commands to manage images in the CRI, and
pre-pull images on Kubernetes upgrades.
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
Fixes#7430
Introduce a set of resources which look similar to other API
implementations: CA, certs, cert SANs, etc.
Introduce a controller which manages the service based on resource
state.
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
Fixes#7379
Add possibility to configure the controlplane static pod resources via
APIServer, ControllerManager and Scheduler configs.
Signed-off-by: LukasAuerbeck <17929465+LukasAuerbeck@users.noreply.github.com>
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
Talos now supports new type of encryption keys which rely on Sealing/Unsealing randomly generated bytes with a KMS server:
```
systemDiskEncryption:
ephemeral:
keys:
- kms:
endpoint: https://1.2.3.4:443
slot: 0
```
gRPC API definitions and a simple reference implementation of the KMS server can be found in this
[repository](https://github.com/siderolabs/kms-client/blob/main/cmd/kms-server/main.go).
Signed-off-by: Artem Chernyshev <artem.chernyshev@talos-systems.com>
Allow specifying the reboot mode during upgrades by introducing `--reboot-mode` flag, similar to the `--mode` flag of the reboot command.
Closessiderolabs/talos#7302.
Signed-off-by: Utku Ozdemir <utku.ozdemir@siderolabs.com>
This commit adds support for API load balancer. Quick way to enable it is during cluster creation using new `api-server-balancer-port` flag (0 by default - disabled). When enabled all API request will be routed across
cluster control plane endpoints.
Closes#7191
Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
Fixes#7233
Waiting for node readiness now happens in the `MachineStatus` controller
which won't mark the node as ready until Kubernetes `Node` is ready.
Handling cordoning/uncordining happens with help of additional resource
in `NodeApplyController`.
New controller provides reactive `NodeStatus` resource to see current
status of Kubernetes `Node`.
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
See #7233
The controlplane label is simply injected into existing controller-based
node label flow.
For controlplane taint default NoScheduleTaint, additional controller &
resource was implemented to handle node taints.
This also fixes a problem with `allowSchedulingOnControlPlanes` not
being reactive to config changes - now it is.
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
I ended up completely rewriting the controller, simplifying the flow
(somewhat) so that there's just a single control flow in the controller,
while reading from v1alpha1 events is converted to reading from a
channel.
Fixes#7227
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
Fixes#7226
This follows same flow as other similar changes - split out logging
configuration as a separate resource, source it for now in the cmdline.
Rewrite the controller to allow multiple log outputs, add send retries.
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
This PR adds support for creating a list of API endpoints (each is pair of host and port).
It gets them from
- Machine config cluster endpoint.
- Localhost with LocalAPIServerPort if machine is control panel.
- netip.Addr[0] and port from affiliates if they are control panels.
For #7191
Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
Use `udevd` rules to create stable interface names.
Link controllers should wait for `udevd` to settle down, otherwise link
rename will fail (interface should not be UP).
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
This is controlled with a feature flag which gets enabled automatically
for Talos 1.5+.
Fixes#7181
If enabled, configures kubelet to use project quotas to track xfs volume
usage, which is much more efficient than doing `du` periodically.
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
* drop old resources API, which was deprecated long time ago
* use bootstrapped event in `talosctl get --watch` to better align
columns in the table output
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
Introduce a new resource, `SiderolinkConfig`, to store SideroLink connection configuration (api endpoint for now).
Introduce a controller for this resource which populates it from the Kernel cmdline.
Rework the SideroLink `ManagerController` to take this new resource as input and reconfigure the link on changes.
Additionally, if the siderolink connection is lost, reconnect to it and reconfigure the links/addresses.
Closessiderolabs/talos#7142, siderolabs/talos#7143.
Signed-off-by: Utku Ozdemir <utku.ozdemir@siderolabs.com>
Fixes#7159
The change looks big, but it's actually pretty simple inside: the static
pods had an annotation which tracks a version of the secrets which
forced control plane pods to reload on a change. At the same time
`kube-apiserver` can reload certificate inputs automatically from files
without restart.
So the inputs were split: the dynamic (for kube-apiserver) inputs don't
need to be reloaded, so its version is not tracked in static pod
annotation, so they don't cause a reload. The previous non-dynamic
resource still causes a reload, but it doesn't get updated when e.g.
node addresses change.
There might be many more refactoring done, the resource chain is a bit
of a mess there, but I wanted to keep number of changes minimal to keep
this backportable.
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
Fixes: https://github.com/siderolabs/talos/issues/7017
Should allow external services to detect which user block devices might
need to be wiped during reset.
Signed-off-by: Artem Chernyshev <artem.chernyshev@talos-systems.com>
Network probes are configured with the specs, and provide their output
as a status.
At the moment only platform code can configure network probes.
If any network probes are configured, they affect network.Status
'Connectivity' flag.
Example, create the probe:
```
talosctl -n 172.20.0.3 meta write 0xa '{"probes": [{"interval": "1s", "tcp": {"endpoint": "google.com:80", "timeout": "10s"}}]}'
```
Watch probe status:
```
$ talosctl -n 172.20.0.3 get probe
NODE NAMESPACE TYPE ID VERSION SUCCESS
172.20.0.3 network ProbeStatus tcp:google.com:80 5 true
```
With failing probes:
```
$ talosctl -n 172.20.0.3 get probe
NODE NAMESPACE TYPE ID VERSION SUCCESS
172.20.0.3 network ProbeStatus tcp:google.com:80 4 true
172.20.0.3 network ProbeStatus tcp:google.com:81 1 false
$ talosctl -n 172.20.0.3 get networkstatus
NODE NAMESPACE TYPE ID VERSION ADDRESS CONNECTIVITY HOSTNAME ETC
172.20.0.3 network NetworkStatus status 5 true true true true
```
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
talosctl netstat -k show all host and non-hostnetwork pods sockets/connections.
talosctl netstat namespace/pod shows sockets/connections of a specific pod +
autocompletes in the shell.
Signed-off-by: Nico Berlee <nico.berlee@on2it.net>
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
This adds support for automatically registering node hostnames in DNS by
sending the current hostname to DHCP via option 12. If the current hostname is
updated, issue a new DISCOVER to propagate the update to DHCP (updating the
hostname on lease renewals is not universally supported by DHCP servers). This
addition maintains the previous functionality where the node can also request
its hostname from the DHCP server. The received hostname will be processed and
prioritized as usual by the `network.HostnameSpecController`.
This change set also contains fixes to make DHCP renewals compliant with RFC
2131, specifically avoiding sending the server identifier and requested IP
address when issuing renewals using a previous offer. This also uncovered
issues and missing features in the upstream `insomniacslk/dhcp` library, the
fixes and improvements for which are now finally merged.
Sending hostname updates have been tested against `dnsmasq` and the built-in
DHCP + DNS services in Windows Server. Hostname retrieval from DHCP and edge
cases with overridden hostnames from different configuration layers have been
extensively tested against `dnsmasq`.
Signed-off-by: Dennis Marttinen <twelho@welho.tech>
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
This allows to put keys to META partition.
META contents can be viewed with `talosctl get metakeys`.
There is not real usecase for it yet, but the next PRs will introduce
two special keys which can be written:
* platform network config for `metal`
* `${code}` variable
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
Implement the new summary dashboard with node info and logs.
Replace the previous metrics dashboard with the new dashboard which has multiple screens for node summary, metrics and editing network config.
Port the old metrics dashboard to the tview library and assign it to be a screen in the new dashboard, accessible by F2 key.
Add a new resource, infos.cluster.talos.dev which contains the cluster name and id of a node.
Disable the network config editor screen in the new dashboard until it is fully implemented with its backend.
Closessiderolabs/talos#4790.
Signed-off-by: Utku Ozdemir <utku.ozdemir@siderolabs.com>
Use a global instance, handle loading/saving META in global context.
Deprecate legacy syslinux ADV, provide an easier interface for
consumers.
Expose META as resources.
Fix the bootloader revert process (it was completely broken for quite a
while :sad:).
This is a first step which mostly does preparation work, real changes
will come in the next PRs:
* add APIs to write to META
* consume META keys for platform network config for `metal`
* custom key for URL `${code}`
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
Fixes: https://github.com/siderolabs/talos/issues/6815
Additionally, make it possible to run reset in maintenance mode: to
enable a way for resetting system disk and remove all traces of Talos
from it.
The new reset flow works in a separate sequence, changed disk probe
lookup to check the boot partition instead of the ephemeral one.
Signed-off-by: Artem Chernyshev <artem.chernyshev@talos-systems.com>
`structprotogen` now supports generating enums directly instead of using predeclared file and hardcoded types. To use this functionality, simply put `structprotogen:gen_enum` in the comment above const block, you want to have the proto definitions for.
Closes#6215
Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
This allows to safely recover out of space quota issues, and perform
degragmentation as needed.
`talosctl etcd status` command provides lots of information about the
cluster health.
See docs for more details.
Fixes#4889
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
This feature allows us to use only IPv4 or IPv6 stack to reach the peers.
Also, it can help to not share the node-specific IPs,
which cannot be accessible at all.
Signed-off-by: Serge Logvinov <serge.logvinov@sinextra.dev>
We add the `nodeLabels` key to the machine config to allow users to add
node labels to the kubernetes Node object. A controller
reads the nodeLabels from the machine config and applies them via the
kubernetes API.
Older versions of talosctl will throw an unknown keys error if `edit mc`
is called on a node with this change.
Fixes#6301
Signed-off-by: Philipp Sauter <philipp.sauter@siderolabs.com>
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
We add a controller that provides the etcd member id as a resource
and change the etcd related commands to support member ids next to
hostnames.
Fixes: #6223
Signed-off-by: Philipp Sauter <philipp.sauter@siderolabs.com>
There's a cyclic dependency on siderolink library which imports talos
machinery back. We will fix that after we get talos pushed under a new
name.
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>