7 Commits

Author SHA1 Message Date
Oguz Kilcan
d27624abc6
chore: rekres and bump go to 1.25.2
Rekres, fix linter issues, bump go to 1.25.2
See groups.google.com/g/golang-nuts/c/Gxn25BP4MXk/m/3KrM-XBOBAAJ

Signed-off-by: Oguz Kilcan <oguz.kilcan@siderolabs.com>
2025-10-08 13:22:55 +02:00
Artem Chernyshev
5047a625f7
feat: compute unique token status for each node
Some checks failed
default / default (push) Has been cancelled
default / e2e-backups (push) Has been cancelled
default / e2e-cluster-import (push) Has been cancelled
default / e2e-forced-removal (push) Has been cancelled
default / e2e-omni-upgrade (push) Has been cancelled
default / e2e-scaling (push) Has been cancelled
default / e2e-short (push) Has been cancelled
default / e2e-short-secureboot (push) Has been cancelled
default / e2e-templates (push) Has been cancelled
default / e2e-upgrades (push) Has been cancelled
default / e2e-workload-proxy (push) Has been cancelled
The node token status can be used to check if the machine has the unique
token generated. It also shows the exact token state:

- `NONE` - token is supported, but not yet generated.
- `UNSUPPORTED` - Talos is < 1.6.x, so token won't be generated.
- `EPHEMERAL` - token is generated, but is not persistent, so join token
  rotation and machine reboot will make the node to disconnect.
- `PERSISTENT` - token is generated and is persisted to the `META`
  partition.

If the node unique status is `NONE` the same controller will try to
generate node unique token.

Fixes: https://github.com/siderolabs/omni/issues/1348

Signed-off-by: Artem Chernyshev <artem.chernyshev@talos-systems.com>
2025-07-30 11:57:24 +03:00
Artem Chernyshev
f458211621
fix: allow encoding join tokens using v1 version
This can help to support compatibility for the newer providers vs older
Omni versions.

Also support legacy flow in the infra providers common library: if the
provider join config doesn't exist, fallback to using `ConnectionParams`
along with join token version 1.

Signed-off-by: Artem Chernyshev <artem.chernyshev@talos-systems.com>
2025-07-22 21:14:20 +03:00
Artem Chernyshev
ab1f7cc7fa
feat: implement multiple token support and token management
Added new pages for join token management.
The token can be created, revoked and deleted.
Also support tokens with the expiration.

Introduced new resources:
- `JoinToken` - keeps the token, it's ID is the token string.
- `JoinTokenStatus` - is generated by the controller, it calculates the
  information about the current token state: active, revoked or expired.
  And also has the information about if the token is default.
- `DefaultJoinToken` - is the singleton resource that keeps the current
  default token id.

The behavior of siderolink manager was changed to create a default
`JoinToken` resource out of whatever is currently stored in the
`siderolink.Config` resource.

`siderolink.ConnectionParams` is now generated by the controller.
It's using the default token from the `DefaultJoinToken` resource.

Infra providers will get their own unique tokens, but they won't use it
until the library is updated.
So they will still rely on the default join token until updated.

Dropped `siderolink.ConnectionParams` usage in most of the places. This
resource is kept only for the backward compatibility reasons.

Fixes: https://github.com/siderolabs/omni/issues/907

Signed-off-by: Artem Chernyshev <artem.chernyshev@talos-systems.com>
2025-07-15 18:32:06 +03:00
Artem Chernyshev
a7ac63725d
chore: rewrite join config generation
Now the machine join config is always generate when there's a `machine`
resource. It will automatically populate the correct parameters for the
machine API URL, logs and events.
If the machine is managed by an infra provider it will populate it's
request ID too.

The default provider join config is also generated, but it is not used
in the common infra provider library, because it's easier to just
generate the config at the moment it's going to be used.

The code for the siderolink join config generation was unified in all
the places, and is now in `client/pkg/siderolink`.

The new management API introduced for downloading the join config in the
UI `GetMachineJoinConfig`.

Signed-off-by: Artem Chernyshev <artem.chernyshev@talos-systems.com>
2025-07-10 13:41:38 +03:00
Artem Chernyshev
b264a412c2
fix: properly support the PXE and ISO machines in the secure tokens flow
Some checks failed
default / default (push) Has been cancelled
default / e2e-backups (push) Has been cancelled
default / e2e-forced-removal (push) Has been cancelled
default / e2e-scaling (push) Has been cancelled
default / e2e-short (push) Has been cancelled
default / e2e-short-secureboot (push) Has been cancelled
default / e2e-templates (push) Has been cancelled
default / e2e-upgrades (push) Has been cancelled
default / e2e-workload-proxy (push) Has been cancelled
The unique token flow was reworked to support machines running from PXE
and ISO.

As they do not support META persistence, Omni doesn't enforce secure
tokens for them.
But to distinguish machines and make the UUID conflict resolution to work,
Omni now calculates the node fingerprints out of the mac addresses of
all physical interfaces on the node.

So now each unique token consists of two parts:

- fingerprint.
- a random string.

Omni detects Talos installation on the machine in the following way:

- check if the pending machine status exists and it detected the system
  disk.
- overwrite the previous check if the existing link was labeled with the
  Talos being installed.
- lastly if the `MachineStatus` exists, overwrite all checks with the
  installed label from it (ensures bare-metal provider workflow which
  goes to installed to not installed and PXE booted).

Then when a machine joins Omni with some token, Omni checks if the
random part is equal. If it is equal, the machine is immediately
accepted.

If the random part is different and the fingerprint matches:
- if Talos is installed - reject the machine and log the warning in the
  logs.
- if Talos is not installed - replace the existing link with the new one
  (only if the request has a valid join token).

Then if nothing matches, the UUID conflict resolution kicks in.
Provisioner creates a `PendingMachine` which is marked with UUID
conflict label and `PendingMachineStatus` controller generates a random
UUID for the node.

Signed-off-by: Artem Chernyshev <artem.chernyshev@talos-systems.com>
2025-03-03 19:54:58 +03:00
Artem Chernyshev
03604222ea
feat: support passing extra data through the siderolink join token
This extra data is used in the infra provider to add the annotation to the
`siderolink.Link` as early as possible.
Then the `Machine` controller is changed to skip the `Links` that have
annotation `omni.sidero.dev/infra-provider` and do not have the label
`omni.sidero.dev/machine-request`.
This change makes not consistent `Links` to be ignored by the system,
until the are fully populated.

Also changed the infra provider interface to take siderolink connection
params as string instead of the resource.

Fixes: https://github.com/siderolabs/omni/issues/603

Signed-off-by: Artem Chernyshev <artem.chernyshev@talos-systems.com>
2024-09-05 14:57:51 +03:00