Commit Graph

9 Commits

Author SHA1 Message Date
Artem Chernyshev
5047a625f7
feat: compute unique token status for each node
Some checks failed
default / default (push) Has been cancelled
default / e2e-backups (push) Has been cancelled
default / e2e-cluster-import (push) Has been cancelled
default / e2e-forced-removal (push) Has been cancelled
default / e2e-omni-upgrade (push) Has been cancelled
default / e2e-scaling (push) Has been cancelled
default / e2e-short (push) Has been cancelled
default / e2e-short-secureboot (push) Has been cancelled
default / e2e-templates (push) Has been cancelled
default / e2e-upgrades (push) Has been cancelled
default / e2e-workload-proxy (push) Has been cancelled
The node token status can be used to check if the machine has the unique
token generated. It also shows the exact token state:

- `NONE` - token is supported, but not yet generated.
- `UNSUPPORTED` - Talos is < 1.6.x, so token won't be generated.
- `EPHEMERAL` - token is generated, but is not persistent, so join token
  rotation and machine reboot will make the node to disconnect.
- `PERSISTENT` - token is generated and is persisted to the `META`
  partition.

If the node unique status is `NONE` the same controller will try to
generate node unique token.

Fixes: https://github.com/siderolabs/omni/issues/1348

Signed-off-by: Artem Chernyshev <artem.chernyshev@talos-systems.com>
2025-07-30 11:57:24 +03:00
Artem Chernyshev
0591d2eeba
feat: implement join token management CLI
Some checks are pending
default / default (push) Waiting to run
default / e2e-backups (push) Blocked by required conditions
default / e2e-cluster-import (push) Blocked by required conditions
default / e2e-forced-removal (push) Blocked by required conditions
default / e2e-omni-upgrade (push) Blocked by required conditions
default / e2e-scaling (push) Blocked by required conditions
default / e2e-short (push) Blocked by required conditions
default / e2e-short-secureboot (push) Blocked by required conditions
default / e2e-templates (push) Blocked by required conditions
default / e2e-upgrades (push) Blocked by required conditions
default / e2e-workload-proxy (push) Blocked by required conditions
The commands added:
```
omnictl jointoken create
omnictl jointoken delete
omnictl jointoken renew
omnictl jointoken revoke
omnictl jointoken unrevoke
omnictl jointoken make-default
```

Fixes: https://github.com/siderolabs/omni/issues/1093

Signed-off-by: Artem Chernyshev <artem.chernyshev@talos-systems.com>
2025-07-28 20:38:24 +03:00
Artem Chernyshev
781a04e186
chore: extract node unique token to the separate resource
Some checks failed
default / default (push) Has been cancelled
default / e2e-backups (push) Has been cancelled
default / e2e-cluster-import (push) Has been cancelled
default / e2e-forced-removal (push) Has been cancelled
default / e2e-omni-upgrade (push) Has been cancelled
default / e2e-scaling (push) Has been cancelled
default / e2e-short (push) Has been cancelled
default / e2e-short-secureboot (push) Has been cancelled
default / e2e-templates (push) Has been cancelled
default / e2e-upgrades (push) Has been cancelled
default / e2e-workload-proxy (push) Has been cancelled
Link is readable by the users. As the token is sensitive it's better to
save it in some resource that is not readable by the `omnictl`.

Signed-off-by: Artem Chernyshev <artem.chernyshev@talos-systems.com>
2025-07-25 14:06:09 +03:00
Oguz Kilcan
f8de9a6d96
feat: add support for imported cluster secrets
Some checks failed
default / default (push) Has been cancelled
default / e2e-backups (push) Has been cancelled
default / e2e-cluster-import (push) Has been cancelled
default / e2e-forced-removal (push) Has been cancelled
default / e2e-omni-upgrade (push) Has been cancelled
default / e2e-scaling (push) Has been cancelled
default / e2e-short (push) Has been cancelled
default / e2e-short-secureboot (push) Has been cancelled
default / e2e-templates (push) Has been cancelled
default / e2e-upgrades (push) Has been cancelled
default / e2e-workload-proxy (push) Has been cancelled
Introduce new resource `ImportedClusterSecrets` for importing an existing secrets bundle.
Add new field `imported` to `ClusterSpec` for utilizing resource `ImportedCreatedSecrets`.
Add new field `imported` to `ClusterSecrets` for pointing out source of the secrets bundle.

This is a feature-gated feature to allow using an existing secrets bundle (`talos gen secrets`) while creating a new Cluster. Cluster created with this method are marked as `tainted`. This feature is part of a story to facilitate importing existing talos clusters to omni.

Signed-off-by: Oguz Kilcan <oguz.kilcan@siderolabs.com>
2025-07-16 12:34:47 +02:00
Artem Chernyshev
ab1f7cc7fa
feat: implement multiple token support and token management
Added new pages for join token management.
The token can be created, revoked and deleted.
Also support tokens with the expiration.

Introduced new resources:
- `JoinToken` - keeps the token, it's ID is the token string.
- `JoinTokenStatus` - is generated by the controller, it calculates the
  information about the current token state: active, revoked or expired.
  And also has the information about if the token is default.
- `DefaultJoinToken` - is the singleton resource that keeps the current
  default token id.

The behavior of siderolink manager was changed to create a default
`JoinToken` resource out of whatever is currently stored in the
`siderolink.Config` resource.

`siderolink.ConnectionParams` is now generated by the controller.
It's using the default token from the `DefaultJoinToken` resource.

Infra providers will get their own unique tokens, but they won't use it
until the library is updated.
So they will still rely on the default join token until updated.

Dropped `siderolink.ConnectionParams` usage in most of the places. This
resource is kept only for the backward compatibility reasons.

Fixes: https://github.com/siderolabs/omni/issues/907

Signed-off-by: Artem Chernyshev <artem.chernyshev@talos-systems.com>
2025-07-15 18:32:06 +03:00
Utku Ozdemir
f1b47f08d9
feat: log and store redacted machine config diffs
Change the RedactedClusterMachineConfig controller to also compute diffs between each config change and store them in a new resource.

Additionally, log this diff and include its creation in the audit logs.

Clean up old diffs with both size (count) and time-based retention.
Include these diffs in the support bundles.

The resource ID follows the following pattern: `<machine-id>-<timestamp>`, e.g., `34bafa44-e994-4911-9c1a-609cccefee93-2025-07-04T19:05:40.181Z`.

Signed-off-by: Utku Ozdemir <utku.ozdemir@siderolabs.com>
2025-07-10 14:49:09 +02:00
Artem Chernyshev
a7ac63725d
chore: rewrite join config generation
Now the machine join config is always generate when there's a `machine`
resource. It will automatically populate the correct parameters for the
machine API URL, logs and events.
If the machine is managed by an infra provider it will populate it's
request ID too.

The default provider join config is also generated, but it is not used
in the common infra provider library, because it's easier to just
generate the config at the moment it's going to be used.

The code for the siderolink join config generation was unified in all
the places, and is now in `client/pkg/siderolink`.

The new management API introduced for downloading the join config in the
UI `GetMachineJoinConfig`.

Signed-off-by: Artem Chernyshev <artem.chernyshev@talos-systems.com>
2025-07-10 13:41:38 +03:00
Artem Chernyshev
122b79605f
test: run Omni as part of integration tests
Some checks are pending
default / default (push) Waiting to run
default / e2e-backups (push) Blocked by required conditions
default / e2e-forced-removal (push) Blocked by required conditions
default / e2e-scaling (push) Blocked by required conditions
default / e2e-short (push) Blocked by required conditions
default / e2e-short-secureboot (push) Blocked by required conditions
default / e2e-templates (push) Blocked by required conditions
default / e2e-upgrades (push) Blocked by required conditions
default / e2e-workload-proxy (push) Blocked by required conditions
This enables test coverage, builds Omni with race detector.

Also redone the COSI state creation flow: no more callbacks.
The state is now an Object, which has `Stop` method, that should be
called when the app stops.
All defers were moved into the `Stop` method basically.

Signed-off-by: Artem Chernyshev <artem.chernyshev@talos-systems.com>
2025-06-18 16:20:11 +03:00
Artem Chernyshev
c9c4c8e10d
test: use go test to build and run Omni integration tests
Some checks failed
default / default (push) Has been cancelled
default / e2e-backups (push) Has been cancelled
default / e2e-forced-removal (push) Has been cancelled
default / e2e-scaling (push) Has been cancelled
default / e2e-short (push) Has been cancelled
default / e2e-short-secureboot (push) Has been cancelled
default / e2e-templates (push) Has been cancelled
default / e2e-upgrades (push) Has been cancelled
default / e2e-workload-proxy (push) Has been cancelled
All test modules were moved under `integration` tag and are now in
`internal/integration` folder: no more `cmd/integration-test`
executable.

New Kres version is able to build the same executable from the tests
directory instead.

All Omni related flags were renamed, for example `--endpoint` ->
`--omni.endpoint`.

2 more functional changes:

- Enabled `--test.failfast` for all test runs.
- Removed finalizers, which were running if the test has failed.

Both of these changes should make it easier to understand the test
failure: Talos node logs won't be cluttered with the finalizer tearing
down the cluster.

Fixes: https://github.com/siderolabs/omni/issues/1171

Signed-off-by: Artem Chernyshev <artem.chernyshev@talos-systems.com>
2025-06-03 15:07:00 +03:00