80 Commits

Author SHA1 Message Date
Utku Ozdemir
26798512e8
chore: bump deps, rekres, Talos 1.12.6, Kubernetes 1.35.3
Bump all dependencies. Update default Talos version to 1.12.6 and default Kubernetes version to 1.35.3.

Signed-off-by: Utku Ozdemir <utku.ozdemir@siderolabs.com>
2026-03-24 10:33:53 +01:00
Utku Ozdemir
2977f05381
feat: allow empty subdomain for workload proxy
Allow setting the workload proxy subdomain to an empty string when useOmniSubdomain is true. This exposes services directly as subdomains of Omni (e.g., grafana.omni.example.com), which is the simplest possible setup for on-prem deployments needing only a wildcard DNS and cert on the Omni domain.

Continuation of https://github.com/siderolabs/omni/pull/2538.

Signed-off-by: Utku Ozdemir <utku.ozdemir@siderolabs.com>
2026-03-19 12:07:38 +01:00
Utku Ozdemir
df236c3da6
fix: remove Election.Resign call to fix data race with Campaign
When multiple runners compete for the same election key, RunElections overwrites the elections map entry. This causes StopElections from the first runner to call stop() on the second runner's etcdElections instance, whose Campaign() may still be writing to the Election's internal fields (leaderKey, leaderSession). Calling Resign() concurrently reads those same fields, causing a data race that flakes in CI under the race detector.

Remove the Resign() call and rely solely on session.Close(), which revokes the lease and deletes all keys attached to it — achieving the same cleanup. Also make campaignErrCh buffered to prevent a goroutine leak when run() exits via session.Done() before Campaign() completes.

Signed-off-by: Utku Ozdemir <utku.ozdemir@siderolabs.com>
2026-03-17 15:30:15 +01:00
Oguz Kilcan
cf7d752453
feat: enforce configurable machine registration limit
Add `account.maxRegisteredMachines` config option to cap the number of registered machines. The provision handler atomically checks the limit under a mutex before creating new Link resources, returning ResourceExhausted when the cap is reached.

Introduce a Notification resource type (ephemeral namespace) so controllers can surface warnings to users. `omnictl` displays all active notifications on every command invocation. Frontend part of showing notifications will be implemented in a different PR.

MachineStatusMetricsController creates a warning notification when the registration limit is reached and tears it down when it's not.

Signed-off-by: Oguz Kilcan <oguz.kilcan@siderolabs.com>
2026-03-16 12:48:47 +01:00
Edward Sammut Alessi
433fe435db
chore: bump default talos version
Bump default talos version to 1.12.5

Signed-off-by: Edward Sammut Alessi <edward.sammutalessi@siderolabs.com>
2026-03-10 15:25:18 +01:00
Oguz Kilcan
e3df911d48
feat: enforce configurable limits on user and service account creation
Add state validation that rejects identity creation when the configured maximum number of users or service accounts is reached. The gRPC resource and management servers now use the validated state so these limits are enforced for all creation paths (CLI, UI, API). Identity is created before the user resource so the validation fires before any side effects.

Also adds create validation for join token name, e2e Playwright tests covering UI and AccountLimits integration test covering API and CLI for limit enforcement.

Signed-off-by: Oguz Kilcan <oguz.kilcan@siderolabs.com>
2026-02-26 13:47:52 +01:00
Oguz Kilcan
1abd7ce6e9
chore: bump default talos version
Bump default talos version to 1.12.4

Signed-off-by: Oguz Kilcan <oguz.kilcan@siderolabs.com>
2026-02-19 14:28:32 +01:00
Utku Ozdemir
8f5d64f86f
test: add embedded etcd smoke test to helm e2e
Add a two-phase approach to the helm e2e test: first install Omni with
embedded etcd and run a smoke test (omnictl get defaultjointoken),
then uninstall and reinstall with external etcd for the full
integration suite.

Other changes:
- Extract reusable extract_sa_key function
- Split helm values into base + external etcd overlay to remove duplication
- Move helm test values to hack/test/helm/templates/ and drop .envsubst suffix
- Fix empty string arg bug in configure_registry_mirrors (remove dead else branch)

Signed-off-by: Utku Ozdemir <utku.ozdemir@siderolabs.com>
2026-02-17 15:24:00 +01:00
Utku Ozdemir
ccc197b258
refactor: replace the old helm chart with the new one
Since we don't want to support/maintain the old chart anymore, we simply replace it with the new chart.

Added a validation which fails on upgrades from the old one to the new one.

Signed-off-by: Utku Ozdemir <utku.ozdemir@siderolabs.com>
2026-02-17 12:29:38 +01:00
Utku Ozdemir
52f249dbcc
feat: make more things configurable in the helm chart
Add support for priorityClassName, terminationGracePeriodSeconds, dnsPolicy/dnsConfig, initContainers, extraContainers (sidecars), and custom labels on all services.

Also, fix some unit tests and add additional unit tests.

Signed-off-by: Utku Ozdemir <utku.ozdemir@siderolabs.com>
2026-02-16 13:59:08 +01:00
Utku Ozdemir
fbf36740f2
test: add unit and e2e tests to the helm chart
Add helm unit tests (via helm-unittest) covering services, ingresses, HTTPRoutes, secrets, PrometheusRules and ServiceAccounts. Add a helm-based e2e test workflow that deploys Omni on a Talos cluster with Traefik and etcd, runs integration tests including workload proxy, and verifies the full stack end-to-end. Add a configurable TestOptions struct to the workload proxy test to allow running with smaller scale in helm e2e.

Signed-off-by: Kevin Tijssen <kevin.tijssen@siderolabs.com>
Signed-off-by: Utku Ozdemir <utku.ozdemir@siderolabs.com>
2026-02-16 13:58:56 +01:00
Utku Ozdemir
0c2c5c1cc3
test: use envsubst in tests and do small improvements
Now that we have envsubst in the build container, we can simplify our scripts a bit.

Also do other cosmetic improvements in the test scripts.

Signed-off-by: Utku Ozdemir <utku.ozdemir@siderolabs.com>
2026-02-16 11:04:25 +01:00
Oguz Kilcan
bd86ff3127
chore: remove deprecated migration flags, config fields, and migration code
The deprecated flags and config fields kept for the SQLite migration period (v1.4.0) have been removed along with all automatic migration code for BoltDB secondary storage, file-based audit logs, file-based discovery service snapshots, and circular buffer machine logs.

Signed-off-by: Oguz Kilcan <oguz.kilcan@siderolabs.com>
2026-02-16 10:38:53 +01:00
Utku Ozdemir
d1c869a9d8
chore: bump deps, rekres
Bump all dependencies.

Signed-off-by: Utku Ozdemir <utku.ozdemir@siderolabs.com>
2026-02-12 20:43:45 +01:00
Utku Ozdemir
1e24fd222d
feat: implement helm chart v2
**Helm Chart v2:**
- Add new Helm chart with comprehensive configuration via values.yaml
- Support for both Kubernetes Ingress and Gateway API
- Built-in validation for required fields and URL consistency
- Prometheus metrics and ServiceMonitor support
- Detailed documentation with examples for Traefik
- Workload proxy setup guide

**Deploy directory reorganization:**
- Move Docker Compose files to `deploy/compose/`
- Move existing Helm chart to `deploy/helm/omni/`
- Add top-level `deploy/README.md` pointing to deployment options
- Add deprecation warning to v1 Helm chart

**Documentation:**
- Add link to Helm chart in root README

Co-authored-by: Kevin Tijssen <kevin.tijssen@siderolabs.com>
Signed-off-by: Utku Ozdemir <utku.ozdemir@siderolabs.com>
2026-01-30 14:09:27 +01:00
Oguz Kilcan
4978834232
test: fix failing workload proxy tests
Fix failing workload proxy tests

Signed-off-by: Oguz Kilcan <oguz.kilcan@siderolabs.com>
2026-01-28 13:32:27 +01:00
Edward Sammut Alessi
0f8a3d6c6f
test(e2e): add an e2e test for exposed services
Add an E2E test which adds an nginx service through an inlineManifests config patch on the control plane, and check that it is accessible.

Signed-off-by: Edward Sammut Alessi <edward.sammutalessi@siderolabs.com>
2026-01-23 18:57:19 +01:00
Edward Sammut Alessi
d3ae77c0cc
chore: bump copyright to 2026
Bump copyright for conformance to 2026

Signed-off-by: Edward Sammut Alessi <edward.sammutalessi@siderolabs.com>
2026-01-21 15:30:49 +01:00
Oguz Kilcan
f56551abc3
chore: move some tests from e2e upgrades e2e test to misc upgrades test
Move some tests from e2e-upgrades test to e2e-misc-upgrades to speed up the overall speed, because the test was taking too much time.

Signed-off-by: Oguz Kilcan <oguz.kilcan@siderolabs.com>
2026-01-17 14:50:36 +01:00
Oguz Kilcan
85d099489f
chore: separate integration-tests
Separate integration/e2e tests for qemu and talemu, so we can run them in parallel.

Signed-off-by: Oguz Kilcan <oguz.kilcan@siderolabs.com>
2026-01-14 15:45:59 +01:00
Oguz Kilcan
ef2d931aac
chore: rekres and bump deps
* Rekres
* Bump deps
* Update default versions for talos and kubernetes

Signed-off-by: Oguz Kilcan <oguz.kilcan@siderolabs.com>
2026-01-09 11:34:03 +01:00
Utku Ozdemir
9bf690ef2e
refactor: do SQLite migrations unconditionally, rework the config flags
Remove the flags for turning on SQLite storage for:
- Discovery service state
- Audit logs
- Machine logs

Instead, migrate them unconditionally to SQLite on the next startup.

Remove many flags which are no longer meaningful. Only keep the ones which are required for the migrations.

Additionally: Make the `--sqlite-storage-path` (or its config counterpart `.storage.sqlite.path`) required with no default value, as a default value does not make sense for it in most of the cases.
Signed-off-by: Utku Ozdemir <utku.ozdemir@siderolabs.com>
2025-12-12 12:47:04 +01:00
Oguz Kilcan
bc2a5a9986
chore: prepare omni with talos v1.12.0-beta.1
Prepare omni for upcoming talos version 1.12.0-beta.1.

Signed-off-by: Oguz Kilcan <oguz.kilcan@siderolabs.com>
2025-12-06 16:55:35 +01:00
Utku Ozdemir
52360252e6
fix: do not clear schematic meta values for non-UKI machines
META section updates are no-op for non-UKI machines, but still, the recent changes in the kernel args PR started clearing them (since now we compute schematic ID always), causing the schematic ID to be updated, which caused cluster machines to be upgraded and restarted.

Remove the UKI check and keep meta valus always as-is.

Update the integration tests to:
- Also include META values.
- Make Omni upgrade test pick both UKI and non-UKI machines.

Signed-off-by: Utku Ozdemir <utku.ozdemir@siderolabs.com>
2025-11-20 12:41:18 +01:00
Utku Ozdemir
db97e09291
chore: bump Kubernetes version to 1.34.2
Updated the default Kubernetes version to 1.34.2 and adjusted related
version constants in the integration script and Go files.

Signed-off-by: Utku Ozdemir <utku.ozdemir@siderolabs.com>
2025-11-14 16:57:00 +01:00
Utku Ozdemir
7468e6ea02
chore: rekres, make linters happy, bump Go, deps and Talos versions
Bump Go to 1.25.4, default Talos version to 1.11.5.
Bump all Go dependencies.

Signed-off-by: Utku Ozdemir <utku.ozdemir@siderolabs.com>
2025-11-10 23:51:22 +01:00
Andrey Smirnov
75a9f3ee9f
feat: use sqlite as secondary resource storage
This pulls in https://github.com/cosi-project/state-sqlite/pull/2

Fixes https://github.com/siderolabs/omni/issues/1770

See https://github.com/siderolabs/omni/issues/1768

Sample migration logs:

```
2025-11-05T11:18:47.340Z        ESC[34mINFOESC[0m       omni/state_sqlite.go:122        migrated resources from BoltDB to SQLite        {"namespace": "metrics"
, "type": "EtcdBackupOverallStatuses.omni.sidero.dev", "count": 1}
2025-11-05T11:18:47.340Z        ESC[34mINFOESC[0m       omni/state_sqlite.go:122        migrated resources from BoltDB to SQLite        {"namespace": "metrics"
, "type": "EtcdBackupStatuses.omni.sidero.dev", "count": 0}
2025-11-05T11:18:47.342Z        ESC[34mINFOESC[0m       omni/state_sqlite.go:122        migrated resources from BoltDB to SQLite        {"namespace": "metrics"
, "type": "MachineStatusLinks.omni.sidero.dev", "count": 2}
2025-11-05T11:18:47.342Z        ESC[34mINFOESC[0m       omni/state_sqlite.go:67 removed old BoltDB database after migration     {"path": "_out/secondary-storag
e/bolt.db"}
```

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2025-11-05 15:40:24 +04:00
Utku Ozdemir
15deddde56
feat: implement extra kernel args support
(Re)implement the kernel args support functionality in the following way:
- Only support UKI or UKI-like (>=1.12 with GrubUseUKICmdline) systems.
- In `MachineStatusController`:
  - When we see a machine for the first time, do a one-time operation of extracting of the extra kernel args from it and store them in the newly introduced `KernelArgs` resource. This resource is user-owned from that point on.
  - Mark the `MachineStatus` with an annotation as "its kernel args are initialized".
  - Start storing the the raw schematic.
  - Take a one-time snapshot of the extensions on the machine and set them as "initial extensions". They might not be the "actual initial", i.e., the set of extensions when we actually seen the machine for the first time, but we do this in a best-effort basis. We need this, since now we cannot simply go back to the initial schematic ID when all extensions are removed - kernel args are also included in the schematic.
  - Start collecting the kernel cmdline from Talos machines as well.
- Adapt the `SchematicConfiguration` controller to not revert to the initial schematic ID ever - it now always computes the needed schematic - when it wants to revert to the initial set of extensions, it uses the new field on the `MachineStatus`.
- Introduce the resource `MachineUpgradeStatus` and its controller `MachineUpgradeStatusController`, which handles the maintenance mode upgrades when kernel args are updated. The controller is named this way, since our long-term plan is to centralize all upgrade calls to be done from this controller. Currently, it does not change Talos version or the set of extensions. It works only in maintenance mode, only for kernel args changes (when supported).
- Introduce the resource `KernelArgsStatus` and its controller `KernelArgsStatusController`, which provides information about the kernel args updates. Its status is reliable in both maintenance and non-maintenance modes.
- Build a UI to update these args (with @Unix4ever's help).

Co-authored-by: Artem Chernyshev <artem.chernyshev@talos-systems.com>
Signed-off-by: Utku Ozdemir <utku.ozdemir@siderolabs.com>
2025-10-28 14:44:48 +01:00
Utku Ozdemir
02425267fe
test: improve integration tests
- Make sure the console output of QEMU is sent to `console=ttyS0` when non-UKI is used.
- Use the new `cluster create` arg `--skip-injecting-extra-cmdline` to make sure `console=ttyS0` kernel arg is not duplicated.
- Get rid of `SUDO_USER` var.
- Add the missing `--omni.output-dir` flag to make sure the support bundles are collected to proper destinations.
- Gather all artifacts to be collected under `TEST_OUTPUTS_DIR` for better organization in the test artifacts archive.
- Quote some strings.

Signed-off-by: Utku Ozdemir <utku.ozdemir@siderolabs.com>
2025-10-28 09:53:03 +01:00
Artem Chernyshev
b5765d8d1c
test: use bridge IP for WireGuard in CI
It was using local pod IP which was generating new schematic every time
the test runs.

Signed-off-by: Artem Chernyshev <artem.chernyshev@talos-systems.com>
2025-10-20 12:22:58 +03:00
Utku Ozdemir
d0c8b1666b
chore: bump Talos to 1.11.3, reorder CI workflow jobs
Make unit tests and lint run before the integration tests.

Signed-off-by: Utku Ozdemir <utku.ozdemir@siderolabs.com>
2025-10-17 13:30:35 +02:00
Utku Ozdemir
049ab877e9
chore: revert 'feat: add support for updating kernel args'
Some checks failed
chromatic / Run chromatic (push) Has been cancelled
default / default (push) Has been cancelled
default / e2e-backups (push) Has been cancelled
default / e2e-cluster-import (push) Has been cancelled
default / e2e-forced-removal (push) Has been cancelled
default / e2e-omni-upgrade (push) Has been cancelled
default / e2e-scaling (push) Has been cancelled
default / e2e-short (push) Has been cancelled
default / e2e-short-secureboot (push) Has been cancelled
default / e2e-templates (push) Has been cancelled
default / e2e-upgrades (push) Has been cancelled
default / e2e-workload-proxy (push) Has been cancelled
default / integration-test (push) Has been cancelled
default / lint (push) Has been cancelled
default / unit-tests (push) Has been cancelled
This reverts commit ae9d7cca4b3ef2c5923cc6476042a575d4158eee.

Signed-off-by: Utku Ozdemir <utku.ozdemir@siderolabs.com>
2025-10-14 18:40:58 +02:00
Oguz Kilcan
0d58ade7bf
feat: implement cluster import
Allow importing existing talos cluster to Omni using `omnictl cluster import`

Closes: #1315

Signed-off-by: Oguz Kilcan <oguz.kilcan@siderolabs.com>
2025-10-14 17:01:07 +02:00
Utku Ozdemir
c88503dcba
chore: bump default Talos version, deps, rekres, re-generate
Bump everything to appropriate versions. Remove some unused imports.

Signed-off-by: Utku Ozdemir <utku.ozdemir@siderolabs.com>
2025-10-13 11:23:45 +02:00
Utku Ozdemir
ae9d7cca4b
feat: add support for updating kernel args
Allow updating kernel args, similar to the set of extensions.

Signed-off-by: Utku Ozdemir <utku.ozdemir@siderolabs.com>
2025-10-08 12:08:53 +02:00
Oguz Kilcan
1f098cfafe
test: improve test cluster creation for e2e tests
* Improve test cluster creation for e2e tests
* Remove partial config apply after vm wipe because it's no longer necessary

Co-authored-by: Utku Ozdemir <utku.ozdemir@siderolabs.com>
Signed-off-by: Oguz Kilcan <oguz.kilcan@siderolabs.com>
2025-09-18 11:27:28 +02:00
Oguz Kilcan
21cd39155c
chore: rekres and fix e2e test runs
Some checks failed
default / default (push) Has been cancelled
default / e2e-backups (push) Has been cancelled
default / e2e-cluster-import (push) Has been cancelled
default / e2e-forced-removal (push) Has been cancelled
default / e2e-omni-upgrade (push) Has been cancelled
default / e2e-scaling (push) Has been cancelled
default / e2e-short (push) Has been cancelled
default / e2e-short-secureboot (push) Has been cancelled
default / e2e-templates (push) Has been cancelled
default / e2e-upgrades (push) Has been cancelled
default / e2e-workload-proxy (push) Has been cancelled
default / integration-test (push) Has been cancelled
default / lint (push) Has been cancelled
default / unit-tests (push) Has been cancelled
Rekres to bring back retrieve PR labels step default job and use relative location of GH workspace instead of `/tmp` for local storage.

Signed-off-by: Oguz Kilcan <oguz.kilcan@siderolabs.com>
2025-09-17 17:17:11 +02:00
Edward Sammut Alessi
5ab4fe4156
chore: migrate omni e2e tests to javascript
Some checks failed
default / default (push) Has been cancelled
default / e2e-backups (push) Has been cancelled
default / e2e-cluster-import (push) Has been cancelled
default / e2e-forced-removal (push) Has been cancelled
default / e2e-omni-upgrade (push) Has been cancelled
default / e2e-scaling (push) Has been cancelled
default / e2e-short (push) Has been cancelled
default / e2e-short-secureboot (push) Has been cancelled
default / e2e-templates (push) Has been cancelled
default / e2e-upgrades (push) Has been cancelled
default / e2e-workload-proxy (push) Has been cancelled
default / integration-test (push) Has been cancelled
default / lint (push) Has been cancelled
default / unit-tests (push) Has been cancelled
Migrate the omni E2E tests to javascript inside the frontend space

Signed-off-by: Edward Sammut Alessi <edward.sammutalessi@siderolabs.com>
2025-09-16 19:08:01 +02:00
Oguz Kilcan
1b4de5b798
feat: abort ongoing cluster import process
Added a new omnictl command for aborting cluster import process and removing the created resources (e.g `Cluster`, `MachineSet`s, `MachineSetNode`s without reseting the machines.

Signed-off-by: Oguz Kilcan <oguz.kilcan@siderolabs.com>
2025-09-15 15:06:31 +02:00
Oguz Kilcan
9b5e552353
chore: rekres and bump deps
* Rekres
* Bump deps
* Update default versions for talos and kubernetes

Signed-off-by: Oguz Kilcan <oguz.kilcan@siderolabs.com>
2025-09-15 11:31:14 +02:00
Noel Georgi
2d30614cc7
chore(ci): rekres to use action runner groups
Rekres to use action runner groups.

Signed-off-by: Noel Georgi <git@frezbo.dev>
2025-09-11 17:39:00 +04:00
Orzelius
906df9a6a4
chore: remove the usage of --input-dir flag in tests
Some checks failed
default / default (push) Has been cancelled
default / e2e-backups (push) Has been cancelled
default / e2e-cluster-import (push) Has been cancelled
default / e2e-forced-removal (push) Has been cancelled
default / e2e-omni-upgrade (push) Has been cancelled
default / e2e-scaling (push) Has been cancelled
default / e2e-short (push) Has been cancelled
default / e2e-short-secureboot (push) Has been cancelled
default / e2e-templates (push) Has been cancelled
default / e2e-upgrades (push) Has been cancelled
default / e2e-workload-proxy (push) Has been cancelled
the flag is being removed in talos

Signed-off-by: Orzelius <33936483+Orzelius@users.noreply.github.com>
2025-09-10 10:04:22 +09:00
Utku Ozdemir
c9657cbb62
feat: add jitter and lazy health checks to exposed services
Some checks failed
default / default (push) Has been cancelled
default / e2e-backups (push) Has been cancelled
default / e2e-cluster-import (push) Has been cancelled
default / e2e-forced-removal (push) Has been cancelled
default / e2e-omni-upgrade (push) Has been cancelled
default / e2e-scaling (push) Has been cancelled
default / e2e-short (push) Has been cancelled
default / e2e-short-secureboot (push) Has been cancelled
default / e2e-templates (push) Has been cancelled
default / e2e-upgrades (push) Has been cancelled
default / e2e-workload-proxy (push) Has been cancelled
- Add jitter to the exposed service health checks, so they spread evenly even when the services are all reconciled at the exact same time.
- Add the "lazy" logic to the current workload proxy health checks by wrapping the "regular" LB with a lazy LB wrapper. With this, we gain:
  - Health checks are started only when an exposed service is attempted to be accessed ("dialed").
  - They are stopped after 5 minutes of inactivity.

Depends on siderolabs/go-loadbalancer#24.

Signed-off-by: Utku Ozdemir <utku.ozdemir@siderolabs.com>
2025-08-25 23:23:28 +02:00
Oguz Kilcan
928b7d8948
test: fix omni upgrade e2e test
Some checks failed
default / default (push) Has been cancelled
default / e2e-backups (push) Has been cancelled
default / e2e-cluster-import (push) Has been cancelled
default / e2e-forced-removal (push) Has been cancelled
default / e2e-omni-upgrade (push) Has been cancelled
default / e2e-scaling (push) Has been cancelled
default / e2e-short (push) Has been cancelled
default / e2e-short-secureboot (push) Has been cancelled
default / e2e-templates (push) Has been cancelled
default / e2e-upgrades (push) Has been cancelled
default / e2e-workload-proxy (push) Has been cancelled
Fix omni upgrade e2e test

Signed-off-by: Oguz Kilcan <oguz.kilcan@siderolabs.com>
2025-08-08 14:11:04 +02:00
Oguz Kilcan
e740c8b7c2
test: fix registry mirror config format in integration tests
Some checks failed
default / default (push) Has been cancelled
default / e2e-backups (push) Has been cancelled
default / e2e-cluster-import (push) Has been cancelled
default / e2e-forced-removal (push) Has been cancelled
default / e2e-omni-upgrade (push) Has been cancelled
default / e2e-scaling (push) Has been cancelled
default / e2e-short (push) Has been cancelled
default / e2e-short-secureboot (push) Has been cancelled
default / e2e-templates (push) Has been cancelled
default / e2e-upgrades (push) Has been cancelled
default / e2e-workload-proxy (push) Has been cancelled
We were rendering registry mirrors incorrectly. There was an extra character and line breaks weren't taken into account.

Also upgraded github.com/go-chi/chi/v5 to latest version to make govulncheck happy.

Signed-off-by: Oguz Kilcan <oguz.kilcan@siderolabs.com>
2025-07-29 11:16:13 +02:00
Artem Chernyshev
0fc13bbf04
test: run Omni upgrade tests against latest stable
Fixes: https://github.com/siderolabs/omni/issues/683

Signed-off-by: Artem Chernyshev <artem.chernyshev@talos-systems.com>
2025-07-28 14:24:12 +03:00
Oguz Kilcan
f8de9a6d96
feat: add support for imported cluster secrets
Some checks failed
default / default (push) Has been cancelled
default / e2e-backups (push) Has been cancelled
default / e2e-cluster-import (push) Has been cancelled
default / e2e-forced-removal (push) Has been cancelled
default / e2e-omni-upgrade (push) Has been cancelled
default / e2e-scaling (push) Has been cancelled
default / e2e-short (push) Has been cancelled
default / e2e-short-secureboot (push) Has been cancelled
default / e2e-templates (push) Has been cancelled
default / e2e-upgrades (push) Has been cancelled
default / e2e-workload-proxy (push) Has been cancelled
Introduce new resource `ImportedClusterSecrets` for importing an existing secrets bundle.
Add new field `imported` to `ClusterSpec` for utilizing resource `ImportedCreatedSecrets`.
Add new field `imported` to `ClusterSecrets` for pointing out source of the secrets bundle.

This is a feature-gated feature to allow using an existing secrets bundle (`talos gen secrets`) while creating a new Cluster. Cluster created with this method are marked as `tainted`. This feature is part of a story to facilitate importing existing talos clusters to omni.

Signed-off-by: Oguz Kilcan <oguz.kilcan@siderolabs.com>
2025-07-16 12:34:47 +02:00
Artem Chernyshev
a7fe525ce1
test: test updating from old Omni version to the current
Some checks are pending
default / default (push) Waiting to run
default / e2e-backups (push) Blocked by required conditions
default / e2e-forced-removal (push) Blocked by required conditions
default / e2e-omni-upgrade (push) Blocked by required conditions
default / e2e-scaling (push) Blocked by required conditions
default / e2e-short (push) Blocked by required conditions
default / e2e-short-secureboot (push) Blocked by required conditions
default / e2e-templates (push) Blocked by required conditions
default / e2e-upgrades (push) Blocked by required conditions
default / e2e-workload-proxy (push) Blocked by required conditions
CI will run integration tests from the previous release first, then run
them from the current commit using other set of tests.

Fixes: https://github.com/siderolabs/omni/issues/1132

Signed-off-by: Artem Chernyshev <artem.chernyshev@talos-systems.com>
2025-06-24 19:24:19 +03:00
Artem Chernyshev
122b79605f
test: run Omni as part of integration tests
Some checks are pending
default / default (push) Waiting to run
default / e2e-backups (push) Blocked by required conditions
default / e2e-forced-removal (push) Blocked by required conditions
default / e2e-scaling (push) Blocked by required conditions
default / e2e-short (push) Blocked by required conditions
default / e2e-short-secureboot (push) Blocked by required conditions
default / e2e-templates (push) Blocked by required conditions
default / e2e-upgrades (push) Blocked by required conditions
default / e2e-workload-proxy (push) Blocked by required conditions
This enables test coverage, builds Omni with race detector.

Also redone the COSI state creation flow: no more callbacks.
The state is now an Object, which has `Stop` method, that should be
called when the app stops.
All defers were moved into the `Stop` method basically.

Signed-off-by: Artem Chernyshev <artem.chernyshev@talos-systems.com>
2025-06-18 16:20:11 +03:00
Artem Chernyshev
ccd55cc8fb
feat: rewrite Omni config management
Some checks are pending
default / default (push) Waiting to run
default / e2e-backups (push) Blocked by required conditions
default / e2e-forced-removal (push) Blocked by required conditions
default / e2e-scaling (push) Blocked by required conditions
default / e2e-short (push) Blocked by required conditions
default / e2e-short-secureboot (push) Blocked by required conditions
default / e2e-templates (push) Blocked by required conditions
default / e2e-upgrades (push) Blocked by required conditions
default / e2e-workload-proxy (push) Blocked by required conditions
Omni can now be configured via a config file instead of the command line
flags.
The flags `--config-path` will now read the config provided in the YAML
format.
The config structure was completely changed. It was not public before,
so it's fine to ignore backward compatibility.
The command line flags were not changed.

Signed-off-by: Artem Chernyshev <artem.chernyshev@talos-systems.com>
2025-06-09 14:44:29 +03:00