mirrors/omni - omni - gitea@git.xfx1.de

mirror of https://github.com/siderolabs/omni.git synced 2026-05-05 06:36:12 +02:00

Author	SHA1	Message	Date
Utku Ozdemir	2fe716d2c9	chore: enable go linting for build tags, fix linting errors Add the build tags we were using, `integration` and `tools`, to be included in the linting/formatting of golangci-lint. Rename the build tag `tools` to `sidero.tools` to avoid colliding with the same named build tag in `github.com/johannesboyne/gofakes3` package - otherwise the dependency was failing to compile due to having multiple package names in the same package. Fix all the linting errors surfaced by this enablement. Also, temporarily re-enabled `nolintlint` to find the nolint directives which were no longer necessary and removed them. Signed-off-by: Utku Ozdemir <utku.ozdemir@siderolabs.com>	2026-04-29 21:18:45 +02:00
Edward Sammut Alessi	d3592671ec	feat: download talosctl directly from factory Download talosctl binaries from factory instead of Github Signed-off-by: Edward Sammut Alessi <edward.sammutalessi@siderolabs.com>	2026-04-29 17:06:25 +02:00
Utku Ozdemir	f9dd849153	feat: introduce powered off machine state and power on support Machines that were shutting down and then disconnect are now shown as "Powered Off" in the UI instead of being stuck in "Shutting Down" with a greyed-out unreachable state. For machines managed by a static infra provider, shutting down a machine now prevents the provider from automatically powering it back on due to cluster allocation. The provider honors the shutdown request until the machine goes through a deallocation cycle, at which point the request is considered stale. Intentionally powered-off machines are also excluded from the "disconnected machines" list on the frontend when destroying a cluster, to avoid them being force-destroyed. The shutdown modal in the frontend now calls a new management API endpoint instead of the Talos API directly. The CLI gains \`omnictl machine shutdown\` and \`omnictl machine power-on\` commands. Closes siderolabs/omni#1634. Part of siderolabs/omni-infra-provider-bare-metal#103. Signed-off-by: Utku Ozdemir <utku.ozdemir@siderolabs.com>	2026-04-24 13:57:12 +02:00
Artem Chernyshev	725f41d4ee	fix: properly display service account expiration time in the UI The old code was incorrectly picking the public key. Fixes: https://github.com/siderolabs/omni/issues/2717 Signed-off-by: Artem Chernyshev <artem.chernyshev@talos-systems.com>	2026-04-23 21:19:54 +03:00
Edward Sammut Alessi	c5a4310570	feat(frontend): add support modal to omni Add a support modal to Omni, providing links to github issues, support, docs, community links, and office hours. Signed-off-by: Edward Sammut Alessi <edward.sammutalessi@siderolabs.com>	2026-04-23 15:46:42 +02:00
Edward Sammut Alessi	be67f710f8	feat: allow reader access to join token Explicitly allow readers to read join tokens Signed-off-by: Edward Sammut Alessi <edward.sammutalessi@siderolabs.com>	2026-04-21 16:28:32 +02:00
Oguz Kilcan	475e3660d7	feat: add Talos version end-of-support notifications and metrics * Track machines running Talos versions approaching or past end of support relative to MinTalosVersion. * Replace the config-driven non-ImageFactory deprecation notification with hardcoded constants and add two new notifications (approaching end of support, end of support reached) with corresponding Prometheus metrics. * Add startup validation hooks (currently disabled) that will refuse to start when unsupported machines are detected. * Fix frontend notification namespace from Default to Ephemeral. Signed-off-by: Oguz Kilcan <oguz.kilcan@siderolabs.com>	2026-04-20 17:11:49 +02:00
Edward Sammut Alessi	488b020b2e	feat: add more filters to audit logs Add multiple new filters to audit logs. Through the UI, there will be a generic search box and the ability to sort columns. Through the CLI, there will be support for the same plus also direct filters for event_type, resource_type, resource_id, cluster_id, and actor. Signed-off-by: Edward Sammut Alessi <edward.sammutalessi@siderolabs.com>	2026-04-15 11:03:54 +02:00
Utku Ozdemir	590ea2e370	feat: add per-key creation and last-active tracking for service accounts Add creation timestamps and per-key last-active tracking to service account key listings. The `omnictl serviceaccount list` command now shows KEY CREATED and KEY LAST ACTIVE columns for each public key, alongside the existing SA-level LAST ACTIVE. A new PublicKeyLastActive resource tracks per-key usage. The activity interceptor now extracts the signing key fingerprint from the auth context and records last-used timestamps per key, with independent debouncing. The ServiceAccountStatusController aggregates this data into the service account status for display. A cleanup controller removes PublicKeyLastActive resources when their corresponding public key is torn down. Closes: siderolabs/omni#2661 Signed-off-by: Utku Ozdemir <utku.ozdemir@siderolabs.com>	2026-04-14 21:12:30 +02:00
Edward Sammut Alessi	cad3713552	feat: implement eula guard for omni Implement a guard for Omni to prevent usage until users accept an EULA through the UI or a startup flag. Signed-off-by: Edward Sammut Alessi <edward.sammutalessi@siderolabs.com>	2026-04-13 16:49:51 +02:00
Artem Chernyshev	e4760526f2	feat: support `omnictl edit` command Works same way as `talosctl edit`, `kubectl edit`. Fixes: https://github.com/siderolabs/omni/issues/905 Signed-off-by: Artem Chernyshev <artem.chernyshev@talos-systems.com>	2026-04-07 15:47:31 +03:00
Artem Chernyshev	43be52c7b1	chore: bump sqlite metrics collector timeout and interval Timeout: 10s -> 60s Interval: 60s -> 120s Signed-off-by: Artem Chernyshev <artem.chernyshev@talos-systems.com>	2026-04-02 14:14:25 +03:00
Artem Chernyshev	6efb0f2f0a	feat: support Kubernetes manifests in the cluster templates Fixes: https://github.com/siderolabs/omni/issues/2172 Leverage kubernetes manifest resources and expose them through cluster templates. Signed-off-by: Artem Chernyshev <artem.chernyshev@talos-systems.com>	2026-03-26 14:10:14 +03:00
Artem Chernyshev	ada0360837	feat: add a way to sync Kubernetes manifests in Omni Manifests support two modes: - `FULL` - Omni will keep the manifest in sync always. - `ONE_TIME` - Omni will apply the manifest only if it doesn't exist. If the manifest is removed by hand and then changed in Omni it will be applied too. Manifests are applied using service side apply, Omni now has three inventories: `omni-internal-inventory`, `omni-user-inventory` and `omny-sync-one-time`: - User inventory will be used for user managed manifests. - Internal one will be used for the manifests which are created by Omni controllers (workloadproxy, advanced healtcheck service). - One time inventory is used with NoPrune enabled. If the manifest is applied it's just removed from the list of applied manifests: that ensures that manifests changes are not going happen. Manifests also support setting namespace to all namespaced resources. It might be useful for the huge manifest files which are supplied without the namespace (similar to `kubectl apply -n namespace -f manifest.yaml`). Signed-off-by: Artem Chernyshev <artem.chernyshev@talos-systems.com>	2026-03-23 15:29:49 +03:00
Oguz Kilcan	b9cabbd95c	feat: add deprecation notification for non-ImageFactory machines Warn users when machines are provisioned without ImageFactory by creating a notification resource when invalid schematics are detected. The notification is gated behind a configurable flag under notifications.nonImageFactoryDeprecation with customizable title/body. Also adds omni_machines_invalid_schematic Prometheus metric, exposes the count in MachineStatusMetricsSpec, and adds a Machine Status section to the Grafana dashboard. Signed-off-by: Oguz Kilcan <oguz.kilcan@siderolabs.com>	2026-03-20 15:47:53 +01:00
Oguz Kilcan	cf7d752453	feat: enforce configurable machine registration limit Add `account.maxRegisteredMachines` config option to cap the number of registered machines. The provision handler atomically checks the limit under a mutex before creating new Link resources, returning ResourceExhausted when the cap is reached. Introduce a Notification resource type (ephemeral namespace) so controllers can surface warnings to users. `omnictl` displays all active notifications on every command invocation. Frontend part of showing notifications will be implemented in a different PR. MachineStatusMetricsController creates a warning notification when the registration limit is reached and tears it down when it's not. Signed-off-by: Oguz Kilcan <oguz.kilcan@siderolabs.com>	2026-03-16 12:48:47 +01:00
Utku Ozdemir	1e9b733cb0	chore: bump deps, rekres Bump all dependencies. Signed-off-by: Utku Ozdemir <utku.ozdemir@siderolabs.com>	2026-03-10 18:31:38 +01:00
Artem Chernyshev	543cf70b5b	chore: force SSA manifests sync mode for Talos >= 1.13 Backend now automatically switches between legacy and SSA modes for different Talos versions. Signed-off-by: Artem Chernyshev <artem.chernyshev@talos-systems.com>	2026-03-05 20:30:13 +03:00
Oguz Kilcan	a9f2937ced	feat: add OIDC token cache isolation for generated kubeconfigs Support isolated OIDC token cache directories in generated `kubeconfig`s to prevent token conflicts when switching between users/clusters. Configurable via server flags and omnictl `--oidc-cache-base-dir` `--oidc-cache-isolation`. Also upgrade exec credential API to v1 and add interactiveMode field. Signed-off-by: Oguz Kilcan <oguz.kilcan@siderolabs.com>	2026-03-04 09:53:05 +01:00
Artem Chernyshev	f8a42eeb04	chore: move graceful upgrades to the lowest level Rewrite `TalosUpgradeStatus` controller to use the completely different flow: - update all `ClusterMachineTalosVersion` resources immediately. - to control quotas and rollout sequence use `UpgradeRollout` resource, it has a single field which is a map of MachineSetName -> Current Quota: - if control plane is updating it sets quota 0 on all other machine sets. - the number of not running/unhealthy machines is subtracted from the quota. - quota is now copied from the new `UpgradeStrategy`, so it's possible to have more than one machine updated in parallel. - `ClusterMachineConfigStatus` controller now adds a new finalizer for upgrades on all `ClusterMachines` which are currently being updated to acquire/release locks and reads quotas from the `UpgradeRollout`. Fixes: https://github.com/siderolabs/omni/issues/2393 Signed-off-by: Artem Chernyshev <artem.chernyshev@talos-systems.com>	2026-03-03 20:02:59 +03:00
Oguz Kilcan	6d03fc7cdb	feat: track user and service account last activity * Add `IdentityLastActive` resource to record the last time each identity(`User`/`ServiceAccount`) made a gRPC call. * Add `IdentityStatusController` to aggregate identity, user role, and last-active data into an ephemeral `IdentityStatus` resource. * Expose last_active in ListUsers/ListServiceAccounts gRPC responses, omnictl CLI output, and the frontend Users/ServiceAccounts views. * Add `UserMetricsController` exposing `omni_users` (total) and `omni_active_users` (7d/30d windows) Prometheus gauges. Signed-off-by: Oguz Kilcan <oguz.kilcan@siderolabs.com>	2026-03-03 13:53:29 +01:00
Edward Sammut Alessi	5fccd82b6e	feat: add talos_version and kubernetes_version to clusterstatus Add talos_version and kubernetes_version to ClusterStatusSpec, so as to not need to also query ClusterSpec. Signed-off-by: Edward Sammut Alessi <edward.sammutalessi@siderolabs.com>	2026-02-26 15:58:39 +01:00
Oguz Kilcan	da60807d48	feat: add ManagementService gRPC endpoints for user operations Migrate user create, list, update, and destroy operations from direct resource manipulation to dedicated ManagementService gRPC endpoints, matching the existing service account pattern. Direct Identity/User resource mutations are now restricted, and the CLI, frontend, and client library are updated to use the new endpoints. Signed-off-by: Oguz Kilcan <oguz.kilcan@siderolabs.com>	2026-02-26 09:33:27 +01:00
Artem Chernyshev	47fb4dd792	feat: allow resetting node unique tokens This allows token rotation and disaster recovery if the token gets rejected by Omni. Introduced the new CLI command for that: ``` omnictl configure machine <id> --reset-node-unique-token ``` Signed-off-by: Artem Chernyshev <artem.chernyshev@talos-systems.com>	2026-02-25 13:42:12 +03:00
Artem Chernyshev	69c2759b8b	fix: break the dep loop in the cluster machine config status controller Extract the fields required by the `MachineConfigStatusController` to a separate resource. Otherwise there's circular loop: `MachinePendingUpdates` -> `MachineSetStatus` -> `MachineConfigStatus` -> `MachinePendingUpdates`... Also change the way machine pending is calculated: do not delete the pending machine updates resource if the Talos version/schematic is not in sync. Signed-off-by: Artem Chernyshev <artem.chernyshev@talos-systems.com>	2026-02-17 00:28:32 +03:00
Oguz Kilcan	afdf123e29	feat: add support for Kubernetes CA rotation Add support for Kubernetes CA rotation Signed-off-by: Oguz Kilcan <oguz.kilcan@siderolabs.com>	2026-02-14 11:32:00 +01:00
Utku Ozdemir	d1c869a9d8	chore: bump deps, rekres Bump all dependencies. Signed-off-by: Utku Ozdemir <utku.ozdemir@siderolabs.com>	2026-02-12 20:43:45 +01:00
Utku Ozdemir	0906bcc23c	fix: prevent unwanted upgrades of non-image-factory machines The schematic comparison logic had an edge case: if a machine predates the image factory, it is installed via a `ghcr.io` installer image (or a custom one). Those machines do not have the schematic meta extension on them, and Omni creates a synthetic schematic ID and properties for those. These properties do not have the "actual" kernel args of the machine, but rather, Omni sets them as what it thinks they should be (the "correct" siderolink args from the Omni perspective). Later, if Omni gets its siderolink API advertised URL get updated, it wrongly detects those synthetic kernel args to be the "new ones (with the new URL)", hence, the desired vs actual schematic comparison returns a mismatch. And Omni does an unnecessary upgrade to that machine. Fix this by using the "current (non-protected) args of the machine" as the synthetic args in such cases. Those "current" args will be synthetic themselves (since we cannot read them from the machine, as it does not have schematic info on it), but, it will prevent changes when the advertised URL changes. Additionally, we have two checks to detect a schematic mismatch in the `ClusterMachineConfigStatus` controller - make them check the mismatch in the same way, to be more consistent. Unrelated to this bug, also fix the `SchematicReady` check (introduced in 1.5) to treat invalid schematics as valid, as otherwise we cannot create clusters from non-factory images. Signed-off-by: Utku Ozdemir <utku.ozdemir@siderolabs.com>	2026-02-05 13:56:49 +01:00
Utku Ozdemir	c319d7bcf2	fix: fix schematic generation for machines in agent mode We had an issue with bare metal provider where two different schematic IDs would fight each other, causing machine to get installed with a wrong schematic ID, only to be upgraded to the correct one immediately, and in some cases, go into an upgrade loop between a correct and an incorrect schematic. The cause: Omni treated schematics it observed when the machine in agent mode dialed in, and stored the information it received (like kernel args and initial schematic info). This was wrong, as agent mode information essentially meaningless. Fix this by changing the simple check of "was the schematic info for machine X ever observed" to be "is the schematic info for machine X ready". The readiness check involves schematic being populated and machine not being in agent mode. This change caused `SchematicConfiguration` resource to not be generated before the machine leaves the agent mode, and caused a side effect: `InfraMachineController` would not receive Talos version from it and would not populate it on the `InfraMachine` resource. And this would cause BM provider to never get notified about the fact that the machine is allocated to a cluster, and would not power it on (to PXE boot it to "regular" Talos, for it to receive the "install" call to Omni). Change that controller to get the Talos version info directly from the Cluster resource. Signed-off-by: Utku Ozdemir <utku.ozdemir@siderolabs.com>	2026-02-03 11:46:15 +01:00
Oguz Kilcan	c6cc25c73c	feat: add support for Talos CA rotation Add support for Talos CA rotation Signed-off-by: Oguz Kilcan <oguz.kilcan@siderolabs.com>	2026-01-30 09:59:25 +01:00
Edward Sammut Alessi	7376edafc1	fix(installation-media): fix bug when setting arch to amd64 Fix a bug when setting arch to AMD64 which was enum value 0 and was being omitted in responses to frontend. Signed-off-by: Edward Sammut Alessi <edward.sammutalessi@siderolabs.com>	2026-01-29 11:30:40 +01:00
Utku Ozdemir	98ef83ee42	fix: fix config patches encryption when encryption is disabled When the resource compression was disabled in the Omni config, we were not generating the ClusterMachineConfigPatches correctly. The issue was: it was attempting to "force-compress" the ClusterMachineConfigPatches when any of the patches' size was above the threshold. But when it was trying to do that, it did not override the global setting of false. The default setting for resource compression is `true`, but when a config file is used to configure Omni, and it was not specified in the config YAML, it was getting overwritten to be `false` due to the boolean merging behavior, which was fixed in https://github.com/siderolabs/omni/pull/2150. Also: fix the compression kicking in even in cases when it is disabled in config but above the threshold. Signed-off-by: Utku Ozdemir <utku.ozdemir@siderolabs.com>	2026-01-26 19:04:21 +01:00
Artem Chernyshev	41506f72f8	chore: move graceful config rollout logic to the lowest controller level Now graceful config rollout is handled by the `ClusterMachineConfigStatusController`. It calculates the available update quota by adding finalizers on the `ClusterMachine` resources. By counting the resources with the finalizers it tracks the remaining quota. It now also calculates the pending changes which are not yet applied to the machine in the `MachinePendingUpdates`. Pending changes are not yet shown in the UI anywhere. Fixes: https://github.com/siderolabs/omni/issues/1929 Signed-off-by: Artem Chernyshev <artem.chernyshev@talos-systems.com>	2026-01-19 16:30:28 +03:00
Edward Sammut Alessi	fb08dcaa2d	feat(frontend): add extra information to userpilot Add extra information to userpilot as per [RFD-28](https://www.notion.so/siderolabs/RFD-28-Userpilot-Implementation-and-Omni-Usage-Analytics-6024961dd8cc43eea0ef7797841be51b?d=2dfb1211badf809fbd3f001c79056040#118b1211badf804b9de5edde0f22da96) Signed-off-by: Edward Sammut Alessi <edward.sammutalessi@siderolabs.com>	2026-01-12 11:52:01 +01:00
Edward Sammut Alessi	66e243a233	refactor(installation-media): add metal id const and use gets where possible Add PlatformMetalID constant to frontend and use it where relevant as an ID. Also update some places in backend with the same idea. There were some lingering uses of List requests in places where Get requests were more suited and those have been replaced too. Signed-off-by: Edward Sammut Alessi <edward.sammutalessi@siderolabs.com>	2026-01-09 17:22:26 +01:00
Oguz Kilcan	ef2d931aac	chore: rekres and bump deps * Rekres * Bump deps * Update default versions for talos and kubernetes Signed-off-by: Oguz Kilcan <oguz.kilcan@siderolabs.com>	2026-01-09 11:34:03 +01:00
Edward Sammut Alessi	950ca1b0a3	refactor(installation-media): extract schematic generation and download links Extract schematic generation and download links from the confirmation step of the installation media wizard to allow for re-use inside the download modal of the list view. Signed-off-by: Edward Sammut Alessi <edward.sammutalessi@siderolabs.com>	2026-01-08 18:17:11 +01:00
Utku Ozdemir	535d733ea6	chore: drop migrations older than v1.1.0 Drop old migrations and deprecated types which were kept only for the migrations. Signed-off-by: Utku Ozdemir <utku.ozdemir@siderolabs.com>	2026-01-06 14:50:11 +01:00
Edward Sammut Alessi	5c98d44bdf	chore: implement `InstallationMediaConfig` resource This resource is going to be used to store the saved installation media presets generated by the UI wizard. Signed-off-by: Artem Chernyshev <artem.chernyshev@talos-systems.com>	2025-12-29 17:41:45 +01:00
Edward Sammut Alessi	7ffe5a4db8	feat(installation-media): allow submitting bootloader to schematic request Allow submitting the bootloader option when creating a schematic. Signed-off-by: Edward Sammut Alessi <edward.sammutalessi@siderolabs.com>	2025-12-16 15:40:54 +01:00
Artem Chernyshev	d3e4884ba7	chore: add new fields to the `CreateSchematic` Omni API Now it's possible to pass the `overlay` ID directly to the request. `MediaId` is also still supported, but is there only for the backward compatibility. `InstallationMedia` resources will be used only in the `omnictl download`. Updated the Wizard UI to no longer use `InstallationMedia` resources. Dropped `pxe_url` from the `CreateSchematic` response, as all required arguments are now on the client side (if not using `InstallationMedia` resources). Signed-off-by: Artem Chernyshev <artem.chernyshev@talos-systems.com>	2025-12-16 15:06:08 +03:00
Artem Chernyshev	aa6acff632	chore: support resource list based filtering in the `DependencyGraph` This will allow seeing which controllers are using the defined resources. Signed-off-by: Artem Chernyshev <artem.chernyshev@talos-systems.com>	2025-12-15 21:00:01 +03:00
Artem Chernyshev	ee926cd9eb	feat: add a way to switch gRPC tunnel mode for the connected machines Fixes: https://github.com/siderolabs/omni/issues/1816 Introduce a new command: ``` omnictl configure machine <id> --siderolink-connection=[udp\|http-tunnel\|auto] ``` Signed-off-by: Artem Chernyshev <artem.chernyshev@talos-systems.com>	2025-12-12 22:59:33 +03:00
Spencer Smith	914c8c0ba1	feat: add min-commit flag for omni This PR would add a new flag for minimum committed machines. This data would come from stripe and if the user's Omni environment has less that the committed machines, we'd just report the minimum specified. Signed-off-by: Spencer Smith <spencer.smith@talos-systems.com>	2025-12-12 09:25:42 -05:00
Edward Sammut Alessi	4d11b75e03	feat: return schematic yml when creating installation media Return the final schematic YML to display in the frontend when creating installation media Signed-off-by: Edward Sammut Alessi <edward.sammutalessi@siderolabs.com>	2025-12-11 17:43:14 +01:00
Tim Jones	d68562f595	feat: add labels to talos version metric Add labels for the assigned cluster and connection status to the `omni_machines_version` metric. Closes #1967 Signed-off-by: Tim Jones <tim.jones@siderolabs.com>	2025-12-10 13:26:36 +01:00
Oguz Kilcan	bc2a5a9986	chore: prepare omni with talos v1.12.0-beta.1 Prepare omni for upcoming talos version 1.12.0-beta.1. Signed-off-by: Oguz Kilcan <oguz.kilcan@siderolabs.com>	2025-12-06 16:55:35 +01:00
Edward Sammut Alessi	24ed384afb	fix(installation-media): only list architectures supported by providers Only list the architectures supported by the providers as defined in the API. Signed-off-by: Edward Sammut Alessi <edward.sammutalessi@siderolabs.com>	2025-12-04 13:44:19 +01:00
Edward Sammut Alessi	9826116e85	fix(installation-media): adjust secureboot support check Adjust the secure boot support check in the machine arch step to match how it works in factory. Signed-off-by: Edward Sammut Alessi <edward.sammutalessi@siderolabs.com>	2025-12-03 21:19:55 +01:00
Artem Chernyshev	8b5c29b303	feat: support locks,node delete and restore when using machine classes Make `MachineSetNode` created without an owner by the `MachineSetNodeController`. Fixes: https://github.com/siderolabs/omni/issues/1450 Signed-off-by: Artem Chernyshev <artem.chernyshev@talos-systems.com>	2025-11-21 20:44:17 +03:00

1 2 3 4

179 Commits