talos

mirror of https://github.com/siderolabs/talos.git synced 2025-08-07 23:27:07 +02:00

Author	SHA1	Message	Date
Andrey Smirnov	badbc51e63	refactor: rewrite code to include preliminary support for multi-doc `config.Container` implements a multi-doc container which implements both `Container` interface (encoding, validation, etc.), and `Conifg` interface (accessing parts of the config). Refactor `generate` and `bundle` packages to support multi-doc, and provide backwards compatibility. Implement a first (mostly example) machine config document for SideroLink API URL. Many places don't properly support multi-doc yet (e.g. config patches). Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>	2023-05-31 18:38:05 +04:00
Andrey Smirnov	96aa9638f7	chore: rename talos-systems/talos to siderolabs/talos There's a cyclic dependency on siderolink library which imports talos machinery back. We will fix that after we get talos pushed under a new name. Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>	2022-11-03 16:50:32 +04:00
Dmitriy Matrenichev	b59ca5810e	chore: move from inet.af/netaddr to net/netip and go4.org/netipx Closes #6007 Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>	2022-08-25 17:51:32 +03:00
Dmitriy Matrenichev	29bd632401	chore: remove old build tags syntax This commit removes lines contains old build tag syntax. Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>	2022-08-24 17:27:01 +03:00
Utku Ozdemir	8d2be5e315	feat: extend node definition used in health checks Introduce `cluster.NodeInfo` to represent the basic info about a node which can be used in the health checks. This information, where possible, will be populated by the discovery service in following PRs. Part of siderolabs#5554. Signed-off-by: Utku Ozdemir <utku.ozdemir@siderolabs.com>	2022-06-13 14:13:42 +02:00
Dmitriy Matrenichev	e06e1473b0	feat: update golangci-lint to 1.45.0 and gofumpt to 0.3.0 - Update golangci-lint to 1.45.0 - Update gofumpt to 0.3.0 - Fix gofumpt errors - Add goimports and format imports since gofumports is removed - Update Dockerfile - Fix .golangci.yml configuration - Fix linting errors Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>	2022-03-24 08:14:04 +04:00
Andrey Smirnov	3e100aa977	test: workaround EventsWatch test flakiness This test sometimes fails with a message like: ``` === RUN TestIntegration/api.EventsSuite/TestEventsWatch assertion_compare.go:323: Error Trace: events.go:88 Error: "0" is not greater than or equal to "14" Test: TestIntegration/api.EventsSuite/TestEventsWatch Messages: [] ``` I believe the root cause is that the initial (first event) delivery might be more than 100ms, so instead of waiting for 100ms for each event, block for 500ms for all events to arrive. Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>	2021-10-15 12:51:56 +03:00
Andrey Smirnov	a059454045	chore: build using Go 1.17 `initramfs` size for amd64 shrinks by 1.3 MiB. Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>	2021-09-13 22:33:47 +03:00
Alexey Palazhchenko	eea750de2c	chore: rename "join" type to "worker" Closes #3413. Signed-off-by: Alexey Palazhchenko <alexey.palazhchenko@gmail.com>	2021-07-09 07:10:45 -07:00
Andrey Smirnov	2ea20f598a	feat: replace timed with time sync controller This is a complete rewrite of time sync process. Now the time sync process starts early at boot time, and it adapts to configuration changes: * before config is available, `pool.ntp.org` is used * once config is available, configured time servers are used Controller updates same time sync resource as other controllers had dependency on, so they have a chance to wait for the time sync event. Talos services which depend on time now wait on same resource instead of waiting on timed health. New features: * time sync now sticks to the particular time server unless there's an error from that server, and server is changed in that case, this improves time sync accuracy * time sync acts on config changes immediately, so it's possible to reconfigure time sync at any time * there's a new 'epoch' field in time sync resources which allows time-dependent controllers to regenerate certs when there's a big enough jump in time Features to implement later: * apid shouldn't depend on timed, it should be started early and it should regenerate certs on time jump * trustd should be updated in same way Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>	2021-03-29 09:29:43 -07:00
Andrey Smirnov	31e56e63db	fix: update in-cluster kubeconfig validity to match other certs Talos generates in-cluster kubeconfig for the kube-scheduler and kube-controller-manager to authenticate to kube-apiserver. Bug was that validity of that kubeconfig was set to 24h by mistake. Fix that by bumping validity to default for other Kubernetes certs (1 year). Add a certificate refresh at 50% of the validity. Fix bugs with copying secret resources which was leading to updates not being propagated correctly. Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>	2021-03-01 11:16:04 -08:00
Andrey Smirnov	3dae6df27b	test: stabilize upgrade test by running health check several times For single node clusters, control plane is unstable after reboot, run health check several times to let it settle down to avoid failures in subsequent checks. Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>	2020-12-11 08:31:01 -08:00
Andrey Smirnov	bddd4f1bf6	refactor: move external API packages into `machinery/` This moves `pkg/config`, `pkg/client` and `pkg/constants` under `pkg/machinery` umbrella. And `pkg/machinery` is published as Go module inside Talos repository. Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>	2020-08-17 09:56:14 -07:00
Andrey Smirnov	3d8418a689	feat: force nodes to be set in `talosctl` commands using the API With load-balancing enabled by default running `talosctl` without `--nodes` is risky, as it might hit any control plane by default without `--nodes`. Only two commands do not enforce this check, as they do their own node contexts: `crashdump` and `health` (client-side). Integration tests were updated to always supply `--nodes` cli argument, while doing that I refactored the storage for discovered nodes to use existing `cluster.Info` interface. The downside is that with e2e CAPI tests CLI tests will be mostly skipped as we don't support discovery in CLI tests at the momemnt. This can be fixed by using `talosctl kubeconfig` + `kubectl get nodes` for node discovery. Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>	2020-07-21 12:17:43 -07:00
Andrey Smirnov	1a0e1bc393	chore: update module dependencies Fixes #2316 Simply update dependencies we don't track on version level to be compatible with Talos components (like etcd or k8s). Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>	2020-07-16 12:00:50 -07:00
Andrey Smirnov	5ecddf2866	feat: add round-robin LB policy to Talos client by default Handling of multiple endpoints has already been implemented in #2094. This PR enables round-robin policy so that grpc picks up new endpoint for each call (and not send each request to the first control plane node). Endpoint list is randomized to handle cases when only one request is going to be sent, so that it doesn't go always to the first node in the list. gprc handles dead/unresponsive nodes automatically for us. `talosctl cluster create` and provision tests switched to use client-side load balancer for Talos API. On the additional improvements we got: * `talosctl` now reports correct node IP when using commands without `-n`, not the loadbalancer IP (if using multiple endpoints of course) * loadbalancer can't provide reliable handling of errors when upstream server is unresponsive or there're no upstreams available, grpc returns much more helpful errors Fixes #1641 Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>	2020-07-09 08:35:15 -07:00
Andrey Smirnov	4cc074cdba	feat: implement API access to event history 1. Add [xid-based](https://github.com/rs/xid) event IDs. Xids are sortable and unique enough. Xids also encode event publishing time with a second precision. 2. Add three ways to look back into event history: based on number of events, on time and ID. Lookup via ID might be used to restart event polling in case of broken API connection from the same moment. 3. Reimplement core event buffer with positions which are always incremented instead of generation+index, this implementation is much more simple (idea from circular buffer). 4. By default, Events API works the same - it shows no history and starts streaming new events only. Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>	2020-07-08 10:54:50 -07:00
Andrey Smirnov	a6b3bd2ff6	feat: implement service events This implements service events, adds test for events API based on service events as they're the easiest to generate on demand. Disabled validate test for 'metal' as it validates disk device against local system which doesn't make much sense. Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>	2020-07-03 13:52:53 -07:00

18 Commits