111 Commits

Author SHA1 Message Date
Dmitriy Matrenichev
fc48849d00
chore: move maps/slices/ordered to gen module
Use github.com/siderolabs/gen

Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
2022-09-21 20:22:43 +03:00
Andrey Smirnov
8b09bd4b04
feat: update Kubernetes to v1.26.0-alpha.1
Talos 1.3.0 will ship with Kubernetes 1.26.0.

See https://github.com/kubernetes/kubernetes/releases/tag/v1.26.0-alpha.1

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-09-21 18:42:31 +04:00
Utku Ozdemir
0847400f72
fix: prevent panic on health check if a member has no IPs
If a member has no IP addresses, prevent cluster health checks from failing with a panic by checking for the length of member IPs and not assuming there's always at least 1 IP.

Signed-off-by: Utku Ozdemir <utku.ozdemir@siderolabs.com>
2022-09-02 15:16:59 +02:00
Utku Ozdemir
0b339a9dc5
feat: track progress of action API calls
Track the progress of the long-running actions `reboot`, `reset`, `upgrade` and `shutdown` on the client side by default, unless `--no-wait=true` is specified.

Use the events API to follow the events using the actor ID of the action and display it using an stderr reporter with a spinner.

Closes siderolabs/talos#5499.

Signed-off-by: Utku Ozdemir <utku.ozdemir@siderolabs.com>
2022-08-29 22:54:40 +02:00
Andrey Smirnov
cdd0f08bc5
feat: check client <> server version in some Talos commands
Talos commands which are sensitive to resource API changes:

* `get`
* `edit`, `patch`
* `upgrade-k8s`

Commands with upcoming changes for actorID:

* `reboot`
* `reset`
* `shutdown`
* `upgrade`

Fixes #6101

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-08-26 18:37:51 +04:00
Dmitriy Matrenichev
b59ca5810e
chore: move from inet.af/netaddr to net/netip and go4.org/netipx
Closes #6007

Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
2022-08-25 17:51:32 +03:00
Andrey Smirnov
9baca49662
refactor: implement COSI resource API for Talos
Overview: deprecate existing Talos resource API, and introduce new COSI
API.

Consequences:

* COSI API can only go via one-2-one proxy (`client.WithNode`)
* client-side API access is way easier with `state.State` wrappers
* lots of small changes on the client side to use new APIs

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-08-12 22:31:54 +04:00
Noel Georgi
b62b18a972
feat: bump k8s to v1.25.0-beta.0
Bump k8s to v1.25.0-beta.0

Update most kubernetes `master` references to `controlplane`

Signed-off-by: Noel Georgi <git@frezbo.dev>
2022-08-10 22:17:53 +05:30
Andrey Smirnov
a6b010a8b4
chore: update Go to 1.19, Linux to 5.15.58
See https://go.dev/doc/go1.19

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-08-03 17:03:58 +04:00
Andrey Smirnov
0cdf222431
fix: retry Conflict errors when upgrading k8s manifests
Fixes #5985

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-07-29 13:20:04 +04:00
Andrey Smirnov
2b23fabcc1
docs: use SVG image for K8s conformance
It doesn't accept PNG images.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-06-22 22:15:46 +04:00
Andrey Smirnov
b816d0b600
docs: fix the vendor information for Kubernetes conformance tests
As we submit results to Certified Kubernetes, we provide metadata which
should be updated now, and also we lost the logo in our assets.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-06-21 22:25:10 +04:00
Andrey Smirnov
a167a54021
test: fix CLI nodes discovery without provisioner data
When integration tests run without data from Talos provisioner (e.g.
against AWS/GCP), it should work only with `talosconfig` as an input.

This specific flow was missing filling out `infoWrapper` properly.

Clean up things a bit by reducing code duplication.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-06-21 18:42:26 +04:00
Utku Ozdemir
6759fcd4ae
feat: use discovery service on cluster health checks
Query the discovery service to fetch the node list and use the results in health checks. Closes siderolabs#5554.

Signed-off-by: Utku Ozdemir <utku.ozdemir@siderolabs.com>
2022-06-15 16:01:38 +02:00
Utku Ozdemir
8d2be5e315
feat: extend node definition used in health checks
Introduce `cluster.NodeInfo` to represent the basic info about a node which can be used in the health checks. This information, where possible, will be populated by the discovery service in following PRs. Part of siderolabs#5554.

Signed-off-by: Utku Ozdemir <utku.ozdemir@siderolabs.com>
2022-06-13 14:13:42 +02:00
Dmitriy Matrenichev
4dbbf4ac50
chore: add generic methods and use them part #2
Use things from #5702.

Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
2022-06-09 23:10:02 +08:00
Dmitriy Matrenichev
70fc424099
chore: add generic methods and use them
Things like ToSet, Keys etc...

Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
2022-06-09 02:59:23 +08:00
Utku Ozdemir
c19dd1b892
feat: add 'etcd members should be control plane nodes' health check
Add new health check which checks if the etcd members match the control plane nodes. Closes siderolabs#5553.

Signed-off-by: Utku Ozdemir <utku.ozdemir@siderolabs.com>
2022-06-07 10:34:38 +02:00
Dmitriy Matrenichev
bf7a6443ee
feat: add 'etcd membership is consistent across nodes' health check
Add new health check which waits for all etcd members. Closes #5552.

Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
2022-05-20 21:51:17 +08:00
Andrey Smirnov
5a91f6076d
fix: ignore completed pods in cluster health check
This fixes an error when integration test become stuck with the message
like:

```
waiting for coredns to report ready: some pods are not ready: [coredns-868c687b7-g2z64]
```

After some random sequence of node restarts one of the pods might become
"stuck" in `Completed` state (as it is shown in `kubectl get pods`)
blocking the check, as the pod will never become ready.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-05-16 14:28:25 +03:00
Andrey Smirnov
f1f43131f8
fix: strip 'v' prefix from versions on Kubernetes upgrade
This fixes an issue when `talosctl upgrade-k8s` fails with unhelpful
message if the version is specified as `v1.23.5` vs. `1.23.5`.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-04-22 14:59:12 +03:00
Andrey Smirnov
4eb9f45cc8
refactor: split polymorphic K8sControlPlane into typed resources
Having polymorphic (spec type depends on ID) resources is not a good
idea, and it's not compatible with protobuf encoding.

Introduce new resources for each polymorphic sub-spec using new Go 1.18
generic typed.Resource to reduce the boilerplate code.

(Still needs proper deepcopy-gen, but I'm skipping it for now, as
K8sControlPlane had also broken deep copy).

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-04-19 16:53:09 +03:00
Andrey Smirnov
8af50fcd27
fix: correct cri package import path
Containerd CRI plugin was merged into the main repo, but we were using
old import path, so our constants coming from the module were outdated.

This fixes the image version for the pause container.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-04-14 16:27:45 +03:00
Andrey Smirnov
0cb84e8c1a
fix: correctly parse tags out of images
Use the last `:` in the image reference.

Handle the case when no version was discovered.

See https://github.com/siderolabs/theila/issues/138

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-04-07 19:32:12 +03:00
Andrey Smirnov
2ca5279e56
fix: retry manifest updates in upgrade-k8s
This showed up recently frequently in integration-provision tests
(might be related to Kubernetes upgrade), but anyways errors should be
retried.

Refactored the function to extract the retryable part.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-04-01 16:20:25 +03:00
Andrey Smirnov
ca8b9c0a3a
feat: update Kubernetes to 1.24.0-alpha.4
See https://github.com/kubernetes/kubernetes/releases/tag/v1.24.0-alpha.4

Fix some incompatibilities around dropped flags/API versions.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-03-30 22:59:07 +03:00
Dmitriy Matrenichev
e06e1473b0
feat: update golangci-lint to 1.45.0 and gofumpt to 0.3.0
- Update golangci-lint to 1.45.0
- Update gofumpt to 0.3.0
- Fix gofumpt errors
- Add goimports and format imports since gofumports is removed
- Update Dockerfile
- Fix .golangci.yml configuration
- Fix linting errors

Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
2022-03-24 08:14:04 +04:00
Artem Chernyshev
27af5d41c6
feat: pause the boot process on some failures instead of rebooting
Some failures can be fixed by updating the machine configuration.
Now `userDisks` and `userFiles` do not make Talos to enter into reboot
loop but pause for 35 minutes.

Additionally, `apid` and `machined` are now started right after
containerd is up and running.

That makes it possible for the operator to connect to the node using
talosctl and fix the config.

Fixes: https://github.com/talos-systems/talos/issues/4669
Signed-off-by: Artem Chernyshev <artem.chernyshev@talos-systems.com>
2022-03-21 17:39:45 +03:00
Andrey Smirnov
50594ab1a7
fix: ignore terminated pods in pod health checks
With graceful kubelet shutdown (#5108), after graceful node restart pods
on the restarted node might stay in the status `Terminated` which breaks
the check on pod readiness.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-03-17 19:17:56 +03:00
Andrew Rynhard
84ee1795dc
docs: update logo
Changes the logo and reformats the description on the front page.

Signed-off-by: Andrew Rynhard <andrew@rynhard.io>
2022-03-16 15:57:56 +03:00
Noel Georgi
dcde2c4f68
chore: update k8s upgrade message
Update k8s upgrade message

Signed-off-by: Noel Georgi <git@frezbo.dev>
2022-01-31 16:49:25 +05:30
Artem Chernyshev
831f65a07f
fix: close client provider instead of Talos client in the upgrade module
Otherwise it breaks Theila, which never closes Talos clients during
operation.

Signed-off-by: Artem Chernyshev <artem.chernyshev@talos-systems.com>
2022-01-27 15:07:28 +03:00
Seán C McCord
6af83afd5a
fix: handle multiple-IP cluster nodes
Allow cluster nodes to have multiple internal IP addresses when checking
for all Kubernetes nodes.

Fixes #4807

Signed-off-by: Seán C McCord <ulexus@gmail.com>
2022-01-17 11:41:54 -05:00
Artem Chernyshev
2f2bdb26aa
feat: replace flags with --mode in apply, edit and patch commands
Fixes: https://github.com/talos-systems/talos/issues/4588

Signed-off-by: Artem Chernyshev <artem.chernyshev@talos-systems.com>
2022-01-13 16:09:53 +03:00
Andrey Smirnov
2f4b9d8d6d
feat: make machine configuration read-only in Talos (almost)
Talos shouldn't try to re-encode the machine config it was provided
with.

So add a `ReadonlyWrapper` around `*v1alpha1.Config` which makes sure
that raw config object is not available anymore (it's a private field),
but config accessors are available for read-only access.

Another thing that `ReadonlyWrapper` does is that it preserves the
original `[]byte` encoding of the config keeping it exactly same way as
it was loaded from file or read over the network.

Improved `talosctl edit mc` to preserve the config as it was submitted,
and preserve the edits on error from Talos (previously edits were lost).

`ReadonlyWrapper` is not used on config generation path though - config
there is represented by `*v1alpha.Config` and can be freely modified.

Why almost? Some parts of Talos (platform code) patch the machine
configuration with new data. We need to fix platforms to provide
networking configuration in a different way, but this will come with
other PRs later.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2021-12-28 20:12:55 +03:00
Andrey Smirnov
f49f40a336
fix: pass path to conformance retrieve results
Sonobouy once again changed the API in a way that breaks our tool.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2021-12-22 17:28:05 +03:00
Andrey Smirnov
dc9a0cfe94
chore: bump Go dependencies
Bump all dependencies, update `grpc.WithInsecure()` which is deprecated
now.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2021-12-20 23:05:32 +03:00
Andrey Smirnov
97ffa7a645
feat: upgrade kubelet version in talosctl upgrade-k8s
Fixes #4656

As now changes to kubelet configuration can be applied without a reboot,
`talosctl upgrade-k8s` can handle the kubelet upgrades as well.

The gist is simply modifying machine config and waiting for `Node`
version to be updated, rest of the code is required for reliability of
the process.

Also fixed a bug in the API while watching deleted items with
tombstones.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2021-12-08 21:12:17 +03:00
Andrey Smirnov
753a82188f
refactor: move pkg/resources to machinery
Fixes #4420

No functional changes, just moving packages around.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2021-11-15 19:50:35 +03:00
Alexey Palazhchenko
95105071de
chore: fix simple issues found by golangci-lint
Avoid slice mutation with append.
Simplify code.

Signed-off-by: Alexey Palazhchenko <alexey.palazhchenko@talos-systems.com>
2021-11-12 15:20:28 +00:00
Alexey Palazhchenko
8e8687d759
fix: use temporary sonobuoy version
`replace` should be removed when v0.55.1+ is released.

Signed-off-by: Alexey Palazhchenko <alexey.palazhchenko@talos-systems.com>
2021-11-12 11:34:09 +00:00
Alexey Palazhchenko
d6147eb17d
chore: update sonobuoy
See https://github.com/vmware-tanzu/sonobuoy/issues/1520.

Closes #4516.

Signed-off-by: Alexey Palazhchenko <alexey.palazhchenko@talos-systems.com>
2021-11-11 14:53:54 +00:00
Artem Chernyshev
261c497c71
feat: implement talosctl support command
Fixes: https://github.com/talos-systems/talos/issues/4406

Signed-off-by: Artem Chernyshev <artem.chernyshev@talos-systems.com>
2021-11-08 16:20:50 +03:00
Andrey Smirnov
ae5af9d3fa
feat: update Kubernetes to 1.23.0-alpha.3
See https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.23.md#v1230-alpha3

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2021-10-22 14:59:41 +03:00
Artem Chernyshev
e3e2113adc
feat: upgrade CoreDNS during upgrade-k8s call
Fixes: https://github.com/talos-systems/talos/issues/4065

Get all Talos generated manifests and apply them, wait for deployments to be
updated and to become ready.

Signed-off-by: Artem Chernyshev <artem.chernyshev@talos-systems.com>
2021-10-13 15:47:06 +03:00
Andrey Smirnov
a1c9d64907
fix: update the way results are retrieved for certified conformance
Looks like we bumped sonobuoy library, and it silently changed a lot of
things in the way it works with the results.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2021-09-13 23:32:59 +03:00
Alexey Palazhchenko
d53e9e8963
chore: use named constants
Just for consistency.

Signed-off-by: Alexey Palazhchenko <alexey.palazhchenko@talos-systems.com>
2021-09-07 12:13:48 +00:00
Alexey Palazhchenko
032e7c6b86
chore: import yaml.v3 consistently
Do not use yaml.v2.

Signed-off-by: Alexey Palazhchenko <alexey.palazhchenko@talos-systems.com>
2021-08-26 11:36:50 +00:00
Artem Chernyshev
2b614e430e
feat: check if cluster has deprecated resources versions
Fixes: https://github.com/talos-systems/talos/issues/4026

Signed-off-by: Artem Chernyshev <artem.chernyshev@talos-systems.com>
2021-08-18 23:26:36 +03:00
Alexey Palazhchenko
09d70b7eaf feat: update Kubernetes to v1.22.0
Closes #3967.
Closes #3997.

Signed-off-by: Alexey Palazhchenko <alexey.palazhchenko@talos-systems.com>
2021-08-06 09:06:32 -07:00