65 Commits

Author SHA1 Message Date
Dmitriy Matrenichev
45e6e27af7
chore: bump runtime
Use new functions and methods from runtime module.

Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
2023-05-11 17:18:08 -04:00
Noel Georgi
c63cf90e32
feat: update k8s to v1.27.0-beta.0
Update k8s to v1.27.0-beta.0

Signed-off-by: Noel Georgi <git@frezbo.dev>
2023-03-21 23:59:17 +05:30
Andrey Smirnov
8ea4bfad8f
refactor: improve the kubernetes upgrade flow
Use new version of go-kubernetes, and move the `kube-proxy` DaemonSet
update to follow common logic of bootstrap manifests update.

This fixes a confusing behavior when after `k8s-upgrade` the version of
`kube-proxy` is not updated in the machine config.

See https://github.com/siderolabs/go-kubernetes/pull/3

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2023-03-06 15:01:29 +04:00
Andrey Smirnov
230e46e567
refactor: extract parts of kubernetes libraries
The shared code is going out to the
github.com/siderolabs/go-kubernetes library.

The code will be used in Talos and other projects using same features.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2023-02-22 14:56:49 +04:00
Andrey Smirnov
1e1aa84f6c
fix: kubernetes removed resource version check
Not all Kubernetes deprecated resources are same - if the old API
version is deprecated, but new one is available, API server handles
trnasition for us. If some resource is removed completely, we need to
check for it. This reduces number of items to check, and simplifies the
check.

Move the check under the umbrella of the 'upgrade pre-checks', and make
it actually fatal.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2023-02-10 14:45:09 +04:00
Andrey Smirnov
89dbb0ecf0
release(v1.4.0-alpha.0): prepare release
This is the official v1.4.0-alpha.0 release.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-12-23 22:32:09 +04:00
Noel Georgi
faf49218ce
feat: add more checks for K8s upgrade
Add more checks for the Talos Kubernetes upgrade.

The removed api-server resources checks are kept as is, needs to be
moved to the new checks as part of #6599.

Fixes: #6444

Signed-off-by: Noel Georgi <git@frezbo.dev>
2022-12-14 19:29:19 +05:30
Andrey Smirnov
96aa9638f7
chore: rename talos-systems/talos to siderolabs/talos
There's a cyclic dependency on siderolink library which imports talos
machinery back. We will fix that after we get talos pushed under a new
name.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-11-03 16:50:32 +04:00
Andrey Smirnov
343c55762e
chore: replace talos-systems Go modules with siderolabs
This the first step towards replacing all import paths to be based on
`siderolabs/` instead of `talos-systems/`.

All updates contain no functional changes, just refactorings to adapt to
the new path structure.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-11-01 12:55:40 +04:00
Dmitriy Matrenichev
fc48849d00
chore: move maps/slices/ordered to gen module
Use github.com/siderolabs/gen

Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
2022-09-21 20:22:43 +03:00
Andrey Smirnov
8b09bd4b04
feat: update Kubernetes to v1.26.0-alpha.1
Talos 1.3.0 will ship with Kubernetes 1.26.0.

See https://github.com/kubernetes/kubernetes/releases/tag/v1.26.0-alpha.1

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-09-21 18:42:31 +04:00
Andrey Smirnov
9baca49662
refactor: implement COSI resource API for Talos
Overview: deprecate existing Talos resource API, and introduce new COSI
API.

Consequences:

* COSI API can only go via one-2-one proxy (`client.WithNode`)
* client-side API access is way easier with `state.State` wrappers
* lots of small changes on the client side to use new APIs

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-08-12 22:31:54 +04:00
Noel Georgi
b62b18a972
feat: bump k8s to v1.25.0-beta.0
Bump k8s to v1.25.0-beta.0

Update most kubernetes `master` references to `controlplane`

Signed-off-by: Noel Georgi <git@frezbo.dev>
2022-08-10 22:17:53 +05:30
Andrey Smirnov
a6b010a8b4
chore: update Go to 1.19, Linux to 5.15.58
See https://go.dev/doc/go1.19

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-08-03 17:03:58 +04:00
Andrey Smirnov
0cdf222431
fix: retry Conflict errors when upgrading k8s manifests
Fixes #5985

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-07-29 13:20:04 +04:00
Dmitriy Matrenichev
4dbbf4ac50
chore: add generic methods and use them part #2
Use things from #5702.

Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
2022-06-09 23:10:02 +08:00
Andrey Smirnov
f1f43131f8
fix: strip 'v' prefix from versions on Kubernetes upgrade
This fixes an issue when `talosctl upgrade-k8s` fails with unhelpful
message if the version is specified as `v1.23.5` vs. `1.23.5`.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-04-22 14:59:12 +03:00
Andrey Smirnov
4eb9f45cc8
refactor: split polymorphic K8sControlPlane into typed resources
Having polymorphic (spec type depends on ID) resources is not a good
idea, and it's not compatible with protobuf encoding.

Introduce new resources for each polymorphic sub-spec using new Go 1.18
generic typed.Resource to reduce the boilerplate code.

(Still needs proper deepcopy-gen, but I'm skipping it for now, as
K8sControlPlane had also broken deep copy).

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-04-19 16:53:09 +03:00
Andrey Smirnov
0cb84e8c1a
fix: correctly parse tags out of images
Use the last `:` in the image reference.

Handle the case when no version was discovered.

See https://github.com/siderolabs/theila/issues/138

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-04-07 19:32:12 +03:00
Andrey Smirnov
2ca5279e56
fix: retry manifest updates in upgrade-k8s
This showed up recently frequently in integration-provision tests
(might be related to Kubernetes upgrade), but anyways errors should be
retried.

Refactored the function to extract the retryable part.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-04-01 16:20:25 +03:00
Andrey Smirnov
ca8b9c0a3a
feat: update Kubernetes to 1.24.0-alpha.4
See https://github.com/kubernetes/kubernetes/releases/tag/v1.24.0-alpha.4

Fix some incompatibilities around dropped flags/API versions.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-03-30 22:59:07 +03:00
Dmitriy Matrenichev
e06e1473b0
feat: update golangci-lint to 1.45.0 and gofumpt to 0.3.0
- Update golangci-lint to 1.45.0
- Update gofumpt to 0.3.0
- Fix gofumpt errors
- Add goimports and format imports since gofumports is removed
- Update Dockerfile
- Fix .golangci.yml configuration
- Fix linting errors

Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
2022-03-24 08:14:04 +04:00
Noel Georgi
dcde2c4f68
chore: update k8s upgrade message
Update k8s upgrade message

Signed-off-by: Noel Georgi <git@frezbo.dev>
2022-01-31 16:49:25 +05:30
Artem Chernyshev
831f65a07f
fix: close client provider instead of Talos client in the upgrade module
Otherwise it breaks Theila, which never closes Talos clients during
operation.

Signed-off-by: Artem Chernyshev <artem.chernyshev@talos-systems.com>
2022-01-27 15:07:28 +03:00
Artem Chernyshev
2f2bdb26aa
feat: replace flags with --mode in apply, edit and patch commands
Fixes: https://github.com/talos-systems/talos/issues/4588

Signed-off-by: Artem Chernyshev <artem.chernyshev@talos-systems.com>
2022-01-13 16:09:53 +03:00
Andrey Smirnov
2f4b9d8d6d
feat: make machine configuration read-only in Talos (almost)
Talos shouldn't try to re-encode the machine config it was provided
with.

So add a `ReadonlyWrapper` around `*v1alpha1.Config` which makes sure
that raw config object is not available anymore (it's a private field),
but config accessors are available for read-only access.

Another thing that `ReadonlyWrapper` does is that it preserves the
original `[]byte` encoding of the config keeping it exactly same way as
it was loaded from file or read over the network.

Improved `talosctl edit mc` to preserve the config as it was submitted,
and preserve the edits on error from Talos (previously edits were lost).

`ReadonlyWrapper` is not used on config generation path though - config
there is represented by `*v1alpha.Config` and can be freely modified.

Why almost? Some parts of Talos (platform code) patch the machine
configuration with new data. We need to fix platforms to provide
networking configuration in a different way, but this will come with
other PRs later.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2021-12-28 20:12:55 +03:00
Andrey Smirnov
97ffa7a645
feat: upgrade kubelet version in talosctl upgrade-k8s
Fixes #4656

As now changes to kubelet configuration can be applied without a reboot,
`talosctl upgrade-k8s` can handle the kubelet upgrades as well.

The gist is simply modifying machine config and waiting for `Node`
version to be updated, rest of the code is required for reliability of
the process.

Also fixed a bug in the API while watching deleted items with
tombstones.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2021-12-08 21:12:17 +03:00
Andrey Smirnov
753a82188f
refactor: move pkg/resources to machinery
Fixes #4420

No functional changes, just moving packages around.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2021-11-15 19:50:35 +03:00
Alexey Palazhchenko
95105071de
chore: fix simple issues found by golangci-lint
Avoid slice mutation with append.
Simplify code.

Signed-off-by: Alexey Palazhchenko <alexey.palazhchenko@talos-systems.com>
2021-11-12 15:20:28 +00:00
Andrey Smirnov
ae5af9d3fa
feat: update Kubernetes to 1.23.0-alpha.3
See https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.23.md#v1230-alpha3

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2021-10-22 14:59:41 +03:00
Artem Chernyshev
e3e2113adc
feat: upgrade CoreDNS during upgrade-k8s call
Fixes: https://github.com/talos-systems/talos/issues/4065

Get all Talos generated manifests and apply them, wait for deployments to be
updated and to become ready.

Signed-off-by: Artem Chernyshev <artem.chernyshev@talos-systems.com>
2021-10-13 15:47:06 +03:00
Alexey Palazhchenko
d53e9e8963
chore: use named constants
Just for consistency.

Signed-off-by: Alexey Palazhchenko <alexey.palazhchenko@talos-systems.com>
2021-09-07 12:13:48 +00:00
Alexey Palazhchenko
032e7c6b86
chore: import yaml.v3 consistently
Do not use yaml.v2.

Signed-off-by: Alexey Palazhchenko <alexey.palazhchenko@talos-systems.com>
2021-08-26 11:36:50 +00:00
Artem Chernyshev
2b614e430e
feat: check if cluster has deprecated resources versions
Fixes: https://github.com/talos-systems/talos/issues/4026

Signed-off-by: Artem Chernyshev <artem.chernyshev@talos-systems.com>
2021-08-18 23:26:36 +03:00
Alexey Palazhchenko
09d70b7eaf feat: update Kubernetes to v1.22.0
Closes #3967.
Closes #3997.

Signed-off-by: Alexey Palazhchenko <alexey.palazhchenko@talos-systems.com>
2021-08-06 09:06:32 -07:00
Andrey Smirnov
0c7ce1cd81 feat: remove remnants of bootkube support
Fixes #3951

Bootkube support was removed in Talos 0.9. Talos versions 0.9-0.11
support conversion of self-hosted bootkube-based control plane to the
new style control plane running as static pods managed by Talos.

This commit removes all backwards compatibility and removes conversion
code.

For the k8s controllers, `BootstrapStatus` is removed and a dependency
on `etcd` service status is added (as it was implicitly there via
`BootstrapStatus`).

Remove control plane conversion code.

In k8s upgrade code, remove self-hosted part.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2021-08-03 07:55:42 -07:00
Artem Chernyshev
70d2505b7c fix: do not require ToVersion to be set when detecting version
We do not know the upgrade version when checking components versions in
Theila.

Signed-off-by: Artem Chernyshev <artem.0xD2@gmail.com>
2021-07-21 08:51:26 -07:00
Artem Chernyshev
f8f1c83a75 feat: detect the lowest Kubernetes version in upgrade-k8s CLI command
Scan all pods in `kube-system` and find `kube-proxy`, `kube-scheduler`,
`kube-controller-manager` and `kube-apiserver` ones, then check the
lowest version amongst them.

Signed-off-by: Artem Chernyshev <artem.0xD2@gmail.com>
2021-07-19 08:24:04 -07:00
Artem Chernyshev
2e463348b2 fix: pass all logs through the options.Log method
Looks like I've missed some 🤦

Signed-off-by: Artem Chernyshev <artem.0xD2@gmail.com>
2021-07-15 08:32:48 -07:00
Artem Chernyshev
bf61c2cc4a fix: write upgrade logs only to the LogOutput if it's defined
No need to print them to stdout in that case.

Signed-off-by: Artem Chernyshev <artem.0xD2@gmail.com>
2021-07-15 07:02:45 -07:00
Artem Chernyshev
23ef1d40af chore: add ability to redirect talos upgrade module logs to io.Writer
This is going to be useful in the third party code which is using
upgrade modules, to collect output logs instead of printing them to the
stdout.

Signed-off-by: Artem Chernyshev <artem.0xD2@gmail.com>
2021-07-13 08:12:06 -07:00
Andrey Smirnov
e883c12b31 fix: make output of upgrade-k8s command less scary
This removes `retrying error` messages while waiting for the API server
pod state to reflect changes from the updated static pod definition.

Log more lines to notify about the progress.

Skip `kube-proxy` if not found (as we allow it to be disabled).

```
$ talosctl upgrade-k8s -n 172.20.0.2 --from 1.21.0 --to 1.21.2
discovered master nodes ["172.20.0.2" "172.20.0.3" "172.20.0.4"]
updating "kube-apiserver" to version "1.21.2"
 > "172.20.0.2": starting update
 > "172.20.0.2": machine configuration patched
 > "172.20.0.2": waiting for API server state pod update
 < "172.20.0.2": successfully updated
 > "172.20.0.3": starting update
 > "172.20.0.3": machine configuration patched
 > "172.20.0.3": waiting for API server state pod update
 < "172.20.0.3": successfully updated
 > "172.20.0.4": starting update
 > "172.20.0.4": machine configuration patched
 > "172.20.0.4": waiting for API server state pod update
 < "172.20.0.4": successfully updated
updating "kube-controller-manager" to version "1.21.2"
 > "172.20.0.2": starting update
 > "172.20.0.2": machine configuration patched
 > "172.20.0.2": waiting for API server state pod update
 < "172.20.0.2": successfully updated
 > "172.20.0.3": starting update
 > "172.20.0.3": machine configuration patched
 > "172.20.0.3": waiting for API server state pod update
 < "172.20.0.3": successfully updated
 > "172.20.0.4": starting update
 > "172.20.0.4": machine configuration patched
 > "172.20.0.4": waiting for API server state pod update
 < "172.20.0.4": successfully updated
updating "kube-scheduler" to version "1.21.2"
 > "172.20.0.2": starting update
 > "172.20.0.2": machine configuration patched
 > "172.20.0.2": waiting for API server state pod update
 < "172.20.0.2": successfully updated
 > "172.20.0.3": starting update
 > "172.20.0.3": machine configuration patched
 > "172.20.0.3": waiting for API server state pod update
 < "172.20.0.3": successfully updated
 > "172.20.0.4": starting update
 > "172.20.0.4": machine configuration patched
 > "172.20.0.4": waiting for API server state pod update
 < "172.20.0.4": successfully updated
updating daemonset "kube-proxy" to version "1.21.2"
kube-proxy skipped as DaemonSet was not found
```

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2021-07-01 06:54:36 -07:00
Alexey Palazhchenko
06209bba28 chore: update RBAC rules, remove old APIs
Refs #3421.

Signed-off-by: Alexey Palazhchenko <alexey.palazhchenko@gmail.com>
2021-06-18 09:54:49 -07:00
Andrey Smirnov
5811f4dda1 feat: implement link (interface) controllers
The structure of the controllers is really similar to addresses and
routes:

* `LinkSpec` resource describes desired link state
* `LinkConfig` controller generates `LinkSpecs` based on machine
configuration and kernel cmdline
* `LinkMerge` controller merges multiple configuration sources into a
single `LinkSpec` paying attention to the config layer priority
* `LinkSpec` controller applies the specs to the kernel state

Controller `LinkStatus` (which was implemented before) watches the
kernel state and publishes current link status.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2021-06-01 09:36:25 -07:00
Andrey Smirnov
d24df8f844 chore: re-import talos-systems/os-runtime as cosi-project/runtime
No changes, just import path change (as project got moved).

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2021-04-12 07:44:24 -07:00
Andrey Smirnov
a1e6415403 fix: retry Kubernetes API errors on cordon/uncordon/etc
This extracts function which was used in upgrade/convert flows to retry
transient errors to the main `kubernetes` package, expands it to ignore
timeout errors, and it is now used to retry errors where applicable in
`pkg/kubernetes`.

Fixes #3403

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2021-04-02 03:51:40 -07:00
Andrey Smirnov
e039172eda fix: ignore EOF errors from Kubernetes API when converting control plane
During the conversion process, API server goes down, so we can see lots
of network errors including EOF.

Fixes #3404

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2021-04-01 10:52:44 -07:00
Andrey Smirnov
672c970739 fix: allow convert-k8s --remove-initialized-keys with K8s cp is down
The command `--remove-initialized-key` is the last resort to convert
control plane when control plane is down for whatever reason, so it
should work when control plane is not available.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2021-03-25 14:06:08 -07:00
Alexey Palazhchenko
fb605a0fc5 chore: tweak nolintlint settings
Copy from kres manually for now.

Signed-off-by: Alexey Palazhchenko <alexey.palazhchenko@gmail.com>
2021-03-25 13:56:16 -07:00
Alexey Palazhchenko
1f5a0c4065 fix: resolve the issue with Kubernetes upgrade
Add missing cases, refactoring.

Signed-off-by: Alexey Palazhchenko <alexey.palazhchenko@gmail.com>
2021-03-25 12:48:28 -07:00