Use new version of go-kubernetes, and move the `kube-proxy` DaemonSet
update to follow common logic of bootstrap manifests update.
This fixes a confusing behavior when after `k8s-upgrade` the version of
`kube-proxy` is not updated in the machine config.
See https://github.com/siderolabs/go-kubernetes/pull/3
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
The shared code is going out to the
github.com/siderolabs/go-kubernetes library.
The code will be used in Talos and other projects using same features.
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
Not all Kubernetes deprecated resources are same - if the old API
version is deprecated, but new one is available, API server handles
trnasition for us. If some resource is removed completely, we need to
check for it. This reduces number of items to check, and simplifies the
check.
Move the check under the umbrella of the 'upgrade pre-checks', and make
it actually fatal.
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
Add more checks for the Talos Kubernetes upgrade.
The removed api-server resources checks are kept as is, needs to be
moved to the new checks as part of #6599.
Fixes: #6444
Signed-off-by: Noel Georgi <git@frezbo.dev>
There's a cyclic dependency on siderolink library which imports talos
machinery back. We will fix that after we get talos pushed under a new
name.
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
This the first step towards replacing all import paths to be based on
`siderolabs/` instead of `talos-systems/`.
All updates contain no functional changes, just refactorings to adapt to
the new path structure.
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
Overview: deprecate existing Talos resource API, and introduce new COSI
API.
Consequences:
* COSI API can only go via one-2-one proxy (`client.WithNode`)
* client-side API access is way easier with `state.State` wrappers
* lots of small changes on the client side to use new APIs
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
This fixes an issue when `talosctl upgrade-k8s` fails with unhelpful
message if the version is specified as `v1.23.5` vs. `1.23.5`.
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
Having polymorphic (spec type depends on ID) resources is not a good
idea, and it's not compatible with protobuf encoding.
Introduce new resources for each polymorphic sub-spec using new Go 1.18
generic typed.Resource to reduce the boilerplate code.
(Still needs proper deepcopy-gen, but I'm skipping it for now, as
K8sControlPlane had also broken deep copy).
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
Use the last `:` in the image reference.
Handle the case when no version was discovered.
See https://github.com/siderolabs/theila/issues/138
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
This showed up recently frequently in integration-provision tests
(might be related to Kubernetes upgrade), but anyways errors should be
retried.
Refactored the function to extract the retryable part.
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
Talos shouldn't try to re-encode the machine config it was provided
with.
So add a `ReadonlyWrapper` around `*v1alpha1.Config` which makes sure
that raw config object is not available anymore (it's a private field),
but config accessors are available for read-only access.
Another thing that `ReadonlyWrapper` does is that it preserves the
original `[]byte` encoding of the config keeping it exactly same way as
it was loaded from file or read over the network.
Improved `talosctl edit mc` to preserve the config as it was submitted,
and preserve the edits on error from Talos (previously edits were lost).
`ReadonlyWrapper` is not used on config generation path though - config
there is represented by `*v1alpha.Config` and can be freely modified.
Why almost? Some parts of Talos (platform code) patch the machine
configuration with new data. We need to fix platforms to provide
networking configuration in a different way, but this will come with
other PRs later.
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
Fixes#4656
As now changes to kubelet configuration can be applied without a reboot,
`talosctl upgrade-k8s` can handle the kubelet upgrades as well.
The gist is simply modifying machine config and waiting for `Node`
version to be updated, rest of the code is required for reliability of
the process.
Also fixed a bug in the API while watching deleted items with
tombstones.
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
Fixes: https://github.com/talos-systems/talos/issues/4065
Get all Talos generated manifests and apply them, wait for deployments to be
updated and to become ready.
Signed-off-by: Artem Chernyshev <artem.chernyshev@talos-systems.com>
Fixes#3951
Bootkube support was removed in Talos 0.9. Talos versions 0.9-0.11
support conversion of self-hosted bootkube-based control plane to the
new style control plane running as static pods managed by Talos.
This commit removes all backwards compatibility and removes conversion
code.
For the k8s controllers, `BootstrapStatus` is removed and a dependency
on `etcd` service status is added (as it was implicitly there via
`BootstrapStatus`).
Remove control plane conversion code.
In k8s upgrade code, remove self-hosted part.
Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
Scan all pods in `kube-system` and find `kube-proxy`, `kube-scheduler`,
`kube-controller-manager` and `kube-apiserver` ones, then check the
lowest version amongst them.
Signed-off-by: Artem Chernyshev <artem.0xD2@gmail.com>
This is going to be useful in the third party code which is using
upgrade modules, to collect output logs instead of printing them to the
stdout.
Signed-off-by: Artem Chernyshev <artem.0xD2@gmail.com>
This removes `retrying error` messages while waiting for the API server
pod state to reflect changes from the updated static pod definition.
Log more lines to notify about the progress.
Skip `kube-proxy` if not found (as we allow it to be disabled).
```
$ talosctl upgrade-k8s -n 172.20.0.2 --from 1.21.0 --to 1.21.2
discovered master nodes ["172.20.0.2" "172.20.0.3" "172.20.0.4"]
updating "kube-apiserver" to version "1.21.2"
> "172.20.0.2": starting update
> "172.20.0.2": machine configuration patched
> "172.20.0.2": waiting for API server state pod update
< "172.20.0.2": successfully updated
> "172.20.0.3": starting update
> "172.20.0.3": machine configuration patched
> "172.20.0.3": waiting for API server state pod update
< "172.20.0.3": successfully updated
> "172.20.0.4": starting update
> "172.20.0.4": machine configuration patched
> "172.20.0.4": waiting for API server state pod update
< "172.20.0.4": successfully updated
updating "kube-controller-manager" to version "1.21.2"
> "172.20.0.2": starting update
> "172.20.0.2": machine configuration patched
> "172.20.0.2": waiting for API server state pod update
< "172.20.0.2": successfully updated
> "172.20.0.3": starting update
> "172.20.0.3": machine configuration patched
> "172.20.0.3": waiting for API server state pod update
< "172.20.0.3": successfully updated
> "172.20.0.4": starting update
> "172.20.0.4": machine configuration patched
> "172.20.0.4": waiting for API server state pod update
< "172.20.0.4": successfully updated
updating "kube-scheduler" to version "1.21.2"
> "172.20.0.2": starting update
> "172.20.0.2": machine configuration patched
> "172.20.0.2": waiting for API server state pod update
< "172.20.0.2": successfully updated
> "172.20.0.3": starting update
> "172.20.0.3": machine configuration patched
> "172.20.0.3": waiting for API server state pod update
< "172.20.0.3": successfully updated
> "172.20.0.4": starting update
> "172.20.0.4": machine configuration patched
> "172.20.0.4": waiting for API server state pod update
< "172.20.0.4": successfully updated
updating daemonset "kube-proxy" to version "1.21.2"
kube-proxy skipped as DaemonSet was not found
```
Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
The structure of the controllers is really similar to addresses and
routes:
* `LinkSpec` resource describes desired link state
* `LinkConfig` controller generates `LinkSpecs` based on machine
configuration and kernel cmdline
* `LinkMerge` controller merges multiple configuration sources into a
single `LinkSpec` paying attention to the config layer priority
* `LinkSpec` controller applies the specs to the kernel state
Controller `LinkStatus` (which was implemented before) watches the
kernel state and publishes current link status.
Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
This extracts function which was used in upgrade/convert flows to retry
transient errors to the main `kubernetes` package, expands it to ignore
timeout errors, and it is now used to retry errors where applicable in
`pkg/kubernetes`.
Fixes#3403
Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
During the conversion process, API server goes down, so we can see lots
of network errors including EOF.
Fixes#3404
Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
The command `--remove-initialized-key` is the last resort to convert
control plane when control plane is down for whatever reason, so it
should work when control plane is not available.
Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>