4 Commits

Author SHA1 Message Date
Andrey Smirnov
a1e6415403 fix: retry Kubernetes API errors on cordon/uncordon/etc
This extracts function which was used in upgrade/convert flows to retry
transient errors to the main `kubernetes` package, expands it to ignore
timeout errors, and it is now used to retry errors where applicable in
`pkg/kubernetes`.

Fixes #3403

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2021-04-02 03:51:40 -07:00
Andrey Smirnov
e039172eda fix: ignore EOF errors from Kubernetes API when converting control plane
During the conversion process, API server goes down, so we can see lots
of network errors including EOF.

Fixes #3404

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2021-04-01 10:52:44 -07:00
Andrey Smirnov
81acadf345 fix: ignore connection refused errors when updating/converting cp
Without loadbalancer, when api-server goes down, there will be
connection refused errors which should be retried.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2021-03-05 06:59:06 -08:00
Andrey Smirnov
117c5c3075 feat: implement command talosctl upgrade-k8s
This command handles upgrading Kubernetes control plane from 1.18.x and
1.19.x to 1.19.x.

There's automatic handling of pod-checkpointer to speed up
kube-apiserver upgrades.

Separate PR will add K8s upgrade to integration tests.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2020-09-10 14:08:49 -07:00