63 Commits

Author SHA1 Message Date
Andrey Smirnov
edf5777222 feat: add an option to force upgrade without checks
Our upgrades are safe by default - we check etcd health, take locks,
etc. But sometimes upgrades might be a way to recover broken (or
semi-broken) cluster, in that case we need upgrade to run even if the
checks are not passing. This is not a safe way to do upgrades, but it
might be a way to recover a cluster.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2021-02-09 10:20:03 -08:00
Andrew Rynhard
4734fe7dd3 feat: upgrade CoreDNS to 1.8.0
Brings in v1.8.0 of CoreDNS.

Signed-off-by: Andrew Rynhard <andrew@rynhard.io>
2021-02-08 11:59:12 -08:00
Andrey Smirnov
6cf98a7322 feat: implement IPv6 DHCP client in networkd
This renames existing 'DHCP' implementation to `DHCP4`, new client is
`DHCP6`.

For now, `DHCP6` is disabled by default and should be explicitly enabled
with the config.

QEMU testbed for IPv6 is going to be pushed as separate PR.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2021-02-05 02:22:18 -08:00
Andrey Smirnov
42cadf5c51 release(v0.9.0-alpha.0): prepare release
This is the official v0.9.0-alpha.0 release.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2021-02-02 14:07:26 -08:00
Spencer Smith
e4e6da3881 feat: allow fqdn to be used when registering k8s node
This PR fixes a problem we had with AWS clusters. We now allow the
kubelet to register using the full fqdn instead of just hostname.

Signed-off-by: Spencer Smith <robertspencersmith@gmail.com>
2021-02-01 13:19:48 -05:00
Andrey Smirnov
76a6794436 fix: kill all processes and umount all disk on reboot/shutdown
There are several ways Talos node might be restarted or shut down:

* error in sequence (initiated from machined)
* panic in main goroutine (machined recovers panics)
* error in sequence (initiated via API, event caught by machined)
* reboot/shutdown via Talos API

Before this change, paths (1) and (2) were handled in machined, and no
disks were unmounted and processes killed, so technically all the
processes are running and potentially writing to the filesystems.
Paths (3) and (4) try to stop services (but not pods) and unmount
explicitly mounted filesystems, followed by reboot directly from
sequencer (bypassing machined handler).

There was a bug that user disks were never explicitly unmounted (but
they might have been unmounted if mounted on top `/var`).

This refactors all the reboot/shutdown paths to flow through machined's
main function: on paths (4) event is sent via event API from the
sequencer back to the machined and machined initiates proper shutdown
sequence.

Refactoring in machined leads to all the paths (1)-(4) flowing through
the same function `handle(error)`.

Added two additional checks before flushing buffers:

* kill all non-system processes, this also kills all mount namespaces
* unmount any filesystem backed by `/dev/*`

This ensures all filesystems are unmounted before buffers are flushed.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2021-01-29 06:14:07 -08:00
Andrey Smirnov
e0a0f58801 feat: use multi-arch images for k8s and Flannel CNI
Flannel got updated to 0.13 version which has multi-arch image.

Kubernetes images are multi-arch.

Fixes #3049

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2021-01-28 08:26:02 -08:00
Andrey Smirnov
0aaf8fa968 feat: replace bootkube with Talos-managed control plane
Control plane components are running as static pods managed by the
kubelets.

Whole subsystem is managed via resources/controllers from os-runtime.

Many supporting changes/refactoring to enable new code paths.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2021-01-26 14:22:35 -08:00
vlad doster
a2b6939c21 docs: update components.md
- fix spelling
- increase readability
- fix grammar
- reduce verbiage

Signed-off-by: Vladislav Doster <mvdoster@gmail.com>
2021-01-26 07:37:17 -08:00
Andrey Smirnov
11863dd74d feat: implement resource API in Talos
This brings in `os-runtime` package and exposes resources with first
iteration of read-only API.

Two Talos resources (and one controller) are implemented:

* legacy.Service resource tracks Talos 'service' `RUNNING` state
* config.V1Alpha1 stores current runtime config

Glue point between existing runtime and new os-runtime based runtime is
in `v1alpha2` implementation and `V1Alpha2()` sub-interfaces of existing
`Runtime`, `State`, `Controller` interfaces.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2021-01-19 11:45:46 -08:00
Andrey Smirnov
d71ac4c4ff feat: update Kubernetes to 1.20.2
Minor point release, official changelog:

https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.20.md

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2021-01-15 09:06:18 -08:00
Artem Chernyshev
9883d0af19 feat: support Wireguard networking
This the first iteration of Wireguard network support.
What was done:
- kernel was updated to enable Wireguard kernel module.
- changed networkd to support creating Wireguard device type.
- used wgctrl to configure wireguard.
- updated `talosctl cluster create` to support generating Wireguard
network configuration automatically by just specifying the network cidr.
- added docs about Wireguard support/how to use it.

Signed-off-by: Artem Chernyshev <artem.0xD2@gmail.com>
2021-01-14 15:51:14 +03:00
Andrew Rynhard
00d345fd3a docs: add v0.9 docs
Adds documentation for v0.9, copied from v0.8.

Signed-off-by: Andrew Rynhard <andrew@rynhard.io>
2021-01-13 15:42:25 +03:00