Commit Graph

42 Commits

Author SHA1 Message Date
Alexey Palazhchenko
f7d276b854 chore: remove old osctl reference
One place was missed.

Signed-off-by: Alexey Palazhchenko <alexey.palazhchenko@gmail.com>
2021-03-19 08:08:58 -07:00
Andrey Smirnov
f0512dfce9 feat: update Kubernetes to 1.20.5
See CHANGELOG:
https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.20.md#changelog-since-v1204

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2021-03-19 03:14:46 -07:00
Andrey Smirnov
cbc38418d8 release(v0.10.0-alpha.0): prepare release
This is the official v0.10.0-alpha.0 release.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2021-03-17 08:40:09 -07:00
Seán C McCord
1362966ff5 docs: rewrite getting-started for ISO
Update the Getting Started documentation to reflect the new ISO-based
installation method.

Fixes #3016

Signed-off-by: Seán C McCord <ulexus@gmail.com>
2021-03-12 04:44:10 -08:00
Andrey Smirnov
49853fc2ec fix: mkdir source of the extra mounts for the kubelet
This makes sure source directory exists before performing mount
operation.

Also adds an ability to patch the config bundle configs with JSON patch,
which is exposed in `talosctl cluster create`, this allowed me to easily
test this fix:

```
talosctl cluster create ... --config-patch='[{"op": "add", "path": "/machine/kubelet/extraMounts", "value": [{"destination": "/var/log/containers", "type": "bind", "source": "/var/log/containers", "options": ["rshared", "rbind", "rw"]}]}]'
```

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2021-03-05 11:47:55 -08:00
Andrey Smirnov
ec72ae892b release(v0.9.0-alpha.5): prepare release
This is the official v0.9.0-alpha.5 release.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2021-03-03 12:04:05 -08:00
Andrey Smirnov
60b7f79fd8 feat: add --on-reboot flag to talosctl edit/patch machineConfig
This allows to apply config even if sequencer is locked to recover from
confguration mistakes.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2021-03-03 08:48:29 -08:00
Andrey Smirnov
60aa011c7a feat: rename namespaces, resources, types etc
See https://github.com/talos-systems/os-runtime/pull/12 for new mnaming
conventions.

No functional changes.

Additionally implements printing extra columns in `talosctl get xyz`.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2021-03-02 13:34:15 -08:00
Andrey Smirnov
3a2caca781 release(v0.9.0-alpha.4): prepare release
This is the official v0.9.0-alpha.4 release.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2021-03-02 12:50:20 -08:00
Andrey Smirnov
a12a5dd255 release(v0.9.0-alpha.3): prepare release
This is the official v0.9.0-alpha.3 release.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2021-03-01 12:55:08 -08:00
Artem Chernyshev
376fdcf6cb feat: implement etcd remove-member cli command
Fixes: https://github.com/talos-systems/talos/issues/3219

We already have `etcd leave`, which makes the node exclude itself from
etcd members.
But in case if the node can't remove itself because it doesn't have
connection to etcd we need this etcd remove-member cli, which basically removes
a node from a different node.

No unit tests for that as it's going to destroy the test cluster.

Signed-off-by: Artem Chernyshev <artem.0xD2@gmail.com>
2021-03-01 07:55:08 -08:00
Andrey Smirnov
d173fd4c01 feat: update etcd to 3.4.15
See https://github.com/etcd-io/etcd/releases/tag/v3.4.15

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2021-03-01 06:16:40 -08:00
Andrey Smirnov
c7ee239087 fix: show stopped/exited containers via CRI inspector
This fixes output of `talosctl containers` to show failed/exited
containers so that it's possible to see e.g. `kube-apiserver` container
when it fails to start. This also enables using ID from the container
list to see logs of failing containers, so it's easy to debug issues
when control plane pods don't start because of wrong configuration.

Also remove option to use either CRI or containerd inspector, default to
containerd for system namespace and to CRI for kubernetes namespace.

The only side effect is that we can't see `kubelet` container in the
output of `talosctl containers -k`, but `kubelet` itself is available in
`talosctl services` and `talosctl logs kubelet`.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2021-02-26 14:45:13 -08:00
Andrey Smirnov
d7cdc8cc15 feat: implement simple layer 2 shared IP for CP
This adds a VIP (virtual IP) option to the network configuration of an
interface, which will allow a set of nodes to share a floating IP
address among them.  For now, this is restricted to control plane use
and only a single shared IP is supported.

Fixes #3111

Signed-off-by: Seán C McCord <ulexus@gmail.com>
Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2021-02-26 14:14:34 -08:00
Artem Chernyshev
041620c852 feat: implement talosctl edit and patch config commands
Fixes: https://github.com/talos-systems/talos/issues/3209

Using parts of `kubectl` package to run the editor.
Also using the same approach as in `kubectl edit` command:
- add commented section to the top of the file with the description.
- if the config has errors, display validation errors in the commented
section at the top of the file.
- retry apply config until it succeeds.
- abort if no changes were detected or if the edited file is empty.

Patch currently supports jsonpatch only and can read it either from the
file or from the inline argument.

https://asciinema.org/a/wPawpctjoCFbJZKo2z2ATDXeC

Signed-off-by: Artem Chernyshev <artem.0xD2@gmail.com>
2021-02-26 02:00:20 +03:00
Andrey Smirnov
589d01892c fix: update the layout of the Disks API to match proxying requirements
Fixes #3199

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2021-02-24 11:33:15 -08:00
Andrey Smirnov
5aa75e020e release(v0.9.0-alpha.2): prepare release
This is the official v0.9.0-alpha.2 release.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2021-02-20 14:46:12 -08:00
Andrey Smirnov
8789849c70 feat: add support for extra volume mounts for control plane pods
This allows to mount extra volumes into Talos-managed control plane
static pods. With additional options like extra files, any additional
content/configuration can be mounted.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2021-02-20 08:50:09 -08:00
Andrey Smirnov
2b76c4890f feat: add an option to disable kube-proxy manifest
This options drops kube-proxy manifest from the list of bootstrap
manifests. It might be used with CNIs which don't need `kube-proxy`.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2021-02-19 07:26:34 -08:00
Andrey Smirnov
e9fc54f6e3 feat: update Kubernetes to 1.20.3
https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.20.md#changelog-since-v1202

Also updater pkgs for:

* talos-systems/pkgs#238 (raspberrypi-firmware update)
* talos-systems/pkgs#242 (Linux 5.10.17 + init_on_free=0)

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2021-02-19 05:22:34 -08:00
Artem Chernyshev
54d6a45217 feat: add state encryption support
State partition encryption support adds a new section to the machine config.
And a new step to the sequencer flow which saves encryption
configuration object as json serialized value in the META partition.

Everything else is the same as is for the ephemeral partition.
Additionally enabled state partition encryption in the disk encryption
integration tests.

Signed-off-by: Artem Chernyshev <artem.0xD2@gmail.com>
2021-02-18 06:55:22 -08:00
Andrey Smirnov
8e35560baa release(v0.9.0-alpha.1): prepare release
This is the official v0.9.0-alpha.1 release.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2021-02-18 04:45:53 -08:00
Andrey Smirnov
7751920dba feat: add a tool and package to convert self-hosted CP to static pods
This is required to upgrade from Talos 0.8.x to 0.9.x. After the cluster
is fully upgraded, control plane is still self-hosted (as it was
bootstrapped with bootkube).

Tool `talosctl convert-k8s` (and library behind it) performs the upgrade
to self-hosted version.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2021-02-17 23:26:57 -08:00
Artem Chernyshev
58ff2c9808 feat: implement ephemeral partition encryption
This PR introduces the first part of disk encryption support.
New config section `systemDiskEncryption` was added into MachineConfig.
For now it contains only Ephemeral partition encryption.

Encryption itself supports two kinds of keys for now:
- node id deterministic key.
- static key which is hardcoded in the config and mainly used for test
purposes.

Talosctl cluster create can now be told to encrypt ephemeral partition
by using `--encrypt-ephemeral` flag.

Additionally:
- updated pkgs library version.
- changed Dockefile to copy cryptsetup deps from pkgs.

Signed-off-by: Artem Chernyshev <artem.0xD2@gmail.com>
2021-02-17 13:39:04 -08:00
Andrey Smirnov
e5bd35ae3c feat: add resource watch API + CLI
This uses API in `os-runtime` to pull the initial list of resources +
updates for resource by type.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2021-02-17 13:24:47 -08:00
Andrey Smirnov
cc83b83808 feat: rename apply-config --no-reboot to --on-reboot
This explains the intetion better: config is applied on reboot, and
allows to easily distinguish it from `apply-config --immediate` which
applies config immediately without a reboot (that is coming in a
different PR).

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2021-02-17 12:49:47 -08:00
Andrey Smirnov
d99a016af2 fix: correct response structure for GenerateConfig API
Also fix recovery grpc handler to print panic stacktrace to the log.

Any API should follow the structure compatible with apid proxying
injection of errors/nodes.

Explicitly fail GenerateConfig API on worker nodes, as it panics on
worker nodes (missing certificates in node config).

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2021-02-11 06:34:10 -08:00
Andrey Smirnov
daea9d3811 feat: support version contract for Talos config generation
This allows to generating current version Talos configs (by default) or
backwards compatible configuration (e.g. for Talos 0.8).

`talosctl gen config` defaults to current version, but explicit version
can be passed to the command via flags.

`talosctl cluster create` defaults to install/container image version,
but that can be overridden. This makes `talosctl cluster create` now
compatible with 0.8.1 images out of the box.

Upgrade tests use contract based on source version in the test.

When used as a library, `VersionContract` can be omitted (defaults to
current version) or passed explicitly. `VersionContract` can be
convienietly parsed from Talos version string or specified as one of the
constants.

Fixes #3130

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2021-02-10 13:02:52 -08:00
Andrey Smirnov
7f3dca8e4c test: add support for IPv6 in talosctl cluster create
Modify provision library to support multiple IPs, CIDRs, gateways, which
can be IPv4/IPv6. Based on IP types, enable services in the cluster to
run DHCPv4/DHCPv6 in the test environment.

There's outstanding bug left with routes not being properly set up in
the cluster so, IPs are not properly routable, but DHCPv6 works and IPs
are allocated (validates DHCPv6 client).

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2021-02-09 13:28:53 -08:00
Andrew Rynhard
3aaa888f9a docs: fix typos
Fixes a few typos in our docs.

Signed-off-by: Andrew Rynhard <andrew@rynhard.io>
2021-02-09 10:50:53 -08:00
Andrey Smirnov
edf5777222 feat: add an option to force upgrade without checks
Our upgrades are safe by default - we check etcd health, take locks,
etc. But sometimes upgrades might be a way to recover broken (or
semi-broken) cluster, in that case we need upgrade to run even if the
checks are not passing. This is not a safe way to do upgrades, but it
might be a way to recover a cluster.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2021-02-09 10:20:03 -08:00
Andrew Rynhard
4734fe7dd3 feat: upgrade CoreDNS to 1.8.0
Brings in v1.8.0 of CoreDNS.

Signed-off-by: Andrew Rynhard <andrew@rynhard.io>
2021-02-08 11:59:12 -08:00
Andrey Smirnov
6cf98a7322 feat: implement IPv6 DHCP client in networkd
This renames existing 'DHCP' implementation to `DHCP4`, new client is
`DHCP6`.

For now, `DHCP6` is disabled by default and should be explicitly enabled
with the config.

QEMU testbed for IPv6 is going to be pushed as separate PR.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2021-02-05 02:22:18 -08:00
Andrey Smirnov
42cadf5c51 release(v0.9.0-alpha.0): prepare release
This is the official v0.9.0-alpha.0 release.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2021-02-02 14:07:26 -08:00
Spencer Smith
e4e6da3881 feat: allow fqdn to be used when registering k8s node
This PR fixes a problem we had with AWS clusters. We now allow the
kubelet to register using the full fqdn instead of just hostname.

Signed-off-by: Spencer Smith <robertspencersmith@gmail.com>
2021-02-01 13:19:48 -05:00
Andrey Smirnov
76a6794436 fix: kill all processes and umount all disk on reboot/shutdown
There are several ways Talos node might be restarted or shut down:

* error in sequence (initiated from machined)
* panic in main goroutine (machined recovers panics)
* error in sequence (initiated via API, event caught by machined)
* reboot/shutdown via Talos API

Before this change, paths (1) and (2) were handled in machined, and no
disks were unmounted and processes killed, so technically all the
processes are running and potentially writing to the filesystems.
Paths (3) and (4) try to stop services (but not pods) and unmount
explicitly mounted filesystems, followed by reboot directly from
sequencer (bypassing machined handler).

There was a bug that user disks were never explicitly unmounted (but
they might have been unmounted if mounted on top `/var`).

This refactors all the reboot/shutdown paths to flow through machined's
main function: on paths (4) event is sent via event API from the
sequencer back to the machined and machined initiates proper shutdown
sequence.

Refactoring in machined leads to all the paths (1)-(4) flowing through
the same function `handle(error)`.

Added two additional checks before flushing buffers:

* kill all non-system processes, this also kills all mount namespaces
* unmount any filesystem backed by `/dev/*`

This ensures all filesystems are unmounted before buffers are flushed.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2021-01-29 06:14:07 -08:00
Andrey Smirnov
e0a0f58801 feat: use multi-arch images for k8s and Flannel CNI
Flannel got updated to 0.13 version which has multi-arch image.

Kubernetes images are multi-arch.

Fixes #3049

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2021-01-28 08:26:02 -08:00
Andrey Smirnov
0aaf8fa968 feat: replace bootkube with Talos-managed control plane
Control plane components are running as static pods managed by the
kubelets.

Whole subsystem is managed via resources/controllers from os-runtime.

Many supporting changes/refactoring to enable new code paths.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2021-01-26 14:22:35 -08:00
Andrey Smirnov
11863dd74d feat: implement resource API in Talos
This brings in `os-runtime` package and exposes resources with first
iteration of read-only API.

Two Talos resources (and one controller) are implemented:

* legacy.Service resource tracks Talos 'service' `RUNNING` state
* config.V1Alpha1 stores current runtime config

Glue point between existing runtime and new os-runtime based runtime is
in `v1alpha2` implementation and `V1Alpha2()` sub-interfaces of existing
`Runtime`, `State`, `Controller` interfaces.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2021-01-19 11:45:46 -08:00
Andrey Smirnov
d71ac4c4ff feat: update Kubernetes to 1.20.2
Minor point release, official changelog:

https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.20.md

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2021-01-15 09:06:18 -08:00
Artem Chernyshev
9883d0af19 feat: support Wireguard networking
This the first iteration of Wireguard network support.
What was done:
- kernel was updated to enable Wireguard kernel module.
- changed networkd to support creating Wireguard device type.
- used wgctrl to configure wireguard.
- updated `talosctl cluster create` to support generating Wireguard
network configuration automatically by just specifying the network cidr.
- added docs about Wireguard support/how to use it.

Signed-off-by: Artem Chernyshev <artem.0xD2@gmail.com>
2021-01-14 15:51:14 +03:00
Andrew Rynhard
00d345fd3a docs: add v0.9 docs
Adds documentation for v0.9, copied from v0.8.

Signed-off-by: Andrew Rynhard <andrew@rynhard.io>
2021-01-13 15:42:25 +03:00