3666 Commits

Author SHA1 Message Date
Steve Francis
54a687fb8e
docs: consolidate and expand on discovery service
This PR expands the explanation of the discover service.

Signed-off-by: Steve Francis <steve.francis@talos-systems.com>
2022-10-03 20:53:24 -04:00
Andrey Smirnov
139c62d762
feat: allow upgrades in maintenance mode (only over SideroLink)
This implements a simple way to upgrade Talos node running in
maintenance mode (only if Talos is installed, i.e. if `STATE` and
`EPHEMERAL` partitions are wiped).

Upgrade is only available over SideroLink for security reasons.

Upgrade in maintenance mode doesn't support any options, and it works
without machine configuration, so proxy environment variables are not
available, registry mirrors can't be used, and extensions are not
installed.

Fixes #6224

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-09-30 21:16:15 +04:00
Noel Georgi
48dee48057
feat: support mtu for routes
Support setting MTU for routes.

Fixes: #6324

Signed-off-by: Noel Georgi <git@frezbo.dev>
2022-09-30 16:38:22 +05:30
Noel Georgi
1c43c72aeb
docs: fix talos required kernel params
Fix Talos required kernel parameters. `talos.config` is optional.

Signed-off-by: Noel Georgi <git@frezbo.dev>
2022-09-29 01:53:55 +05:30
Andrey Smirnov
67cc45ae3f
release(v1.3.0-alpha.0): prepare release
This is the official v1.3.0-alpha.0 release.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
pkg/machinery/v1.3.0-alpha.0 v1.3.0-alpha.0
2022-09-28 17:45:28 +04:00
Serge Logvinov
18c377a4d1
feat: customize audit policy
Add resource `AuditPolicyConfigs.kubernetes.talos.dev`.
It can be changed through machine config `cluster.apiServer.auditPolicy`

Signed-off-by: Serge Logvinov <serge.logvinov@sinextra.dev>
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-09-28 13:46:44 +04:00
Noel Georgi
23c9ea46bb
fix: raspberry pi install
Fix raspberry pi install.

Some fixes were missed from #6388

Signed-off-by: Noel Georgi <git@frezbo.dev>
2022-09-28 01:09:28 +05:30
Philipp Sauter
f17cdee167
feat: jsonpath filter for talosctl get outputs
We add a filter to the `talosctl get` command that allows users to
specify a jsonpath filter. Now they can reduce the information that is
printed to only the parts they are interested in.

Fixes #6109

Signed-off-by: Philipp Sauter <philipp.sauter@siderolabs.com>
2022-09-27 20:47:11 +02:00
Noel Georgi
6bd3cca1a8
chore: generic raspberry pi images
Use generic Raspberry Pi images. Deprecate the RPi4 specific image.

Ref: https://github.com/siderolabs/pkgs/pull/596

Signed-off-by: Noel Georgi <git@frezbo.dev>
2022-09-27 16:39:12 +05:30
Andrey Smirnov
d914ab8bb4
chore: add vulncheck tool as a linter
See https://go.dev/security/vuln/

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-09-27 14:40:50 +04:00
Kris Reeves
a0151aa13e
feat: add generic rpi u-boot support
This commit adds support for building Talos for the
Compute Module 4 and other generic Raspberry Pi
hardware.

Fixes: #6273

Signed-off-by: Kris Reeves <kris@pressbuttonllc.com>
Signed-off-by: Noel Georgi <git@frezbo.dev>
2022-09-26 21:04:07 +05:30
Andrey Smirnov
30f851d093
chore: bump dependences
go-mod-outdated

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-09-26 18:37:38 +04:00
Andrey Smirnov
8b2235c3b6
fix: lookup Equinix Metal bond slaves using 'permanent addr'
See #6333

Using permanent address fixes issues with mis-matching the links after
they got bonded.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-09-26 18:10:39 +04:00
Noel Georgi
b3257ebb1c
chore: bump kernel to 5.15.70
Bump kernel to [5.15.70](https://github.com/siderolabs/pkgs/pull/594)

Signed-off-by: Noel Georgi <git@frezbo.dev>
2022-09-26 17:34:47 +05:30
Andrey Smirnov
0b2767c164
feat: implement 'permanent addr' in link statuses
Permanent address is only available for physical links, and it might be
different from the 'hardware address': when bonding, 'hardware address'
gets overridden from the bond master, while 'permanent address' still
shows MAC of the interface.

This part of the fix for incorrect bonding issue on Equinix Metal.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-09-26 14:45:46 +04:00
Serge Logvinov
c90e20251d
fix: kubeconfig permission
Set kubeconfig permission to `-rw-------`

Signed-off-by: Serge Logvinov <serge.logvinov@sinextra.dev>
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-09-23 15:00:43 +04:00
Dmitriy Matrenichev
fc48849d00
chore: move maps/slices/ordered to gen module
Use github.com/siderolabs/gen

Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
2022-09-21 20:22:43 +03:00
Andrey Smirnov
8b09bd4b04
feat: update Kubernetes to v1.26.0-alpha.1
Talos 1.3.0 will ship with Kubernetes 1.26.0.

See https://github.com/kubernetes/kubernetes/releases/tag/v1.26.0-alpha.1

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-09-21 18:42:31 +04:00
Andrey Smirnov
276d4175bb
chore: bump extension versions in testing
Test with recent versions.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-09-21 17:16:31 +04:00
Noel Georgi
357b770cb5
fix: cryptsetup delete slot
Fix cryptsetup delete slot.

Fixes: #6298

Signed-off-by: Noel Georgi <git@frezbo.dev>
2022-09-21 16:37:54 +05:30
Andrey Smirnov
7111288393
fix: continue applying bootstrap manifests on some errors
Fixes #6302

This allows Talos to proceed if some manifest is invalid (or malformed),
while aborts the loop on connection errors (when `kube-apiserver` is not
ready).

This fixes a problem when a single resource might stop all manifests
from being applied and preventing a cluster bootstrap.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-09-20 22:27:17 +04:00
Andrey Smirnov
ce12c7b380
chore: update COSI runtime to v0.2.0-alpha.1
This adds metadata annotations and fixes some hanging watch loops.

There should be no functional changes for Talos.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-09-20 22:02:57 +04:00
Noel Georgi
1b435c0b36
chore: bump kernel + ice drivers
Bump kernel to [5.15.69](https://github.com/siderolabs/pkgs/pull/592)
Add Intel ice drivers

Signed-off-by: Noel Georgi <git@frezbo.dev>
2022-09-20 22:05:02 +05:30
Tim Jones
18e041f1ec
docs: fix typo in patching example
Fix missing 'mc' in talosctl patch example command.

Signed-off-by: Tim Jones <tim.jones@siderolabs.com>
2022-09-20 15:03:31 +02:00
Andrey Smirnov
0ad6452ca1
feat: update CoreDNS to v1.10.0
See https://github.com/coredns/coredns/blob/master/notes/coredns-1.10.0.md

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-09-19 18:29:54 +04:00
Andrey Smirnov
479f3f52ee
chore: bump dependencies
go-mod-outdated

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-09-19 18:02:50 +04:00
Andrey Smirnov
e07c6ae99e
feat: update Kubernetes to v1.25.1
See https://github.com/kubernetes/kubernetes/releases/tag/v1.25.1

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-09-19 16:23:00 +04:00
Andrey Smirnov
13fdfaffc4
test: fix up default branch name
master -> main

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-09-19 15:35:44 +04:00
Sander Maijers
ef181321a5
docs: add component diagram; K8s & Talos Linux
- Provide editable component diagram
  (diagrams.net).
- Document for both 1.2 and 1.3.

Signed-off-by: Sander Maijers <3374183+sanmai-NL@users.noreply.github.com>
Signed-off-by: Noel Georgi <git@frezbo.dev>
2022-09-19 12:08:11 +05:30
Andrey Smirnov
aade736435
docs: fix missing variable in OpenEBS docs
With misisng variable it rendered as empty creating confusion for our
users.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-09-15 22:24:13 +04:00
Andrey Smirnov
472590aa82
chore: return InvalidArgument on invalid config in maintenance mode
Follow-up fix for #6258

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-09-15 21:46:48 +04:00
Andrey Smirnov
e5cabd42cc
feat: enable etcd consistency hashcheck
This will be only enabled for Talos v1.3.x.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-09-15 21:03:40 +04:00
Andrey Smirnov
015535d905
fix: update discovery client with the redirect fix
See https://github.com/siderolabs/discovery-client/pull/4

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-09-15 20:32:33 +04:00
Noel Georgi
d0c8e7699c
chore: bump kernel and go
Bump kernel to 5.15.68
Bump go to 1.19.1

Signed-off-by: Noel Georgi <git@frezbo.dev>
2022-09-15 21:22:55 +05:30
Andrey Smirnov
985b0c2e79
chore: remove go.work.sum
This file receives many updates, and we don't want to handle them.

Everyone can have it on their local machine.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-09-15 18:43:35 +04:00
Andrey Smirnov
69124f1026
feat: update etcd to v3.5.5
See https://github.com/etcd-io/etcd/releases/tag/v3.5.5

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-09-15 17:09:02 +04:00
Pau Campana
1985a796c0
docs: update docs for pod security
Add new section to see how to disable admission control in control
plane.

Signed-off-by: Pau Campana <pau.campanya.soler@gmail.com>
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-09-15 14:41:26 +04:00
Andrey Smirnov
94b088f02f
fix: set etcd options consistently
This fixes an issue introduced in #5879: options should be set same way
for both `init` and `controlplane` cases.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-09-14 22:56:26 +04:00
Dmitriy Matrenichev
92ae7ef4b1
fix: fix protoenc encoding for enums and types with custom encoders
This commit bumps protoenc to v0.2.0 and also adds tests to ensure that encoding fixes are working correctly.

Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
2022-09-14 17:47:37 +03:00
Noel Georgi
93809017c5
docs: cpu scaling governor knowledgebase
Add docs on setting cpu scaling governor across all CPUs.

Thanks to @nberlee for the [suggestion](https://github.com/siderolabs/talos/issues/4508#issuecomment-1245633679)

Signed-off-by: Noel Georgi <git@frezbo.dev>
2022-09-14 13:20:28 +05:30
Andrey Smirnov
7b270ff33d
test: fix api controller test
Fixing the test to match the implementation.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-09-13 15:26:32 +04:00
Andrey Smirnov
2dadcd6695
fix: stop worker nodes from acting as apid routers
Don't allow worker nodes to act as apid routers:

* don't try to issue client certificate for apid on worker nodes
* if worker nodes receives incoming connections with `--nodes` set to
  one of the local addresses of the nodd, it routes the request to
  itself without proxying

Second point allows using `talosctl -e worker -n worker` to connect
directly to the worker if the connection from the control plane is not
available for some reason.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-09-13 15:07:31 +04:00
Andrey Smirnov
9eaf33f3f2
fix: never sign client certificate requests in trustd
Talos worker nodes use `trustd` API on control plane nodes to issue
certificates for `apid` service. Access to the API is protected with the
Talos join token specified in the machine configuration.

There was no validation on what kind of request is requested, so
`trustd` could issue a certificate which is valid for client
authentication with any set of Talos API RBAC roles, including
`os:admin` role allowing full access to the Talos API on control plane
nodes.

See: GHSA-7hgc-php5-77qq
CVE: CVE-2022-36103

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-09-13 15:06:09 +04:00
Noel Georgi
4367491247
feat: environment vars for extension service
This allows setting environment variables for the extension service.

Signed-off-by: Noel Georgi <git@frezbo.dev>
2022-09-13 14:06:55 +05:30
Andrey Smirnov
0c0cb671ea
chore: mark machine configuration validation failure as InvalidArgument
This makes it easier to distinguish between retriable and fatal
failures.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-09-12 22:04:54 +04:00
Andrey Smirnov
f424e53404
fix: stop containers more thoroughly
Don't skip pods which are not ready, try still to stop containers inside
not ready pod sandboxes.

Re-enable the test with Canal CNI (upstream Calico got fixed).

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-09-12 20:16:40 +04:00
Dmitriy Matrenichev
12827b861c
chore: move "implements" checks to compile time
There is no need to use `assert.Implements` since we can express this check during compile time. Go will eliminate `_` variables and any accompanying allocations during dead-code elimination phase.

This commit also removes:

    tok := new(v1alpha1.ClusterConfig).Token()
	assert.Implements(t, (*config.Token)(nil), tok)

Code since it doesn't check anything - v1alpha1.ClusterConfig.Token() already returns a config.Token interface.

Also - run `go work sync` and `go mod tidy`.

Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
2022-09-12 16:57:24 +03:00
Andrey Smirnov
3a67c42cbf
fix: kill the task processes when cleaning up stale task
The bug was triggered by `containerd` crash (restart), in this case
runner receives an error as if the process exited.
Runner tries to restart the container, but as the container is still
running, attempt to delete the task would fail.

With this change Talos always tries to kill the running container and
waits for the container to terminate.

The error message when the bug was triggered looks like:

```
service[kubelet](Waiting): Error running Containerd(kubelet), going to restart forever: failed to clean up task "kubelet": task must be stopped before deletion: running: failed precondition
```

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-09-12 17:05:13 +04:00
Andrey Smirnov
14a79e325b
chore: bump dependencies
dependabot

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-09-12 16:38:21 +04:00
Andrey Smirnov
9beee92e71
docs: fix double vv in Kubernetes version
Fixes #6242

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-09-12 15:36:26 +04:00