3859 Commits

Author SHA1 Message Date
Andrey Smirnov
e51a110f0e
chore: bump dependencies
Go modules, container images.

Fixup for new COSI version: `ResourceDefinition` signature.

Update for new gRPC version: endpoints interface.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2023-02-15 15:26:55 +04:00
Noel Georgi
2d01480180
feat: automatically load modules based on hw info
Fixes: #6802

Automatically load kernel modules based on hardware info and modules
alias info. udevd would automatically load modules based on HW
information present.

Signed-off-by: Noel Georgi <git@frezbo.dev>
2023-02-14 19:57:13 +05:30
Noel Georgi
7b75cd8b94
fix: kernel module dependency tree generation
This fixes the issue when the overlay mount target directory was used as
lowerdir for the mount, creating extra folders in the extension.

Fix the issue by adding support for normal overlay mounts to use a
source directory when specified.

Also fixes a small issue where messages was logged when error is nil.

Signed-off-by: Noel Georgi <git@frezbo.dev>
2023-02-14 01:07:11 +05:30
Noel Georgi
65d02e5ade
fix: dbus shutdown when it's not initialized
If dbus is not started and a shutdown was called talos panics, fix by
checking if the mock is nil.

Signed-off-by: Noel Georgi <git@frezbo.dev>
2023-02-13 21:12:54 +05:30
Andrey Smirnov
a7079ce85c
fix: quote the ampersand character in GRUB config
Not sure how I missed it in the first PR, but that's the only character
which was not quoted properly.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2023-02-13 18:58:34 +04:00
Andrey Smirnov
933ba2d820
fix: display correct blockdevice size
See https://github.com/siderolabs/go-blockdevice/pull/67

Fixes #6836

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2023-02-13 16:55:35 +04:00
Andrey Smirnov
c449cb736b
fix: talosctl reboot command passing mode in wait mode
The reboot mode was not passed correctly in wait mode.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2023-02-13 16:20:07 +04:00
budimanjojo
34ab0007a6
docs: port is needed for wireguard endpoint
Example of `wireguard.peers[].endpoint` is wrong

Signed-off-by: budimanjojo <budimanjojo@gmail.com>
Signed-off-by: Noel Georgi <git@frezbo.dev>
2023-02-13 12:10:17 +05:30
Andrey Smirnov
1e1aa84f6c
fix: kubernetes removed resource version check
Not all Kubernetes deprecated resources are same - if the old API
version is deprecated, but new one is available, API server handles
trnasition for us. If some resource is removed completely, we need to
check for it. This reduces number of items to check, and simplifies the
check.

Move the check under the umbrella of the 'upgrade pre-checks', and make
it actually fatal.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2023-02-10 14:45:09 +04:00
Andrey Smirnov
dcbcf5a93c
fix: wait for network and retry in platform get config funcs
Wait for the network before trying to access the metadata service.

Retry the calls when appropriate (most platforms use `download.Download`
function which does proper retries).

Co-authored-by: Noel Georgi <git@frezbo.dev>
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2023-02-09 21:04:43 +04:00
Andrey Smirnov
3d7566ec74
test: update Canal CNI manifest URL
With recent changes to Calico website, old URL returns 404.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2023-02-08 23:20:56 +04:00
Andrey Smirnov
e09e106665
fix: default dns domain to 'cluster.local' in local case
One case was missing: when network section is present, but value is
omitted.

Fixes #6825

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2023-02-08 14:35:28 +04:00
Noel Georgi
cc6e37a47f
feat: use process wrapper for dropping capabilities
Use process wrapper introduced in #6814 to drop capabilities. This change
also means the capabilities are dropped per process level and not for
PID 1 (machined), which allows us to drop capabilities per process.

Signed-off-by: Noel Georgi <git@frezbo.dev>
2023-02-07 00:49:56 +05:30
Steffen Windoffer
0c6c888745
fix: trackable action flag usage text. --no-wait does not exist
--wait gets set to true

Signed-off-by: Steffen Windoffer <steffen@wind0r.de>
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2023-02-06 15:26:38 +04:00
Noel Georgi
5cb2915d8e
feat: use wrapper for starting processes
Use a wrapper for starting processes which can setup proper cgroups,
OOMscore, and also drop capabilities for the process, then it calls
`execve`.

The containerd tests is also fixed to support cgroups when
running tests in buildkit. It used to pass previously as we did not
error if cgroup setup failed.

Signed-off-by: Noel Georgi <git@frezbo.dev>
2023-02-03 18:32:09 +05:30
Andrey Smirnov
56d9453261
fix: panic in talosctl cluster show
This might happen with docker provisioner if the network is not found.

Fixes #6793

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2023-02-03 14:45:52 +04:00
Andrey Smirnov
38a51191e4
fix: correctly expand parameters in the URL
This fixes multiple issues:

* `log.Fatalf` in the machined code leads to kernel panic
* return URL if some expansion fails
* correctly handle destroyed event (wait for the next one)

Fixes #6807

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2023-02-02 18:42:45 +04:00
Andrey Smirnov
af21860a22
fix: return proper error if download attempts time out
Fixes #6795

This fixes a problem with Talos being stuck if the download attempts
time out - the returned context.Canceled error was triggering a
different flow which treats sequence take over as a special case, while
there is no other sequence to run.

Correct error should be timeout.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2023-02-02 18:19:04 +04:00
Andrey Smirnov
54f7d4c923
fix: correctly quote and unquote strings in GRUB config
One of the fields in the GRUB config - boot arguments - contains
user-controlled input. Talos supports variable expansion in
`talos.config` parameter, and uses `${var}` syntax.

In GRUB config, `}` is a special character, and introduction of `}`
breaks config parsing both for GRUB and Talos.

Correctly escape and unescape special characters.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2023-02-02 17:11:22 +04:00
Andrey Smirnov
54cf0672a7
fix: omit zero MTU in the machine config
Fixes #6747

The setting `mtu: 0` always meant "don't touch MTU", but the presence of
such line is very confusing.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2023-01-31 15:56:40 +04:00
Sander Maijers
bdc53ac254
docs: add hyperlink to Docker API docs about config.json
This reduces time needed to navigate docs.

Signed-off-by: Sander Maijers <3374183+sanmai-NL@users.noreply.github.com>
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2023-01-30 23:10:03 +04:00
Andrey Smirnov
b3bc06dd14
chore: bump vtprotobuf to v0.4.0
Use new equality generate check.

It's not being used in Talos a lot, it's almost only in the discovery
API client code.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2023-01-30 20:50:45 +04:00
Noel Georgi
0ba5e59f69
fix: drone config for renovate PR's
Fix drone config to exclude renovate pushes.

Signed-off-by: Noel Georgi <git@frezbo.dev>
2023-01-30 19:01:13 +05:30
Noel Georgi
590a393de9
fix: udevd healthcheck
The previous `udevd` healthcheck was incomplete and if `udevd` took more
time to startup the initial `udevadm trigger` would have silently failed
failing to setup proper devices. `udevadm trigger` returns an exit code
of zero even if `udevd` is not running. This PR fixes by first checking
if the `udevd` control socket exists, which is a faster check, then
making sure `udevd` is up by running `udevadm control` command. This
ensures that `udevd` is properly initialized before running any `udevadm
trigger` commands even if `udevd` is restarted/killed.

Signed-off-by: Noel Georgi <git@frezbo.dev>
2023-01-30 16:12:41 +05:30
Steve Francis
2b6b6deacd
docs: simplify and clarify digital ocean docs
Update Talos install guide for the Digital Ocean cloud platform.

Signed-off-by: Steve Francis <steve.francis@talos-systems.com>
Signed-off-by: Tim Jones <tim.jones@siderolabs.com>
2023-01-27 10:13:05 +01:00
Andrey Smirnov
92bc15f7f1
release(v1.4.0-alpha.1): prepare release
This is the official v1.4.0-alpha.1 release.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
v1.4.0-alpha.1 pkg/machinery/v1.4.0-alpha.1
2023-01-25 22:16:25 +04:00
Andrey Smirnov
e3da4754e7
feat: update Linux to 6.1.7
Bring in latest pkgs/tools.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2023-01-25 21:25:51 +04:00
Andrey Smirnov
006449e464
test: build integration test early in the pipeline
If we don't pre-build, it's getting built each time the `e2e-*` step
runs, and we have some running in parallel.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2023-01-24 16:20:51 +04:00
Noel Georgi
09aa712642
fix: renovate config
Add proper `extractVersion` regex.

Signed-off-by: Noel Georgi <git@frezbo.dev>
2023-01-24 15:49:52 +05:30
Utku Ozdemir
2d136f1879
feat: set markdown and html descriptions in config json schema
Set the additional description fields for vscode/monaco/jetbrains editors.

Strip the markdown formatting from the plain description.

Additionally, fix the description of the field `aescbcEncryptionSecret`.

Related to siderolabs/talos#6705.

Signed-off-by: Utku Ozdemir <utku.ozdemir@siderolabs.com>
2023-01-23 23:45:47 +01:00
Noel Georgi
f0804027a4
fix: renovate config
Fix renovate config

Signed-off-by: Noel Georgi <git@frezbo.dev>
2023-01-24 01:11:43 +05:30
Noel Georgi
812a2877cd
chore: bump deps + renovate cleanup
Bump dependencies.
Disable renovate for PR's and skip un-needed update checks.

Signed-off-by: Noel Georgi <git@frezbo.dev>
2023-01-24 00:42:58 +05:30
Andrey Smirnov
aa9f66c1c8
fix: mark DigitalOcean anchor IP as scope link
This excludes it out of the `NodeAddress`.

Needs extra testing to confirm that it actually still works as anchor
IP.

Fixes #6760

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2023-01-23 20:35:52 +04:00
Noel Georgi
bb4937f1b3
feat: enable renovate
Enable renovate for timely dependency updates.

Signed-off-by: Noel Georgi <git@frezbo.dev>
2023-01-23 20:08:37 +05:30
Andrey Smirnov
3e00571627
fix: unwrap gRPC errors on stop/remove pods check
As the client returns wrapped errors, unwrap them using our own method
which does `errors.As` instead of gRPC one which doesn't do unwrapping.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2023-01-23 14:29:04 +04:00
Andrey Smirnov
00e52ae078
fix: build correctly etcd initial cluster URL
The supposed format with multiple adverised URLs is:

`name=u1,name=u2`

Previously Talos generated:

`name=u1,u2`

(which is wrong)

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2023-01-20 22:52:47 +04:00
Utku Ozdemir
ae83b10ae8
feat: create JSON schema for v1alpha1.Config
Extend `docgen` tool to generate a JSON schema for `v1alpha1.Config` if a new optional cli arg is provided.

Extend the YAML-structured code comments on config fields to allow overriding the generated schema.

Add custom schemas for complex types.

Related to siderolabs/talos#6705.

Signed-off-by: Utku Ozdemir <utku.ozdemir@siderolabs.com>
2023-01-20 15:39:46 +01:00
Andrey Smirnov
703d965951
feat: update Kubernetes to 1.26.1, etcd to 3.5.7
See:

* https://github.com/etcd-io/etcd/releases/tag/v3.5.7
* https://github.com/kubernetes/kubernetes/releases/v1.26.1

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2023-01-20 15:41:55 +04:00
Steve Francis
965e645915
docs: update to use talosctl install script
Signed-off-by: Steve Francis <steve.francis@talos-systems.com>

Replaced multiple curl examples to get the correct talosctl with a curl that executes the install script.
2023-01-20 12:07:00 +01:00
Dmitriy Matrenichev
c5954f4345
chore: bump deps
For some reason `go-mod-outdated` didn't work for me, so I had to do
this manually.

Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
2023-01-19 21:40:00 +03:00
Andrey Smirnov
bb50f6a56d
chore: preallocate disk images for QEMU VMs
This improves the performance of the I/O operations if the underlying
filesystem supports it.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2023-01-19 19:09:00 +04:00
Noel Georgi
d4b8b35de7
feat: generate kernel module dependency tree
Run `depmod` during install/upgrades when extensions provide kernel
modules and `modules.dep` needs to be re-generated. This also allows
modules of same name from kernel to co-exist. Modules in `extras`
folder takes precedence over `in-built` ones.

Signed-off-by: Noel Georgi <git@frezbo.dev>
2023-01-19 18:54:10 +05:30
Andrey Smirnov
18122ae73e
fix: service restart (including extension services)
Fixes #6707

There was a race condition between different parts of the service code:
`Stop` waits for the event which is published before the service is
removed from the `running[id]` map, so if one does `Stop` followed by
`Start` (this is what `services restart` API does), by the time it goes
to `Start` it might be still in the `running[id]` map, so `Start` does
nothing.

Overall this code should be rewritten and simplified, but for now move
out sending these "terminal" events out so that by the time the event is
published, the service is stopped and removed from the `running[id]`
map.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2023-01-18 14:52:47 +04:00
Andrey Smirnov
680fd5e452
fix: bump COSI runtime with the panic controller restart fix
See https://github.com/cosi-project/runtime/pull/211

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2023-01-18 14:25:50 +04:00
Andrey Smirnov
0b65bbfc87
fix: handle overwriting tags in syslinux ADV
This is (still) being used in Talos to handle upgrade rollbacks.

There were multiple problems with this code, and one of them leads to
panic if the tag is written multiple times without deletion:

```
github.com/siderolabs/talos/internal/app/machined/pkg/runtime/v1alpha1/bootloader/adv/syslinux.ADV.SetTagBytes({0xc00175bc00?, 0x1f11dbe?, 0xed4f4d?}, 0x0?, {0xc000afb7f0?, 0x400?, 0x0?})
/src/internal/app/machined/pkg/runtime/v1alpha1/bootloader/adv/syslinux/syslinux.go:125 +0x270
github.com/siderolabs/talos/internal/app/machined/pkg/runtime/v1alpha1/bootloader/adv/syslinux.ADV.SetTag(...)
/src/internal/app/machined/pkg/runtime/v1alpha1/bootloader/adv/syslinux/syslinux.go:95
github.com/siderolabs/talos/cmd/installer/pkg/install.(*Installer).Install(0xc0004374a0, 0x5)
/src/cmd/installer/pkg/install/install.go
```

The `uint8()` conversion was causing overflow and wrong index when ADV
real length is over 255.

Fix multiple writes of the same tag by deleting previous value first.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2023-01-17 23:21:39 +04:00
Serge Logvinov
70d9428a1d
fix: kubespan MSS clamping
Change TCP maximum segment size if it goes through the KubeSpan to match
KubeSpan MTU.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2023-01-17 19:02:33 +04:00
Dmitriy Matrenichev
683b4ccb4f
chore: update Go to 1.19.5 and kernel to 6.1.4
Release notes https://go.dev/doc/devel/release#go1.19.5

Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
2023-01-12 17:28:22 +03:00
Andrey Smirnov
062c7d754b
test: fix integration test on cp endpoint update
As with #6724, controlplane node kubelet doesn't use control plane
endpoint anymore, run the test on the worker node instead of cp node.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2023-01-12 15:23:14 +04:00
Dmitriy Matrenichev
8e9fc13d7c
feat: implement enum generator for proto files
`structprotogen` now supports generating enums directly instead of using predeclared file and hardcoded types. To use this functionality, simply put `structprotogen:gen_enum` in the comment above const block, you want to have the proto definitions for.

Closes #6215

Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
2023-01-11 16:02:21 +03:00
Cees-Jan Kiewiet
771b0dc061
docs: update left over rpi_4 ref to rpi_generic
While following this guide I found that one reference to rpi_4 wasn't
updated to rpi_generic yet, this commit fixes that.

Signed-off-by: Cees-Jan Kiewiet <ceesjank@gmail.com>
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2023-01-11 15:57:55 +04:00