813 Commits

Author SHA1 Message Date
Dmitriy Matrenichev
5f34f5b41f
chore: rename api load balancer to KubePrism
Closes #7432

Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
2023-07-14 15:23:53 +03:00
Dmitriy Matrenichev
078aac92ee
chore: bump deps
Bump:
- REVERT cilium/cilium-cli to v0.14.7
- github.com/Azure/azure-sdk-for-go/sdk/azcore to v1.7.0
- github.com/Azure/azure-sdk-for-go/sdk/storage/azblob to v1.1.0
- github.com/aws/aws-sdk-go to v1.44.300
- github.com/beevik/ntp to v1.2.0
- github.com/docker/docker to v24.0.4+incompatible
- github.com/gomarkdown/markdown to v0.0.0-20230711084535-11b03c0ae6d6
- github.com/hetznercloud/hcloud-go to v1.48.0
- github.com/iancoleman/orderedmap to v0.3.0
- github.com/jsimonetti/rtnetlink to v1.3.4
- github.com/siderolabs/go-debug to v0.2.3
- golang.org/x/net to v0.12.0
- golang.org/x/tools to v0.11.0
- google.golang.org/genproto/googleapis/rpc to v0.0.0-20230711160842-782d3b101e98
- google.golang.org/grpc to v1.56.2
- google.golang.org/protobuf to v1.31.0

Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
2023-07-14 12:44:58 +03:00
Andrey Smirnov
53873b8444
refactor: move ukify into Talos code
This is intemediate step to move parts of the `ukify` down to the main
Talos source tree, and call it from `talosctl` binary.

The next step will be to integrate it into the imager and move `.uki`
build out of the Dockerfile.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2023-07-13 19:14:32 +04:00
Noel Georgi
79365d9bac
feat: tpm2 based disk encryption
Support disk encryption using tpm2 and pre-calculated signed PCR values.

Fixes: #7266

Signed-off-by: Noel Georgi <git@frezbo.dev>
2023-07-12 20:41:28 +05:30
Andrey Smirnov
d32dd3a820
chore: update Go to 1.20.6
See https://go.dev/doc/devel/release#go1.20.6

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2023-07-12 15:21:26 +04:00
Andrey Smirnov
8017afb107
feat: implement CRI image management and pre-pull on K8s upgrade
Fixes #6391

Implement a set of APIs and commands to manage images in the CRI, and
pre-pull images on Kubernetes upgrades.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2023-07-11 19:25:10 +04:00
Andrey Smirnov
1c2f19b367
feat: update Kubernetes to 1.28.0-alpha.4
The Go modules were not tagged for alpha.4, so using alpha.3 tag.

Talos 1.5 will ship with Kubernetes 1.28.0.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2023-07-11 15:40:24 +04:00
Noel Georgi
94e9891c1b
chore: bump sd-boot to v254-rc1
Bump sd-boot.
Fix parsing PE executable offsets.
Set the PE file alignment to be 512 bytes.

Signed-off-by: Noel Georgi <git@frezbo.dev>
2023-07-11 15:52:57 +05:30
Noel Georgi
3206db5289
feat: drop tpm simulator for ukify measure
We do not need a tpm simulator for ukify measure. We can pre-calculate
the values. This also means we can build ukify as a static binary.

Signed-off-by: Noel Georgi <git@frezbo.dev>
2023-07-10 19:21:32 +05:30
Artem Chernyshev
ce63abb219
feat: add KMS assisted encryption key handler
Talos now supports new type of encryption keys which rely on Sealing/Unsealing randomly generated bytes with a KMS server:

```
systemDiskEncryption:
  ephemeral:
    keys:
      - kms:
          endpoint: https://1.2.3.4:443
        slot: 0
```
gRPC API definitions and a simple reference implementation of the KMS server can be found in this
[repository](https://github.com/siderolabs/kms-client/blob/main/cmd/kms-server/main.go).

Signed-off-by: Artem Chernyshev <artem.chernyshev@talos-systems.com>
2023-07-07 19:02:39 +03:00
Andrey Smirnov
2fec8388fc
chore: bump dependencies
Go modules, pkgs, Cilium CLI, CAPI base version.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2023-07-05 18:30:54 +04:00
Andrey Smirnov
c9a9f95611
refactor: extract secure boot certificate generation
Fixes #7412

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2023-07-03 16:55:02 +04:00
Noel Georgi
cbdf96d461
feat: support environment file for extensions
Supports setting `environmentFile` for Talos System Extension Services.

Fixes: #7316

Signed-off-by: Noel Georgi <git@frezbo.dev>
2023-06-28 00:21:13 +05:30
Noel Georgi
258f074490
fix: ukify cert generation
Fix ukify cert generation by setting the proper Subj/CN.

Fixes: #7383

Signed-off-by: Noel Georgi <git@frezbo.dev>
2023-06-27 20:31:50 +05:30
Andrey Smirnov
fbebc17f8b
fix: disable LVM backups/archive
Fixes #3129

Talos does not have a good location to keep LVM metadata backups.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2023-06-27 17:28:18 +04:00
Noel Georgi
e5306ef263
chore: format and cleanup test scripts
This formats and cleanups the test scripts.

Signed-off-by: Noel Georgi <git@frezbo.dev>
2023-06-27 16:53:40 +05:30
Tim Jones
53389b1e72
feat: auto-enroll secure boot keys
Uses the auto-enrollment feature of sd-boot to enroll required UEFI Secure
Boot keys.

Fixes: #7373

Signed-off-by: Tim Jones <tim.jones@siderolabs.com>
Signed-off-by: Noel Georgi <git@frezbo.dev>
2023-06-24 00:44:56 +05:30
Noel Georgi
e1b150a110
release(v1.5.0-alpha.1): prepare release
This is the official v1.5.0-alpha.1 release.

Signed-off-by: Noel Georgi <git@frezbo.dev>
2023-06-23 00:40:12 +05:30
Noel Georgi
8daf432b29
chore: bump deps
Bump deps.

Signed-off-by: Noel Georgi <git@frezbo.dev>
2023-06-22 22:41:08 +05:30
Andrey Smirnov
fe0f46980f
feat: implement secure boot from disk
This includes sd-boot handling, EFI variables, etc.

There are some TODOs which need to be addressed to make things smooth.

Install to disk, upgrades work.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2023-06-16 20:15:16 +05:30
Dmitriy Matrenichev
445f5ad542
feat: support API server load balancer
This commit adds support for API load balancer. Quick way to enable it is during cluster creation using new `api-server-balancer-port` flag (0 by default - disabled). When enabled all API request will be routed across
cluster control plane endpoints.

Closes #7191

Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
2023-06-16 10:09:20 -04:00
Dmitriy Matrenichev
665702ddd3
chore: fix cilium e2e tests
`WITH_CONFIG_PATCH_WORKER` check result was overriding any value set in `CONFIG_PATCH_FLAG` variable.
Move it to the different variable.

Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2023-06-14 15:08:31 +04:00
Andrey Smirnov
e858bca3a2
test: fix cilium integration tests
Due to a bug (?) cilium tests don't clean up all the deployments & pods,
leaving one pod in 'Pending' state.

Kubernetes e2e tests check for !Running pods in `kube-system` namespace.

Fix by moving cilium tests to a separate namespace.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2023-06-13 20:41:15 +04:00
Andrey Smirnov
0797b0d168
chore: add a pipeline to test cloud-images step without a release
Also uncomment Azure uploader.

Add the Azure environment variables to the Makefile cloud-images step.

Change disk size and tier to 16GiB and tier: P3

Add boolean value to drone pipeline and the cloud images hack will check the value to determine which Azure Compute Gallery to push images to.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
Signed-off-by: Christian Rolland <christian.rolland@siderolabs.com>
2023-06-12 19:42:27 +04:00
Noel Georgi
a34a948985
fix: copy missing modules.* files
Copy missing `modules.order`, `modules.builtin` and
`modules.builtin.modinfo` files so tools can read them.

Signed-off-by: Noel Georgi <git@frezbo.dev>
2023-06-09 17:25:28 +05:30
Noel Georgi
aef2192a65
chore: use fixed module list
Use a fixed list of modules to copy into Talos initramfs.

This makes sure we can still enable thing in Talos kernel as modules but
not ship it as default in Talos (extra modules could be extensions).

Also fixes: #7341

Signed-off-by: Noel Georgi <git@frezbo.dev>
2023-06-09 00:08:42 +05:30
Andrey Smirnov
a61dcdbbd5
fix: don't load RDMA over Ethernet driver by default
Document the change for upgrade notes.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2023-06-08 16:28:32 +04:00
Andrey Smirnov
aac441f618
chore: update Go to 1.20.5, bump dependencies
Go dependencies, new pkgs, extras, etc.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2023-06-07 23:40:59 +04:00
Christian Rolland
e6dde8ffc5
feat: add network chaos to qemu development environment
Add flags for configuring the qemu bridge interface with chaos options:
- network-chaos-enabled
- network-jitter
- network-latency
- network-packet-loss
- network-packet-reorder
- network-packet-corrupt
- network-bandwidth

These flags are used in /pkg/provision/providers/vm/network.go at the end of the CreateNetwork function to first see if the network-chaos-enabled flag is set, and then check if bandwidth is set.  This will allow developers to simulate clusters having a degraded WAN connection in the development environment and testing pipelines.

If bandwidth is not set, it will then enable the other options.
- Note that if bandwidth is set, the other options such as jitter, latency, packet loss, reordering and corruption will not be used.  This is for two reasons:
	- Restriction the bandwidth can often intoduce many of the other issues being set by the other options.
	- Setting the bandwidth uses a separate queuing discipline (Token Bucket Filter) from the other options (Network Emulator) and requires a much more complex configuration using a Heirarchial Token Bucket Filter which cannot be configured at a granular enough level using the vishvananda/netlink library.

Adding both queuing disciplines to the same interface may be an option to look into in the future, but would take more extensive testing and control over many more variables which I believe is out of the scope of this PR.  It is also possible to add custom profiles, but will also take more research to develop common scenarios which combine different options in a realistic manner.

Signed-off-by: Christian Rolland <christian.rolland@siderolabs.com>
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2023-06-06 20:15:26 +04:00
Noel Georgi
3a865370f5
feat: qemu secureboot
Add qemu support for secureboot testing via `talosctl cluster create`.

Can be tested via:

```bash
sudo -E _out/talosctl-linux-amd64 cluster create --provisioner=qemu $REGISTRY_MIRROR_FLAGS --controlplanes=1 --workers=1 --iso-path=_out/talos-uki-amd64.iso --with-secureboot=true --with-tpm2=true --skip-injecting-config --with-apply-config
```

This currently only supports just booting Talos in SecureBoot mode.
Installation and Upgrade comes as extra PRs.

Fixes: #7324

Signed-off-by: Noel Georgi <git@frezbo.dev>
2023-06-06 19:20:07 +05:30
Noel Georgi
423a31ac9d
chore: deprectae bootloader installer option
Deprecate the `bootloader` installer option. This has not been used in a
long while.

Signed-off-by: Noel Georgi <git@frezbo.dev>
2023-06-05 23:21:03 +05:30
Noel Georgi
bfc3419376
chore: add default console args
Add default console args for UKI iso.

Signed-off-by: Noel Georgi <git@frezbo.dev>
2023-06-05 19:43:20 +05:30
Noel Georgi
3f68485e44
feat: add uki iso generation
This adds code to generate a UKI ISO (UEFI only).

Fixes: #7261

Signed-off-by: Noel Georgi <git@frezbo.dev>
2023-06-02 22:44:27 +05:30
Andrey Smirnov
bab484a405
feat: use stable network interface names
Use `udevd` rules to create stable interface names.

Link controllers should wait for `udevd` to settle down, otherwise link
rename will fail (interface should not be UP).

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2023-06-01 21:29:12 +04:00
Andrey Smirnov
a0773f783c
chore: add ukify Go script
This is a port of ukify.py and systemd-measure from systemd.

This requires no actual TPM to be present to calculate the PCR
signatures.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
Signed-off-by: Noel Georgi <git@frezbo.dev>
2023-05-30 23:33:26 +05:30
Andrey Smirnov
b69e38d1ff
chore: bump dependencies
New pkgs, Linux 6.1.30, Flannel 0.22.0, Go modules.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2023-05-29 23:19:44 +04:00
Dmitriy Matrenichev
85d8a16194
chore: bump deps
Bump deps

Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
2023-05-22 16:02:15 -04:00
Andrey Smirnov
10155c390e
feat: enable xfs project quota support, kubelet feature
This is controlled with a feature flag which gets enabled automatically
for Talos 1.5+.

Fixes #7181

If enabled, configures kubelet to use project quotas to track xfs volume
usage, which is much more efficient than doing `du` periodically.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2023-05-19 20:33:39 +04:00
Andrey Smirnov
eba8185642
release(v1.5.0-alpha.0): prepare release
This is the official v1.5.0-alpha.0 release.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2023-05-19 18:38:24 +04:00
Andrey Smirnov
383471c3e9
feat: update default Kubernetes to v1.27.2
See https://github.com/kubernetes/kubernetes/releases/v1.27.2

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2023-05-19 15:14:17 +04:00
Christian Rolland
e0c1585d30
feat: create azure community gallery image version on release
Create Azure Community Gallery Image Version on release:
- Add /hack/cloud-image-uploader/azure.go
	- Upload vhd file to container for all architectures
	- Create managed disk from vhd file for all architectures
	- Create image version from managed disk for all architectures
- Modify /hack/cloud-image-uploader/main.go
	- Start Community Gallery processes concurently with AWS upload
- Modify /hack/cloud-image-uploader/options.go
	- Add additional Options for Community Gallery processes
- Modify .drone.jsonnet to use secrets for environment variables
	- The following secrets need to be created for this to work:
		- azure_subscription_id
		- azure_client_id
		- azure_client_secret
		- azure_tenant_id

Signed-off-by: Christian Rolland <christian.rolland@siderolabs.com>

chore: fix linting errors in readme

Fix linting errors in readme

Signed-off-by: Christian Rolland <christian.rolland@siderolabs.com>

chore: fix markdown linting errors

Fix markdown linting errors in readme

Signed-off-by: Christian Rolland <christian.rolland@siderolabs.com>

chore: fix markdown linting errors

Fix markdown linting errors in readme

Signed-off-by: Christian Rolland <christian.rolland@siderolabs.com>

chore: change disk size to match new 10GB cloud image size

Change disk size to match 10GB cloud image size

Signed-off-by: Christian Rolland <christian.rolland@siderolabs.com>
2023-05-18 17:49:55 -04:00
Dmitriy Matrenichev
61cad86731
chore: bump deps
- github.com/containerd/typeurl to v2.1.1
- github.com/aws/aws-sdk-go to v1.44.264
- alpine to 3.18.0
- node to 20.2.0-alpine
- github.com/containernetworking/plugins to v1.3.0
- github.com/docker/docker to v23.0.6+incompatible
- github.com/hetznercloud/hcloud-go to v1.45.1
- github.com/insomniacslk/dhcp to v0.0.0-20230516061539-49801966e6cb
- github.com/rivo/tview to v0.0.0-20230511053024-822bd067b165
- tools to v1.5.0-alpha.0-7-gd2dde48
- pkgs to v1.5.0-alpha.0-16-g7958db1

Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
2023-05-18 01:07:36 -04:00
Andrey Smirnov
01dfd3af7d
feat: update etcd to v3.5.9
See https://github.com/etcd-io/etcd/releases/tag/v3.5.9

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2023-05-15 15:59:23 +04:00
Noel Georgi
cc3128d944
chore: bump kernel to 6.1.28
Bump kernel to 6.1.28

Signed-off-by: Noel Georgi <git@frezbo.dev>
2023-05-15 14:30:02 +05:30
Noel Georgi
3b36993b99
fix: rlimit nofile test
The test was added at the wrong place.

Signed-off-by: Noel Georgi <git@frezbo.dev>
2023-05-12 16:20:52 +05:30
Dmitriy Matrenichev
45e6e27af7
chore: bump runtime
Use new functions and methods from runtime module.

Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
2023-05-11 17:18:08 -04:00
Noel Georgi
4f720d4653
fix: revert: set rlimit explicitly in wrapperd
This reverts commit a2565f67416e9b9bc22f2d5506df9ea7771c0c8c.

The fix done in `a2565f67`, was actually a no-op caused by the
misunderstanding the fix done in Go and backported to [Go 1.20.4](ecf7e00db8).
The fix gave a false confidence that it was working when it was tested
against Talos `main` branch since the PR #7190 bumped `x/sys` package
from [v0.7.0 -> v0.8.0](ecf7e00db8), the actual change in `x/sys` can be found here at ff18efa0a3 which meant that when updating Go to 1.20.4 the `x/sys` package should been updated too. The `x/sys` package changed how the syscall to set the rlimit was called, it got moved into the Go stdlib instead of calling rlimit syscall in the `x/sys` package, which meant a combination of using Go 1.20.4 and an older `x/sys` package means `RLIMIT_NOFILE` value would not be set back to the original value.

The Talos 1.4 release branch currently have  `x/sys`
at [v0.7.0(https://github.com/siderolabs/talos/blob/v1.4.3/go.mod#L133),
so the backport would consist of this change along another commit bumping `x/sys` package to `v0.8.0`.

Fixes: #7198
Fixes: #7206

Co-authored-by: Utku Ozdemir <utku.ozdemir@siderolabs.com>
Signed-off-by: Noel Georgi <git@frezbo.dev>
2023-05-11 23:38:20 +05:30
Andrey Smirnov
e67f3f5c54
feat: linux 6.1.27, containerd 1.6.21, go 1.20.4
Plus bunch of other dependencies.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2023-05-08 20:26:19 +04:00
Andrey Smirnov
d43c61e80f
fix: enforce nolock option for all NFS mounts by default
Talos doesn't have `rpc.statsd` running, so mounting without locking is
the only option. Some places in Kubernetes don't allow to set mount
options for NFS, so setting defaults is the only way.

Fixes #6582

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2023-05-04 17:26:36 +04:00
Andrey Smirnov
e6fffda013
chore: linux 6.1.26, runc 1.1.7
Update to the latest pkgs.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2023-04-27 20:31:04 +04:00