29 Commits

Author SHA1 Message Date
Dmitry Sharshakov
4575dd8e74
chore: allow not preallocated disks for QEMU cluster
Preallocation still done by default for correct max usage estimates, but
in development environment it could be beneficial not to use up that
space, so I added a flag to disable preallocation

Signed-off-by: Dmitry Sharshakov <d3dx12.xx@gmail.com>
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-02-23 16:45:44 +04:00
Dmitriy Matrenichev
fa3b933705
chore: replace fmt.Errorf with errors.New where possible
This time use `eg` from `x/tools` repo tool to do this.

Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
2024-02-14 17:39:30 +03:00
Andrey Smirnov
383e528df8
chore: allow uuid-based hostnames in talosctl cluster create
This is useful when the VMs are booted without machine config,
so default hostnames based on controlplanes/workers no longer make
sense.

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-02-07 16:22:53 +04:00
Andrey Smirnov
01f0cbe61c
feat: support iPXE direct booting in talosctl cluster create
This embeds a tiny TFTP server which serves UEFI iPXE which embeds a
script that chainloads a given iPXE script.

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2023-12-19 17:56:08 +04:00
Artem Chernyshev
ce63abb219
feat: add KMS assisted encryption key handler
Talos now supports new type of encryption keys which rely on Sealing/Unsealing randomly generated bytes with a KMS server:

```
systemDiskEncryption:
  ephemeral:
    keys:
      - kms:
          endpoint: https://1.2.3.4:443
        slot: 0
```
gRPC API definitions and a simple reference implementation of the KMS server can be found in this
[repository](https://github.com/siderolabs/kms-client/blob/main/cmd/kms-server/main.go).

Signed-off-by: Artem Chernyshev <artem.chernyshev@talos-systems.com>
2023-07-07 19:02:39 +03:00
Christian Rolland
e6dde8ffc5
feat: add network chaos to qemu development environment
Add flags for configuring the qemu bridge interface with chaos options:
- network-chaos-enabled
- network-jitter
- network-latency
- network-packet-loss
- network-packet-reorder
- network-packet-corrupt
- network-bandwidth

These flags are used in /pkg/provision/providers/vm/network.go at the end of the CreateNetwork function to first see if the network-chaos-enabled flag is set, and then check if bandwidth is set.  This will allow developers to simulate clusters having a degraded WAN connection in the development environment and testing pipelines.

If bandwidth is not set, it will then enable the other options.
- Note that if bandwidth is set, the other options such as jitter, latency, packet loss, reordering and corruption will not be used.  This is for two reasons:
	- Restriction the bandwidth can often intoduce many of the other issues being set by the other options.
	- Setting the bandwidth uses a separate queuing discipline (Token Bucket Filter) from the other options (Network Emulator) and requires a much more complex configuration using a Heirarchial Token Bucket Filter which cannot be configured at a granular enough level using the vishvananda/netlink library.

Adding both queuing disciplines to the same interface may be an option to look into in the future, but would take more extensive testing and control over many more variables which I believe is out of the scope of this PR.  It is also possible to add custom profiles, but will also take more research to develop common scenarios which combine different options in a realistic manner.

Signed-off-by: Christian Rolland <christian.rolland@siderolabs.com>
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2023-06-06 20:15:26 +04:00
Andrey Smirnov
badbc51e63
refactor: rewrite code to include preliminary support for multi-doc
`config.Container` implements a multi-doc container which implements
both `Container` interface (encoding, validation, etc.), and `Conifg`
interface (accessing parts of the config).

Refactor `generate` and `bundle` packages to support multi-doc, and
provide backwards compatibility.

Implement a first (mostly example) machine config document for
SideroLink API URL.

Many places don't properly support multi-doc yet (e.g. config patches).

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2023-05-31 18:38:05 +04:00
Andrey Smirnov
96aa9638f7
chore: rename talos-systems/talos to siderolabs/talos
There's a cyclic dependency on siderolink library which imports talos
machinery back. We will fix that after we get talos pushed under a new
name.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-11-03 16:50:32 +04:00
Andrey Smirnov
30bbf6463a
refactor: use siderolabs/net version with netip.Addr
Replace most of `net.IP` usage in Talos with `netip.Addr`, refactor code
accordingly.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-11-02 14:21:03 +04:00
Andrey Smirnov
343c55762e
chore: replace talos-systems Go modules with siderolabs
This the first step towards replacing all import paths to be based on
`siderolabs/` instead of `talos-systems/`.

All updates contain no functional changes, just refactorings to adapt to
the new path structure.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-11-01 12:55:40 +04:00
Andrey Smirnov
2f2d97b6b5
fix: don't wait for the hostname in maintenance mode
Fixes #6119

With new stable default hostname feature, any default hostname is
disabled until the machine config is available.

Talos enters maintenance mode when the default config source is empty,
so it doesn't have any machine config available at the moment
maintenance service is started.

Hostname might be set via different sources, e.g. kernel args or via
DHCP before the machine config is available, but if all these sources
are not available, hostname won't be set at all.

This stops waiting for the hostname, and skips setting any DNS names in
the maintenance mode certificate SANs if the hostname is not available.

Also adds a regression test via new `--disable-dhcp-hostname` flag to
`talosctl cluster create`.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-08-23 17:52:20 +04:00
Noel Georgi
b62b18a972
feat: bump k8s to v1.25.0-beta.0
Bump k8s to v1.25.0-beta.0

Update most kubernetes `master` references to `controlplane`

Signed-off-by: Noel Georgi <git@frezbo.dev>
2022-08-10 22:17:53 +05:30
Andrey Smirnov
da2985fe1b
fix: respect local API server port
It wasn't used when building an endpoint to the local API server, so
Talos couldn't talk to the local API server when port was changed from
the default one.

Fixes #5706

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-06-09 00:33:49 +04:00
Andrey Smirnov
19bf12af07
fix: enable IPv6 in Docker-based Talos clusters
Docker by default disable IPv6 completely in the containers which breaks
SideroLink on Docker-based clusters, as SideroLink is using IPv6
addresses for the Wiregurard tunnel.

This change might break `talosctl cluster create` on host systems which
have IPv6 disabled completely, so provide a flag to revert this
behavior.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-04-01 20:28:12 +03:00
Andrey Smirnov
f7d1e77769
feat: provide SideroLink client implementation
Related to #4448

The only remaining part is filtering out SideroLink addresses when Talos
looks for a node address.

See also https://github.com/talos-systems/siderolink/pull/2

The way to test it out:

```
$ talosctl cluster create ... --extra-boot-kernel-args
siderolink.api=172.20.0.1:4000
```

(where 172.20.0.1 is the bridge IP)

Run `siderolink-agent` (test implementation):

```
$ sudo _out/siderolink-agent-linux-amd64
```

Now on the host, there should be a `siderolink` Wireguard userspace
tunnel:

```
$ sudo wg
interface: siderolink
  public key: 2aq/V91QyrHAoH24RK0bldukgo2rWk+wqE5Eg6TArCM=
  private key: (hidden)
  listening port: 51821

peer: Tyr6C/F3FFLWtnzqq7Dsm54B40bOPq6++PTiD/zqn2Y=
  endpoint: 172.20.0.1:47857
  allowed ips: fdae:41e4:649b:9303:b6db:d99c:215e:dfc4/128
  latest handshake: 2 minutes, 2 seconds ago
  transfer: 3.62 KiB received, 1012 B sent

...
```

Each Talos node will be registered as a peer, tunnel is established.

You can now ping Talos nodes from the host over the tunnel:

```
$ ping fdae:41e4:649b:9303:b6db:d99c:215e:dfc4
PING fdae:41e4:649b:9303:b6db:d99c:215e:dfc4(fdae:41e4:649b:9303:b6db:d99c:215e:dfc4) 56 data bytes
64 bytes from fdae:41e4:649b:9303:b6db:d99c:215e:dfc4: icmp_seq=1 ttl=64 time=0.352 ms
64 bytes from fdae:41e4:649b:9303:b6db:d99c:215e:dfc4: icmp_seq=2 ttl=64 time=0.437 ms
```

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2021-11-22 16:44:35 +03:00
Alexey Palazhchenko
eea750de2c chore: rename "join" type to "worker"
Closes #3413.

Signed-off-by: Alexey Palazhchenko <alexey.palazhchenko@gmail.com>
2021-07-09 07:10:45 -07:00
Andrey Smirnov
33119d2b8e chore: add an option to launch cluster with bad RTC state
This is useful for time sync testing.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2021-06-23 13:08:20 -07:00
Andrey Smirnov
82804414fc test: provide a way to force different boot order in provision library
There's no change to the default behavior. This change is going to be
used with Sidero/Sfyra.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2021-05-18 13:38:22 -07:00
Andrey Smirnov
7f3dca8e4c test: add support for IPv6 in talosctl cluster create
Modify provision library to support multiple IPs, CIDRs, gateways, which
can be IPv4/IPv6. Based on IP types, enable services in the cluster to
run DHCPv4/DHCPv6 in the test environment.

There's outstanding bug left with routes not being properly set up in
the cluster so, IPs are not properly routable, but DHCPv6 works and IPs
are allocated (validates DHCPv6 client).

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2021-02-09 13:28:53 -08:00
Artem Chernyshev
a83af03730 refactor: update go-blockdevice and restructure disk interaction code
This refactoring is required to simplify the work to be done to support
disk encryption.

Tried to minimize amount of queries done by `blockdevice` `probe`
methods.
Instead, where we have `runtime.Runtime` we get all required blockdevices
there from blockdevice cache stored in `State().Machine().Disk()`.
This opens a way to store encryption settings in the `Partition`
objects.

Signed-off-by: Artem Chernyshev <artem.0xD2@gmail.com>
2021-01-28 17:42:09 +03:00
Artem Chernyshev
6540e9bf70 feat: support disk image in talosctl cluster create
Fixes: https://github.com/talos-systems/talos/issues/2973

Can now supply disk image using `--disk-image-path` flag.
May need to enable `--with-apply-config` if it's necessary to bootstrap
nodes properly.

Signed-off-by: Artem Chernyshev <artem.0xD2@gmail.com>
2020-12-22 17:06:00 +03:00
Andrey Smirnov
c5ffe9f4f7 test: add support for mounting ISO in talosctl cluster create
If disk is empty and ISO path is given, QEMU provisioner mounts ISO on
the first boot.

To drop into maintenance mode:

```
talosctl cluster create --provisioner=qemu --iso-path=./_out/talos-amd64.iso --skip-injecting-config --wait=false
```

Then inject the config, bootstrap the node, wait for it to come up (via
`talosctl cluster health`).

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2020-12-10 05:55:44 -08:00
Andrey Smirnov
b2b86a622e fix: remove 'token creds' from maintenance service
This fixes the reverse Go dependency from `pkg/machinery` to `talos`
package.

Add a check to `Dockerfile` to prevent `pkg/machinery/go.mod` getting
out of sync, this should prevent problems in the future.

Fix potential security issue in `token` authorizer to deny requests
without grpc metadata.

In provisioner, add support for launching nodes without the config
(config is not delivered to the provisioned nodes).

Breaking change in `pkg/provision`: now `NodeRequest.Type` should be set
to the node type (as config can be missing now).

In `talosctl cluster create` add a flag to skip providing config to the
nodes so that they enter maintenance mode, while the generated configs
are written down to disk (so they can be tweaked and applied easily).

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2020-11-09 14:10:32 -08:00
Andrey Smirnov
8560fb9662 chore: enable nlreturn linter
Most of the fixes were automatically applied.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2020-11-09 06:48:07 -08:00
Andrey Smirnov
350d75eb46 feat: build talosctl-cni-bundle, use it in talosctl for QEMU
This builds a bundle with CNI plugins for talosctl which is
automatically downloaded by `talosctl` if CNI plugins are missing.

CNI directories are moved by default to the `~/.talos/cni` path.

Also add a bunch of pre-flight checks to the QEMU provisioner to make it
easier to bootstrap the Talos QEMU cluster.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2020-10-30 16:30:37 -07:00
Artem Chernyshev
061b296530 feat: allow specifying user-disks in talosctl cluster create
User-disks are supported by QEMU and Firecracker providers.
Can be defined by using the following parameters:
```
--user-disk /mount/path:1GB
```

Can get more than 1 user disk.
Same set of user disks will be created for all master and worker nodes.

Additionally enable user-disks in qemu e2e test.

Signed-off-by: Artem Chernyshev <artem.0xD2@gmail.com>
2020-10-30 08:44:08 -07:00
Andrey Smirnov
d60adf9e3b test: add support for PXE nodes in qemu provision library
This isn't supposed to be used ever in Talos directly, but rather only
in integration tests for Sidero.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2020-08-21 06:49:32 -07:00
Andrey Smirnov
bddd4f1bf6 refactor: move external API packages into machinery/
This moves `pkg/config`, `pkg/client` and `pkg/constants`
under `pkg/machinery` umbrella.

And `pkg/machinery` is published as Go module inside Talos repository.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2020-08-17 09:56:14 -07:00
Andrey Smirnov
9379cf9ee1 refactor: expose provision as public package
This change is only moving packages and updating import paths.

Goal: expose `internal/pkg/provision` as `pkg/provision` to enable other
projects to import Talos provisioning library.

As cluster checks are almost always required as part of provisioning
process, package `internal/pkg/cluster` was also made public as
`pkg/cluster`.

Other changes were direct dependencies discovered by `importvet` which
were updated.

Public packages (useful, general purpose packages with stable API):

* `internal/pkg/conditions` -> `pkg/conditions`
* `internal/pkg/tail` -> `pkg/tail`

Private packages (used only on provisioning library internally):

* `internal/pkg/inmemhttp` -> `pkg/provision/internal/inmemhttp`
* `internal/pkg/kernel/vmlinuz` -> `pkg/provision/internal/vmlinuz`
* `internal/pkg/cniutils` -> `pkg/provision/internal/cniutils`

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2020-08-12 05:12:05 -07:00