78 Commits

Author SHA1 Message Date
Dmitriy Matrenichev
61cad86731
chore: bump deps
- github.com/containerd/typeurl to v2.1.1
- github.com/aws/aws-sdk-go to v1.44.264
- alpine to 3.18.0
- node to 20.2.0-alpine
- github.com/containernetworking/plugins to v1.3.0
- github.com/docker/docker to v23.0.6+incompatible
- github.com/hetznercloud/hcloud-go to v1.45.1
- github.com/insomniacslk/dhcp to v0.0.0-20230516061539-49801966e6cb
- github.com/rivo/tview to v0.0.0-20230511053024-822bd067b165
- tools to v1.5.0-alpha.0-7-gd2dde48
- pkgs to v1.5.0-alpha.0-16-g7958db1

Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
2023-05-18 01:07:36 -04:00
Andrey Smirnov
e67f3f5c54
feat: linux 6.1.27, containerd 1.6.21, go 1.20.4
Plus bunch of other dependencies.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2023-05-08 20:26:19 +04:00
Nico Berlee
0af8fe2fb5
feat: netstat pod support
talosctl netstat -k show all host and non-hostnetwork pods sockets/connections.
talosctl netstat namespace/pod shows sockets/connections of a specific pod +
autocompletes in the shell.

Signed-off-by: Nico Berlee <nico.berlee@on2it.net>
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2023-03-30 23:39:38 +04:00
Noel Georgi
36a9a208ec
chore: bump deps
Bump deps

Signed-off-by: Noel Georgi <git@frezbo.dev>
2023-03-22 16:37:27 +05:30
Andrey Smirnov
442cb9c1b0
feat: implement APIs to write to META
This allows to put keys to META partition.

META contents can be viewed with `talosctl get metakeys`.

There is not real usecase for it yet, but the next PRs will introduce
two special keys which can be written:

* platform network config for `metal`
* `${code}` variable

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2023-03-15 22:17:52 +04:00
Andrey Smirnov
e03902b546
feat: update Go to 1.20.2
Also bump Linux to 6.1.15.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2023-03-10 16:41:17 +04:00
Nico Berlee
97048f7c37
feat: netstat in API and client
Implements netstat in Talos API and client (talosctl).

Signed-off-by: Nico Berlee <nico.berlee@on2it.net>
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2023-03-09 15:48:30 +04:00
Artem Chernyshev
b520710810
feat: introduce new flag in reset API that makes Talos reset user disks
Fixes: https://github.com/siderolabs/talos/issues/6815

Additionally, make it possible to run reset in maintenance mode: to
enable a way for resetting system disk and remove all traces of Talos
from it.

The new reset flow works in a separate sequence, changed disk probe
lookup to check the boot partition instead of the ephemeral one.

Signed-off-by: Artem Chernyshev <artem.chernyshev@talos-systems.com>
2023-02-28 15:10:41 +03:00
Andrey Smirnov
96629d5ba6
feat: implement etcd maintenance commands
This allows to safely recover out of space quota issues, and perform
degragmentation as needed.

`talosctl etcd status` command provides lots of information about the
cluster health.

See docs for more details.

Fixes #4889

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2023-01-03 23:25:28 +04:00
Andrey Smirnov
89dbb0ecf0
release(v1.4.0-alpha.0): prepare release
This is the official v1.4.0-alpha.0 release.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-12-23 22:32:09 +04:00
Andrey Smirnov
31fb905358
feat: update Linux 6.1.1, containerd 1.6.14
Bumps tools/pkgs/extras to the latest.

Bumps Go modules.

Enables adaptive capacity for COSI state.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-12-23 20:30:09 +04:00
Andrey Smirnov
f7a9a90db2
chore: update pkgs/tools (Go 1.19.4, containerd 1.6.11)
Update to the latest pkgs/tools to fix the build due to vulncheck.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-12-09 17:25:47 +04:00
Philipp Sauter
4e114ca120
feat: use the etcd member id for etcd operations instead of hostname
We add a controller that provides the etcd member id as a resource
and change the etcd related commands to support member ids next to
hostnames.

Fixes: #6223

Signed-off-by: Philipp Sauter <philipp.sauter@siderolabs.com>
2022-11-10 19:17:56 +04:00
Andrey Smirnov
96aa9638f7
chore: rename talos-systems/talos to siderolabs/talos
There's a cyclic dependency on siderolink library which imports talos
machinery back. We will fix that after we get talos pushed under a new
name.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-11-03 16:50:32 +04:00
Andrey Smirnov
30bbf6463a
refactor: use siderolabs/net version with netip.Addr
Replace most of `net.IP` usage in Talos with `netip.Addr`, refactor code
accordingly.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-11-02 14:21:03 +04:00
Tim Jones
e6fba7d3bc
chore: update dependencies
Updates:
* pkgs v1.3.0-alpha.0-33-g8fe5cbc
* tools v1.3.0-alpha.0-20-g3b5f89a
* aws-sdk-go v1.44.120
* docker v20.10.20+incompatible
* fsnotify v1.6.0
* nftables v0.0.0-20221015190445-4f5cd5826fbd
* gen v0.4.0
* grpc-proxy v0.4.0
* spf13/cobra v1.6.0
* u-root v0.10.0
* x/net v0.1.0
* x/sync v0.1.0
* x/sys v0.1.0
* x/term v0.1.0
* x/time v0.1.0
* grpc v1.50.1
* genproto v0.0.0-20221018160656-63c7b68cfc55
* Linux kernel 5.15.74

Signed-off-by: Tim Jones <tim.jones@siderolabs.com>
2022-10-21 15:20:01 +04:00
Andrey Smirnov
7d52bad370
feat: update Linux to 5.15.73
Also updates tools/pkgs/extras:

* https://github.com/siderolabs/tools/pull/238
* https://github.com/siderolabs/tools/pull/239
* https://github.com/siderolabs/pkgs/pull/605
* https://github.com/siderolabs/extras/pull/63

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-10-12 23:01:38 +04:00
Noel Georgi
d0c8e7699c
chore: bump kernel and go
Bump kernel to 5.15.68
Bump go to 1.19.1

Signed-off-by: Noel Georgi <git@frezbo.dev>
2022-09-15 21:22:55 +05:30
Utku Ozdemir
b5da686a7b
feat: add actor ID to events & emit an initial empty event
Add a new field `actorID` to the events and populate it with a UUID for the lifecycle actions `reboot`, `reset`, `upgrade` and `shutdown`. This actor ID will be present on all events emitted by this triggered action. We can use this ID later on the client side to be able to track triggered actions.

We also emit an event with an empty payload on the events streaming GRPC endpoint when a client connects. The purpose of this event is to signal to the client that the event streaming has actually started.

Server-side part of siderolabs/talos#5499.

Signed-off-by: Utku Ozdemir <utku.ozdemir@siderolabs.com>
2022-08-11 15:14:11 +02:00
Noel Georgi
b62b18a972
feat: bump k8s to v1.25.0-beta.0
Bump k8s to v1.25.0-beta.0

Update most kubernetes `master` references to `controlplane`

Signed-off-by: Noel Georgi <git@frezbo.dev>
2022-08-10 22:17:53 +05:30
Andrey Smirnov
fe2ee3b100
feat: implement MachineStatus resource
Fixes #5789

Example:

```yaml
spec:
    stage: running
    status:
        ready: false
        unmetConditions:
            - name: staticPods
              reason: kube-system/kube-controller-manager-talos-default-master-1 not ready, kube-system/kube-scheduler-talos-default-master-1 not ready
```

As events (CLI doesn't show full contents):

```
172.20.0.2   cbhf2l6f9lrs738hehfg   talos/runtime/machine.MachineStatusEvent   BOOTING   ready: false, unmet conditions: [time network services]
```

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-08-01 18:36:10 +04:00
Andrey Smirnov
065b59276c
feat: implement packet capture API
This uses the `go-packet` library with native bindings for the packet
capture (without `libpcap`). This is not the most performant way, but it
allows us to avoid CGo.

There is a problem with converting network filter expressions (like
`tcp port 3222`) into BPF instructions, it's only available in C
libraries, but there's a workaround with `tcpdump`.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-07-19 01:23:09 +04:00
Andrey Smirnov
022581d809
release(v1.2.0-alpha.0): prepare release
This is the official v1.2.0-alpha.0 release.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-06-30 19:01:07 +04:00
Artem Chernyshev
2b03057b91
feat: implement a new mode try in the config manipulation commands
The new mode allows changing the config for a period of time, which
allows trying the configuration and automatically rolling it back in case
if it doesn't work for example.

The mode can only be used with changes that can be applied without a
reboot.

When changed it doesn't write the configuration to disk, only changes it
in memory.
`--timeout` parameter can be used to customize the rollback delay.
The default timeout is 1 minute.

Any consequent configuration change will abort try mode and the last
applied configuration will be used.

Signed-off-by: Artem Chernyshev <artem.chernyshev@talos-systems.com>
2022-04-21 20:31:45 +03:00
Artem Chernyshev
2b9722d1f5
feat: add dry-run flag in apply-config and edit commands
Dry run prints out config diff, selected application mode without
changing the configuration.

Signed-off-by: Artem Chernyshev <artem.chernyshev@talos-systems.com>
2022-04-14 19:12:57 +03:00
Andrey Smirnov
25d19131d3
release(v1.1.0-alpha.0): prepare release
This is the official v1.1.0-alpha.0 release.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-04-01 18:23:19 +03:00
Dmitriy Matrenichev
e06e1473b0
feat: update golangci-lint to 1.45.0 and gofumpt to 0.3.0
- Update golangci-lint to 1.45.0
- Update gofumpt to 0.3.0
- Fix gofumpt errors
- Add goimports and format imports since gofumports is removed
- Update Dockerfile
- Fix .golangci.yml configuration
- Fix linting errors

Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
2022-03-24 08:14:04 +04:00
Andrey Smirnov
59681b8c9a
fix: backport fixes from release-1.0 branch
They were discovered as we tagged 1.0.0 version:

* wrong deprecated version
* incompatibility in extension compatibility checks

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-03-04 23:28:06 +03:00
Tim Jones
fe40e7b1b3
feat: drain node on shutdown
Cordon & drain a node when the Shutdown message is received.
Also adds a '--force' option to the shutdown command in case the control
plane is unresponsive.

Signed-off-by: Tim Jones <timniverse@gmail.com>
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-02-01 00:06:32 +03:00
Artem Chernyshev
2f2bdb26aa
feat: replace flags with --mode in apply, edit and patch commands
Fixes: https://github.com/talos-systems/talos/issues/4588

Signed-off-by: Artem Chernyshev <artem.chernyshev@talos-systems.com>
2022-01-13 16:09:53 +03:00
Andrey Smirnov
cb548a368a
release(v0.15.0-alpha.0): prepare release
This is the official v0.15.0-alpha.0 release.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2021-12-30 16:27:19 +03:00
Rohit Dandamudi
7f9922296a
feat: add powercycle mode in reboot
- Fixes #4569
- Updated reboot process sequence
- Updted api.descriptors to avoid proto type change linting error https://github.com/talos-systems/talos/pull/4612#discussion_r758599242
Signed-off-by: Rohit Dandamudi <rohit.dandamudi@siderolabs.com>

Signed-off-by: Rohit Dandamudi <rohit.dandamudi@siderolabs.com>
2021-12-02 22:40:04 +05:30
Alexey Palazhchenko
0f169bf9b1
chore: add API deprecations mechanism
Refs #4576.

Signed-off-by: Alexey Palazhchenko <alexey.palazhchenko@talos-systems.com>
2021-11-30 06:31:55 +00:00
Alexey Palazhchenko
20d39c0b48
chore: format .proto files
Refs #2722.

Co-authored-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
Signed-off-by: Alexey Palazhchenko <alexey.palazhchenko@talos-systems.com>
2021-11-23 15:05:25 +00:00
Artem Chernyshev
f730252579
feat: add new event types
Add config load + validation errors and address + hostnames events.

Signed-off-by: Artem Chernyshev <artem.chernyshev@talos-systems.com>
2021-11-18 18:48:35 +03:00
Alexey Palazhchenko
6e16fd2fee
chore: update tools, pkgs, and extras
To use Go 1.17.3.

Closes #4493.
Closes #4496.

Signed-off-by: Alexey Palazhchenko <alexey.palazhchenko@talos-systems.com>
2021-11-08 16:00:00 +00:00
Andrey Smirnov
dadaa65d54
feat: print uid/gid for the files in ls -l
This adds information about file ownership in the long listing which is
crucial sometimes.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2021-08-13 00:10:49 +03:00
Andrey Smirnov
5b57a98008
chore: update Go to 1.16.7, Linux to 5.10.57
See:

* https://github.com/talos-systems/tools/pull/144
* https://github.com/talos-systems/pkgs/pull/319
* https://github.com/talos-systems/extras/pull/24

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2021-08-12 18:31:03 +03:00
Andrey Smirnov
eefe1c21c3
feat: add new etcd members in learner mode
Fixes #3714

This provides more safe way to join new members to the etcd cluster.

See https://etcd.io/docs/v3.4/learning/design-learner/

With learner mode join there are few differences:

* new nodes are joined one by one, because etcd enforces a single
learner member in the cluster
* learner members are not counted in quorum calculations, so while
learner catches up with the master node, quorum is not affected and
cluster is still operational

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2021-08-12 17:56:57 +03:00
Andrey Smirnov
0c7ce1cd81 feat: remove remnants of bootkube support
Fixes #3951

Bootkube support was removed in Talos 0.9. Talos versions 0.9-0.11
support conversion of self-hosted bootkube-based control plane to the
new style control plane running as static pods managed by Talos.

This commit removes all backwards compatibility and removes conversion
code.

For the k8s controllers, `BootstrapStatus` is removed and a dependency
on `etcd` service status is added (as it was implicitly there via
`BootstrapStatus`).

Remove control plane conversion code.

In k8s upgrade code, remove self-hosted part.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2021-08-03 07:55:42 -07:00
Andrey Smirnov
9c73257cb1 feat: update Go to 1.16.6
See:

* https://github.com/talos-systems/tools/pull/140
* https://github.com/talos-systems/pkgs/pull/300
* https://github.com/talos-systems/extras/pull/21

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2021-07-14 06:44:22 -07:00
Alexey Palazhchenko
eea750de2c chore: rename "join" type to "worker"
Closes #3413.

Signed-off-by: Alexey Palazhchenko <alexey.palazhchenko@gmail.com>
2021-07-09 07:10:45 -07:00
Alexey Palazhchenko
bbf1c091d4 feat: add RBAC to talosctl version output
Refs #3852.

Signed-off-by: Alexey Palazhchenko <alexey.palazhchenko@gmail.com>
2021-06-28 07:10:25 -07:00
Alexey Palazhchenko
06209bba28 chore: update RBAC rules, remove old APIs
Refs #3421.

Signed-off-by: Alexey Palazhchenko <alexey.palazhchenko@gmail.com>
2021-06-18 09:54:49 -07:00
Alexey Palazhchenko
f63ab9dd9b feat: implement talosctl config new command
Refs #3421.

Signed-off-by: Alexey Palazhchenko <alexey.palazhchenko@gmail.com>
2021-06-17 09:06:43 -07:00
Artem Chernyshev
9a91142a38 feat: print complete member info in etcd members
Fixes: https://github.com/talos-systems/talos/issues/3487

Example output:

```
NODE       ID                 HOSTNAME                 PEERS                   CLIENTS
10.5.0.2   c3d3020cf75b8728   talos-default-master-1   https://10.5.0.2:2380   https://10.5.0.2:2379
```

Signed-off-by: Artem Chernyshev <artem.0xD2@gmail.com>
2021-04-17 11:07:59 -07:00
Andrey Smirnov
0bd8b0e800 feat: provide an option to recover etcd from data directory copy
Sometimes `talosctl etcd snapshot` might not be available, for example
when etcd is not healthy. In that case it's possible to copy raw etcd
data directory with `talosctl cp /var/lib/etcd .` and use
`member/snap/db` to recover the cluster. But such copy won't pass
integrity checks, so they should be disabled explicitly.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2021-04-14 08:25:32 -07:00
Alexey Palazhchenko
29da22d063 feat: add config validation warnings
Closes #3412.
Refs #3413.

Signed-off-by: Alexey Palazhchenko <alexey.palazhchenko@gmail.com>
2021-04-08 13:49:58 -07:00
Andrey Smirnov
e0650218a6 feat: support etcd recovery from snapshot on bootstrap
When Talos `controlplane` node is waiting for a bootstrap, `etcd`
contents can be recovered from a snapshot created with
`talosctl etcd snapshot` on a healthy cluster.

Bootstrap process goes same way as before, but the etcd data directory
is recovered from the snapshot.

This flow enables disaster recovery for the control plane: given that
periodic backups are available, destroy control plane nodes, re-create
them with the same config, and bootstrap one node with the saved
snapshot to recover etcd state at the time of the snapshot.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2021-04-08 10:15:37 -07:00
Andrey Smirnov
e664362cec feat: add API and command to save etcd snapshot (backup)
This adds a simple API and `talosctl etcd snapshot` command to stream
snapshot of etcd from one of the control plane nodes to the local file.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2021-04-02 09:20:16 -07:00