127 Commits

Author SHA1 Message Date
Noel Georgi
48dee48057
feat: support mtu for routes
Support setting MTU for routes.

Fixes: #6324

Signed-off-by: Noel Georgi <git@frezbo.dev>
2022-09-30 16:38:22 +05:30
Serge Logvinov
18c377a4d1
feat: customize audit policy
Add resource `AuditPolicyConfigs.kubernetes.talos.dev`.
It can be changed through machine config `cluster.apiServer.auditPolicy`

Signed-off-by: Serge Logvinov <serge.logvinov@sinextra.dev>
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-09-28 13:46:44 +04:00
Andrey Smirnov
0b2767c164
feat: implement 'permanent addr' in link statuses
Permanent address is only available for physical links, and it might be
different from the 'hardware address': when bonding, 'hardware address'
gets overridden from the bond master, while 'permanent address' still
shows MAC of the interface.

This part of the fix for incorrect bonding issue on Equinix Metal.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-09-26 14:45:46 +04:00
Andrey Smirnov
ce12c7b380
chore: update COSI runtime to v0.2.0-alpha.1
This adds metadata annotations and fixes some hanging watch loops.

There should be no functional changes for Talos.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-09-20 22:02:57 +04:00
Noel Georgi
5e21cca52d
feat: support setting kernel parameters
Support setting kernel parameters via machine config.

Fixes: #6206

Signed-off-by: Noel Georgi <git@frezbo.dev>
2022-09-05 23:45:51 +05:30
Dmitriy Matrenichev
bd56621cdf
feat: add structprotogen tool
This commit adds structprotogen tool which is used to generate proto file from Go structs.

Closes #6078.

Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
2022-09-05 16:54:00 +03:00
Andrey Smirnov
7e527777e8
chore: update API descriptors
Re-generate protobuf API descriptors in preparation for 1.2.0-beta.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-08-15 17:35:09 +04:00
Andrey Smirnov
9baca49662
refactor: implement COSI resource API for Talos
Overview: deprecate existing Talos resource API, and introduce new COSI
API.

Consequences:

* COSI API can only go via one-2-one proxy (`client.WithNode`)
* client-side API access is way easier with `state.State` wrappers
* lots of small changes on the client side to use new APIs

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-08-12 22:31:54 +04:00
Dmitriy Matrenichev
e422ea63d0
chore: add proto definitions for common types
This commit adds proto definitions for this types;
- *url.URL
- netaddr.IP
- netaddr.IPPort
- netaddr.IPPrefix
- *x509.PEMEncodedKey
- *x509.PEMEncodedCertificateAndKey

Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
2022-08-12 15:38:31 +03:00
Utku Ozdemir
b5da686a7b
feat: add actor ID to events & emit an initial empty event
Add a new field `actorID` to the events and populate it with a UUID for the lifecycle actions `reboot`, `reset`, `upgrade` and `shutdown`. This actor ID will be present on all events emitted by this triggered action. We can use this ID later on the client side to be able to track triggered actions.

We also emit an event with an empty payload on the events streaming GRPC endpoint when a client connects. The purpose of this event is to signal to the client that the event streaming has actually started.

Server-side part of siderolabs/talos#5499.

Signed-off-by: Utku Ozdemir <utku.ozdemir@siderolabs.com>
2022-08-11 15:14:11 +02:00
Noel Georgi
b62b18a972
feat: bump k8s to v1.25.0-beta.0
Bump k8s to v1.25.0-beta.0

Update most kubernetes `master` references to `controlplane`

Signed-off-by: Noel Georgi <git@frezbo.dev>
2022-08-10 22:17:53 +05:30
Dmitriy Matrenichev
7b80a747bc
feat: add protobuf encoding/decoding for Go structs
This commit adds the support for encoding/decoding Go structs with `protobuf:<n>` tags.

Closes #5940

Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
2022-08-10 16:04:08 +03:00
Andrey Smirnov
92314e47bf
refactor: use controllers/resources to feed trustd with data
This is mostly same as the way `apid` consumes certificates generated by
`machined` via COSI API connection.

Service `trustd` consumes two resources:

* `secrets.Trustd` which contains `trustd` server TLS certificates and
  it gets refreshed as e.g. node IP changes
* `secrets.OSRoot` which contains Talos API CA and join token

This PR fixes an issue with `trustd` certs not always including all IPs
of the node, as previously `trustd` certs will only capture addresses of
the node at the moment of `trustd` startup.

Another thing is that refactoring allows to dynamically change API CA
and join token. This needs more work, but `trustd` should now pick up
changes without any additional changes.

Fixes #5863

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-08-04 23:45:34 +04:00
Andrey Smirnov
fe2ee3b100
feat: implement MachineStatus resource
Fixes #5789

Example:

```yaml
spec:
    stage: running
    status:
        ready: false
        unmetConditions:
            - name: staticPods
              reason: kube-system/kube-controller-manager-talos-default-master-1 not ready, kube-system/kube-scheduler-talos-default-master-1 not ready
```

As events (CLI doesn't show full contents):

```
172.20.0.2   cbhf2l6f9lrs738hehfg   talos/runtime/machine.MachineStatusEvent   BOOTING   ready: false, unmet conditions: [time network services]
```

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-08-01 18:36:10 +04:00
Artem Chernyshev
ae1bec59e9
feat: allow running only one sequence at a time
Fix `Talos` sequencer to run only a single sequence at the same time.
Sequences priority was updated. To match the table:

| what is running (columns) what is requested (rows) | boot | reboot | reset | upgrade |
|----------------------------------------------------|------|--------|-------|---------|
| reboot                                             | Y    | Y      | Y     | N       |
| reset                                              | Y    | N      | N     | N       |
| upgrade                                            | Y    | N      | N     | N       |

With a small addition that `WithTakeover` is still there.
If set, priority is ignored.

This is mainly used for `Shutdown` sequence invokation.
And if doing apply config with reboot enabled.

Signed-off-by: Artem Chernyshev <artem.chernyshev@talos-systems.com>
2022-07-27 17:21:36 +03:00
Andrey Smirnov
065b59276c
feat: implement packet capture API
This uses the `go-packet` library with native bindings for the packet
capture (without `libpcap`). This is not the most performant way, but it
allows us to avoid CGo.

There is a problem with converting network filter expressions (like
`tcp port 3222`) into BPF instructions, it's only available in C
libraries, but there's a workaround with `tcpdump`.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-07-19 01:23:09 +04:00
Andrey Smirnov
022581d809
release(v1.2.0-alpha.0): prepare release
This is the official v1.2.0-alpha.0 release.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-06-30 19:01:07 +04:00
Andrey Smirnov
f2997c0f22
chore: bump dependencies
dependabot + go-mod-outdated

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-06-06 23:27:17 +04:00
Artem Chernyshev
2b03057b91
feat: implement a new mode try in the config manipulation commands
The new mode allows changing the config for a period of time, which
allows trying the configuration and automatically rolling it back in case
if it doesn't work for example.

The mode can only be used with changes that can be applied without a
reboot.

When changed it doesn't write the configuration to disk, only changes it
in memory.
`--timeout` parameter can be used to customize the rollback delay.
The default timeout is 1 minute.

Any consequent configuration change will abort try mode and the last
applied configuration will be used.

Signed-off-by: Artem Chernyshev <artem.chernyshev@talos-systems.com>
2022-04-21 20:31:45 +03:00
Artem Chernyshev
2b9722d1f5
feat: add dry-run flag in apply-config and edit commands
Dry run prints out config diff, selected application mode without
changing the configuration.

Signed-off-by: Artem Chernyshev <artem.chernyshev@talos-systems.com>
2022-04-14 19:12:57 +03:00
Andrey Smirnov
25d19131d3
release(v1.1.0-alpha.0): prepare release
This is the official v1.1.0-alpha.0 release.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-04-01 18:23:19 +03:00
Tomasz Zurkowski
cc7719c9d0
docs: improve comments in security proto
The existing comments did not match the service definition (they look
like copy paste from another service). I also added a little bit more
comments for the fields in the request and response.

Signed-off-by: Tomasz Zurkowski <zurkowski@google.com>
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-03-16 14:18:48 +03:00
Caleb Woodbine
d256b5c5e4
docs: fix spelling mistakes
Resolve spelling with `misspell -w .`

Signed-off-by: Caleb Woodbine <calebwoodbine.public@gmail.com>
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-03-15 15:38:25 +03:00
Andrey Smirnov
59681b8c9a
fix: backport fixes from release-1.0 branch
They were discovered as we tagged 1.0.0 version:

* wrong deprecated version
* incompatibility in extension compatibility checks

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-03-04 23:28:06 +03:00
Tim Jones
fe40e7b1b3
feat: drain node on shutdown
Cordon & drain a node when the Shutdown message is received.
Also adds a '--force' option to the shutdown command in case the control
plane is unresponsive.

Signed-off-by: Tim Jones <timniverse@gmail.com>
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-02-01 00:06:32 +03:00
Artem Chernyshev
ebec5d4a0c
feat: support full disk path in the diskSelector
Fixes: https://github.com/talos-systems/talos/issues/4788

Signed-off-by: Artem Chernyshev <artem.chernyshev@talos-systems.com>
2022-01-27 15:23:00 +03:00
Artem Chernyshev
2f2bdb26aa
feat: replace flags with --mode in apply, edit and patch commands
Fixes: https://github.com/talos-systems/talos/issues/4588

Signed-off-by: Artem Chernyshev <artem.chernyshev@talos-systems.com>
2022-01-13 16:09:53 +03:00
Andrey Smirnov
cb548a368a
release(v0.15.0-alpha.0): prepare release
This is the official v0.15.0-alpha.0 release.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2021-12-30 16:27:19 +03:00
Rohit Dandamudi
7f9922296a
feat: add powercycle mode in reboot
- Fixes #4569
- Updated reboot process sequence
- Updted api.descriptors to avoid proto type change linting error https://github.com/talos-systems/talos/pull/4612#discussion_r758599242
Signed-off-by: Rohit Dandamudi <rohit.dandamudi@siderolabs.com>

Signed-off-by: Rohit Dandamudi <rohit.dandamudi@siderolabs.com>
2021-12-02 22:40:04 +05:30
Alexey Palazhchenko
8d1cbeef9f
chore: add API breaking changes detector
Closes #4576.

Signed-off-by: Alexey Palazhchenko <alexey.palazhchenko@talos-systems.com>
2021-11-30 15:06:05 +00:00
Alexey Palazhchenko
0f169bf9b1
chore: add API deprecations mechanism
Refs #4576.

Signed-off-by: Alexey Palazhchenko <alexey.palazhchenko@talos-systems.com>
2021-11-30 06:31:55 +00:00
Alexey Palazhchenko
20d39c0b48
chore: format .proto files
Refs #2722.

Co-authored-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
Signed-off-by: Alexey Palazhchenko <alexey.palazhchenko@talos-systems.com>
2021-11-23 15:05:25 +00:00
Artem Chernyshev
f730252579
feat: add new event types
Add config load + validation errors and address + hostnames events.

Signed-off-by: Artem Chernyshev <artem.chernyshev@talos-systems.com>
2021-11-18 18:48:35 +03:00
Andrey Smirnov
c97becdd95
chore: remove interfaces and routes APIs
Fixes #4279

These APIs were deprecated in 0.13, now it's time to drop them for 0.14.

They were not used anywhere in Talos, so no changes on Talos side.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2021-10-27 15:34:17 +03:00
Andrey Smirnov
b450b7cef0
chore: deprecate Interfaces and Routes APIs
Fixes #4094

Deprecate old networkd APIs, `talosctl interfaces` and `talosctl routes`
now suggest different commands to be used to achieve same task.

TUI installer was updated to stop using Interfaces API.

Those APIs will be completely removed in 0.14.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2021-09-27 15:21:02 +03:00
Andrey Smirnov
dadaa65d54
feat: print uid/gid for the files in ls -l
This adds information about file ownership in the long listing which is
crucial sometimes.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2021-08-13 00:10:49 +03:00
Andrey Smirnov
eefe1c21c3
feat: add new etcd members in learner mode
Fixes #3714

This provides more safe way to join new members to the etcd cluster.

See https://etcd.io/docs/v3.4/learning/design-learner/

With learner mode join there are few differences:

* new nodes are joined one by one, because etcd enforces a single
learner member in the cluster
* learner members are not counted in quorum calculations, so while
learner catches up with the master node, quorum is not affected and
cluster is still operational

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2021-08-12 17:56:57 +03:00
Andrey Smirnov
0c7ce1cd81 feat: remove remnants of bootkube support
Fixes #3951

Bootkube support was removed in Talos 0.9. Talos versions 0.9-0.11
support conversion of self-hosted bootkube-based control plane to the
new style control plane running as static pods managed by Talos.

This commit removes all backwards compatibility and removes conversion
code.

For the k8s controllers, `BootstrapStatus` is removed and a dependency
on `etcd` service status is added (as it was implicitly there via
`BootstrapStatus`).

Remove control plane conversion code.

In k8s upgrade code, remove self-hosted part.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2021-08-03 07:55:42 -07:00
Alexey Palazhchenko
eea750de2c chore: rename "join" type to "worker"
Closes #3413.

Signed-off-by: Alexey Palazhchenko <alexey.palazhchenko@gmail.com>
2021-07-09 07:10:45 -07:00
Alexey Palazhchenko
bbf1c091d4 feat: add RBAC to talosctl version output
Refs #3852.

Signed-off-by: Alexey Palazhchenko <alexey.palazhchenko@gmail.com>
2021-06-28 07:10:25 -07:00
Artem Chernyshev
71fff02ff0 fix: revert back resource.proto order
Otherwise it breaks older `talosctl` compatibility.

Signed-off-by: Artem Chernyshev <artem.0xD2@gmail.com>
2021-06-23 16:37:36 -07:00
Artem Chernyshev
1990ad2525 feat: add created and updated timestamps to the resource metadata
This will allow to keep track of when the resource was created and
updated.
Update is tied to the version bump.

Signed-off-by: Artem Chernyshev <artem.0xD2@gmail.com>
2021-06-23 13:56:49 -07:00
Andrey Smirnov
d8c2bca1b5 feat: reimplement apid certificate generation on top of COSI
This PR can be split into two parts:

* controllers
* apid binding into COSI world

Controllers
-----------

* `k8s.EndpointController` provides control plane endpoints on worker
nodes (it isn't required for now on control plane nodes)
* `secrets.RootController` now provides OS top-level secrets (CA cert)
and secret configuration
* `secrets.APIController` generates API secrets (certificates) in a bit
different way for workers and control plane nodes: controlplane nodes
generate directly, while workers reach out to `trustd` on control plane
nodes via `k8s.Endpoint` resource

apid Binding
------------

Resource `secrets.API` provides binding to protobuf by converting
itself back and forth to protobuf spec.

apid no longer receives machine configuration, instead it receives
gRPC-backed socket to access Resource API. apid watches `secrets.API`
resource, fetches certs and CA from it and uses that in its TLS
configuration.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2021-06-23 13:07:00 -07:00
Alexey Palazhchenko
06209bba28 chore: update RBAC rules, remove old APIs
Refs #3421.

Signed-off-by: Alexey Palazhchenko <alexey.palazhchenko@gmail.com>
2021-06-18 09:54:49 -07:00
Alexey Palazhchenko
f63ab9dd9b feat: implement talosctl config new command
Refs #3421.

Signed-off-by: Alexey Palazhchenko <alexey.palazhchenko@gmail.com>
2021-06-17 09:06:43 -07:00
Artem Chernyshev
14e696d068 feat: update COSI runtime and add support for tail in the Talos gRPC
Updated protobufs to expose tail length option.

Signed-off-by: Artem Chernyshev <artem.0xD2@gmail.com>
2021-06-03 11:46:39 -07:00
Andrey Smirnov
33db8857aa fix: use COSI runtime DestroyReady input type
See https://github.com/cosi-project/runtime/pull/35

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2021-06-01 12:30:52 -07:00
Andrey Smirnov
c3a4173e11 chore: remove security API ReadFile/WriteFile
This seems to be unused completely, and they look scary enough at the
same time.

For better readability and to avoid any pitfalls, better to remove them.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2021-04-27 03:48:20 -07:00
Artem Chernyshev
9a91142a38 feat: print complete member info in etcd members
Fixes: https://github.com/talos-systems/talos/issues/3487

Example output:

```
NODE       ID                 HOSTNAME                 PEERS                   CLIENTS
10.5.0.2   c3d3020cf75b8728   talos-default-master-1   https://10.5.0.2:2380   https://10.5.0.2:2379
```

Signed-off-by: Artem Chernyshev <artem.0xD2@gmail.com>
2021-04-17 11:07:59 -07:00
Andrey Smirnov
0bd8b0e800 feat: provide an option to recover etcd from data directory copy
Sometimes `talosctl etcd snapshot` might not be available, for example
when etcd is not healthy. In that case it's possible to copy raw etcd
data directory with `talosctl cp /var/lib/etcd .` and use
`member/snap/db` to recover the cluster. But such copy won't pass
integrity checks, so they should be disabled explicitly.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2021-04-14 08:25:32 -07:00