Fixes: https://github.com/siderolabs/talos/issues/6045
`talosctl apply-config` now supports `--config-patch` flag that takes
machine config patches as the input.
Signed-off-by: Artem Chernyshev <artem.chernyshev@talos-systems.com>
This commit adds the support for encoding/decoding Go structs with `protobuf:<n>` tags.
Closes#5940
Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
This PR removes obsolete references
Signed-off-by: Steve Francis <steve.francis@talos-systems.com>
Signed-off-by: Spencer Smith <spencer.smith@talos-systems.com>
We add a new CRD, `serviceaccounts.talos.dev` (with `tsa` as short name), and its controller which allows users to get a `Secret` containing a short-lived Talosconfig in their namespaces with the roles they need. Additionally, we introduce the `talosctl inject serviceaccount` command to accept a YAML file with Kubernetes manifests and inject them with Talos service accounts so that they can be directly applied to Kubernetes afterwards. If Talos API access feature is enabled on Talos side, the injected workloads will be able to talk to Talos API.
Closessiderolabs/talos#4422.
Signed-off-by: Utku Ozdemir <utku.ozdemir@siderolabs.com>
This replaces old resource API filter the new one based on new COSI
feature to filter access to the resources.
There should be no functional changes.
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
This adds a new metadata field `node` which performs always proxying to
a single node without touching any protobuf structs on the way.
So with `node`, we can call APIs which do not conform to the Talos API
proxying standards, but from the UX point of view things will work same
way, but multiplexing will be handled on the client side.
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
This is mostly same as the way `apid` consumes certificates generated by
`machined` via COSI API connection.
Service `trustd` consumes two resources:
* `secrets.Trustd` which contains `trustd` server TLS certificates and
it gets refreshed as e.g. node IP changes
* `secrets.OSRoot` which contains Talos API CA and join token
This PR fixes an issue with `trustd` certs not always including all IPs
of the node, as previously `trustd` certs will only capture addresses of
the node at the moment of `trustd` startup.
Another thing is that refactoring allows to dynamically change API CA
and join token. This needs more work, but `trustd` should now pick up
changes without any additional changes.
Fixes#5863
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
This PR supports skipping node registration from K8s.
This is an adavnced use case and only needs to be used in special cases.
In this mode Kubelet only runs the static pods.
Fixes: #5847
Operations that will be broken:
- `talosctl cluster create` would eventually timeout since it expects
nodes to be registered.
- `talosctl health` since it expects nodes to be registered.
- `talosctl upgrade-k8s` since it expects nodes to be registered. Static
pods can still be updated by editing the machine config..
Signed-off-by: Noel Georgi <git@frezbo.dev>
This extracts etcd configuration and finalized run arguments as
resources managed by controllers.
The biggest change in terms of UX is that Talos now waits for the etcd
configured subnet to be actually available before starting etcd.
Previously etcd quickly failed if the requested subnet was not available
on the host.
Coupled with other fixes (#5951, #5988), this should bring etcd
join/promote sequence back into proper shape.
I also reverted all temporary measures for discovering etcd endpoints,
now etcd join doesn't depend on Kubernetes (once again).
Fixes#5889
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
Fixes#5652
This reworks and unifies HTTP client/transport management in Talos:
* cleanhttp is used everywhere consistently
* DefaultClient is using pooled client, other clients use regular
transport
* like before, Proxy vars are inspected on each request (but now
consistently)
* manifest download functions now recreate the client on each run to
pick up latest changes
* system CA list is picked up from a fixed locations, and supports
reloading on changes
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
This provides a "quick" shutdown when the API server is going down.
This solves to have a "clean" shutdown of the `talosctl events` stream
when the server is rebooting.
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
This is incompatible with Calico and Cilium in default configuration, as
it's not easy to figure out exact PodCIDRs of the node.
We change the default but provide the option to revert the old behavior.
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
Now the sequencer is smart enough to skip `LeaveEtcd` and
`CordonAndDrain` node if the node is not fully joined to the cluster
yet.
Signed-off-by: Artem Chernyshev <artem.chernyshev@talos-systems.com>
This commit replaces `ioutil.TempDir` with `t.TempDir` in tests. The
directory created by `t.TempDir` is automatically removed when the test
and all its subtests complete.
Prior to this commit, temporary directory created using `ioutil.TempDir`
needs to be removed manually by calling `os.RemoveAll`, which is omitted
in some tests. The error handling boilerplate e.g.
defer func() {
if err := os.RemoveAll(dir); err != nil {
t.Fatal(err)
}
}
is also tedious, but `t.TempDir` handles this for us nicely.
Reference: https://pkg.go.dev/testing#T.TempDir
Signed-off-by: Eng Zer Jun <engzerjun@gmail.com>
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
This uses all available (potential) etcd endpoints, which includes the
member being promoted as well. We avoid failures by iterating over the
list of endpoints on each attempt to make sure each and every endpoint
is tried.
Part of #5889
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
This fixes a case when some IP which became default at some point was
removed completely from the node. In that case Talos should set default
address to another address, as having default IP not on the node doesn't
make much sense.
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
Similar to the way kubectl reads kubeconfig, we attempt to load talosconfig file from multiple locations. If the file exists under `/var/run/secrets/talos.dev/config`, we load with higher priority before falling back to `~/.talos/config`. This will allow talosctl to be able to access Talos API from inside a pod when talosconfig is mounted into `/var/run/secrets/talos.dev/config`, similar to the way Kubernetes service account tokens work.
Part of siderolabs/talos#5980.
Signed-off-by: Utku Ozdemir <utku.ozdemir@siderolabs.com>
This commit adds gotagsrewrite tool, which is used to add `protobuf:"<n>"` tags to structs with //gotagsrewrite:gen comment. This will be used in conjunction with github.com/siderolabs/protoenc.
Closes#5941
Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
This fixes apid and machined shutdown sequences to do graceful stop of
gRPC server with timeout.
Also sequences are restructured to stop apid/machined as late as
possible allowing access to the node while the long sequence is running
(e.g. upgrade or reset).
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
Fix `Talos` sequencer to run only a single sequence at the same time.
Sequences priority was updated. To match the table:
| what is running (columns) what is requested (rows) | boot | reboot | reset | upgrade |
|----------------------------------------------------|------|--------|-------|---------|
| reboot | Y | Y | Y | N |
| reset | Y | N | N | N |
| upgrade | Y | N | N | N |
With a small addition that `WithTakeover` is still there.
If set, priority is ignored.
This is mainly used for `Shutdown` sequence invokation.
And if doing apply config with reboot enabled.
Signed-off-by: Artem Chernyshev <artem.chernyshev@talos-systems.com>
When message is sent via the proxy, `metadata.error` carries only string
representation which can't be unmarshalled back into an `error` which we
can match against. A similar fix was already done for "unary" responses,
but we missed the streaming case.
This fixes a spurious failure in integration tests when calling
`talosctl pcap --duration 1s`.
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
Replace Matcher field with Matcher method and store Op and size data directly in InstallDiskSizeMatcher.
Closes#5860.
Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
Fixing small issue in syntax.
Signed-off-by: Matthew Richardson <M.Richardson@ed.ac.uk>
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
This fixes a small bug with stable hostnames when they were only enabled
for control plane nodes.
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
The `PacketSource` interface is racy, as it provides a channel to read
packets from, while packets are read in a (invisible) goroutine, so
closing the capture handle creates a data race with reading.
Unwrap that goroutine into an explicit loop to avoid the race.
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>