22 Commits

Author SHA1 Message Date
Artem Chernyshev
e7e99cf1b3 feat: support disk usage command in talosctl
Usage example:

```bash
talosctl du --nodes 10.5.0.2 /var -H -d 2
NODE       NAME
10.5.0.2   8.4 kB   etc
10.5.0.2   1.3 GB   lib
10.5.0.2   16 MB    log
10.5.0.2   25 kB    run
10.5.0.2   4.1 kB   tmp
10.5.0.2   1.3 GB   .
```

Supported flags:
- `-a` writes counts for all files, not just directories.
- `-d` recursion depth
- '-H' humanize size outputs.
- '-t' size threshold (skip files if < size or > size).

Fixes: https://github.com/talos-systems/talos/issues/2504

Signed-off-by: Artem Chernyshev <artem.0xD2@gmail.com>
2020-10-13 09:30:31 -07:00
Andrew Rynhard
d4f103ffcb fix: pass config via stdin
In order to perform upgrades the way we would like, it is important that
we avoid any bind mounts into containers. This change ensures that all
system services get their config via stdin.

Signed-off-by: Andrew Rynhard <andrew@rynhard.io>
2020-08-20 15:26:13 -07:00
Andrey Smirnov
bddd4f1bf6 refactor: move external API packages into machinery/
This moves `pkg/config`, `pkg/client` and `pkg/constants`
under `pkg/machinery` umbrella.

And `pkg/machinery` is published as Go module inside Talos repository.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2020-08-17 09:56:14 -07:00
Andrey Smirnov
47608fb874 refactor: make pkg/config not rely on machined/../internal/runtime
This makes `pkg/config` directly importable from other projects.

There should be no functional changes.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2020-07-29 12:40:12 -07:00
Andrey Smirnov
c54639e541 feat: implement server-side API for cluster health checks
This implements existing server-side health checks as defined in
`internal/pkg/cluster/checks` in Talos API.

Summary of changes:

* new `cluster` API

* `apid` now listens without auth on local file socket

* `cluster` API is for now implemented in `machined`, but we can move it
to the new service if we find it more appropriate

* `talosctl health` by default now does server-side health check

UX: `talosctl health` without arguments does health check for the
cluster if it has healthy K8s to return master/worker nodes. If needed,
node list can be overridden with flags.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2020-07-15 13:52:13 -07:00
Andrey Smirnov
cbb7ca8390 refactor: merge osd into machined
This merges `osd` API into `machined`. API was copied from `osd` into
`machined`, and `osd` API was deprecated.

For backwards compatibility, `machined` still implements `osd` API, so
older Talos API clients can still talk to the node without changes.

Docs were updated. No functional changes.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2020-07-13 12:50:00 -07:00
Andrew Rynhard
7915c73a86 fix: register event service with router
This adds the events streaming RPC to routerd.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2020-05-15 07:33:32 -07:00
Andrew Rynhard
49307d554d refactor: improve machined
This is a rewrite of machined. It addresses some of the limitations and
complexity in the implementation. This introduces the idea of a
controller. A controller is responsible for managing the runtime, the
sequencer, and a new state type introduced in this PR.

A few highlights are:

- no more event bus
- functional approach to tasks (no more types defined for each task)
  - the task function definition now offers a lot more context, like
    access to raw API requests, the current sequence, a logger, the new
    state interface, and the runtime interface.
- no more panics to handle reboots
- additional initialize and reboot sequences
- graceful gRPC server shutdown on critical errors
- config is now stored at install time to avoid having to download it at
  install time and at boot time
- upgrades now use the local config instead of downloading it
- the upgrade API's preserve option takes precedence over the config's
  install force option

Additionally, this pulls various packes in under machined to make the
code easier to navigate.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2020-04-28 08:20:55 -07:00
Spencer Smith
12bfd8dd94 feat: allow for persistence of config data
This PR will allow users to set the `persist: true` value in their
config data to tell talos not to re-pull the config data at each reboot.
The default will still remain as a "pull every time" methodolgy in order
to encourage immutability by default.

Signed-off-by: Spencer Smith <robertspencersmith@gmail.com>
2020-03-06 11:42:00 -05:00
Andrey Smirnov
a068acfbe4 feat: split routerd from apid
New service `routerd` performs exactly single task: based on incoming
API call service name, it routes the requests to the appropriate Talos
service (`networkd`, `osd`, etc.) Service `routerd` listens of file
socket and routes requests to file sockets.

Service `apid` now does single task as well:

* it either fans out request to other `apid` services running on other
nodes and aggregates responses
* or it forwards requests to local `routerd` (when request destination
is local node)

Cons:

* one more proxying layer on request path

Pros:

* more clear service roles
* `routerd` is part of core Talos, services should register with it to
expose their API; no auth in the service (not exposed to the world)
* `apid` might be replaced with other implementation, it depends on TLS infra,
auth, etc.
* `apid` is better segregated from other Talos services (can only access
`routerd`, can't talk to other Talos services directly, so less exposure
in case of a bug)

This change is no-op to the end users.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2020-03-05 22:05:56 +03:00
Andrew Rynhard
ad863a7f92 refactor: rename protobuf services, RPCs, and messages
This PR brings our protobuf files into conformance with the protobuf
style guide, and community conventions. It is purely renames, along with
generated docs.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2019-12-11 11:41:40 -08:00
Andrey Smirnov
3a93e65b54 feat: make osd.Dmesg API streaming
This is to prepare for upcoming switch to reading `/dev/kmsg` which
should allow following logs, doing some kind of tail, etc.

The output is far from being perfect, as `dmesg` data is delivered as
single chunk (not as lines), but once server side updates, client side
should match it.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2019-12-09 23:52:35 +03:00
Andrey Smirnov
daef87b9c2 refactor: extract TLS bits from apid main.go
No functional changes, just moving code around.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2019-12-05 10:27:44 -08:00
Andrey Smirnov
ad2f2574d7 fix: provide a way for client TLS config to use Provider
In `tls.Config`, there are two hooks for getting certificate for client
and server config. So we need separate configuration methods to
configure them both.

Required in apid to provide refreshing TLS client cert to
grpc.ClientConn.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2019-11-29 23:35:23 +03:00
Andrey Smirnov
5b7bea2471 feat: use grpc-proxy in apid
This replaces codegen version of apid proxying with
talos-systems/grpc-proxy based version. Proxying is transparent, it
doesn't require exact information about methods and response types. It
requires some common layout response to enhance it properly with node
metadata or errors.

There should be no signifcant changes to the API with the previous
version, but it's worth mentioning a few changes:

1. grpc.ClientConn is established just once per upstream (either local
service or remote apid instance).

2. When called without `-t` (`targets`), apid proxies immediately down
to local service skipping proxying to itself (as before), which results
in empty node metadata in response (before it had local node IP). Might
revert this later to proxy to itself (?).

3. Streaming APIs are now fully supported with multiple targets, but
message definition doesn't contain `ResponseMetadata`, so streaming APIs
are broken now with targets (needs a fix).

4. Errors are now returned as responses with `Error` field set in
`ResponseMetadata`, this requires client library update and `osctl` to
handle it properly.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2019-11-29 22:57:25 +03:00
Andrey Smirnov
e658c442a6 feat: implement grpc request loggging
Logging is pretty simple and bare minimum is being logged. I believe
better logging can be provided for apid when it does fan-out, but that
is beyond the scope for the first PR.

Sample logs:

```
$ osctl-linux-amd64 logs machined-api
machined 2019/11/11 21:16:43 OK [/machine.Machine/ServiceList] 0.000ms unary Success (:authority=unix:/run/system/machined/machine.sock;content-type=application/grpc;user-agent=grpc-go/1.23.0)
machined 2019/11/11 21:17:09 Unknown [/machine.Machine/Logs] 0.000ms stream open /run/system/log/machined.log: no such file or directory (:authority=unix:/run/system/machined/machine.sock;content-type=application/grpc;user-agent=grpc-go/1.23.0)
```

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2019-11-11 13:42:08 -08:00
Andrey Smirnov
add4a8d5ab fix: recover from panics in grpc servers
This installs default middleware to recover from panics (convert them to
errors) in all the grpc servers by default.

Slight refactoring to allow that as grpc can only accept Unary/Stream
interceptors only once.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2019-11-08 15:28:18 -08:00
Brad Beam
7897374ff1 feat: Add support for streaming apis in apid
This brings in the recent updates to protoc-gen-proxy to allow support
for proxying streaming api requests. We artificially limit it to only the first
target specified in the list while we work through what multi target stream
support looks like.

Signed-off-by: Brad Beam <brad.beam@talos-systems.com>
2019-11-08 14:22:30 -06:00
Brad Beam
457c6416a6 feat: Add network api to apid
This extends apid to include the network api

Signed-off-by: Brad Beam <brad.beam@talos-systems.com>
2019-10-28 04:21:48 -07:00
Brad Beam
ee24e42319 feat: Add time api to apid
This extends apid to cover the time api.

Signed-off-by: Brad Beam <brad.beam@talos-systems.com>
2019-10-25 14:35:14 -07:00
Andrey Smirnov
d3d011c8d2 chore: replace /* */ comments with // comments in license header
This fixes issues with `// +build` directives not being recognized in
source files.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2019-10-25 14:15:17 -07:00
Brad Beam
573cce8d18 feat: Add APId
This PR introduces APId. This service replaces the frontend functionality
previously provided by OSD. The main driver for this is two fold:

1. Create a single purpose application to expose the talos api

2. Make use of code generation to DRY api changes

Signed-off-by: Brad Beam <brad.beam@talos-systems.com>
2019-10-25 13:02:33 -05:00