50 Commits

Author SHA1 Message Date
Andrey Smirnov
10a40a15d9 fix: extract errors from API response
This PR only touches `Version` method, but I will expand it to other
methods in the next PR.

When proxying to many upstreams, errors are wrapped as responses as we
can't return error and response from grpc call. Reflect-based function
was introduced to filter out responses which contain errors as
multierror. Reflection was used, as each response is a different Go
type, and we can't write a generic function for it.

osctl was updated to support having both resp & err not nil. One failed
response shouldn't result in error.

Re-enabled integration test for multiple targets and version
consistency, need e2e validation.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2019-12-05 09:44:10 -08:00
Andrey Smirnov
fc52025490 fix: provide peer remote address for 'NODE': as default in osctl
This change is pretty mechanical, just wrap every API so that remote
peer address is used as default for `resp.Metadata.Hostname`.

This makes `NODE:` non-empty in all the API calls.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2019-12-05 00:11:55 +03:00
Andrey Smirnov
5b7bea2471 feat: use grpc-proxy in apid
This replaces codegen version of apid proxying with
talos-systems/grpc-proxy based version. Proxying is transparent, it
doesn't require exact information about methods and response types. It
requires some common layout response to enhance it properly with node
metadata or errors.

There should be no signifcant changes to the API with the previous
version, but it's worth mentioning a few changes:

1. grpc.ClientConn is established just once per upstream (either local
service or remote apid instance).

2. When called without `-t` (`targets`), apid proxies immediately down
to local service skipping proxying to itself (as before), which results
in empty node metadata in response (before it had local node IP). Might
revert this later to proxy to itself (?).

3. Streaming APIs are now fully supported with multiple targets, but
message definition doesn't contain `ResponseMetadata`, so streaming APIs
are broken now with targets (needs a fix).

4. Errors are now returned as responses with `Error` field set in
`ResponseMetadata`, this requires client library update and `osctl` to
handle it properly.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2019-11-29 22:57:25 +03:00
Andrew Rynhard
ac089dc330 feat: add read API
This adds an API for reading arbitrary files.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2019-11-25 10:46:50 -08:00
Sekerin Evgeniy
83d5f4c721 feat: Add context key to osctl
Added context key for change context on osctl

Signed-off-by: Sekerin Evgeniy <sekerin.e.a@gmail.com>
2019-11-13 11:32:15 -08:00
Brad Beam
531e7d8144 feat: Add meminfo api
Add ability to retrieve node memory stats ( /proc/meminfo ).

Signed-off-by: Brad Beam <brad.beam@talos-systems.com>
2019-11-10 21:02:43 -06:00
Andrey Smirnov
cdda81df66 test: add k8s integration tests
Once again, mostly groundwork and one simple test for node versions.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2019-11-06 17:08:44 -08:00
Brad Beam
41a4741bca refactor: Move logs to machined
This moves Logs endpoint to machined to reduce the mount footprint of osd.

Signed-off-by: Brad Beam <brad.beam@talos-systems.com>
2019-11-04 15:04:13 -08:00
Brad Beam
a4e1479b07 refactor: Move kubeconfig to machined
This moves the Kubeconfig api endpoint to machined and consolidates the
"read a file" code into machined. This also changes Kubeconfig to
use the CopyOut method which changes Kubeconfig to a streaming grpc call.

Signed-off-by: Brad Beam <brad.beam@talos-systems.com>
2019-11-04 14:45:23 -08:00
Brad Beam
3fd8abf426 chore: Move data messages to common proto
This is to allows reuse across multiple apis.

Signed-off-by: Brad Beam <brad.beam@talos-systems.com>
2019-11-04 14:24:41 -06:00
Brad Beam
457c6416a6 feat: Add network api to apid
This extends apid to include the network api

Signed-off-by: Brad Beam <brad.beam@talos-systems.com>
2019-10-28 04:21:48 -07:00
Brad Beam
ee24e42319 feat: Add time api to apid
This extends apid to cover the time api.

Signed-off-by: Brad Beam <brad.beam@talos-systems.com>
2019-10-25 14:35:14 -07:00
Andrey Smirnov
d3d011c8d2 chore: replace /* */ comments with // comments in license header
This fixes issues with `// +build` directives not being recognized in
source files.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2019-10-25 14:15:17 -07:00
Brad Beam
251ab16e07 feat: Add node metadata wrapper to machine api
- Added common.proto to host NodeMetadata
- go_package names were fixed up so imports are generated with the proper
  package names
- fixed up build work (dockerfile) to prevent copying the previously
  generated go proto files. This fixes a bug where we could incorrectly
  copy the previously generated protobuf instead of a new one generated
  at an incorrect location/name/etc.

Signed-off-by: Brad Beam <brad.beam@talos-systems.com>
2019-10-22 14:42:34 -05:00
Brad Beam
e6bf92ce31 feat(osd): Enable hitting multiple OSD endpoints
This enables the ability to specify additional <talos> endpoints to connect to
to pull back data.

Signed-off-by: Brad Beam <brad.beam@talos-systems.com>
2019-10-16 15:30:25 -05:00
Andrew Rynhard
d430a37e46 refactor: use go 1.13 error wrapping
This removes the github.com/pkg/errors package in favor of the official
error wrapping in go 1.13.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2019-10-15 22:20:50 -07:00
Andrey Smirnov
c2cb0f9778 chore: enable 'wsl' linter and fix all the issues
I wish there were less of them :)

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2019-10-10 01:16:29 +03:00
Andrey Smirnov
bb5f5cc754 chore: bump golangci-lint to 1.20
Memory usage reduced around 8-10x: now it stays stable at 1GB.

I disabled some of the new linters, and one rule which is violated a
lot.

I might make sense to go back and enable `wsl` fixing all the issues
(leaving that for another PR).

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2019-10-09 22:21:08 +03:00
Andrew Rynhard
6ec5cb02cb refactor: decouple grpc client and userdata code
This detangles the gRPC client code from the userdata code. The
motivation behind this is to make creating clients more simple and not
dependent on our configuration format.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2019-09-26 14:18:53 -07:00
Andrew Rynhard
9ffa064a70 feat: return a struct for processes RPC
This makes working with the API much cleaner as a client. Using gob
doesn't give the client a well-known type to work with in the API
definition.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2019-09-20 16:18:05 -07:00
Andrew Rynhard
3a92537a30 refactor: rename RPCs
The following RPCs have been renamed:

- ps to containers
- top to processes
- df to mounts

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2019-09-20 14:33:51 -07:00
Andrew Rynhard
9230ff4e35 feat: return a data structure in version RPC
A byte slice is not very useful. Having a struct with fields makes for a
better experience.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2019-09-19 16:58:07 -07:00
Andrew Rynhard
6efd6fbe08 chore: move gRPC API to public
In order for other projects to make use of our APIs, they must not
reside underneath the internal directory. This moves the protobuf
definitions to a top-level "api" directory and scopes them according to
their domain. This change also removes generated code from the gitignore
file so that users don't have to generate the code themseleves.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2019-09-19 08:55:13 -07:00
Andrey Smirnov
b68e6395d8 feat(machined): filter actions stop/start/restart on per-service level
This implements 'default deny' policy for service operations via the
API: services do not allow operations.

Service whitelists itself for stop/start/restart by implementing the
interface and returning boolean flag which might depend on userdata.

Machined APIs `Stop/Start` were renamed to `ServiceStop`/`ServiceStart`
to avoid confusion with osd API `Restart` which is not related to
services. Old APIs are deprecated and compatibility code forwards old
APIs to the new code.

`ServiceRestart` API was introduced to distinguish restart action from
stop/start (previously restart was implemented as stop+start in the
CLI).

Service udevd-trigger was whitelisted for all operations (allows
stopping hanging run, restarting to trigger once again).

Services proxyd & ntpd were whitelisted for restart and start (start is
whitelisted to help with service stuck in stopped state while restarting).

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2019-09-13 00:38:19 +03:00
Andrew Rynhard
d89b199825 chore: change upgrade request "url" to "image"
This aligns the nomenclature used throughout the codebase.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2019-08-27 21:43:20 -07:00
Brad Beam
692571bdec feat(networkd): Add grpc endpoint
Allows us to list routes and interface details

Signed-off-by: Brad Beam <brad.beam@talos-systems.com>
2019-08-25 19:48:08 -07:00
Brad Beam
d36007fb29 feat(osd): Add ntpd client
Allows us to access ntp api

Signed-off-by: Brad Beam <brad.beam@talos-systems.com>
2019-08-25 13:38:34 -07:00
Seán C McCord
d0ff28a8c7 fix: enclose server address is bracks if IPv6
Fixes #980

Signed-off-by: Seán C McCord <ulexus@gmail.com>
2019-08-10 17:42:17 -07:00
Andrew Rynhard
90c91807bd refactor: restructure the project layout
This change moves packages into more appropriate places.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2019-08-01 22:19:42 -07:00
Andrey Smirnov
9c63f4ed0a feat(init): implement complete API for service lifecycle (start/stop)
It is now possible to `start`/`stop`/`restart` any service via `osctl`
commands.

There are some changes in `ServiceRunner` to support re-use (re-entering
running state). `Services` singleton now tracks service running state to
avoid calling `Start()` on already running `ServiceRunner` instance.
Method `Start()` was renamed to `LoadAndStart()` to break up service
loading (adding to the list of service) and actual service start.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2019-08-01 11:16:57 -07:00
Andrew Rynhard
b4383e35db feat: move df API to init
This change allows for more accurate mount reporting as /proc/mounts is
a symlink to /proc/self/mounts and contains mounts that are relative to
the running process. In our case this was osd. This caused inaccurate
reporting of mounts since they were relative to osd when we really
wanted mounts relative to machined.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2019-07-24 15:28:37 -07:00
Andrew Rynhard
8e8aae98dd feat: add machined
This commit splits our current init into init and machined.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2019-07-16 13:12:21 -07:00
Andrew Rynhard
5d8ee0a3a5 fix: use existing logic to perform reset
This PR moves the reset API to the init API definition.
It leverages the same code we use for upgrades.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2019-07-04 18:26:14 -07:00
Andrey Smirnov
237e903f91 feat(osd): implement CRI inspector for containers (#817)
Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2019-07-02 15:48:00 -07:00
Andrey Smirnov
76071abbb8
feat(init): move 'ls' API to init from osd (#755)
Service `osd` doesn't have access to rootfs, as it is running in a
container, so move API to `init` which has unconstrained access to
rootfs. (This is in line with another API, `osctl cp`).

Fixes: #752

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2019-06-21 22:29:39 +03:00
Andrey Smirnov
9ed45f7090 feat(osctl): implement 'cp' to copy files out of the Talos node (#740)
Actual API is implemented in the `init`, as it has access to root
filesystem. `osd` proxies API back to `init` with some tricks to support
grpc streaming.

Given some absolute path, `init` produces and streams back .tar.gz
archive with filesystem contents.

`osctl cp` works in two modes. First mode streams data to stdout, so
that we can do e.g.: `osctl cp /etc - | tar tz`. Second mode extracts
archive to specified location, dropping ownership info and adjusting
permissions a bit. Timestamps are not preserved.

If full dump with owner/permisisons is required, it's better to stream
data to `tar xz`, for quick and dirty look into filesystem contents
under unprivileged user it's easier to use in-place extraction.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2019-06-20 17:02:58 -07:00
Andrey Smirnov
0c0a0340b2
fix(osctl): allow '-target' flag for osctl restart (#732)
I couldn't find any use for the `timeout` flag nor the value passed in
the API, but it block much more useful and present in other commands
flag 'target'.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2019-06-14 21:37:57 +03:00
Seán C. McCord
532a53bfaf feat(init): Implement 'ls' command (#721)
Fixes #719

Signed-off-by: Seán C McCord <ulexus@gmail.com>
2019-06-07 10:19:20 -07:00
Andrey Smirnov
f5969d2c6c
fix(osctl): avoid panic on empty 'talosconfig' (#725)
When talosconfig doesn't exist, `osctl` creates empty one behind the
scenes, but that leads to immediate panic if the command tries to build
osd client:

```
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x18 pc=0x11d6786]

goroutine 1 [running]:
github.com/talos-systems/talos/cmd/osctl/pkg/client.NewDefaultClientCredentials(0x7ffd720f5100, 0xb, 0xc000559ce8, 0x757014, 0xc0000d5500)
	/src/cmd/osctl/pkg/client/client.go:50 +0xa6
github.com/talos-systems/talos/cmd/osctl/cmd.setupClient(0x16ca3f0)
	/src/cmd/osctl/cmd/root.go:100 +0x3d
github.com/talos-systems/talos/cmd/osctl/cmd.glob..func22(0x24ad7c0, 0xc00058c240, 0x0, 0x3)
	/src/cmd/osctl/cmd/ps.go:32 +0x37
github.com/spf13/cobra.(*Command).execute(0x24ad7c0, 0xc0005f8a00, 0x3, 0x4, 0x24ad7c0, 0xc0005f8a00)
	/toolchain/gopath/pkg/mod/github.com/spf13/cobra@v0.0.3/command.go:766 +0x2ae
github.com/spf13/cobra.(*Command).ExecuteC(0x24ae140, 0x2507030, 0x162f2d7, 0xb)
	/toolchain/gopath/pkg/mod/github.com/spf13/cobra@v0.0.3/command.go:852 +0x2ec
github.com/spf13/cobra.(*Command).Execute(...)
	/toolchain/gopath/pkg/mod/github.com/spf13/cobra@v0.0.3/command.go:800
github.com/talos-systems/talos/cmd/osctl/cmd.Execute()
	/src/cmd/osctl/cmd/root.go:93 +0x24f
main.main()
	/src/cmd/osctl/main.go:10 +0x20
```

Fix that by returning explicit error:

```
error getting client credentials: 'context' key is not set in the config
```

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2019-06-07 17:40:28 +03:00
Andrey Smirnov
ad410fb7f2
refactor(osctl): move cli code out of 'client' package (#692)
This moves cli code (rendering output, etc.) out of 'client' package, so
that client package is usable outside of cli.

Consistently accept context as first param to API methods, so that we
can build graceful request cancellation.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2019-05-29 01:10:25 +03:00
Andrey Smirnov
75b2ce7fd2
feat(init): implement services list API and osctl service CLI (#662)
This returns list of all the services registered, with their current
status, past events, health state, etc.

New CLI is `osctl service [<id>]`: without `<id>` it prints list of all
the services, with specific `<id>` it provides details for a service.

I decided to create "parallel" data structures in protobuf as Go
structures don't map nicely onto what protoc generates: pointers vs.
values, additional fields like mutexes, etc. Probably there's a better
approach, I'm open for it.

For CLI, I tried to keep CLI stuff in `cmd/` package, and I also created
simple wrapper to remove duplicated code which sets up client for each
command.

Examples:

```
$ osctl service
SERVICE      STATE     HEALTH   LAST CHANGE   LAST EVENT
containerd   Running   OK       21s ago       Health check successful
kubeadm      Running   ?        2s ago        Started task kubeadm (PID 280) for container kubeadm
kubelet      Running   ?        0s ago        Started task kubelet (PID 383) for container kubelet
ntpd         Running   ?        14s ago       Started task ntpd (PID 129) for container ntpd
osd          Running   ?        14s ago       Started task osd (PID 126) for container osd
proxyd       Waiting   ?        14s ago       Waiting for conditions
trustd       Running   ?        14s ago       Started task trustd (PID 125) for container trustd
udevd        Running   ?        14s ago       Started task udevd (PID 130) for container udevd
```

```
$ osctl service proxyd
ID       proxyd
STATE    Running
HEALTH   ?
EVENTS   [Preparing]: Running pre state (22s ago)
         [Waiting]: Waiting for conditions (22s ago)
         [Preparing]: Creating service runner (6s ago)
         [Running]: Started task proxyd (PID 461) for container proxyd (6s ago)
```

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2019-05-17 18:01:12 +03:00
Brad Beam
0b33280915
feat(init): Add upgrade endpoint (#623)
Signed-off-by: Brad Beam <brad.beam@talos-systems.com>
2019-05-13 15:15:25 -05:00
Brad Beam
5485b9ecb4
fix(osctl): Fix panic on osctl df if error is returned (#646)
Signed-off-by: Brad Beam <brad.beam@talos-systems.com>
2019-05-11 21:52:10 -05:00
Andrey Smirnov
ab2917e833
feat(init): implement init gRPC API, forward reboot to init (#579)
This implements insecure over-file-socket gRPC API for init with two
first simplest APIs: reboot and shutdown (poweroff).

File socket is mounted only to `osd` service, so it is the only service
which can access init API. Osd forwards reboot/shutdown already
implemented APIs to init which actually executes these.

This enables graceful shutdown/reboot with service shutdown, sync, etc.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2019-04-26 23:04:24 +03:00
Andrew Rynhard
fc05224b4f
feat: add shutdown command (#577)
Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2019-04-26 08:53:12 -07:00
Andrew Rynhard
a8fa1f5cd1
feat(osctl): add df command (#569)
Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2019-04-26 08:24:31 -07:00
Brad Beam
3f358b12ae
feat(osctl): Add osctl top (#560)
Also adds pkg/proc as the backing package for top data

Signed-off-by: Brad Beam <brad.beam@talos-systems.com>
2019-04-23 21:25:41 -05:00
Andrew Rynhard
7688de6a3a
chore: upgrade golangci-lint to v1.16.0 (#515)
Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2019-04-09 21:53:35 -07:00
Andrew Rynhard
e18b5086a9
chore: update org to new name (#480)
Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2019-04-03 18:29:21 -07:00
Andrew Rynhard
455aeb742c
chore: expose userdata and osctl client packages (#471)
Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2019-04-02 17:11:17 -07:00