12 Commits

Author SHA1 Message Date
Andrey Smirnov
a0188aff73
feat(init): implement service dependencies, correct start and shutdown (#680)
This PR introduces dependencies between the services. Now each service
has two virtual events associated with it: 'up' (running and healthy)
and 'down' (finished or failed). These events are used to establish
correct order via conditions abstraction.

Service image unpacking was moved into 'pre' stage simplifying
`init/main.go`, service images are now closer to the code which runs the
service itself.

Step 'pre' now runs after 'wait' step, and service dependencies are now
mixed into other conditions of 'wait' step on startup.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2019-05-24 19:17:52 +03:00
Andrey Smirnov
06bff97a3f
refactor: change conditions to be interface, add descriptions (#677)
Conditions are now implemented as interface with two methods: `Wait` for
condition to be true (cancelable via context) and 'String' which
describes what condition is waiting for.

Generic 'WaitForAll' was implemented to wait for multiple conditions at
once.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2019-05-21 21:25:08 +03:00
Andrey Smirnov
505b5022c4
feat(init): implement graceful shutdown of 'init' (#562)
Most crucial changes in `init/main.go`: on shutdown now Talos tries
to stop gracefully all the services. All the shutdown paths are unified,
including poweroff, reboot and panic handling on startup.

While I was at it, I also fixed bug with containers failing to start
when old snapshot is still around.

Service lifecycle is wrapped with `ServiceRunner` object now which
handles state transitions and captures events related to state changes.
Every change goes to the log as well.

There's no way to capture service state yet, but that is planned to be
implemented as RPC API for `init` which is exposed via `osd` to `osctl`.

Future steps:

1. Implement service dependencies for correct startup order and
shutdown order.

2. Implement service health, so that we can say "start trustd when
containerd is up and healthy".

3. Implement gRPC API for init, expose via osd (service status, restart,
poweroff, ...)

4. Impement 'String()' for conditions, so that we can see what service
is waiting on right now.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2019-04-26 16:53:19 +03:00
Andrey Smirnov
a858cb4986
refactor: extract 'restart' piece of the runners into wrapper runner (#559)
This changes `runner.Runner` API to support more methods to allow for
containerd runner to create container object only once, and start/stop
tasks to implement restarts.

New API: `Open()` (initialize), `Run()` (run once until exits), `Stop()`
(stop running instance), `Close()` (free resource, no longer available
for new `Run()`).

So the sequence might be: `Open`, `Run`, `Stop`, `Run`, `Stop`, `Close`.

Process and containerd runners were updated for the new API, and
'restart' part was removed, now both runners only run the task once.

Restart piece was implemented in an abstract way for any wrapped
`runner.Runner` in the `runner/restart` package. Restart supports three
restart policies: `Once`, `UntilSuccess` and `Forever`.

Service API was changed slightly to return the `runner.Runner`
interface, and `system.Services` now handles running the service.

For all the services, code was adjusted to either return runner (run
once), or was wrapped with `restart` runner to provide restart policy.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2019-04-23 01:25:26 +03:00
Brad Beam
271d28244b fix(osd): Fix k8s.io namespace logs (#557)
Signed-off-by: Brad Beam <brad.beam@talos-systems.com>
2019-04-18 08:49:33 -07:00
Andrey Smirnov
d29e27ee33 refactor: containerd runner refactoring and unit-tests (#551)
Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2019-04-16 13:56:52 -07:00
Andrew Rynhard
e18b5086a9
chore: update org to new name (#480)
Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2019-04-03 18:29:21 -07:00
Andrew Rynhard
455aeb742c
chore: expose userdata and osctl client packages (#471)
Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2019-04-02 17:11:17 -07:00
Andrew Rynhard
94b011c724
refactor: use containerd exported defaults (#310)
Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2019-01-15 20:29:13 -08:00
Andrew Rynhard
25fca3d68d
feat: import core service containers from local store (#309)
Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2019-01-15 18:46:41 -08:00
Andrew Rynhard
ee226dddac
chore: enforce commit and license policies (#304)
Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2019-01-13 16:10:49 -08:00
Andrew Rynhard
72eb1b34f5
chore: use buildkit for builds (#295) 2018-12-19 22:22:05 -08:00