Fixes were applied automatically.
Import ordering might be questionable, but it's strict:
* stdlib
* other packages
* same package imports
Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
This moves `pkg/config`, `pkg/client` and `pkg/constants`
under `pkg/machinery` umbrella.
And `pkg/machinery` is published as Go module inside Talos repository.
Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
This replaces logging to files with inotify following to pure in-memory
circular buffer which grows on demand capped at specified maximum
capacity.
The concern with previous approach was that logs on tmpfs were growing
without any bound potentially consuming all the node memory.
Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
This is a rewrite of machined. It addresses some of the limitations and
complexity in the implementation. This introduces the idea of a
controller. A controller is responsible for managing the runtime, the
sequencer, and a new state type introduced in this PR.
A few highlights are:
- no more event bus
- functional approach to tasks (no more types defined for each task)
- the task function definition now offers a lot more context, like
access to raw API requests, the current sequence, a logger, the new
state interface, and the runtime interface.
- no more panics to handle reboots
- additional initialize and reboot sequences
- graceful gRPC server shutdown on critical errors
- config is now stored at install time to avoid having to download it at
install time and at boot time
- upgrades now use the local config instead of downloading it
- the upgrade API's preserve option takes precedence over the config's
install force option
Additionally, this pulls various packes in under machined to make the
code easier to navigate.
Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
This updates upgrade tests to run two flows with 3+1 clusters:
1. 0.3 -> current (testing upgrade with partition wiping)
2. 0.4-alpha.7 -> current (testing upgrade without partition wiping,
boot-a/boot-b)
And small upgrade with preserve enabled for single-node cluster.
Provision tests are now split into two parallel tracks in Drone.
Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
This PR reworks the ordering of our recovery function. It will make sure
we actually show the user the recovery message prior to looking into
whether to auto-reboot.
Signed-off-by: Spencer Smith <robertspencersmith@gmail.com>
This PR allows Talos to respect the panic=0 flag if users pass that in
their kernel args. Doing this makes it easier to catch kernel panics in
debug scenarios and allows the user to manually trigger a restart with
ctrl+alt+del when they're ready.
Signed-off-by: Spencer Smith <robertspencersmith@gmail.com>
This aligns the nomenclature for filesystems like /dev and /proc with
what is used in the kernel code.
Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
Since dmesg is not streamed, it becomes difficult to debug issues with
machined. This fixes that by setting up the logging of machine to go to
/dev/kmsg and to a log file.
Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
This removes the github.com/pkg/errors package in favor of the official
error wrapping in go 1.13.
Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
The gofumports does everything that gofumpt does with the addition of
formatting imports. This change proposes the use of the `-local` flag so
that we can have imports separated in the following order:
- standard library
- third party
- Talos specific
Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
This change aims to make installations more unified and reliable. It
introduces the concept of a mountpoint manager that is capable of
mounting, unmounting, and moving a set of mountpoints in the correct
order.
Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
This change aims to standardize the boot process. It introduces the
concept of a phase, which is comprised of tasks. Phases are ran in serial and
the tasks that make up a phase are ran concurrently.
Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
The responsibility of init should only be to mount the rootfs. This
change moves Talos specific logic into machined. This will allow us to
define a version of Talos in a single binary instead of split across
two. This will enable cleaner upgrades and helps make the codebase
easier to reason about.
Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
This PR is needed so that the eth0 device will have the proper mtu when
coming online in google cloud
Signed-off-by: Spencer Smith <robertspencersmith@gmail.com>
Switch from `StringSliceVar` to `StringArrayVar` to maintain commas
in kernel args.
Update entrypoint script to allow specifying extra kernel args.
Remove default console settings in kernel config.
Signed-off-by: Brad Beam <brad.beam@talos-systems.com>
Minor improvements to help when debugging.
Without this, if bringing up the default interface fails, the logs can
be misleading.
Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
In addition to adding a flag, this adds a field to the user data that allows
for extra kernel arguments to be specified.
Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
This makes test launch their own isolated instance of containerd with
its own root/state directories and listening socket address. Each test
brings this instance up/down on its own.
Add options to override containerd address in the code (used only in the
tests).
Enable parallel go test runs once again.
P.S. I wish I could share that 'SetupSuite' phase across the tests, but
afaik there's no way in Go to share `_test.go` code across packages. If
we put it as normal package, this might pull in test dependencies (like
`testify`) into production code, which I don't like.
Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
When we receive all the necessary files from trustd, we cancel the context. This
was treated as an error case and a message was logged accordingly. However,
this case was not really an error versus a signal to stop trying to fetch a
given file.
Fixes#723
Add basic FileSet tests. Minor refactor to FileSet call to allow easier testing
Add context canceled test for download
Add config tests and trustd coverage
Signed-off-by: Brad Beam <brad.beam@talos-systems.com>
This PR moves the reset API to the init API definition.
It leverages the same code we use for upgrades.
Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
First, use cryptographically secure random number generator.
Second, generate random 32 bytes, don't limit them to any range, as
they're going to be base64-encoded anyways.
Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
This unifies low-level filesystem walker code for `ls` and `cp`.
New features:
* `ls` now reports relative filenames
* `ls` now prints symlink destination for symlinks
* `cp` now properly always reports errors from the API
* `cp` now reports all the errors back to the client
Example for `ls`:
```
osctl-linux-amd64 --talosconfig talosconfig ls -l /var
MODE SIZE(B) LASTMOD NAME
drwxr-xr-x 4096 Jun 26 2019 .
Lrwxrwxrwx 4 Jun 25 2019 etc -> /etc
drwxr-xr-x 4096 Jun 26 2019 lib
drwxr-xr-x 4096 Jun 21 2019 libexec
drwxr-xr-x 4096 Jun 26 2019 log
drwxr-xr-x 4096 Jun 21 2019 mail
drwxr-xr-x 4096 Jun 26 2019 opt
Lrwxrwxrwx 6 Jun 21 2019 run -> ../run
drwxr-xr-x 4096 Jun 21 2019 spool
dtrwxrwxrwx 4096 Jun 21 2019 tmp
-rw------- 14979 Jun 26 2019 userdata.yaml
```
Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
Service `osd` doesn't have access to rootfs, as it is running in a
container, so move API to `init` which has unconstrained access to
rootfs. (This is in line with another API, `osctl cp`).
Fixes: #752
Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
Actual API is implemented in the `init`, as it has access to root
filesystem. `osd` proxies API back to `init` with some tricks to support
grpc streaming.
Given some absolute path, `init` produces and streams back .tar.gz
archive with filesystem contents.
`osctl cp` works in two modes. First mode streams data to stdout, so
that we can do e.g.: `osctl cp /etc - | tar tz`. Second mode extracts
archive to specified location, dropping ownership info and adjusting
permissions a bit. Timestamps are not preserved.
If full dump with owner/permisisons is required, it's better to stream
data to `tar xz`, for quick and dirty look into filesystem contents
under unprivileged user it's easier to use in-place extraction.
Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
For #711, this should be a complete fix - waiting for container to be
started.
For #712, this should be more of a workaround - playing with timeouts to
hit the failure less likely. Idea of the test is that health check
should be aborted on timeout (1ms) while health check succeeds if not
aborted in 50ms. Before the fix it was 1ms/10ms, but still concurrently
there was a chance that goroutine exits successfully after 10ms while
1ms context deadline is not reached.
Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
Fixes: #689, #690
Refactor container inspection code into a package of its own with some
rudimentary tests. Use this package consistently in osd commands dealing
with containers.
Improvements for the next PRs:
* implement API to fetch info about container by ID (to avoid fetching
full list)
* handle and display errors on client side, not to the log of the
server
* more tests, including k8s containers (how can we do that?)
Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>