88 Commits

Author SHA1 Message Date
Spencer Smith
aed8c06730 chore: rename v1 node configs to v1alpha1
This PR moves to using v1alpha1 as the inital node config version, so
we can graduate these configs a little more cleanly later on.

Signed-off-by: Spencer Smith <robertspencersmith@gmail.com>
2019-09-09 13:03:49 -04:00
Seán C McCord
beecb70374 feat: Allow spec of canonical controlplane addr
Broke the binding between the discrete IP addresses of the control plane
elements and the ControlPlaneEndpoint.  This allows the specification of
a canonical controlplane address which may optionally be a DNS name.

Fixes #1131

Signed-off-by: Seán C McCord <ulexus@gmail.com>
2019-09-08 17:18:52 -07:00
Seán C McCord
47a361c5b6 fix(osctl): use real userdata as defaults for install
This modifies `osctl install` to use the provided userdata as the source
for default installation values.  This allows such things as
userdata-supplied extra kernel parameters to be automatically
included in the bootloader.

Fixes #1102

Signed-off-by: Seán C McCord <ulexus@gmail.com>
2019-09-08 17:00:12 -07:00
Seán C McCord
bcb6a2d3a5 fix: prepend custom options for kernel commandline
Added a decomposition option to the kernel.NewDefaultCmdline() so that
the Defaults can be added _after_ constructing a custom commandline.
This is then implemented for `osctl install`.

Fixes #1128

Signed-off-by: Seán C McCord <ulexus@gmail.com>
2019-09-08 16:58:49 -07:00
Andrew Rynhard
71e8a5fccf chore: remove top output border
This should give it a closer feel to the rest of the UX.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2019-09-06 19:48:12 -07:00
Andrew Rynhard
a6e12b498d chore: align time command with output standards
This changes the output to a table writer with all caps for headers.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2019-09-05 14:42:43 -07:00
Andrew Rynhard
1bbed6907b chore: fix generate version flag and mark v0 as deprecated
Since the command's name is 'generate' the 'gen' prefix is not needed
in the version flag. The flag is scoped under the generate command so
it should be very clear that the '--version' flag is used to control the
config version.

We also move to defaulting to v0 since v1 is new and still needs to be
tested in the real world. We can default to v1 in the next release.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2019-08-30 06:59:54 -07:00
Andrew Rynhard
d89b199825 chore: change upgrade request "url" to "image"
This aligns the nomenclature used throughout the codebase.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2019-08-27 21:43:20 -07:00
Andrew Rynhard
2e8f393fc5 chore: remove unused init token
This removes a token that we never used. Right now its just noise, so
let's remove it.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2019-08-27 21:36:52 -07:00
Andrew Rynhard
1b8bf0d3aa fix: use unique variables for CLI flags
Since the cluster create command and the upgrade command shared a common
variable, and the upgrade defaults to an empty string, we get an invalid
reference format error when attempting to create a cluster. This makes
the variables unique to avoid that.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2019-08-27 19:33:30 -07:00
Andrew Rynhard
66c848cc0d fix: make --target persistent across all commands
We have this flag missing in a number of places. This ensures that all
commands in the future will have this flags. A potential cleanup would
be to hide this flag in commands where it does not make sense. For now I
think its best to have everywhere.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2019-08-27 18:57:53 -07:00
Andrew Rynhard
d098785a17 chore: remove local upgrade functionality
We have no need for this anymore since installs and upgrades are now
completely handled in a container.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2019-08-27 18:44:18 -07:00
Andrew Rynhard
4247b1befc chore: output top header in all caps
This changes the top output to be consistent with the rest of the CLI
output.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2019-08-27 18:04:39 -07:00
Andrew Rynhard
d4770d41ad feat: run installs via container
This moves to performing installs via a container.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2019-08-27 15:01:20 -05:00
Spencer Smith
f85750cdca feat: generate and use v1 machine configs
This PR will implement the v1 machine config proposal. This will allow
for a streamlined config for talos nodes.

Signed-off-by: Spencer Smith <robertspencersmith@gmail.com>
2019-08-26 19:36:14 -04:00
Brad Beam
692571bdec feat(networkd): Add grpc endpoint
Allows us to list routes and interface details

Signed-off-by: Brad Beam <brad.beam@talos-systems.com>
2019-08-25 19:48:08 -07:00
Brad Beam
d36007fb29 feat(osd): Add ntpd client
Allows us to access ntp api

Signed-off-by: Brad Beam <brad.beam@talos-systems.com>
2019-08-25 13:38:34 -07:00
Seán C McCord
7b217c79d7 feat: allow specification of additional API SANs
Adds handler for specification of additional subjet alt names (SANs) for
the API Server when generating a new cluster configuration using
`osctl`.

Fixes #800

Signed-off-by: Seán C McCord <ulexus@gmail.com>
2019-08-21 16:25:54 -07:00
Andrew Rynhard
794c7231f5 feat: run dedicated instance of containerd for system services
In order to facilitate upgrades and resets that are capable of
manipulating the system block device, we need to run an instance of
containerd that has zero dependencies on the disk. We run containerd
purely in memory for running system services.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2019-08-19 12:32:59 -07:00
Andrew Rynhard
6940aaf233 fix: verify installation definition
This fixes the possibility of panicing on a nil pointer by running the
verification steps earlier.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2019-08-16 09:58:12 -07:00
Andrew Rynhard
a116145c1b feat: rename DATA partition to EPHEMERAL
This changes the data partition name to something more appropriate. We
chose ephemeral to make it very clear that the disk should not be used
for application data.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2019-08-15 08:00:22 -07:00
Seán C McCord
ae77d6e053 fix: format IPv6 host entries properly
This reworks a bunch of the formatting for the userdata generation to
output a cleaner talos config when using IPv6 masters and `osctl config
generate`.

Please note that this changes the scope of concern for master indexing,
keeping `osctl` blissfully unaware of the master-reference chaining.
All it does is report the index of the master it is trying to generate.
The generator itself handles the reference chaining.

Fixes #916, fixes #917, and fixes #918

Signed-off-by: Seán C McCord <ulexus@gmail.com>
2019-08-12 11:35:38 -07:00
Seán C McCord
d0ff28a8c7 fix: enclose server address is bracks if IPv6
Fixes #980

Signed-off-by: Seán C McCord <ulexus@gmail.com>
2019-08-10 17:42:17 -07:00
Andrey Smirnov
ae54f7e40d fix: stalls in local Docker cluster boot
Problem was triggered by udevd trigger, root cause is not clear, but
workaround is to disable it for container mode.

Implement CPU/mem limits for `osctl cluster create`, apply defaults,
bump defaults for cicd.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2019-08-10 13:31:47 +03:00
Andrew Rynhard
90c91807bd refactor: restructure the project layout
This change moves packages into more appropriate places.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2019-08-01 22:19:42 -07:00
Andrew Rynhard
ca35b85300 refactor: improve installation reliability
This change aims to make installations more unified and reliable. It
introduces the concept of a mountpoint manager that is capable of
mounting, unmounting, and moving a set of mountpoints in the correct
order.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2019-08-01 11:44:40 -07:00
Andrey Smirnov
9c63f4ed0a feat(init): implement complete API for service lifecycle (start/stop)
It is now possible to `start`/`stop`/`restart` any service via `osctl`
commands.

There are some changes in `ServiceRunner` to support re-use (re-entering
running state). `Services` singleton now tracks service running state to
avoid calling `Start()` on already running `ServiceRunner` instance.
Method `Start()` was renamed to `LoadAndStart()` to break up service
loading (adding to the list of service) and actual service start.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2019-08-01 11:16:57 -07:00
Andrey Smirnov
ac963ad7e1 feat(osctl): allow configurable number of masters to cluster create
This allows to run tiny Talos clusters (which is sometimes nice for
local testing), e.g. with just a single master and zero workers.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2019-07-30 15:32:16 -07:00
Andrew Rynhard
e63c882b89 refactor: split machined into phases
This change aims to standardize the boot process. It introduces the
concept of a phase, which is comprised of tasks. Phases are ran in serial and
the tasks that make up a phase are ran concurrently.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2019-07-29 12:40:03 -07:00
Andrew Rynhard
6852fa969f chore: create raw image as sparse file
This change reduces the size of raw disk significantly by creating it as
a sparse file.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2019-07-25 11:28:07 -07:00
Andrew Rynhard
0ec17e4169 feat: run rootfs from squashfs
This change moves the rootfs to a squashfs image.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2019-07-25 08:38:31 -07:00
Andrew Rynhard
b4383e35db feat: move df API to init
This change allows for more accurate mount reporting as /proc/mounts is
a symlink to /proc/self/mounts and contains mounts that are relative to
the running process. In our case this was osd. This caused inaccurate
reporting of mounts since they were relative to osd when we really
wanted mounts relative to machined.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2019-07-24 15:28:37 -07:00
Spencer Smith
6fd685dad0 feat: allow specification of mtu for cluster create
This PR adds the ability to set mtu for the cluster create networks.
Default is 1440, which seems to be the default for calico.

Signed-off-by: Spencer Smith <robertspencersmith@gmail.com>
2019-07-17 07:34:28 -07:00
Andrew Rynhard
8e8aae98dd feat: add machined
This commit splits our current init into init and machined.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2019-07-16 13:12:21 -07:00
Brad Beam
e9482a4041 fix: Fix integration of extra kernel args
Switch from `StringSliceVar` to `StringArrayVar` to maintain commas
in kernel args.

Update entrypoint script to allow specifying extra kernel args.

Remove default console settings in kernel config.

Signed-off-by: Brad Beam <brad.beam@talos-systems.com>
2019-07-16 14:38:55 -05:00
Andrew Rynhard
0c17564398 chore: move init to /sbin
In order to run Talos with ignite, we need to have init at /sbin/init.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2019-07-15 13:26:09 -07:00
Andrew Rynhard
d197d5c6cd feat: add install flag for extra kernel args
In addition to adding a flag, this adds a field to the user data that allows
for extra kernel arguments to be specified.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2019-07-12 13:27:44 -07:00
Spencer Smith
ff9934cfe2 chore: update toolchain version and output created config files
Decided to combine two very small changes (which I'm now grumpy at myself for doing).

First, we'll update the toolchain image versions to allow for the use of a new containerd and runc. Also updated go.mod and go.sum to make use of newer containerd version. Closes #743 and #744.

Second, I added the bit of logic to osctl config generate to determine the working directory and let the user know that we created the various yaml files there. Closes #760.

Signed-off-by: Spencer Smith <robertspencersmith@gmail.com>
2019-07-05 17:59:25 -04:00
Andrew Rynhard
5d8ee0a3a5 fix: use existing logic to perform reset
This PR moves the reset API to the init API definition.
It leverages the same code we use for upgrades.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2019-07-04 18:26:14 -07:00
Andrew Rynhard
cca60ed121
fix: probe specified install device (#818)
Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2019-07-02 20:46:29 -07:00
Andrey Smirnov
237e903f91 feat(osd): implement CRI inspector for containers (#817)
Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2019-07-02 15:48:00 -07:00
Andrey Smirnov
0662af19d1 chore: seed math.rand PRNG on startup in every service (#801)
This is important as otherwise `math/rand` outputs predictable sequence
each time.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2019-06-28 11:03:15 -07:00
Andrey Smirnov
17f28d3461 feat(osctl): improve output of stats and ps commands (#788)
Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2019-06-26 15:37:54 -07:00
Andrey Smirnov
6d5ee0ca80
feat(init): unify filesystem walkers for ls/cp APIs (#779)
This unifies low-level filesystem walker code for `ls` and `cp`.

New features:

* `ls` now reports relative filenames
* `ls` now prints symlink destination for symlinks
* `cp` now properly always reports errors from the API
* `cp` now reports all the errors back to the client

Example for `ls`:

```
osctl-linux-amd64 --talosconfig talosconfig ls -l /var
MODE          SIZE(B)   LASTMOD       NAME
drwxr-xr-x    4096      Jun 26 2019   .
Lrwxrwxrwx    4         Jun 25 2019   etc -> /etc
drwxr-xr-x    4096      Jun 26 2019   lib
drwxr-xr-x    4096      Jun 21 2019   libexec
drwxr-xr-x    4096      Jun 26 2019   log
drwxr-xr-x    4096      Jun 21 2019   mail
drwxr-xr-x    4096      Jun 26 2019   opt
Lrwxrwxrwx    6         Jun 21 2019   run -> ../run
drwxr-xr-x    4096      Jun 21 2019   spool
dtrwxrwxrwx   4096      Jun 21 2019   tmp
-rw-------    14979     Jun 26 2019   userdata.yaml
```

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2019-06-26 17:43:09 +03:00
Seán C. McCord
81163cefb4 feat(osd): extend Routes API (#756)
Signed-off-by: Seán C McCord <ulexus@gmail.com>
2019-06-22 08:03:13 -07:00
Andrey Smirnov
76071abbb8
feat(init): move 'ls' API to init from osd (#755)
Service `osd` doesn't have access to rootfs, as it is running in a
container, so move API to `init` which has unconstrained access to
rootfs. (This is in line with another API, `osctl cp`).

Fixes: #752

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2019-06-21 22:29:39 +03:00
Andrew Rynhard
1f36f0e7df
refactor(osctl): use UserHomeDir to detect user home directory (#749)
Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2019-06-20 17:57:57 -07:00
Andrey Smirnov
9ed45f7090 feat(osctl): implement 'cp' to copy files out of the Talos node (#740)
Actual API is implemented in the `init`, as it has access to root
filesystem. `osd` proxies API back to `init` with some tricks to support
grpc streaming.

Given some absolute path, `init` produces and streams back .tar.gz
archive with filesystem contents.

`osctl cp` works in two modes. First mode streams data to stdout, so
that we can do e.g.: `osctl cp /etc - | tar tz`. Second mode extracts
archive to specified location, dropping ownership info and adjusting
permissions a bit. Timestamps are not preserved.

If full dump with owner/permisisons is required, it's better to stream
data to `tar xz`, for quick and dirty look into filesystem contents
under unprivileged user it's easier to use in-place extraction.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2019-06-20 17:02:58 -07:00
Andrey Smirnov
0c0a0340b2
fix(osctl): allow '-target' flag for osctl restart (#732)
I couldn't find any use for the `timeout` flag nor the value passed in
the API, but it block much more useful and present in other commands
flag 'target'.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2019-06-14 21:37:57 +03:00
Andrey Smirnov
fb320a894b
fix(osctl): Revert "display non-fatal errors from ps/stats in osctl (#724)" (#727)
This reverts commit f200eb7a8a0b7c2d29710f695000eb7680ce8b7d.

grpc can't send back both response and an error.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2019-06-07 22:50:05 +03:00