This adds a "Troubleshooting" section to the documention along with a
guide on generating a certificate. This covers the scenario when a
user's certificate has expired.
Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
This adds a note on the usage of random.trust_cpu to get around slow
boot times due to low entropy.
Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
This replaces ECDSA with Ed25519. Ed25519 is considered to be safer and
more trustworthy than ECDSA NIST curves.
Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
This removes the default privileged mode that all containers were
started with and adds the required capabilities on a per-service basis.
Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
- tweak whitespace between sections
- fix the top menu for small screens
- fix the terminal overlapping on small screens
- tweak wording on a few of the bullet points
- clean up the display of the "certified" logo on small screens
- clean up the "features" grid on medium/large screens
Signed-off-by: Tim Gerla <tim@gerla.net>
This PR introduces APId. This service replaces the frontend functionality
previously provided by OSD. The main driver for this is two fold:
1. Create a single purpose application to expose the talos api
2. Make use of code generation to DRY api changes
Signed-off-by: Brad Beam <brad.beam@talos-systems.com>
This PR fixes a bug on mac with the localhost not making it into cert
sans when doing `osctl cluster create`. Now that they're present, we're
able to use kubectl again.
Signed-off-by: Spencer Smith <robertspencersmith@gmail.com>
- Most of the landing page is responsive on small/medium screens now. There are still
some bugs around the ascii cinema.
- Some wording tweaks, mostly I removed words to make things more concise. Feel free
to edit my edits.
- Simplified a couple of HTML constructs.
- Expanded the "features" section into two rows with a placeholder image for the 6th item.
Happy for feedback.
Signed-off-by: Tim Gerla <tim@gerla.net>
Things have changed since v0.2. This is a refresh to make the getting
started guide up to date.
Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
This PR will add the ability for talos to detect if the machine config
that it downloads from the platform is a gzipped file. If so, it will
unzip it and overwrite the byte slice that gets written to disk.
Signed-off-by: Spencer Smith <robertspencersmith@gmail.com>
There are use cases where a Talos node will not be publicly accessible.
This treats platform external IP errors as non-fatal.
Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
I can't say how exactly those conflicts happen in the tests, but I tried
to randomize more container IDs and namespace names (which both feed
into final abstract unix socket path).
Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
This is not 100% fix as I can't reproduce tests hanging in local
environment, but the idea is the following:
1. `reaper.Start()` started reaper loop in a goroutine which starts with
subscribing to `SIGCHLD`.
2. `reaper.Start()` just spawned goroutine never waiting for it.
3. if after `reaper.Start()` reaper goroutine never runs, but process is
created in the test and it terminates, `SIGCHLD` will be ignored and
reaper will never wake up to reap the child.
4. process test hangs as it waits for reaper to reap the child and
return its exit status.
Sample failures:
```
=== RUN TestProcessSuite/runReaper=true/TestRunLogs
2019/10/15 14:17:41 state Running: Process Process(["/bin/sh" "-c" "echo -n \"Test 1\nTest 2\n\""]) started with PID 11802
coverage: 60.0% of statements
panic: test timed out after 10m0s
```
```
=== RUN TestCmdSuite/runReaper=true/TestRun
true
coverage: 71.4% of statements
panic: test timed out after 10m0s
```
Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
The problem was that if container fails to start, it never reaches
'StateRunning' and test hangs waiting for that state. Assertion doesn't
abort whole test (it only aborts goroutine it was called from), so this
doesn't help.
Fix that by signalling back if some containers fail to start.
This is not a fix, but it should expose the actual failure happening in
this test.
Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
This replaces `time.Sleep()` wait with calls to `retry.Constant` to
wait for specific condition to be reached.
Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
Failure:
```
--- FAIL: Test_constantRetryer_Retry (7.00s)
--- FAIL: Test_constantRetryer_Retry/test_expected_number_of_retries (2.00s)
constant_test.go:168: expected count of 2, got 3
```
The problem is that retry interval (1s) perfectly aligns with timeout
(2s), so depending on which timer fires first, function might be called
two or three times. Fix that by extending timeout a bit so it fits one
more run and not more.
P.S. This test might be still flaky under load if function doesn't have
a chance to run (starvation). Proper fix is to use fake time in the
tests.
Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
- Added common.proto to host NodeMetadata
- go_package names were fixed up so imports are generated with the proper
package names
- fixed up build work (dockerfile) to prevent copying the previously
generated go proto files. This fixes a bug where we could incorrectly
copy the previously generated protobuf instead of a new one generated
at an incorrect location/name/etc.
Signed-off-by: Brad Beam <brad.beam@talos-systems.com>
We were mistakenly overwriting the control plane endpoint in the
`generate` command. This fixes that and adds a simple validation of the
endpoint field in the config. We should expand the validation to ensure
that a valid IP or DNS name have been provided.
Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
Big footers seem to be in style nowadays. This adds the CNCF log to the
footer and increases the footer height. It also moves the certified
Kubernetes log into the "What is Talos?" section.
Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
This is partially driven by the upcoming api changes, but when we tell protoc to look for api.proto,
itll find the first match in the includes(`-I`) directive.
Signed-off-by: Brad Beam <brad.beam@talos-systems.com>
This adds a feature about how Talos is ephemeral. I feel this is
important to get across to our users.
Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
This adds a little more space between the landing page items to make the
page a little more readable.
Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
Instead of floating the sidebar, we want it to be stick so that the
footer doesn't cover the bottom of the sidebar.
Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>