There's a cyclic dependency on siderolink library which imports talos
machinery back. We will fix that after we get talos pushed under a new
name.
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
Introduce `cluster.NodeInfo` to represent the basic info about a node which can be used in the health checks. This information, where possible, will be populated by the discovery service in following PRs. Part of siderolabs#5554.
Signed-off-by: Utku Ozdemir <utku.ozdemir@siderolabs.com>
This fixes output of `talosctl containers` to show failed/exited
containers so that it's possible to see e.g. `kube-apiserver` container
when it fails to start. This also enables using ID from the container
list to see logs of failing containers, so it's easy to debug issues
when control plane pods don't start because of wrong configuration.
Also remove option to use either CRI or containerd inspector, default to
containerd for system namespace and to CRI for kubernetes namespace.
The only side effect is that we can't see `kubelet` container in the
output of `talosctl containers -k`, but `kubelet` itself is available in
`talosctl services` and `talosctl logs kubelet`.
Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
This enables golangci-lint via build tags for integration tests (this
should have been done long ago!), and fixes the linting errors.
Two tests were updated to reduce flakiness:
* apply config: wait for nodes to issue "boot done" sequence event
before proceeding
* recover: kill pods even if they appear after the initial set gets
killed (potential race condition with previous test).
Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
This moves `pkg/config`, `pkg/client` and `pkg/constants`
under `pkg/machinery` umbrella.
And `pkg/machinery` is published as Go module inside Talos repository.
Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
This makes `pkg/config` directly importable from other projects.
There should be no functional changes.
Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
With load-balancing enabled by default running `talosctl` without
`--nodes` is risky, as it might hit any control plane by default without
`--nodes`.
Only two commands do not enforce this check, as they do their own node
contexts: `crashdump` and `health` (client-side).
Integration tests were updated to always supply `--nodes` cli argument,
while doing that I refactored the storage for discovered nodes to use
existing `cluster.Info` interface.
The downside is that with e2e CAPI tests CLI tests will be mostly
skipped as we don't support discovery in CLI tests at the momemnt. This
can be fixed by using `talosctl kubeconfig` + `kubectl get nodes` for
node discovery.
Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
There were three problems:
* cli tests did commands in sequence assuming they all hit the same
node, but with load-balancing it's no longer true
* restart test was affected, as it hit different node for check after
restart, and it succeeded immediately, while on original node process
was still starting which resulted in failure in the next tests; replace
the check to make sure service is up and healthy, so that test leaves
cluster in a good state
* restart API response had wrong format (no message returned) which
resulted in failures with apid proxy (when used with `-n`)
Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
This moves our test scripts to using the bootstrap API. Some
automation around invoking the bootstrap API was also added
to give the same ease of use when creating clusters with the
CLI.
Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
I added tests for all the commands which work reliably in container mode.
Some tests are naive, some are more sophisticated. While going
through the tests, I think I found a small bug in `osctl gen keypair`.
When we get reliable KVM tests, I can revisit and add missing
tests for time, reboot, shutdown and friends.
Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>