Adds a auditd service that gathers all audit logs from kernel.
Signed-off-by: Noel Georgi <git@frezbo.dev>
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
Signed-off-by: Noel Georgi <git@frezbo.dev>
Bump golangci-lint and fixup new warnings. Ignore check that checks for
used function parameters, it's kind of noisy and makes it confusing to
read interface implementations.
Signed-off-by: Noel Georgi <git@frezbo.dev>
There's a cyclic dependency on siderolink library which imports talos
machinery back. We will fix that after we get talos pushed under a new
name.
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
Introduce `cluster.NodeInfo` to represent the basic info about a node which can be used in the health checks. This information, where possible, will be populated by the discovery service in following PRs. Part of siderolabs#5554.
Signed-off-by: Utku Ozdemir <utku.ozdemir@siderolabs.com>
This enables golangci-lint via build tags for integration tests (this
should have been done long ago!), and fixes the linting errors.
Two tests were updated to reduce flakiness:
* apply config: wait for nodes to issue "boot done" sequence event
before proceeding
* recover: kill pods even if they appear after the initial set gets
killed (potential race condition with previous test).
Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
This moves `pkg/config`, `pkg/client` and `pkg/constants`
under `pkg/machinery` umbrella.
And `pkg/machinery` is published as Go module inside Talos repository.
Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
With load-balancing enabled by default running `talosctl` without
`--nodes` is risky, as it might hit any control plane by default without
`--nodes`.
Only two commands do not enforce this check, as they do their own node
contexts: `crashdump` and `health` (client-side).
Integration tests were updated to always supply `--nodes` cli argument,
while doing that I refactored the storage for discovered nodes to use
existing `cluster.Info` interface.
The downside is that with e2e CAPI tests CLI tests will be mostly
skipped as we don't support discovery in CLI tests at the momemnt. This
can be fixed by using `talosctl kubeconfig` + `kubectl get nodes` for
node discovery.
Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
Fixes#2316
Simply update dependencies we don't track on version level to be
compatible with Talos components (like etcd or k8s).
Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
This merges `osd` API into `machined`. API was copied from `osd` into
`machined`, and `osd` API was deprecated.
For backwards compatibility, `machined` still implements `osd` API, so
older Talos API clients can still talk to the node without changes.
Docs were updated. No functional changes.
Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
Handling of multiple endpoints has already been implemented in #2094.
This PR enables round-robin policy so that grpc picks up new endpoint
for each call (and not send each request to the first control plane
node).
Endpoint list is randomized to handle cases when only one request is
going to be sent, so that it doesn't go always to the first node in the
list.
gprc handles dead/unresponsive nodes automatically for us.
`talosctl cluster create` and provision tests switched to use
client-side load balancer for Talos API.
On the additional improvements we got:
* `talosctl` now reports correct node IP when using commands without
`-n`, not the loadbalancer IP (if using multiple endpoints of course)
* loadbalancer can't provide reliable handling of errors when upstream
server is unresponsive or there're no upstreams available, grpc returns
much more helpful errors
Fixes#1641
Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
This replaces logging to files with inotify following to pure in-memory
circular buffer which grows on demand capped at specified maximum
capacity.
The concern with previous approach was that logs on tmpfs were growing
without any bound potentially consuming all the node memory.
Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
This is a rewrite of machined. It addresses some of the limitations and
complexity in the implementation. This introduces the idea of a
controller. A controller is responsible for managing the runtime, the
sequencer, and a new state type introduced in this PR.
A few highlights are:
- no more event bus
- functional approach to tasks (no more types defined for each task)
- the task function definition now offers a lot more context, like
access to raw API requests, the current sequence, a logger, the new
state interface, and the runtime interface.
- no more panics to handle reboots
- additional initialize and reboot sequences
- graceful gRPC server shutdown on critical errors
- config is now stored at install time to avoid having to download it at
install time and at boot time
- upgrades now use the local config instead of downloading it
- the upgrade API's preserve option takes precedence over the config's
install force option
Additionally, this pulls various packes in under machined to make the
code easier to navigate.
Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
This is a rename of the osctl binary. We decided that talosctl is a
better name for the Talos CLI. This does not break any APIs, but does
make older documentation only accurate for previous versions of Talos.
Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
Fixes#1563
This implements dmesg reading via `/dev/kmsg`, with message parsing and
formatting. Kernel log facility and severity are parsed, timestamp is
calculated relative to boot time (it's accurate unless time jumps a
lot during node lifetime).
New flags to follow dmesg was added, tail flag allows to stream only new
message (ignoring old messages). We could try to implement tailing last
N messages, just a bit more work, open to suggestions (for symmetry with
regular logs).
Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
This PR brings our protobuf files into conformance with the protobuf
style guide, and community conventions. It is purely renames, along with
generated docs.
Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
Now default is not to follow the logs (which is similar to `kubectl logs`).
Integration test was added for `Logs()` API and `osctl logs` command.
Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>