3 Commits

Author SHA1 Message Date
Andrey Smirnov
c4fb7dad0e
fix: force DNS runner shutdown on timeout
I observed a random failure (via Sidero Metal integration tests) when
Talos fails to reboot due to the runner never shutting down.

A probable root cause is a bug in the `dns` library (which runs the NDS
server), but we need to ensure this code always terminates, even at the
price of leaking a running DNS server.

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2025-04-24 16:54:09 +04:00
Dmitriy Matrenichev
dab30a8b9f
fix: ensure no goroutines escape in dns controller
- Remove all reliance on finalizers.
- Add `Close` method to CoreDNS `Proxy` struct.
- Wait for `Runner.Serve` to complete.

Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
2025-03-07 22:08:46 +03:00
Dmitriy Matrenichev
4fe6dc8a0a
chore: clean dns code
Split from #9596 (without IPv6 stuff). This PR does this things:
- Refactored `DNSResolveCacheController`. Most of the logic moved to `dns` package types. Simplify and streamline logic.
- Replace most of the goroutine orchestration with suture package.
- Support per-item reaction to the dns listeners/servers failing to start. This allows us to ignore IPv6 errors if it's disabled.
- Support per-item reaction to the dns listeners/servers failing to stop.

Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
2024-11-08 21:54:28 +03:00