13 Commits

Author SHA1 Message Date
Andrey Smirnov
1849b53881
feat: update dependencies
Bump Go modules, linters, other minor dependencies.

Linux 6.12.17, containerd 2.0.3.

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2025-03-04 21:03:43 +04:00
Andrey Smirnov
c735d14928
fix: wait for udevd before starting sync
Fixes #8074

One part of the fix is to wait for udevd to be ready, as anyways before
udevd is ready network interfaces are not ready, so sync is not
possible.

Second part is that now u-root's rtc package supports closing rtc
devices, so we can properly open/close it as part of the sync loop (vs.
previous workaround with sync.Once).

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-11-27 20:33:03 +04:00
Andrey Smirnov
d4a6d017db
fix: ignore invalid NTP responses
Due to the bug introduced when refactoring for PTP devices, invalid NTP
responses (including for example NTP kiss of death), were incorrectly
handled when only a single NTP server was used.

The error was logged, but the response was used to adjust the time which
leads to unexpected time jumps.

Properly ignore any invalid NTP response.

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-09-12 20:46:44 +04:00
Dmitriy Matrenichev
aca475c665
chore: small usability fixes
* Replace logging.Wrap(log.Writer()) with zaptest.NewLogger(suite.T()) where possible.
* Replace reflect.DeepEqual with =|slices.Equal|bytes.Equal where possible.

Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
2024-06-10 05:48:11 +03:00
Dmitriy Matrenichev
19f15a840c
chore: bump golangci-lint to 1.57.0
Fix all discovered issues.

Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
2024-03-21 01:06:53 +03:00
Dmitriy Matrenichev
fa3b933705
chore: replace fmt.Errorf with errors.New where possible
This time use `eg` from `x/tools` repo tool to do this.

Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
2024-02-14 17:39:30 +03:00
Andrey Smirnov
4eab3017b0
fix: calculate log2i properly
Fixes #7080

The real bug was off-by-one in `log2i` implementation, other changes are
cleanups as `x/sys/unix` package now contains all the constants we need.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2023-08-03 21:17:58 +04:00
Andrey Smirnov
96aa9638f7
chore: rename talos-systems/talos to siderolabs/talos
There's a cyclic dependency on siderolink library which imports talos
machinery back. We will fix that after we get talos pushed under a new
name.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-11-03 16:50:32 +04:00
Andrey Smirnov
343c55762e
chore: replace talos-systems Go modules with siderolabs
This the first step towards replacing all import paths to be based on
`siderolabs/` instead of `talos-systems/`.

All updates contain no functional changes, just refactorings to adapt to
the new path structure.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-11-01 12:55:40 +04:00
Andrey Smirnov
a2233bfe46
fix: improve NTP sync process
Fixes #4425

* add more logging for responses and sync process
* adjust time sync constants
* change the way poll interval is chosen (increasing on good sync,
decreasing on variation)
* filter out spikes

Based on flow in https://github.com/systemd/systemd/blob/main/src/timesync/timesyncd-manager.c

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2021-11-11 20:39:07 +03:00
Artem Chernyshev
1db301edf6 feat: switch controller-runtime to zap.Logger
Enable logging using default development config with some fine tuning.
Additionally, now `info` and below logs go to kmsg.

Signed-off-by: Artem Chernyshev <artem.0xD2@gmail.com>
2021-05-25 02:15:31 -07:00
Andrey Smirnov
4d50a4edd0 fix: update the way NTP sync uses adjtimex syscall
Fixes #3582

Time adjustment code was rewritten taking a peek at other time sync
implementations. Looks like `adjtimex` was used incorrectly before which
leads to huge time oscillations and `STA_UNSYNC` being set by the
kernel. Instead of setting time via `settimeofday`, use `adjtimex` as
well to set the time on big jump.

With this change, oscillation is pretty stable around zero, in
microsecond range (polling interval lowered for testing):

```
172.20.0.2: 2021/05/06 18:51:28  time.SyncController: adjusting time (slew) by -11.375µs via 192.36.143.130, state TIME_OK, status STA_PLL | STA_NANO
172.20.0.2: 2021/05/06 18:51:37  time.SyncController: adjusting time (slew) by 426.276µs via 192.36.143.130, state TIME_OK, status STA_PLL | STA_NANO
172.20.0.2: 2021/05/06 18:51:50  time.SyncController: adjusting time (slew) by -622.037µs via 192.36.143.130, state TIME_OK, status STA_PLL | STA_NANO
172.20.0.2: 2021/05/06 18:51:58  time.SyncController: adjusting time (slew) by 59.822µs via 192.36.143.130, state TIME_OK, status STA_PLL | STA_NANO
172.20.0.2: 2021/05/06 18:52:11  time.SyncController: adjusting time (slew) by 126.855µs via 192.36.143.130, state TIME_OK, status STA_NANO | STA_PLL
172.20.0.2: 2021/05/06 18:52:20  time.SyncController: adjusting time (slew) by 17.334µs via 192.36.143.130, state TIME_OK, status STA_NANO | STA_PLL
172.20.0.2: 2021/05/06 18:52:28  time.SyncController: adjusting time (slew) by -108.787µs via 192.36.143.130, state TIME_OK, status STA_NANO | STA_PLL
172.20.0.2: 2021/05/06 18:52:34  time.SyncController: adjusting time (slew) by -71.687µs via 192.36.143.130, state TIME_OK, status STA_PLL | STA_NANO
172.20.0.2: 2021/05/06 18:52:40  time.SyncController: adjusting time (slew) by 114.759µs via 192.36.143.130, state TIME_OK, status STA_PLL | STA_NANO
172.20.0.2: 2021/05/06 18:52:47  time.SyncController: adjusting time (slew) by 46.716µs via 192.36.143.130, state TIME_OK, status STA_PLL | STA_NANO
```

Also one should pick a time server close to the node to get lower RTT
and dispersion.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2021-05-07 07:27:18 -07:00
Andrey Smirnov
2ea20f598a feat: replace timed with time sync controller
This is a complete rewrite of time sync process.

Now the time sync process starts early at boot time, and it adapts to
configuration changes:

* before config is available, `pool.ntp.org` is used
* once config is available, configured time servers are used

Controller updates same time sync resource as other controllers had
dependency on, so they have a chance to wait for the time sync event.

Talos services which depend on time now wait on same resource instead of
waiting on timed health.

New features:

* time sync now sticks to the particular time server unless there's an
error from that server, and server is changed in that case, this
improves time sync accuracy

* time sync acts on config changes immediately, so it's possible to
reconfigure time sync at any time

* there's a new 'epoch' field in time sync resources which allows
time-dependent controllers to regenerate certs when there's a big enough
jump in time

Features to implement later:

* apid shouldn't depend on timed, it should be started early and it
should regenerate certs on time jump

* trustd should be updated in same way

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2021-03-29 09:29:43 -07:00