Fixes#8074
One part of the fix is to wait for udevd to be ready, as anyways before
udevd is ready network interfaces are not ready, so sync is not
possible.
Second part is that now u-root's rtc package supports closing rtc
devices, so we can properly open/close it as part of the sync loop (vs.
previous workaround with sync.Once).
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
Due to the bug introduced when refactoring for PTP devices, invalid NTP
responses (including for example NTP kiss of death), were incorrectly
handled when only a single NTP server was used.
The error was logged, but the response was used to adjust the time which
leads to unexpected time jumps.
Properly ignore any invalid NTP response.
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
* Replace logging.Wrap(log.Writer()) with zaptest.NewLogger(suite.T()) where possible.
* Replace reflect.DeepEqual with =|slices.Equal|bytes.Equal where possible.
Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
Fixes#7080
The real bug was off-by-one in `log2i` implementation, other changes are
cleanups as `x/sys/unix` package now contains all the constants we need.
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
There's a cyclic dependency on siderolink library which imports talos
machinery back. We will fix that after we get talos pushed under a new
name.
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
This the first step towards replacing all import paths to be based on
`siderolabs/` instead of `talos-systems/`.
All updates contain no functional changes, just refactorings to adapt to
the new path structure.
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
Fixes#4425
* add more logging for responses and sync process
* adjust time sync constants
* change the way poll interval is chosen (increasing on good sync,
decreasing on variation)
* filter out spikes
Based on flow in https://github.com/systemd/systemd/blob/main/src/timesync/timesyncd-manager.c
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
Enable logging using default development config with some fine tuning.
Additionally, now `info` and below logs go to kmsg.
Signed-off-by: Artem Chernyshev <artem.0xD2@gmail.com>
Fixes#3582
Time adjustment code was rewritten taking a peek at other time sync
implementations. Looks like `adjtimex` was used incorrectly before which
leads to huge time oscillations and `STA_UNSYNC` being set by the
kernel. Instead of setting time via `settimeofday`, use `adjtimex` as
well to set the time on big jump.
With this change, oscillation is pretty stable around zero, in
microsecond range (polling interval lowered for testing):
```
172.20.0.2: 2021/05/06 18:51:28 time.SyncController: adjusting time (slew) by -11.375µs via 192.36.143.130, state TIME_OK, status STA_PLL | STA_NANO
172.20.0.2: 2021/05/06 18:51:37 time.SyncController: adjusting time (slew) by 426.276µs via 192.36.143.130, state TIME_OK, status STA_PLL | STA_NANO
172.20.0.2: 2021/05/06 18:51:50 time.SyncController: adjusting time (slew) by -622.037µs via 192.36.143.130, state TIME_OK, status STA_PLL | STA_NANO
172.20.0.2: 2021/05/06 18:51:58 time.SyncController: adjusting time (slew) by 59.822µs via 192.36.143.130, state TIME_OK, status STA_PLL | STA_NANO
172.20.0.2: 2021/05/06 18:52:11 time.SyncController: adjusting time (slew) by 126.855µs via 192.36.143.130, state TIME_OK, status STA_NANO | STA_PLL
172.20.0.2: 2021/05/06 18:52:20 time.SyncController: adjusting time (slew) by 17.334µs via 192.36.143.130, state TIME_OK, status STA_NANO | STA_PLL
172.20.0.2: 2021/05/06 18:52:28 time.SyncController: adjusting time (slew) by -108.787µs via 192.36.143.130, state TIME_OK, status STA_NANO | STA_PLL
172.20.0.2: 2021/05/06 18:52:34 time.SyncController: adjusting time (slew) by -71.687µs via 192.36.143.130, state TIME_OK, status STA_PLL | STA_NANO
172.20.0.2: 2021/05/06 18:52:40 time.SyncController: adjusting time (slew) by 114.759µs via 192.36.143.130, state TIME_OK, status STA_PLL | STA_NANO
172.20.0.2: 2021/05/06 18:52:47 time.SyncController: adjusting time (slew) by 46.716µs via 192.36.143.130, state TIME_OK, status STA_PLL | STA_NANO
```
Also one should pick a time server close to the node to get lower RTT
and dispersion.
Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
This is a complete rewrite of time sync process.
Now the time sync process starts early at boot time, and it adapts to
configuration changes:
* before config is available, `pool.ntp.org` is used
* once config is available, configured time servers are used
Controller updates same time sync resource as other controllers had
dependency on, so they have a chance to wait for the time sync event.
Talos services which depend on time now wait on same resource instead of
waiting on timed health.
New features:
* time sync now sticks to the particular time server unless there's an
error from that server, and server is changed in that case, this
improves time sync accuracy
* time sync acts on config changes immediately, so it's possible to
reconfigure time sync at any time
* there's a new 'epoch' field in time sync resources which allows
time-dependent controllers to regenerate certs when there's a big enough
jump in time
Features to implement later:
* apid shouldn't depend on timed, it should be started early and it
should regenerate certs on time jump
* trustd should be updated in same way
Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>