3 Commits

Author SHA1 Message Date
Artem Chernyshev
1db301edf6 feat: switch controller-runtime to zap.Logger
Enable logging using default development config with some fine tuning.
Additionally, now `info` and below logs go to kmsg.

Signed-off-by: Artem Chernyshev <artem.0xD2@gmail.com>
2021-05-25 02:15:31 -07:00
Andrey Smirnov
4d50a4edd0 fix: update the way NTP sync uses adjtimex syscall
Fixes #3582

Time adjustment code was rewritten taking a peek at other time sync
implementations. Looks like `adjtimex` was used incorrectly before which
leads to huge time oscillations and `STA_UNSYNC` being set by the
kernel. Instead of setting time via `settimeofday`, use `adjtimex` as
well to set the time on big jump.

With this change, oscillation is pretty stable around zero, in
microsecond range (polling interval lowered for testing):

```
172.20.0.2: 2021/05/06 18:51:28  time.SyncController: adjusting time (slew) by -11.375µs via 192.36.143.130, state TIME_OK, status STA_PLL | STA_NANO
172.20.0.2: 2021/05/06 18:51:37  time.SyncController: adjusting time (slew) by 426.276µs via 192.36.143.130, state TIME_OK, status STA_PLL | STA_NANO
172.20.0.2: 2021/05/06 18:51:50  time.SyncController: adjusting time (slew) by -622.037µs via 192.36.143.130, state TIME_OK, status STA_PLL | STA_NANO
172.20.0.2: 2021/05/06 18:51:58  time.SyncController: adjusting time (slew) by 59.822µs via 192.36.143.130, state TIME_OK, status STA_PLL | STA_NANO
172.20.0.2: 2021/05/06 18:52:11  time.SyncController: adjusting time (slew) by 126.855µs via 192.36.143.130, state TIME_OK, status STA_NANO | STA_PLL
172.20.0.2: 2021/05/06 18:52:20  time.SyncController: adjusting time (slew) by 17.334µs via 192.36.143.130, state TIME_OK, status STA_NANO | STA_PLL
172.20.0.2: 2021/05/06 18:52:28  time.SyncController: adjusting time (slew) by -108.787µs via 192.36.143.130, state TIME_OK, status STA_NANO | STA_PLL
172.20.0.2: 2021/05/06 18:52:34  time.SyncController: adjusting time (slew) by -71.687µs via 192.36.143.130, state TIME_OK, status STA_PLL | STA_NANO
172.20.0.2: 2021/05/06 18:52:40  time.SyncController: adjusting time (slew) by 114.759µs via 192.36.143.130, state TIME_OK, status STA_PLL | STA_NANO
172.20.0.2: 2021/05/06 18:52:47  time.SyncController: adjusting time (slew) by 46.716µs via 192.36.143.130, state TIME_OK, status STA_PLL | STA_NANO
```

Also one should pick a time server close to the node to get lower RTT
and dispersion.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2021-05-07 07:27:18 -07:00
Andrey Smirnov
2ea20f598a feat: replace timed with time sync controller
This is a complete rewrite of time sync process.

Now the time sync process starts early at boot time, and it adapts to
configuration changes:

* before config is available, `pool.ntp.org` is used
* once config is available, configured time servers are used

Controller updates same time sync resource as other controllers had
dependency on, so they have a chance to wait for the time sync event.

Talos services which depend on time now wait on same resource instead of
waiting on timed health.

New features:

* time sync now sticks to the particular time server unless there's an
error from that server, and server is changed in that case, this
improves time sync accuracy

* time sync acts on config changes immediately, so it's possible to
reconfigure time sync at any time

* there's a new 'epoch' field in time sync resources which allows
time-dependent controllers to regenerate certs when there's a big enough
jump in time

Features to implement later:

* apid shouldn't depend on timed, it should be started early and it
should regenerate certs on time jump

* trustd should be updated in same way

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2021-03-29 09:29:43 -07:00