112 Commits

Author SHA1 Message Date
Andrey Smirnov
b969e7720e chore: update references to old protobuf package
This simply uses new protobuf package instead of old one.

Old protobuf package is still in use by Talos dependencies.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2021-07-08 05:34:12 -07:00
Andrey Smirnov
f2ae9cd0c1 feat: replace networkd with new network implementation
This removes networkd, updates network ready condition, enables all the
controllers which were previously disabled.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2021-06-15 17:37:28 -07:00
Andrey Smirnov
5811f4dda1 feat: implement link (interface) controllers
The structure of the controllers is really similar to addresses and
routes:

* `LinkSpec` resource describes desired link state
* `LinkConfig` controller generates `LinkSpecs` based on machine
configuration and kernel cmdline
* `LinkMerge` controller merges multiple configuration sources into a
single `LinkSpec` paying attention to the config layer priority
* `LinkSpec` controller applies the specs to the kernel state

Controller `LinkStatus` (which was implemented before) watches the
kernel state and publishes current link status.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2021-06-01 09:36:25 -07:00
Alexey Palazhchenko
fb605a0fc5 chore: tweak nolintlint settings
Copy from kres manually for now.

Signed-off-by: Alexey Palazhchenko <alexey.palazhchenko@gmail.com>
2021-03-25 13:56:16 -07:00
Artem Chernyshev
65701aa724 fix: resolve the issue with DHCP lease not being renewed
Very easily reproduced when you start a node with a Dynamic IP.
Normally it should renew lease after TTL/2, but that doesn't happen, so
the node starts to get next IP one after another.

After looking at packets sent by other clients, found out that they
have `Client IP address` equal to the IP given by the DHCP server.

Additionally, changed DHCP client to send Request packets directly to the DHCP server after getting an offer.
It looks like DHCP spec states that you should use unicast request directly to DHCP server, not broadcast.

Signed-off-by: Artem Chernyshev <artem.0xD2@gmail.com>
2021-03-25 06:44:23 -07:00
Andrey Smirnov
b0209fd29d refactor: move networkd, timed APIs to machined, remove routerd
This moves implementation of the user-facing APIs to the machined, and
as now all the APIs are implemented by machined, remove routerd and
adjust apid to proxy to machined.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2021-03-24 00:00:28 -07:00
Andrey Smirnov
89a4b09fe8 refactor: run networkd as a goroutine in machined
This removes networkd as a separate container and image.

Reasons:

* `machined` becomes more and more bound into the core flow - now it
interacts with `etcd` for VIPs, so container has more and more
mounts/permissions
* it should be easier to COSIfy machined piece by piece if we have it
running in the same process
* initramfs size

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2021-03-23 06:22:49 -07:00
Alexey Palazhchenko
0dbaeb9e65 chore: update tools, use new generators
To stay current.

Signed-off-by: Alexey Palazhchenko <alexey.palazhchenko@gmail.com>
2021-03-16 11:17:15 -07:00
Andrey Smirnov
4aae924c68 refactor: provide explicit logger for networkd
This changes will make easier running networkd as a goroutine in machined.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2021-03-12 09:54:09 -08:00
Artem Chernyshev
22f375300c chore: update golanci-lint to 1.38.0
Fix all discovered issues.
Detected couple bugs, fixed them as well.

Signed-off-by: Artem Chernyshev <artem.0xD2@gmail.com>
2021-03-12 06:50:02 -08:00
Alexey Palazhchenko
df52c13581 chore: fix //nolint directives
That's the recommended syntax:
https://golangci-lint.run/usage/false-positives/

Signed-off-by: Alexey Palazhchenko <alexey.palazhchenko@gmail.com>
2021-03-05 05:58:33 -08:00
Andrey Smirnov
d7cdc8cc15 feat: implement simple layer 2 shared IP for CP
This adds a VIP (virtual IP) option to the network configuration of an
interface, which will allow a set of nodes to share a floating IP
address among them.  For now, this is restricted to control plane use
and only a single shared IP is supported.

Fixes #3111

Signed-off-by: Seán C McCord <ulexus@gmail.com>
Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2021-02-26 14:14:34 -08:00
Andrey Smirnov
24b4c0bcb3 refactor: add context to the networkd
This changes introduces top-level cancellable on signal context to
networkd to abort operations when networkd is being stopped.

This allows for clean restarts of networkd container, and it is required
to support canceallable context for VIP etcd operations.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2021-02-25 09:39:53 -08:00
Spencer Smith
1111edfc76 fix: pass attributes when adding routes
This PR fixes a bug where we removed adding attributes during the
RouteAdd call for rtnetlink. The way that code is implemented means that
if *any* attributes are passed, the defaults are ignored. But we were
expecting defaults to be there. No longer!

Signed-off-by: Spencer Smith <robertspencersmith@gmail.com>
2021-02-11 16:24:49 -08:00
Andrey Smirnov
757cc204ec fix: prefer configured nameservers, fix DHCP6 in container
Always prefer explicitly configured nameservers,
networkd was missing capability to bind address for DHCP6.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2021-02-05 17:00:51 -08:00
Andrey Smirnov
6cf98a7322 feat: implement IPv6 DHCP client in networkd
This renames existing 'DHCP' implementation to `DHCP4`, new client is
`DHCP6`.

For now, `DHCP6` is disabled by default and should be explicitly enabled
with the config.

QEMU testbed for IPv6 is going to be pushed as separate PR.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2021-02-05 02:22:18 -08:00
Andrey Smirnov
47c260e365 fix: update DHCP client to use Request-Ack sequence after an Offer
The problem was that our DHCP client was always doing
Discover-Offer-Request-Ack sequence as if it doesn't have any IP lease
which was breaking some DHCP servers leading to an error which in turn
makes Talos retry harder hitting the same error over and over again.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2021-02-03 13:43:50 -08:00
Andrey Smirnov
512c79e8d6 fix: lower memory usage a bit by disabling memory profiling
As of now, we're not using Go profiling, so it's safe to disable it to
save some memory and CPU costs. Once we start using it, we can re-enable
it conditionally.

Each process allocates around 1.4MiB on amd64 for memory profiling
buckets.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2021-02-01 04:49:59 -08:00
Artem Chernyshev
9883d0af19 feat: support Wireguard networking
This the first iteration of Wireguard network support.
What was done:
- kernel was updated to enable Wireguard kernel module.
- changed networkd to support creating Wireguard device type.
- used wgctrl to configure wireguard.
- updated `talosctl cluster create` to support generating Wireguard
network configuration automatically by just specifying the network cidr.
- added docs about Wireguard support/how to use it.

Signed-off-by: Artem Chernyshev <artem.0xD2@gmail.com>
2021-01-14 15:51:14 +03:00
Andrey Smirnov
af5c34b340 fix: pick first interface valid hostname (vs. last one)
Looks like the code before change in #1578 returned the first hostname
found while interating over interfaces and addressing methods, but #1578
supposedly inadvertently flipped that to iterate over all interfaces (so
last interface wins).

Problem is that both `DHCP` and `Static` addressing methods provide
hostnames, while DHCP hostname comes from DHCP server, while `Static`
defines hostname as `talos-10-5-0-2` (by IP).

If we were to fix it for real, we should build a list of hostname with
priorities coming from different sources and pick a hostname with the
highest priority, so this fix is more of a bandaid rather than a real
fix.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2021-01-12 04:18:54 -08:00
Andrey Smirnov
5325a66e3e fix: bring up bonded interfaces correctly on packet
This probably fixes bonding in general if 2nd link in the bond is down.

For packet, set additional options for the bonded interface. In
networkd, add interfaces filtered out by link status as ignored to make
them available as bond subinterfaces.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2020-12-29 11:37:55 -08:00
Andrey Smirnov
360d887967 fix: prevent endless loop with DHCP requests in networkd
There were two problems:

* `configureInterfaces` was always failing if interface is already set
up, as the routes already exist

* `renew` was halving the renew interval each time `configureInterface`
fails, which starts at (LeaseTime/2) and goes effectively to zero

This was leading to high networkd CPU usage, storm of DHCP requests on
the network.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2020-12-01 08:12:12 -08:00
Andrew Rynhard
10db642b2f feat: add support for the Banana Pi M64
This adds the Banana Pi M64 to the list of supported boards.

Signed-off-by: Andrew Rynhard <andrew@rynhard.io>
2020-11-30 18:17:37 -08:00
Artem Chernyshev
8aad711f18 feat: implement network interfaces list API
To be used in the interactive installer to configure networking.

Signed-off-by: Artem Chernyshev <artem.0xD2@gmail.com>
2020-11-27 10:48:45 -08:00
Andrey Smirnov
9a32e34cb1 feat: implement apply configuration without reboot
This allows config to be written to disk without being applied
immediately.

Small refactoring to extract common code paths.

At first, I tried to implement this via the sequencer, but looks like
it's too hard to get it right, as sequencer lacks context and config to
be written is not applied to the runtime.

Fixes #2828

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2020-11-23 12:42:44 -08:00
Andrey Smirnov
7767a41d4a feat: set interface MTU in DHCP mode even if DHCP is not successful
Fixes #2789

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2020-11-19 10:59:21 -08:00
Seán C McCord
5d4d179cd8 feat: support ipv6 routes
While IPv6 were mostly supported already, there was a single segment in
the interface setup which forced everything into an IPv4 route.
This limitation has been removed.

In so doing, route metrics have been cleaned up a small amount.
This change allows the specification of the route metric from the
config.

Fixes #2772

Signed-off-by: Seán C McCord <ulexus@gmail.com>
2020-11-17 13:11:26 -08:00
Andrey Smirnov
a2efa44663 chore: enable gci linter
Fixes were applied automatically.

Import ordering might be questionable, but it's strict:

* stdlib
* other packages
* same package imports

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2020-11-09 08:09:48 -08:00
Andrey Smirnov
8560fb9662 chore: enable nlreturn linter
Most of the fixes were automatically applied.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2020-11-09 06:48:07 -08:00
Andrew Rynhard
562ab1d572 chore: update golangci-lint
Brings in the latest version of golangci-lint and addresses errors.

Signed-off-by: Andrew Rynhard <andrew@rynhard.io>
2020-11-02 20:34:05 -08:00
Spencer Smith
8b5406c889 chore: move to newer release of rtnetlink with fn args
This PR makes use of a new merge into the upstream rtnetlink library
that introduces functional args for adding routes.

Signed-off-by: Spencer Smith <robertspencersmith@gmail.com>
2020-10-22 06:56:22 -07:00
Andrey Smirnov
16b6d344de chore: bump module dependencies in go.mod
This covers most of the packages except for those we have to keep on
hold (etcd and grpc because of etcd).

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2020-10-20 08:09:42 -07:00
Spencer Smith
4c47fa259c feat: support MTU and route changes for DHCP
This PR updates the behavior of our machine configs with respect to
DHCP-enabled interfaces. Now, if MTU is specified by the user, that
value will take precedence over any setting provided by the DHCP server.

Additionally, any routes specified will be appended to routes specified
by the DHCP server.

Signed-off-by: Spencer Smith <robertspencersmith@gmail.com>
2020-10-16 17:22:47 -07:00
Spencer Smith
7bc3fcf77d feat: support metric values for DHCP
This PR adds a "DHCPOptions" field to the config. This field contains a
single subfield currently, "RouteMetric". Setting this well ensure that
any routes provided from the DHCP server are given this metric upon
injection into the routing table.

Signed-off-by: Spencer Smith <robertspencersmith@gmail.com>
2020-10-16 08:29:04 -07:00
Artem Chernyshev
b53fc45e08 chore: add Context as param to some methods of Platform interface
Context is passed there for proper cancellation and timeouts.

Signed-off-by: Artem Chernyshev <artem.0xD2@gmail.com>
2020-10-07 14:40:10 -07:00
Spencer Smith
621bda47fe feat: allow for link local networking
This PR allows for the ability to specify neither CIDR nor DHCP in the
talos machine config. The result here should allow for things like SLAAC
addressing with ipv6.

Signed-off-by: Spencer Smith <robertspencersmith@gmail.com>
2020-09-17 09:48:10 -07:00
Seán C McCord
896745a6dd fix: gracefully handle invalid interfaces in bond
An invalid interface set as a bond's subinterface should produce an
error, but it should not panic.

Signed-off-by: Seán C McCord <ulexus@gmail.com>
2020-09-17 01:44:51 -07:00
Andrey Smirnov
f6ecf000c9 refactor: extract packages loadbalancer and retry
This removes in-tree packages in favor of:

* github.com/talos-systems/go-retry
* github.com/talos-systems/go-loadbalancer

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2020-09-02 13:46:22 -07:00
Andrew Rynhard
d4f103ffcb fix: pass config via stdin
In order to perform upgrades the way we would like, it is important that
we avoid any bind mounts into containers. This change ensures that all
system services get their config via stdin.

Signed-off-by: Andrew Rynhard <andrew@rynhard.io>
2020-08-20 15:26:13 -07:00
Andrey Smirnov
bddd4f1bf6 refactor: move external API packages into machinery/
This moves `pkg/config`, `pkg/client` and `pkg/constants`
under `pkg/machinery` umbrella.

And `pkg/machinery` is published as Go module inside Talos repository.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2020-08-17 09:56:14 -07:00
Andrey Smirnov
2697b99b7d refactor: extract pkg/net as github.com/talos-systems/net
This extracts common package as new module/repository.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2020-08-14 11:04:50 -07:00
Andrew Rynhard
92523bc422 refactor: remove structs from config provider
This make the config provider a pure interface definition by removing
all concrete internal types, and making them an interface.

Signed-off-by: Andrew Rynhard <andrew@rynhard.io>
2020-08-06 13:21:41 -07:00
Andrey Smirnov
47608fb874 refactor: make pkg/config not rely on machined/../internal/runtime
This makes `pkg/config` directly importable from other projects.

There should be no functional changes.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2020-07-29 12:40:12 -07:00
Andrey Smirnov
41d5f7859a chore: update golangci-lint to 1.28.3
Fixes #2272

`gofumpt` is now included into `golangci-lint`, but not the
`gofumports`, so we keep it using it as separate binary, but we keep
versions in sync with `golangci-lint`.

This contains fixes from:

* `gofumpt` (automated, mostly around octal constants)
* `exhaustive` in `switch` statements
* `noctx` (adding context with default timeout to http requests)

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2020-07-16 08:05:42 -07:00
Andrew Rynhard
a5a2d959ed feat: upgrade runc to v1.0.0-rc90
This updates runc to the same version vendored by containerd.

Signed-off-by: Andrew Rynhard <andrew@rynhard.io>
2020-07-02 13:19:33 -07:00
Andrey Smirnov
686dcc6743 chore: enable 'testpackage' linter
This linter makes sure tests are excercising only public package API.

I fixed all the tests which touch only public API of the packages. For
other test packages I added proper `//nolint` directive.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2020-06-30 16:42:28 -07:00
Andrey Smirnov
81d1c2bfe7 chore: enable godot linter
Issues were fixed automatically.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2020-06-30 10:39:56 -07:00
Spencer Smith
d57c97fdb6 feat: allow ability to create dummy nics
This PR will introduce a new field to v1alpha1 configs that allows users
to set `dummy: true` when specifying interfaces. If present, we will
create a dummy interface with the CIDR information given. This is useful
for users that don't want to use loopback for things like ECMP (or want
more than one dummy interface).

The created dummy interface looked like this with `ip a`:

```
3: dummy0: <BROADCAST,NOARP,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN qlen 1000
    link/ether 66:4a:e3:5f:38:10 brd ff:ff:ff:ff:ff:ff
    inet 10.254.0.5/32 brd 10.254.0.5 scope global dummy0
       valid_lft forever preferred_lft forever
```

Will close #2186.

Signed-off-by: Spencer Smith <robertspencersmith@gmail.com>
2020-06-17 17:15:07 -04:00
Frederik Schwan
f57d05e02a fix: ipv6 static default gateway not set if gateway is a LL unicast address
The source address is set by default, which leads to RNETLINK
errors, when the Global Unicast Address is passed as a Source
to a LL Unicast Gateway. Errors of RTNETLINK are now logged.

Signed-off-by: Frederik Schwan <frederik.schwan@linux.com>
2020-04-29 18:23:37 -07:00
Andrew Rynhard
49307d554d refactor: improve machined
This is a rewrite of machined. It addresses some of the limitations and
complexity in the implementation. This introduces the idea of a
controller. A controller is responsible for managing the runtime, the
sequencer, and a new state type introduced in this PR.

A few highlights are:

- no more event bus
- functional approach to tasks (no more types defined for each task)
  - the task function definition now offers a lot more context, like
    access to raw API requests, the current sequence, a logger, the new
    state interface, and the runtime interface.
- no more panics to handle reboots
- additional initialize and reboot sequences
- graceful gRPC server shutdown on critical errors
- config is now stored at install time to avoid having to download it at
  install time and at boot time
- upgrades now use the local config instead of downloading it
- the upgrade API's preserve option takes precedence over the config's
  install force option

Additionally, this pulls various packes in under machined to make the
code easier to navigate.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2020-04-28 08:20:55 -07:00