Previously, IPBlocks (like srcIPBlocks) only contained a single IP
Family which meant that a len() > 0 would indicate that an IP block had
been defined in the NetworkPolicy. However, now the IPBlocks structs are
IP family specific which means that they will always contain 2 entries,
one for the IPv4 family and one of the IPv6 family. Which means that
this condition will evaluate to true for all NetworkPolicies and waste
system resources creating empty ipsets and bad iptables rules.
The previous version of the bgp_policies code only allowed for creating
a policy when the policy didn't exist already. However, with the advent
of dual-stack we need to be able to add / remove statements if we add or
lose a specific IP family (e.g. IPv4 or IPv6) since they are handled in
different statements.
Given that the owner of GoBGP has let us know that policies are
idempotent, this now involves quite a bit of work. We need to follow the
following procedure:
add statements if missing -> add them to a policy -> if policy doesn't
equal the one already in GoBGP -> create the new policy and associate
it -> de-associate the old policy -> remove the old policy
Previously we used to do an idempotent sync all active VIPs any time we
got a service or endpoint update. However, this only worked when we
assumed a single-stack deployment model where IPs were never deleted
unless the whole service was deleted.
In a dual-stack model, we can add / remove LoadBalancer IPs and Cluster
IPs on updates. Given this, we need to take into account the finite
change that happens, and not just revert to sync-all because we'll never
stop advertising IPs that should be removed.
As a fall-back, we still have the outer Run loop that syncs all active
routes every X amount of seconds (configured by user CLI parameter). So
on that timer we'll still have something that syncs all active VIPs and
works as an outer control loop to ensure that desired state eventually
becomes active state if we accidentally remove a VIP that should have
been there.
When a single IP family's set looks to be equal, switch to continue
instead of return so that other families can still be evaluated as those
might have changes.
When enabled, generate the router id by hashing the primary IP.
With this no explicit router id has to be provided on IPv6-only clusters.
Signed-off-by: Erik Larsson <who+github@cnackers.org>
Use different IP ranges in BGP Policies unit test so that it becomes
more obvious when there are unit test failures resulting from
multi-processing of unit tests.
Fixes a problem where a user would end up with redundant external peers
in their BGP policies because getting peers is IP family agnostic and
yet is run twice on the same list.
This also ruined unit test consistency.
Rather than just silently not adding policies for controllers that don't
support a given address family, emit a warning so that it is more
obvious in the logs that kube-router isn't able to add a policy for a
given family when the controller doesn't have that family enabled.
On dual-stack nodes there can still be pods that are single stack. When
this happens there won't be a pod IP for a given family and if
kube-router tries to add rules with a missing pod IP the iptables rules
won't be formatted correctly (because it won't have a valid source or
destination for that family).
So rather than breaking the whole iptables-restore we warn in the logs
and skip the pod policy chains for that family.
Without this logic, it appears that sometimes GoBGP is inclined to match
unintentional routes in policy because of the MATCHSET_ANY declaration
and the way that it interacts with empty sets.
In my testing, without this logic I found that it often resulted in
various routes not being advertised correctly and not even showing up in
GoBGP itself. My current guess is that policy keeps GoBGP from importing
the route into the RIB even from the Protobuf socket connection that
kube-router establishes directly.
We do a lot of getting defined sets for GoBGP and are planning to do
more of it in the future. This commit centralizes the logic for this and
reduces repetition.
Running kube-router without metrics is a perfectly valid way to run
kube-router and as such it shouldn't emit an error message when a user
has not set that flag. Move the message down to Info.
Without this, kube-router would end up sharing the index between ipv4
and ipv6 which would cause it to error out when one incremented beyond
the number of rules that actually existed in the chain.