* build: Dependency bumps to prep for release 2.9.0
* fix(typos): Update typos config to match IZ on txt files in testdata
* chore(lint): Address issues from newer versions of golangci-lint
* fix(dockerfile): Update iptables-wrapper install according to updated installation instructions
- Replace fmt.Errorf %s/%v + err.Error() with %w for proper error
wrapping and errors.Is/As chain support across all packages
- Replace errors.New("msg" + err.Error()) with fmt.Errorf("msg: %w", err)
- Replace strings.Contains(err.Error(), ...) with errors.Is(err,
syscall.EEXIST) and errors.Is(err, syscall.ESRCH) in linux_networking.go
- Remove now-unused IfaceHasAddr and IpvsServerExists string constants
- Replace sort.Strings with slices.Sort in bgp_policies.go, ipset.go,
and testhelpers
- Replace sort.SliceStable with slices.SortStableFunc in bgp_policies.go
- Replace reflect.DeepEqual on []string with slices.Equal in bgp_policies.go
(also fixes bug: was comparing map to slice instead of slice to slice)
- Replace reflect.DeepEqual on []*gobgpapi.Prefix with slices.EqualFunc
comparing exported fields to avoid protobuf internal state comparison
- Replace strings.Index + manual slicing with strings.Cut in docker.go
and classify.go
- Update cni_test.go to use assert.EqualError instead of assert.Equal
for wrapped error comparison
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
bgpServerStarted was already fixed. This commit applies the same pattern
to initSrcDstCheckDone and ec2IamAuthorized, which are written from Run()
and the AWS src-dst-check path and read from syncInternalPeers() via
bgp_peers.go — potential data race under concurrent BGP peer syncs.
Also adds TestAtomicBoolFieldsNoConcurrentDataRace to
network_routes_controller_test.go to exercise all three fields under
the race detector.
bgpServerStarted is written in Run() on the controller goroutine and
read from informer callbacks (OnNodeUpdate, handleServiceUpdate,
handleServiceDelete, OnEndpointsAdd, OnEndpointsUpdate) which run on
a separate goroutine. This is a data race.
Replace the plain bool with sync/atomic.Bool, using Store() for writes
and Load() for reads, to make cross-goroutine access safe without
requiring the caller to hold nrc.mu.
This implements support for KEP-1860. When a LoadBalancer ingress has ipMode set to 'Proxy', kube-router will skip adding the IP to the local IPVS table and will not hijack the traffic. If ipMode is 'VIP' or unset, the current behavior is maintained.
Fixes#2014
Moves all Service VIP range configurations into pkg/svcip this is where
validation and querying of ranges goes rather than passing each range to
each controller.
It also centralizes the validation logic since NRC and NSC need
basically equivalent logic. It additionally adds a RangeQuerier
interface for the NPC and LBC controllers which require knowing the
literal ranges.
Replace the misleading kube_router_controller_bgp_peers gauge
which only counts 'cluster nodes' with a new per peer metric
kube_router_bgp_peer_info with 'GaugeVec' that exposes actual
BGP session state from gobgp. labels include peer address, asn,
type, and state. Metric value is 1 if established and 0 otherwise.
Closes: https://github.com/cloudnativelabs/kube-router/issues/848
Signed-off-by: Roman Kuzmitskii <roman@damex.org>
Changes AFI SAFI configuration to:
* Use consolidated logic for AFI SAFI configuration for both internal
peers and external peers
* Configure AFI SAFI regardless of GracefulRestart enablement
* This is important because by default GoBGP only configures a default
AFI SAFI configuration for the address family of its configured
peering IP. Which means that previously dual-stack configurations
that did not enable GracefulRestart would not work (see: #1992)
The problem here stems from the fact that when netpol generates its list of expected ipsets, it includes the inet6:
prefix, however, when the proxy and routing controller sent their list of expected ipsets, they did not do so. This
meant that no matter how we handled it in ipset.go it was wrong for one or the other use-cases.
I decided to standardize on the netpol way of sending the list of expected ipset names so that BuildIPSetRestore() can
function in the same way for all invocations.
Attempt to filter out sets that we are not authoritative for to avoid
race conditions with other operators (like Istio) that might be
attempting to modify ipsets at the same time.
Back in commit 9fd46cc when I was pulling out the krnode struct I made a
mistake in the `syncNodeIPSets()` function and didn't grab the IPs of
all nodes, instead I only grabbed the IP of the current node multiple
times.
This caused other nodes (besides the current one) to get removed from
the `kube-router-node-ips` ipset which ensures that we don't NAT traffic
from pods to nodes (daemons and HostNetwork'd items).
This should fix that problem.
Over time, feedback from users has been that our interpretation of how
the kube-router service.local annotation interacts with the internal
traffic policy has been that it is too restrictive.
It seems like tuning it to fall in line with the local internal traffic
policy is too restrictive. This commit changes that posture, by equating
the service.local annotation with External Traffic Policy Local and
Internal Traffic Policy Cluster.
This means that when service.local is set the following will be true:
* ExternalIPs / LoadBalancer IPs will only be available on a node that
hosts the workload
* ExternalIPs / LoadBalancer IPs will only be BGP advertised (when
enabled) by nodes that host the workload
* Services will have the same posture as External Traffic Policy set to
local
* ClusterIPs will be available on all nodes for LoadBalancing
* ClusterIPs will only be BGP advertised (when enabled) by nodes that
host the workload
* Cluster IP services will have the same posture as Internal Traffic
Policy set to cluster
For anyone desiring the original functionality of the service.local
annotation that has been in place since kube-router v2.1.0, all that
would need to be done is to set `internalTrafficPolicy` to Local as
described here: https://kubernetes.io/docs/concepts/services-networking/service-traffic-policy/
Make the health controller more robust and extensible by adding in
constants for heart beats instead of 3 character random strings that are
easy to get wrong.
There are a couple of items that have typically ended up in a no-op for
us when considering routes to inject. However, now that we have a route
map where we track route state, we need this not just to be no-ops, but
also update the route state cache as well to ensure that the route
doesn't get replaced in the future.
When we find tunnels to clean up, we need to not only remove the tunnel
and the route to that tunnel, but also remove the route from the state
map.
When we discover that no route needs to be added to the host because
it's not in the same subnet and we weren't supposed to create a tunnel,
then we also clean it up and ensure that it isn't in our state as well.