177 Commits

Author SHA1 Message Date
Aaron U'Ren
ca6b644d32 test(NSC): mock netlink calls - attempt 1 2026-02-01 11:07:13 -06:00
Aaron U'Ren
048680706c fix(NSC): cleanup historical bad IPv6 TCPMSS vals 2026-02-01 10:56:40 -06:00
Richard Kojedzinszky
ee0940b87c fix(dsr): set TCPMSS based on address family 2026-01-25 12:00:21 -06:00
Cat C
440ad4d0a1 fix: Replace all netlink functions that throw ErrDumpInterrupted with a retry wrapper 2026-01-09 09:17:43 -06:00
ccoVeille
1e8976bd79 build(deps): update github.com/ccoveille/go-safecast to v2.0.0 2025-11-08 01:13:51 +01:00
ccoVeille
e8a59fda2e build(deps): bump github.com/ccoveille/go-safecast to 1.8.1 2025-11-03 12:04:58 +01:00
Aaron U'Ren
846fbd8500 fix(ipset): don't strip inet6 prefixing of ipsets
The problem here stems from the fact that when netpol generates its list of expected ipsets, it includes the inet6:
prefix, however, when the proxy and routing controller sent their list of expected ipsets, they did not do so. This
meant that no matter how we handled it in ipset.go it was wrong for one or the other use-cases.

I decided to standardize on the netpol way of sending the list of expected ipset names so that BuildIPSetRestore() can
function in the same way for all invocations.
2025-10-27 21:25:33 -05:00
Bukal, Tomáš
720e2ca2bd fix(ipset): store kube-router-local-ips ipset 2025-10-11 08:26:43 -05:00
Aaron U'Ren
6c44013bc5 fix(ipset): ignore non-kube-router ipsets
Attempt to filter out sets that we are not authoritative for to avoid
race conditions with other operators (like Istio) that might be
attempting to modify ipsets at the same time.
2025-10-04 18:30:28 -05:00
Aaron U'Ren
d7214cec4f feat(Endpoints): convert Endpoints -> EndpointSlices 2025-09-06 16:27:03 -05:00
Richard Kojedzinszky
c2fd633373 fix(nsc): sync field name 2025-09-01 21:04:49 -05:00
Richard Kojedzinszky
7533c183a1 feat(nsc): getMetricsServiceMap() rebuilds only after services changed 2025-09-01 21:04:49 -05:00
Richard Kojedzinszky
5efb999169 feat(nsc): replace unsafe.Pointer with atomic.Pointer 2025-09-01 21:04:49 -05:00
Richard Kojedzinszky
4e8bb705b5 feat(nsc): move metrics logic to separate file 2025-09-01 21:04:49 -05:00
Richard Kojedzinszky
a224198c89 feat(nsc): optimize key in temporary serviceMap 2025-09-01 21:04:49 -05:00
Richard Kojedzinszky
4ed0cf4117 feat(nsc): improve Service statistics 2025-09-01 21:04:49 -05:00
Richard Kojedzinszky
1b4b6d6b2b feat(nsc): eliminate nested loops in Collect() 2025-09-01 21:04:49 -05:00
Richard Kojedzinszky
766627645e feat(nsc): collect service statistics on demand 2025-09-01 21:04:49 -05:00
Richard Kojedzinszky
4b4ebec81f feat(nsc): prepare serviceMap to be accessed by collector thread 2025-09-01 21:04:49 -05:00
Aaron U'Ren
69e58eda04 feat(NSC): add some additional debugging to traffic director 2025-06-29 17:42:18 -05:00
Aaron U'Ren
e29b6a3275 fix(NSC): pass fwmark to traffic director as an int
It used to be when we were using iproute2's CLI we needed to have the
fwmark as a hex number so we were passing it as a string in that format.

However, now that we use the netlink library directly, we already have
the fwmark in the condition that we need it. So instead of doing all of
these string <-> int conversions, lets just keep this simpler.
2025-06-29 17:42:18 -05:00
Aaron U'Ren
b070531ec5 fix: add proper nil rule src handling
When ip rules are evaluated in the netlink library, default routes for
src and dst are equated to nil. This makes it difficult to evaluate
them and requires additional handling in order for them.

I filed an issue upstream so that this could potentially get fixed:
https://github.com/vishvananda/netlink/issues/1080 however if it doesn't
get resolved, this should allow us to move forward.
2025-06-29 17:42:18 -05:00
Aaron U'Ren
4795a07e7c fix(ip rule): use NewRule() for all rule creations
It has proven to be tricky to insert new rules without calling the
designated NewRule() function from the netlink library. Usually attempts
will fail with an operation not supported message.

This improves the reliability of rule insertion.
2025-06-29 17:42:18 -05:00
Aaron U'Ren
f59a4f5ae8 feat: convert execs to ip to netlink calls
Not making direct exec calls to user binary interfaces has long been a
principle of kube-router. When kube-router was first coded, the netlink
library was missing significant features that forced us to exec out.
However, now netlink seems to have most of the functionality that we
need.

This converts all of the places where we can use netlink to use the
netlink functionality.
2025-06-29 17:42:18 -05:00
Manuel Rüger
6a1d15c24c Use golangci-lint 2.0.2 2025-04-23 22:56:24 +02:00
Aaron U'Ren
760fcd5c85 fix(lint): remove non-constant format string (govet) 2025-02-14 14:18:26 -06:00
Aaron U'Ren
48b631c4ea fix(lint): remove unnecessary variable initializations (copyloopvar) 2025-02-14 14:18:26 -06:00
Aaron U'Ren
858fdf659d fix(lint): prevent against integer overflow errors 2025-02-14 14:18:26 -06:00
Dmitry Sharshakov
aa7cffb6f0 fix(NSC): only set rp_filter to 2 if it is 1
Setting rp_filter to 2 when it is 0 will override its status to be always enabled (in the loose mode).
This behavior could break some networking solutions as it made packet admission rules more strict.
2025-01-11 14:37:21 -06:00
Aaron U'Ren
44439d6069 feat(NSC): change service.local internal traffic policy posture
Over time, feedback from users has been that our interpretation of how
the kube-router service.local annotation interacts with the internal
traffic policy has been that it is too restrictive.

It seems like tuning it to fall in line with the local internal traffic
policy is too restrictive. This commit changes that posture, by equating
the service.local annotation with External Traffic Policy Local and
Internal Traffic Policy Cluster.

This means that when service.local is set the following will be true:

* ExternalIPs / LoadBalancer IPs will only be available on a node that
  hosts the workload
* ExternalIPs / LoadBalancer IPs will only be BGP advertised (when
  enabled) by nodes that host the workload
* Services will have the same posture as External Traffic Policy set to
  local
* ClusterIPs will be available on all nodes for LoadBalancing
* ClusterIPs will only be BGP advertised (when enabled) by nodes that
  host the workload
* Cluster IP services will have the same posture as Internal Traffic
  Policy set to cluster

For anyone desiring the original functionality of the service.local
annotation that has been in place since kube-router v2.1.0, all that
would need to be done is to set `internalTrafficPolicy` to Local as
described here: https://kubernetes.io/docs/concepts/services-networking/service-traffic-policy/
2024-12-04 08:14:52 +01:00
Aaron U'Ren
0ac15b273e fact(healthcontroller): make more robust
Make the health controller more robust and extensible by adding in
constants for heart beats instead of 3 character random strings that are
easy to get wrong.
2024-11-21 15:24:09 +01:00
Aaron U'Ren
9fd46cc86d fact(krnode): add node struct abstraction
This prepares the way for broader refactors in the way that we handle
nodes by:

* Separating frequently used node logic from the controller creation
  steps
* Keeping reused code DRY-er
* Adding interface abstractions for key groups of node data and starting
  to rely on those more rather than concrete types
* Separating node data from the rest of the controller data structure so
  that it smaller definitions of data can be passed around to functions
  that need it rather than always passing the entire controller which
  contains more data / surface area than most functions need.
2024-09-29 17:53:36 -05:00
Aaron U'Ren
a0442e5abd fix: allow basic ICMPv6 neighbor discovery
This fixes the problem where if network policy is applied before any
communication between two pods, all subsequent communication fails
because the two pods aren't able to discovery each other.
2024-08-03 14:55:47 -05:00
Aaron U'Ren
71072c1de6 doc(NSC): add extra comments to setupHandlers call 2024-07-31 17:03:16 -05:00
Aaron U'Ren
b217e7b434 fix(NSC): ensure kube-router owns kube-router-svip
Currently, kube-router just lists all IPVS services on the host and then
adds the load balancing service IPs to kube-router-svip blindly.
However, this assumes that the only IPVS entries are entries that
kube-router has originated and that the user isn't using IPVS.

We want to make sure that we are only creating rules for services that
we are authoritative for. So to this end, we now double-check that this
is one of our services before adding rules that may effect it.
2024-07-31 17:03:16 -05:00
Richard Kojedzinszky
9a0230b716
fix(nsc): remove previous TCPMSS rules during setting up DSR 2024-05-27 11:49:14 -05:00
Aaron U'Ren
5d308ac5fa
fix(nsc): remove previous TCPMSS rules 2024-05-27 11:49:14 -05:00
Richard Kojedzinszky
989527c6bb
feat(nsc): apply TCPMSS rules on kube-bridge interface only 2024-05-27 11:49:14 -05:00
Richard Kojedzinszky
c3b6db955c
fix(nsc): TCPMSS rules are created per-service and for reply packets only
Fixes #1671
2024-05-27 11:49:13 -05:00
Aaron U'Ren
317c754af0 fix(hairpin): rely on CNI hairpin mode
Instead of using the hairping controller that I created, instead rely
upon the `hairpinMode` option of the bridge CNI that does this job
better. See https://www.kube-router.io/docs/user-guide/#hairpin-mode for
more information.
2024-04-27 16:39:46 -05:00
Aaron U'Ren
b270750f15 fact: nsc.getPodObjectForEndpoint -> nsc.getPodObjectForEndpointIP 2024-04-27 13:28:09 -05:00
Aaron U'Ren
ecaad2c6e4 fix(cleanup): add missing handlers for cleanup
kube-router v2.X introduced the idea of iptables and ipset handlers that
allow kube-router to be dual-stack capable. However, the cleanup logic
for the various controllers was not properly ported when this happened.
When the cleanup functions run, they often have not had their
controllers fully initialized as cleanup should not be dependant on
kube-router being able to reach a kube-apiserver.

As such, they were missing these handlers. And as such they either
silently ended up doing noops or worse, they would run into nil pointer
failures.

This corrects that, so that kube-router no longer fails this way and
cleans up as it had in v1.X.
2024-04-26 14:16:09 -05:00
Aaron U'Ren
b423b1feb1 feat(NSC): ensure rp_filter is set correctly
rp_filter on RedHat based OS's is often set to 1 instead of 2 which is
more permissive and allows the outbound route for traffic to differ from
the route of incoming traffic.
2024-04-24 18:13:26 -05:00
Aaron U'Ren
46a1b17903 feat(go): upgrade 1.20.13 -> 1.21.7 + dep update
Upgrades to Go 1.21.7 now that Go 1.20 is no longer being maintained.

It also, resolves the race conditions that we were seeing with BGP
server tests when we upgraded from 1.20 -> 1.21. This appears to be
because some efficiency changed in 1.21 that caused BGP to write to the
events at the same time that the test harness was trying to read from
them. Solved this in a coarse manner by adding surrounding mutexes to
the test code.

Additionally, upgraded dependencies.
2024-03-02 15:45:54 -06:00
Aaron U'Ren
47fe189fe6 feat(lint): update golangci-lint and fix lint errors 2024-03-02 15:45:54 -06:00
Aaron U'Ren
9a136c1b16 feat(NSC): implement NodePort Health Check
NodePort Health Check has long been part of the Kubernetes API, but
kube-router hasn't implemented it in the past. This is meant to be a
port that is assigned by the kube-controller-manager for LoadBalancer
services that have a traffic policy of `externalTrafficPolicy=Local`.

When set, the k8s networking implementation is meant to open a port and
provide HTTP responses that inform parties external to the Kubernetes
cluster about whether or not a local endpoint exists on the node. It
should return a 200 status if the node contains a local endpoint and
return a 503 status if the node does not contain a local endpoint.

This allows applications outside the cluster to choose their endpoint in
such a way that their source IP could be preserved. For more details
see:
https://kubernetes.io/docs/tutorials/services/source-ip/#source-ip-for-services-with-type-loadbalancer
2024-03-01 16:52:05 -06:00
Aaron U'Ren
7aec8d0456 doc(NSC): add comment for hairpin controller 2024-03-01 16:52:05 -06:00
Aaron U'Ren
959022fdca feat(NSC): add endpoint statuses to internal struct
Add isReady, isServing, and isTerminating to internal EndpointSlice
struct so that downstream consumers have more information about the
service to make decisions later on.
2024-03-01 16:52:05 -06:00
Aaron U'Ren
16daa08c7b feat(NSC): add endpoints that are ready or serving
In order to be compliant with upstream network implementation
expectations we choose to proxy an endpoint as long as it is either
ready OR serving. This means that endpoints that are terminating will
still be proxied which makes kube-router conformant with the upstream
e2e tests.
2024-03-01 16:52:05 -06:00
Aaron U'Ren
fcd21b4759 feat: fully support service traffic policies
Adds support for spec.internalTrafficPolicy and fixes support for
spec.externalTrafficPolicy so that it only effects external traffic.

Keeps existing support for kube-router.io/service-local annotation which
overrides both to local when set to true. Any other value in this
annotation is ignored.
2024-01-24 09:05:24 -08:00