219 Commits

Author SHA1 Message Date
ccoVeille
e06ddccabe feat(test): use safecast.RequireConvert as a replacement for safecast.Convert in tests 2025-11-21 21:20:44 -06:00
ccoVeille
1e8976bd79 build(deps): update github.com/ccoveille/go-safecast to v2.0.0 2025-11-08 01:13:51 +01:00
ccoVeille
e8a59fda2e build(deps): bump github.com/ccoveille/go-safecast to 1.8.1 2025-11-03 12:04:58 +01:00
Aaron U'Ren
846fbd8500 fix(ipset): don't strip inet6 prefixing of ipsets
The problem here stems from the fact that when netpol generates its list of expected ipsets, it includes the inet6:
prefix, however, when the proxy and routing controller sent their list of expected ipsets, they did not do so. This
meant that no matter how we handled it in ipset.go it was wrong for one or the other use-cases.

I decided to standardize on the netpol way of sending the list of expected ipset names so that BuildIPSetRestore() can
function in the same way for all invocations.
2025-10-27 21:25:33 -05:00
Aaron U'Ren
f44598bcb1 test(ipset): add unit tests for ipset regression testing 2025-10-27 21:25:33 -05:00
Bukal, Tomáš
720e2ca2bd fix(ipset): store kube-router-local-ips ipset 2025-10-11 08:26:43 -05:00
Aaron U'Ren
6c44013bc5 fix(ipset): ignore non-kube-router ipsets
Attempt to filter out sets that we are not authoritative for to avoid
race conditions with other operators (like Istio) that might be
attempting to modify ipsets at the same time.
2025-10-04 18:30:28 -05:00
Aaron U'Ren
a4fb70a095 feat(lint): update golangci-lint v2.0.2 -> v2.4.0 2025-09-20 16:30:54 -05:00
Aaron U'Ren
d7214cec4f feat(Endpoints): convert Endpoints -> EndpointSlices 2025-09-06 16:27:03 -05:00
Aaron U'Ren
732d7a72dc fix(nsc): add loadbalancer IPs to metrics 2025-09-01 21:04:49 -05:00
Richard Kojedzinszky
c2fd633373 fix(nsc): sync field name 2025-09-01 21:04:49 -05:00
Richard Kojedzinszky
b4a9ba70fd fix(nsc): rename network_services_metrics.go 2025-09-01 21:04:49 -05:00
Richard Kojedzinszky
7533c183a1 feat(nsc): getMetricsServiceMap() rebuilds only after services changed 2025-09-01 21:04:49 -05:00
Richard Kojedzinszky
5efb999169 feat(nsc): replace unsafe.Pointer with atomic.Pointer 2025-09-01 21:04:49 -05:00
Richard Kojedzinszky
d0163ab725 feat(nsc): move part of Collect() to getMetricsServiceMap() 2025-09-01 21:04:49 -05:00
Richard Kojedzinszky
4e8bb705b5 feat(nsc): move metrics logic to separate file 2025-09-01 21:04:49 -05:00
Richard Kojedzinszky
a224198c89 feat(nsc): optimize key in temporary serviceMap 2025-09-01 21:04:49 -05:00
Richard Kojedzinszky
4ed0cf4117 feat(nsc): improve Service statistics 2025-09-01 21:04:49 -05:00
Richard Kojedzinszky
1b4b6d6b2b feat(nsc): eliminate nested loops in Collect() 2025-09-01 21:04:49 -05:00
Richard Kojedzinszky
766627645e feat(nsc): collect service statistics on demand 2025-09-01 21:04:49 -05:00
Richard Kojedzinszky
4b4ebec81f feat(nsc): prepare serviceMap to be accessed by collector thread 2025-09-01 21:04:49 -05:00
Anupam Ghosh
5e397e50e7 fix failed message 2025-08-06 17:01:22 -07:00
Anupam Ghosh
bbb8f3b0d9 disable sloppy_tcp if there is no DSR & Maglev service 2025-08-06 17:01:22 -07:00
Anupam Ghosh
598fc86349 enable sloppy_tcp when DSR and Maglev is enabled 2025-08-06 17:01:22 -07:00
Aaron U'Ren
700620509f feat(DSR): disable routing DSR traffic via kube-bridge
This was originally added in PR #210, but it appears to cause more
problems in my testing scenarios than it solves. When this is enabled,
it makes it so that services cannot be routed to from kube workers to
DSR enabled services when routed to other nodes in the cluster.
2025-06-29 17:42:18 -05:00
Aaron U'Ren
2ebcac62ec feat(linux_networking): add some additional logging 2025-06-29 17:42:18 -05:00
Aaron U'Ren
8504c52e80 fix(DSR): setup source routing for all external IPs
Previously, kube-router was only considering externalIPs when setting up
source routing policy, notably absent was consideration of LoadBalancer
IPs which are equally important for getting right with DSR.

This appears to have been a long-standing use-case that was never
correctly considered since when kube-router added a LoadBalancer
controller.
2025-06-29 17:42:18 -05:00
Aaron U'Ren
e6edc853fb fix(ipAddrDel): check to see if IP exists on interface before delete
Rather than yolo'ing a delete of the IP on the interface, check to see
if it exists and save the user some warning in their logs.
2025-06-29 17:42:18 -05:00
Aaron U'Ren
69e58eda04 feat(NSC): add some additional debugging to traffic director 2025-06-29 17:42:18 -05:00
Aaron U'Ren
94bfc0d9ba fix(ipAddrDel): check for routes before trying to delete
Instead of deleting and just hoping for the best, this change makes it
so that we check first whether or not a route exists. This helps to
reduce needless warnings that the user receives and is just all around
more accurate.
2025-06-29 17:42:18 -05:00
Aaron U'Ren
e29b6a3275 fix(NSC): pass fwmark to traffic director as an int
It used to be when we were using iproute2's CLI we needed to have the
fwmark as a hex number so we were passing it as a string in that format.

However, now that we use the netlink library directly, we already have
the fwmark in the condition that we need it. So instead of doing all of
these string <-> int conversions, lets just keep this simpler.
2025-06-29 17:42:18 -05:00
Aaron U'Ren
3c895955f7 fact(utils): factor out single subnet ip logic
Removes repeated logic of calculating IP address subnets for single
subnet hosts and consolidates it in one place.
2025-06-29 17:42:18 -05:00
Aaron U'Ren
b070531ec5 fix: add proper nil rule src handling
When ip rules are evaluated in the netlink library, default routes for
src and dst are equated to nil. This makes it difficult to evaluate
them and requires additional handling in order for them.

I filed an issue upstream so that this could potentially get fixed:
https://github.com/vishvananda/netlink/issues/1080 however if it doesn't
get resolved, this should allow us to move forward.
2025-06-29 17:42:18 -05:00
Aaron U'Ren
f2b0d785a0 fact: add ip utils library & add unit testing
Consolidate IP utility functions into a new file and add proper unit
testing. Additionally consolidate logic and references to default route
subnets.
2025-06-29 17:42:18 -05:00
Aaron U'Ren
4795a07e7c fix(ip rule): use NewRule() for all rule creations
It has proven to be tricky to insert new rules without calling the
designated NewRule() function from the netlink library. Usually attempts
will fail with an operation not supported message.

This improves the reliability of rule insertion.
2025-06-29 17:42:18 -05:00
Aaron U'Ren
56076051f8 fix(linux_networking.go): add scope to local routes
In order for a local route to be valid it needs to have the scope set to
host. When we were executing ip commands iproute2 just did this for us
to make the command accurate. Now that we're communicating with the
netlink socket, we need to do this conversion for ourselves.

Without this we get an error that says "invalid argument" from the
netlink subsystem. But if the route isn't local, then most of the
routing logic for services doesn't work correctly because it acts upon
external traffic as well as local traffic which isn't correct.
2025-06-29 17:42:18 -05:00
Aaron U'Ren
80328ace67 fix(linux_networking.go): filter routes to be deleted by table
Previously we were accidentally deleting all routes that were found,
this mimics the previous functionality better by only deleting external
IPs that were found in the externalIPRouteTable that are no longer in
the activeExternalIPs map.

Also improves logging around any routes that are deleted as this is
likely of interest to all kube-router administrators.
2025-06-29 17:42:18 -05:00
Aaron U'Ren
f59a4f5ae8 feat: convert execs to ip to netlink calls
Not making direct exec calls to user binary interfaces has long been a
principle of kube-router. When kube-router was first coded, the netlink
library was missing significant features that forced us to exec out.
However, now netlink seems to have most of the functionality that we
need.

This converts all of the places where we can use netlink to use the
netlink functionality.
2025-06-29 17:42:18 -05:00
Manuel Rüger
6a1d15c24c Use golangci-lint 2.0.2 2025-04-23 22:56:24 +02:00
Aaron U'Ren
d8430e21c0 fix(lint): remove nolint for error messages
It looks like they fixed goconst upstream and it no longer checks this
2025-02-14 14:18:26 -06:00
Aaron U'Ren
760fcd5c85 fix(lint): remove non-constant format string (govet) 2025-02-14 14:18:26 -06:00
Aaron U'Ren
48b631c4ea fix(lint): remove unnecessary variable initializations (copyloopvar) 2025-02-14 14:18:26 -06:00
Aaron U'Ren
858fdf659d fix(lint): prevent against integer overflow errors 2025-02-14 14:18:26 -06:00
Dmitry Sharshakov
aa7cffb6f0 fix(NSC): only set rp_filter to 2 if it is 1
Setting rp_filter to 2 when it is 0 will override its status to be always enabled (in the loose mode).
This behavior could break some networking solutions as it made packet admission rules more strict.
2025-01-11 14:37:21 -06:00
Aaron U'Ren
44439d6069 feat(NSC): change service.local internal traffic policy posture
Over time, feedback from users has been that our interpretation of how
the kube-router service.local annotation interacts with the internal
traffic policy has been that it is too restrictive.

It seems like tuning it to fall in line with the local internal traffic
policy is too restrictive. This commit changes that posture, by equating
the service.local annotation with External Traffic Policy Local and
Internal Traffic Policy Cluster.

This means that when service.local is set the following will be true:

* ExternalIPs / LoadBalancer IPs will only be available on a node that
  hosts the workload
* ExternalIPs / LoadBalancer IPs will only be BGP advertised (when
  enabled) by nodes that host the workload
* Services will have the same posture as External Traffic Policy set to
  local
* ClusterIPs will be available on all nodes for LoadBalancing
* ClusterIPs will only be BGP advertised (when enabled) by nodes that
  host the workload
* Cluster IP services will have the same posture as Internal Traffic
  Policy set to cluster

For anyone desiring the original functionality of the service.local
annotation that has been in place since kube-router v2.1.0, all that
would need to be done is to set `internalTrafficPolicy` to Local as
described here: https://kubernetes.io/docs/concepts/services-networking/service-traffic-policy/
2024-12-04 08:14:52 +01:00
Aaron U'Ren
0ac15b273e fact(healthcontroller): make more robust
Make the health controller more robust and extensible by adding in
constants for heart beats instead of 3 character random strings that are
easy to get wrong.
2024-11-21 15:24:09 +01:00
Aaron U'Ren
8bb7cbf451 fix(linux_networking_moq.go): import def order 2024-10-21 15:44:07 -05:00
Aaron U'Ren
e4fa335acb fix(krnode): apply suggestions from code review
Co-authored-by: Tom Wieczorek <twz123@users.noreply.github.com>
2024-09-29 17:53:36 -05:00
Aaron U'Ren
9fd46cc86d fact(krnode): add node struct abstraction
This prepares the way for broader refactors in the way that we handle
nodes by:

* Separating frequently used node logic from the controller creation
  steps
* Keeping reused code DRY-er
* Adding interface abstractions for key groups of node data and starting
  to rely on those more rather than concrete types
* Separating node data from the rest of the controller data structure so
  that it smaller definitions of data can be passed around to functions
  that need it rather than always passing the entire controller which
  contains more data / surface area than most functions need.
2024-09-29 17:53:36 -05:00
Aaron U'Ren
a0442e5abd fix: allow basic ICMPv6 neighbor discovery
This fixes the problem where if network policy is applied before any
communication between two pods, all subsequent communication fails
because the two pods aren't able to discovery each other.
2024-08-03 14:55:47 -05:00