99 Commits

Author SHA1 Message Date
Aaron U'Ren
054d5d1ceb
feat(lint): add basic typos checker to ensure less spelling mistakes in the future 2026-03-15 13:44:32 -05:00
Aaron U'Ren
8aaba6505e test(NSC): add comprehensive TCPMSS unit tests 2026-02-01 10:56:40 -06:00
Cat C
a8326ca382 fix(nrc): Update make test-pretty to test internal subdirectory. Update nlretry and LocalLinkQuerier interface to support passing in contexts 2025-12-28 16:50:40 -06:00
Cat C
8ea5e44db8 fix(nrc): Add netlink.Handle wrapper to retry netlink calls that raise ErrDumpInterrupted errors 2025-12-28 16:50:40 -06:00
Cat C
9e091b8875 feat(NRC): This commit adds support for a consolidated annotation for configuring
a node's BGP peer settings while maintaining backwards support for the existing
annotations to address #1393.
2025-12-15 22:46:22 -06:00
Aaron U'Ren
5c7215da52 fact(service.go): modernize interface{} -> any 2025-12-02 18:22:37 -06:00
Aaron U'Ren
36b3a3aeaa fix(service.go): rely on LabelServiceName only
When resolving EndpointSlice -> Service ownership, do expose an error if the ownerReference in the metadata does not
agree with the kubernetes.io/service-name label. Instead just use the service-name label.

This aligns better to the way that other network plugins work.
2025-12-02 18:22:37 -06:00
Aaron U'Ren
fcdb0ed8ae fix(node.go): embed root cause errors in returned errors 2025-12-02 18:03:34 -06:00
Aaron U'Ren
846fbd8500 fix(ipset): don't strip inet6 prefixing of ipsets
The problem here stems from the fact that when netpol generates its list of expected ipsets, it includes the inet6:
prefix, however, when the proxy and routing controller sent their list of expected ipsets, they did not do so. This
meant that no matter how we handled it in ipset.go it was wrong for one or the other use-cases.

I decided to standardize on the netpol way of sending the list of expected ipset names so that BuildIPSetRestore() can
function in the same way for all invocations.
2025-10-27 21:25:33 -05:00
Aaron U'Ren
f44598bcb1 test(ipset): add unit tests for ipset regression testing 2025-10-27 21:25:33 -05:00
Aaron U'Ren
6c44013bc5 fix(ipset): ignore non-kube-router ipsets
Attempt to filter out sets that we are not authoritative for to avoid
race conditions with other operators (like Istio) that might be
attempting to modify ipsets at the same time.
2025-10-04 18:30:28 -05:00
Aaron U'Ren
a4fb70a095 feat(lint): update golangci-lint v2.0.2 -> v2.4.0 2025-09-20 16:30:54 -05:00
Aaron U'Ren
d7214cec4f feat(Endpoints): convert Endpoints -> EndpointSlices 2025-09-06 16:27:03 -05:00
Anupam Ghosh
bbb8f3b0d9 disable sloppy_tcp if there is no DSR & Maglev service 2025-08-06 17:01:22 -07:00
Anupam Ghosh
98e38e9e66 get and set with int8 for SysctlConfig 2025-08-06 17:01:22 -07:00
Anupam Ghosh
598fc86349 enable sloppy_tcp when DSR and Maglev is enabled 2025-08-06 17:01:22 -07:00
Aaron U'Ren
3c895955f7 fact(utils): factor out single subnet ip logic
Removes repeated logic of calculating IP address subnets for single
subnet hosts and consolidates it in one place.
2025-06-29 17:42:18 -05:00
Aaron U'Ren
f2b0d785a0 fact: add ip utils library & add unit testing
Consolidate IP utility functions into a new file and add proper unit
testing. Additionally consolidate logic and references to default route
subnets.
2025-06-29 17:42:18 -05:00
Aaron U'Ren
2836065ffe fix(linux_routing.go): choose first rt_tables file
The rt_tables list is already in an ordered form in terms of priority.
Once one is found, it should be considered the optimal one and stop
looking for additional tables.
2025-06-29 17:42:18 -05:00
Aaron U'Ren
f59a4f5ae8 feat: convert execs to ip to netlink calls
Not making direct exec calls to user binary interfaces has long been a
principle of kube-router. When kube-router was first coded, the netlink
library was missing significant features that forced us to exec out.
However, now netlink seems to have most of the functionality that we
need.

This converts all of the places where we can use netlink to use the
netlink functionality.
2025-06-29 17:42:18 -05:00
Manuel Rüger
6a1d15c24c Use golangci-lint 2.0.2 2025-04-23 22:56:24 +02:00
Aaron U'Ren
16d1f6bc79 feat(Makefile): update golangci-lint 1.56.2->1.63.4 2025-02-14 14:18:26 -06:00
Dmitry Sharshakov
aa7cffb6f0 fix(NSC): only set rp_filter to 2 if it is 1
Setting rp_filter to 2 when it is 0 will override its status to be always enabled (in the loose mode).
This behavior could break some networking solutions as it made packet admission rules more strict.
2025-01-11 14:37:21 -06:00
Aaron U'Ren
1b2fd449c2 fix: remove unreachable / redundant error condition 2024-11-21 15:24:09 +01:00
Aaron U'Ren
2896ae4d1a fix(node_test.go): some minor unit test errors & verbiage 2024-11-21 15:24:09 +01:00
Aaron U'Ren
b09395f547 fix(node): make ordering more specific
We should be returning IP addresses in specific orders based upon what
type of IP they are.
2024-10-21 15:44:07 -05:00
Aaron U'Ren
5e4db3fa33 test(krnode): add unit tests for new functionality 2024-09-29 17:53:36 -05:00
Aaron U'Ren
e4fa335acb fix(krnode): apply suggestions from code review
Co-authored-by: Tom Wieczorek <twz123@users.noreply.github.com>
2024-09-29 17:53:36 -05:00
Aaron U'Ren
9fd46cc86d fact(krnode): add node struct abstraction
This prepares the way for broader refactors in the way that we handle
nodes by:

* Separating frequently used node logic from the controller creation
  steps
* Keeping reused code DRY-er
* Adding interface abstractions for key groups of node data and starting
  to rely on those more rather than concrete types
* Separating node data from the rest of the controller data structure so
  that it smaller definitions of data can be passed around to functions
  that need it rather than always passing the entire controller which
  contains more data / surface area than most functions need.
2024-09-29 17:53:36 -05:00
qbnit
ef370d39d8 fix(util): syntax in iptables.go
Fix strange mistake with ';' instead of 'string'.
2024-08-06 16:51:21 -05:00
qbnit
55b830d8f1 test: update utils/iptables_test to align with recent fix
Update tests for code affected in 8dfb5728582771dd6dd0341d3d0a122f14080cd5.
2024-08-06 16:51:21 -05:00
qbnit
1d53ea5764 fix: select ICMP version for common ICMP rules
This fixes kube-router trying to use `icmp` instead of `icmpv6` when creating common ICMP rules with ip6tables.

Fixes: #1712
2024-08-06 16:51:21 -05:00
Aaron U'Ren
a0442e5abd fix: allow basic ICMPv6 neighbor discovery
This fixes the problem where if network policy is applied before any
communication between two pods, all subsequent communication fails
because the two pods aren't able to discovery each other.
2024-08-03 14:55:47 -05:00
Natanael Copa
a1125f6e18 fix: ensure that ipv6 is not disabled in kernel
Signed-off-by: Natanael Copa <ncopa@mirantis.com>
2024-06-16 17:24:02 -05:00
Aaron U'Ren
f5167732dc fact(ipset): simplify cleanup code by reducing family complexity 2024-05-13 12:18:05 -05:00
Aaron U'Ren
e4919f3976 fix(ipset.go): make IP families distinct in ipset handler
iptables provides distinct handlers because different commands execute
on different families. As such, in the code, it has become a paradigm to
iterate over the active handlers and execute their functions. However,
ipset does not treat families distinct in any of the list or save
operations and a single command handles both families. The only thing
that is distinct about ipset families is their type and their name
prefix.

This means operations like calling ipset.Save() with their default
implementation will pull both IPv4 and IPv6 sets into the same struct
and comingle them. This can cause problems down the line with any other
logic that assumes they will be distinct like iptables.

This adds logic to separate them out on the Save() operation and ensure
that the sets don't become comingled.
2024-05-13 12:18:05 -05:00
Aaron U'Ren
61fe3a8034 feat(ipset.go): add set type and new line to debug msg 2024-05-13 12:18:05 -05:00
Aaron U'Ren
7755b4a67f fix(node.go): improve logic for GetNodeObject
Before the logic ran like the following in terms of preference:

1. Prefer environment var NODE_NAME
2. `Use os.Hostname()`
3. Fallback to `--hostname-override` passed by user

This didn't make a whole lot of sense, as `--hostname-override` is
directly, and supposedly intentionally set by the user, therefore it
should be the MOST preferred, not the least preferred. Additionally,
none of the errors encountered were passed back to the user so that
future conditions could be considered, so if there was an error at the
API level, that error was swallowed. Now the logic looks like:

1. Prefer `--hostname-override` if it is set. If it is set and we
   weren't able to resolve to a node object, return the error
2. Use environment var NODE_NAME if it is set. If it is set and we
   weren't able to resolve to a node object, return the error
3. Fallback to `os.Hostname()`. If we weren't able to resolve to a node
   object then return the error and give the user options
2024-04-26 14:16:09 -05:00
Aaron U'Ren
c762eaf2e5 feat(ipset): add more name utilities
Naming ipsets with the advent of IPv6 gets tricky because IPv6 ipsets
have to be prefixed with inet6:. This commit adds additional utilities
that help users find the correct name of ipsets.
2024-04-26 13:55:30 -05:00
Aaron U'Ren
70920609dc fix(rt_tables): add path fallback logic
Ever since version v6.5.0 of iproute2, iproute2 no longer automatically
creates the /etc/iproute2 files, instead preferring to add files to
/usr/lib/iproute2 and then later on /usr/share/iproute2.

This adds fallback path matching to kube-router so that it can find
/etc/iproute2/rt_tables wherever it is defined instead of just failing.

This also means people running kube-router in containers will need to
change their mounts depending on where this file is located on their
host OS. However, ensuring that this file is copied to `/etc/iproute2`
is a legitimate way to ensure that this is consistent across a fleet of
multiple OS versions.
2024-03-25 18:56:23 -05:00
Aaron U'Ren
46a1b17903 feat(go): upgrade 1.20.13 -> 1.21.7 + dep update
Upgrades to Go 1.21.7 now that Go 1.20 is no longer being maintained.

It also, resolves the race conditions that we were seeing with BGP
server tests when we upgraded from 1.20 -> 1.21. This appears to be
because some efficiency changed in 1.21 that caused BGP to write to the
events at the same time that the test harness was trying to read from
them. Solved this in a coarse manner by adding surrounding mutexes to
the test code.

Additionally, upgraded dependencies.
2024-03-02 15:45:54 -06:00
Aaron U'Ren
8afdee87d9 fact(NSC): differentiate headless services
Differentiate headless services from ClusterIP being none, in
preparation for handling the service.kubernetes.io/headless label. One
might thing that handling these is similar, which it sort of is and sort
of isn't. ClusterIP is an immutable field, whereas labels are mutable.
This changes our handling of ClusterIP none-ness from the presence of
the headless label.

When we consider what to do with ClusterIP being none, that is
fundamentally different, because once it is None, the k8s API guarantees
that the service won't ever change.

Whereas the label can be added and removed.
2024-01-05 10:27:23 -06:00
Jason Piper
fcf0ad913d prometheus metrics: add option to specify listen address
In the situation that you have multiple interfaces/IP addresses,
we want to be able to specify which one we want to expose the
prometheus metrics on.
2023-11-05 18:49:13 -06:00
Aaron U'Ren
9d63cc689b feat(debug): add some extra debug at level 3 2023-10-07 08:52:31 -05:00
Aaron U'Ren
4c6e19f2e1 feat(ipset): consolidate ipset usage across controllers
Before this, we had 2 different ways to interact with ipsets, through
the handler interface which had the best handling for IPv6 because NPC
heavily utilizes it, and through the ipset struct which mostly repeated
the handler logic, but didn't handle some key things.

NPC utilized the handler functions and NSC / NRC mostly utilized the old
ipset struct functions. This caused a lot of duplication between the two
groups of functions and also caused issues with proper IPv6 handling.

This commit consolidates the two sets of usage into just the handler
interface. This greatly simplifies how the controllers interact with
ipsets and it also reduces the logic complexity on the ipset side.

This also fixes up some inconsistency with how we handled IPv6 ipset
names. ipset likes them to be prefixed with inet6:, but we weren't
always doing this in a way that made sense and was consistent across all
functions in the ipset struct.
2023-10-07 08:52:31 -05:00
Aaron U'Ren
da73dea69b feat(NSC): use EndpointSlice instead of Endpoints
With the advent of IPv6 integrated into the NSC we no longer get all IPs
from endpoints, but rather just the primary IP of the pod (which is
often, but not always the IPv4 address).

In order to get all possible endpoint addresses for a given service we
need to switch to using EndpointSlice which also nicely groups addresses
into IPv4 and IPv6 by AddressType and also gives us more information
about the endpoint status by giving us attributes for serving and
terminating, instead of just ready or not ready.

This does mean that users will need to add another permission to their
RBAC in order for kube-router to access these objects.
2023-10-07 08:52:31 -05:00
Aaron U'Ren
25ecb098c6 feat(nsc): add dualstack capabilities 2023-10-07 08:52:31 -05:00
Aaron U'Ren
1d5c9ce25c fix(ecmp_vip): update VIPs based on svc change
Previously we used to do an idempotent sync all active VIPs any time we
got a service or endpoint update. However, this only worked when we
assumed a single-stack deployment model where IPs were never deleted
unless the whole service was deleted.

In a dual-stack model, we can add / remove LoadBalancer IPs and Cluster
IPs on updates. Given this, we need to take into account the finite
change that happens, and not just revert to sync-all because we'll never
stop advertising IPs that should be removed.

As a fall-back, we still have the outer Run loop that syncs all active
routes every X amount of seconds (configured by user CLI parameter). So
on that timer we'll still have something that syncs all active VIPs and
works as an outer control loop to ensure that desired state eventually
becomes active state if we accidentally remove a VIP that should have
been there.
2023-10-07 08:52:31 -05:00
Aaron U'Ren
ec12fda820 fix(node): do nil checking on FindBestIP util funcs 2023-10-07 08:52:31 -05:00
Aaron U'Ren
4f284be53e fix(NRC): add IPv6 logic to bgp-local-addresses 2023-10-07 08:52:31 -05:00