16 Commits

Author SHA1 Message Date
Aaron U'Ren
a0fe844a93 feat(NSC): honor service-proxy-name label
Abide the service.kubernetes.io/service-proxy-name label as defined by
the upstream standard here:
https://github.com/kubernetes-sigs/kpng/blob/master/doc/service-proxy.md#ignored-servicesendpoints

Resolves the failing e2e test:
should implement service.kubernetes.io/service-proxy-name

Fixes: #979
2024-01-05 10:27:23 -06:00
Aaron U'Ren
0f3714b9b7 fix(hairpin): set hairpin_mode for veth iface
It used to be that the kubelet handled setting hairpin mode for us:
https://github.com/kubernetes/kubernetes/pull/13628

Then this functionality moved to the dockershim:
https://github.com/kubernetes/kubernetes/pull/62212

Then the functionality was removed entirely:
https://github.com/kubernetes/kubernetes/commit/83265c9171f

Unfortunately, it was lost that we ever depended on this in order for
our hairpin implementation to work, if we ever knew it at all.
Additionally, I suspect that containerd and cri-o implementations never
worked correctly with hairpinning.

Without this, the NAT rules that we implement for hairpinning don't work
correctly. Because hairpin_mode isn't implemented on the virtual
interface of the container on the host, the packet bubbles up to the
kube-bridge. At some point in the traffic flow, the route back to the
pod gets resolved to the mac address inside the container, at that
point, the packet's source mac and destination mac don't match the
kube-bridge interface and the packet is black-holed.

This can also be fixed by putting the kube-bridge interface into
promiscuous mode so that it accepts all mac addresses, but I think that
going back to the original functionality of enabling hairpin_mode on the
veth interface of the container is likely the lesser of two evils here
as putting the kube-bridge interface into promiscuous mode will likely
have unintentional consequences.
2023-12-07 12:44:51 -06:00
Aaron U'Ren
da73dea69b feat(NSC): use EndpointSlice instead of Endpoints
With the advent of IPv6 integrated into the NSC we no longer get all IPs
from endpoints, but rather just the primary IP of the pod (which is
often, but not always the IPv4 address).

In order to get all possible endpoint addresses for a given service we
need to switch to using EndpointSlice which also nicely groups addresses
into IPv4 and IPv6 by AddressType and also gives us more information
about the endpoint status by giving us attributes for serving and
terminating, instead of just ready or not ready.

This does mean that users will need to add another permission to their
RBAC in order for kube-router to access these objects.
2023-10-07 08:52:31 -05:00
Aaron U'Ren
25ecb098c6 feat(nsc): add dualstack capabilities 2023-10-07 08:52:31 -05:00
Aaron U'Ren
06f5f8babf feat(go): update package version to /v2
Do the necessary to update kube-router to a new major version following
upstream documentation: https://go.dev/doc/modules/major-version
2023-10-07 08:52:31 -05:00
Aaron U'Ren
4b6cf6c896 fact(protocol): standardize protocol conversions 2022-02-11 17:34:10 -06:00
Aaron U'Ren
28aab6ea20 fact(service_endpoints_sync): simplify external IP logic
This is an attempt to make the external IP logic easier to follow and
more straight forward for future changes like consolidating the iptables
logic.
2022-02-11 17:34:10 -06:00
Aaron U'Ren
5101a4fe81 fix(nsc): remove error for lookupFWMarkByService
lookupFWMarkByService() was previous returning an error when no fwmark
was found in the tracking map for a given service. However, this isn't
really an error condition and shouldn't be treated as such. When it was
treated as an error condition users got a lot of confusing errors in the
logs.
2021-12-03 11:49:28 +01:00
Aaron U'Ren
c3f90c54b3
Fix Misc DSR Issues (#1174)
* fact(NSC): consolidate constants to top

* fix(NSC): increase IPVS add service logging

* fix(NSC): improve logging for FWMark IPVS entries

* fix(NSC): add missing parameter to logging

* feat(NSC): generate unique FW marks

Because we trim the 32-bit FNV-1a hash to 16 bits there is the potential
for FW marks to collide with each other even for unique inputs of IP,
protocol, and port. This reduces that chance up to the 16-bit max by
keeping track of which FW marks we've already allocated and what IP,
protocol, port combo they've been allocated for.

Fixes #1045

* fact(NSC): move utility funcs to utils

* fix(NSC): reduce IPVS service shell outs

This also aligns it more with the almost identical function used for
non-FWmarked services ipvsAddService() which is also called from
setupExternalIPServices and passes in this same list of ipvsServices.

* fix(NSC): fix & consolidate DSR cleanup code

A lot of this is refactor work, but its important to know why the DSR
mangle tables were not being cleaned up in the first place. When we
transitioned to iptables-save to look over the mangle rules, we didn't
realize that iptables-save changes the format of the marks from integer
values (which is what the CLI works with) to hexadecimal.

This made it so that we were never actually matching on a mangle rule,
which left them all behind. When these mangle rules were left, it meant
that IPs that used to be part of a DSR service were essentially
black-holed on the system and were no longer route-able.

Fixes #1167

* doc(dsr): expand DSR documentation

fixes #1055

* ensure active service map is updated for non DSR services

Co-authored-by: Murali Reddy <muralimmreddy@gmail.com>
2021-10-14 16:14:05 +05:30
Aaron U'Ren
5e1d033a44 fix(sysctl): revert is fatal check for some conditions 2021-09-13 17:39:28 -05:00
Aaron U'Ren
feb16d0d0b doc(NSC): add some comments around DSR 2021-09-11 16:20:07 -05:00
Aaron U'Ren
8f3861de40 fact(sysctl): consolidate sysctl usage into utils 2021-09-11 16:20:07 -05:00
Aaron U'Ren
da5f8e0044 fix: address minor PR feedback and misspells 2021-09-11 16:20:07 -05:00
Aaron U'Ren
85f28411dc feat(.golangci.yml): enable long lines linter and remediate 2021-09-11 16:20:07 -05:00
Aaron U'Ren
6208bfac46 feat(.golangci.yml): enable gomnd and remediate 2021-09-11 16:20:07 -05:00
Aaron U'Ren
c5f4c00d63 feat(.golangci.yml): enable dupl and remediate 2021-09-11 16:20:07 -05:00