98 Commits

Author SHA1 Message Date
Aaron U'Ren
fa7bcdeb06 fix(bgp_policies_test.go): use startBgpServer()
Use startBgpServer() rather than doing things individually, so that we
can follow the logic path of how kube-router actually works better. This
allows us to use annotations rather than set stuff manually and allows
us to test more of the code-path of the NRC.

Additionally, this change allows us to actually test some errors better
such as, make sure that startBgpServer() actually throws the error we
expect when only one part of the prepend ASN annotation is present.
Previously, we were not actually testing this code path.
2021-05-17 12:08:36 -05:00
Aaron U'Ren
a5d6560751 fact(bgp_policies_test): move BGP policy tests into their own file 2021-05-17 12:08:36 -05:00
Aaron U'Ren
ef827d3dbf fix: protect uint32 conversion
See the following for more details:
https://github.com/cloudnativelabs/kube-router/security/code-scanning?query=ref%3Arefs%2Fpull%2F1065%2Fmerge+tool%3ACodeQL
2021-04-14 16:23:59 -05:00
Aaron U'Ren
1816886cb4 fix: remove possible BGP password leak via logs
See:
https://github.com/cloudnativelabs/kube-router/security/code-scanning/1?query=ref%3Arefs%2Fpull%2F1065%2Fmerge
2021-04-14 16:23:59 -05:00
Aaron U'Ren
be01f317c7 fact: other misc cleanups 2021-04-14 16:23:59 -05:00
Aaron U'Ren
0faf772fbd fix: don't overload function names with vars 2021-04-14 16:23:59 -05:00
Aaron U'Ren
53cfbe30eb fix: return early when we might be holding nil references 2021-04-14 16:23:59 -05:00
Aaron U'Ren
4efa5ccc48 fact: remove function parameters that are never referenced 2021-04-14 16:23:59 -05:00
Aaron U'Ren
a86b3fad35 fact: handle errors from Close() explicitely 2021-04-14 16:23:59 -05:00
Aaron U'Ren
96675e620b fix: don't capitalize error messages
It is standard practice in Go to not capitalize error messages:
https://github.com/golang/go/wiki/CodeReviewComments#error-strings
2021-04-14 16:23:59 -05:00
Aaron U'Ren
e9c77d0a35 fix(comments): misspellings and bad doc strings 2021-04-14 16:23:59 -05:00
Manuel Rüger
7d47aefe7d Replace github.com/golang/glog with k8s.io/klog/v2
glog is effectively unmaintained and the kubernetes ecosystem is mainly
using its fork klog

Fixes: #1051
2021-04-11 13:16:03 -05:00
Murali Reddy
c309b276ae skip logging Error when there is no Service object for an Endpoint 2021-03-24 14:30:27 -05:00
Aaron U'Ren
43c3c9de86
Handle headless services (#1047)
* doc(ecmp_vip.go): add info around extra withdraw

Rename getWithdraw to make it more explicit what its doing here. Also
add documentation as to why this is needed on Update and not
Create/Delete as well as why we only treat externalIPs.

* fix(ecmp_vip.go): remove superfluous AddPolicies

AddPolicies is already called downstream of nrc.OnEndpointsUpdate() so
there is no need to do it here as well, the only result is that this
expensive operation and idempotent operation is run twice.

* feat: better handling of headless services

Also introduces a consolidated Service utilities section for controller
functionality related to services that is shared.

* fix: add logging back to tryHandleServiceDelete
2021-03-24 08:31:39 +05:30
Murali Reddy
40512f104a serialize the iptables changes by NRC and NPC while starting 2021-03-18 09:21:22 -05:00
yydzhou
49b9add056
Making IPIP/tunnel and override-nexthop independent (#1025)
* enable tunnel plus override-nexthop config

* add docs

* feedback integration

Co-authored-by: deng.zhou <deng.zhou@bytedance.com>
2021-02-09 18:44:56 +05:30
Murali Reddy
54b921f1f8 Merge remote-tracking branch 'iamakulov/master' 2021-01-04 16:56:41 +05:30
Aaron U'Ren
f8aed0c92a fix(nrc): multiple services with the same VIP
Properly consider the readiness of all services in the case where
multiple services share the same VIP. Don't withdraw a VIP just because
one service is not ready.
2020-12-11 11:05:33 -06:00
ᗪєνιη ᗷυнʟ
8e3f36c679
Add LoadBalancer to getExternalIPs (#995)
* add LoadBalancer to getExternalIPs

* fix up network_routes_controller tests

* update ecmp_vip tests
2020-10-02 16:34:14 +05:30
Murali Reddy
92b914e7fd review comments 2020-10-01 23:00:36 -05:00
Murali Reddy
7904b7c950 addressing review comments 2020-10-01 23:00:36 -05:00
Murali Reddy
947bb246e4 fix lint error 2020-10-01 23:00:36 -05:00
Murali Reddy
db1bd5611e set mtu in cni spec to auto configure MTU's of the pod's veth's and kube-bridge interfaces
Fixes #165
2020-10-01 23:00:36 -05:00
Aaron U'Ren
824614d162
Add Support for Reading Peer Passwords via a File (#986)
* Add support for reading peer passwords via a file

Syntax of the file is the same as for --peer-router-passwords, that is,
a comma separated list of base64 encoded passwords.

Passwords specified with --peer-router-passwords have precedence over
passwords read from peer-router-passwords-file.

* fix(options): peer password file linting and doc

Co-authored-by: Jean Raby <jean@raby.sh>
2020-09-08 16:16:21 -05:00
Murali Reddy
3c734fb96a
merge gobgp-update into master (#982)
* merge gobgp-update into master

* update travis.yaml go version:

* go get github.com/osrg/gobgp to build gobgp

* install git as go get needs it
2020-09-07 10:27:58 +05:30
Ivan Akulov
1a487d2140
Remove options passed to .Refresh()
To match the existing code behavior that existed for at least two years
2020-08-19 21:50:37 +03:00
Aaron U'Ren
e35dc9d61e
Merge pull request #958 from coufalja/random-all
Add --random-fully to MASQ iptables rules to mitigate conntrack issues
2020-08-04 16:49:06 -05:00
jakub.coufal
68dba40d58 Clean original iptables rule if --random-fully is supported 2020-08-04 07:33:17 +02:00
Murali Reddy
a33089d292
[testing] run go linters (#943)
* run go linters for static code checking

* fix(lint): fix all goimports linting errors

* fix(lint): fix all golint errors

* fix(lint): fix all spelling errors

Co-authored-by: Aaron U'Ren <aauren@gmail.com>
2020-07-28 23:52:41 +05:30
jakub.coufal
8d424ea09b Fix pod egress rule cleanup 2020-07-28 14:02:00 +02:00
jakub.coufal
d66a3bb06e Activate --random-fully where supported 2020-07-27 14:43:06 +02:00
Murali Reddy
bb35b9ad2e fix lint error: minor fix to catch the error from .bgpServer.Stop() 2020-07-17 06:54:34 +05:30
Aaron U'Ren
031a9926d6
Merge pull request #786 from jdrahos/rr_ipv4_785
Allow to configure RR cluster id using IPv4 strings
2020-07-16 09:41:13 -05:00
CloudNativer
1c184624d1
The bgp-holdtime function parameter of setting holdtime is added to adjust the holdtime of BGP negotiation with the connected network devices. (#921)
The bgp-holdtime function parameter of setting holdtime is added to adjust the holdtime of BGP negotiation with the connected network devices.
2020-07-13 09:10:31 -05:00
Aaron U'Ren
b07f53f4b8 fix(graceful_restart): gofmt and doc fixes so unit tests pass 2020-07-10 16:26:54 -05:00
Jean Raby
1c594b2827 Allow setting BGP Graceful restart time from CLI
Default value remains the same as GoBGP (90s)
2020-07-10 13:57:04 -05:00
Murali Reddy
81d717d9af fix false negative errors in creating BGP defined sets 2020-06-11 16:59:09 +05:30
Manuel Rüger
12674d5f8b
Add golangci-lint support (#895)
* Makefile: Add lint using golangci-lint

* build/travis-test.sh: Run lint step

* metrics_controller: Lint

pkg/metrics/metrics_controller.go:150:2: `mu` is unused (structcheck)
        mu          sync.Mutex
        ^
pkg/metrics/metrics_controller.go:151:2: `nodeIP` is unused (structcheck)
        nodeIP      net.IP
        ^

* network_service_graceful: Lint

pkg/controllers/proxy/network_service_graceful.go:21:6: `gracefulQueueItem` is unused (deadcode)
type gracefulQueueItem struct {
     ^
pkg/controllers/proxy/network_service_graceful.go:22:2: `added` is unused (structcheck)
        added   time.Time
        ^
pkg/controllers/proxy/network_service_graceful.go:23:2: `service` is unused (structcheck)
        service *ipvs.Service
        ^

* network_services_controller_test: Lint

pkg/controllers/proxy/network_services_controller_test.go:80:6: func `logf` is unused (unused)

* ecmp_vip: Lint

pkg/controllers/routing/ecmp_vip.go:208:4: S1023: redundant `return` statement (gosimple)
                        return
                        ^

* bgp_peers: Lint

pkg/controllers/routing/bgp_peers.go:331:4: S1023: redundant `return` statement (gosimple)
                        return
                        ^

* bgp_policies: Lint

pkg/controllers/routing/bgp_policies.go:80:3: S1011: should replace loop with `externalBgpPeers = append(externalBgpPeers, nrc.nodePeerRouters...)` (gosimple)
                for _, peer := range nrc.nodePeerRouters {
                ^
pkg/controllers/routing/bgp_policies.go:23:20: ineffectual assignment to `err` (ineffassign)
        podCidrPrefixSet, err := table.NewPrefixSet(config.PrefixSet{
                          ^
pkg/controllers/routing/bgp_policies.go:42:22: ineffectual assignment to `err` (ineffassign)
        clusterIPPrefixSet, err := table.NewPrefixSet(config.PrefixSet{
                            ^
pkg/controllers/routing/bgp_policies.go:33:30: Error return value of `nrc.bgpServer.AddDefinedSet` is not checked (errcheck)
                nrc.bgpServer.AddDefinedSet(podCidrPrefixSet)
                                           ^
pkg/controllers/routing/bgp_policies.go:48:30: Error return value of `nrc.bgpServer.AddDefinedSet` is not checked (errcheck)
                nrc.bgpServer.AddDefinedSet(clusterIPPrefixSet)
                                           ^
pkg/controllers/routing/bgp_policies.go:69:31: Error return value of `nrc.bgpServer.AddDefinedSet` is not checked (errcheck)
                        nrc.bgpServer.AddDefinedSet(iBGPPeerNS)
                                                   ^
pkg/controllers/routing/bgp_policies.go:108:31: Error return value of `nrc.bgpServer.AddDefinedSet` is not checked (errcheck)
                        nrc.bgpServer.AddDefinedSet(ns)
                                                   ^
pkg/controllers/routing/bgp_policies.go:120:30: Error return value of `nrc.bgpServer.AddDefinedSet` is not checked (errcheck)
                nrc.bgpServer.AddDefinedSet(ns)
                                           ^
                                                   ^

* network_policy_controller: Lint

pkg/controllers/netpol/network_policy_controller.go:35:2: `networkPolicyAnnotation` is unused (deadcode)
        networkPolicyAnnotation      = "net.beta.kubernetes.io/network-policy"
        ^
pkg/controllers/netpol/network_policy_controller.go:1047:4: SA9003: empty branch (staticcheck)
                        if err != nil {
                        ^
pkg/controllers/netpol/network_policy_controller.go:969:10: SA4006: this value of `err` is never used (staticcheck)
        chains, err := iptablesCmdHandler.ListChains("filter")
                ^
pkg/controllers/netpol/network_policy_controller.go:1568:4: SA4006: this value of `err` is never used (staticcheck)
                        err = iptablesCmdHandler.Delete("filter", "FORWARD", strconv.Itoa(i-realRuleNo))
                        ^
pkg/controllers/netpol/network_policy_controller.go:1584:4: SA4006: this value of `err` is never used (staticcheck)
                        err = iptablesCmdHandler.Delete("filter", "OUTPUT", strconv.Itoa(i-realRuleNo))
                        ^

* network_services_controller: Lint

pkg/controllers/proxy/network_services_controller.go:66:2: `h` is unused (deadcode)
        h      *ipvs.Handle
        ^
pkg/controllers/proxy/network_services_controller.go:879:23: SA1019: client.NewEnvClient is deprecated: use NewClientWithOpts(FromEnv)  (staticcheck)
        dockerClient, err := client.NewEnvClient()
                             ^
pkg/controllers/proxy/network_services_controller.go:944:5: unreachable: unreachable code (govet)
                                glog.V(3).Infof("Waiting for tunnel interface %s to come up in the pod, retrying", KUBE_TUNNEL_IF)
                                ^
pkg/controllers/proxy/network_services_controller.go:1289:5: S1002: should omit comparison to bool constant, can be simplified to `!hasHairpinChain` (gosimple)
        if hasHairpinChain != true {
           ^
pkg/controllers/proxy/network_services_controller.go:1237:43: S1019: should use make(map[string][]string) instead (gosimple)
        rulesNeeded := make(map[string][]string, 0)
                                                 ^
pkg/controllers/proxy/network_services_controller.go:1111:4: S1023: redundant break statement (gosimple)
                        break
                        ^
pkg/controllers/proxy/network_services_controller.go:1114:4: S1023: redundant break statement (gosimple)
                        break
                        ^
pkg/controllers/proxy/network_services_controller.go:1117:4: S1023: redundant break statement (gosimple)
                        break
                        ^
pkg/controllers/proxy/network_services_controller.go:445:21: Error return value of `nsc.publishMetrics` is not checked (errcheck)
                nsc.publishMetrics(nsc.serviceMap)
                                  ^
pkg/controllers/proxy/network_services_controller.go:1609:9: Error return value of `h.Write` is not checked (errcheck)
        h.Write([]byte(ip + "-" + protocol + "-" + port))
               ^
pkg/controllers/proxy/network_services_controller.go:912:13: Error return value of `netns.Set` is not checked (errcheck)
                        netns.Set(hostNetworkNamespaceHandle)
                                 ^
pkg/controllers/proxy/network_services_controller.go:926:13: Error return value of `netns.Set` is not checked (errcheck)
                        netns.Set(hostNetworkNamespaceHandle)
                                 ^
pkg/controllers/proxy/network_services_controller.go:950:13: Error return value of `netns.Set` is not checked (errcheck)
                        netns.Set(hostNetworkNamespaceHandle)
                                 ^
pkg/controllers/proxy/network_services_controller.go:641:9: SA4006: this value of `err` is never used (staticcheck)
        addrs, err := getAllLocalIPs()
               ^

* network_routes_controller: Lint

pkg/controllers/routing/network_routes_controller.go:340:2: S1000: should use for range instead of for { select {} } (gosimple)
        for {
        ^
pkg/controllers/routing/network_routes_controller.go:757:22: Error return value of `nrc.bgpServer.Stop` is not checked (errcheck)
                        nrc.bgpServer.Stop()
                                          ^
pkg/controllers/routing/network_routes_controller.go:770:22: Error return value of `nrc.bgpServer.Stop` is not checked (errcheck)
                        nrc.bgpServer.Stop()
                                          ^
pkg/controllers/routing/network_routes_controller.go:782:23: Error return value of `nrc.bgpServer.Stop` is not checked (errcheck)
                                nrc.bgpServer.Stop()
                                                  ^
pkg/controllers/routing/network_routes_controller.go:717:12: Error return value of `g.Serve` is not checked (errcheck)
        go g.Serve()

* ipset: Lint

pkg/utils/ipset.go:243:23: Error return value of `entry.Set.Parent.Save` is not checked (errcheck)
        entry.Set.Parent.Save()
                             ^

* pkg/cmd/kube-router: Lint

pkg/cmd/kube-router.go:214:26: SA1006: printf-style function with dynamic format string and no further arguments should use print-style function instead (staticcheck)
                fmt.Fprintf(os.Stderr, output)
                                       ^
pkg/cmd/kube-router.go:184:15: SA1017: the channel used with signal.Notify should be buffered (staticcheck)
        signal.Notify(ch, syscall.SIGINT, syscall.SIGTERM)
                     ^
pkg/cmd/kube-router.go:94:17: Error return value of `hc.RunServer` is not checked (errcheck)
        go hc.RunServer(stopCh, &wg)
                       ^
pkg/cmd/kube-router.go:112:16: Error return value of `hc.RunCheck` is not checked (errcheck)
        go hc.RunCheck(healthChan, stopCh, &wg)
                      ^
pkg/cmd/kube-router.go:121:12: Error return value of `mc.Run` is not checked (errcheck)
                go mc.Run(healthChan, stopCh, &wg)
                         ^

* cmd/kube-router/kube-router: Lint

cmd/kube-router/kube-router.go:31:24: Error return value of `flag.CommandLine.Parse` is not checked (errcheck)
        flag.CommandLine.Parse([]string{})
                              ^
cmd/kube-router/kube-router.go:33:10: Error return value of `flag.Set` is not checked (errcheck)
        flag.Set("logtostderr", "true")
                ^
cmd/kube-router/kube-router.go:34:10: Error return value of `flag.Set` is not checked (errcheck)
        flag.Set("v", config.VLevel)
                ^
cmd/kube-router/kube-router.go:62:27: SA1006: printf-style function with dynamic format string and no further arguments should use print-style function instead (staticcheck)
                        fmt.Fprintf(os.Stdout, http.ListenAndServe("0.0.0.0:6060", nil).Error())
                                               ^

* kube-router_test: Lint

cmd/kube-router/kube-router_test.go:21:10: Error return value of `io.Copy` is not checked (errcheck)
                io.Copy(stderrBuf, stderrR)
                       ^
cmd/kube-router/kube-router_test.go:40:17: Error return value of `docBuf.ReadFrom` is not checked (errcheck)
        docBuf.ReadFrom(docF)
                       ^

* service_endpoints_sync: Lint

pkg/controllers/proxy/service_endpoints_sync.go:460:2: ineffectual assignment to `ipvsSvcs` (ineffassign)
        ipvsSvcs, err := nsc.ln.ipvsGetServices()
        ^
pkg/controllers/proxy/service_endpoints_sync.go:311:5: SA4006: this value of `err` is never used (staticcheck)
                                err = nsc.ln.ipAddrDel(dummyVipInterface, externalIP)
                                ^

* node: Lint

pkg/utils/node.go:19:16: SA1019: clientset.Core is deprecated: please explicitly pick a version if possible.  (staticcheck)
                node, err := clientset.Core().Nodes().Get(nodeName, metav1.GetOptions{})
                             ^
pkg/utils/node.go:27:15: SA1019: clientset.Core is deprecated: please explicitly pick a version if possible.  (staticcheck)
        node, err := clientset.Core().Nodes().Get(hostName, metav1.GetOptions{})
                     ^
pkg/utils/node.go:34:15: SA1019: clientset.Core is deprecated: please explicitly pick a version if possible.  (staticcheck)
                node, err = clientset.Core().Nodes().Get(hostnameOverride, metav1.GetOptions{})
                            ^

* aws: Lint

pkg/controllers/routing/aws.go:31:8: SA4006: this value of `err` is never used (staticcheck)
                URL, err := url.Parse(providerID)
                     ^

* health_controller: Lint

pkg/healthcheck/health_controller.go:54:10: Error return value of `w.Write` is not checked (errcheck)
                w.Write([]byte("OK\n"))
                       ^
pkg/healthcheck/health_controller.go:68:10: Error return value of `w.Write` is not checked (errcheck)
                w.Write([]byte("Unhealthy"))
                       ^
pkg/healthcheck/health_controller.go:159:2: S1000: should use a simple channel send/receive instead of `select` with a single case (gosimple)
        select {
        ^

* network_routes_controller_test: Lint

pkg/controllers/routing/network_routes_controller_test.go:1113:37: Error return value of `testcase.nrc.bgpServer.Stop` is not checked (errcheck)
                        defer testcase.nrc.bgpServer.Stop()
                                                         ^
pkg/controllers/routing/network_routes_controller_test.go:1314:37: Error return value of `testcase.nrc.bgpServer.Stop` is not checked (errcheck)
                        defer testcase.nrc.bgpServer.Stop()
                                                         ^
pkg/controllers/routing/network_routes_controller_test.go:2327:37: Error return value of `testcase.nrc.bgpServer.Stop` is not checked (errcheck)
                        defer testcase.nrc.bgpServer.Stop()
                                                         ^

* .golangci.yml: Increase timeout

Default is 1m, increase to 5m otherwise travis might fail

* Makefile: Update golangci-lint to 1.27.0

* kube-router_test.go: defer waitgroup

Co-authored-by: Aaron U'Ren <aauren@users.noreply.github.com>

* network_routes_controller: Incorporate review

* bgp_policies: Incorporate review

* network_routes_controller: Incorporate review

* bgp_policies: Log error instead

* network_services_controller: Incorporate review

Co-authored-by: Aaron U'Ren <aauren@users.noreply.github.com>
2020-06-03 22:29:06 +02:00
Aaron U'Ren
cb48a7f87b
fix(network_routes): missing node ip -> error log (#904)
Before we used to raise an error when a node was missing an IP, but it
turns out that this is not a required attribute of a node. And while it
is rare that a node would be missing an IP address, a node doesn't
require an IP address or a name or really much of anything in order to
exist.

This brings us to stronger conformance with the Kubernetes API and makes
it so that kube-router logs errors rather than changing it's health
status and potentially causing cascading failures across the fleet if a
user adds a node like this.
2020-05-26 00:18:21 +05:30
Aaron U'Ren
d2178da5f2
fix(ecmp_vip): check for nil nodename (#903)
While rare that NodeName is missing it is not guaranteed to exist by the
Kubernetes API (see link below). This retains checking via NodeName
first if it exists, but if it's nil rather than segfaulting it evaluates
the via IP address.

Fixes #781

https://github.com/cloudnativelabs/kube-router/blob/master/vendor/k8s.io/api/core/v1/types.go#L3487
2020-05-24 20:30:18 +05:30
Лач
86ebd286b6
Fix for same issue as #750, but for network_routes_controller 2020-04-27 02:02:34 +05:00
Murali Reddy
0f21f87fd5
withdraw external IP from advertisement only if the deleted service is the last service using external IP (#850)
* withdraw external IP from advertisement only if the deleted service
is the last service using external IP

Fixes #828

* addressing review comment
2020-04-23 08:18:56 +05:30
Murali Reddy
4c764f5486
handle DeletedFinalStateUnknown objects in DeleteFunc handlers (#856)
* in DeleteFunc handlers across the controllers  handle the case where received object can be of
type DeletedFinalStateUnknown

fixes one of the symptoms (panic on receiving DeletedFinalStateUnknown objects) reported in #712

* address review comments
2020-04-13 15:57:14 +05:30
Murali Reddy
2c4911b9a9
Fix unit test failure due to switch of listing node API objects from (#869)
API server to cached informer. Modify test to use informer
2020-04-06 18:58:31 +05:30
Murali Reddy
33724aac05
read the necessary API objects from local cache instead of listing from the API server (#864)
Fixes #862
2020-04-03 06:51:52 +05:30
Murali Reddy
9db9a4980b
populate pod CID in network routing controler to simulate reading from node spec once at begining (#844) 2020-02-19 11:24:11 +05:30
Murali Reddy
148736b33d fix gofmt 2020-02-17 02:06:17 +05:30
wu0407
459e52eba2
fix unhealthy on api server down (#813)
* fix router controller unhealthy on api server down

* import glog

* use  NetworkRoutingController  podCidr

* fix undefind
2020-02-17 01:56:21 +05:30
jdrahos
8023f6a753 Allow to configure cluster id using IPv4 strings 2019-09-24 17:17:22 -04:00
Tom Pointon
d6f9f31a7b Fix: Send BGP Withdrawals for Service VIPs Upon Service Deletion (#756)
* Refactor: seperate fetching service VIPs from advertise/withdrawal decision

* Refactor: simplify advertise/withdrawal logic

* Pass svcDeleted param to getVIPsForService

* Don't advertise VIPs from deleted services

* Test for withdrawing VIPs from deleted service

* Refactor: use explicit handleServiceDelete functions
2019-09-19 17:55:15 +05:30