Joakim Karlsson f3e7aced1a Metrics + Logging update (#294)
* - added protocol & port label to metrics
- removed some redundant code

* added example dashboard

* added dashboard screenshot

* updated dashboard json & screenshot

* ammend bad dashboard export

* first new metric

* .

* more metrics: controller_publish_metrics_time & controller_iptables_sync_time

* namespace redeclared

* fix typo in name

* smal fixes

* new metric controller_bgp_peers & controller_bgp_internal_peers_sync_time

* typo fix

* new metric controller_ipvs_service_sync_time

* fix

* register metric

* fix

* fix

* added more metrics

* service controller log levels

* fix

* fix

* added metrics controller

* fixes

* fix

* fix

* fixed more log levels

* server and graceful shutdown

* fix

* fix

* fix

* code cleanup

* docs

* move metrics exporting to controller

* fix

* fix

* fixes

* fix

* fix missing

* fix

* fix

* test

* test

* fix

* fix

* fix

* updated dashboard

* updates to metric controller

* fixed order in newmetricscontroller

* err declared and not used

* updated dashboard

* updated dashboard screenshot

* removed --metrics & changed --metrics-port to enable / disable metrics

* https://github.com/cloudnativelabs/kube-router/issues/271

* cannot use config.MetricsPort (type uint16) as type int in assignment

* cannot use mc.MetricsPort (type uint16) as type int in argument to strconv.Itoa

* updated docs

* changed default metric port to 0, disabled

* added missing newline to .dockerignore

* add lag parse to pickup on -v directives

* test

* test

* test

* fix regression

* syntax error: non-declaration statement outside function body

* fix

* changed nsc to mc

* updated docs

* markdown fix

* moved metrics registration out to respective controller so only metrics for running parts will be exposed

* removed junk that came from visual studio code

* fixed some typos

* Moved the metrics back into each controller and added expose behaviour so only the running components metrics would be published

* removed to much, added back instanciation of metricscontroller

* fixed some invalid  variable names

* fixed last typos on config name

* fixed order in newnetworkservicecontroller

* updated metrics docs & removed the metrics sync period as it will obey the controllers sync period

* forgott to save options.go

* cleanup

* Updated metric name & docs

* updated metrics.md

* fixed a high cpu usage bug in the metrics_controller's wait loop
2018-01-25 22:56:51 +05:30

4.0 KiB

Metrics

Scraping kube-router metrics with Prometheus

The scope of this document is to describe how to setup the annotations needed for Prometheus to use Kubernetes SD to discover & scape kube-router pods. For help with installing Prometheus please see their docs

Metrics options:

  --metrics-path        string               Path to serve Prometheus metrics on ( default: /metrics )
  --metrics-port        uint16 <0-65535>     Prometheus metrics port to use ( default: 0, disabled )

To enable kube-router metrics, start kube-router with --metrics-port and provide a port over 0

Metrics is generally exported at the same rate as the sync period for each service.

The default values unless other specified is iptables-sync-period - 1 min ipvs-sync-period - 1 min routes-sync-period - 1 min

By enabling Kubernetes SD in Prometheus configuration & adding required annotations Prometheus can automaticly discover & scrape kube-router metrics

Version notes

kube-router 0.1.0-rc2 and upwards supports the runtime configuration for controlling where to expose the metrics. If you are using a older version, metrics path & port is locked to /metrics & 8080

Supported annotations

The following annotations can be set on pods/services to enable automatic SD & scraping

  • prometheus.io/scrape: Only scrape services that have a value of true
  • prometheus.io/path: If the metrics path is not /metrics override this.
  • prometheus.io/port: If the metrics are exposed on a different port to the

They are to be set under spec.template.metadata

For example:

spec:
  template:
    metadata:
      labels:
        k8s-app: kube-router
      annotations:
        prometheus.io/scrape: "true"
        prometheus.io/port: "8080"

Avail metrics

If metrics is enabled only the running services metrics are exposed

The following metrics is exposed by kube-router prefixed by kube_router_

run-router = true

  • controller_bgp_peers Number of BGP peers of the instance
  • controller_bgp_advertisements_received Number of total BGP advertisements received since kube-router start
  • controller_bgp_internal_peers_sync_time Time it took for the BGP internal peer sync loop to complete

run-firewall=true

  • controller_iptables_sync_time Time it took for the iptables sync loop to complete

run-service-proxy = true

  • controller_ipvs_services_sync_time Time it took for the ipvs sync loop to complete
  • controller_ipvs_services The number of ipvs services in the instance
  • controller_ipvs_metrics_export_time The time it took to run the metrics export for IPVS services
  • service_total_connections Total connections made to the service since creation
  • service_packets_in Total n/o packets received by service
  • service_packets_out Total n/o packets sent by service
  • service_bytes_in Total bytes received by the service
  • service_bytes_out Total bytes sent by the service
  • service_pps_in Incoming packets per second
  • service_pps_out Outgoing packets per second
  • service_cps Connections per second
  • service_bps_in Incoming bytes per second
  • service_bps_out Outgoing bytes per second

To get a grouped list of CPS for each service a Prometheus query could look like this e.g: sum(kube_router_service_cps) by (namespace, service_name)

Grafana Dashboard

This repo contains a example Grafana dashboard utilizing all the above exposed metrics from kube-router. dashboard