754 Commits

Author SHA1 Message Date
dgrisonnet
02776a1d37 [bot] [main] Automated version update 2021-09-27 09:53:31 +00:00
Arunprasad Rajkumar
c5d265a14e
thanos: bump to latest and add thanosPrometheusCommonDimensions
This commit pulls latest changes from thanos mixins and sets `thanosPrometheusCommonDimensions`
to `namespace, pod` for k8s use case.

Refer https://github.com/thanos-io/thanos/pull/4508 for more details.

Signed-off-by: Arunprasad Rajkumar <arajkuma@redhat.com>
2021-09-27 12:07:08 +05:30
Philip Gough
56f96e6389 Adjust dropped metrics from cAdvisor
This change drops pod-centric metrics without a non-empty 'container' label.

Previously we dropped pod-centric metrics without a (pod, namespace) label set
however these can be critical for debugging.
2021-09-24 17:24:01 +01:00
Damien Grisonnet
7f1092cdde
Merge pull request #1344 from PhilipGough/MON-1085
jsonnet: Support scraping the config-reloader for AlertManager and Pr…
2021-09-22 16:16:48 +02:00
Philip Gough
7b32afb8aa jsonnet: Support scraping the config-reloader for AlertManager and Prometheus 2021-09-22 14:54:12 +01:00
dgrisonnet
a232cca3b6 [bot] [main] Automated version update 2021-09-20 07:39:09 +00:00
Sylvain Pasche
6d5c1b793c Always generate grafana-config secret
Since https://github.com/brancz/kubernetes-grafana/pull/115, upstream
grafana contains a non-empty config. Generate the grafana-config secret
unconditionally even if no user config is passed.
2021-09-16 14:25:53 +02:00
dgrisonnet
6654c13142 [bot] [main] Automated version update 2021-09-13 07:39:05 +00:00
Damien Grisonnet
6f744e24a5
Merge pull request #1357 from arajkumar/adjust-NodeFilesystemSpaceFillingUp-warning-threshold
Adjust node filesystem space filling up warning threshold to 20%
2021-09-06 19:04:29 +02:00
Arunprasad Rajkumar
4de44139ec
add comments to reason fsSpaceFilling threshold adjustment
Signed-off-by: Arunprasad Rajkumar <arajkuma@redhat.com>
2021-09-02 17:38:02 +05:30
Arunprasad Rajkumar
03471fd86f
Adjust threshold for SpaceFillingUp warning alert
Reduce threshold of NodeFilesystemSpaceFillingUp warning alert to 20% space available, instead of 40% (default).

This will align the threshold according to default kubelet GC values
below[1],

"imageMinimumGCAge": "2m0s",
"imageGCHighThresholdPercent": 85,
"imageGCLowThresholdPercent": 80,

[1] https://kubernetes.io/docs/reference/config-api/kubelet-config.v1beta1/

Signed-off-by: Arunprasad Rajkumar <arajkuma@redhat.com>
2021-09-01 13:29:36 +05:30
dgrisonnet
a1c6a4e21d [bot] [main] Automated version update 2021-08-30 07:39:09 +00:00
simonpasquier
eb52023db2 [bot] [main] Automated version update 2021-08-25 09:37:24 +00:00
Damien Grisonnet
9ef6dff167 jsonnet: unpin dependencies
Signed-off-by: Damien Grisonnet <dgrisonn@redhat.com>
2021-08-20 13:49:12 +02:00
Damien Grisonnet
eca67844af jsonnet: pin and update jsonnet depdencies
Signed-off-by: Damien Grisonnet <dgrisonn@redhat.com>
2021-08-19 16:41:53 +02:00
Damien Grisonnet
b5ec93208b jsonnet: drop deprecated etcd metric
Signed-off-by: Damien Grisonnet <dgrisonn@redhat.com>
2021-08-18 17:27:50 +02:00
Damien Grisonnet
45adc03cfb jsonnet: update prometheus-adapter to v0.9.0
Signed-off-by: Damien Grisonnet <dgrisonn@redhat.com>
2021-08-17 18:05:45 +02:00
paulfantom
c4113807fb
jsonnet: set thanos config to null by default
Signed-off-by: paulfantom <pawel@krupa.net.pl>
2021-08-16 15:16:52 +02:00
paulfantom
ad3fc8920e [bot] [main] Automated version update 2021-08-16 08:04:51 +00:00
Dimitrije Manic
12cd7fd9ce Prometheus ruleSelector defaults to all rules 2021-08-11 10:16:24 -04:00
paulfantom
f6d6b30aed
jsonnet: use full dependency path 2021-08-06 14:15:23 +02:00
Damien Grisonnet
33cc694f18
Merge pull request #1308 from PaytmLabs/feature/separate-thanos-rules
Create Thanos Sidecar rules separately from Prometheus ones
2021-08-05 16:19:01 +02:00
Maxime Brunet
961f138dd0
Add back _config.runbookURLPattern for Thanos Sidecar rules 2021-08-04 14:22:06 -07:00
Paweł Krupa
54d8f88162
Merge pull request #1307 from PaytmLabs/feature/addons/aws-vpc-cni
Turn AWS VPC CNI into a control plane add-on
2021-08-04 09:56:50 +02:00
Paweł Krupa
e931a417fc
Merge pull request #1230 from Luis-TT/fix-kube-proxy-dashboard 2021-08-04 09:55:09 +02:00
Luis Vidal Ernst
0b49c3102d Added PodMonitor for kube-proxy 2021-08-03 08:31:49 +02:00
Maxime Brunet
0e7dc97bc5
Create Thanos Sidecar rules separately from Prometheus ones 2021-08-02 12:46:06 -07:00
Maxime Brunet
d3ccfb8220
Turn AWS VPC CNI into a control plane add-on 2021-08-02 11:26:33 -07:00
dgrisonnet
e97eb0fbe9 [bot] [main] Automated version update 2021-08-02 13:37:08 +00:00
Maxime Brunet
b7fe018d29
eks: Revert back to awscni_total_ip_addresses-based alert 2021-07-31 11:37:12 -07:00
Paweł Krupa
b9c73c7b29
Merge pull request #1283 from prashbnair/node-veth
changing node exporter ignore list
2021-07-28 09:17:03 +02:00
Prashant Balachandran
09fdac739d changing node exporter ignore list 2021-07-27 17:17:19 +05:30
Paweł Krupa
785789b776
Merge pull request #1257 from Luis-TT/kube-state-metrics-kubac-proxy-resources 2021-07-27 12:36:26 +02:00
lanmarti
ed48391831 Add resource requests and limits to prometheus-adapter container 2021-07-27 12:19:51 +02:00
Maxime Brunet
3a98a3478c
eks: Fix CNI metrics relabelings
Signed-off-by: Maxime Brunet <maxime.brunet@paytm.com>
2021-07-23 13:39:29 -07:00
Manuel Rüger
acd1eeba4c node.libsonnet: Fix small typo
Signed-off-by: Manuel Rüger <manuel@rueg.eu>
2021-07-22 19:14:24 +02:00
paulfantom
cfe830f8f0
jsonnet/kube-prometheus: point to runbooks.prometheus-operator.dev
Signed-off-by: paulfantom <pawel@krupa.net.pl>
2021-07-22 17:30:57 +02:00
Luis Vidal Ernst
9c638162ae Allow customizing of kubeRbacProxy in kube-state-metrics 2021-07-21 13:57:05 +02:00
Paweł Krupa
acea5efd85
Merge pull request #1268 from paulfantom/alerts-best-practices
Alerts best practices
2021-07-21 09:32:32 +02:00
Philip Gough
463ad065d3 jsonnet: Drop cAdvisor metrics with no (pod, namespace) labels while preserving ability to monitor system services resource usage
The following provides a description and cardinality estimation based on the tests in a local cluster:

container_blkio_device_usage_total - useful for containers, but not for system services (nodes*disks*services*operations*2)
container_fs_.*                    - add filesystem read/write data (nodes*disks*services*4)
container_file_descriptors         - file descriptors limits and global numbers are exposed via (nodes*services)
container_threads_max              - max number of threads in cgroup. Usually for system services it is not limited (nodes*services)
container_threads                  - used threads in cgroup. Usually not important for system services (nodes*services)
container_sockets                  - used sockets in cgroup. Usually not important for system services (nodes*services)
container_start_time_seconds       - container start. Possibly not needed for system services (nodes*services)
container_last_seen                - Not needed as system services are always running (nodes*services)
container_spec_.*                  - Everything related to cgroup specification and thus static data (nodes*services*5)
2021-07-20 12:50:02 +01:00
paulfantom
46eb1713a5
jsonnet: remove unused alert unit tests as those are moved to alertmanager repository 2021-07-20 11:14:38 +02:00
paulfantom
8c357c6bde
jsonnet: align alert annotations with best practices
Signed-off-by: paulfantom <pawel@krupa.net.pl>
2021-07-20 10:59:49 +02:00
paulfantom
1a3c610c61 [bot] Automated version update 2021-07-19 13:44:23 +00:00
Paweł Krupa
99ee030de3
Merge pull request #1259 from PaytmLabs/feature/eks/cni-relabel-instance
eks: Relabel instance with node name for CNI DaemonSet
2021-07-19 10:09:09 +02:00
Paweł Krupa
80bb15bedd
Merge pull request #1255 from yeya24/fix-dashboards-definition-length-check 2021-07-19 09:56:09 +02:00
Maxime Brunet
7394929c76
eks: Relabel instance with node name for CNI DaemonSet 2021-07-17 11:28:38 -07:00
Yury Gargay
9b08b941f8 Update kubernetes-mixin
From b710a868a9
2021-07-14 18:51:36 +02:00
ben.ye
43adca8df7 fmt again
Signed-off-by: ben.ye <ben.ye@bytedance.com>
2021-07-13 19:56:38 -07:00
ben.ye
90b2751f06 fmt code
Signed-off-by: ben.ye <ben.ye@bytedance.com>
2021-07-13 19:48:01 -07:00
ben.ye
dee7762ae3 create dashboardDefinitions if rawDashboards or folderDashboards are specified
Signed-off-by: ben.ye <ben.ye@bytedance.com>
2021-07-13 19:39:01 -07:00