86 Commits

Author SHA1 Message Date
beorn7
61617eb2d9 Fix PrometheusRemoteWriteDesiredShards
This rule has the same labels on both sides. We don't want
`group_right` and `on`, we want nothing.

Signed-off-by: beorn7 <beorn@grafana.com>
2019-10-29 00:23:39 +01:00
Callum Styan
da6d46625f Repeat shards panels on the queue label.
Signed-off-by: Callum Styan <callumstyan@gmail.com>
2019-10-21 11:03:50 -07:00
Callum Styan
818974ff8f Rewrite remote write dashboard using base grafonnet.
Signed-off-by: Callum Styan <callumstyan@gmail.com>
2019-10-17 15:40:58 -07:00
Callum Styan
81fa63006c Add additional shards/segment graphs to remote write dashboard.
Signed-off-by: Callum Styan <callumstyan@gmail.com>
2019-10-09 09:59:02 -07:00
Simon Pasquier
e36ab7e192
prometheus-mixin: improve description of sample alerts (#6050)
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2019-09-24 17:44:27 +02:00
Björn Rabenstein
3b3eaf3496
Merge pull request #5787 from cstyan/reshard-max-logging
Add metrics for max/min/desired shards to queue manager.
2019-09-09 22:32:54 +02:00
Callum Styan
a98599bea8 Update remote write max shards alert; properly template/query for max
shards in description.

Signed-off-by: Callum Styan <callumstyan@gmail.com>
2019-09-09 12:01:11 -07:00
Callum Styan
3b75614892 Add a warning alert, since the remote write behind alert will probably
already be going off, about desired shards being higher than max shards.

Signed-off-by: Callum Styan <callumstyan@gmail.com>
2019-08-08 06:45:46 -07:00
Simon Pasquier
dd174963a2 prometheus-mixin: remove PrometheusTSDBWALCorruptions
The counter is only increased when tsdb.Open() is called which
Prometheus does only once in its lifetime (when it initializes). If the
corruption can't be recovered, tsdb.Open() returns an error and
Prometheus exits. Hence the metric is either 0 (no corruption) or 1
(corruption detected and repaired). If the latter, the alert isn't
actionable and the only way to resolve it is to restart Prometheus which
would reset the counter.

Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2019-08-06 14:36:56 +02:00
Matthias Loibl
20d12ff1c7
Fix prometheus-mixin dashboards to use grafanaDashboards
Signed-off-by: Matthias Loibl <mail@matthiasloibl.com>
2019-07-11 15:40:26 +02:00
beorn7
4825585834 Tweak tenses
Signed-off-by: beorn7 <beorn@grafana.com>
2019-06-28 17:37:49 +02:00
beorn7
9a2177949d Protect gauge-based alerts against failed scrapes
Signed-off-by: beorn7 <beorn@grafana.com>
2019-06-28 16:46:19 +02:00
beorn7
52707535b8 Remove/improve unused variables and weird doc comments
Signed-off-by: beorn7 <beorn@grafana.com>
2019-06-28 15:41:31 +02:00
beorn7
7a25a2586d Sync with alerts from kube-prometheus
While doing so, re-introduce the summary/description
annotations. Also, add a few more rules and tweak a few of the
existing ones.

Signed-off-by: beorn7 <beorn@grafana.com>
2019-06-27 23:50:26 +02:00
beorn7
ded0705bdc Update remote repo for grafana-builder dependency
Signed-off-by: beorn7 <beorn@grafana.com>
2019-06-27 14:39:38 +02:00
beorn7
1336a28848 Use a config variable for the Prometheus name
Signed-off-by: beorn7 <beorn@grafana.com>
2019-06-27 14:34:11 +02:00
beorn7
613cb5430c Add a "work in progress" disclaimer.
Signed-off-by: beorn7 <beorn@grafana.com>
2019-06-26 23:24:22 +02:00
beorn7
e34af6d4d3 Address various comments from the review
Signed-off-by: beorn7 <beorn@grafana.com>
2019-06-26 23:22:16 +02:00
beorn7
23c03207e9 Fixed indentation
Signed-off-by: beorn7 <beorn@grafana.com>
2019-06-26 20:31:05 +02:00
beorn7
d5845ad05b Fix formatting
This is the outcome of `make fmt`.

Signed-off-by: beorn7 <beorn@grafana.com>
2019-06-26 16:23:25 +02:00
beorn7
d45e8a0f61 Adjust to jsonnet v0.13
Signed-off-by: beorn7 <beorn@grafana.com>
2019-06-26 16:22:21 +02:00
beorn7
5c04ef3935 Make README.md immediately useful
Signed-off-by: beorn7 <beorn@grafana.com>
2019-06-26 16:12:59 +02:00
beorn7
ddfabda152 Add Makefile and suitable jsonnet files
This makes the mixins usable as abvertised.

Signed-off-by: beorn7 <beorn@grafana.com>
2019-06-26 15:30:55 +02:00
beorn7
e943803a3c Add .gitignore file
Signed-off-by: beorn7 <beorn@grafana.com>
2019-06-26 15:22:23 +02:00
Callum Styan
a5762f3681 Add dashboard for remote write to prometheus-mixin.
Signed-off-by: Callum Styan <callumstyan@gmail.com>
2019-06-17 15:02:42 -07:00
Tom Wilkie
38a9bbbec2 Loosen off PrometheusRemoteWriteBehind alert.
Signed-off-by: Tom Wilkie <tom.wilkie@gmail.com>
2019-03-04 12:47:24 +00:00
Tom Wilkie
b615069289 Update metric names.
Signed-off-by: Tom Wilkie <tom.wilkie@gmail.com>
2019-03-01 07:39:48 -08:00
Tom Wilkie
e248ffb220 Add alert for WAL remote write falling behind.
Signed-off-by: Tom Wilkie <tom.wilkie@gmail.com>
2019-02-12 15:22:58 +00:00
Tom Wilkie
638204c775 Typo
Signed-off-by: Tom Wilkie <tom.wilkie@gmail.com>
2018-11-19 12:23:42 +00:00
Tom Wilkie
8f42192e52 Add Prometheus alerts from kube-prometheus, remove the alertmanager alerts.
Signed-off-by: Tom Wilkie <tom.wilkie@gmail.com>
2018-11-19 11:22:55 +00:00
Tom Wilkie
dfbdf8d3bb Add a basic readme with link to the mixin docs.
Signed-off-by: Tom Wilkie <tom.wilkie@gmail.com>
2018-11-16 17:23:14 +00:00
Tom Wilkie
5fd712b210 copypasta.
Signed-off-by: Tom Wilkie <tom.wilkie@gmail.com>
2018-11-16 17:17:47 +00:00
Tom Wilkie
50861d586a Alert if more than 1% of alerts fail for a given integration.
Signed-off-by: Tom Wilkie <tom.wilkie@gmail.com>
2018-11-16 17:17:47 +00:00
Tom Wilkie
266ba185fe Remove PromScrapeFailed alert.
Signed-off-by: Tom Wilkie <tom.wilkie@gmail.com>
2018-11-16 17:17:47 +00:00
Tom Wilkie
e8a8ce5654 Basic Prometheus dashboard.
Signed-off-by: Tom Wilkie <tom.wilkie@gmail.com>
2018-11-16 17:17:47 +00:00
Tom Wilkie
ee1427faad Prometheus monitoring mixin for Prometheus itself.
Signed-off-by: Tom Wilkie <tom.wilkie@gmail.com>
2018-11-16 17:17:47 +00:00