Commit Graph

4 Commits

Author SHA1 Message Date
Siavash Safi
2d5d239883
feat(notifier): independent alertmanager queues
Independent Alertmanager queues avoid issues with queue overflowing when
one or more Alertmanager instances are unavailable which could result in
lost alert notifications.
The buffered queues are managed per AlertmanagerSet which are dynamically
added/removed with service discovery or configuration reload.

The following metrics now include an extra dimention for alertmanager label:
- prometheus_notifications_dropped_total
- prometheus_notifications_queue_capacity
- prometheus_notifications_queue_length

This change also includes the test from #14099

Closes #7676

Signed-off-by: Siavash Safi <siavash@cloudflare.com>
2025-06-18 10:15:53 +02:00
Siavash Safi
333c0001e2
chore(notifier): remvove year from copyrights
Signed-off-by: Siavash Safi <siavash@cloudflare.com>
2025-06-10 11:55:15 +02:00
György Krajcsovits
334e9e1518
notifier: unit test for dropping throughput on stuck AM
Ref: https://github.com/prometheus/prometheus/issues/7676

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
Signed-off-by: Siavash Safi <siavash@cloudflare.com>
2025-04-22 09:54:10 +02:00
Siavash Safi
ef48e4cb9f
chore: refactor notifier package
Split the notifier package into smaller source files.

Signed-off-by: Siavash Safi <siavash@cloudflare.com>
2025-04-03 17:48:04 +11:00