948 Commits

Author SHA1 Message Date
David Leadbeater
e647f7954f
promtool: Add feature flags for promql features (#16443)
These are supported in the main prometheus binary but the feature flags
weren't supported in promtool.

Fixes #16412.

Signed-off-by: David Leadbeater <dgl@dgl.cx>
2025-04-17 10:29:44 +01:00
Julien Pivotto
7370d62acf PromQL: allow arithmetic in durations in PromQL parser
Updated the parser to allow calculations in PromQL durations.

This enables durations in the form of:

  rate(http_requests_total[10m+2s])

The computation of the calculations is done directly at the parse level and does not hit the PromQL Engine.
The lexer has also been updated and improved, in particular for subqueries.

Buxfix: rate(http_requests_total[0]) is no longer allowed.

Signed-off-by: Julien Pivotto <291750+roidelapluie@users.noreply.github.com>
2025-04-03 10:56:02 +02:00
Dimitar Dimitrov
bd5b2ea95c
ruler notifier: make batch size configurable (#16254)
* ruler notifier: make batch size configurable

In Mimir we experimented with setting a higher value for the batch size.
A 4x increase in batch size decreased the time to process a single notification by about 2x.
This reduces the processing time of the notifications queue and increases the throughput of the queue.

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Update cmd/prometheus/main.go

Co-authored-by: gotjosh <josue.abreu@gmail.com>
Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Update docs

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Use a string constant

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Add godoc comment on exported constant

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

---------

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>
Co-authored-by: gotjosh <josue.abreu@gmail.com>
2025-03-24 14:22:19 +00:00
Matthieu MOREL
5fa1146e21
chore: enable gci linter (#16245)
Signed-off-by: Matthieu MOREL <matthieu.morel35@gmail.com>
2025-03-22 15:46:13 +00:00
Ganesh Vernekar
62bf7dbed7
Merge pull request #16224 from pr00se/main-test-data-race
Address data race in main_test.go
2025-03-18 13:49:26 -04:00
Fiona Liao
37c2ebb5fd
Make out-of-order native histograms flag a no-op and always enable (#16207)
* Remove experimental out-of-order native histogram flag

This feature has been available in Prometheus since September 2024,
and has no known issues. Therefore proposing to remove the flag
entirely and always have it on. Note that there are still two
settings that need to be configured (out-of-order time window > 0
and native histograms enabled) for this feature to work.

Signed-off-by: Fiona Liao <fiona.liao@grafana.com>

* Update CHANGELOG

Signed-off-by: Fiona Liao <fiona.liao@grafana.com>

* Keep feature flag with warning

Signed-off-by: Fiona Liao <fiona.liao@grafana.com>

* Update CHANGELOG

Signed-off-by: Fiona Liao <fiona.liao@grafana.com>

* Update tsdb/head_append.go

Co-authored-by: George Krajcsovits <krajorama@users.noreply.github.com>
Signed-off-by: Fiona Liao <fiona.y.liao@gmail.com>

* Update CHANGELOG.md

Co-authored-by: George Krajcsovits <krajorama@users.noreply.github.com>
Signed-off-by: Fiona Liao <fiona.y.liao@gmail.com>

* Update tsdb/head_append.go

Co-authored-by: George Krajcsovits <krajorama@users.noreply.github.com>
Signed-off-by: Fiona Liao <fiona.y.liao@gmail.com>

* Additional cleanup of comments and test names

Signed-off-by: Fiona Liao <fiona.liao@grafana.com>

---------

Signed-off-by: Fiona Liao <fiona.liao@grafana.com>
Signed-off-by: Fiona Liao <fiona.y.liao@gmail.com>
Co-authored-by: George Krajcsovits <krajorama@users.noreply.github.com>
2025-03-18 10:59:02 +00:00
Patryk Prus
c6e4cf03fd
Address data race in main_test.go
Signed-off-by: Patryk Prus <p@trykpr.us>
2025-03-17 17:13:11 -04:00
Arthur Silva Sens
95f49dd84b
Bump prometheus/common to v0.63.0 (#16210)
* Bump prometheus/common to v0.63.0

Signed-off-by: Arthur Silva Sens <arthursens2005@gmail.com>

* nolint usage of deprecated model.NameValidationScheme

Signed-off-by: Arthur Silva Sens <arthursens2005@gmail.com>

---------

Signed-off-by: Arthur Silva Sens <arthursens2005@gmail.com>
2025-03-13 20:42:42 +01:00
Owen Williams
94b43c5d4c utf8: Remove support for legacy global validation setting
Global and Data Source configurations can specify legacy mode, but Prometheus now requires that the overall validation mode be set to UTF-8

Signed-off-by: Owen Williams <owen.williams@grafana.com>
2025-03-13 10:47:24 -04:00
Bartlomiej Plotka
7a7bc65237
Add util/compression package to consolidate snappy/zstd use in Prometheus. (#16156)
# Conflicts:
#	tsdb/db_test.go

Apply suggestions from code review




tmp



Addressed comments.



Update util/compression/buffers.go

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>
Co-authored-by: Arthur Silva Sens <arthursens2005@gmail.com>
2025-03-10 10:36:26 +00:00
Arve Knudsen
7cbf749096
Upgrade to github.com/oklog/ulid/v2 (#16168)
Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
2025-03-05 16:03:25 +01:00
Julius Volz
6054e843fe Fix test race by not calling t.Log() after test completion
Fixes https://github.com/prometheus/prometheus/issues/16169

Signed-off-by: Julius Volz <julius.volz@gmail.com>
2025-03-05 12:20:12 +01:00
co63oc
0e4e5a71bd
Fix typos (#16076)
Signed-off-by: co63oc <co63oc@users.noreply.github.com>
2025-02-28 11:24:25 +11:00
Matthieu MOREL
c7d4b53ec1 chore: enable unused-parameter from revive
Signed-off-by: Matthieu MOREL <matthieu.morel35@gmail.com>
2025-02-19 19:50:28 +01:00
Bartlomiej Plotka
de23a9667c
prw2: Split PRW2.0 from metadata-wal-records feature (#16030)
Rationales:

* metadata-wal-records might be deprecated and replaced going forward: https://github.com/prometheus/prometheus/issues/15911
* PRW 2.0 works without metadata just fine (although it sends untyped metrics as expected).

Signed-off-by: bwplotka <bwplotka@gmail.com>
2025-02-13 12:16:33 +00:00
frazou
9b4c8f6be2
rulefmt: support YAML aliases for Alert/Record/Expr (#14957)
* rulefmt: add tests with YAML aliases for Alert/Record/Expr

Altough somewhat discouraged in favour of using proper configuration
management tools to generate full YAML, it can still be useful in some
situations to use YAML anchors/aliases in rules.

The current implementation is however confusing: aliases will work
everywhere except on the alert/record name and expr

This first commit adds (failing) tests to illustrate the issue, the next
one fixes it. The YAML test file is intentionally filled with anchors
and aliases. Although this is probably not representative of a real-world
use case (which would have less of them), it errs on the safer side.

Signed-off-by: François HORTA <fhorta@scaleway.com>

* rulefmt: support YAML aliases for Alert/Record/Expr

This fixes the use of YAML aliases in alert/recording rule names and
expressions. A side effect of this change is that the RuleNode YAML type is
no longer propagated deeper in the codebase, instead the generic Rule type
can now be used.

Signed-off-by: François HORTA <fhorta@scaleway.com>

* rulefmt: Add test for YAML merge combined with aliases

Currently this does work, but adding a test for the related
functionally here makes sense.

Signed-off-by: David Leadbeater <dgl@dgl.cx>

* rulefmt: Rebase to latest changes

Signed-off-by: David Leadbeater <dgl@dgl.cx>

---------

Signed-off-by: François HORTA <fhorta@scaleway.com>
Signed-off-by: David Leadbeater <dgl@dgl.cx>
Co-authored-by: David Leadbeater <dgl@dgl.cx>
2025-02-13 20:48:33 +11:00
Nicolas Peugnet
898af6102d
promtool: support creating tsdb blocks from a pipe (#16011)
This is very useful when piping the input file to stdin and then using
/dev/stdin as the input file. e.g.

    xzcat dump.xz |
    	promtool tsdb create-blocks-from openmetrics /dev/stdin /tmp/data

Signed-off-by: Nicolas Peugnet <nicolas.peugnet@lip6.fr>
2025-02-13 13:03:31 +11:00
Bartlomiej Plotka
00b69efabb
model/textparse: Change parser interface Metric(...) string to Labels(...) (#16012)
* model/textparse: Change parser interface Metric(...) string to Labels(...)

Simplified the interface given no one is using the return argument.
Renamed for clarity too.

Found and discussed https://github.com/prometheus/prometheus/pull/15731#discussion_r1950916842

Signed-off-by: bwplotka <bwplotka@gmail.com>

* Fixed comments; optimized not needed copy for om and text.

Signed-off-by: bwplotka <bwplotka@gmail.com>

---------

Signed-off-by: bwplotka <bwplotka@gmail.com>
2025-02-12 15:47:56 +00:00
Jan Fajerski
7f37a008c4
Merge pull request #15540 from mmorel-35/prometheus/common@v0.61.0
chore(deps): use `version.PrometheusUserAgent`
2025-01-28 13:10:48 +01:00
Artur Melanchyk
30112f6ed7
promtool: optimize labels slice allocation
Signed-off-by: Artur Melanchyk <arturmelanchyk@imail.name>
2025-01-24 14:31:59 +01:00
Matthieu MOREL
dd5ab743ea chore(deps): use version.PrometheusUserAgent
Signed-off-by: Matthieu MOREL <matthieu.morel35@gmail.com>
2025-01-22 07:31:02 +01:00
Giedrius Statkevičius
92218ecb9b promtool: add --ignore-unknown-fields
Add --ignore-unknown-fields that ignores unknown fields in rule group
files. There are lots of tools in the ecosystem that "like" to extend
the rule group file structure but they are currently unreadable by
promtool if there's anything extra. The purpose of this flag is so that
we could use the "vanilla" promtool instead of rolling our own.

Some examples of tools/code:

https://github.com/grafana/mimir/blob/main/pkg/mimirtool/rules/rwrulefmt/rulefmt.go
8898eb3cc5/pkg/rules/rules.go (L18-L25)

Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>
2025-01-15 11:34:28 +02:00
Arve Knudsen
5df6ea3042
promtool: Support linting of scrape interval (#15719)
* PromTool: Support Scrape Interval Lint Checking

---------

Signed-off-by: zhaowang <zhaowang@apac.freewheel.com>
Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
Co-authored-by: zhaowang <zhaowang@apac.freewheel.com>
2025-01-15 08:45:05 +01:00
sh0rez
5303e515af
remote/otlp: convert delta to cumulative (#15165)
What

Adds support for OTLP delta temporality to the OTLP endpoint.
This is done by calling the deltatocumulative processor from the OpenTelemetry collector during OTLP conversion.

Why

Delta conversion is a naturally stateful process, which requires careful request routing when operated inside a collector.
Prometheus is already stateful and doing the conversion in-server reduces the operational burden on the ingest architecture by only having one stateful component.

How

deltatocumulative is a OTel collector component that works as follows:

* pmetric.Metrics come from a receiver or in this case from the HTTP client
* It operates as an in-place update loop:
    * for each sample, if not delta, leave unmodified
    * if delta, do:
      * state += sample, where state is the in-memory sum of all previous samples
      * sample = state, sample value is now cumulative
    * this is supported for sums (counters), gauges, histograms (old histograms) and exponential histograms (native histograms)
If a series receives no new samples for 5m, its state is removed from memory


Performance

Delta performance is a stateful operation and the OTel code is not highly optimized yet, e.g. it locks the entire processor for each request. Nonetheless, care has been taken to mitigate those effects:

delta conversion is behind a feature flag. If disabled, no conversion code is ever invoked
if enabled, conversion is not invoked if request not actually contains delta samples. This leads to no measureable performance difference between default-cumulative to convert-cumulative (only cumulative, feature on/off)

Signed-off-by: sh0rez <me@shorez.de>
2025-01-14 11:33:31 -03:00
Arve Knudsen
f030894c2c
Fix issues raised by staticcheck (#15722)
Fix issues raised by staticcheck

We are not enabling staticcheck explicitly, though, because it has too many false positives.

---------

Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
2025-01-09 17:51:26 +01:00
machine424
9823a93c42
fix(main.go): avoid closing the query engine until it is guaranteed to no longer be in use.
partially reverts https://github.com/prometheus/prometheus/pull/14064

fixes https://github.com/prometheus/prometheus/issues/15232

supersedes https://github.com/prometheus/prometheus/pull/15533

reusing Engine.Close() outside of tests will require more consideration.

Signed-off-by: machine424 <ayoubmrini424@gmail.com>
2024-12-30 05:14:44 +01:00
Bartlomiej Plotka
30967330ca
Merge pull request #14755 from prometheus/arthursens/appendct-prwv2
Append CT as zero sample from PRWv2
2024-12-27 12:44:54 +01:00
Bryan Boreham
7b03796d0f
Scraping: stop storing discovered labels (#15261)
Instead of storing discovered labels on every target, recompute them if
required. The `Target` struct now needs to hold some more data required
to recompute them, such as ScrapeConfig.

This moves the load from every Prometheus all of the time, to just when
someone views Service Discovery in the UI.

The way `PopulateLabels` is used changes; you are no longer expected to
call it with a part-populated `labels.Builder`.

The signature of `Target.Labels` changes to take a `labels.Builder`
instead of a `ScratchBuilder`, for consistency with `DiscoveredLabels`.

This will save a lot of work when many targets are filtered out in
relabeling. Combine with `keep_dropped_targets` to avoid ever computing
most labels for dropped targets.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-12-21 13:33:08 +00:00
Arthur Silva Sens
3b97a6397c
Put PRWv2 created timestamp ingestion behing feature-flag
Signed-off-by: Arthur Silva Sens <arthursens2005@gmail.com>
2024-12-20 10:48:46 -03:00
bwplotka
f44bba31b3 config: Remove validation GetScrapeConfigs; require using config.Load.
We should never modify (or even shallow copy) Config after config.Load;
added comments and modified GetScrapeConfigs to do so. For GetScrapeConfigs
the validation (even repeated) was likely doing writes (because global fields was 0). We GetScrapeConfigs concurrently
in tests and ApplyConfig causing test races. In prod there were
races but likelyt only to replace 0 with 0, so not too severe.

I removed validation since I don't see anyone using our config.Config without Load.
I had to refactor one test that was doing it, all others use yaml config.

Fixes #15538
Previous attempt: https://github.com/prometheus/prometheus/pull/15634

Signed-off-by: bwplotka <bwplotka@gmail.com>
2024-12-11 13:31:01 +00:00
Bryan Boreham
dd1d707e1c
Merge pull request #15371 from bboreham/main-naming
[REFACTOR] Small improvement to top-level naming
2024-11-24 17:55:10 +00:00
Ben Ye
40c3ba9fd4
fix promtool analyze block shows metric name with 0 cardinality (#15438)
Signed-off-by: Ben Ye <benye@amazon.com>
2024-11-24 17:30:20 +01:00
Bryan Boreham
3c22b35a03
Merge pull request #15339 from colega/export-go-sync-mutex-wait-total-seconds-total-metric
Export 'go_sync_mutex_wait_total_seconds_total' metric
2024-11-20 13:34:37 +00:00
Björn Rabenstein
384c5951ef
Merge pull request #14489 from harry671003/implement_metadata_limit
storage: Implement limit in mergeGenericQuerier
2024-11-19 17:32:16 +01:00
Ben Kochie
869addfec8
Enable auto-gomaxprocs by default (#15378)
Enable the `auto-gomaxprocs` feature flag by default.
* Add command line flag `--no-auto-gomaxprocs` to disable.

Signed-off-by: SuperQ <superq@gmail.com>
2024-11-12 13:27:01 +01:00
Bryan Boreham
f0bd5fdcc6
Merge pull request #15226 from aniketnk/i15185
Parallelize tests in cmd/promtool/
2024-11-12 11:40:30 +00:00
Julien
e9179cb240
Merge pull request #15011 from roidelapluie/fixreload
Fix auto reload when a config file with a syntax error is reverted
2024-11-12 11:55:21 +01:00
Ben Kochie
0f0deb77a2
Enable auto-gomemlimit by default (#15372)
Enable the `auto-gomemlimit` feature flag by default.
* Add command line flag `--no-auto-gomemlimit` to disable.

Signed-off-by: SuperQ <superq@gmail.com>
2024-11-11 18:26:38 +01:00
Bryan Boreham
de35a40ed5 [REFACTOR] Small improvement to top-level naming
Previously, top-level initialization (including WAL reading) had this at
the top of the stack:

    github.com/oklog/run.(*Group).Run.func1:38

I found this confusing, and thought it had something to do with logging.

Introducing another local function should make this clearer.
Also make the error message user-centric not code-centric.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-11-11 10:55:05 +00:00
Ben Ye
140f4aa9ae
feat: Allow customizing TSDB postings decoder (#13567)
* allow customizing TSDB postings decoder

---------

Signed-off-by: Ben Ye <benye@amazon.com>
2024-11-11 07:59:24 +01:00
Aniket Kaulavkar
650abaccce Parallelize tests in cmd/promtool/
Signed-off-by: Aniket Kaulavkar <aniket.kaulavkar@gmail.com>
2024-11-08 21:54:08 +05:30
🌲 Harry 🌊 John 🏔
f9bc50b247 storage: Implement limit in mergeGenericQuerier
Signed-off-by: 🌲 Harry 🌊 John 🏔 <johrry@amazon.com>
2024-11-07 09:08:23 -08:00
Arthur Silva Sens
96c2fe04e7
Merge pull request #15328 from prometheus/deadcode
chore: Remove dead code
2024-11-06 13:20:53 -03:00
Matthieu MOREL
af1a19fc78 enable errorf rule from perfsprint linter
Signed-off-by: Matthieu MOREL <matthieu.morel35@gmail.com>
2024-11-06 16:50:36 +01:00
Oleg Zaytsev
50e83e8f98 Export 'go_sync_mutex_wait_total_seconds_total' metric
This metric can be useful to debug mutex contention.

Ref: https://github.com/golang/go/issues/49881
Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
2024-11-05 15:25:55 +01:00
Julien
9d9e86b333 Fix auto reload when a config file with a syntax error is reverted
When we had a syntax error but restored the old file, we did not
re-trigger the config reload, so the config reload metric was showing
that config reload was unsucessful.

I made magic to handle logs in cmd/prometheus.

For now it is a separate file so we can backport this easily.

I will generalize the helper in another PR.

Signed-off-by: Julien <roidelapluie@o11y.eu>
2024-11-05 12:43:34 +01:00
Arthur Silva Sens
88818c9cb3
chore: Remove dead code
Signed-off-by: Arthur Silva Sens <arthursens2005@gmail.com>
2024-11-04 13:00:22 -03:00
Alban Hurtaud
4b56af7eb8
Add hidden flag for the delayed compaction random time window (#14919)
* Add hidden flag for the delayed compaction random time window

Signed-off-by: Alban HURTAUD <alban.hurtaud@amadeus.com>

* Update cmd/prometheus/main.go

Co-authored-by: Ayoub Mrini <ayoubmrini424@gmail.com>
Signed-off-by: Alban Hurtaud <alban.hurtaud@amadeus.com>

* Update cmd/prometheus/main.go

Co-authored-by: Ayoub Mrini <ayoubmrini424@gmail.com>
Signed-off-by: Alban Hurtaud <alban.hurtaud@amadeus.com>

* Update tsdb/db.go

Co-authored-by: Ayoub Mrini <ayoubmrini424@gmail.com>
Signed-off-by: Alban Hurtaud <alban.hurtaud@amadeus.com>

* Fix flag name according to review - add test for delay

Signed-off-by: Alban HURTAUD <alban.hurtaud@amadeus.com>

* Fix afer main rebase

Signed-off-by: Alban HURTAUD <alban.hurtaud@amadeus.com>

* Implement review comments

Signed-off-by: Alban HURTAUD <alban.hurtaud@amadeus.com>

* Update generatedelaytest to try with limit values

Signed-off-by: Alban HURTAUD <alban.hurtaud@amadeus.com>

---------

Signed-off-by: Alban HURTAUD <alban.hurtaud@amadeus.com>
Signed-off-by: Alban Hurtaud <alban.hurtaud@amadeus.com>
Co-authored-by: Ayoub Mrini <ayoubmrini424@gmail.com>
2024-11-04 08:26:26 +01:00
Vanshika
cccbe72514
TSDB: Fix some edge cases when OOO is enabled (#14710)
Fix some edge cases when OOO is enabled

Signed-off-by: Vanshikav123 <vanshikav928@gmail.com>
Signed-off-by: Vanshika <102902652+Vanshikav123@users.noreply.github.com>
Signed-off-by: Jesus Vazquez <jesusvzpg@gmail.com>
Co-authored-by: Jesus Vazquez <jesusvzpg@gmail.com>
2024-10-23 17:34:28 +02:00
George Krajcsovits
1b4e7f74e6
feat(tools): add debug printouts to rules unit testing (#15196)
* promtool: Add debug flag for rule tests

This makes it print out the tsdb state (both input_series and rules that
are run) at the end of a test, making reasoning about tests much easier.

Signed-off-by: David Leadbeater <dgl@dgl.cx>

* Reuse generated test name from junit testing

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>

---------

Signed-off-by: David Leadbeater <dgl@dgl.cx>
Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
Co-authored-by: David Leadbeater <dgl@dgl.cx>
2024-10-22 15:24:36 +02:00