Commit Graph

15684 Commits

Author SHA1 Message Date
tongjicoder
4fe20fa340 chore: fix some comments
Signed-off-by: tongjicoder <tongjicoder@icloud.com>
2025-05-27 23:14:41 +08:00
Ayoub Mrini
317acb3d68
refactor: use the built-in max/min to simplify the code (#16617)
Signed-off-by: carrychair <linghuchong404@gmail.com>
2025-05-27 14:42:50 +02:00
Zhengke Zhou
45211dc72f
chore: Adjust test and add comment about DNS resolution issue for failing tests (#16200)
* chore: Add comment about DNS resolution issue for failing tests

Signed-off-by: zhengkezhou1 <madzhou1@gmail.com>

* remove unexported-return

Signed-off-by: zhengkezhou1 <madzhou1@gmail.com>

---------

Signed-off-by: zhengkezhou1 <madzhou1@gmail.com>
2025-05-27 14:40:09 +02:00
Ayoub Mrini
44f78bb3c8
Merge pull request #16623 from machine424/reprep
fix: add reproducer for a dangling-reference issue in parsers and fix
2025-05-27 05:24:48 +02:00
Subhramit Basu
44e27a876e
Add parse alerting for rules files (#16601)
Builds over https://github.com/prometheus/prometheus/pull/16462
Addresses comments, adds invalid rules file

Signed-off-by: subhramit <subhramit.bb@live.in>
Co-authored-by: marcodebba <marcodebonis74@gmail.com>
2025-05-26 18:29:34 +02:00
Antonio Jimenez
2834a665ed
Add support for promoting all OTel resource attributes (#16426)
Add support for promoting all OTel resource attributes via `promote_all_resource_attributes`,
except for those ignored using 'ignore_resource_attributes'.

---------

Signed-off-by: Antonio Jimenez <antonjim@thousandEyes.com>
Signed-off-by: Antonio Jimenez <123171955+antonjim-te@users.noreply.github.com>
2025-05-26 18:15:01 +02:00
Joe Harvey
79c9e9348f
ci: address zizmor gh action vulnerabilities (#16530)
* ci: address zizmor gh action vulnerabilities

---------

Signed-off-by: Joe Harvey <51208233+jharvey10@users.noreply.github.com>
Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
Co-authored-by: Arve Knudsen <arve.knudsen@gmail.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-05-26 15:38:09 +00:00
bragi92
14fc57e4cf
remote_write azure auth : allow empty client_id to suppport system assigned managed identity (#16421)
* squash (#1)

* remote-write: allow empty azure client_id to support system assigned managed identity

* add blank line for tests

* remote-write: allow empty azure client_id to support system assigned managed identity

Signed-off-by: Kaveesh Dubey <kadubey@microsoft.com>

* add blank line for tests

Signed-off-by: Kaveesh Dubey <kadubey@microsoft.com>

---------

Signed-off-by: Kaveesh Dubey <kadubey@microsoft.com>

* treat empty client_id as system-assigned identity; this is a valid case

Signed-off-by: Kaveesh Dubey <kadubey@microsoft.com>

* rename file 

Signed-off-by: bragi92 <kadubey@microsoft.com>

---------

Signed-off-by: Kaveesh Dubey <kadubey@microsoft.com>
Signed-off-by: bragi92 <kadubey@microsoft.com>
2025-05-24 15:01:49 +02:00
machine424
50a6efd5ec
fix(model/textparse): Labels(): copy the input to avoid dangling references
Signed-off-by: machine424 <ayoubmrini424@gmail.com>
2025-05-23 11:03:48 +02:00
machine424
2bfbd8a714
fix: add reproducer for a dangling-reference issue in parsers
Signed-off-by: machine424 <ayoubmrini424@gmail.com>
2025-05-22 17:22:51 +02:00
Julien
1d9dfde989
Merge pull request #16041 from prymitive/parenExprEnd
Add a test for aggregation wrapped in ParenExpr
2025-05-22 09:32:17 +02:00
machine424
690c9da817 chore(config): add guidelines for adding a new RuntimeConfig field based on learnings from GoGC addition
Signed-off-by: machine424 <ayoubmrini424@gmail.com>
2025-05-21 16:16:05 +02:00
machine424
e809ccb90a test(cmd/main/TestRuntimeGOGCConfig): add checks on reloads as well
Signed-off-by: machine424 <ayoubmrini424@gmail.com>
2025-05-21 16:16:05 +02:00
Ayoub Mrini
2edc3ed6c5
feat(tsdb): introduce --use-uncached-io feature flag and allow using it for chunks writing (#15365)
Signed-off-by: machine424 <ayoubmrini424@gmail.com>
Signed-off-by: Ayoub Mrini <ayoubmrini424@gmail.com>
2025-05-21 14:42:30 +02:00
Ryan Wu
091e662f4d
refactor(endpointslice): use service cache.Indexer to achieve better iteration performance (#16365)
* refactor(endpointslice): use cache.Indexer to index endpointslices by LabelServiceName so not have to iterate over all endpoint objects.

Signed-off-by: Ryan Wu <rongjun0821@gmail.com>

* check the type and error early and add 'TestEndpointSliceDiscoveryWithUnrelatedServiceUpdate' unit test to give a regression test

Signed-off-by: Ryan Wu <rongjun0821@gmail.com>

* make service indexer namespaced

Signed-off-by: Ryan Wu <rongjun0821@gmail.com>

* remove unneeded test func

Signed-off-by: Ryan Wu <rongjun0821@gmail.com>

* Apply suggestions from code review

Co-authored-by: Ayoub Mrini <ayoubmrini424@gmail.com>
Signed-off-by: Ryan Wu <rongjun0821@gmail.com>

---------

Signed-off-by: Ryan Wu <rongjun0821@gmail.com>
Co-authored-by: Ayoub Mrini <ayoubmrini424@gmail.com>
2025-05-20 20:33:25 +02:00
George Krajcsovits
eb940d9c3b
Merge pull request #16565 from prometheus/krajo/intern-custom-values
perf(chunkenc): intern custom values for native histograms
2025-05-20 09:13:12 +02:00
carrychair
e83dc66bdb refactor: use the built-in max/min to simplify the code
Signed-off-by: carrychair <linghuchong404@gmail.com>
2025-05-20 14:36:39 +08:00
György Krajcsovits
772d5ab433
Merge branch 'main' into krajo/intern-custom-values 2025-05-20 08:23:15 +02:00
Julius Volz
6c930e8506
Merge pull request #16605 from prometheus/rules-page-filters
Add health & text filtering on the /rules page
2025-05-19 23:22:25 +02:00
Ayoub Mrini
dd9e18c831
Merge pull request #16008 from jub0bs/cors
util/httputil: Always add Vary header in SetCORS (fixes #15406)
2025-05-19 21:53:11 +02:00
Ayoub Mrini
f2fac45eaf
Merge pull request #16563 from zepellin/patch-1
docs: Fix `metric_name_escaping_scheme` config parameter
2025-05-19 16:31:57 +02:00
jub0bs
4bc8df0f54
util/httputil: Always add Vary header in SetCORS
Closes #15406

Signed-off-by: jub0bs <jcretel-infosec+github@protonmail.com>
2025-05-19 11:42:44 +02:00
Ayoub Mrini
eb8d34c2ad
Merge pull request #16587 from prymitive/discoveryLocks
discovery: Try fixing potential deadlocks in discovery
2025-05-19 11:09:49 +02:00
Ayoub Mrini
4b7321c8e8
Merge pull request #16607 from marcoderama/patch-3
docs: fix typo in operators.md
2025-05-19 10:55:42 +02:00
Bartlomiej Plotka
8e6b008608
feature: type-and-unit-labels (PROM-39 implementation) (#16228)
* feature: type-and-unit-labels (extended MetricIdentity)

Experimental implementation of https://github.com/prometheus/proposals/pull/39

Previous (unmerged) experiments:
* https://github.com/prometheus/prometheus/compare/main...dashpole:prometheus:type_and_unit_labels
* https://github.com/prometheus/prometheus/pull/16025

Signed-off-by: bwplotka <bwplotka@gmail.com>

feature: type-and-unit-labels (extended MetricIdentity)

Experimental implementation of https://github.com/prometheus/proposals/pull/39

Previous (unmerged) experiments:
* https://github.com/prometheus/prometheus/compare/main...dashpole:prometheus:type_and_unit_labels
* https://github.com/prometheus/prometheus/pull/16025

Signed-off-by: bwplotka <bwplotka@gmail.com>

* Fix compilation errors

Signed-off-by: Arthur Silva Sens <arthursens2005@gmail.com>

Lint

Signed-off-by: Arthur Silva Sens <arthursens2005@gmail.com>

Revert change made to protobuf 'Accept' header

Signed-off-by: Arthur Silva Sens <arthursens2005@gmail.com>

Fix compilation errors for 'dedupelabels' tag

Signed-off-by: Arthur Silva Sens <arthursens2005@gmail.com>

* Rectored into schema.Metadata

Signed-off-by: bwplotka <bwplotka@gmail.com>

* texparse: Added tests for PromParse

Signed-off-by: bwplotka <bwplotka@gmail.com>

* add OM tests.

Signed-off-by: bwplotka <bwplotka@gmail.com>

* add proto tests

Signed-off-by: bwplotka <bwplotka@gmail.com>

* Addressed comments.

Signed-off-by: bwplotka <bwplotka@gmail.com>

* add schema label tests.

Signed-off-by: bwplotka <bwplotka@gmail.com>

* addressed comments.

Signed-off-by: bwplotka <bwplotka@gmail.com>

* fix tests.

Signed-off-by: bwplotka <bwplotka@gmail.com>

* add promql tests.

Signed-off-by: bwplotka <bwplotka@gmail.com>

* lint

Signed-off-by: bwplotka <bwplotka@gmail.com>

* Addressed comments.

Signed-off-by: bwplotka <bwplotka@gmail.com>

---------

Signed-off-by: bwplotka <bwplotka@gmail.com>
Signed-off-by: Arthur Silva Sens <arthursens2005@gmail.com>
Co-authored-by: Arthur Silva Sens <arthursens2005@gmail.com>
2025-05-17 09:37:25 +00:00
Arthur Silva Sens
5a98246f50
Fix flakiness in TestOTLPWriteHandler (#16608) 2025-05-17 05:35:26 -03:00
Jan-Otto Kröpke
84a3acaf1b
Merge release-3.4 into main (#16609)
Co-authored-by: Ayoub Mrini <ayoubmrini424@gmail.com>
2025-05-17 10:04:22 +02:00
marcoderama
778d49bbfb
docs: fix typo in operators.md
Signed-off-by: marcoderama <marcoderamagit@gmail.com>
2025-05-16 12:03:24 -07:00
Julius Volz
69ed0a5794 Add health & text filtering on the /rules page
Addresses part of https://github.com/prometheus/prometheus/issues/16515

For now, I'm adding very similar filtering to the /rules page as we have on
the /alerts page, with the difference being:

* The state filter filters by rule health (ok/warn/unknown) instead of
  alert state (firing/pending/inactive)
* We don't collect & show detailed stats on the different state counts as
  we do on the /alerts page

There is a lot of copied / very similar code between those two pages (and
also some others) around filtering and pagination, so maybe there is an
opportunity for more code sharing in the future here.

Signed-off-by: Julius Volz <julius.volz@gmail.com>
2025-05-16 17:50:46 +02:00
Julius Volz
77224a6ef1
Merge pull request #16604 from prometheus/fix-dropped-target-counts
SD UI: Better total target count display when using `keep_dropped_targets` option
2025-05-16 14:17:18 +02:00
Julius Volz
5f1c6226e2 SD UI: Better total target count display when using keep_dropped_targets option
Fixes https://github.com/prometheus/prometheus/issues/16586

Signed-off-by: Julius Volz <julius.volz@gmail.com>
2025-05-16 11:07:55 +02:00
Ben Kochie
1eaf12e99b
Add golangci-lint fmt (#16602)
With golangci-lint v2, it now has "formatters" that can be configured.
Add `golangci-lint fmt` to the `make format` in Makefile.common.
* Enable goimports formatter.

Signed-off-by: SuperQ <superq@gmail.com>
2025-05-16 11:05:35 +02:00
Ben Kochie
3eb44003c6
Fixup make proto (#16603)
Use `common-` prefix for `make proto` so downstream projects like
client_golang can implement their own `make proto`.

Signed-off-by: SuperQ <superq@gmail.com>
2025-05-16 09:03:07 +00:00
Lukasz Mierzwa
59761f631b Move m.targetsMtx.Lock down into the loop
Make sure the order of locks is always the same in all functions. In ApplyConfig() we have m.targetsMtx.Lock() after provider is locked, so replicate the same in allGroups().

Signed-off-by: Lukasz Mierzwa <l.mierzwa@gmail.com>
2025-05-15 12:30:48 +01:00
Ayoub Mrini
2690761582
Merge pull request #16558 from wbh1/fix/GOGC-ENV-VAR
fix(config): respect GOGC environment variable if no "runtime" block exists
2025-05-13 23:34:08 +02:00
Julius Volz
df1b4da348
Merge pull request #16597 from prometheus/cleanup-docs-code-and-headers
Clean up codeboxes and headings in docs
2025-05-13 17:05:14 +02:00
Julius Volz
1b818b03d5 Clean up codeboxes and headings in docs
The new docs site will have syntax highlighting, so this adds language tags
to code boxes that are currently missing them. I didn't add `promql` as a
language yet since the highlighter doesn't support it yet, plus a lot of
the PromQL codeboxes in our docs aren't strictly valid PromQL, they are
more like multiple expressions listed in the same code box on multiple
lines. So I'm leaving that for sometime later.

In the HTTP API page, I moved the curl examples from the JSON codeboxes to
their own ones above the JSON output. I considered putting an "Output:"
text between the curl + JSON output, but I think the way it currently looks
without it is probably fine.

I also fixed a number of headings which were at the wrong level relative to
their nesting in the document.

I also removed `go` as a language from the Go template language examples,
because the Go template language isn't Go at all.

I also adjusted the indent on one codebox to be more reasonable (2 spaces
instead of 8).

And then finally, my editor made a bunch of whitespace changes
automatically, like removing trailing spaces.

Signed-off-by: Julius Volz <julius.volz@gmail.com>

Signed-off-by: Julius Volz <julius.volz@gmail.com>
2025-05-13 15:38:29 +02:00
Will Hegedus
5d94ad9ae0 test(cmd): enable test cases for GOGC environment variable
Signed-off-by: Will Hegedus <whegedus@akamai.com>
2025-05-12 07:49:40 -04:00
Will Hegedus
33578fedb3 fix(config): respect GOGC environment variable if no "runtime" block exists
Fixes: https://github.com/prometheus/prometheus/issues/16334
Related to:
- https://github.com/prometheus/prometheus/pull/15238
- https://github.com/prometheus/prometheus/pull/16052

Currently, when the GOGC environment variable is set -- and no `runtime`
block is set in the Prometheus config file -- it is ignored and the
default value of 75% is always used.

However, if there is an empty runtime block (e.g. `runtime: {}`), _then_
the GOGC environment variable is checked.

This PR changes this behavior to consistently check and use the GOGC
environment variable when it is set (unless the `gogc` field is set in
the `runtime` block of the loaded config file, in which case it still
gives that precedence).

Co-authored-by: Adam Rambo <arambo@protonmail.com>
Signed-off-by: Will Hegedus <whegedus@akamai.com>
2025-05-12 07:37:20 -04:00
Julius Volz
dbf5d01a62
Fix full-page re-rendering when opening status nav menu (#16590)
When opening the status pages menu while already viewing one of the
status pages, the whole page would be re-rendered because the menu target's
default action of following the current page's URL was not prevented. Also,
we don't need to use a NavLink component for the menu target when we are
not viewing a status page, because then the component won't need to be
highlighted anyways.

Discovered + fixed with the help of react-scan.

Signed-off-by: Julius Volz <julius.volz@gmail.com>
2025-05-12 12:17:18 +02:00
Julius Volz
5c06804df8
Optimize memoization and search debouncing on /targets page (#16589)
Moving the debouncing of the search field to the parent component and then
memoizing the ScrapePoolsList component prevents a lot of superfluous
re-renders of the entire scrape pools list that previously got triggered
immediately when you typed in the search box or even just collapsed a pool.
(While the computation of what data to show was already memoized in the
ScrapePoolList component, the component itself still had to re-render a lot
with the same data.)

Discovered this problem + verified fix using react-scan.

Signed-off-by: Julius Volz <julius.volz@gmail.com>
2025-05-12 10:39:58 +02:00
Lukasz Mierzwa
7d55ee8cc8 Try fixing potential deadlocks in discovery
Manager.ApplyConfig() uses multiple locks:
- Provider.mu
- Manager.targetsMtx

Manager.cleaner() uses the same locks but in the opposite order:
- First it locks Manager.targetsMtx
- The it locks Provider.mu

I've seen a few strange cases of Prometheus hanging up on shutdown and never compliting that shutdown.
From a few traces I was given it appears that while Prometheus is still running only discovery.Manager and notifier.Manager are running running.
From that trace it also seems like they are stuck on a lock from two functions:
- cleaner waits on a RLock()
- ApplyConfig waits on a Lock()

I cannot reproduce it but I suspect this is a race between locks. Imagine this scenario:
- Manager.ApplyConfig() is called
- Manager.ApplyConfig locks Provider.mu.Lock()
- at the same time cleaner() is called on the same Provider instance and it calls Manager.targetsMtx.Lock()
- Manager.ApplyConfig() now calls Manager.targetsMtx.Lock() but that lock is already held by cleaner() function so ApplyConfig() hangs there
- at the same time cleaner() now wants to lock Provider.mu.Rlock() but that lock is already held by Manager.ApplyConfig()
- we end up with both functions locking each other out without any way to break that lock

Re-order lock calls to try to avoid this scenario.
I tried writing a test case for it but couldn't hit this issue.

Signed-off-by: Lukasz Mierzwa <l.mierzwa@gmail.com>
2025-05-12 09:13:46 +01:00
Neeraj Gartia
8b0d33e5b2
promql: support variable scalar parameter in aggregations in range queries (#16404)
This fixes the regression introduced in https://github.com/prometheus/prometheus/issues/15971 while preserving the performance improvements.

Signed-off-by: Neeraj Gartia <neerajgartia211002@gmail.com>
2025-05-11 15:40:31 +02:00
Neeraj Gartia
591242901a
promql: Refactor some functions to make them more DRY (#16532)
Signed-off-by: Neeraj Gartia <neerajgartia211002@gmail.com>
2025-05-11 15:16:15 +02:00
hardlydearly
ba4b058b7a refactor: use slices.Contains to simplify code
Signed-off-by: hardlydearly <799511800@qq.com>
2025-05-09 08:27:10 +02:00
Julien
59874fd89c
Merge pull request #16478 from KofClubs/range-vector-1001ms
promql: function selector sometimes misses a sample on dense samples
2025-05-08 12:02:10 +02:00
George Krajcsovits
f0471ff74b
Merge pull request #16552 from charleskorn/charleskorn/fix-16551
promql: don't emit a value from `histogram_fraction` or `histogram_quantile` if classic and native histograms are present at the same timestamp
2025-05-08 09:28:14 +02:00
György Krajcsovits
6c646657d5
perf(chunkenc): intern the custom values for native histograms
The custom values are the "le" bucket boundaries of native histograms
with custom buckets. They are never modified. It is ok to not copy them
when iterating a chunk, just reference them.

If we will ever have a function that modifies the custom values, like
'trim' for example. That function will have to make a copy on write.

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
2025-05-07 14:40:45 +02:00
György Krajcsovits
6dc6785473
chore(engine): add simple NHCB benchmark
Copy the benchmark for native histograms with exponential buckets and
adopt to native histograms with custom buckets.

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
2025-05-07 13:54:34 +02:00
Martin Danko
148968e399
docs: Fix metric_name_escaping_scheme config parameter
This is a fix for documentation update done in https://github.com/prometheus/prometheus/pull/16066, setting correct name for a configuration value.

Signed-off-by: Martin Danko <46035688+zepellin@users.noreply.github.com>
2025-05-07 12:41:49 +02:00