16360 Commits

Author SHA1 Message Date
Björn Rabenstein
6e88f5171d
Merge pull request #17215 from prometheus/beorn7/histogram
federation: Add NHCB support
2025-09-25 19:31:37 +02:00
beorn7
62eda08a6c web: Add NHCB support to federation
This simply fills the classic buckets of the histogram protobuf with
the content of the custom buckets.

Signed-off-by: beorn7 <beorn@grafana.com>
2025-09-25 15:54:27 +02:00
Arthur Silva Sens
6471d14602
Add George and Arthur as release Shepherds for 3.7 (#17227)
Signed-off-by: Arthur Silva Sens <arthursens2005@gmail.com>
2025-09-25 13:04:32 +00:00
Naman-B-Parlecha
ed67a0cbf1 refactor(histogram): rename types for clarity in histogram conversion tests
Signed-off-by: Naman-B-Parlecha <namanparlecha@gmail.com>
2025-09-25 17:40:10 +05:30
Naman-B-Parlecha
f71f911040 fix(lint): Changing tests
Signed-off-by: Naman-B-Parlecha <namanparlecha@gmail.com>
2025-09-25 15:28:25 +05:30
Julien
4199c2f45a
Add anchored and smoothed to vector selectors. (#16457)
* Add anchored and smoothed to vector selectors.

This adds "anchored" and "smoothed" keywords that can be used following a matrix selector.

"Anchored" selects the last point before the range (or the first one after the range) and adds it at the boundary of the matrix selector.

"Smoothed" applies linear interpolation at the edges using the points around the edges. In the absence of a point before or after the edge, the first or the last point is added to the edge, without interpolation.

*Exemple usage*

* `increase(caddy_http_requests_total[5m] anchored)` (equivalent of *caddy_http_requests_total - caddy_http_requests_total offset 5m* but takes counter reset into consideration)
* `rate(caddy_http_requests_total[step()] smoothed)`

Signed-off-by: Julien Pivotto <291750+roidelapluie@users.noreply.github.com>

* Update docs/feature_flags.md

Co-authored-by: Charles Korn <charleskorn@users.noreply.github.com>
Signed-off-by: Julien <291750+roidelapluie@users.noreply.github.com>

* Smoothed/Anchored rate: Add more tests

Signed-off-by: Julien Pivotto <291750+roidelapluie@users.noreply.github.com>

* Anchored/Smoothed modifier: error out with histograms

Signed-off-by: Julien Pivotto <291750+roidelapluie@users.noreply.github.com>

---------

Signed-off-by: Julien Pivotto <291750+roidelapluie@users.noreply.github.com>
Signed-off-by: Julien <291750+roidelapluie@users.noreply.github.com>
Co-authored-by: Charles Korn <charleskorn@users.noreply.github.com>
2025-09-25 11:34:59 +02:00
Naman-B-Parlecha
73904b4c75 refactor(histogram): Converting to Absolute values and fixing the test
Signed-off-by: Naman-B-Parlecha <namanparlecha@gmail.com>
2025-09-25 03:42:23 +05:30
George Krajcsovits
55a4782eb7
Merge pull request #17214 from prometheus/krajo/native-histogram-schema-wal
Native histograms: ignore invalid schemas from WAL and log
2025-09-24 14:59:18 +02:00
George Krajcsovits
35d9f28c87
Update tsdb/record/record.go
Co-authored-by: Björn Rabenstein <beorn@grafana.com>
Signed-off-by: George Krajcsovits <krajorama@users.noreply.github.com>
2025-09-24 14:27:37 +02:00
György Krajcsovits
30f941c57c
fix(wal): ignore invalid native histogram schemas on load
Reduce the resolution of histograms as needed and ignore invalid
schemas while emitting a warning log.

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
2025-09-24 11:41:25 +02:00
George Krajcsovits
f53782b009
Merge pull request #17213 from prometheus/krajo/native-histogram-schema-reduce
Native histograms: reduce resolution as needed when reading from chunk or remote read
2025-09-24 11:28:35 +02:00
George Krajcsovits
112f91803c
Merge pull request #17189 from prometheus/krajo/native-histogram-schema-validation
fix(nativehistograms): validation should fail on unsupported schemas
2025-09-24 11:27:26 +02:00
Michael Shen
1eaddc64d0
Migrate K8s discovery service queues to use strongly typed queues
Signed-off-by: Michael Shen <mishen@umich.edu>
2025-09-23 20:32:11 -07:00
Michael Shen
9c525b84c4
Add deprecation notice to associated K8s endpoints API objects
Signed-off-by: Michael Shen <mishen@umich.edu>
2025-09-23 20:30:37 -07:00
Michael Shen
1703e54dfd
Update to k8s.io v0.33.5
Signed-off-by: Michael Shen <mishen@umich.edu>
2025-09-23 20:30:36 -07:00
Simon Pasquier
dde7d6ad37
doc: clarify start/end for label API endpoints (#17217)
Because the label API endpoints read from the TSDB indexes, they can
return information for series which are present in the index but have no
samples in the queried interval.

Add similar note for the series endpoint.

Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2025-09-23 12:03:14 +01:00
György Krajcsovits
a5a6413c1a
better errors naming and formatting, typo fixes
Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
2025-09-23 11:20:55 +02:00
György Krajcsovits
6e42da8904
feat(remote): reduce resolution of native histograms on remote read
If a sample read through remote read has too high resolution,
reduce it to the maximum allowed.

This is a slow path, but we only expect it to happen if the server
side is newer version that allows higher resolution.

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
2025-09-23 11:20:55 +02:00
György Krajcsovits
b6df8d3274
feat(chunkenc): allow more native histograms schemas
Allow -9..52 schemas instead of just -4..8, but reduce resolution to 8 if
above.

The reduce code path will be slow, but we only expect it to happen if
TSDB already has higher resolution samples and we are in a rollback.

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>

# Conflicts:
#	model/histogram/generic.go
2025-09-23 11:20:48 +02:00
György Krajcsovits
794c545930
Merge remote-tracking branch 'origin/main' into krajo/native-histogram-schema-validation 2025-09-23 10:51:02 +02:00
Minh Nguyen
d04550a9c4
[RW2] Return 400 error code for wrongly-formatted histograms (#17210)
* return 400 error code

Signed-off-by: pipiland2612 <nguyen.t.dang.minh@gmail.com>

* fix

Signed-off-by: pipiland2612 <nguyen.t.dang.minh@gmail.com>

* add more cases

Signed-off-by: pipiland2612 <nguyen.t.dang.minh@gmail.com>

* format code

Signed-off-by: pipiland2612 <nguyen.t.dang.minh@gmail.com>

* nit_fixing

Signed-off-by: pipiland2612 <nguyen.t.dang.minh@gmail.com>

---------

Signed-off-by: pipiland2612 <nguyen.t.dang.minh@gmail.com>
2025-09-23 07:24:46 +02:00
machine424
365409d3be
chore: allow seamless use of testing/synctest for >=go1.24
Signed-off-by: machine424 <ayoubmrini424@gmail.com>
2025-09-19 22:48:25 +02:00
György Krajcsovits
5b39b79f5a
refactor error creation and tests
Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
2025-09-19 09:26:34 +02:00
György Krajcsovits
b99378f2c4
Merge remote-tracking branch 'origin/main' into krajo/native-histogram-schema-validation 2025-09-19 08:59:00 +02:00
George Krajcsovits
5e6900558a
Apply suggestions from code review
Co-authored-by: Björn Rabenstein <beorn@grafana.com>
Signed-off-by: George Krajcsovits <krajorama@users.noreply.github.com>
2025-09-19 08:58:27 +02:00
beorn7
aac5cc3d99 web: Trim excessive line length in federate.go
Signed-off-by: beorn7 <beorn@grafana.com>
2025-09-19 00:35:12 +02:00
Björn Rabenstein
d5cc5e2738
Merge pull request #17071 from prometheus/beorn7/tsdb
tsdb: Fix commit order for mixed-typed series
2025-09-18 13:55:31 +02:00
George Krajcsovits
95b0d75fbc
Merge pull request #17201 from prometheus/krajo/ignore-duplicate-ct
perf(otlp): reduce logs from OTLP endpoint
2025-09-18 13:37:51 +02:00
Bryan Boreham
c743b2f3cd [PERF] Regex: stop calling Simplify
It slows down compilation and doesn't make any of our benchmarks go faster.
Assumed to be something that helped at an earlier point, but doesn't help now.

Add a benchmark with a more complicated regex to demonstrate the slowdown.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2025-09-18 11:20:14 +01:00
György Krajcsovits
f0a297bb7c
fix(remote): validate native histogram schema in remote read
When remote read returns chunks, the validation is in tsdb/chunkenc.
However when it returns samples, we need to modify the iterator to
validate.

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
2025-09-18 11:09:45 +02:00
György Krajcsovits
267be7dc20
fix(chunkenc): error out when reading unknown histogram schemas from chunks
Otherwise higher level code like PromQL needs to constantly check if it
can handle the samples.

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
2025-09-18 09:21:03 +02:00
Ayoub Mrini
4917346065
Merge pull request #17203 from machine424/release36
chore: prepare release 3.6.0
v0.306.0 v3.6.0
2025-09-17 21:05:30 +02:00
beorn7
bd0bf66f31 tsdb: Include floatHistograms in headAppender.Rollback()
Signed-off-by: beorn7 <beorn@grafana.com>
2025-09-17 19:22:25 +02:00
beorn7
b1fbf4f1e2 tsdb: Refactor staleness marker handling
With the fixed commit order, we can now handle the conversion of float
staleness markers to histogram staleness markers in a more direct way.

Signed-off-by: beorn7 <beorn@grafana.com>
2025-09-17 19:22:25 +02:00
beorn7
385d2800c9 promqltest: Add regression test for mixed-sample commit order
Regression test for:
- https://github.com/prometheus/prometheus/issues/14172
- https://github.com/prometheus/prometheus/issues/15177

Test cases are by @krajorama, taken from commit
b48bc9dc7e2ac553528763297cca73014357d542 .

Signed-off-by: beorn7 <beorn@grafana.com>
2025-09-17 19:22:25 +02:00
beorn7
7e82bdb75b tsdb: Fix commit order for mixed-typed series
Fixes https://github.com/prometheus/prometheus/issues/15177

The basic idea here is to divide the samples to be commited into (sub)
batches whenever we detect that the same series receives a sample of a
type different from the previous one. We then commit those batches one
after another, and we log them to the WAL one after another, so that
we hit both birds with the same stone. The cost of the stone is that
we have to track the sample type of each series in a map. Given the
amount of things we already track in the appender, I hope that it
won't make a dent. Note that this even addresses the NHCB special case
in the WAL.

This does a few other things that I could not resist to pick up on the
go:

- It adds more zeropool.Pools and uses the existing ones more
  consistently. My understanding is that this was merely an oversight.
  Maybe the additional pool usage will compensate for the increased
  memory demand of the map.

- Create the synthetic zero sample for histograms a bit more
  carefully. So far, we created a sample that always went into its own
  chunk. Now we create a sample that is compatible enough with the
  following sample to go into the same chunk. This changed the test
  results quite a bit. But IMHO it makes much more sense now.

- Continuing past efforts, I changed more namings of `Samples` into
  `Floats` to keep things consistent and less confusing. (Histogram
  samples are also samples.) I still avoided changing names in other
  packages.

- I added a few shortcuts `h := a.head`, saving many characters.

TODOs:

- Address @krajorama's TODOs about commit order and staleness handling.

Signed-off-by: beorn7 <beorn@grafana.com>
2025-09-17 19:22:25 +02:00
beorn7
46cfc9fb99 tsdb: Extend TestDataNotAvailableAfterRollback
This exposes the ommission of float histograms from the rollback.

Signed-off-by: beorn7 <beorn@grafana.com>
2025-09-17 19:22:25 +02:00
Naman-B-Parlecha
5eeba3638d
adding comment for ConvertNHCBToClassicHistogram
Signed-off-by: Naman-B-Parlecha <namanparlecha@gmail.com>
2025-09-17 15:48:57 +05:30
Naman-B-Parlecha
c8e3f8c97a
drop(flag): moving feature flag to other pr
Signed-off-by: Naman-B-Parlecha <namanparlecha@gmail.com>
2025-09-17 15:32:16 +05:30
machine424
8462515c75
test(storage/remote/queue_manager_test.go): use synctest in TestShutdown for better
control over time

The test becomes flaky after it was asked to run on parallel
and "fight" for resources

let's hide all of that

Signed-off-by: machine424 <ayoubmrini424@gmail.com>
2025-09-17 11:20:07 +02:00
Ayoub Mrini
7416f33df5
chore: define golangci-lint version in a single place and bump to v2.4.0 (#17202)
Signed-off-by: machine424 <ayoubmrini424@gmail.com>
2025-09-17 10:52:09 +02:00
machine424
5af40c2404
chore(workflows/check_release_notes): do not run on dependabot PRs and only run against main
Signed-off-by: machine424 <ayoubmrini424@gmail.com>
2025-09-17 09:35:59 +02:00
machine424
65b1cd5ae2
chore: prepare release 3.6.0
Signed-off-by: machine424 <ayoubmrini424@gmail.com>
2025-09-17 09:20:59 +02:00
György Krajcsovits
0cf54d7819
perf(otlp): reduce logs from OTLP endpoint
It's not possible to store created timestamp at the same timestamp as
the current sample, so do not even try.

In OpenTelemetry spec, if the start time is unknown, it will be set to
the same timestamp as the first sample.
https://opentelemetry.io/docs/specs/otel/metrics/data-model/#cumulative-streams-handling-unknown-start-time
This means that we will get a lot of duplicate sample for timestamp
errors and we should not log those.

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
2025-09-17 08:50:43 +02:00
George Krajcsovits
ccfda912e3
Merge pull request #17015 from Garbett1/update-fsnotify
chore: update fsnotify
2025-09-16 14:02:08 +02:00
Andrew Hall
aa922ce3b6
Added support for string literals and range results for instant queries in test scripting framework (#17055)
Signed-off-by: Andrew Hall <andrew.hall@grafana.com>
Co-authored-by: Charles Korn <charleskorn@users.noreply.github.com>
Co-authored-by: Arve Knudsen <arve.knudsen@gmail.com>
2025-09-16 12:28:19 +01:00
Bryan Boreham
26279e5b6d
Merge pull request #17066 from cuiweixie/reflect.TypeFor-discovery
discovery: refactor to use reflect.TypeFor

Use a neater form, introduced in Go 1.22.
2025-09-16 12:22:14 +01:00
Bryan Boreham
0a3c64631c
Merge pull request #17195 from dancer1325/docs/fix_gettingstarted_outdated_graph_references
docs(): fix gettingStarted outdated graph reference
2025-09-16 12:11:35 +01:00
dancer1325
a14faab435 docs(): fix gettingStarted outdated graph reference
/graph does NOT exist anymore in the new React app. It has been refactored within /query

Signed-off-by: dancer1325 <alfredotic0809@gmail.com>
2025-09-15 17:31:18 +02:00
György Krajcsovits
bdf547ae9c
fix(nativehistograms): validation should fail on unsupported schemas
Histogram.Validate and FloatHistogram.Validate now return error on
unsupported schemas.

Scrape and remote-write handler reduces the schema to the maximum allowed
if it is above the maximum, but below theoretical maximum of 52.
For scrape the maximum is a configuration option, for remote-write it is 8.

Note: OTLP endpont already does the reduction, without checking that it is
below 52 as the spec does not specify a maximum.

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
2025-09-13 16:54:44 +02:00