1435 Commits

Author SHA1 Message Date
geogrego
58dbe927d5 docs: minor improvement for docs
Signed-off-by: geogrego <geogrego@outlook.com>
2025-10-29 14:42:14 +08:00
Fiona Liao
b004db49af
Reduce samples for TestRuntimeRetentionConfigChange (#17422)
* Reduce samples for TestRuntimeRetentionConfigChange

---------

Signed-off-by: Fiona Liao <fiona.liao@grafana.com>
2025-10-28 18:23:32 +01:00
Arve Knudsen
df8a9076b9
tsdb: Reduce TestHeadSeriesChunkRace number of iterations to 100 (#17410)
Reduce tsdb.TestHeadSeriesChunkRace number of iterations from 1000 to
100, to stop this test from timing out under CI.

Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
2025-10-28 13:57:20 +01:00
Minh Nguyen
ad4b59c504
tsdb: Deprecate retention flags; add tsdb.retention runtime configuration (#17026)
* Move storage from CL to config file

Signed-off-by: pipiland2612 <nguyen.t.dang.minh@gmail.com>

* Fix .md

Signed-off-by: pipiland2612 <nguyen.t.dang.minh@gmail.com>

* run make cli-documentation

Signed-off-by: pipiland2612 <nguyen.t.dang.minh@gmail.com>

* fix

Signed-off-by: pipiland2612 <nguyen.t.dang.minh@gmail.com>

* run make cli-documentation

Signed-off-by: pipiland2612 <nguyen.t.dang.minh@gmail.com>

* nit_fixed

Signed-off-by: pipiland2612 <nguyen.t.dang.minh@gmail.com>

* fix

Signed-off-by: pipiland2612 <nguyen.t.dang.minh@gmail.com>

* add test and update configuration.md

Signed-off-by: pipiland2612 <nguyen.t.dang.minh@gmail.com>

* fix lint

Signed-off-by: pipiland2612 <nguyen.t.dang.minh@gmail.com>

---------

Signed-off-by: pipiland2612 <nguyen.t.dang.minh@gmail.com>
2025-10-27 14:51:33 +00:00
György Krajcsovits
18efd9d629
feat(ui): mark native histograms as stable in ui strings
Plus some docstrings

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
2025-10-24 12:32:15 +02:00
Ayoub Mrini
504587c724
chore(direct_io): fix constructor's name (#17371)
Signed-off-by: machine424 <ayoubmrini424@gmail.com>
2025-10-23 11:35:16 +02:00
beorn7
ad7d1aed99 Phase out native histogram feature flag
The detailed plan for this is laid out in
https://github.com/prometheus/prometheus/issues/16572 .

This commit adds a global and local scrape config option
`scrape_native_histograms`, which has to be set to true to ingest
native histograms.

To ease the transition, the feature flag is changed to simply set the
default of `scrape_native_histograms` to true.

Further implications:

- The default scrape protocols now depend on the
  `scrape_native_histograms` setting.
- Everywhere else, histograms are now "on by default".

Documentation beyond the one for the feature flag and the scrape
config are deliberately left out. See
https://github.com/prometheus/prometheus/pull/17232 for that.

Signed-off-by: beorn7 <beorn@grafana.com>
2025-10-15 14:50:52 +02:00
Björn Rabenstein
1caac94026
Merge pull request #17302 from prometheus/release-3.7
Merge release-3.7 back into main.
2025-10-07 18:40:08 +02:00
beorn7
e2aed2cd27 tsdb: Disable more tests on MS Windows
Signed-off-by: beorn7 <beorn@grafana.com>
2025-10-07 16:34:59 +02:00
Björn Rabenstein
68e4d4e5eb
Merge pull request #17298 from prometheus/release-3.7
Merging back release-3.7 branch into master
2025-10-07 16:23:36 +02:00
Björn Rabenstein
f2fc492473
Merge pull request #17284 from linasm/custom-bucket-bounds-match-fn
NHCB: Separate CustomBucketBoundsMatch from FloatBucketsMatch
2025-10-07 15:38:59 +02:00
beorn7
51c8e55835 tsdb: Do not track stFloat in typesInBatch explicitly
Signed-off-by: beorn7 <beorn@grafana.com>
2025-10-07 15:01:22 +02:00
beorn7
5f582a7e1f tsdb: Remove leftover debug fmt.Println
Signed-off-by: beorn7 <beorn@grafana.com>
2025-10-07 14:58:25 +02:00
Bartlomiej Plotka
a4da440dad
fix: Fix slicelabels corruption when used with proto decoding (#17150)
* fix: Fix slicelabels corruption when used with proto decoding

Alternative to https://github.com/prometheus/prometheus/pull/16957/

Signed-off-by: bwplotka <bwplotka@gmail.com>

* addressed comments

Signed-off-by: bwplotka <bwplotka@gmail.com>

---------

Signed-off-by: bwplotka <bwplotka@gmail.com>
2025-10-07 12:06:48 +01:00
György Krajcsovits
d11ee103ac
perf(tsdb): reuse map of sample types to speed up head appender
While investigating +10% CPU in v3.7 release, found that ~5% is from
expanding the types map. Try reuse.

Also fix a linter error.

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
2025-10-07 08:31:20 +02:00
György Krajcsovits
c26a5390aa
perf(tsdb): reuse map of sample types to speed up head appender
While investigating +10% CPU in v3.7 release, found that ~5% is from
expanding the types map. Try reuse.

Also fix a linter error.

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
2025-10-06 21:44:34 +02:00
Linas Medziunas
8caf1f1c41 [NHCB] Separate CustomBucketBoundsMatch from floatBucketsMatch
Signed-off-by: Linas Medziunas <linas.medziunas@gmail.com>
2025-10-05 22:38:07 +03:00
Patryk Prus
dc3e6af91a
tsdb: Fix appended sample count metrics when converting float staleness markers to histograms (#17241)
tsdb: Fix appended sample count metrics when converting histogram staleness markers

Signed-off-by: Patryk Prus <p@trykpr.us>
Signed-off-by: Björn Rabenstein <github@rabenste.in>
Co-authored-by: Björn Rabenstein <github@rabenste.in>
2025-09-30 16:49:54 +00:00
George Krajcsovits
35d9f28c87
Update tsdb/record/record.go
Co-authored-by: Björn Rabenstein <beorn@grafana.com>
Signed-off-by: George Krajcsovits <krajorama@users.noreply.github.com>
2025-09-24 14:27:37 +02:00
György Krajcsovits
30f941c57c
fix(wal): ignore invalid native histogram schemas on load
Reduce the resolution of histograms as needed and ignore invalid
schemas while emitting a warning log.

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
2025-09-24 11:41:25 +02:00
György Krajcsovits
a5a6413c1a
better errors naming and formatting, typo fixes
Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
2025-09-23 11:20:55 +02:00
György Krajcsovits
b6df8d3274
feat(chunkenc): allow more native histograms schemas
Allow -9..52 schemas instead of just -4..8, but reduce resolution to 8 if
above.

The reduce code path will be slow, but we only expect it to happen if
TSDB already has higher resolution samples and we are in a rollback.

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>

# Conflicts:
#	model/histogram/generic.go
2025-09-23 11:20:48 +02:00
György Krajcsovits
5b39b79f5a
refactor error creation and tests
Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
2025-09-19 09:26:34 +02:00
György Krajcsovits
b99378f2c4
Merge remote-tracking branch 'origin/main' into krajo/native-histogram-schema-validation 2025-09-19 08:59:00 +02:00
György Krajcsovits
267be7dc20
fix(chunkenc): error out when reading unknown histogram schemas from chunks
Otherwise higher level code like PromQL needs to constantly check if it
can handle the samples.

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
2025-09-18 09:21:03 +02:00
beorn7
bd0bf66f31 tsdb: Include floatHistograms in headAppender.Rollback()
Signed-off-by: beorn7 <beorn@grafana.com>
2025-09-17 19:22:25 +02:00
beorn7
b1fbf4f1e2 tsdb: Refactor staleness marker handling
With the fixed commit order, we can now handle the conversion of float
staleness markers to histogram staleness markers in a more direct way.

Signed-off-by: beorn7 <beorn@grafana.com>
2025-09-17 19:22:25 +02:00
beorn7
7e82bdb75b tsdb: Fix commit order for mixed-typed series
Fixes https://github.com/prometheus/prometheus/issues/15177

The basic idea here is to divide the samples to be commited into (sub)
batches whenever we detect that the same series receives a sample of a
type different from the previous one. We then commit those batches one
after another, and we log them to the WAL one after another, so that
we hit both birds with the same stone. The cost of the stone is that
we have to track the sample type of each series in a map. Given the
amount of things we already track in the appender, I hope that it
won't make a dent. Note that this even addresses the NHCB special case
in the WAL.

This does a few other things that I could not resist to pick up on the
go:

- It adds more zeropool.Pools and uses the existing ones more
  consistently. My understanding is that this was merely an oversight.
  Maybe the additional pool usage will compensate for the increased
  memory demand of the map.

- Create the synthetic zero sample for histograms a bit more
  carefully. So far, we created a sample that always went into its own
  chunk. Now we create a sample that is compatible enough with the
  following sample to go into the same chunk. This changed the test
  results quite a bit. But IMHO it makes much more sense now.

- Continuing past efforts, I changed more namings of `Samples` into
  `Floats` to keep things consistent and less confusing. (Histogram
  samples are also samples.) I still avoided changing names in other
  packages.

- I added a few shortcuts `h := a.head`, saving many characters.

TODOs:

- Address @krajorama's TODOs about commit order and staleness handling.

Signed-off-by: beorn7 <beorn@grafana.com>
2025-09-17 19:22:25 +02:00
beorn7
46cfc9fb99 tsdb: Extend TestDataNotAvailableAfterRollback
This exposes the ommission of float histograms from the rollback.

Signed-off-by: beorn7 <beorn@grafana.com>
2025-09-17 19:22:25 +02:00
Bryan Boreham
11c49151b7
[REFACTOR] TSDB chunks: replace magic numbers with constants (#17095)
For size of header and position of flags byte.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2025-09-02 16:05:21 +01:00
Bryan Boreham
aa12c0d4c3
Merge pull request #17074 from prymitive/logs
TSDB: Log when GC / block write starts
2025-09-02 12:55:12 +01:00
Bryan Boreham
8e133e100f
Merge pull request #17081 from prometheus/superq/if_err_nil
tsdb: Fixup err nil checks
2025-09-02 12:37:51 +01:00
bwplotka
794bf774c2 Reapply "prw: use Unit and Type labels for metadata when feature flag is enabled (#17033)"
This reverts commit f5fab4757733746a708e7b80324b8929c1b84856.
2025-08-29 08:16:37 +01:00
bwplotka
f5fab47577 Revert "prw: use Unit and Type labels for metadata when feature flag is enabled (#17033)"
This reverts commit c808a71e18d4d1cc91e1d06859ebeae818465324.
2025-08-29 08:15:28 +01:00
Jonathan
c808a71e18
prw: use Unit and Type labels for metadata when feature flag is enabled (#17033)
* chore: send Unit and Type when feature flag is enabled

Signed-off-by: perebaj <perebaj@gmail.com>

* remove unused code and comments

Signed-off-by: perebaj <perebaj@gmail.com>

* remove unreal scenario

Signed-off-by: perebaj <perebaj@gmail.com>

* remove unused if

Signed-off-by: perebaj <perebaj@gmail.com>

* remove unused labels

Signed-off-by: perebaj <perebaj@gmail.com>

* linter

Signed-off-by: perebaj <perebaj@gmail.com>

* enable type and unit through remotewrite config

Signed-off-by: perebaj <perebaj@gmail.com>

* remove test comment and capture type and unit when flag is enabled

Signed-off-by: perebaj <perebaj@gmail.com>

* gofumpt

Signed-off-by: perebaj <perebaj@gmail.com>

* modelTypeToWriteV2Type

Signed-off-by: perebaj <perebaj@gmail.com>

* use NewMetadataFromLabels

Signed-off-by: perebaj <perebaj@gmail.com>

* capture feature flag from main

Signed-off-by: perebaj <perebaj@gmail.com>

* simplifying logic

Signed-off-by: perebaj <perebaj@gmail.com>

* remove unused function

Signed-off-by: perebaj <perebaj@gmail.com>

* formatting code

Signed-off-by: perebaj <perebaj@gmail.com>

* gofumpt

Signed-off-by: perebaj <perebaj@gmail.com>

* remove public var: EnableTypeAndUnitLabels

Signed-off-by: perebaj <perebaj@gmail.com>

* remove enableTypeAndUnitLabels from TestPopulateV2TimeSeries_typeAndUnitLabels

Signed-off-by: perebaj <perebaj@gmail.com>

* remove enableTypeAndUnitLabels from main

Signed-off-by: perebaj <perebaj@gmail.com>

* use schema helper to populate metadata

Signed-off-by: perebaj <perebaj@gmail.com>

* remove metadata since nil is the default value

Signed-off-by: perebaj <perebaj@gmail.com>

* add TestPopulateV2TimeSeries_UnexpectedMetadata

Signed-off-by: perebaj <perebaj@gmail.com>

* Update storage/remote/queue_manager_test.go

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

---------

Signed-off-by: perebaj <perebaj@gmail.com>
Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>
Co-authored-by: Bartlomiej Plotka <bwplotka@gmail.com>
2025-08-29 04:10:01 +00:00
Björn Rabenstein
ba808d1736
Merge pull request #17092 from prometheus/beorn7/cleanup
Apply analyzer modernize to the whole codebase
2025-08-28 00:42:33 +02:00
Bryan Boreham
2fb50b12cd
[PERF] TSDB: Optimize appender creation on empty chunks (#16922)
Skip creating an iterator and walking all through any existing values,
when we can easily tell there are no existing values.

This is the normal case - the TSDB head creates an appender immediately
after creating every chunk.

Remove redundant handling of empty chunks.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2025-08-27 17:11:08 +01:00
beorn7
23f1d3ba25 tsdb: Remove unused Layout() methods
Both `HistogramChunk` and `FloatHistogramChunk` have a `Layout()`
method for historical reasons. As it has turned out, these methods are
unused and also buggy. This commit simply removes them.

Signed-off-by: beorn7 <beorn@grafana.com>
2025-08-27 17:01:23 +02:00
beorn7
71c21fb9e4 Fix minor issues after applying analyzer "modernize"
- The tool left an empty line behind that we don't need anymore, see
  https://github.com/prometheus/prometheus/pull/17092. (Arguably not a
  bug in the tool but just our stricter style about empty lines.)

- In tsdb/index/postings_test.go , our (admittedly somewhat
  convoluted) code structure tricked the tool so it spit out something
  that wouldn't even compile.

- storage/remote/queue_manager_test.go is just a minor formatting
  nit.

Signed-off-by: beorn7 <beorn@grafana.com>
2025-08-27 15:44:11 +02:00
beorn7
747c5ee2b1 Apply analyzer "modernize" to the whole codebase
See
https://pkg.go.dev/golang.org/x/tools/gopls/internal/analysis/modernize
for details.

This ran into a few issues (arguably bugs in the modernize tool),
which I will fix in the next commit, so that we have transparency what
was done automatically.

Beyond those hiccups, I believe all the changes applied are
legitimate. Even where there might be no tangible direct gain, I would
argue it's still better to use the "modern" way to avoid micro
discussions in tiny style PRs later.

Signed-off-by: beorn7 <beorn@grafana.com>
2025-08-27 14:48:41 +02:00
Lukasz Mierzwa
31282d67b7 Log when GC / block write starts
Right now Prometheus only logs when these operations are completed.
It's a bit surprising to see suddenly a message saying "I was busy doing X for the past N minutes"
so let's add a message when the operation starts, so it's easier to understand what Prometheus was doing at any point in time
when reading logs.

Signed-off-by: Lukasz Mierzwa <l.mierzwa@gmail.com>
2025-08-26 10:30:22 +01:00
SuperQ
b87cbf0294
Fixup err nil checks
Cleanup double `if` statements for errors being nil / not-nil.

Signed-off-by: SuperQ <superq@gmail.com>
2025-08-25 17:37:02 +02:00
Minh Nguyen
c8deefb038
[tsdb] Add CounterResetHint: CounterReset to synthetic zero sample (#17011)
Signed-off-by: pipiland2612 <nguyen.t.dang.minh@gmail.com>
2025-08-21 23:26:01 +02:00
Bryan Boreham
498f63e60b
Merge pull request #17029 from pr00se/wal-checkpoint-dropped-samples
TSDB: use timestamps rather than WAL segment numbers to track how long deleted series should be retained in checkpoints
2025-08-20 11:15:10 +01:00
Ganesh Vernekar
a86d9a3858
Merge pull request #16925 from prometheus/codesome/stale-series-tracking
tsdb: Track stale series in the Head block based on stale sample
2025-08-19 15:35:19 -07:00
Patryk Prus
bbc9e47e42
Add comment about differences between agent mode and regular Prometheus
Signed-off-by: Patryk Prus <p@trykpr.us>
2025-08-19 18:33:52 -04:00
Ganesh Vernekar
3904b3cd5f Restore stale series count from chunk snapshots
Signed-off-by: Ganesh Vernekar <ganesh.vernekar@reddit.com>
2025-08-19 15:07:37 -07:00
Ganesh Vernekar
b29ce3e489 Restore stale series count on WAL replay
Signed-off-by: Ganesh Vernekar <ganesh.vernekar@reddit.com>
2025-08-19 15:07:37 -07:00
Ganesh Vernekar
0c3d3d7466 Test the stale series tracking in Head
Signed-off-by: Ganesh Vernekar <ganesh.vernekar@reddit.com>
2025-08-19 15:07:37 -07:00
Ganesh Vernekar
7a947d3629 Track stale series in the Head
Signed-off-by: Ganesh Vernekar <ganesh.vernekar@reddit.com>
2025-08-19 15:07:27 -07:00