1628 Commits

Author SHA1 Message Date
Owen Williams
a5876a0143
tsdb: Reduce test flakiness (#18577)
I have seen some flakiness in these tests, including timeouts. LLM suggested these fixes to make them more deterministic.  They look good to me.

Signed-off-by: Owen Williams <owen.williams@grafana.com>
2026-04-27 10:19:15 +02:00
Denys Sedchenko
ca578101af
feat(tsdb/agent): Implement checkpoint based on series in memory (#17948)
Adds CheckpointFromInMemorySeries option for agent.Options to enable a faster checkpoint implementation that skips segment re-read and just uses in-memory data instead.

* feat: impl agent-specific checkpoint dir
* feat: impl ActiveSeries interface
* feat: use new checkpoint impl
* feat: hide new checkpoint impl behind a feature flag
* feat: add benchmark
* feat: add benchstat case
* feat: use feature flag in bench
* feat: use same labels for persisted state and append
* feat: set WAL segment size
* feat: add checkpoint size metric and bump series size
* feat: wal replay test
* feat: expose new checkpoint opts in cmd flags
* feat: update cli doc
* add ActiveSeries and DeletedSeries doc

Signed-off-by: x1unix <9203548+x1unix@users.noreply.github.com>
Signed-off-by: Denys Sedchenko <9203548+x1unix@users.noreply.github.com>
Co-authored-by: George Krajcsovits <krajorama@users.noreply.github.com>
2026-04-24 19:42:26 +02:00
Julien Pivotto
f69db5bc54 storage: introduce search interface with scoring and filtering
Signed-off-by: Julien Pivotto <291750+roidelapluie@users.noreply.github.com>
2026-04-23 15:05:48 +02:00
Julien
3b9caf6564
Merge pull request #18569 from roidelapluie/roidelapluie/labelnames-limit
tsdb: apply LabelNames limit from LabelHints in blockBaseQuerier
2026-04-23 12:28:19 +02:00
Julien Pivotto
a5b5a3329c tsdb: apply LabelNames limit from LabelHints in blockBaseQuerier
Signed-off-by: Julien Pivotto <291750+roidelapluie@users.noreply.github.com>
2026-04-23 11:05:17 +02:00
George Krajcsovits
c84b0acdb4
test(tsdb): add OOO error coverage for ST zero sample appends (#18554)
* test(tsdb): add OOO error coverage for ST zero sample appends

Add unit tests exercising the out-of-order error paths in
AppendSTZeroSample, AppendHistogramSTZeroSample (AppenderV1), and
the best-effort ST injection in AppenderV2.Append.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>

* make format

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>

* test(tsdb): add TestHeadAppenderV2_BestEffortSTZeroSample_OOO

The three OOO cases added to TestHeadAppenderV2_Append_EnableSTAsZeroSample
use a single appender so headChunks is nil at append time; the zero sample
enters the batch and is rejected silently in commitFloats, never reaching
the error-handling branch at line 374 of bestEffortAppendSTZeroSample.

Add a dedicated test that commits the first sample before appending the
second. This makes headChunks non-nil, so appendFloat/appendHistogram/
appendFloatHistogram returns ErrOutOfOrderSample at append time and the
branch at line 374 is actually executed.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>

---------

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-23 09:48:12 +02:00
Arve Knudsen
c7b2210ac3
tsdb: cache collected head chunks on ChunkReader for O(1) lookup (#18302)
tsdb: cache collected head chunks on ChunkReader for O(1) lookup

The query path calls s.chunk() once per chunk meta via
ChunkOrIterableWithCopy. Each call walks the head chunks linked list
from the head to the target position. For a series with N head chunks
iterated oldest-first, total work is O(N²).

Cache the collected []*memChunk slice on headChunkReader, keyed by
series ref, head pointer, and mmapped chunks length. Collected once
per series under lock; reused on subsequent chunk lookups for the same
series. The backing array is reused across series (zero alloc after
first use).

Series with 0 or 1 head chunks skip the cache entirely to avoid
per-series overhead that dominates for typical workloads where most
series have a single head chunk.

The cache is gated behind an enableCache flag, toggled via an optional
chunkCacheToggler interface only when hints.Step > 0 (range queries).
Instant queries only need one chunk per series, so the cache overhead
is not recouped.

Also replace O(N²) linked-list traversals in appendSeriesChunks with
O(N) collectHeadChunks + slice iteration, and thread reusable
headChunksBuf through the index reader paths to avoid per-series
allocations.


---------

Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
Co-authored-by: George Krajcsovits <krajorama@users.noreply.github.com>
2026-04-17 18:34:41 +02:00
Arve Knudsen
98809e40c6
tsdb: Skip clean series during periodic head chunk mmap (#18272)
tsdb: Skip clean series during periodic head chunk mmap

The periodic mmapHeadChunks cycle previously acquired a per-series
lock on every series, even though typically >99% have nothing to
mmap. This was identified as a CPU bottleneck in Grafana Mimir.

Add a headChunkCount field (sync/atomic.Uint32) to memSeries that
tracks the number of head chunks. It is incremented in
cutNewHeadChunk and the histogram new-chunk paths, and reset by
mmapChunks and truncateChunksBefore. mmapHeadChunks uses a lock-free
Load to skip series with fewer than 2 head chunks, avoiding the
per-series lock for clean series.

sync/atomic.Uint32 (4 bytes) is used instead of go.uber.org/atomic
(8 bytes) to fit in existing struct padding without growing
memSeries. Chunk counts are bounded by the 3-byte field in
HeadChunkRef, so cannot overflow uint32.

Also fix pre-existing comment inaccuracies in the touched code:
headChunks.next -> headChunks.prev, mmapHeadChunks() -> mmapChunks()
in the doc comment, and a grammar error.

---------

Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
2026-04-14 17:11:35 +02:00
Julien Pivotto
2828c543bc tsdb: reduce chunk segment size in TestDiskFillingUpAfterDisablingOOO
The test only writes ~80 samples, so the default 512MB chunk segment
pre-allocation during compaction is unnecessary. Use 1MB instead to
avoid large file allocations on constrained CI environments.

Signed-off-by: Julien Pivotto <291750+roidelapluie@users.noreply.github.com>
2026-04-02 12:23:55 +02:00
Ayoub Mrini
0dd834e924
Merge pull request #18406 from machine424/depll
test: migrate TestDelayedCompaction to synctest to eliminate flakiness
2026-04-01 16:40:50 +02:00
Björn Rabenstein
4280662cdf
Merge pull request #18304 from crawfordxx/fix-typos-in-comments
Fix typos in comments and metric help strings
2026-04-01 13:45:59 +02:00
Jorge Creixell
4b562bba6e
tsdb: fix prometheus_tsdb_head_chunks going negative after WAL replay (#18401)
* tsdb: fix prometheus_tsdb_head_chunks going negative after WAL replay

  When truncateStaleSeries deletes a series (writing a full-range tombstone
  to the WAL) and the same label set is immediately re-created, WAL replay
  queues the following sequence on the same processor shard for the shared
  memSeries pointer:

    reset(mSeries, M mmappedChunks, walRef=old)
    deleteSeriesByID(old)
    reset(mSeries, N mmappedChunks, walRef=new)

  deleteSeriesByID correctly subtracts M from the gauge but does not clear
  series.mmappedChunks. The subsequent reset subtracts M again, driving
  prometheus_tsdb_head_chunks negative when M > N.

  Fix by setting series.mmappedChunks = nil in deleteSeriesByID after
  accounting for those chunks.

  Fixes #10884

  Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Signed-off-by: Jorge Creixell <jcreixell@gmail.com>

* Simplify test

  - Re-use appending helper
  - Cleanup comments

Signed-off-by: Jorge Creixell <jcreixell@gmail.com>

* Improve comments in test

Signed-off-by: Jorge Creixell <jcreixell@gmail.com>

* Fix formatting

Signed-off-by: Jorge Creixell <jcreixell@gmail.com>

* Improve comment

Co-authored-by: George Krajcsovits <krajorama@users.noreply.github.com>
Signed-off-by: Jorge Creixell <jcreixell@gmail.com>

---------

Signed-off-by: Jorge Creixell <jcreixell@gmail.com>
Co-authored-by: George Krajcsovits <krajorama@users.noreply.github.com>
2026-04-01 11:30:33 +02:00
Rushabh Mehta
a2172f91c1
tsdb: Find the last series ID on startup from the last series id file and WAL scan (#18333)
* Add logic to Head.Init(...) for fast startup

Signed-off-by: Rushabh Mehta <mehtarushabh2005@gmail.com>

* Add unit tests

Signed-off-by: Rushabh Mehta <mehtarushabh2005@gmail.com>

* Empty commit to retrigger CI

Signed-off-by: Rushabh Mehta <mehtarushabh2005@gmail.com>

* Empty commit to retrigger CI

Signed-off-by: Rushabh Mehta <mehtarushabh2005@gmail.com>

* Make readSeriesStateFile return a struct directly, fix small nits, remove test

Signed-off-by: Rushabh Mehta <mehtarushabh2005@gmail.com>

* Fix test for readSeriesStateFile function

Signed-off-by: Rushabh Mehta <mehtarushabh2005@gmail.com>

* Fix some more nits, add extra testcase

Signed-off-by: Rushabh Mehta <mehtarushabh2005@gmail.com>

---------

Signed-off-by: Rushabh Mehta <mehtarushabh2005@gmail.com>
2026-03-31 21:45:53 -07:00
Bartlomiej Plotka
fb38463dfb
Merge pull request #18321 from atoulme/aix
aix: support the aix/ppc64 compilation target
2026-03-31 16:42:20 +02:00
Julien
4b4d5157b8
chunkenc: add tests for XOR2 active ST delta and value branches (#18363)
Signed-off-by: Julien Pivotto <291750+roidelapluie@users.noreply.github.com>
2026-03-31 15:15:48 +02:00
Kyle Eckhart
37d85980a3
tsdb/agent: fix getOrCreate race (#18292)
* tsdb/agent: fix race in getOrCreate and consolidate series lookup
* tsdb/agent: fix transition window race in SetUnlessAlreadySet
* tsdb/agent: address review feedback and improve BenchmarkGetOrCreate

Signed-off-by: Kyle Eckhart <kgeckhart@users.noreply.github.com>

---------

Signed-off-by: Kyle Eckhart <kgeckhart@users.noreply.github.com>
2026-03-31 15:08:58 +02:00
machine424
86215cf91f
test: migrate TestDelayedCompaction to synctest to eliminate flakiness
The previous implementation relied on real wall-clock time and busy-loops
(time.Sleep + polling loops) to detect when compaction had finished, making
it both slow and flaky especially on busy CI envs and also  on Windows due to timer imprecision).

Now both the subtests run on windows.

The delay value can be increased (1s → 5s) at zero cost to test runtime

Also cleaned up shared logic into small helpers and split the no-delay and
delay-enabled cases into separate subtests for clarity.

Signed-off-by: machine424 <ayoubmrini424@gmail.com>
2026-03-30 23:58:08 +02:00
machine424
dcfb8ce59c
chore: remove util/testutil/synctest now that we use Go>=1.25
Signed-off-by: machine424 <ayoubmrini424@gmail.com>
2026-03-30 19:48:39 +02:00
Julien Pivotto
3856195bb8 tsdb: use float64 for retention percentage
The retention.percentage config field was typed as uint, which silently
truncated fractional values. Setting percentage: 1.5 in prometheus.yml
resulted in a retention of 1%, with no warning or error.

Remove the redundant MaxPercentage > 100 clamp in main.go; the config
UnmarshalYAML already returns an error for out-of-range values before
this code is reached.

Signed-off-by: Julien Pivotto <291750+roidelapluie@users.noreply.github.com>
2026-03-26 12:39:22 +01:00
Julien Pivotto
7a1a5e285f chunkenc: add extra tests
Signed-off-by: Julien Pivotto <291750+roidelapluie@users.noreply.github.com>
2026-03-25 09:59:12 +01:00
Julien Pivotto
d8607cbd9b tsdb/chunkenc: optimise XOR2 and varbit hot paths
Use writeBitsFast instead of writeBits in putVarbitInt/putVarbitUint,
combining prefix and value into a single call per bucket. Inline the
common fast paths in XOR2 Append to avoid encodeJoint and putVarbitInt
calls for the typical dod=0 and 13-bit dod cases.

Signed-off-by: Julien Pivotto <291750+roidelapluie@users.noreply.github.com>
2026-03-25 09:09:46 +01:00
Rushabh Mehta
df61021436
tsdb: Add series_state.json file to wal/ directory to track state (#18303)
* Add series_state.json file creation and updation logic.

Signed-off-by: Rushabh Mehta <mehtarushabh2005@gmail.com>

* Make comments follow the guidelines.

Signed-off-by: Rushabh Mehta <mehtarushabh2005@gmail.com>

* Fix linter complaints

Signed-off-by: Rushabh Mehta <mehtarushabh2005@gmail.com>

* Put PR behind feature flag fast-startup

Signed-off-by: Rushabh Mehta <mehtarushabh2005@gmail.com>

* Marshal updated information to file directly

Signed-off-by: Rushabh Mehta <mehtarushabh2005@gmail.com>

* Fix linter failures

Signed-off-by: Rushabh Mehta <mehtarushabh2005@gmail.com>

* Move series state code from head.go to head_wal.go

Signed-off-by: Rushabh Mehta <mehtarushabh2005@gmail.com>

* Fix nits

Signed-off-by: Rushabh Mehta <mehtarushabh2005@gmail.com>

* Add unit test

Signed-off-by: Rushabh Mehta <mehtarushabh2005@gmail.com>

---------

Signed-off-by: Rushabh Mehta <mehtarushabh2005@gmail.com>
2026-03-23 20:46:04 -07:00
Julien
5b96e611dc
Merge pull request #18325 from roidelapluie/roidelapluie/xor2-with-st
tsdb/chunkenc: port XOR2 performance improvements to ST-aware encoding
2026-03-20 16:09:40 +01:00
Julien Pivotto
3b2b42f681 tsdb/chunkenc: add writeBits benchmarks, clarify comments, and simplify encodeJoint
Signed-off-by: Julien Pivotto <291750+roidelapluie@users.noreply.github.com>
2026-03-20 14:54:48 +01:00
Julien
16876bab95
Merge pull request #18200 from roidelapluie/roidelapluie/retention-validation
Multiple fixes in retention configuration
2026-03-20 12:27:37 +01:00
Julien Pivotto
e865bdd172 tsdb/chunkenc: avoid error allocation in readXOR2ControlFast and add decode tests
Change readXOR2ControlFast to return (uint8, bool) instead of (uint8, error)
to avoid allocating io.EOF on the fast path. Refactor encodeJoint to skip
computing vbits when the value is a stale NaN. Add TestXOR2DecodeFunctionsAcrossPadding
to exercise decodeValue, decodeValueKnownNonZero, and decodeNewLeadingTrailing
across all 64 bit-buffer alignments.

Signed-off-by: Julien Pivotto <291750+roidelapluie@users.noreply.github.com>
2026-03-19 17:00:31 +01:00
Julien Pivotto
7176a6de91 tsdb/chunkenc: port XOR2 performance improvements to ST-aware encoding
Port the following optimizations from the roidelapluie/xor2 branch to
the ST-aware XOR2 implementation on main:

bstream.go:
- Add writeBitsFast() as a writeBits variant that handles the partial
  last byte inline to avoid per-byte writeByte calls and writes
  complete bytes directly to the stream slice; used only by XOR2 to
  leave the shared writeBits unchanged for other encoders
- Add readXOR2ControlFast() for inlinable hot-path control decoding
  that avoids buffer refills for the common 4-bit cases
- Add readUvarint()/readVarint() methods that use direct method calls
  instead of io.ByteReader interface dispatch, reducing GC pressure
  from interior pointer references in findObject

xor2.go:
- Switch all writeBits calls to writeBitsFast
- Use readXOR2ControlFast() + readXOR2Control() fallback in Next()
- Use it.br.readVarint()/readUvarint() instead of binary.ReadVarint/
  ReadUvarint to avoid GC overhead from interface dispatch
- Add 3-bit fast path in decodeValue() to read the full value control
  prefix in one buffer peek rather than up to three separate bit reads
- Add combined 1+sz bit fast path in decodeValueKnownNonZero() to
  fold the control bit and value bits into a single buffer operation
- Add 11-bit combined read in decodeNewLeadingTrailing() to read
  leading (5 bits) and sigbits (6 bits) together
- Pre-compute the value XOR delta in encodeJoint() and pass it to
  writeVDeltaKnownNonZero(delta uint64) to avoid recomputation

Signed-off-by: Julien Pivotto <291750+roidelapluie@users.noreply.github.com>
2026-03-19 12:14:34 +01:00
Bartlomiej Plotka
2ba3046c80
Merge float st-storage implementation (PROM-60) and initial xor2-encoding (#18062)
* feat(tsdb/chunkenc): add float chunk format with start timestamp support


Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>

* optimize code path and layout

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>

* make new format usable in head

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>

* fix issue with seeking to last sample again

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>

* fix iterator benchmark for chunks not supporting ST

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>

* reduce footprint of the xoroptst chunk iterator object

It was 80 bytes with a lot of padding compared to the 56 bytes of the
original xor chunk iterator. Made it 64 bytes, tightly packed.

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>

* Fix benchmark expectations on ST in interator

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>

* add inclusive delta test case

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>

* make testcases independent of order

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>

* drop unused code

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>

* Drop commented out line

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>

* documentation

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>

* Small simplification in the doc

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>

* Add delta st inclusive test case for random vt

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>

* Switch to delta of difference of st to prev t

from delta of delta of st.

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>

* Write ST after T and V so we can write a single bit on the second sample

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>

* verify chunk sample len function

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>

* Reduce size of first st stored a little

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>

* test the case where st equals the t

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>

* add st equal t to bechmarks

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>

* test(chunkenc): test that appender can contonue chunks

Test that initializing a chunk appender from an existing chunk
works correctly.

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>

* fix(chunkenc): bug in initializing appender on existing chunk

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>

* Add cases with jitter in the start time as well


Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>

* tsdb: ST-in-WAL: Counter implementation and benchmarks (#17671)

Initial implementation of https://github.com/prometheus/prometheus/issues/17790.
Only implements ST-per-sample for Counters. Tests and benchmarks updated.

Note: This increases the size of the RefSample object for all users, whether st-per-sample is turned on or not.

Signed-off-by: Owen Williams <owen.williams@grafana.com>

* refactor: sed enableStStorage/enableSTStorage

Signed-off-by: bwplotka <bwplotka@gmail.com>

* feat[scrape]: add ST parsing support to scrape AppenderV2 flow (#18103)

Signed-off-by: bwplotka <bwplotka@gmail.com>

* feat(tsdb): change head opt EnableSTStorage to atomic (#18107)

In downstream projects this needs to be set dynamically per tenant.

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>

* Merge pull request #18108 from prometheus/bwplotka/fix

scrape: add tests for ST appending; add warnings for ST feature flag users around _created drop

* refact(tsdb): trivial rename (#18109)

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>

* fix(tsdb): missing passing head option to wal/wbl write (#18113)

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>

* feat(tsdb): allow using ST capable XOR chunks - retain format on read (#18013)

* feat(tsdb): allow appending to ST capable XOR chunk optionally

Only for float samples as of now.  Supports for in-order and out-of-order
samples.

Make sure that on readout the ST capable chunks are returned automatically.
When the chunks are returned as is, this is trivially true.
When a chunk needs to be re-coded due to deletion (tombstone) markers,
we take the encoding of the original chunk.
When a chunk needs to be created from overlapping chunks, we observe
whether ST is zero or not and create the new chunk based on that.

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>

* fix test after merge

Signed-off-by: bwplotka <bwplotka@gmail.com>

* feat: RW2 sending ST support

Signed-off-by: bwplotka <bwplotka@gmail.com>

tmp

Signed-off-by: bwplotka <bwplotka@gmail.com>

* tests: test ST in a cheapest way possible

Signed-off-by: bwplotka <bwplotka@gmail.com>

* tests: add bench CLI recommended invokations

Signed-off-by: bwplotka <bwplotka@gmail.com>

* fixed tests after rebase

Signed-off-by: bwplotka <bwplotka@gmail.com>

* feat(chunkenc): replace xoroptst chunk encoding with xor2

XOR2 is based on https://github.com/prometheus/prometheus/pull/18238
With additional ST support.

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>

* feat: add compliance RW sender test for agent

Signed-off-by: bwplotka <bwplotka@gmail.com>

* feat(agent): add support for appending ST

Signed-off-by: bwplotka <bwplotka@gmail.com>

* replace stray xoroptst words

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>

* Apply suggestions from code review

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* post merge conflict fixes

Signed-off-by: bwplotka <bwplotka@gmail.com>

* feat(tsdb): register st_storage in feature API

Register the st-storage feature flag in the feature registry via
the TSDB options, consistent with how other TSDB features like
exemplar_storage and delayed_compaction are registered.

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
Coded with Claude Sonnet 4.6.

* Document xor2-encoding feature flag

Signed-off-by: Carrie Edwards <edwrdscarrie@gmail.com>

* Add xor2-encoding feature flag

Signed-off-by: Carrie Edwards <edwrdscarrie@gmail.com>

* Update CHANGELOG

Signed-off-by: Carrie Edwards <edwrdscarrie@gmail.com>

* Fix linting

Signed-off-by: Carrie Edwards <edwrdscarrie@gmail.com>

* Remove setting of xor2 encoding option in db open

Signed-off-by: Carrie Edwards <edwrdscarrie@gmail.com>

* Fix tests

Signed-off-by: Carrie Edwards <edwrdscarrie@gmail.com>

* Fix linting

Signed-off-by: Carrie Edwards <edwrdscarrie@gmail.com>

* Update feature flag description

Signed-off-by: Carrie Edwards <edwrdscarrie@gmail.com>

* Update comments and feature flag description

Signed-off-by: Carrie Edwards <edwrdscarrie@gmail.com>

* Update documentation for st-storage feature

Signed-off-by: Carrie Edwards <edwrdscarrie@gmail.com>

* st: disconnect st-storage with xor2-encoding given planned experiments (#18316)

* st: disconnect st-storage with xor2-encoding given planned experiments

Signed-off-by: bwplotka <bwplotka@gmail.com>

* Update docs/feature_flags.md

Co-authored-by: George Krajcsovits <krajorama@users.noreply.github.com>
Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Update docs/feature_flags.md

Co-authored-by: George Krajcsovits <krajorama@users.noreply.github.com>
Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Update docs/feature_flags.md

Co-authored-by: George Krajcsovits <krajorama@users.noreply.github.com>
Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Update docs/feature_flags.md

Co-authored-by: George Krajcsovits <krajorama@users.noreply.github.com>
Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

---------

Signed-off-by: bwplotka <bwplotka@gmail.com>
Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>
Co-authored-by: George Krajcsovits <krajorama@users.noreply.github.com>

---------

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
Signed-off-by: Ganesh Vernekar <ganesh.vernekar@reddit.com>
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
Signed-off-by: Aleksandr Smirnov <5targazer@mail.ru>
Signed-off-by: Mohammad Abbasi <mohammad.v184@gmail.com>
Signed-off-by: matt-gp <small_minority@hotmail.com>
Signed-off-by: Ian Kerins <git@isk.haus>
Signed-off-by: SuperQ <superq@gmail.com>
Signed-off-by: dependabot[bot] <support@github.com>
Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
Signed-off-by: ffgan <sudoemt@gmail.com>
Signed-off-by: Patryk Prus <p@trykpr.us>
Signed-off-by: Owen Williams <owen.williams@grafana.com>
Signed-off-by: bwplotka <bwplotka@gmail.com>
Signed-off-by: 3Juhwan <13selfesteem91@naver.com>
Signed-off-by: Sammy Tran <sammyqtran@gmail.com>
Signed-off-by: Casie Chen <casie.chen@grafana.com>
Signed-off-by: Dan Cech <dcech@grafana.com>
Signed-off-by: kakabisht <kakabisht07@gmail.com>
Signed-off-by: Jeanette Tan <jeanette.tan@grafana.com>
Signed-off-by: Divyansh Mishra <divyanshmishra@Divyanshs-MacBook-Air-3.local>
Signed-off-by: Varun Chawla <varun_6april@hotmail.com>
Signed-off-by: Martin Valiente Ainz <64830185+tinitiuset@users.noreply.github.com>
Signed-off-by: Kyle Eckhart <kgeckhart@users.noreply.github.com>
Signed-off-by: Julien Pivotto <291750+roidelapluie@users.noreply.github.com>
Signed-off-by: Matthieu MOREL <matthieu.morel35@gmail.com>
Signed-off-by: Linas Medziunas <linas.medziunas@gmail.com>
Signed-off-by: Björn Rabenstein <github@rabenste.in>
Signed-off-by: beorn7 <beorn@grafana.com>
Signed-off-by: Sayuru <71478576+samaras3@users.noreply.github.com>
Signed-off-by: Matt <small_minority@hotmail.com>
Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>
Signed-off-by: Carrie Edwards <edwrdscarrie@gmail.com>
Co-authored-by: Bartlomiej Plotka <bwplotka@gmail.com>
Co-authored-by: Ganesh Vernekar <ganesh.vernekar@reddit.com>
Co-authored-by: Bryan Boreham <bjboreham@gmail.com>
Co-authored-by: Sasha <103973965+crush-on-anechka@users.noreply.github.com>
Co-authored-by: Mohammad Abbasi <mohammad.v184@gmail.com>
Co-authored-by: matt-gp <small_minority@hotmail.com>
Co-authored-by: Ian Kerins <git@isk.haus>
Co-authored-by: SuperQ <superq@gmail.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Arve Knudsen <arve.knudsen@gmail.com>
Co-authored-by: Julien <291750+roidelapluie@users.noreply.github.com>
Co-authored-by: ffgan <sudoemt@gmail.com>
Co-authored-by: Patryk Prus <p@trykpr.us>
Co-authored-by: Ganesh Vernekar <ganeshvern@gmail.com>
Co-authored-by: Joe Adams <github@joeadams.io>
Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
Co-authored-by: Owen Williams <owen.williams@grafana.com>
Co-authored-by: 3Juhwan <13selfesteem91@naver.com>
Co-authored-by: Casie Chen <casie.chen@grafana.com>
Co-authored-by: Dan Cech <dcech@grafana.com>
Co-authored-by: hridyesh bisht <41201308+kakabisht@users.noreply.github.com>
Co-authored-by: zenador <zenador@users.noreply.github.com>
Co-authored-by: Divyansh Mishra <divyanshmishra@Divyanshs-MacBook-Air-3.local>
Co-authored-by: Varun Chawla <varun_6april@hotmail.com>
Co-authored-by: Martin Valiente Ainz <64830185+tinitiuset@users.noreply.github.com>
Co-authored-by: Kyle Eckhart <kgeckhart@users.noreply.github.com>
Co-authored-by: Matthieu MOREL <matthieu.morel35@gmail.com>
Co-authored-by: Linas Medžiūnas <linasm@users.noreply.github.com>
Co-authored-by: Björn Rabenstein <github@rabenste.in>
Co-authored-by: beorn7 <beorn@grafana.com>
Co-authored-by: Sayuru <71478576+samaras3@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Carrie Edwards <edwrdscarrie@gmail.com>
2026-03-19 10:51:40 +01:00
Antoine Toulme
d7bc621514 Support the aix/ppc64 compilation target
Signed-off-by: Antoine Toulme <atoulme@splunk.com>
2026-03-18 17:07:44 -07:00
crawfordxx
afaff7e116 Fix typos in comments and metric help strings
- limt -> limit (storage/remote/queue_manager.go metric help text)
- exluded -> excluded (tsdb/compact.go comment)
- wont -> won't (tsdb/head.go comment)

Signed-off-by: crawfordxx <crawfordxx@users.noreply.github.com>
2026-03-18 12:22:09 +08:00
Arve Knudsen
fc1c60d9eb
tsdb: clear pooled objects before returning to sync.Pool (#17895)
Clear WAL replay pool objects before Put() to avoid retaining references
to Labels, Histograms, and other data that could prevent garbage
collection.

The following pools now properly clear their contents:
- wlReplaySeriesPool: clear Labels field
- wlReplaytStonesPool: clear intervals
- wlReplayExemplarsPool: clear Labels field
- wlReplayHistogramsPool: clear histogram pointers
- wlReplayFloatHistogramsPool: clear float histogram pointers
- wlReplayMetadataPool: clear metadata strings
- agent walReplaySeriesPool: clear Labels field
- agent walReplayHistogramsPool: clear histogram pointers
- agent walReplayFloatHistogramsPool: clear float histogram pointers

Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
2026-03-17 15:18:44 +01:00
Bartlomiej Plotka
a02e20d98e
Merge branch 'main' into feature/start-time
Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>
2026-03-17 13:06:25 +01:00
Arve Knudsen
6b5c0b327a
tsdb: mmap histogram chunks during WAL replay (#18306)
* tsdb: mmap histogram chunks during WAL replay

The float sample path in processWALSamples calls mmapChunks when a new
chunk is created during WAL replay, but the histogram path was missing
this call. Without it, histogram head chunks accumulate as a linked
list in memory rather than being mmapped, causing unnecessary memory
growth during long WAL replays.

---------

Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
2026-03-16 12:08:47 +00:00
Carrie Edwards
a4a17a77cd Update comments and feature flag description
Signed-off-by: Carrie Edwards <edwrdscarrie@gmail.com>
2026-03-13 07:43:28 -07:00
Carrie Edwards
b575f5e28b Fix linting
Signed-off-by: Carrie Edwards <edwrdscarrie@gmail.com>
2026-03-12 12:26:42 -07:00
Carrie Edwards
8a02ae58d4 Fix tests
Signed-off-by: Carrie Edwards <edwrdscarrie@gmail.com>
2026-03-12 12:20:24 -07:00
Carrie Edwards
a0d0a8efe8 Remove setting of xor2 encoding option in db open
Signed-off-by: Carrie Edwards <edwrdscarrie@gmail.com>
2026-03-12 11:08:33 -07:00
Carrie Edwards
c10abae45e Fix linting
Signed-off-by: Carrie Edwards <edwrdscarrie@gmail.com>
2026-03-12 11:07:02 -07:00
Carrie Edwards
a679ab5eb4 Add xor2-encoding feature flag
Signed-off-by: Carrie Edwards <edwrdscarrie@gmail.com>
2026-03-12 11:07:00 -07:00
György Krajcsovits
0dac72ee94
feat(tsdb): register st_storage in feature API
Register the st-storage feature flag in the feature registry via
the TSDB options, consistent with how other TSDB features like
exemplar_storage and delayed_compaction are registered.

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
Coded with Claude Sonnet 4.6.
2026-03-12 16:01:04 +01:00
bwplotka
3cf43337dc post merge conflict fixes
Signed-off-by: bwplotka <bwplotka@gmail.com>
2026-03-12 09:03:08 +00:00
bwplotka
c133a969af Merge branch 'main' into start-time-main-sync 2026-03-12 08:28:15 +00:00
Bartlomiej Plotka
a3217fe94f
Merge pull request #18270 from prometheus/agentrw
feat(agent): fix ST append; add compliance RW sender test for agent
2026-03-12 05:45:59 +01:00
Bartlomiej Plotka
a73202012b
tsdb/wlog[PERF]: optimize WAL watcher reads (up to 540x less B/op; 13000x less allocs/op) (#18250)
See the detailed analysis https://docs.google.com/document/d/1efVAMcEw7-R_KatHHcobcFBlNsre-DoThVHI8AO2SDQ/edit?tab=t.0

I ran extensive benchmarks using synthetic data as well as real WAL segments pulled from the prombench runs.

All benchmarks are here https://github.com/prometheus/prometheus/compare/bwplotka/wal-reuse?expand=1

* optimization(tsdb/wlog): reuse Ref* buffers across WAL watchers' reads

Signed-off-by: bwplotka <bwplotka@gmail.com>

* optimization(tsdb/wlog): avoid expensive error wraps

Signed-off-by: bwplotka <bwplotka@gmail.com>

* optimization(tsdb/wlog): reuse array for filtering

Signed-off-by: bwplotka <bwplotka@gmail.com>

* fmt

Signed-off-by: bwplotka <bwplotka@gmail.com>

* lint fix

Signed-off-by: bwplotka <bwplotka@gmail.com>

* tsdb/record: add test for clear() on histograms

Signed-off-by: bwplotka <bwplotka@gmail.com>

* updated WriteTo with what's currently expected

Signed-off-by: bwplotka <bwplotka@gmail.com>

---------

Signed-off-by: bwplotka <bwplotka@gmail.com>
2026-03-11 09:17:13 +00:00
Bartlomiej Plotka
f7c60bf97e
Apply suggestions from code review
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>
2026-03-10 15:55:40 +00:00
György Krajcsovits
a773d3daad
replace stray xoroptst words
Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
2026-03-10 16:23:08 +01:00
George Krajcsovits
7cbe0e3a5c
tsdb/agent: Prevent duplicate SeriesRefs from being lost in stripeSeries (#17538)
* Show the agent db can hold duplicate series by hash

Signed-off-by: Kyle Eckhart <kgeckhart@users.noreply.github.com>

* Prevent duplicate SeriesRefs from being lost in db stripeSeries

Signed-off-by: Kyle Eckhart <kgeckhart@users.noreply.github.com>

* Drop default initialized value

Signed-off-by: Kyle Eckhart <kgeckhart@users.noreply.github.com>

* More comments and only reset deleted if the new segment is larger

Signed-off-by: Kyle Eckhart <kgeckhart@users.noreply.github.com>

* Manually manage db/rw to prevent windows test error

Signed-off-by: Kyle Eckhart <kgeckhart@users.noreply.github.com>

* Fix incorrect type from rebase

Signed-off-by: Kyle Eckhart <kgeckhart@users.noreply.github.com>

* Use Set in GetOrSet to enforce proper lock ordering

Signed-off-by: Kyle Eckhart <kgeckhart@users.noreply.github.com>

* Missing period and left over refactor

Signed-off-by: Kyle Eckhart <kgeckhart@users.noreply.github.com>

---------

Signed-off-by: Kyle Eckhart <kgeckhart@users.noreply.github.com>
2026-03-10 14:31:48 +01:00
bwplotka
6ab5d8f9be feat(agent): add support for appending ST
Signed-off-by: bwplotka <bwplotka@gmail.com>
2026-03-10 12:27:48 +00:00
György Krajcsovits
5e5b14c04b
feat(chunkenc): replace xoroptst chunk encoding with xor2
XOR2 is based on https://github.com/prometheus/prometheus/pull/18238
With additional ST support.

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
2026-03-06 14:35:06 +01:00
Bartlomiej Plotka
9dc782bf2c
Merge pull request #18220 from prometheus/extended-test
tests(tsdb/wlog): Tighten watcher tail tests
2026-03-05 11:31:04 +01:00