Commit Graph

250 Commits

Author SHA1 Message Date
Matthieu MOREL
5fa1146e21
chore: enable gci linter (#16245)
Signed-off-by: Matthieu MOREL <matthieu.morel35@gmail.com>
2025-03-22 15:46:13 +00:00
Ganesh Vernekar
bc595263c1
Merge pull request #16231 from pr00se/multiref-improvements
TSDB: Handle metadata/tombstones/exemplars for duplicate series during WAL replay
2025-03-19 16:15:50 -04:00
Ziqi Zhao
f6903bcc22
Let HistogramAppender.appendable return CounterResetHeader instead of… (#16195)
Let HistogramAppender.appendable return CounterResetHeader instead of boolean

Signed-off-by: Ziqi Zhao <zhaoziqi9146@gmail.com>
Signed-off-by: Björn Rabenstein <github@rabenste.in>

---------

Signed-off-by: Ziqi Zhao <zhaoziqi9146@gmail.com>
Signed-off-by: Björn Rabenstein <github@rabenste.in>
Co-authored-by: Björn Rabenstein <github@rabenste.in>
2025-03-18 17:40:27 +01:00
Patryk Prus
e4e1b515bc
TSDB: Handle metadata/tombstones/exemplars for duplicate series during WAL replay
Signed-off-by: Patryk Prus <p@trykpr.us>
2025-03-18 12:22:33 -04:00
Fiona Liao
37c2ebb5fd
Make out-of-order native histograms flag a no-op and always enable (#16207)
* Remove experimental out-of-order native histogram flag

This feature has been available in Prometheus since September 2024,
and has no known issues. Therefore proposing to remove the flag
entirely and always have it on. Note that there are still two
settings that need to be configured (out-of-order time window > 0
and native histograms enabled) for this feature to work.

Signed-off-by: Fiona Liao <fiona.liao@grafana.com>

* Update CHANGELOG

Signed-off-by: Fiona Liao <fiona.liao@grafana.com>

* Keep feature flag with warning

Signed-off-by: Fiona Liao <fiona.liao@grafana.com>

* Update CHANGELOG

Signed-off-by: Fiona Liao <fiona.liao@grafana.com>

* Update tsdb/head_append.go

Co-authored-by: George Krajcsovits <krajorama@users.noreply.github.com>
Signed-off-by: Fiona Liao <fiona.y.liao@gmail.com>

* Update CHANGELOG.md

Co-authored-by: George Krajcsovits <krajorama@users.noreply.github.com>
Signed-off-by: Fiona Liao <fiona.y.liao@gmail.com>

* Update tsdb/head_append.go

Co-authored-by: George Krajcsovits <krajorama@users.noreply.github.com>
Signed-off-by: Fiona Liao <fiona.y.liao@gmail.com>

* Additional cleanup of comments and test names

Signed-off-by: Fiona Liao <fiona.liao@grafana.com>

---------

Signed-off-by: Fiona Liao <fiona.liao@grafana.com>
Signed-off-by: Fiona Liao <fiona.y.liao@gmail.com>
Co-authored-by: George Krajcsovits <krajorama@users.noreply.github.com>
2025-03-18 10:59:02 +00:00
Patryk Prus
86eeaf1886
Skip writing series records uniformly across the benchmark, so we skip some OOO series as well
Signed-off-by: Patryk Prus <p@trykpr.us>
2025-03-17 15:17:53 -04:00
Patryk Prus
2147538d1e
Add missing series refs to benchmark
Signed-off-by: Patryk Prus <p@trykpr.us>
2025-03-17 15:17:53 -04:00
Bartlomiej Plotka
7a7bc65237
Add util/compression package to consolidate snappy/zstd use in Prometheus. (#16156)
# Conflicts:
#	tsdb/db_test.go

Apply suggestions from code review




tmp



Addressed comments.



Update util/compression/buffers.go

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>
Co-authored-by: Arthur Silva Sens <arthursens2005@gmail.com>
2025-03-10 10:36:26 +00:00
Patryk Prus
61aa82865d
TSDB: keep duplicate series records in checkpoints while their samples may still be present (#16060)
Renames the head's deleted map to walExpiries, and creates entries for any
duplicate series records encountered during WAL replay, with the expiry set
to the highest current WAL segment number. Any subsequent WAL
checkpoints will see the duplicate series entry in the walExpiries map, and
keep the series record until the last WAL segment that could contain its
samples is deleted.

Other considerations:

WBL: series records aren't written to the WBL, so there are no duplicates to deal with
agent mode: has its own WAL replay logic that handles duplicate series records differently, and is outside the scope of this PR
2025-03-05 13:45:08 -05:00
Matthieu MOREL
c7d4b53ec1 chore: enable unused-parameter from revive
Signed-off-by: Matthieu MOREL <matthieu.morel35@gmail.com>
2025-02-19 19:50:28 +01:00
machine424
d644324407
feat(tsdb/(head|agent)): reuse pools across segments to avoid generating garbage during WL replay
This is part of the "reduce WAL replay overhead/garbage" effort to help with https://github.com/prometheus/prometheus/issues/6934.

Signed-off-by: machine424 <ayoubmrini424@gmail.com>
2025-02-10 22:40:24 +01:00
Bryan Boreham
6ba25ba93f tsdb tests: avoid 'defer' till end of function
'defer' only runs at the end of the function, so explicitly close the
querier after we finish with it. Also check it didn't error.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2025-01-27 19:59:43 +00:00
György Krajcsovits
1e420ef373 Merge branch 'main' into cedwards/nhcb-wal-wbl
# Conflicts:
#	tsdb/tsdbutil/histogram.go
2025-01-02 12:50:19 +01:00
Bryan Boreham
cfa32f3d28 TSDB: Move merge of head postings into index
This enables it to take advantage of a more compact data structure
since all postings are known to be `*ListPostings`.

Remove the `Get` member which was not used for anything else, and fix up
tests.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-12-20 19:22:30 +00:00
Joel Beckmeyer
39f5a07236 fix TestOOOHeadChunkReader_Chunk on 32-bit
Signed-off-by: Joel Beckmeyer <joel@beckmeyer.us>
2024-12-16 10:45:07 -05:00
Carrie Edwards
a046417bc0 Use new record type only for NHCB 2024-12-06 13:46:20 -08:00
Carrie Edwards
6684344026 Rename old histogram record type, use old names for new records 2024-12-05 09:21:47 -08:00
Carrie Edwards
37df50adb9 Attempt for record type 2024-12-05 09:21:47 -08:00
Fiona Liao
c599d37668
Always return unknown hint for first sample in non-gauge histogram chunk (#15343)
Always return unknown hint for first sample in non-gauge histogram chunk

---------

Signed-off-by: Fiona Liao <fiona.liao@grafana.com>
Co-authored-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
2024-11-12 15:14:06 +01:00
György Krajcsovits
e6a682f046 Reproduce populateWithDelChunkSeriesIterator corrupting chunk meta
When handling recoded histogram chunks the min time of the chunk is
updated by mistake. It should only update when the chunk is completely
new.

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
2024-10-18 10:34:22 +02:00
György Krajcsovits
631fadc4ca Unit test for data race in head.Appender.AppendHistogram
Two Appenders race when creating a series with a native histogram
as the memSeries will be common and the lastHistogram field is written
without lock.

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
2024-10-10 14:10:07 +02:00
Matthieu MOREL
ab64966e9d
fix: use "ErrorContains" or "EqualError" instead of "Contains(t, err.Error()" and "Equal(t, err.Error()" (#15094)
* fix: use "ErrorContains" or "EqualError" instead of "Contains(t, err.Error()" and "Equal(t, err.Error()"

---------

Signed-off-by: Matthieu MOREL <matthieu.morel35@gmail.com>
Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
Co-authored-by: Arve Knudsen <arve.knudsen@gmail.com>
2024-10-06 16:35:29 +00:00
Arthur Silva Sens
95a53ef982
Join tests for appending float and histogram CTs
Signed-off-by: Arthur Silva Sens <arthursens2005@gmail.com>
2024-09-26 11:29:31 -03:00
Arthur Silva Sens
6bd9b1a7cc
Histogram CT Zero ingestion
Signed-off-by: Arthur Silva Sens <arthursens2005@gmail.com>
2024-09-26 11:29:22 -03:00
Carrie Edwards
14e3c05ce8
tsdb: Add support for ingestion of out-of-order native histogram samples (#14546)
Add support for ingesting OOO native histograms

* Add flag for enabling and disabling OOO native histogram ingestion
* Update OOO querying tests to include native histogram samples
* Add OOO head tests
* Add test for OOO native histogram counter reset headers

Signed-off-by: Carrie Edwards <edwrdscarrie@gmail.com>
Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
Co-authored by: Carrie Edwards <edwrdscarrie@gmail.com>
Co-authored by: Jeanette Tan <jeanette.tan@grafana.com>
Co-authored by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
Co-authored by: Fiona Liao <fiona.liao@grafana.com>
2024-09-17 11:19:06 +02:00
Harry John
919dc0cbc6
storage: Update LabelQuerier interface to return sorted label values (#14849)
* Change LabelQuerier.LabelValues() to return sorted values

---------

Signed-off-by: 🌲 Harry 🌊 John 🏔 <johrry@amazon.com>
2024-09-17 08:55:02 +02:00
Nathan Baulch
50cd453c8f
chore: Fix typos (#14868)
* Fix typos

---------

Signed-off-by: Nathan Baulch <nathan.baulch@gmail.com>
2024-09-10 22:32:03 +02:00
György Krajcsovits
d3f4e7c223 Remove unnecessary conversion
Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
2024-09-09 12:51:02 +02:00
György Krajcsovits
60ab1cc5a5 BUGFIX: TSDB: panic in chunk querier
Followup to #14831

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
2024-09-09 12:43:02 +02:00
George Krajcsovits
536d9f9ce9
BUGFIX: TSDB: panic in query during truncation with OOO head (#14831)
Check if headQuerier is nil before trying to use it.

* TestQueryOOOHeadDuringTruncate: unit test to check query during truncate
Regression test for #14822

* Simulate race between query and Compact()

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
2024-09-05 17:17:42 +01:00
Arthur Silva Sens
442f24e099
chore: Simplify TestHeadAppender_AppendCTZeroSample (#14812)
Signed-off-by: Arthur Silva Sens <arthursens2005@gmail.com>
2024-09-02 21:30:37 +01:00
Oleg Zaytsev
ce7d830f1f
Bring back BenchmarkLoadRealWLs (#14757)
This was part of #14525 which was reverted.
I still think that having this benchmark committed in to the repo is
useful.

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
2024-09-02 17:20:10 +01:00
Marco Pracucci
ef649d5968
Revert " Store mmMaxTime in same field as seriesShard"
Signed-off-by: Marco Pracucci <marco@pracucci.com>
2024-08-26 08:56:16 +02:00
Arve Knudsen
3a78e76282 Upgrade golangci-lint to v1.60.1
Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
2024-08-18 12:13:25 +02:00
cuiweiyuan
1800af54f0 chore: fix some function names
Signed-off-by: cuiweiyuan <cuiweiyuan@aliyun.com>
2024-08-15 13:57:21 +08:00
Oleg Zaytsev
0833d2a230
Fix appendable: check whether last val was a histogram (#14613)
* Fix appendable: check whether last val was a histogram

When appending a float, we were checking whether lastValue was equal to
current value, but we didn't check whether last value was a float value.

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
2024-08-07 15:02:59 +02:00
Oleg Zaytsev
b7f2f3c3ac
Add BenchmarkLoadRealWLs
This benchmark runs on real WLs rather than fake generated ones.

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
2024-07-30 10:19:56 +02:00
Carrie Edwards
79b53bd3de Refactor TestWBLReplay to use scenarios
Signed-off-by: Carrie Edwards <edwrdscarrie@gmail.com>
Co-authored by: Fiona Liao <fiona.liao@grafana.com>:
2024-07-16 10:53:28 -07:00
Carrie Edwards
2e0e4e9ce9 Add support for handling multiple chunks in OOO head
Signed-off-by: Carrie Edwards <edwrdscarrie@gmail.com>
Co-authored by: Jeanette Tan <jeanette.tan@grafana.com>:
Co-authored by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>:
Signed-off-by: Carrie Edwards <edwrdscarrie@gmail.com>
Co-authored by: Fiona Liao <fiona.liao@grafana.com>:
2024-07-16 10:53:09 -07:00
Carrie Edwards
06550883c1 Clean up of tests and test utils
Co-authored by: Fiona Liao <fiona.liao@grafana.com>:

Signed-off-by: Carrie Edwards <edwrdscarrie@gmail.com>
2024-07-03 09:28:38 -07:00
Carrie Edwards
45a32a29ef Update tsdb tests to use test utils.
Co-authored-by: Fiona Liao <fiona.liao@grafana.com>
Signed-off-by: Carrie Edwards <edwrdscarrie@gmail.com>
2024-07-03 09:28:38 -07:00
Oleg Zaytsev
fd1a89b7c8
Pass affected labels to MemPostings.Delete() (#14307)
* Pass affected labels to MemPostings.Delete

As suggested by @bboreham, we can track the labels of the deleted series
and avoid iterating through all the label/value combinations.

This looks much faster on the MemPostings.Delete call. We don't have a
benchmark on stripeSeries.gc() where we'll pay the price of iterating
the labels of each one of the deleted series.

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
2024-06-18 10:28:56 +00:00
Arve Knudsen
b2396c0c8f Upgrade to golangci-lint v1.59.0
Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
2024-05-27 22:38:48 +02:00
Alan Protasio
8894d65cd6
Fix head stats and hooks when replaying a corrupted snapshot (#14079)
* Fixing head stats and hooks when replaying a corrupted snapshot

Signed-off-by: alanprot <alanprot@gmail.com>

* Fixing create/removed series metrics

Signed-off-by: alanprot <alanprot@gmail.com>

* Refactoring to have common code between gc and flush method

Signed-off-by: alanprot <alanprot@gmail.com>

* Update tsdb/head.go

Co-authored-by: Ayoub Mrini <ayoubmrini424@gmail.com>
Signed-off-by: Alan Protasio <alanprot@gmail.com>

* refactor

Signed-off-by: alanprot <alanprot@gmail.com>

* Update tsdb/head_test.go

Co-authored-by: Ganesh Vernekar <ganeshvern@gmail.com>
Signed-off-by: Alan Protasio <alanprot@gmail.com>

* Update tsdb/head_test.go

Co-authored-by: Ganesh Vernekar <ganeshvern@gmail.com>
Signed-off-by: Alan Protasio <alanprot@gmail.com>

---------

Signed-off-by: alanprot <alanprot@gmail.com>
Signed-off-by: Alan Protasio <alanprot@gmail.com>
Co-authored-by: Ayoub Mrini <ayoubmrini424@gmail.com>
Co-authored-by: Ganesh Vernekar <ganeshvern@gmail.com>
2024-05-24 22:43:21 -04:00
Oleksandr Redko
f10c3454e9 Enable perfsprint linter and fix up code
Signed-off-by: Oleksandr Redko <oleksandr.red+github@gmail.com>
2024-05-15 17:51:05 +03:00
Matthieu MOREL
6f595c6762
golangci-lint: enable whitespace linter (#13905)
Signed-off-by: Matthieu MOREL <matthieu.morel35@gmail.com>
2024-04-11 09:27:54 +01:00
Nick Pillitteri
481f14e1c0
TSDB: Don't rely on integer overflow in head compaction check (#13755)
* TSDB: Don't compact the head block when empty

Don't compact the Head block if there have not yet been any samples
appended.

Previously, the logic for determining if the head should be compacted
relied on the default values for min and max time and integer overflow
when they were checked in `Head.compactable()`. The check in
`Head.compactable()` effectively did `math.MinInt64 - math.MaxInt64`
which overflowed and wrapped to `1`. Since `1` is less than `1.5`
times the chunk range, compaction did not happen. This was the correct
behavior but relying on overflow wrapping is surprising.

This change add a method for checking if the min and max time for the
head is unset and uses it to short-circuit compaction in that case.
It also replaces several explicit checks for the default value to
determine if the head has not yet had any samples added.

Signed-off-by: Nick Pillitteri <nick.pillitteri@grafana.com>
2024-03-26 12:17:38 +01:00
Bryan Boreham
925134e6de tsdb tests: make work with labels SymbolTable
Need to initialize decoders with SymbolTable.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-02-26 11:45:25 +00:00
Bryan Boreham
c0e36e6bb3 Standardise exemplar label as "trace_id"
This is consistent with the OpenTelemetry standard, and an example in OpenMetrics.

https://github.com/open-telemetry/opentelemetry-specification/blob/89aa01348139/specification/metrics/data-model.md#exemplars
https://github.com/OpenObservability/OpenMetrics/blob/138654493130/specification/OpenMetrics.md#exemplars-1

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-02-15 14:20:08 +00:00
Bryan Boreham
12cac5bd5c tsdb tests: use go-cmp instead of DeepEquals
Also one simpler call checking nil.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-02-08 19:32:33 +00:00