275 Commits

Author SHA1 Message Date
beorn7
747c5ee2b1 Apply analyzer "modernize" to the whole codebase
See
https://pkg.go.dev/golang.org/x/tools/gopls/internal/analysis/modernize
for details.

This ran into a few issues (arguably bugs in the modernize tool),
which I will fix in the next commit, so that we have transparency what
was done automatically.

Beyond those hiccups, I believe all the changes applied are
legitimate. Even where there might be no tangible direct gain, I would
argue it's still better to use the "modern" way to avoid micro
discussions in tiny style PRs later.

Signed-off-by: beorn7 <beorn@grafana.com>
2025-08-27 14:48:41 +02:00
Bryan Boreham
498f63e60b
Merge pull request #17029 from pr00se/wal-checkpoint-dropped-samples
TSDB: use timestamps rather than WAL segment numbers to track how long deleted series should be retained in checkpoints
2025-08-20 11:15:10 +01:00
Ganesh Vernekar
a86d9a3858
Merge pull request #16925 from prometheus/codesome/stale-series-tracking
tsdb: Track stale series in the Head block based on stale sample
2025-08-19 15:35:19 -07:00
Ganesh Vernekar
7a947d3629 Track stale series in the Head
Signed-off-by: Ganesh Vernekar <ganesh.vernekar@reddit.com>
2025-08-19 15:07:27 -07:00
Patryk Prus
5cb0192626
Address linter errors
Signed-off-by: Patryk Prus <p@trykpr.us>
2025-08-08 14:25:14 -04:00
Patryk Prus
0fea41ed53
Refactor keep function to work for both agent and non-agent implementations
Signed-off-by: Patryk Prus <p@trykpr.us>
2025-08-08 14:12:47 -04:00
Patryk Prus
6875022873
Update head.walExpiries with record timestamps during WAL replay
Signed-off-by: Patryk Prus <p@trykpr.us>
2025-08-08 14:12:47 -04:00
Patryk Prus
218558f543
Store mint rather than the last WAL segment in head.walExpiries during head GC
Signed-off-by: Patryk Prus <p@trykpr.us>
2025-08-08 14:12:41 -04:00
Matthieu MOREL
cef219c31c chore: enable unused-receiver rule from revive
Signed-off-by: Matthieu MOREL <matthieu.morel35@gmail.com>
2025-08-04 09:43:33 +00:00
Bryan Boreham
a11772234d
Merge pull request #16333 from colega/fix-series-create-gc-race
fix: race condition between series creation and garbage collection
2025-04-17 12:15:11 +01:00
machine424
a825d448da feat(tsdb/(head|agent)): dereference the pools at the end of the WL replay to
not wait for an extra GC cycle until the built-in cleanup mechanism
kicks in

See https://github.com/prometheus/prometheus/pull/15778

Signed-off-by: machine424 <ayoubmrini424@gmail.com>
2025-04-17 13:06:08 +02:00
Ben Kochie
a721daf981
Log WAL segment loading time (#16336)
Improve readability of "WAL segment loaded" by logging the duration
of each load. This helps make it easier to spot slow WAL file load
times.

Signed-off-by: SuperQ <superq@gmail.com>
2025-03-31 06:05:14 +02:00
Oleg Zaytsev
e4fe8d8684
Create memSeries with pendingCommit=true
This fixes TestHead_RaceBetweenSeriesCreationAndGC.

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
2025-03-27 11:11:57 +01:00
Matthieu MOREL
5fa1146e21
chore: enable gci linter (#16245)
Signed-off-by: Matthieu MOREL <matthieu.morel35@gmail.com>
2025-03-22 15:46:13 +00:00
Fiona Liao
37c2ebb5fd
Make out-of-order native histograms flag a no-op and always enable (#16207)
* Remove experimental out-of-order native histogram flag

This feature has been available in Prometheus since September 2024,
and has no known issues. Therefore proposing to remove the flag
entirely and always have it on. Note that there are still two
settings that need to be configured (out-of-order time window > 0
and native histograms enabled) for this feature to work.

Signed-off-by: Fiona Liao <fiona.liao@grafana.com>

* Update CHANGELOG

Signed-off-by: Fiona Liao <fiona.liao@grafana.com>

* Keep feature flag with warning

Signed-off-by: Fiona Liao <fiona.liao@grafana.com>

* Update CHANGELOG

Signed-off-by: Fiona Liao <fiona.liao@grafana.com>

* Update tsdb/head_append.go

Co-authored-by: George Krajcsovits <krajorama@users.noreply.github.com>
Signed-off-by: Fiona Liao <fiona.y.liao@gmail.com>

* Update CHANGELOG.md

Co-authored-by: George Krajcsovits <krajorama@users.noreply.github.com>
Signed-off-by: Fiona Liao <fiona.y.liao@gmail.com>

* Update tsdb/head_append.go

Co-authored-by: George Krajcsovits <krajorama@users.noreply.github.com>
Signed-off-by: Fiona Liao <fiona.y.liao@gmail.com>

* Additional cleanup of comments and test names

Signed-off-by: Fiona Liao <fiona.liao@grafana.com>

---------

Signed-off-by: Fiona Liao <fiona.liao@grafana.com>
Signed-off-by: Fiona Liao <fiona.y.liao@gmail.com>
Co-authored-by: George Krajcsovits <krajorama@users.noreply.github.com>
2025-03-18 10:59:02 +00:00
Patryk Prus
401dbacf2e
Add counters for unknown series references during WAL/WBL replay
Signed-off-by: Patryk Prus <p@trykpr.us>
2025-03-17 15:17:53 -04:00
Patryk Prus
61aa82865d
TSDB: keep duplicate series records in checkpoints while their samples may still be present (#16060)
Renames the head's deleted map to walExpiries, and creates entries for any
duplicate series records encountered during WAL replay, with the expiry set
to the highest current WAL segment number. Any subsequent WAL
checkpoints will see the duplicate series entry in the walExpiries map, and
keep the series record until the last WAL segment that could contain its
samples is deleted.

Other considerations:

WBL: series records aren't written to the WBL, so there are no duplicates to deal with
agent mode: has its own WAL replay logic that handles duplicate series records differently, and is outside the scope of this PR
2025-03-05 13:45:08 -05:00
Arve Knudsen
7cbf749096
Upgrade to github.com/oklog/ulid/v2 (#16168)
Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
2025-03-05 16:03:25 +01:00
Bryan Boreham
42d55505f9
Merge pull request #12659 from prymitive/memChunk
Short-cut common memChunk operations
2025-02-25 11:33:56 +00:00
machine424
d644324407
feat(tsdb/(head|agent)): reuse pools across segments to avoid generating garbage during WL replay
This is part of the "reduce WAL replay overhead/garbage" effort to help with https://github.com/prometheus/prometheus/issues/6934.

Signed-off-by: machine424 <ayoubmrini424@gmail.com>
2025-02-10 22:40:24 +01:00
Alan Protasio
9d1abbb9ed
Call PostCreation callback only after the new series is added to the mempotings (#15579)
Signed-off-by: alanprot <alanprot@gmail.com>
2025-01-28 12:11:58 +01:00
Bryan Boreham
ac4f8a5e23
[ENHANCEMENT] TSDB: Improve calculation of space used by labels (#13880)
* [ENHANCEMENT] TSDB: Improve calculation of space used by labels

The labels for each series in the Head take up some some space in the
Postings index, but far more space in the `memSeries` structure.

Instead of having the Postings index calculate this overhead, which is
a layering violation, have the caller pass in a function to do it.

Provide three implementations of this function for the three Labels
versions.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-12-16 09:42:52 +00:00
Pedro Tanaka
bab587b9dc
Agent: allow for ingestion of CT samples (#15124)
* Remove unused option from HeadOptions

Signed-off-by: Pedro Tanaka <pedro.tanaka@shopify.com>

* Improve docs for appendable() method in head appender

Signed-off-by: Pedro Tanaka <pedro.tanaka@shopify.com>

* Ingest CT (float) samples in Agent DB

Signed-off-by: Pedro Tanaka <pedro.tanaka@shopify.com>

* allow for ingestion of CT native histogram

Signed-off-by: Pedro Tanaka <pedro.tanaka@shopify.com>

* adding some verification for ct ts

Signed-off-by: Pedro Tanaka <pedro.tanaka@shopify.com>

* Validating CT histogram before append and add newly created series to pending series

Signed-off-by: Pedro Tanaka <pedro.tanaka@shopify.com>

* checking the wal for written samples

Signed-off-by: Pedro Tanaka <pedro.tanaka@shopify.com>

* Checking for samples in test

Signed-off-by: Pedro Tanaka <pedro.tanaka@shopify.com>

* adding case for validations

Signed-off-by: Pedro Tanaka <pedro.tanaka@shopify.com>

* fixing comparison when dedupelabels is enabled

Signed-off-by: Pedro Tanaka <pedro.tanaka@shopify.com>

* unite tests, use table testing

Signed-off-by: Pedro Tanaka <pedro.tanaka@shopify.com>

* Implement CT related methods in timestampTracker for write storage

Signed-off-by: Pedro Tanaka <pedro.tanaka@shopify.com>

* adding error case to test

Signed-off-by: Pedro Tanaka <pedro.tanaka@shopify.com>

* removing unused fields

Signed-off-by: Pedro Tanaka <pedro.tanaka@shopify.com>

* Updating lastTs for series when adding CT to invalidate duplicates

Signed-off-by: Pedro Tanaka <pedro.tanaka@shopify.com>

* making sure that updating the lastTS wont cause OOO later on in Commit();

Signed-off-by: Pedro Tanaka <pedro.tanaka@shopify.com>

---------

Signed-off-by: Pedro Tanaka <pedro.tanaka@shopify.com>
2024-10-27 01:06:34 +01:00
Łukasz Mierzwa
b6e22cd346 Short-cut common memChunk operations
memChunk is a linked list, speed up some common operations when there's no need to iterate all elements on the list.

Signed-off-by: Łukasz Mierzwa <l.mierzwa@gmail.com>
2024-10-25 12:19:20 +01:00
TJ Hoplock
6ebfbd2d54 chore!: adopt log/slog, remove go-kit/log
For: #14355

This commit updates Prometheus to adopt stdlib's log/slog package in
favor of go-kit/log. As part of converting to use slog, several other
related changes are required to get prometheus working, including:
- removed unused logging util func `RateLimit()`
- forward ported the util/logging/Deduper logging by implementing a small custom slog.Handler that does the deduping before chaining log calls to the underlying real slog.Logger
- move some of the json file logging functionality to use prom/common package functionality
- refactored some of the new json file logging for scraping
- changes to promql.QueryLogger interface to swap out logging methods for relevant slog sugar wrappers
- updated lots of tests that used/replicated custom logging functionality, attempting to keep the logical goal of the tests consistent after the transition
- added a healthy amount of `if logger == nil { $makeLogger }` type conditional checks amongst various functions where none were provided -- old code that used the go-kit/log.Logger interface had several places where there were nil references when trying to use functions like `With()` to add keyvals on the new *slog.Logger type

Signed-off-by: TJ Hoplock <t.hoplock@gmail.com>
2024-10-07 15:58:50 -04:00
György Krajcsovits
44ebbb8458 Fix missing histogram copy in sampleRing
The specialized version of sample add to the ring:
func addH(s hSample, buf []hSample, r *sampleRing) []hSample
func addFH(s fhSample, buf []fhSample, r *sampleRing) []fhSample
already correctly copy histogram samples from the reused hReader, fhReader
buffers, but the generic version does not. This means that the
data is overwritten on the next read if the sample ring has seen histogram
and float samples at the same time and switched to generic mode.

The `genericAdd` function (which was commented anyway) is by now quite
different from the specialized functions so that this commit deletes
it.

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
2024-10-02 13:57:28 +02:00
Carrie Edwards
14e3c05ce8
tsdb: Add support for ingestion of out-of-order native histogram samples (#14546)
Add support for ingesting OOO native histograms

* Add flag for enabling and disabling OOO native histogram ingestion
* Update OOO querying tests to include native histogram samples
* Add OOO head tests
* Add test for OOO native histogram counter reset headers

Signed-off-by: Carrie Edwards <edwrdscarrie@gmail.com>
Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
Co-authored by: Carrie Edwards <edwrdscarrie@gmail.com>
Co-authored by: Jeanette Tan <jeanette.tan@grafana.com>
Co-authored by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
Co-authored by: Fiona Liao <fiona.liao@grafana.com>
2024-09-17 11:19:06 +02:00
Nathan Baulch
50cd453c8f
chore: Fix typos (#14868)
* Fix typos

---------

Signed-off-by: Nathan Baulch <nathan.baulch@gmail.com>
2024-09-10 22:32:03 +02:00
George Krajcsovits
536d9f9ce9
BUGFIX: TSDB: panic in query during truncation with OOO head (#14831)
Check if headQuerier is nil before trying to use it.

* TestQueryOOOHeadDuringTruncate: unit test to check query during truncate
Regression test for #14822

* Simulate race between query and Compact()

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
2024-09-05 17:17:42 +01:00
Marco Pracucci
ef649d5968
Revert " Store mmMaxTime in same field as seriesShard"
Signed-off-by: Marco Pracucci <marco@pracucci.com>
2024-08-26 08:56:16 +02:00
Oleg Zaytsev
0300ad58a9
Revert the option regardless of error
Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
2024-07-30 11:31:31 +02:00
Oleg Zaytsev
d8e1b6bdfd
Store mmMaxTime in same field as seriesShard
We don't use seriesShard during DB initialization, so we can use the
same 8 bytes to store mmMaxTime, and save those during the rest of the
lifetime of the database.

This doesn't affect CPU performance.

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
2024-07-30 10:20:29 +02:00
Bryan Boreham
d878146c70 TSDB: shrink memSeries by moving bools together
In each case the following member requires 8-byte alignment, so moving
one beside the other shrinks memSeries from 176 to 168 bytes, when
compiled with `-tags stringlabels`.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-07-15 16:09:48 +01:00
Bryan Boreham
709c5d6fc3
TSDB: Lock around access to labels in head under -tags dedupelabels (#14322)
* TSDB: Document what needs locking in memSeries

* TSDB: Lock around access to series labels

So we can modify them to reset the symbol-table.

* TSDB: Make label locking conditional on build tag

---------

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-07-05 10:11:32 +01:00
Oleg Zaytsev
fd1a89b7c8
Pass affected labels to MemPostings.Delete() (#14307)
* Pass affected labels to MemPostings.Delete

As suggested by @bboreham, we can track the labels of the deleted series
and avoid iterating through all the label/value combinations.

This looks much faster on the MemPostings.Delete call. We don't have a
benchmark on stripeSeries.gc() where we'll pay the price of iterating
the labels of each one of the deleted series.

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
2024-06-18 10:28:56 +00:00
Alan Protasio
8894d65cd6
Fix head stats and hooks when replaying a corrupted snapshot (#14079)
* Fixing head stats and hooks when replaying a corrupted snapshot

Signed-off-by: alanprot <alanprot@gmail.com>

* Fixing create/removed series metrics

Signed-off-by: alanprot <alanprot@gmail.com>

* Refactoring to have common code between gc and flush method

Signed-off-by: alanprot <alanprot@gmail.com>

* Update tsdb/head.go

Co-authored-by: Ayoub Mrini <ayoubmrini424@gmail.com>
Signed-off-by: Alan Protasio <alanprot@gmail.com>

* refactor

Signed-off-by: alanprot <alanprot@gmail.com>

* Update tsdb/head_test.go

Co-authored-by: Ganesh Vernekar <ganeshvern@gmail.com>
Signed-off-by: Alan Protasio <alanprot@gmail.com>

* Update tsdb/head_test.go

Co-authored-by: Ganesh Vernekar <ganeshvern@gmail.com>
Signed-off-by: Alan Protasio <alanprot@gmail.com>

---------

Signed-off-by: alanprot <alanprot@gmail.com>
Signed-off-by: Alan Protasio <alanprot@gmail.com>
Co-authored-by: Ayoub Mrini <ayoubmrini424@gmail.com>
Co-authored-by: Ganesh Vernekar <ganeshvern@gmail.com>
2024-05-24 22:43:21 -04:00
Nick Pillitteri
481f14e1c0
TSDB: Don't rely on integer overflow in head compaction check (#13755)
* TSDB: Don't compact the head block when empty

Don't compact the Head block if there have not yet been any samples
appended.

Previously, the logic for determining if the head should be compacted
relied on the default values for min and max time and integer overflow
when they were checked in `Head.compactable()`. The check in
`Head.compactable()` effectively did `math.MinInt64 - math.MaxInt64`
which overflowed and wrapped to `1`. Since `1` is less than `1.5`
times the chunk range, compaction did not happen. This was the correct
behavior but relying on overflow wrapping is surprising.

This change add a method for checking if the min and max time for the
head is unset and uses it to short-circuit compaction in that case.
It also replaces several explicit checks for the default value to
determine if the head has not yet had any samples added.

Signed-off-by: Nick Pillitteri <nick.pillitteri@grafana.com>
2024-03-26 12:17:38 +01:00
Ben Ye
ceca6c4716
[ENHANCEMENT] TSDB: Log more statistics during startup (#13838)
* log chunk snapshot and mmap chunks replay duration together with total replay duration

Signed-off-by: Ben Ye <benye@amazon.com>
2024-03-26 11:16:27 +00:00
György Krajcsovits
4d4d822c36 Add native histograms to latency/duration metrics
Dogfood native histograms.
Allow dependent projects to migrate to native histograms.

I took the defaults from client_golang.

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
2024-03-01 14:44:38 +01:00
Bryan Boreham
93b72ec5dd tsdb: create SymbolTables for labels as required
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-02-26 11:45:25 +00:00
Fiona Liao
52389647b2 Add type label to outOfOrderSamplesAppended metric
Signed-off-by: Fiona Liao <fiona.liao@grafana.com>
2024-02-19 15:24:39 +00:00
Bryan Boreham
cd4562d3a6
Merge pull request #13473 from bboreham/pure-mutex
tsdb: use cheaper Mutex on series
2024-01-30 09:57:08 +00:00
Marco Pracucci
501bc6419e
Add ShardedPostings() support to TSDB (#10421)
This PR is a reference implementation of the proposal described in #10420.

In addition to what described in #10420, in this PR I've introduced labels.StableHash(). The idea is to offer an hashing function which doesn't change over time, and that's used by query sharding in order to get a stable behaviour over time. The implementation of labels.StableHash() is the hashing function used by Prometheus before stringlabels, and what's used by Grafana Mimir for query sharding (because built before stringlabels was a thing).

Follow up work
As mentioned in #10420, if this PR is accepted I'm also open to upload another foundamental piece used by Grafana Mimir query sharding to accelerate the query execution: an optional, configurable and fast in-memory cache for the series hashes.

Signed-off-by: Marco Pracucci <marco@pracucci.com>
2024-01-29 11:57:27 +00:00
Bryan Boreham
66237c1996 tsdb: use cheaper Mutex on series
Mutex is 8 bytes; RWMutex is 24 bytes and much more complicated. Since
`RLock` is only used in two places, `UpdateMetadata` and `Delete`,
neither of which are hotspots, we should use the cheaper one.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-01-26 11:01:39 +00:00
Bryan Boreham
b9eab6e4b8
tsdb: simplify internal series delete function (#13261)
Lifting an optimisation from Agent code, `seriesHashmap.del` can use
the unique series reference, doesn't need to check Labels.
Also streamline the logic for deleting from `unique` and `conflicts` maps,
and add some comments to help the next person.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-01-25 11:57:54 +01:00
Bryan Boreham
3f30ad3cc2
Merge pull request #13015 from bboreham/smaller-txring
tsdb: make transaction isolation data structures smaller
2024-01-25 10:48:15 +00:00
Matthieu MOREL
8f6cf3aabb tsdb: use Go standard errors
Signed-off-by: Matthieu MOREL <matthieu.morel35@gmail.com>
2023-12-11 12:18:54 +00:00
Arthur Silva Sens
5082655392
Append Created Timestamps (#12733)
* Append created timestamps.

Signed-off-by: Arthur Silva Sens <arthur.sens@coralogix.com>

* Log when created timestamps are ignored

Signed-off-by: Arthur Silva Sens <arthur.sens@coralogix.com>

* Proposed changes to Append CT PR.

Changes:

* Changed textparse Parser interface for consistency and robustness.
* Changed CT interface to be more explicit and handle validation.
* Simplified test, change scrapeManager to allow testability.
* Added TODOs.

Signed-off-by: bwplotka <bwplotka@gmail.com>

* Updates.

Signed-off-by: bwplotka <bwplotka@gmail.com>

* Addressed comments.

Signed-off-by: bwplotka <bwplotka@gmail.com>

* Refactor head_appender test

Signed-off-by: Arthur Silva Sens <arthur.sens@coralogix.com>

* Fix linter issues

Signed-off-by: Arthur Silva Sens <arthur.sens@coralogix.com>

* Use model.Sample in head appender test

Signed-off-by: Arthur Silva Sens <arthur.sens@coralogix.com>

---------

Signed-off-by: Arthur Silva Sens <arthur.sens@coralogix.com>
Signed-off-by: bwplotka <bwplotka@gmail.com>
Co-authored-by: bwplotka <bwplotka@gmail.com>
2023-12-11 08:43:42 +00:00
Xiaochao Dong
28d8f1650c
tsdb: Make sure the cache for postings cardinality properly honors the label name (#12653)
Add a string remembering which label and limit the cache corresponds to.

Signed-off-by: Xiaochao Dong (@damnever) <the.xcdong@gmail.com>
2023-11-28 13:54:37 +00:00
Arve Knudsen
1200c89d0c
Fix tsdb.stripeSeries.gc so it handles conflicts properly (#13195)
* Fix tsdb.stripeSeries.gc so it handles conflicts properly

tsdb.stripeSeries.gc needs to prune seriesHashmap.conflicts first,
otherwise seriesHashmap replaces the unique field with the first among
the conflicts. Also add regression test.

Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>

* TestStripeSeries_gc: Support stringlabels, don't use internals

Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>

---------

Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
2023-11-28 14:43:35 +01:00