184 Commits

Author SHA1 Message Date
beorn7
71c21fb9e4 Fix minor issues after applying analyzer "modernize"
- The tool left an empty line behind that we don't need anymore, see
  https://github.com/prometheus/prometheus/pull/17092. (Arguably not a
  bug in the tool but just our stricter style about empty lines.)

- In tsdb/index/postings_test.go , our (admittedly somewhat
  convoluted) code structure tricked the tool so it spit out something
  that wouldn't even compile.

- storage/remote/queue_manager_test.go is just a minor formatting
  nit.

Signed-off-by: beorn7 <beorn@grafana.com>
2025-08-27 15:44:11 +02:00
beorn7
747c5ee2b1 Apply analyzer "modernize" to the whole codebase
See
https://pkg.go.dev/golang.org/x/tools/gopls/internal/analysis/modernize
for details.

This ran into a few issues (arguably bugs in the modernize tool),
which I will fix in the next commit, so that we have transparency what
was done automatically.

Beyond those hiccups, I believe all the changes applied are
legitimate. Even where there might be no tangible direct gain, I would
argue it's still better to use the "modern" way to avoid micro
discussions in tiny style PRs later.

Signed-off-by: beorn7 <beorn@grafana.com>
2025-08-27 14:48:41 +02:00
Bryan Boreham
ba2a8aba8c
Merge pull request #17000 from bboreham/clarify-intersect
[REFACTOR] TSDB: Clarify intersectPostings
2025-08-06 12:07:10 +01:00
Bartlomiej Plotka
5df982538f
Merge pull request #16994 from mmorel-35/unused-parameters
chore: enable unused-receiver rule from revive
2025-08-06 10:46:28 +01:00
Ganesh Vernekar
64808d4f56
Merge pull request #16968 from pipiland2612/Remove_label_index
tsdb: Remove writing Label Index and Label Offset Table in the index
2025-08-05 15:12:44 -07:00
Bryan Boreham
e068c7332d [REFACTOR] TSDB: Clarify intersectPostings
This is intended to make `intersectPostings` easier to follow.

Instead of cryptic `arr` and `cur`, name the members `postings` and
`current`.

Instead of updating `cur` to intermediate values encountered during
operations, introduce a local variable `target` meaning the ref we might
expect to find next, and only update `current` when an intersection is
found.

Name the function which implements seeking `Seek` instead of `doNext`.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2025-08-05 13:09:29 +01:00
Alan Protasio
25aee26a57
Improving "Sparse postings" intersection (#13971)
Lets take the given example:

P1: [2, 5, 9, 18, 21]
P2: [3, 7, 14, 19, 21]
P3: [1, 21]

Currently, we would only advance through P1 and P2 until discovering
an intersection and then checking P3. In essence, the traversal order
was: 2, 3, 5, 7, 9, 14, 18, 19, 21 (intersection found).

With the proposed change, P3 is also examined even if P1 and P2
haven't found an intersection yet. This adjustment allows for the
possibility of skipping some iterations.

Post-change, the traversal order becomes: 2, 3, 21 (3 iterations instead of 9).

Signed-off-by: alanprot <alanprot@gmail.com>
2025-08-05 12:22:54 +01:00
Matthieu MOREL
cef219c31c chore: enable unused-receiver rule from revive
Signed-off-by: Matthieu MOREL <matthieu.morel35@gmail.com>
2025-08-04 09:43:33 +00:00
pipiland2612
8b24acb729 Remove label index and labe offset index
Signed-off-by: pipiland2612 <nguyen.t.dang.minh@gmail.com>
2025-08-01 13:50:49 +03:00
hardlydearly
ba4b058b7a refactor: use slices.Contains to simplify code
Signed-off-by: hardlydearly <799511800@qq.com>
2025-05-09 08:27:10 +02:00
Andre Branchizio
b07b552139
[PERF] TSDB: Pass down label value limit into implementation (#16158)
* allow limiting label values calls

Signed-off-by: Andre Branchizio <andrejbranch@gmail.com>
2025-05-06 18:54:48 +01:00
Arve Knudsen
e7e3ab2824
Fix linting issues found by golangci-lint v2.0.2 (#16368)
* Fix linting issues found by golangci-lint v2.0.2

---------

Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
2025-05-03 19:05:13 +02:00
Matthieu MOREL
c7d4b53ec1 chore: enable unused-parameter from revive
Signed-off-by: Matthieu MOREL <matthieu.morel35@gmail.com>
2025-02-19 19:50:28 +01:00
Ben Ye
919a5b657e
Expose ListPostings Length via Len() method (#15678)
tsdb: expose remaining ListPostings Length

Signed-off-by: Ben Ye <benye@amazon.com>

---------

Signed-off-by: Ben Ye <benye@amazon.com>
2025-01-07 17:58:26 +01:00
Bryan Boreham
cfa32f3d28 TSDB: Move merge of head postings into index
This enables it to take advantage of a more compact data structure
since all postings are known to be `*ListPostings`.

Remove the `Get` member which was not used for anything else, and fix up
tests.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-12-20 19:22:30 +00:00
Bryan Boreham
0a8779f46d TSDB: Make mergedPostings generic
Now we can call it with more specific types which is more efficient than
making everything go through the `Postings` interface.

Benchmark the concrete type.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-12-20 17:09:21 +00:00
Bryan Boreham
1b22242024 TSDB BenchmarkMerge: run fewer sizes
As long as we run small and big sizes, we don't need all the sizes inbetween.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-12-20 17:09:21 +00:00
Bryan Boreham
e630ffdbed TSDB: extend BenchmarkMemPostings_PostingsForLabelMatching to check merge speed
We need to create more postings entries so the merger has some work to do.
Not material for the regexp ones as they match so few series.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-12-20 17:09:21 +00:00
Oleg Zaytsev
17d5bc4e54
Update comment on MemPostings.lvs
There was a missing verb there.

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
2024-12-16 17:20:51 +01:00
Bryan Boreham
ac4f8a5e23
[ENHANCEMENT] TSDB: Improve calculation of space used by labels (#13880)
* [ENHANCEMENT] TSDB: Improve calculation of space used by labels

The labels for each series in the Head take up some some space in the
Postings index, but far more space in the `memSeries` structure.

Instead of having the Postings index calculate this overhead, which is
a layering violation, have the caller pass in a function to do it.

Provide three implementations of this function for the three Labels
versions.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-12-16 09:42:52 +00:00
Oleg Zaytsev
cd1f8ac129
MemPostings: keep a map of label values slices (#15426)
While investigating lock contention on `MemPostings`, we saw that lots
of locking is happening in `LabelValues` and
`PostingsForLabelsMatching`, both copying the label values slices while
holding the mutex.

This adds an extra map that holds an append-only label values slice for
each one of the label names. Since the slice is append-only, it can be
copied without holding the mutex.

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
2024-11-29 12:52:56 +01:00
Bryan Boreham
ca3119bd24 TSDB: eliminate one yolostring
When the only use of a []byte->string conversion is as a map key, Go
doesn't allocate.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-11-26 17:21:55 +00:00
Bryan Boreham
e98c19c1ce [PERF] TSDB: Cache all symbols for compaction
Trade a bit more memory for a lot less CPU spent looking up symbols.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-11-26 17:21:55 +00:00
Oleg Zaytsev
9aa6e041d3
MemPostings: allocate ListPostings once in PFALV (#15465)
Same as #15427 but for the new method added in #14144

Instead of allocating each ListPostings one by one, allocate them all in
one go.

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
2024-11-26 16:03:45 +01:00
Oleg Zaytsev
cc390aab64
MemPostings: allocate ListPostings once in PFLM (#15427)
Instead of allocating ListPostings pointers one by one, allocate a slice
and take pointers from that. It's faster, and also generates less
garbage (NewListPostings is one of the top offenders in number of
allocations).

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
2024-11-20 17:52:20 +01:00
Arve Knudsen
89bbb885e5
Upgrade to golangci-lint v1.62.0 (#15424)
Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
2024-11-20 17:22:20 +01:00
Arve Knudsen
06d54fcc6c
[PERF] TSDB: Optimize inverse matching (#14144)
Simple follow-up to #13620. Modify `tsdb.PostingsForMatchers` to use the optimized tsdb.IndexReader.PostingsForLabelMatching method also for inverse matching.

Introduce method `PostingsForAllLabelValues`, to avoid changing the existing method.

The performance is much improved for a subset of the cases; there are up to
~60% CPU gains and ~12.5% reduction in memory usage. 

Remove `TestReader_InversePostingsForMatcherHonorsContextCancel` since
`inversePostingsForMatcher` only passes `ctx` to `IndexReader` implementations now.

Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
2024-11-19 15:49:01 +00:00
Ben Ye
140f4aa9ae
feat: Allow customizing TSDB postings decoder (#13567)
* allow customizing TSDB postings decoder

---------

Signed-off-by: Ben Ye <benye@amazon.com>
2024-11-11 07:59:24 +01:00
Oleg Zaytsev
b1e4052682
MemPostings.Delete(): make pauses to unlock and let the readers read (#15242)
This introduces back some unlocking that was removed in #13286 but in a
more balanced way, as suggested by @pracucci.

For TSDBs with a lot of churn, Delete() can take a couple of seconds,
and while it's holding the mutex, reads and writes are blocked waiting
for that mutex, increasing the number of connections handled and memory
usage.

This implementation pauses every 4K labels processed (note that also
compared to #13286 we're not processing all the label-values anymore,
but only the affected ones, because of #14307), makes sure that it's
possible to get the read lock, and waits for a few milliseconds more.

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
Co-authored-by: Marco Pracucci <marco@pracucci.com>
2024-11-05 12:59:57 +01:00
Bryan Boreham
2fbbfc3da8 Revert "Fix MemPostings.Add and MemPostings.Get data race (#15141)"
This reverts commit 50ef0dc954592666a13ff92ef20811f0127c3b49.

Memory allocation goes so high in Prombench that the system is unusable.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-11-03 12:30:34 +00:00
Bryan Boreham
e2e01c1cff
Merge pull request #15216 from yeya24/log-last-series-labels
log last series labelset when hitting OOO series labels
2024-11-01 14:15:39 +00:00
Oleg Zaytsev
ba11a55df4
Revert "Process MemPostings.Delete() with GOMAXPROCS workers"
Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
2024-10-29 17:13:40 +01:00
Ben Ye
99882eec3b log last series labelset when hitting OOO series labels during compaction
Signed-off-by: Ben Ye <benye@amazon.com>
2024-10-24 09:27:15 -07:00
Oleg Zaytsev
50ef0dc954
Fix MemPostings.Add and MemPostings.Get data race (#15141)
* Tests for Mempostings.{Add,Get} data race
* Fix MemPostings.{Add,Get} data race

We can't modify the postings list that are held in MemPostings as they
might already be in use by some readers.

* Modify BenchmarkHeadStripeSeriesCreate to have common labels

If there are no common labels on the series, we don't excercise the
ordering part of MemSeries, as we're just creating slices of one element
for each label value.

---------

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
2024-10-11 15:21:15 +02:00
Bryan Boreham
54de4fb780
Merge pull request #14975 from colega/process-mempostings-delete-with-gomaxprocs-workers
Process `MemPostings.Delete()` with `GOMAXPROCS` workers
2024-09-29 07:58:42 +01:00
Oleg Zaytsev
ada8a6ef10
Add some more tests for MemPostings_Delete
Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
2024-09-27 10:14:39 +02:00
Oleg Zaytsev
4fd2556baa
Extract processWithBoundedParallelismAndConsistentWorkers
Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
2024-09-26 15:43:19 +02:00
Oleg Zaytsev
ccd0308abc
Don't do anything if MemPostings are empty
Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
2024-09-25 15:00:10 +02:00
Oleg Zaytsev
9c417aa710
Fix deadlock with empty MemPostings
Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
2024-09-25 14:08:50 +02:00
Bryan Boreham
5d8f0ef0c2
Merge pull request #14721 from bboreham/exp-grow-postings
[PERF] TSDB: Grow postings by doubling
2024-09-25 10:47:55 +01:00
Oleg Zaytsev
e196b977af
Process MemPostings.Delete() with GOMAXPROCS workers
We are still seeing lock contention on MemPostings.mtx, and MemPostings.Delete() is by far the most expensive operation on that mutex.

This adds parallelism to that method, trying to reduce the amount of time we spend with the mutex held.

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
2024-09-25 10:38:47 +02:00
Bryan Boreham
ca673eb749 Merge remote-tracking branch 'origin/release-2.55' into merge-2.55-into-main
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-09-22 17:49:34 +01:00
Bryan Boreham
31c5760551
Neater string vs byte-slice conversions (#14425)
unsafe.Slice and unsafe.StringData were added in Go 1.20

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-09-21 12:19:21 +02:00
Ganesh Vernekar
5ccb069414 Backward compatibility with upcoming index v3
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2024-09-19 10:27:52 +01:00
Bryan Boreham
33adbe47b1 [PERF] TSDB: Grow postings by doubling
Go's built-in append() grows larger slices with factor 1.3, which means we do a lot more allocating and copying for larger postings.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-08-24 11:16:58 +01:00
beorn7
0f760f63dd lint: Revamp our linting rules, mostly around doc comments
Several things done here:

- Set `max-issues-per-linter` to 0 so that we actually see all linter
  warnings and not just 50 per linter. (As we also set
  `max-same-issues` to 0, I assume this was the intention from the
  beginning.)

- Stop using the golangci-lint default excludes (by setting
  `exclude-use-default: false`. Those are too generous and don't match
  our style conventions. (I have re-added some of the excludes
  explicitly in this commit. See below.)

- Re-add the `errcheck` exclusion we have used so far via the
  defaults.

- Exclude the signature requirement `govet` has for `Seek` methods
  because we use non-standard `Seek` methods a lot. (But we keep other
  requirements, while the default excludes completely disabled the
  check for common method segnatures.)

- Exclude warnings about missing doc comments on exported symbols. (We
  used to be pretty adamant about doc comments, but stopped that at
  some point in the past. By now, we have about 500 missing doc
  comments. We may consider reintroducing this check, but that's
  outside of the scope of this commit. The default excludes of
  golangci-lint essentially ignore doc comments completely.)

- By stop using the default excludes, we now get warnings back on
  malformed doc comments. That's the most impactful change in this
  commit. It does not enforce doc comments (again), but _if_ there is
  a doc comment, it has to have the recommended form. (Most of the
  changes in this commit are fixing this form.)

- Improve wording/spelling of some comments in .golangci.yml, and
  remove an outdated comment.

- Leave `package-comments` inactive, but add a TODO asking if we
  should change that.

- Add a new sub-linter `comment-spacings` (and fix corresponding
  comments), which avoids missing spaces after the leading `//`.

Signed-off-by: beorn7 <beorn@grafana.com>
2024-08-22 17:36:11 +02:00
Arve Knudsen
3a78e76282 Upgrade golangci-lint to v1.60.1
Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
2024-08-18 12:13:25 +02:00
Oleg Zaytsev
726ed124e4
Replace ListPostings.Seek's binary search call by the generic slices.BinarySearch (#14393)
Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
2024-07-02 14:51:05 +01:00
Oleg Zaytsev
fd1a89b7c8
Pass affected labels to MemPostings.Delete() (#14307)
* Pass affected labels to MemPostings.Delete

As suggested by @bboreham, we can track the labels of the deleted series
and avoid iterating through all the label/value combinations.

This looks much faster on the MemPostings.Delete call. We don't have a
benchmark on stripeSeries.gc() where we'll pay the price of iterating
the labels of each one of the deleted series.

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
2024-06-18 10:28:56 +00:00
Ben Ye
0e6fca8e76 add unit test
Signed-off-by: Ben Ye <benye@amazon.com>
2024-06-16 12:09:42 -07:00