It is fairly common that the next item is the one we want, and cheap
to check.
We could also start the binary search one position on, but strangely
that slows it down.
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
- The tool left an empty line behind that we don't need anymore, see
https://github.com/prometheus/prometheus/pull/17092. (Arguably not a
bug in the tool but just our stricter style about empty lines.)
- In tsdb/index/postings_test.go , our (admittedly somewhat
convoluted) code structure tricked the tool so it spit out something
that wouldn't even compile.
- storage/remote/queue_manager_test.go is just a minor formatting
nit.
Signed-off-by: beorn7 <beorn@grafana.com>
See
https://pkg.go.dev/golang.org/x/tools/gopls/internal/analysis/modernize
for details.
This ran into a few issues (arguably bugs in the modernize tool),
which I will fix in the next commit, so that we have transparency what
was done automatically.
Beyond those hiccups, I believe all the changes applied are
legitimate. Even where there might be no tangible direct gain, I would
argue it's still better to use the "modern" way to avoid micro
discussions in tiny style PRs later.
Signed-off-by: beorn7 <beorn@grafana.com>
This is intended to make `intersectPostings` easier to follow.
Instead of cryptic `arr` and `cur`, name the members `postings` and
`current`.
Instead of updating `cur` to intermediate values encountered during
operations, introduce a local variable `target` meaning the ref we might
expect to find next, and only update `current` when an intersection is
found.
Name the function which implements seeking `Seek` instead of `doNext`.
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
Lets take the given example:
P1: [2, 5, 9, 18, 21]
P2: [3, 7, 14, 19, 21]
P3: [1, 21]
Currently, we would only advance through P1 and P2 until discovering
an intersection and then checking P3. In essence, the traversal order
was: 2, 3, 5, 7, 9, 14, 18, 19, 21 (intersection found).
With the proposed change, P3 is also examined even if P1 and P2
haven't found an intersection yet. This adjustment allows for the
possibility of skipping some iterations.
Post-change, the traversal order becomes: 2, 3, 21 (3 iterations instead of 9).
Signed-off-by: alanprot <alanprot@gmail.com>
This enables it to take advantage of a more compact data structure
since all postings are known to be `*ListPostings`.
Remove the `Get` member which was not used for anything else, and fix up
tests.
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
Now we can call it with more specific types which is more efficient than
making everything go through the `Postings` interface.
Benchmark the concrete type.
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
We need to create more postings entries so the merger has some work to do.
Not material for the regexp ones as they match so few series.
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
* [ENHANCEMENT] TSDB: Improve calculation of space used by labels
The labels for each series in the Head take up some some space in the
Postings index, but far more space in the `memSeries` structure.
Instead of having the Postings index calculate this overhead, which is
a layering violation, have the caller pass in a function to do it.
Provide three implementations of this function for the three Labels
versions.
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
While investigating lock contention on `MemPostings`, we saw that lots
of locking is happening in `LabelValues` and
`PostingsForLabelsMatching`, both copying the label values slices while
holding the mutex.
This adds an extra map that holds an append-only label values slice for
each one of the label names. Since the slice is append-only, it can be
copied without holding the mutex.
Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
Same as #15427 but for the new method added in #14144
Instead of allocating each ListPostings one by one, allocate them all in
one go.
Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
Instead of allocating ListPostings pointers one by one, allocate a slice
and take pointers from that. It's faster, and also generates less
garbage (NewListPostings is one of the top offenders in number of
allocations).
Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
Simple follow-up to #13620. Modify `tsdb.PostingsForMatchers` to use the optimized tsdb.IndexReader.PostingsForLabelMatching method also for inverse matching.
Introduce method `PostingsForAllLabelValues`, to avoid changing the existing method.
The performance is much improved for a subset of the cases; there are up to
~60% CPU gains and ~12.5% reduction in memory usage.
Remove `TestReader_InversePostingsForMatcherHonorsContextCancel` since
`inversePostingsForMatcher` only passes `ctx` to `IndexReader` implementations now.
Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
This introduces back some unlocking that was removed in #13286 but in a
more balanced way, as suggested by @pracucci.
For TSDBs with a lot of churn, Delete() can take a couple of seconds,
and while it's holding the mutex, reads and writes are blocked waiting
for that mutex, increasing the number of connections handled and memory
usage.
This implementation pauses every 4K labels processed (note that also
compared to #13286 we're not processing all the label-values anymore,
but only the affected ones, because of #14307), makes sure that it's
possible to get the read lock, and waits for a few milliseconds more.
Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
Co-authored-by: Marco Pracucci <marco@pracucci.com>
This reverts commit 50ef0dc954592666a13ff92ef20811f0127c3b49.
Memory allocation goes so high in Prombench that the system is unusable.
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
* Tests for Mempostings.{Add,Get} data race
* Fix MemPostings.{Add,Get} data race
We can't modify the postings list that are held in MemPostings as they
might already be in use by some readers.
* Modify BenchmarkHeadStripeSeriesCreate to have common labels
If there are no common labels on the series, we don't excercise the
ordering part of MemSeries, as we're just creating slices of one element
for each label value.
---------
Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
We are still seeing lock contention on MemPostings.mtx, and MemPostings.Delete() is by far the most expensive operation on that mutex.
This adds parallelism to that method, trying to reduce the amount of time we spend with the mutex held.
Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
Go's built-in append() grows larger slices with factor 1.3, which means we do a lot more allocating and copying for larger postings.
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>