- The tool left an empty line behind that we don't need anymore, see
https://github.com/prometheus/prometheus/pull/17092. (Arguably not a
bug in the tool but just our stricter style about empty lines.)
- In tsdb/index/postings_test.go , our (admittedly somewhat
convoluted) code structure tricked the tool so it spit out something
that wouldn't even compile.
- storage/remote/queue_manager_test.go is just a minor formatting
nit.
Signed-off-by: beorn7 <beorn@grafana.com>
See
https://pkg.go.dev/golang.org/x/tools/gopls/internal/analysis/modernize
for details.
This ran into a few issues (arguably bugs in the modernize tool),
which I will fix in the next commit, so that we have transparency what
was done automatically.
Beyond those hiccups, I believe all the changes applied are
legitimate. Even where there might be no tangible direct gain, I would
argue it's still better to use the "modern" way to avoid micro
discussions in tiny style PRs later.
Signed-off-by: beorn7 <beorn@grafana.com>
This is intended to make `intersectPostings` easier to follow.
Instead of cryptic `arr` and `cur`, name the members `postings` and
`current`.
Instead of updating `cur` to intermediate values encountered during
operations, introduce a local variable `target` meaning the ref we might
expect to find next, and only update `current` when an intersection is
found.
Name the function which implements seeking `Seek` instead of `doNext`.
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
Lets take the given example:
P1: [2, 5, 9, 18, 21]
P2: [3, 7, 14, 19, 21]
P3: [1, 21]
Currently, we would only advance through P1 and P2 until discovering
an intersection and then checking P3. In essence, the traversal order
was: 2, 3, 5, 7, 9, 14, 18, 19, 21 (intersection found).
With the proposed change, P3 is also examined even if P1 and P2
haven't found an intersection yet. This adjustment allows for the
possibility of skipping some iterations.
Post-change, the traversal order becomes: 2, 3, 21 (3 iterations instead of 9).
Signed-off-by: alanprot <alanprot@gmail.com>
This enables it to take advantage of a more compact data structure
since all postings are known to be `*ListPostings`.
Remove the `Get` member which was not used for anything else, and fix up
tests.
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
Now we can call it with more specific types which is more efficient than
making everything go through the `Postings` interface.
Benchmark the concrete type.
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
We need to create more postings entries so the merger has some work to do.
Not material for the regexp ones as they match so few series.
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
* [ENHANCEMENT] TSDB: Improve calculation of space used by labels
The labels for each series in the Head take up some some space in the
Postings index, but far more space in the `memSeries` structure.
Instead of having the Postings index calculate this overhead, which is
a layering violation, have the caller pass in a function to do it.
Provide three implementations of this function for the three Labels
versions.
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
While investigating lock contention on `MemPostings`, we saw that lots
of locking is happening in `LabelValues` and
`PostingsForLabelsMatching`, both copying the label values slices while
holding the mutex.
This adds an extra map that holds an append-only label values slice for
each one of the label names. Since the slice is append-only, it can be
copied without holding the mutex.
Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
Same as #15427 but for the new method added in #14144
Instead of allocating each ListPostings one by one, allocate them all in
one go.
Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
Instead of allocating ListPostings pointers one by one, allocate a slice
and take pointers from that. It's faster, and also generates less
garbage (NewListPostings is one of the top offenders in number of
allocations).
Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
Simple follow-up to #13620. Modify `tsdb.PostingsForMatchers` to use the optimized tsdb.IndexReader.PostingsForLabelMatching method also for inverse matching.
Introduce method `PostingsForAllLabelValues`, to avoid changing the existing method.
The performance is much improved for a subset of the cases; there are up to
~60% CPU gains and ~12.5% reduction in memory usage.
Remove `TestReader_InversePostingsForMatcherHonorsContextCancel` since
`inversePostingsForMatcher` only passes `ctx` to `IndexReader` implementations now.
Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
This introduces back some unlocking that was removed in #13286 but in a
more balanced way, as suggested by @pracucci.
For TSDBs with a lot of churn, Delete() can take a couple of seconds,
and while it's holding the mutex, reads and writes are blocked waiting
for that mutex, increasing the number of connections handled and memory
usage.
This implementation pauses every 4K labels processed (note that also
compared to #13286 we're not processing all the label-values anymore,
but only the affected ones, because of #14307), makes sure that it's
possible to get the read lock, and waits for a few milliseconds more.
Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
Co-authored-by: Marco Pracucci <marco@pracucci.com>
This reverts commit 50ef0dc954592666a13ff92ef20811f0127c3b49.
Memory allocation goes so high in Prombench that the system is unusable.
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
* Tests for Mempostings.{Add,Get} data race
* Fix MemPostings.{Add,Get} data race
We can't modify the postings list that are held in MemPostings as they
might already be in use by some readers.
* Modify BenchmarkHeadStripeSeriesCreate to have common labels
If there are no common labels on the series, we don't excercise the
ordering part of MemSeries, as we're just creating slices of one element
for each label value.
---------
Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
We are still seeing lock contention on MemPostings.mtx, and MemPostings.Delete() is by far the most expensive operation on that mutex.
This adds parallelism to that method, trying to reduce the amount of time we spend with the mutex held.
Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
Go's built-in append() grows larger slices with factor 1.3, which means we do a lot more allocating and copying for larger postings.
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
Several things done here:
- Set `max-issues-per-linter` to 0 so that we actually see all linter
warnings and not just 50 per linter. (As we also set
`max-same-issues` to 0, I assume this was the intention from the
beginning.)
- Stop using the golangci-lint default excludes (by setting
`exclude-use-default: false`. Those are too generous and don't match
our style conventions. (I have re-added some of the excludes
explicitly in this commit. See below.)
- Re-add the `errcheck` exclusion we have used so far via the
defaults.
- Exclude the signature requirement `govet` has for `Seek` methods
because we use non-standard `Seek` methods a lot. (But we keep other
requirements, while the default excludes completely disabled the
check for common method segnatures.)
- Exclude warnings about missing doc comments on exported symbols. (We
used to be pretty adamant about doc comments, but stopped that at
some point in the past. By now, we have about 500 missing doc
comments. We may consider reintroducing this check, but that's
outside of the scope of this commit. The default excludes of
golangci-lint essentially ignore doc comments completely.)
- By stop using the default excludes, we now get warnings back on
malformed doc comments. That's the most impactful change in this
commit. It does not enforce doc comments (again), but _if_ there is
a doc comment, it has to have the recommended form. (Most of the
changes in this commit are fixing this form.)
- Improve wording/spelling of some comments in .golangci.yml, and
remove an outdated comment.
- Leave `package-comments` inactive, but add a TODO asking if we
should change that.
- Add a new sub-linter `comment-spacings` (and fix corresponding
comments), which avoids missing spaces after the leading `//`.
Signed-off-by: beorn7 <beorn@grafana.com>
* Pass affected labels to MemPostings.Delete
As suggested by @bboreham, we can track the labels of the deleted series
and avoid iterating through all the label/value combinations.
This looks much faster on the MemPostings.Delete call. We don't have a
benchmark on stripeSeries.gc() where we'll pay the price of iterating
the labels of each one of the deleted series.
Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>