See the detailed analysis https://docs.google.com/document/d/1efVAMcEw7-R_KatHHcobcFBlNsre-DoThVHI8AO2SDQ/edit?tab=t.0
I ran extensive benchmarks using synthetic data as well as real WAL segments pulled from the prombench runs.
All benchmarks are here https://github.com/prometheus/prometheus/compare/bwplotka/wal-reuse?expand=1
* optimization(tsdb/wlog): reuse Ref* buffers across WAL watchers' reads
Signed-off-by: bwplotka <bwplotka@gmail.com>
* optimization(tsdb/wlog): avoid expensive error wraps
Signed-off-by: bwplotka <bwplotka@gmail.com>
* optimization(tsdb/wlog): reuse array for filtering
Signed-off-by: bwplotka <bwplotka@gmail.com>
* fmt
Signed-off-by: bwplotka <bwplotka@gmail.com>
* lint fix
Signed-off-by: bwplotka <bwplotka@gmail.com>
* tsdb/record: add test for clear() on histograms
Signed-off-by: bwplotka <bwplotka@gmail.com>
* updated WriteTo with what's currently expected
Signed-off-by: bwplotka <bwplotka@gmail.com>
---------
Signed-off-by: bwplotka <bwplotka@gmail.com>
* RemoveTmpDirs function to tsdbutil
* Refactor db to use RemoveTmpDirs and no longer cleanup checkpoint tmp dirs
* Use RemoveTmpDirs in wlog checkpoint to cleanup all checkpoint tmp folders
* Add tests for RemoveTmpDirs
* Ensure db.Open will still cleanup extra temporary checkpoints
Signed-off-by: Kyle Eckhart <kgeckhart@users.noreply.github.com>
Initial implementation of https://github.com/prometheus/prometheus/issues/17790.
Only implements ST-per-sample for Counters. Tests and benchmarks updated.
Note: This increases the size of the RefSample object for all users, whether st-per-sample is turned on or not.
Signed-off-by: Owen Williams <owen.williams@grafana.com>
Modernize tsdb package by migrating multi-error handling
to the standard library errors package.
* Add a modernized CloseAll helper.
Signed-off-by: SuperQ <superq@gmail.com>
Reduce the resolution of histograms as needed and ignore invalid
schemas while emitting a warning log.
Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
See
https://pkg.go.dev/golang.org/x/tools/gopls/internal/analysis/modernize
for details.
This ran into a few issues (arguably bugs in the modernize tool),
which I will fix in the next commit, so that we have transparency what
was done automatically.
Beyond those hiccups, I believe all the changes applied are
legitimate. Even where there might be no tangible direct gain, I would
argue it's still better to use the "modern" way to avoid micro
discussions in tiny style PRs later.
Signed-off-by: beorn7 <beorn@grafana.com>
* Unregister metrics emitted by `remote.WriteStorage` when closed
Signed-off-by: Charles Korn <charles.korn@grafana.com>
* Address PR feedback: add test
Signed-off-by: Charles Korn <charles.korn@grafana.com>
---------
Signed-off-by: Charles Korn <charles.korn@grafana.com>
The `:=` causes new variables to be created, which means the outer
slice stays at nil, and new memory is allocated every time round the
loop.
Extracted from https://github.com/prometheus/prometheus/pull/16182
Credit to @bwplotka.
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
Renames the head's deleted map to walExpiries, and creates entries for any
duplicate series records encountered during WAL replay, with the expiry set
to the highest current WAL segment number. Any subsequent WAL
checkpoints will see the duplicate series entry in the walExpiries map, and
keep the series record until the last WAL segment that could contain its
samples is deleted.
Other considerations:
WBL: series records aren't written to the WBL, so there are no duplicates to deal with
agent mode: has its own WAL replay logic that handles duplicate series records differently, and is outside the scope of this PR
Fix issues raised by staticcheck
We are not enabling staticcheck explicitly, though, because it has too many false positives.
---------
Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
Exported the CheckpointPrefix constant to be used in other packages.
Updated references to the constant in db.go and checkpoint.go files.
This change improves code readability and maintainability.
Signed-off-by: johncming <johncming@yahoo.com>
Co-authored-by: johncming <conjohn668@gmail.com>
The segment size was too low for the additional NHCB data, thus it created
more segments then expected. This meant that less were in the lower
numbered segments, which meant more was kept.
FAIL: TestCheckpoint (4.05s)
FAIL: TestCheckpoint/compress=none (0.22s)
checkpoint_test.go:361:
Error Trace: /home/krajo/go/github.com/prometheus/prometheus/tsdb/wlog/checkpoint_test.go:361
Error: "0.8586956521739131" is not less than "0.8"
Test: TestCheckpoint/compress=none
Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
It was crashing due to uninitialized metrics, and not terminating due to
incorrectly reading segment names.
We need to export `SetMetrics` to avoid the first problem.
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
For: #14355
This commit updates Prometheus to adopt stdlib's log/slog package in
favor of go-kit/log. As part of converting to use slog, several other
related changes are required to get prometheus working, including:
- removed unused logging util func `RateLimit()`
- forward ported the util/logging/Deduper logging by implementing a small custom slog.Handler that does the deduping before chaining log calls to the underlying real slog.Logger
- move some of the json file logging functionality to use prom/common package functionality
- refactored some of the new json file logging for scraping
- changes to promql.QueryLogger interface to swap out logging methods for relevant slog sugar wrappers
- updated lots of tests that used/replicated custom logging functionality, attempting to keep the logical goal of the tests consistent after the transition
- added a healthy amount of `if logger == nil { $makeLogger }` type conditional checks amongst various functions where none were provided -- old code that used the go-kit/log.Logger interface had several places where there were nil references when trying to use functions like `With()` to add keyvals on the new *slog.Logger type
Signed-off-by: TJ Hoplock <t.hoplock@gmail.com>
Several things done here:
- Set `max-issues-per-linter` to 0 so that we actually see all linter
warnings and not just 50 per linter. (As we also set
`max-same-issues` to 0, I assume this was the intention from the
beginning.)
- Stop using the golangci-lint default excludes (by setting
`exclude-use-default: false`. Those are too generous and don't match
our style conventions. (I have re-added some of the excludes
explicitly in this commit. See below.)
- Re-add the `errcheck` exclusion we have used so far via the
defaults.
- Exclude the signature requirement `govet` has for `Seek` methods
because we use non-standard `Seek` methods a lot. (But we keep other
requirements, while the default excludes completely disabled the
check for common method segnatures.)
- Exclude warnings about missing doc comments on exported symbols. (We
used to be pretty adamant about doc comments, but stopped that at
some point in the past. By now, we have about 500 missing doc
comments. We may consider reintroducing this check, but that's
outside of the scope of this commit. The default excludes of
golangci-lint essentially ignore doc comments completely.)
- By stop using the default excludes, we now get warnings back on
malformed doc comments. That's the most impactful change in this
commit. It does not enforce doc comments (again), but _if_ there is
a doc comment, it has to have the recommended form. (Most of the
changes in this commit are fixing this form.)
- Improve wording/spelling of some comments in .golangci.yml, and
remove an outdated comment.
- Leave `package-comments` inactive, but add a TODO asking if we
should change that.
- Add a new sub-linter `comment-spacings` (and fix corresponding
comments), which avoids missing spaces after the leading `//`.
Signed-off-by: beorn7 <beorn@grafana.com>