102 Commits

Author SHA1 Message Date
bwplotka
3cf43337dc post merge conflict fixes
Signed-off-by: bwplotka <bwplotka@gmail.com>
2026-03-12 09:03:08 +00:00
bwplotka
c133a969af Merge branch 'main' into start-time-main-sync 2026-03-12 08:28:15 +00:00
Bartlomiej Plotka
a73202012b
tsdb/wlog[PERF]: optimize WAL watcher reads (up to 540x less B/op; 13000x less allocs/op) (#18250)
See the detailed analysis https://docs.google.com/document/d/1efVAMcEw7-R_KatHHcobcFBlNsre-DoThVHI8AO2SDQ/edit?tab=t.0

I ran extensive benchmarks using synthetic data as well as real WAL segments pulled from the prombench runs.

All benchmarks are here https://github.com/prometheus/prometheus/compare/bwplotka/wal-reuse?expand=1

* optimization(tsdb/wlog): reuse Ref* buffers across WAL watchers' reads

Signed-off-by: bwplotka <bwplotka@gmail.com>

* optimization(tsdb/wlog): avoid expensive error wraps

Signed-off-by: bwplotka <bwplotka@gmail.com>

* optimization(tsdb/wlog): reuse array for filtering

Signed-off-by: bwplotka <bwplotka@gmail.com>

* fmt

Signed-off-by: bwplotka <bwplotka@gmail.com>

* lint fix

Signed-off-by: bwplotka <bwplotka@gmail.com>

* tsdb/record: add test for clear() on histograms

Signed-off-by: bwplotka <bwplotka@gmail.com>

* updated WriteTo with what's currently expected

Signed-off-by: bwplotka <bwplotka@gmail.com>

---------

Signed-off-by: bwplotka <bwplotka@gmail.com>
2026-03-11 09:17:13 +00:00
Bartlomiej Plotka
4e3efec434
Apply suggestions from code review
Co-authored-by: George Krajcsovits <krajorama@users.noreply.github.com>
Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>
2026-03-05 09:31:31 +00:00
bwplotka
3678ff9042 tests(tsdb/wlog): Tighten watcher tail tests
Signed-off-by: bwplotka <bwplotka@gmail.com>
2026-03-03 13:06:16 +00:00
bwplotka
d3f4053012 fix test after merge
Signed-off-by: bwplotka <bwplotka@gmail.com>
2026-02-23 10:02:32 +00:00
bwplotka
56c46af6a6 Merge branch 'main' into st-f-main 2026-02-23 10:00:39 +00:00
bwplotka
5ac1080a60 refactor: sed enableStStorage/enableSTStorage
Signed-off-by: bwplotka <bwplotka@gmail.com>
2026-02-17 11:11:46 +00:00
Kyle Eckhart
ae062151cd
tsdb/wlog: Remove any temproary checkpoints when creating a Checkpoint (#17598)
* RemoveTmpDirs function to tsdbutil
* Refactor db to use RemoveTmpDirs and no longer cleanup checkpoint tmp dirs
* Use RemoveTmpDirs in wlog checkpoint to cleanup all checkpoint tmp folders
* Add tests for RemoveTmpDirs
* Ensure db.Open will still cleanup extra temporary checkpoints

Signed-off-by: Kyle Eckhart <kgeckhart@users.noreply.github.com>
2026-02-17 09:23:54 +01:00
Owen Williams
b57f5b59b3
tsdb: ST-in-WAL: Counter implementation and benchmarks (#17671)
Initial implementation of https://github.com/prometheus/prometheus/issues/17790.
Only implements ST-per-sample for Counters. Tests and benchmarks updated.

Note: This increases the size of the RefSample object for all users, whether st-per-sample is turned on or not.

Signed-off-by: Owen Williams <owen.williams@grafana.com>
2026-02-12 13:17:50 -05:00
Ben Kochie
8d491cc642
tsdb: Migrate multi-errors to errors package (#17768)
Modernize tsdb package by migrating multi-error handling
to the standard library errors package.
* Add a modernized CloseAll helper.

Signed-off-by: SuperQ <superq@gmail.com>
2026-02-04 10:41:57 +01:00
Ben Kochie
72a23934ad
Refactor various tsdb sub-packages (#17847)
Migrate various tsdb related packages from `tsdb/errors` to the standard
library `errors` package.

Signed-off-by: SuperQ <superq@gmail.com>
2026-01-13 13:38:58 +00:00
Ben Kochie
e14795bbf4
Remove copyright date from headers (#17785)
Remove copyright dates from various files as part of [PROM-50].

[PROM-50]: https://github.com/prometheus/proposals/blob/main/proposals/0050-remove-copyright-dates.md

Signed-off-by: SuperQ <superq@gmail.com>
2026-01-05 13:46:21 +01:00
Ben Kochie
204249fcb5
Update golangci-lint (#17478)
* Update golangci-lint to v2.6.0
* Fixup various linting issues.
* Fixup deprecations.
* Add exception for `labels.MetricName` deprecation.

Signed-off-by: SuperQ <superq@gmail.com>
2025-11-05 13:47:34 +01:00
Ben Kochie
48956f60d7
Update modernize (#17471)
Apply additional Go modernize tool improvements.

Signed-off-by: SuperQ <superq@gmail.com>
2025-11-04 05:13:49 +00:00
György Krajcsovits
18efd9d629
feat(ui): mark native histograms as stable in ui strings
Plus some docstrings

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
2025-10-24 12:32:15 +02:00
György Krajcsovits
30f941c57c
fix(wal): ignore invalid native histogram schemas on load
Reduce the resolution of histograms as needed and ignore invalid
schemas while emitting a warning log.

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
2025-09-24 11:41:25 +02:00
beorn7
747c5ee2b1 Apply analyzer "modernize" to the whole codebase
See
https://pkg.go.dev/golang.org/x/tools/gopls/internal/analysis/modernize
for details.

This ran into a few issues (arguably bugs in the modernize tool),
which I will fix in the next commit, so that we have transparency what
was done automatically.

Beyond those hiccups, I believe all the changes applied are
legitimate. Even where there might be no tangible direct gain, I would
argue it's still better to use the "modern" way to avoid micro
discussions in tiny style PRs later.

Signed-off-by: beorn7 <beorn@grafana.com>
2025-08-27 14:48:41 +02:00
Bryan Boreham
498f63e60b
Merge pull request #17029 from pr00se/wal-checkpoint-dropped-samples
TSDB: use timestamps rather than WAL segment numbers to track how long deleted series should be retained in checkpoints
2025-08-20 11:15:10 +01:00
pipiland2612
de93387f0b Parallel tsdb/wlog test
Signed-off-by: pipiland2612 <nguyen.t.dang.minh@gmail.com>
2025-08-12 14:12:15 +02:00
Patryk Prus
0fea41ed53
Refactor keep function to work for both agent and non-agent implementations
Signed-off-by: Patryk Prus <p@trykpr.us>
2025-08-08 14:12:47 -04:00
Patryk Prus
218558f543
Store mint rather than the last WAL segment in head.walExpiries during head GC
Signed-off-by: Patryk Prus <p@trykpr.us>
2025-08-08 14:12:41 -04:00
Matthieu MOREL
cef219c31c chore: enable unused-receiver rule from revive
Signed-off-by: Matthieu MOREL <matthieu.morel35@gmail.com>
2025-08-04 09:43:33 +00:00
Charles Korn
46acc974c0
fix(remote): Unregister metrics emitted by remote.WriteStorage when closed (#16868)
* Unregister metrics emitted by `remote.WriteStorage` when closed

Signed-off-by: Charles Korn <charles.korn@grafana.com>

* Address PR feedback: add test

Signed-off-by: Charles Korn <charles.korn@grafana.com>

---------

Signed-off-by: Charles Korn <charles.korn@grafana.com>
2025-07-17 11:32:15 +02:00
Bryan Boreham
30d04792ca
[PERF] Remote-write: re-use memory to read WAL data (#16197)
The `:=` causes new variables to be created, which means the outer
slice stays at nil, and new memory is allocated every time round the
loop.

Extracted from https://github.com/prometheus/prometheus/pull/16182
Credit to @bwplotka.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2025-03-11 10:49:51 +00:00
Bartlomiej Plotka
7a7bc65237
Add util/compression package to consolidate snappy/zstd use in Prometheus. (#16156)
# Conflicts:
#	tsdb/db_test.go

Apply suggestions from code review




tmp



Addressed comments.



Update util/compression/buffers.go

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>
Co-authored-by: Arthur Silva Sens <arthursens2005@gmail.com>
2025-03-10 10:36:26 +00:00
Patryk Prus
61aa82865d
TSDB: keep duplicate series records in checkpoints while their samples may still be present (#16060)
Renames the head's deleted map to walExpiries, and creates entries for any
duplicate series records encountered during WAL replay, with the expiry set
to the highest current WAL segment number. Any subsequent WAL
checkpoints will see the duplicate series entry in the walExpiries map, and
keep the series record until the last WAL segment that could contain its
samples is deleted.

Other considerations:

WBL: series records aren't written to the WBL, so there are no duplicates to deal with
agent mode: has its own WAL replay logic that handles duplicate series records differently, and is outside the scope of this PR
2025-03-05 13:45:08 -05:00
Matthieu MOREL
c7d4b53ec1 chore: enable unused-parameter from revive
Signed-off-by: Matthieu MOREL <matthieu.morel35@gmail.com>
2025-02-19 19:50:28 +01:00
Arve Knudsen
f030894c2c
Fix issues raised by staticcheck (#15722)
Fix issues raised by staticcheck

We are not enabling staticcheck explicitly, though, because it has too many false positives.

---------

Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
2025-01-09 17:51:26 +01:00
György Krajcsovits
1e420ef373 Merge branch 'main' into cedwards/nhcb-wal-wbl
# Conflicts:
#	tsdb/tsdbutil/histogram.go
2025-01-02 12:50:19 +01:00
johncming
061400e31b
tsdb: export CheckpointPrefix constant (#15636)
Exported the CheckpointPrefix constant to be used in other packages.
Updated references to the constant in db.go and checkpoint.go files.
This change improves code readability and maintainability.

Signed-off-by: johncming <johncming@yahoo.com>
Co-authored-by: johncming <conjohn668@gmail.com>
2024-12-29 17:54:45 +01:00
György Krajcsovits
8f572fe905 fix(lint): linter errors
Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
2024-12-10 16:25:20 +01:00
György Krajcsovits
b94c87bea6 fix(test): TestCheckpoint segment size too low
The segment size was too low for the additional NHCB data, thus it created
more segments then expected. This meant that less were in the lower
numbered segments, which meant more was kept.

FAIL: TestCheckpoint (4.05s)
  FAIL: TestCheckpoint/compress=none (0.22s)
        checkpoint_test.go:361:
            	Error Trace:	/home/krajo/go/github.com/prometheus/prometheus/tsdb/wlog/checkpoint_test.go:361
            	Error:      	"0.8586956521739131" is not less than "0.8"
            	Test:       	TestCheckpoint/compress=none

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
2024-12-10 16:16:46 +01:00
Carrie Edwards
a046417bc0 Use new record type only for NHCB 2024-12-06 13:46:20 -08:00
Carrie Edwards
6684344026 Rename old histogram record type, use old names for new records 2024-12-05 09:21:47 -08:00
Carrie Edwards
454f6d39ca Add separate handling for histograms and custom bucket histograms 2024-12-05 09:21:47 -08:00
Carrie Edwards
37df50adb9 Attempt for record type 2024-12-05 09:21:47 -08:00
Carrie Edwards
6d413fad36 Use histogram records for custom value handling 2024-12-05 09:21:47 -08:00
Carrie Edwards
aa144b7263 Handle custom buckets in WAL and WBL 2024-12-05 09:21:47 -08:00
Bryan Boreham
0ef0b75a4f [TESTS] Remote-Write: Fix BenchmarkStartup
It was crashing due to uninitialized metrics, and not terminating due to
incorrectly reading segment names.

We need to export `SetMetrics` to avoid the first problem.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-11-15 11:22:07 +00:00
TJ Hoplock
6ebfbd2d54 chore!: adopt log/slog, remove go-kit/log
For: #14355

This commit updates Prometheus to adopt stdlib's log/slog package in
favor of go-kit/log. As part of converting to use slog, several other
related changes are required to get prometheus working, including:
- removed unused logging util func `RateLimit()`
- forward ported the util/logging/Deduper logging by implementing a small custom slog.Handler that does the deduping before chaining log calls to the underlying real slog.Logger
- move some of the json file logging functionality to use prom/common package functionality
- refactored some of the new json file logging for scraping
- changes to promql.QueryLogger interface to swap out logging methods for relevant slog sugar wrappers
- updated lots of tests that used/replicated custom logging functionality, attempting to keep the logical goal of the tests consistent after the transition
- added a healthy amount of `if logger == nil { $makeLogger }` type conditional checks amongst various functions where none were provided -- old code that used the go-kit/log.Logger interface had several places where there were nil references when trying to use functions like `With()` to add keyvals on the new *slog.Logger type

Signed-off-by: TJ Hoplock <t.hoplock@gmail.com>
2024-10-07 15:58:50 -04:00
machine424
d1b4312f0a fix(wlog/watcher_test.go): make TestRun_AvoidNotifyWhenBehind more resilient
Signed-off-by: machine424 <ayoubmrini424@gmail.com>
2024-09-17 13:11:04 +02:00
Callum Styan
a77f5007f9
fix bug with metadata for rw2 (#14766)
Signed-off-by: Callum Styan <callumstyan@gmail.com>
2024-08-30 08:14:20 +01:00
beorn7
0f760f63dd lint: Revamp our linting rules, mostly around doc comments
Several things done here:

- Set `max-issues-per-linter` to 0 so that we actually see all linter
  warnings and not just 50 per linter. (As we also set
  `max-same-issues` to 0, I assume this was the intention from the
  beginning.)

- Stop using the golangci-lint default excludes (by setting
  `exclude-use-default: false`. Those are too generous and don't match
  our style conventions. (I have re-added some of the excludes
  explicitly in this commit. See below.)

- Re-add the `errcheck` exclusion we have used so far via the
  defaults.

- Exclude the signature requirement `govet` has for `Seek` methods
  because we use non-standard `Seek` methods a lot. (But we keep other
  requirements, while the default excludes completely disabled the
  check for common method segnatures.)

- Exclude warnings about missing doc comments on exported symbols. (We
  used to be pretty adamant about doc comments, but stopped that at
  some point in the past. By now, we have about 500 missing doc
  comments. We may consider reintroducing this check, but that's
  outside of the scope of this commit. The default excludes of
  golangci-lint essentially ignore doc comments completely.)

- By stop using the default excludes, we now get warnings back on
  malformed doc comments. That's the most impactful change in this
  commit. It does not enforce doc comments (again), but _if_ there is
  a doc comment, it has to have the recommended form. (Most of the
  changes in this commit are fixing this form.)

- Improve wording/spelling of some comments in .golangci.yml, and
  remove an outdated comment.

- Leave `package-comments` inactive, but add a TODO asking if we
  should change that.

- Add a new sub-linter `comment-spacings` (and fix corresponding
  comments), which avoids missing spaces after the leading `//`.

Signed-off-by: beorn7 <beorn@grafana.com>
2024-08-22 17:36:11 +02:00
Arve Knudsen
3a78e76282 Upgrade golangci-lint to v1.60.1
Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
2024-08-18 12:13:25 +02:00
Arve Knudsen
b5d13a1ab5 Merge remote-tracking branch 'prometheus/main' into arve/wlog-histograms
Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
2024-08-14 19:04:53 +01:00
Bryan Boreham
aa4b056ad0
Merge pull request #13200 from bboreham/wlog-defer
tsdb/wlog: close segment files sooner
2024-08-13 14:11:38 +01:00
machine424
9e43ad2e37 chore(remote_write): clean up as watcher.go is part of wlog now
Signed-off-by: machine424 <ayoubmrini424@gmail.com>
2024-08-05 13:40:23 +02:00
Arve Knudsen
9af19ed856 Merge remote-tracking branch 'prometheus/main' into arve/wlog-histograms
Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
2024-07-26 11:51:29 +02:00
Bryan Boreham
646e1e847d Merge branch 'release-2.53' into merge-2.53.1 2024-07-10 11:17:53 +01:00