Commit Graph

60 Commits

Author SHA1 Message Date
Matthieu MOREL
cef219c31c chore: enable unused-receiver rule from revive
Signed-off-by: Matthieu MOREL <matthieu.morel35@gmail.com>
2025-08-04 09:43:33 +00:00
carrychair
e83dc66bdb refactor: use the built-in max/min to simplify the code
Signed-off-by: carrychair <linghuchong404@gmail.com>
2025-05-20 14:36:39 +08:00
machine424
a825d448da feat(tsdb/(head|agent)): dereference the pools at the end of the WL replay to
not wait for an extra GC cycle until the built-in cleanup mechanism
kicks in

See https://github.com/prometheus/prometheus/pull/15778

Signed-off-by: machine424 <ayoubmrini424@gmail.com>
2025-04-17 13:06:08 +02:00
Bartlomiej Plotka
7a7bc65237
Add util/compression package to consolidate snappy/zstd use in Prometheus. (#16156)
# Conflicts:
#	tsdb/db_test.go

Apply suggestions from code review




tmp



Addressed comments.



Update util/compression/buffers.go

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>
Co-authored-by: Arthur Silva Sens <arthursens2005@gmail.com>
2025-03-10 10:36:26 +00:00
Patryk Prus
61aa82865d
TSDB: keep duplicate series records in checkpoints while their samples may still be present (#16060)
Renames the head's deleted map to walExpiries, and creates entries for any
duplicate series records encountered during WAL replay, with the expiry set
to the highest current WAL segment number. Any subsequent WAL
checkpoints will see the duplicate series entry in the walExpiries map, and
keep the series record until the last WAL segment that could contain its
samples is deleted.

Other considerations:

WBL: series records aren't written to the WBL, so there are no duplicates to deal with
agent mode: has its own WAL replay logic that handles duplicate series records differently, and is outside the scope of this PR
2025-03-05 13:45:08 -05:00
Ayoub Mrini
e04913aea2
Merge pull request #15778 from machine424/reuse-pools
feat(tsdb/(head|agent)): reuse pools across segments to reduce garbage during WL replay
2025-02-17 12:48:17 +01:00
machine424
d644324407
feat(tsdb/(head|agent)): reuse pools across segments to avoid generating garbage during WL replay
This is part of the "reduce WAL replay overhead/garbage" effort to help with https://github.com/prometheus/prometheus/issues/6934.

Signed-off-by: machine424 <ayoubmrini424@gmail.com>
2025-02-10 22:40:24 +01:00
Matthieu MOREL
b472ce7010 chore: enable early-return from revive
Signed-off-by: Matthieu MOREL <matthieu.morel35@gmail.com>
2025-02-10 22:08:43 +01:00
Carrie Edwards
a046417bc0 Use new record type only for NHCB 2024-12-06 13:46:20 -08:00
Carrie Edwards
6684344026 Rename old histogram record type, use old names for new records 2024-12-05 09:21:47 -08:00
Carrie Edwards
454f6d39ca Add separate handling for histograms and custom bucket histograms 2024-12-05 09:21:47 -08:00
Carrie Edwards
37df50adb9 Attempt for record type 2024-12-05 09:21:47 -08:00
Pedro Tanaka
bab587b9dc
Agent: allow for ingestion of CT samples (#15124)
* Remove unused option from HeadOptions

Signed-off-by: Pedro Tanaka <pedro.tanaka@shopify.com>

* Improve docs for appendable() method in head appender

Signed-off-by: Pedro Tanaka <pedro.tanaka@shopify.com>

* Ingest CT (float) samples in Agent DB

Signed-off-by: Pedro Tanaka <pedro.tanaka@shopify.com>

* allow for ingestion of CT native histogram

Signed-off-by: Pedro Tanaka <pedro.tanaka@shopify.com>

* adding some verification for ct ts

Signed-off-by: Pedro Tanaka <pedro.tanaka@shopify.com>

* Validating CT histogram before append and add newly created series to pending series

Signed-off-by: Pedro Tanaka <pedro.tanaka@shopify.com>

* checking the wal for written samples

Signed-off-by: Pedro Tanaka <pedro.tanaka@shopify.com>

* Checking for samples in test

Signed-off-by: Pedro Tanaka <pedro.tanaka@shopify.com>

* adding case for validations

Signed-off-by: Pedro Tanaka <pedro.tanaka@shopify.com>

* fixing comparison when dedupelabels is enabled

Signed-off-by: Pedro Tanaka <pedro.tanaka@shopify.com>

* unite tests, use table testing

Signed-off-by: Pedro Tanaka <pedro.tanaka@shopify.com>

* Implement CT related methods in timestampTracker for write storage

Signed-off-by: Pedro Tanaka <pedro.tanaka@shopify.com>

* adding error case to test

Signed-off-by: Pedro Tanaka <pedro.tanaka@shopify.com>

* removing unused fields

Signed-off-by: Pedro Tanaka <pedro.tanaka@shopify.com>

* Updating lastTs for series when adding CT to invalidate duplicates

Signed-off-by: Pedro Tanaka <pedro.tanaka@shopify.com>

* making sure that updating the lastTS wont cause OOO later on in Commit();

Signed-off-by: Pedro Tanaka <pedro.tanaka@shopify.com>

---------

Signed-off-by: Pedro Tanaka <pedro.tanaka@shopify.com>
2024-10-27 01:06:34 +01:00
Vanshika
cccbe72514
TSDB: Fix some edge cases when OOO is enabled (#14710)
Fix some edge cases when OOO is enabled

Signed-off-by: Vanshikav123 <vanshikav928@gmail.com>
Signed-off-by: Vanshika <102902652+Vanshikav123@users.noreply.github.com>
Signed-off-by: Jesus Vazquez <jesusvzpg@gmail.com>
Co-authored-by: Jesus Vazquez <jesusvzpg@gmail.com>
2024-10-23 17:34:28 +02:00
TJ Hoplock
6ebfbd2d54 chore!: adopt log/slog, remove go-kit/log
For: #14355

This commit updates Prometheus to adopt stdlib's log/slog package in
favor of go-kit/log. As part of converting to use slog, several other
related changes are required to get prometheus working, including:
- removed unused logging util func `RateLimit()`
- forward ported the util/logging/Deduper logging by implementing a small custom slog.Handler that does the deduping before chaining log calls to the underlying real slog.Logger
- move some of the json file logging functionality to use prom/common package functionality
- refactored some of the new json file logging for scraping
- changes to promql.QueryLogger interface to swap out logging methods for relevant slog sugar wrappers
- updated lots of tests that used/replicated custom logging functionality, attempting to keep the logical goal of the tests consistent after the transition
- added a healthy amount of `if logger == nil { $makeLogger }` type conditional checks amongst various functions where none were provided -- old code that used the go-kit/log.Logger interface had several places where there were nil references when trying to use functions like `With()` to add keyvals on the new *slog.Logger type

Signed-off-by: TJ Hoplock <t.hoplock@gmail.com>
2024-10-07 15:58:50 -04:00
Arthur Silva Sens
6bd9b1a7cc
Histogram CT Zero ingestion
Signed-off-by: Arthur Silva Sens <arthursens2005@gmail.com>
2024-09-26 11:29:22 -03:00
Nathan Baulch
50cd453c8f
chore: Fix typos (#14868)
* Fix typos

---------

Signed-off-by: Nathan Baulch <nathan.baulch@gmail.com>
2024-09-10 22:32:03 +02:00
shandongzhejiang
b2712ff284 chore: fix some function names
Signed-off-by: shandongzhejiang <shandongzhejiang@icloud.com>
2024-08-21 11:09:37 +08:00
Sebastian Rabenhorst
05380aa0ac
agent db: make rejecting ooo samples configurable (#14094)
feat: Make OOO ingestion time window configurable for Prometheus Agent.

Signed-off-by: Sebastian Rabenhorst <sebastian.rabenhorst@shopify.com>
2024-06-12 11:07:42 -03:00
Bryan Boreham
93b72ec5dd tsdb: create SymbolTables for labels as required
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-02-26 11:45:25 +00:00
Matthieu MOREL
8f6cf3aabb tsdb: use Go standard errors
Signed-off-by: Matthieu MOREL <matthieu.morel35@gmail.com>
2023-12-11 12:18:54 +00:00
Arthur Silva Sens
5082655392
Append Created Timestamps (#12733)
* Append created timestamps.

Signed-off-by: Arthur Silva Sens <arthur.sens@coralogix.com>

* Log when created timestamps are ignored

Signed-off-by: Arthur Silva Sens <arthur.sens@coralogix.com>

* Proposed changes to Append CT PR.

Changes:

* Changed textparse Parser interface for consistency and robustness.
* Changed CT interface to be more explicit and handle validation.
* Simplified test, change scrapeManager to allow testability.
* Added TODOs.

Signed-off-by: bwplotka <bwplotka@gmail.com>

* Updates.

Signed-off-by: bwplotka <bwplotka@gmail.com>

* Addressed comments.

Signed-off-by: bwplotka <bwplotka@gmail.com>

* Refactor head_appender test

Signed-off-by: Arthur Silva Sens <arthur.sens@coralogix.com>

* Fix linter issues

Signed-off-by: Arthur Silva Sens <arthur.sens@coralogix.com>

* Use model.Sample in head appender test

Signed-off-by: Arthur Silva Sens <arthur.sens@coralogix.com>

---------

Signed-off-by: Arthur Silva Sens <arthur.sens@coralogix.com>
Signed-off-by: bwplotka <bwplotka@gmail.com>
Co-authored-by: bwplotka <bwplotka@gmail.com>
2023-12-11 08:43:42 +00:00
Julien Levesy
501f514389
feat(tsdb/agent): notify remote storage when commit happens (#13223)
Signed-off-by: Julien Levesy <jlevesy@gmail.com>
Signed-off-by: Callum Styan <callumstyan@gmail.com>
Co-authored-by: Callum Styan <callumstyan@gmail.com>
2023-12-01 14:00:26 -08:00
Matthieu MOREL
724737006d
tsdb/agent: use Go standard errors package
Signed-off-by: Matthieu MOREL <matthieu.morel35@gmail.com>

Signed-off-by: Matthieu MOREL <matthieu.morel35@gmail.com>
2023-11-08 09:22:31 +01:00
Linas Medziunas
ebed7d0612 Change Validate to be a method on histogram structs
Signed-off-by: Linas Medziunas <linas.medziunas@gmail.com>
2023-11-03 16:47:59 +02:00
Linas Medziunas
1f8aea11d6 Move histogram validation code to model/histogram
Signed-off-by: Linas Medziunas <linas.medziunas@gmail.com>
2023-11-03 16:17:24 +02:00
Paschalis Tsilias
afab845e65
tsdb/agent: allow ingestion of OOO samples (#12897)
Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com>
2023-10-13 16:33:09 +02:00
Arve Knudsen
6daee89e5f
Add context argument to Querier.Select (#12660)
Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
2023-09-12 12:37:38 +02:00
Oleg Zaytsev
6ea6def0d3
Use zeropool when replaying agent's DB WAL (#12651)
Same as https://github.com/prometheus/prometheus/pull/12189 but for
tsdb/agent/db.go

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
2023-08-04 10:39:55 +02:00
Robert Fratto
886945cda7
tsdb/agent: ensure that new series get written to WAL on rollback (#12592)
If a new series is introduced in a storage.Appender instance, that
series should be written to the WAL once the storage.Appender is closed,
even on Rollback.

Previously, new series would only be written to the WAL when calling
Commit. However, because the series is stored in memory regardless,
subsequent calls to Commit may write samples to the WAL which reference
a series ID which that was never written.

Related to #11589. It's likely that this fix also resolves this issue,
but we need more testing from users to see if the problem persists after
this fix; there may be more cases where samples get written to the WAL
in Prometheus Agent mode without the corresponding series record.

Signed-off-by: Robert Fratto <robertfratto@gmail.com>
2023-07-27 09:28:26 -04:00
Justin Lei
32d87282ad
Add Zstandard compression option for wlog (#11666)
Snappy remains as the default compression but there is now a flag to switch 
the compression algorithm.

Signed-off-by: Justin Lei <justin.lei@grafana.com>
2023-07-11 14:57:57 +02:00
Bryan Boreham
0ab9553611
tsdb: drop deleted series from the WAL sooner (#12297)
`head.deleted` holds the WAL segment in use at the time each series was
removed from the head. At the end of `truncateWAL()` we will delete
all segments up to `last`, so we can drop any series that were last seen
in a segment at or before that point.

(same change in Prometheus Agent too)

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2023-05-01 16:43:15 +01:00
beorn7
5b53aa1108 style: Replace else if cascades with switch
Wiser coders than myself have come to the conclusion that a `switch`
statement is almost always superior to a statement that includes any
`else if`.

The exceptions that I have found in our codebase are just these two:

* The `if else` is followed by an additional statement before the next
  condition (separated by a `;`).
* The whole thing is within a `for` loop and `break` statements are
  used. In this case, using `switch` would require tagging the `for`
  loop, which probably tips the balance.

Why are `switch` statements more readable?

For one, fewer curly braces. But more importantly, the conditions all
have the same alignment, so the whole thing follows the natural flow
of going down a list of conditions. With `else if`, in contrast, all
conditions but the first are "hidden" behind `} else if `, harder to
spot and (for no good reason) presented differently from the first
condition.

I'm sure the aforemention wise coders can list even more reasons.

In any case, I like it so much that I have found myself recommending
it in code reviews. I would like to make it a habit in our code base,
without making it a hard requirement that we would test on the CI. But
for that, there has to be a role model, so this commit eliminates all
`if else` occurrences, unless it is autogenerated code or fits one of
the exceptions above.

Signed-off-by: beorn7 <beorn@grafana.com>
2023-04-19 17:22:31 +02:00
beorn7
c3c7d44d84 lint: Adjust to the lint warnings raised by current versions of golint-ci
We haven't updated golint-ci in our CI yet, but this commit prepares
for that.

There are a lot of new warnings, and it is mostly because the "revive"
linter got updated. I agree with most of the new warnings, mostly
around not naming unused function parameters (although it is justified
in some cases for documentation purposes – while things like mocks are
a good example where not naming the parameter is clearer).

I'm pretty upset about the "empty block" warning to include `for`
loops. It's such a common pattern to do something in the head of the
`for` loop and then have an empty block. There is still an open issue
about this: https://github.com/mgechev/revive/issues/810 I have
disabled "revive" altogether in files where empty blocks are used
excessively, and I have made the effort to add individual
`// nolint:revive` where empty blocks are used just once or twice.
It's borderline noisy, though, but let's go with it for now.

I should mention that none of the "empty block" warnings for `for`
loop bodies were legitimate.

Signed-off-by: beorn7 <beorn@grafana.com>
2023-04-19 17:10:10 +02:00
Vernon Miller
ca0abf26c5
Adds an affirmative log message for successful WAL repair (#12135)
* Adds an affirmative log message for successful WAL repair

Signed-off-by: Vernon Miller <vernon.miller@grafana.com>
Signed-off-by: Vernon Miller <96601789+aldernero@users.noreply.github.com>
Co-authored-by: Ganesh Vernekar <ganeshvern@gmail.com>
2023-03-21 19:33:43 +05:30
Sebastian Rabenhorst
c057318578
agent: native histogram support (#11842)
Signed-off-by: Sebastian Rabenhorst <sebastian.rabenhorst@shopify.com>
2023-01-12 11:13:44 -05:00
Marc Tudurí
9474610baf
Support FloatHistogram in TSDB (#11522)
Extends Appender.AppendHistogram function to accept the FloatHistogram. TSDB supports appending, querying, WAL replay, for this new type of histogram.

Signed-off-by: Marc Tudurí <marctc@protonmail.com>
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
Co-authored-by: Ganesh Vernekar <ganeshvern@gmail.com>
2022-12-28 14:25:07 +05:30
Bryan Boreham
14ad2e780b Update package tsdb/agent for new labels.Labels type
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2022-12-19 15:22:09 +00:00
Bryan Boreham
6bdecf377c
Switch from 'sanity' to more inclusive lanuage (#9376)
* Switch from 'sanity' to more inclusive lanuage

"Removing ableist language in code is important; it helps to create and
maintain an environment that welcomes all developers of all backgrounds,
while emphasizing that we as developers select the most articulate,
precise, descriptive language we can rather than relying on metaphors.

The phrase sanity check is ableist, and unnecessarily references mental
health in our code bases. It denotes that people with mental illnesses
are inferior, wrong, or incorrect, and the phrase sanity continues to be
used by employers and other individuals to discriminate against these
people."

From https://gist.github.com/seanmhanson/fe370c2d8bd2b3228680e38899baf5cc

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2022-11-28 17:09:18 +00:00
Ganesh Vernekar
648be89822
Merge remote-tracking branch 'upstream/main' into fix-conflict
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2022-10-12 14:20:02 +05:30
Jesus Vazquez
775d90d5f8
TSDB: Rename wal package to wlog (#11352)
The wlog.WL type can now be used to create a Write Ahead Log or a Write
Behind Log.

Before the prefix for wbl metrics was
'prometheus_tsdb_out_of_order_wal_' and has been replaced with
'prometheus_tsdb_out_of_order_wbl_'.

Signed-off-by: Jesus Vazquez <jesus.vazquez@grafana.com>
Signed-off-by: Jesus Vazquez <jesusvazquez@users.noreply.github.com>
Co-authored-by: Ganesh Vernekar <15064823+codesome@users.noreply.github.com>
2022-10-10 20:38:46 +05:30
Jesus Vazquez
e934d0f011 Merge 'main' into sparsehistogram
Signed-off-by: Jesus Vazquez <jesus.vazquez@grafana.com>
2022-10-05 22:14:49 +02:00
Robert Fratto
448cfda6c1
tsdb/agent: fix validation of default options (#9876)
* tsdb/agent: fix application of defaults

MaxTS was being incorrectly constrained to the truncation interval

* add more tests to check validation

* force MaxWALTime = MinWALTime if min > max

Signed-off-by: Robert Fratto <robertfratto@gmail.com>
2022-09-27 19:41:43 +05:30
Jesus Vazquez
c1b669bf9b
Add out-of-order sample support to the TSDB (#11075)
* Introduce out-of-order TSDB support

This implementation is based on this design doc:
https://docs.google.com/document/d/1Kppm7qL9C-BJB1j6yb6-9ObG3AbdZnFUBYPNNWwDBYM/edit?usp=sharing

This commit adds support to accept out-of-order ("OOO") sample into the TSDB
up to a configurable time allowance. If OOO is enabled, overlapping querying
are automatically enabled.

Most of the additions have been borrowed from
https://github.com/grafana/mimir-prometheus/
Here is the list ist of the original commits cherry picked
from mimir-prometheus into this branch:
- 4b2198d7ec
- 2836e5513f
- 00b379c3a5
- ff0dc75758
- a632c73352
- c6f3d4ab33
- 5e8406a1d4
- abde1e0ba1
- e70e769889
- df59320886

Co-authored-by: Jesus Vazquez <jesus.vazquez@grafana.com>
Co-authored-by: Ganesh Vernekar <ganeshvern@gmail.com>
Co-authored-by: Dieter Plaetinck <dieter@grafana.com>
Signed-off-by: Jesus Vazquez <jesus.vazquez@grafana.com>

* gofumpt files

Signed-off-by: Jesus Vazquez <jesus.vazquez@grafana.com>

* Add license header to missing files

Signed-off-by: Jesus Vazquez <jesus.vazquez@grafana.com>

* Fix OOO tests due to existing chunk disk mapper implementation

Signed-off-by: Jesus Vazquez <jesus.vazquez@grafana.com>

* Fix truncate int overflow

Signed-off-by: Jesus Vazquez <jesus.vazquez@grafana.com>

* Add Sync method to the WAL and update tests

Signed-off-by: Jesus Vazquez <jesus.vazquez@grafana.com>

* remove useless sync

Signed-off-by: Jesus Vazquez <jesus.vazquez@grafana.com>

* Update minOOOTime after truncating Head

* Update minOOOTime after truncating Head

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Fix lint

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Add a unit test

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
Signed-off-by: Jesus Vazquez <jesus.vazquez@grafana.com>

* Load OutOfOrderTimeWindow only once per appender

Signed-off-by: Jesus Vazquez <jesus.vazquez@grafana.com>

* Fix OOO Head LabelValues and PostingsForMatchers

Signed-off-by: Jesus Vazquez <jesus.vazquez@grafana.com>

* Fix replay of OOO mmap chunks

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Remove unnecessary err check

Signed-off-by: Jesus Vazquez <jesus.vazquez@grafana.com>

* Prevent panic with ApplyConfig

Signed-off-by: Ganesh Vernekar 15064823+codesome@users.noreply.github.com
Signed-off-by: Jesus Vazquez <jesus.vazquez@grafana.com>

* Run OOO compaction after restart if there is OOO data from WBL

Signed-off-by: Ganesh Vernekar 15064823+codesome@users.noreply.github.com
Signed-off-by: Jesus Vazquez <jesus.vazquez@grafana.com>

* Apply Bartek's suggestions

Co-authored-by: Bartlomiej Plotka <bwplotka@gmail.com>
Signed-off-by: Jesus Vazquez <jesus.vazquez@grafana.com>

* Refactor OOO compaction

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Address comments and TODOs

- Added a comment explaining why we need the allow overlapping
  compaction toggle
- Clarified TSDBConfig OutOfOrderTimeWindow doc
- Added an owner to all the TODOs in the code

Signed-off-by: Jesus Vazquez <jesus.vazquez@grafana.com>

* Run go format

Signed-off-by: Jesus Vazquez <jesus.vazquez@grafana.com>

* Fix remaining review comments

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Fix tests

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Change wbl reference when truncating ooo in TestHeadMinOOOTimeUpdate

Signed-off-by: Jesus Vazquez <jesus.vazquez@grafana.com>

* Fix TestWBLAndMmapReplay test failure on windows

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Address most of the feedback

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Refactor the block meta for out of order

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Fix windows error

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Fix review comments

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

Signed-off-by: Jesus Vazquez <jesus.vazquez@grafana.com>
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
Signed-off-by: Ganesh Vernekar 15064823+codesome@users.noreply.github.com
Co-authored-by: Ganesh Vernekar <15064823+codesome@users.noreply.github.com>
Co-authored-by: Ganesh Vernekar <ganeshvern@gmail.com>
Co-authored-by: Dieter Plaetinck <dieter@grafana.com>
Co-authored-by: Oleg Zaytsev <mail@olegzaytsev.com>
Co-authored-by: Bartlomiej Plotka <bwplotka@gmail.com>
2022-09-20 22:35:50 +05:30
beorn7
c9fd3c235d Merge branch 'main' into sparsehistogram 2022-08-10 17:54:37 +02:00
Paschalis Tsilias
d1122e0743
Introduce TSDB changes for appending metadata to the WAL (#10972)
* Append metadata to the WAL

Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>

* Remove extra whitespace; Reword some docstrings and comments

Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>

* Use RLock() for hasNewMetadata check

Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>

* Use single byte for metric type in RefMetadata

Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>

* Update proposed WAL format for single-byte type metadata

Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>

* Implementa MetadataAppender interface for the Agent

Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>

* Address first round of review comments

Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>

* Amend description of metadata in wal.md

Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>

* Correct key used to retrieve metadata from cache

When we're setting metadata entries in the scrapeCace, we're using the
p.Help(), p.Unit(), p.Type() helpers, which retrieve the series name and
use it as the cache key. When checking for cache entries though, we used
p.Series() as the key, which included the metric name _with_ its labels.
That meant that we were never actually hitting the cache. We're fixing
this by utiling the __name__ internal label for correctly getting the
cache entries after they've been set by setHelp(), setType() or
setUnit().

Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>

* Put feature behind a feature flag

Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>

* Fix AppendMetadata docstring

Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>

* Reorder WAL format document

Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>

* Change error message of AppendMetadata; Fix access of s.meta in AppendMetadata

Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>

* Reuse temporary buffer in Metadata encoder

Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>

* Only keep latest metadata for each refID during checkpointing

Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>

* Fix test that's referencing decoding metadata

Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>

* Avoid creating metadata block if no new metadata are present

Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>

* Add tests for corrupt metadata block and relevant record type

Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>

* Fix CR comments

Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>

* Extract logic about changing metadata in an anonymous function

Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>

* Implement new proposed WAL format and amend relevant tests

Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>

* Use 'const' for metadata field names

Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>

* Apply metadata to head memSeries in Commit, not in AppendMetadata

Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>

* Add docstring and rename extracted helper in scrape.go

Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>

* Add tests for tsdb-related cases

Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>

* Fix linter issues vol1

Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>

* Fix linter issues vol2

Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>

* Fix Windows test by closing WAL reader files

Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>

* Use switch instead of two if statements in metadata decoding

Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>

* Fix review comments around TestMetadata* tests

Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>

* Add code for replaying WAL; test correctness of in-memory data after a replay

Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>

* Remove scrape-loop related code from PR

Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>

* Address first round of comments

Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>

* Simplify tests by sorting slices before comparison

Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>

* Fix test to use separate transactions

Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>

* Empty out buffer and record slices after encoding latest metadata

Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>

* Fix linting issue

Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>

* Update calculation for DroppedMetadata metric

Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>

* Rename MetadataAppender interface and AppendMetadata method to MetadataUpdater/UpdateMetadata

Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>

* Reuse buffer when encoding latest metadata for each series

Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>

* Fix review comments; Check all returned error values using two helpers

Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>

* Simplify use of helpers

Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>

* Satisfy linter

Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>
2022-07-19 10:58:52 +02:00
beorn7
3bc711e333 Merge branch 'main' into sparsehistogram 2022-05-04 13:37:13 +02:00
Paschalis Tsilias
40c1efe8bc
tsdb/agent: Ignore duplicate exemplars (#10595)
* tsdb/agent: Ignore duplicate exemplars

Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>

* Make each exemplar unique in TestCommit

Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>

* Re-Trigger CI for Windows and UI-related steps

Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>

* Change test comment to properly re-trigger pipeline

Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>

* Defer Close() calls for test agent and segment reader

Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>
2022-04-18 11:41:04 -04:00
Julien Pivotto
685ce9964d
Merge pull request #10599 from prometheus/release-2.35
Merge back release 2.35
2022-04-15 00:10:06 +02:00
Robert Fratto
286dfc70b7
tsdb/agent: port grafana/agent#676 (#10587)
* tsdb/agent: port grafana/agent#676

grafana/agent#676 fixed an issue where a loading a WAL with multiple
segments may result in ref ID collision. The starting ref ID for new
series should be set to the highest ref ID across all series records
from all WAL segments.

This fixes an issue where the starting ref ID was incorrectly set to the
highest ref ID found in the newest segment, which may not have any ref
IDs at all if no series records have been appended to it yet.

Signed-off-by: Robert Fratto <robertfratto@gmail.com>

* tsdb/agent: update terminology (s/ref ID/nextRef)

Signed-off-by: Robert Fratto <robertfratto@gmail.com>
2022-04-14 10:27:06 +02:00