74 Commits

Author SHA1 Message Date
Dieter Plaetinck
cda025b5b5
TSDB: demistify SeriesRefs and ChunkRefs (#9536)
* TSDB: demistify seriesRefs and ChunkRefs

The TSDB package contains many types of series and chunk references,
all shrouded in uint types.  Often the same uint value may
actually mean one of different types, in non-obvious ways.

This PR aims to clarify the code and help navigating to relevant docs,
usage, etc much quicker.

Concretely:

* Use appropriately named types and document their semantics and
  relations.
* Make multiplexing and demuxing of types explicit
  (on the boundaries between concrete implementations and generic
  interfaces).
* Casting between different types should be free.  None of the changes
  should have any impact on how the code runs.

TODO: Implement BlockSeriesRef where appropriate (for a future PR)

Signed-off-by: Dieter Plaetinck <dieter@grafana.com>

* feedback

Signed-off-by: Dieter Plaetinck <dieter@grafana.com>

* agent: demistify seriesRefs and ChunkRefs

Signed-off-by: Dieter Plaetinck <dieter@grafana.com>
2021-11-06 15:40:04 +05:30
Mateusz Gozdek
1a6c2283a3 Format Go source files using 'gofumpt -w -s -extra'
Part of #9557

Signed-off-by: Mateusz Gozdek <mgozdekof@gmail.com>
2021-11-02 19:52:34 +01:00
Darshan Chaudhary
1f688657bf
Call delete on head if interval overlaps (#9151)
* Call delete on head if interval overlaps

Signed-off-by: darshanime <deathbullet@gmail.com>

* Garbage collect tombstones during head gc

Signed-off-by: darshanime <deathbullet@gmail.com>

* Truncate tombstones before min time during head gc

Signed-off-by: darshanime <deathbullet@gmail.com>

* Lock less by deleting all keys in a single pass

Signed-off-by: darshanime <deathbullet@gmail.com>

* Pass map to DeleteTombstones

Signed-off-by: darshanime <deathbullet@gmail.com>

* Create new slice to replace old one

Signed-off-by: darshanime <deathbullet@gmail.com>
2021-09-16 12:20:03 +05:30
Ganesh Vernekar
ee7e0071d1
Snapshot in-memory chunks on shutdown for faster restarts (#7229)
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-08-06 17:51:01 +01:00
Ganesh Vernekar
59d02b5ef0
tsdb: Block Head GC till pending readers are done reading (#9081)
* tsdb: Block Head GC till pending readers are done reading

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Fix review comments

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Fix review comments 2

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Fix the exclusiveness of maxt in WaitForPendingReadersInTimeRange

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-07-20 14:17:20 +05:30
Martin Disibio
1bcd13d6b5
Exemplar resize (#8974)
* Create experimental circular buffer resize method, benchmarks

Signed-off-by: Martin Disibio <mdisibio@gmail.com>

* Optimize exemplar resize to only replay as many exemplars as needed

Signed-off-by: Martin Disibio <mdisibio@gmail.com>

* More comments, benchmark AddExemplar

Signed-off-by: Martin Disibio <mdisibio@gmail.com>

* optimizations

Signed-off-by: Martin Disibio <mdisibio@gmail.com>

* comment

Signed-off-by: Martin Disibio <mdisibio@gmail.com>

* Slight refactor of resize benchmark + make use of resize via runtime
reloadable storage config.

Signed-off-by: Callum Styan <callumstyan@gmail.com>

* Some more config related changes.

Signed-off-by: Callum Styan <callumstyan@gmail.com>

* Address some review comments.

Signed-off-by: Callum Styan <callumstyan@gmail.com>

* Address more review comments.

Signed-off-by: Callum Styan <callumstyan@gmail.com>

* Refactor to remove usage of noopExemplarStorage and avoid race condition
when resizing from Head code.

Signed-off-by: Callum Styan <callumstyan@gmail.com>

* Fix or add comments to clarify some of the new behaviour.

Signed-off-by: Callum Styan <callumstyan@gmail.com>

* fix potential panics related to negative exemplar buffer lengths

Signed-off-by: Callum Styan <callumstyan@gmail.com>

Co-authored-by: Callum Styan <callumstyan@gmail.com>
2021-07-20 10:22:57 +05:30
Julien Duchesne
8855c2e626
Add prometheus_tsdb_clean_start metric (#8824)
Add cleanup of the lockfile when the db is cleanly closed

The metric describes the status of the lockfile on startup
0: Already existed
1: Did not exist
-1: Disabled

Therefore, if the min value over time of this metric is 0, that means that executions have exited uncleanly
We can then use that metric to have a much lower threshold on the crashlooping alert:

If the metric exists and it has been zero, two restarts is enough to trigger the alarm
If it does not exist (old prom version for example), the current five restarts threshold remains

Signed-off-by: Julien Duchesne <julien.duchesne@grafana.com>

* Change metric name + set unset value to -1

Signed-off-by: Julien Duchesne <julien.duchesne@grafana.com>

* Only check the last value of the clean start alert

Signed-off-by: Julien Duchesne <julien.duchesne@grafana.com>

* Fix test + nit

Signed-off-by: Julien Duchesne <julien.duchesne@grafana.com>
2021-06-16 15:03:02 +05:30
Levi Harrison
b5f6f8fb36 Switched to go-kit/log
Signed-off-by: Levi Harrison <git@leviharrison.dev>
2021-06-11 12:28:36 -04:00
Levi Harrison
7bc11dcb06
React UI: Add Starting Screen (#8662)
* Added walreplay API endpoint

Signed-off-by: Levi Harrison <git@leviharrison.dev>

* Added starting page to react-ui

Signed-off-by: Levi Harrison <git@leviharrison.dev>

* Documented the new endpoint

Signed-off-by: Levi Harrison <git@leviharrison.dev>

* Fixed typos

Signed-off-by: Levi Harrison <git@leviharrison.dev>

Co-authored-by: Julius Volz <julius.volz@gmail.com>

* Removed logo

Signed-off-by: Levi Harrison <git@leviharrison.dev>

* Changed isResponding to isUnexpected

Signed-off-by: Levi Harrison <git@leviharrison.dev>

* Changed width of progress bar

Signed-off-by: Levi Harrison <git@leviharrison.dev>

* Changed width of progress bar

Signed-off-by: Levi Harrison <git@leviharrison.dev>

* Added DB stats object

Signed-off-by: Levi Harrison <git@leviharrison.dev>

* Updated starting page to work with new fields

Signed-off-by: Levi Harrison <git@leviharrison.dev>

* Passing nil

Signed-off-by: Levi Harrison <git@leviharrison.dev>

* Passing nil (pt. 2)

Signed-off-by: Levi Harrison <git@leviharrison.dev>

* Passing nil (pt. 3)

Signed-off-by: Levi Harrison <git@leviharrison.dev>

* Passing nil (and also implementing a method this time) (pt. 4)

Signed-off-by: Levi Harrison <git@leviharrison.dev>

* Passing nil (and also implementing a method this time) (pt. 5)

Signed-off-by: Levi Harrison <git@leviharrison.dev>

* Changed const to let

Signed-off-by: Levi Harrison <git@leviharrison.dev>

* Passing nil (pt. 6)

Signed-off-by: Levi Harrison <git@leviharrison.dev>

* Remove SetStats method

Signed-off-by: Levi Harrison <git@leviharrison.dev>

* Added comma

Signed-off-by: Levi Harrison <git@leviharrison.dev>

* Changed api

Signed-off-by: Levi Harrison <git@leviharrison.dev>

* Changed to triple equals

Signed-off-by: Levi Harrison <git@leviharrison.dev>

* Fixed data response types

Signed-off-by: Levi Harrison <git@leviharrison.dev>

* Don't return pointer

Signed-off-by: Levi Harrison <git@leviharrison.dev>

* Changed version

Signed-off-by: Levi Harrison <git@leviharrison.dev>

* Fixed interface issue

Signed-off-by: Levi Harrison <git@leviharrison.dev>

* Fixed pointer

Signed-off-by: Levi Harrison <git@leviharrison.dev>

* Fixed copying lock value error

Signed-off-by: Levi Harrison <git@leviharrison.dev>

Co-authored-by: Julius Volz <julius.volz@gmail.com>
2021-06-05 15:29:32 +01:00
Ben Ye
0a8912433a
allow compact series merger to be configurable (#8836)
Signed-off-by: yeya24 <yb532204897@gmail.com>
2021-05-18 18:38:37 +02:00
nberkley
f9e2dd0697
Add support for smaller block chunk segment allocations (#8478)
* Add support for --storage.tsdb.max-chunk-size to suport small chunks for space limited prometheus instances.

Signed-off-by: Nathan Berkley <nberkley@tripadvisor.com>

* Update tsdb/compact.go

Co-authored-by: Bartlomiej Plotka <bwplotka@gmail.com>
Signed-off-by: Nathan Berkley <nberkley@tripadvisor.com>

* Update tsdb/db.go

Co-authored-by: Bartlomiej Plotka <bwplotka@gmail.com>
Signed-off-by: Nathan Berkley <nberkley@tripadvisor.com>

* Update cmd/prometheus/main.go

Co-authored-by: Bartlomiej Plotka <bwplotka@gmail.com>
Signed-off-by: Nathan Berkley <nberkley@tripadvisor.com>

* Change naming scheme to

Signed-off-by: Nathan Berkley <nberkley@tripadvisor.com>

* Add a lower bound to --storage.tsdb.max-block-chunk-segment-size

Signed-off-by: Nathan Berkley <nberkley@tripadvisor.com>

* Update storage.md to explain what a chunk segment is

Signed-off-by: Nathan Berkley <nberkley@tripadvisor.com>

* Apply suggestions from code review

Co-authored-by: Ganesh Vernekar <15064823+codesome@users.noreply.github.com>
Signed-off-by: Nathan Berkley <nberkley@tripadvisor.com>

* Force tests

Signed-off-by: Nathan Berkley <nberkley@tripadvisor.com>

* Fix code style

Signed-off-by: Nathan Berkley <nberkley@tripadvisor.com>

Co-authored-by: Bartlomiej Plotka <bwplotka@gmail.com>
Co-authored-by: Ganesh Vernekar <15064823+codesome@users.noreply.github.com>
2021-04-15 14:25:01 +05:30
Bryan Boreham
c7a62b95ce
GetRef() now returns the label set (#8641)
The purpose of GetRef() is to allow Append() to be called without
the caller needing to copy the labels. To avoid a race where a series
is removed from TSDB between the calls to GetRef() and Append(), we
return TSDB's copy of the labels.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2021-03-24 15:24:58 +00:00
Bryan Boreham
d614ae9ecf
[RFC] Add method to get reference number for TSDB Appender (#8600)
* Add method to get reference number for TSDB Appender

In situations where we need to copy labels before calling Add(),
GetRef() allows to check first, then call AddFast() in the case that the series
is already known.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>

* Add explicit interface for GetRef() method

Suggested in code review by @bwplotka

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>

* Rename OptionalGetRef to GetRef

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>

* Simplify return value of GetRef()

0 can be relied on to mean 'no reference'

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2021-03-19 19:28:55 +00:00
Callum Styan
289ba11b79
Add circular in-memory exemplars storage (#6635)
* Add circular in-memory exemplars storage

Signed-off-by: Callum Styan <callumstyan@gmail.com>
Signed-off-by: Tom Wilkie <tom.wilkie@gmail.com>
Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
Signed-off-by: Martin Disibio <mdisibio@gmail.com>

Co-authored-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
Co-authored-by: Tom Wilkie <tom.wilkie@gmail.com>
Co-authored-by: Martin Disibio <mdisibio@gmail.com>

* Fix some comments, clean up exemplar metrics struct and exemplar tests.

Signed-off-by: Callum Styan <callumstyan@gmail.com>

* Fix exemplar query api null vs empty array issue.

Signed-off-by: Callum Styan <callumstyan@gmail.com>

Co-authored-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
Co-authored-by: Tom Wilkie <tom.wilkie@gmail.com>
Co-authored-by: Martin Disibio <mdisibio@gmail.com>
2021-03-16 15:17:45 +05:30
Arthur Silva Sens
6a3d55db0a
Rolling tombstones clean up (#8007)
* CleanupTombstones refactored, now reloading blocks after every compaction.

The goal is to remove deletable blocks after every compaction and, thus, decrease disk space used when cleaning tombstones.

Signed-off-by: arthursens <arthursens2005@gmail.com>

* Protect DB against parallel reloads

Signed-off-by: ArthurSens <arthursens2005@gmail.com>

* Fix typos

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>

Co-authored-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
2021-02-17 11:02:43 +05:30
Dustin Hooten
b9f0baf6ff
Combine NewHead() args into a HeadOptions struct (#8452)
* Combine NewHead() args into a HeadOptions struct

Signed-off-by: Dustin Hooten <dustinhooten@gmail.com>

* remove overrides params

Signed-off-by: Dustin Hooten <dustinhooten@gmail.com>

* address pr feedback

Signed-off-by: Dustin Hooten <dustinhooten@gmail.com>
2021-02-09 19:42:48 +05:30
Nguyen Le Vu Long
fbe960f2c1
fix: remove pre-2.21 tmp blocks on start (#8353)
* fix: remove pre-2.21 tmp blocks on start

Signed-off-by: Nguyen Le Vu Long <vulongvn98@gmail.com>

* fix: commenting

Signed-off-by: Nguyen Le Vu Long <vulongvn98@gmail.com>
2021-01-09 10:02:26 +01:00
Arthur Silva Sens
7e932637e3
Reload tsdb blocks every minute (#8340)
* Reload tsdb blocks every minute

Signed-off-by: ArthurSens <arthursens2005@gmail.com>

* Proteced tsdb with mutex locks

Signed-off-by: ArthurSens <arthursens2005@gmail.com>
2021-01-07 13:00:08 +05:30
arthursens
8493704b9b Change seconds()*1000 to milliseconds()
Signed-off-by: arthursens <arthursens2005@gmail.com>
2020-12-25 10:45:23 -03:00
Ganesh Vernekar
faa1554aa1
Don't call runtime.GC() after compaction (#8276)
Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
2020-12-22 14:44:17 +00:00
Arthur Silva Sens
64a106c5dd
Logging added for when compaction takes more than the block time range (#8151)
* Logging added for when compaction takes more than the block time range

Signed-off-by: arthursens <arthursens2005@gmail.com>

* Log only if no errors were already logged

Signed-off-by: arthursens <arthursens2005@gmail.com>

* Log duration as human readable string

Signed-off-by: arthursens <arthursens2005@gmail.com>

* Move logging from compactHead() to Compact()

Signed-off-by: arthursens <arthursens2005@gmail.com>

* Compute duration of all head compactions plus wal truncation

Signed-off-by: arthursens <arthursens2005@gmail.com>

* Remove named return added os first commits

Signed-off-by: arthursens <arthursens2005@gmail.com>

* Address nits

Signed-off-by: arthursens <arthursens2005@gmail.com>

* Change miliseconds to seconds to make fuzzit tests happy

Signed-off-by: ArthurSens <arthursens2005@gmail.com>
2020-12-07 21:29:43 +00:00
Marco Pracucci
db19e05d93
Add option to customise head chunks write buffer size (#8201)
* Add option to customise head chunks write buffer size

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Fixed tests

Signed-off-by: Marco Pracucci <marco@pracucci.com>
2020-11-19 18:30:47 +05:30
Bartlomiej Plotka
3d8826a3d4
MultiError: Refactored MultiError for more concise and safe usage. (#8066)
* MultiError: Refactored MultiError for more concise and safe usage.

* Less lines
* Goland IDE was marking every usage of old MultiError "potential nil" error
* It was easy to forgot using Err() when error was returned, now it's safely assured on compile time.

NOTE: Potentially I would rename package to merrors. (: In different PR.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Addressed review comments.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Addressed comments.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Fix after rebase.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>
2020-10-28 15:24:58 +00:00
Ganesh Vernekar
3245b3267b
Don't use returned DB to close resources on TSDB startup error (#8113)
* Don't use returned DB to close resources on TSDB startup error

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>

* Add unit test and fix another panic

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>

* Fix review comment

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
2020-10-28 15:39:03 +05:30
Julien Pivotto
4e5b1722b3
Move away from testutil, refactor imports (#8087)
Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2020-10-22 11:00:08 +02:00
Arthur Silva Sens
c5a832b394
Close resources after failing to startup TSDB (#8031)
* Close resources after failing to startup TSDB

Signed-off-by: arthursens <arthursens2005@gmail.com>

* Return close error instead of logging

Signed-off-by: arthursens <arthursens2005@gmail.com>

* Change named return's name

Signed-off-by: arthursens <arthursens2005@gmail.com>
2020-10-21 20:38:28 +05:30
Brian Brazil
fdf1c29224
Fix metric description of prometheus_tsdb_symbol_table_size_bytes. (#8080)
This is how much memory we use to load in the on-disk
symbol tables, not the size of the tables themselves.

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>
2020-10-21 14:35:40 +01:00
Bartlomiej Plotka
2fe1e9fa93
Create a checkpoint only at the end of Compact call (#8067)
* Create a checkpoint only at the end of Compact call

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>

* Fix review comments

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>

* Fix Bartek's offline reviews

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>

* Introduce TruncateInMemory and TruncateWAL

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>

* Small enhancements and test fixing attempts

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>

* Fix tests

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>

* Add TestOneCheckpointPerCompactCall

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>

* Don't truncate WAL on block compaction

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>

* Simplified the algo.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Better protection around calling truncateWAL, truncate WAL on Head compaction error

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>

Co-authored-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
2020-10-19 20:57:08 +05:30
Julien Pivotto
59733b1a26
TSDB: use blocks instead of db.blocks in condition (#8068)
Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2020-10-19 16:51:54 +05:30
Gayathri Venkatesh
73e2ce1bd6
Do not ignore reload errors in compactHead (#8051)
* Modified unknownRefs to unknownRefs.Load()

Signed-off-by: GayathriVenkatesh <gayaa2010@gmail.com>

* Modified db.go

Signed-off-by: GayathriVenkatesh <gayaa2010@gmail.com>

* Revert "Modified unknownRefs to unknownRefs.Load()"

This reverts commit 79caf595fa9b9193878dc0dd9dec17d58851ae42.

Signed-off-by: GayathriVenkatesh <gayaa2010@gmail.com>

* Made changes to reload error in db.go

Signed-off-by: GayathriVenkatesh <gayaa2010@gmail.com>
2020-10-14 15:05:24 +05:30
Arthur Silva Sens
4f45e201cc
Promtool tsdb list now prints block sizes (#7993)
* promtool tsdb list now prints blocks' size

Signed-off-by: arthursens <arthursens2005@gmail.com>
2020-10-12 23:15:40 +02:00
johncming
b521612042
tsdb: simplify code. (#7792)
Signed-off-by: johncming <johncming@yahoo.com>
2020-08-14 15:15:08 +05:30
johncming
d19fc71903
tsdb: use NewRangeHead instead. (#7793)
Signed-off-by: johncming <johncming@yahoo.com>
2020-08-13 10:55:35 +01:00
Bartlomiej Plotka
f16cbc20d6
tsdb: Bug fix for further continued deletions after crash deletions; added more tests. (#7777)
* tsdb: Bug fix for further continued after crash deletions; added more tests.

Additionally: Added log line for block removal.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Addressed comment.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>
2020-08-11 15:53:23 +01:00
Bartlomiej Plotka
4ae2ef94e0
tsdb: Delete blocks atomically; Remove tmp blocks on start; Added test. (#7772)
## Changes:

* Rename dir when deleting
* Ignoring blocks with broken meta.json on start (we do that on reload)
* Compactor writes <ulid>.tmp-for-creation blocks instead of just .tmp
* Delete tmp-for-creation and tmp-for-deletion blocks during DB open.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>
2020-08-11 06:56:08 +01:00
Bartlomiej Plotka
e6d7cc5fa4
tsdb: Added ChunkQueryable implementations to db; unified MergeSeriesSets and vertical to single struct. (#7069)
* tsdb: Added ChunkQueryable implementations to db; unified compactor, querier and fanout block iterating.

Chained to https://github.com/prometheus/prometheus/pull/7059

* NewMerge(Chunk)Querier now takies multiple primaries allowing tsdb DB code to use it.
* Added single SeriesEntry / ChunkEntry for all series implementations.
* Unified all vertical, and non vertical for compact and querying to single
merge series / chunk sets by reusing VerticalSeriesMergeFunc for overlapping algorithm (same logic as before)
* Added block (Base/Chunk/)Querier for block querying. We then use populateAndTomb(Base/Chunk/) to iterate over chunks or samples.
* Refactored endpoint tests and querier tests to include subtests.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Addressed comments from Brian and Beorn.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Fixed snapshot test and added chunk iterator support for DBReadOnly.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Fixed race when iterating over Ats first.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Fixed tests.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Fixed populate block tests.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Fixed endpoints test.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Fixed test.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Added test & fixed case of head open chunk.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Fixed DBReadOnly tests and bug producing 1 sample chunks.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Added cases for partial block overlap for multiple full chunks.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Added extra tests for chunk meta after compaction.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Fixed small vertical merge bug and added more tests for that.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>
2020-07-31 16:03:02 +01:00
Annanay
9bba8a6eae Merge branch 'master' into appender-context
Signed-off-by: Annanay <annanayagarwal@gmail.com>
2020-07-30 16:43:18 +05:30
Annanay
89129cd39a Address comments
Signed-off-by: Annanay <annanayagarwal@gmail.com>
2020-07-30 16:41:13 +05:30
Javier Palomo Almena
348ff4285f
tsdb: Replace sync/atomic with uber-go/atomic in tsdb (#7659)
* tsdb/chunks: Replace sync/atomic with uber-go/atomic

Signed-off-by: Javier Palomo <javier.palomo.almena@gmail.com>

* tsdb/heaad: Replace sync/atomic with uber-go/atomic

Signed-off-by: Javier Palomo <javier.palomo.almena@gmail.com>

* vendor: Make go.uber.org/atomic a direct dependency

There is no modifications to go.sum and vendor/ because
it was already vendored.

Signed-off-by: Javier Palomo <javier.palomo.almena@gmail.com>

* tsdb: Remove comments referring to the sync/atomic alignment bug

Related: https://golang.org/pkg/sync/atomic/#pkg-note-BUG

Signed-off-by: Javier Palomo <javier.palomo.almena@gmail.com>
2020-07-28 10:12:42 +05:30
Annanay
7f98a744e5 Add context to Appender interface
Signed-off-by: Annanay <annanayagarwal@gmail.com>
2020-07-24 19:40:51 +05:30
Ganesh Vernekar
4a8531a64b
BlocksToDelete function in DB options (#7638)
* Optional retention filter for DB

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>

* Fix review comments

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>

* Specify len for the map creation

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
2020-07-22 20:49:33 +05:30
Ganesh Vernekar
e65e2e0dac
Fix panic from db metrics (#7501)
Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
2020-07-05 10:11:42 +05:30
Bartlomiej Plotka
b788986717
storage: Adjusted fully storage layer support for chunk iterators: Remote read client, readyStorage, fanout. (#7059)
* Fixed nits introduced by https://github.com/prometheus/prometheus/pull/7334
* Added ChunkQueryable implementation to fanout and readyStorage.
* Added more comments.
* Changed NewVerticalChunkSeriesMerger to CompactingChunkSeriesMerger, removed tiny interface by reusing VerticalSeriesMergeFunc for overlapping algorithm for
both chunks and series, for both querying and compacting (!) + made sure duplicates are merged.
* Added ErrChunkSeriesSet
* Added Samples interface for seamless []promb.Sample to []tsdbutil.Sample conversion.
* Deprecating non chunks serieset based StreamChunkedReadResponses, added chunk one.
* Improved tests.
* Split remote client into Write (old storage) and read.
* Queryable client is now SampleAndChunkQueryable. Since we cannot use nice QueryableFunc I moved
all config based options to sampleAndChunkQueryableClient to aboid boilerplate.

In next commit: Changes for TSDB.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>
2020-06-24 14:41:52 +01:00
Simon Pasquier
2f12049371
tsdb: improve logs when encountering corruption (#7308)
* tsdb: improve logs when encountering corruption

Signed-off-by: Simon Pasquier <spasquie@redhat.com>

* Wrap corrupted block errors

Signed-off-by: Simon Pasquier <spasquie@redhat.com>

* Add file path to head chunks

Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2020-06-17 16:40:00 +02:00
Krasimir Georgiev
f4dd45609a
Use min and maxt of the range head when creating a block (#7282)
Signed-off-by: Krasi Georgiev <8903888+krasi-georgiev@users.noreply.github.com>
2020-05-22 17:00:06 +05:30
Ganesh Vernekar
1c99adb9fd
Callbacks for lifecycle of series in TSDB (#7159)
* Callbacks for lifecycle of series in TSDB

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>

* Add more comments

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
2020-05-20 18:52:08 +05:30
Ganesh Vernekar
d4b9fe801f
M-map full chunks of Head from disk (#6679)
When appending to the head and a chunk is full it is flushed to the disk and m-mapped (memory mapped) to free up memory

Prom startup now happens in these stages
 - Iterate the m-maped chunks from disk and keep a map of series reference to its slice of mmapped chunks.
- Iterate the WAL as usual. Whenever we create a new series, look for it's mmapped chunks in the map created before and add it to that series.

If a head chunk is corrupted the currpted one and all chunks after that are deleted and the data after the corruption is recovered from the existing WAL which means that a corruption in m-mapped files results in NO data loss.

[Mmaped chunks format](https://github.com/prometheus/prometheus/blob/master/tsdb/docs/format/head_chunks.md)  - main difference is that the chunk for mmaping now also includes series reference because there is no index for mapping series to chunks.
[The block chunks](https://github.com/prometheus/prometheus/blob/master/tsdb/docs/format/chunks.md) are accessed from the index which includes the offsets for the chunks in the chunks file - example - chunks of series ID have offsets 200, 500 etc in the chunk files.
In case of mmaped chunks, the offsets are stored in memory and accessed from that. During WAL replay, these offsets are restored by iterating all m-mapped chunks as stated above by matching the series id present in the chunk header and offset of that chunk in that file.

**Prombench results**

_WAL Replay_

1h Wal reply time
30% less wal reply time - 4m31 vs 3m36
2h Wal reply time
20% less wal reply time - 8m16 vs 7m

_Memory During WAL Replay_

High Churn:
10-15% less RAM -  32gb vs 28gb
20% less RAM after compaction 34gb vs 27gb
No Churn:
20-30% less RAM -  23gb vs 18gb
40% less RAM after compaction 32.5gb vs 20gb

Screenshots are in [this comment](https://github.com/prometheus/prometheus/pull/6679#issuecomment-621678932)


Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
2020-05-06 21:00:00 +05:30
Ben Ye
1e4e37144d
Fixed wrongly handled not ready TSDB on web and API. (#7182)
* fix federate endpoint panic

Signed-off-by: yeya24 <yb532204897@gmail.com>

* Fixed all cases of not ready TSDB being wrongly handled.

* Fixed issue for federation.
* Ensured this will never happen again thanks to interfaces
* Fixes same issue for stats.
* Added tests for readiness.
* Fixed bug in stats. It was:
   status.MaxTime = db.Head().MaxTime()
   status.MinTime = db.Head().MaxTime()


Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Addressed Brian's comments.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Addressed Brian's comments.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

Co-authored-by: Bartlomiej Plotka <bwplotka@gmail.com>
2020-04-29 17:16:14 +01:00
Marek Slabicki
8224ddec23
Capitalizing first letter of all log lines (#7043)
Signed-off-by: Marek Slabicki <thaniri@gmail.com>
2020-04-11 09:22:18 +01:00
Brad Walker
3348930df5
Replace fileutil.ReadDir with ioutil.ReadDir (#7029) (#7033)
* tsdb: Replace fileutil.ReadDir with ioutil.ReadDir (#7029)

Signed-off-by: Brad Walker <brad@bradmwalker.com>

* tsdb: Remove fileutil.ReadDir (#7029)

Signed-off-by: Brad Walker <brad@bradmwalker.com>
2020-04-06 19:04:20 +05:30