1395 Commits

Author SHA1 Message Date
Fabian Reinartz
dfaf31a1da Move web/httputils to pkg/httputil and add DeadlineClient to it 2015-06-01 21:12:31 +02:00
Fabian Reinartz
2317b001d0 Move flock package to pkg/flock 2015-06-01 21:12:31 +02:00
Fabian Reinartz
3c8fbf1e15 Move test package to pkg/testutil 2015-06-01 21:12:31 +02:00
Fabian Reinartz
aff01e29c3 Limit retrievable samples to retention window.
The storage does not delete data immediately after the retention period.
We don't want to retrieve this data as it causes artifacts.
2015-05-27 13:13:59 +02:00
Fabian Reinartz
a92134a947 Merge pull request #724 from prometheus/fabxc/storage-startup
Read from indexing queue during crash recovery.
2015-05-23 16:50:47 +02:00
Fabian Reinartz
6e319532cf Read from indexing queue during crash recovery.
Change #704 introduced a regression that started reading the queue only
after potential crash recovery. When more than the queue capacity was
indexed, Prometheus deadlocked.
2015-05-23 15:32:35 +02:00
beorn7
dbcb3d9333 Use an RW lock to checkpoint fingerprint mappings.
This has to be backported to 0.13.x.
2015-05-23 14:05:05 +02:00
beorn7
3b9ab546e6 Add metrics to count inconsistencies and fp collisions. 2015-05-21 18:46:20 +02:00
Björn Rabenstein
c44e7cd105 Merge pull request #706 from prometheus/beorn7/persistence2
Improve iterator performance.
2015-05-21 13:48:52 +02:00
Fabian Reinartz
112a778922 Align int64s for atomic operations 2015-05-21 01:38:50 +02:00
beorn7
3b9c421a69 Weed out all the [Gg]et* method names.
The only exception is getNumChunksToPersist to avoid naming the struct
member numChunksToPersist in a weird way.
2015-05-20 19:13:06 +02:00
Julius Volz
267fd34156 Switch Prometheus to use github.com/prometheus/log.
This change is conceptually very simple, although the diff is large. It
switches logging from "github.com/golang/glog" to
"github.com/prometheus/log", while not actually changing any log
messages. V(1)-style logging has been changed to be log.Debug*().
2015-05-20 18:19:32 +02:00
beorn7
81b190bf45 Remove locking from series iterator. Cache chunk iterators. 2015-05-20 16:19:34 +02:00
beorn7
cd5574bf8a Make chunk and series iterators more efficient. 2015-05-20 16:19:34 +02:00
beorn7
f79c694be5 Add benchmarks for series iterator methods. 2015-05-20 16:19:34 +02:00
Fabian Reinartz
f59a449a24 Fix storage test 2015-05-20 16:12:07 +02:00
Fabian Reinartz
d8440d75f1 Do not start storage processing before Start() is called. 2015-05-19 13:51:45 +02:00
beorn7
d1a93655a1 Fix typo. 2015-05-11 17:15:30 +02:00
beorn7
7c6466d476 Reserve only ~1M FPs for the mapping.
That reduces the chance of having a fingerprint in the reserved area.
2015-05-08 18:10:56 +02:00
beorn7
ac75dc2812 Avoid archive lookup for known mapped FPs. 2015-05-08 16:39:26 +02:00
beorn7
ed810b45bf Improvements after review. 2015-05-08 13:35:39 +02:00
beorn7
c36e0e05f1 Add crash recovery of fingerprint mappings. 2015-05-07 18:58:14 +02:00
beorn7
2235cec175 Handle fingerprint collisions. 2015-05-07 18:17:59 +02:00
Fabian Reinartz
eeca323d24 Merge branch 'master' into promql 2015-05-06 13:04:54 +02:00
beorn7
9820e5fe99 Use FastFingerprint where appropriate. 2015-05-06 12:00:58 +02:00
Scott Worley
e5f92d35fe Fix storage/local tests for 32-bit systems 2015-04-30 14:19:48 -07:00
Fabian Reinartz
32b7595c47 Create promql package with lexer/parser.
This commit creates a (so far unused) package. It contains the a custom
lexer/parser for the query language.

ast.go: New AST that interacts well with the parser.
lex.go: Custom lexer (new).
lex_test.go: Lexer tests (new).
parse.go: Custom parser (new).
parse_test.go: Parser tests (new).
functions.go: Changed function type, dummies for parser testing (barely changed/dummies).
printer.go: Adapted from rules/ and adjusted to new AST (mostly unchanged, few additions).
2015-04-23 16:04:50 +02:00
beorn7
a052d32609 Comment improvement. 2015-04-14 10:49:43 +02:00
beorn7
66fc61f9b7 Make bufPool a member of the persistence struct. 2015-04-14 10:43:09 +02:00
beorn7
b02d900e61 Improve chunk and chunkDesc loading.
Also, clean up some things in the code (especially introduction of the
chunkLenWithHeader constant to avoid the same expression all over the place).

Benchmark results:

BEFORE
BenchmarkLoadChunksSequentially     5000            283580 ns/op          152143 B/op        312 allocs/op
BenchmarkLoadChunksRandomly        20000             82936 ns/op           39310 B/op         99 allocs/op
BenchmarkLoadChunkDescs            10000            110833 ns/op           15092 B/op        345 allocs/op

AFTER
BenchmarkLoadChunksSequentially    10000            146785 ns/op          152285 B/op        315 allocs/op
BenchmarkLoadChunksRandomly        20000             67598 ns/op           39438 B/op        103 allocs/op
BenchmarkLoadChunkDescs            20000             99631 ns/op           12636 B/op        192 allocs/op

Note that everything is obviously loaded from the page cache (as the
benchmark runs thousands of times with very small series files). In a
real-world scenario, I expect a larger impact, as the disk operations
will more often actually hit the disk. To load ~50 sequential chunks,
this reduces the iops from 100 seeks and 100 reads to 1 seek and 1
read.
2015-04-13 21:06:04 +02:00
beorn7
c563398c68 Remove obsolete debug message. 2015-04-13 16:59:52 +02:00
beorn7
c5fa0b90c3 Fix the case where a series in memory has 0 chunks, but chunks on disk.
This is actually completely normal for a freshly unarchived series.

Test added to expose.
2015-04-09 15:57:11 +02:00
Björn Rabenstein
d8e515e9cb Merge pull request #617 from prometheus/influxdb-write-support
Add experimental InfluxDB write support.
2015-04-07 13:23:06 +02:00
Julius Volz
593e565688 Allow writing to InfluxDB/OpenTSDB at the same time. 2015-04-02 20:24:38 +02:00
beorn7
3035b8bfdd Adaptively reduce the wait time for memory series maintenance.
This will make in-memory series maintenance the faster the more chunks
are waiting for persistence.
2015-04-01 17:52:03 +02:00
Julius Volz
61fb688dd9 Add experimental InfluxDB write support. 2015-04-01 02:03:16 +02:00
beorn7
fbc44d8f95 Add benchmark for loading chunks and chunk descs. 2015-03-19 19:28:21 +01:00
beorn7
6a21f73898 Fixes after review. 2015-03-19 17:54:59 +01:00
beorn7
51d35f4481 Instrument series maintenance durations. 2015-03-19 17:06:16 +01:00
beorn7
12ae6e9203 Increase resilience of the storage against data corruption - step 4.
Step 4: Add a configurable sync'ing of series files after modification.
2015-03-19 15:58:02 +01:00
beorn7
11bd9ce1bd Increase resilience of the storage against data corruption - step 3.
Step 3: Remember the mtime of series files and make use of it to
detect series files that are not the one the checkpoint thinks they
are.
2015-03-19 15:44:11 +01:00
beorn7
e25cca823c Increase resilience of the storage against data corruption - step 2.
Step 2: Add a flag -storage.local.pedantic-checks to check every
series file.

Also, remove countPersistedHeadChunks channel, which is unused.
2015-03-19 12:06:15 +01:00
beorn7
3d8d8928be Increase resilience of the storage against data corruption - step 1.
Step 1: Admit the problem by turning the various "panic"s into logged
errors, followed by marking the persistence as dirty.
2015-03-19 11:49:18 +01:00
beorn7
da7c0461c6 Rename persist queue len/cap to num/max chunks to persist.
Remove deprecated flag storage.incoming-samples-queue-capacity.
2015-03-18 19:36:41 +01:00
beorn7
a075900f9a Merge branch 'beorn7/persistence' into beorn7/ingestion-tweaks 2015-03-18 19:09:31 +01:00
beorn7
1d8fc7d56f Change minor things after code review. 2015-03-18 19:09:07 +01:00
beorn7
be11cb2b07 Remove the sample ingestion channel.
The one central sample ingestion channel has caused a variety of
trouble. This commit removes it. Targets and rule evaluation call an
Append method directly now. To incorporate multiple storage backends
(like OpenTSDB), storage.Tee forks the Append into two different
appenders.

Note that the tsdb queue manager had its own queue anyway. It was a
queue after a queue... Much queue, so overhead...

Targets have their own little buffer (implemented as a channel) to
avoid stalling during an http scrape. But a new scrape will only be
started once the old one is fully ingested.

The contraption of three pipelined ingesters was removed. A Target is
an ingester itself now. Despite more logic in Target, things should be
less confusing now.

Also, remove lint and vet warnings in ast.go.
2015-03-15 14:08:22 +01:00
beorn7
0056eaeb4f Redesign series maintenance and chunk persistence. 2015-03-14 22:05:23 +01:00
beorn7
5bea942d8e Improve various things around chunk encoding.
A number of mostly minor things:

- Rename chunk type -> chunk encoding.

- After all, do not carry around the chunk encoding to all parts of
  the system, but just have one place where the encoding for new
  chunks is set based on the flag. The new approach has caveats as
  well, but the polution of so many method signatures is worse.

- Use the default chunk encoding for new chunks of existing
  series. (Previously, only new _series_ would get chunks with the
  default encoding.)

- Use an enum for chunk encoding. (But keep the version number for the
  flag, for reasons discussed previously.)

- Add encoding() to the chunk interface (so that a chunk knows its own
  encoding - no need to have that in a different top-level function).

- Got rid of newFollowUpChunk (which would keep the existing encoding
  for all chunks of a time series). Now only use newChunk(), which
  will create a chunk encoding according to the flag.

- Simplified transcodeAndAdd.

- Reordered methods of deltaEncodedChunk and doubleDeltaEncoded chunk
  to match the order in the chunk interface.

- Only transcode if the chunk is not yet half full. If more than half
  full, add a new chunk instead.
2015-03-14 19:03:20 +01:00
beorn7
9ecf93526d Sync the checkpoints.
Because that's what should be done with checkpoints.
2015-03-11 19:10:51 +01:00