prometheus

mirror of https://github.com/prometheus/prometheus.git synced 2025-09-21 22:01:01 +02:00

Author	SHA1	Message	Date
Arve Knudsen	7cbf749096	Upgrade to github.com/oklog/ulid/v2 (#16168 ) Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>	2025-03-05 16:03:25 +01:00
Bryan Boreham	42d55505f9	Merge pull request #12659 from prymitive/memChunk Short-cut common memChunk operations	2025-02-25 11:33:56 +00:00
Matthieu MOREL	c7d4b53ec1	chore: enable unused-parameter from revive Signed-off-by: Matthieu MOREL <matthieu.morel35@gmail.com>	2025-02-19 19:50:28 +01:00
Ayoub Mrini	e04913aea2	Merge pull request #15778 from machine424/reuse-pools feat(tsdb/(head\|agent)): reuse pools across segments to reduce garbage during WL replay	2025-02-17 12:48:17 +01:00
Bartlomiej Plotka	de23a9667c	prw2: Split PRW2.0 from metadata-wal-records feature (#16030 ) Rationales: * metadata-wal-records might be deprecated and replaced going forward: https://github.com/prometheus/prometheus/issues/15911 * PRW 2.0 works without metadata just fine (although it sends untyped metrics as expected). Signed-off-by: bwplotka <bwplotka@gmail.com>	2025-02-13 12:16:33 +00:00
machine424	d644324407	feat(tsdb/(head\|agent)): reuse pools across segments to avoid generating garbage during WL replay This is part of the "reduce WAL replay overhead/garbage" effort to help with https://github.com/prometheus/prometheus/issues/6934. Signed-off-by: machine424 <ayoubmrini424@gmail.com>	2025-02-10 22:40:24 +01:00
Matthieu MOREL	b472ce7010	chore: enable early-return from revive Signed-off-by: Matthieu MOREL <matthieu.morel35@gmail.com>	2025-02-10 22:08:43 +01:00
Bryan Boreham	b74cebf6bf	Merge pull request #12920 from prymitive/compactLock Fix locks in db.reloadBlocks()	2025-02-10 17:35:09 +00:00
Dimitar Dimitrov	686dcc7b0d	headIndexReader: reduce debug logging (#15993 ) Around Mimir compactions we see logging in ShardedPostings do massive allocations and drive GC up to 50% of CPU. Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>	2025-02-07 15:46:55 +00:00
SuryaPrakash	cb3b17a14c	fix: os.MkdirTemp with t.TempDir (#15860 ) Signed-off-by: Surya Prakash <surya0prakash@proton.me>	2025-01-31 14:32:20 +00:00
Alan Protasio	9d1abbb9ed	Call PostCreation callback only after the new series is added to the mempotings (#15579 ) Signed-off-by: alanprot <alanprot@gmail.com>	2025-01-28 12:11:58 +01:00
Jan Fajerski	6823f58e59	Merge pull request #15732 from bboreham/benchmark-setup-append-periodically TSDB benchmarks: Commit periodically to speed up init	2025-01-28 11:35:04 +01:00
Bryan Boreham	6ba25ba93f	tsdb tests: avoid 'defer' till end of function 'defer' only runs at the end of the function, so explicitly close the querier after we finish with it. Also check it didn't error. Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	2025-01-27 19:59:43 +00:00
Bryan Boreham	2f615a200d	tsdb tests: restrict some 'defer' operations 'defer' only runs at the end of the function, so introduce some more functions / move the start, so that 'defer' can run at the end of the logical block. Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	2025-01-27 19:59:43 +00:00
Bryan Boreham	f4fbe47254	tsdb tests: avoid capture-by-reference in goroutines Only one version of the variable is captured; this is a source of race conditions. Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	2025-01-27 19:59:43 +00:00
piguagua	a82f2b8168	chore: fix function name and struct name in comment (#15827 ) Signed-off-by: piguagua <piguagua@aliyun.com>	2025-01-17 21:26:08 +01:00
Julius Volz	0d7db907a9	Merge pull request #15785 from crystalstall/main refactor: using slices.Contains to simplify the code	2025-01-13 10:31:41 +01:00
crystalstall	616914abe2	Signed-off-by: crystalstall <crystalruby@qq.com> refactor: using slices.Contains to simplify the code Signed-off-by: crystalstall <crystalruby@qq.com>	2025-01-11 00:41:51 +08:00
Lukasz Mierzwa	e3728122b2	Update comments for methods that require a lock Signed-off-by: Lukasz Mierzwa <l.mierzwa@gmail.com>	2025-01-09 17:20:10 +00:00
Lukasz Mierzwa	a1740cd2e7	Remove unnecessary locks Compact() is an uppercase function that deals with locks on its own, so we shouldn't have a lock around it. Signed-off-by: Lukasz Mierzwa <lukasz@cloudflare.com>	2025-01-09 17:06:05 +00:00
Łukasz Mierzwa	d106b3beb7	Wrap db.blocks read in a read lock We don't hold db.mtx lock when trying to read db.blocks here so we need a read lock around this loop. Signed-off-by: Łukasz Mierzwa <l.mierzwa@gmail.com>	2025-01-09 17:06:05 +00:00
Łukasz Mierzwa	92788d313a	Remove TestTombstoneCleanRetentionLimitsRace This test ensures that running db.reloadBlocks() and db.CleanTombstones() at the same time doesn't race. The problem is that CleanTombstones() is a public method while reloadBlocks() is internal. CleanTombstones() sets db.cmtx lock while reloadBlocks() is not protected by any locks at all, it expects the public method through which it was called to do it. So having a race between these two is not unexpected and we shouldn't really be testing this. db.cmtx ensures that no other function can be modifying the list of open blocks and so the scenario tested here cannot happen. If it would happen it would be only because some other method doesn't aquire db.ctmx lock, something this test cannot detect. Signed-off-by: Łukasz Mierzwa <l.mierzwa@gmail.com>	2025-01-09 17:06:03 +00:00
Łukasz Mierzwa	b880cea613	Fix locks in db.reloadBlocks() This partially reverts ae3d392aa9c3a5c5f92f8116738c5b32c98b09a7. ae3d392aa9c3a5c5f92f8116738c5b32c98b09a7 added a call to db.mtx.Lock() that lasts for the entire duration of db.reloadBlocks(), previous db.mtx would be locked only during critical part of db.reloadBlocks(). The motivation was to protect against races: `9e0351e161 (r555699794)` The 'reloads' being mentioned are (I think) reloadBlocks() calls, rather than db.reload() or other methods. TestTombstoneCleanRetentionLimitsRace was added to catch this but I wasn't able to ever get any error out of it, even after disabling all calls to db.mtx in reloadBlocks() and CleanTombstones(). To make things more complicated CleanupTombstones() itself calls reloadBlocks(), so it seems that the real issue is that we might have concurrent calls to reloadBlocks(). The problem with this change is that db.reloadBlocks() can take a very long time, that's because it might need to load very large blocks from disk, which is slow. While db.mtx is locked a large chunk of the db is locked, including queries, since db.mtx read lock is needed for db.Querier() call. One of the issues this manifests itself as is a gap in all metrics and blocked queries just after a large block compaction happens. When compaction merges multiple day-or-more blocks into a week-or-more block it create a single very big block. After that block is written it needs to be loaded and that seems to be taking many seconds (30-45), during which mtx is held and everything is blocked. Turns out that there is another lock that is more fine grained and aimed at this specific use case: // cmtx ensures that compactions and deletions don't run simultaneously. cmtx sync.Mutex All calls to reloadBlocks() are wrapped inside cmtx lock. The only exception is db.reload() which this change fixes. We can't add cmtx lock inside reloadBlocks() itself because it's called by a number of functions, some of which are already holding cmtx. Looking at the code I think it is sufficient to hold cmtx and skip a reloadBlocks() wide mtx call. Signed-off-by: Łukasz Mierzwa <l.mierzwa@gmail.com>	2025-01-09 17:05:39 +00:00
Arve Knudsen	f030894c2c	Fix issues raised by staticcheck (#15722 ) Fix issues raised by staticcheck We are not enabling staticcheck explicitly, though, because it has too many false positives. --------- Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>	2025-01-09 17:51:26 +01:00
Ben Ye	919a5b657e	Expose ListPostings Length via Len() method (#15678 ) tsdb: expose remaining ListPostings Length Signed-off-by: Ben Ye <benye@amazon.com> --------- Signed-off-by: Ben Ye <benye@amazon.com>	2025-01-07 17:58:26 +01:00
György Krajcsovits	1e420ef373	Merge branch 'main' into cedwards/nhcb-wal-wbl # Conflicts: # tsdb/tsdbutil/histogram.go	2025-01-02 12:50:19 +01:00
György Krajcsovits	a7ccc8e091	record_test.go: avoid captures, simply return test refs Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>	2025-01-02 12:45:20 +01:00
Bryan Boreham	096e2aa7bd	Merge pull request #14518 from bboreham/faster-listpostings-merge TSDB: Optimization: Merge postings using concrete type	2025-01-02 10:43:45 +00:00
Bryan Boreham	b2fa1c9524	TSDB benchmarks: Commit periodically to speed up init When creating dummy data for benchmarks, call `Commit()` periodically to avoid growing the appender to enormous size. Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	2024-12-30 17:42:56 +00:00
johncming	061400e31b	tsdb: export CheckpointPrefix constant (#15636 ) Exported the CheckpointPrefix constant to be used in other packages. Updated references to the constant in db.go and checkpoint.go files. This change improves code readability and maintainability. Signed-off-by: johncming <johncming@yahoo.com> Co-authored-by: johncming <conjohn668@gmail.com>	2024-12-29 17:54:45 +01:00
Carrie Edwards	1508149184	Update benchmark test and comment	2024-12-27 09:09:13 -08:00
Bryan Boreham	cfa32f3d28	TSDB: Move merge of head postings into index This enables it to take advantage of a more compact data structure since all postings are known to be `*ListPostings`. Remove the `Get` member which was not used for anything else, and fix up tests. Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	2024-12-20 19:22:30 +00:00
Bryan Boreham	0a8779f46d	TSDB: Make mergedPostings generic Now we can call it with more specific types which is more efficient than making everything go through the `Postings` interface. Benchmark the concrete type. Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	2024-12-20 17:09:21 +00:00
Bryan Boreham	1b22242024	TSDB BenchmarkMerge: run fewer sizes As long as we run small and big sizes, we don't need all the sizes inbetween. Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	2024-12-20 17:09:21 +00:00
Bryan Boreham	e630ffdbed	TSDB: extend BenchmarkMemPostings_PostingsForLabelMatching to check merge speed We need to create more postings entries so the merger has some work to do. Not material for the regexp ones as they match so few series. Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	2024-12-20 17:09:21 +00:00
Björn Rabenstein	318d6bc4bf	Merge pull request #15548 from TinfoilSubmarine/fix/386-test-failures test: fixes for 32-bit archs	2024-12-18 15:49:30 +01:00
Björn Rabenstein	ff398062cb	Merge pull request #15679 from colega/update-comment-on-mempostings-lvs Update comment on MemPostings.lvs	2024-12-17 19:41:56 +01:00
Oleg Zaytsev	c8359fcd6b	Fix bug in lbl!~".+" shortcut (#15684 ) We were appending to the wrong slice, so instead of removing values, we were adding them. Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>	2024-12-17 17:34:24 +01:00
Oleg Zaytsev	17d5bc4e54	Update comment on MemPostings.lvs There was a missing verb there. Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>	2024-12-16 17:20:51 +01:00
Joel Beckmeyer	39f5a07236	fix TestOOOHeadChunkReader_Chunk on 32-bit Signed-off-by: Joel Beckmeyer <joel@beckmeyer.us>	2024-12-16 10:45:07 -05:00
Bryan Boreham	ac4f8a5e23	[ENHANCEMENT] TSDB: Improve calculation of space used by labels (#13880 ) * [ENHANCEMENT] TSDB: Improve calculation of space used by labels The labels for each series in the Head take up some some space in the Postings index, but far more space in the `memSeries` structure. Instead of having the Postings index calculate this overhead, which is a layering violation, have the caller pass in a function to do it. Provide three implementations of this function for the three Labels versions. Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	2024-12-16 09:42:52 +00:00
David Ashpole	953a873342	update links to openmetrics to reference the v1.0.0 release Signed-off-by: David Ashpole <dashpole@google.com>	2024-12-13 21:32:27 +00:00
György Krajcsovits	df88de5800	Fix lint for real Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>	2024-12-12 12:52:01 +01:00
György Krajcsovits	cf36792e14	Fix unused import Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>	2024-12-12 12:49:28 +01:00
György Krajcsovits	fdb1516af1	Fix lint errors Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>	2024-12-12 12:47:43 +01:00
György Krajcsovits	d64d1c4c0a	Benchmark encoding classic and nhcb Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>	2024-12-12 10:59:06 +01:00
György Krajcsovits	a325ff142c	fix(test): do not run automatic WAL truncate during test Remove the 2 minute timeout as the default is 2 hours and wouldn't interfere. With the test. Otherwise the extra samples combined with race detection can push the test over 2 minutes and make it fail. Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>	2024-12-10 17:30:46 +01:00
György Krajcsovits	07276aeece	fix(test): if we are dereferencing a slice we should check its len Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>	2024-12-10 16:25:50 +01:00
György Krajcsovits	8f572fe905	fix(lint): linter errors Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>	2024-12-10 16:25:20 +01:00
György Krajcsovits	b94c87bea6	fix(test): TestCheckpoint segment size too low The segment size was too low for the additional NHCB data, thus it created more segments then expected. This meant that less were in the lower numbered segments, which meant more was kept. FAIL: TestCheckpoint (4.05s) FAIL: TestCheckpoint/compress=none (0.22s) checkpoint_test.go:361: Error Trace: /home/krajo/go/github.com/prometheus/prometheus/tsdb/wlog/checkpoint_test.go:361 Error: "0.8586956521739131" is not less than "0.8" Test: TestCheckpoint/compress=none Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>	2024-12-10 16:16:46 +01:00

1 2 3 4 5 ...

1413 Commits