prometheus

mirror of https://github.com/prometheus/prometheus.git synced 2026-04-01 03:41:05 +02:00

Author	SHA1	Message	Date
Julien	16876bab95	Merge pull request #18200 from roidelapluie/roidelapluie/retention-validation Multiple fixes in retention configuration	2026-03-20 12:27:37 +01:00
Carrie Edwards	a0d0a8efe8	Remove setting of xor2 encoding option in db open Signed-off-by: Carrie Edwards <edwrdscarrie@gmail.com>	2026-03-12 11:08:33 -07:00
Carrie Edwards	a679ab5eb4	Add xor2-encoding feature flag Signed-off-by: Carrie Edwards <edwrdscarrie@gmail.com>	2026-03-12 11:07:00 -07:00
György Krajcsovits	0dac72ee94	feat(tsdb): register st_storage in feature API Register the st-storage feature flag in the feature registry via the TSDB options, consistent with how other TSDB features like exemplar_storage and delayed_compaction are registered. Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com> Coded with Claude Sonnet 4.6.	2026-03-12 16:01:04 +01:00
Julien Pivotto	3675a5e56c	tsdb: fix unit mismatch in retention duration on config reload conf.StorageConfig.TSDBConfig.Retention.Time is model.Duration which is type-aliased to time.Duration (nanoseconds), but RetentionDuration is int64 in milliseconds. The missing division by time.Millisecond caused the metric prometheus_tsdb_retention_limit_seconds to be reported 1e6 times too large after a config reload. Signed-off-by: Julien Pivotto <291750+roidelapluie@users.noreply.github.com>	2026-02-26 16:44:49 +01:00
Julien Pivotto	dcfa1b96c6	config: validate TSDB retention settings during config parsing Move retention validation from tsdb/db.go into a TSDBRetentionConfig UnmarshalYAML method so that invalid values are rejected at config load/reload time rather than at apply time. - Reject negative retention size values. - Reject retention percentage values above 100. - Simplify ApplyConfig to assign retention values unconditionally, enabling setting a value back to 0 to disable it. Signed-off-by: Julien Pivotto <291750+roidelapluie@users.noreply.github.com>	2026-02-26 15:16:09 +01:00
bwplotka	8f3a6020d8	Merge branch 'main' into st-main-sync2	2026-02-25 13:54:25 +00:00
Julien	9d38077e50	Merge pull request #18080 from ldufr/ldufresne/retention-size-percentage Add percentage based retention	2026-02-24 15:50:36 +01:00
Laurent Dufresne	c76e78d0a4	Added test for percentage-based retention Signed-off-by: Laurent Dufresne <laurent.dufresne@grafana.com>	2026-02-24 15:28:45 +01:00
Laurent Dufresne	971143edac	Added `Retention.Percentage` to config file with runtime config reloading Signed-off-by: Laurent Dufresne <laurent.dufresne@grafana.com>	2026-02-24 15:28:20 +01:00
Jérôme LOYET	696679e50c	Add `storage.tsdb.retention.percentage` config Signed-off-by: Jérôme LOYET <822436+fatpat@users.noreply.github.com> Signed-off-by: Laurent Dufresne <laurent.dufresne@grafana.com>	2026-02-24 15:27:45 +01:00
bwplotka	56c46af6a6	Merge branch 'main' into st-f-main	2026-02-23 10:00:39 +00:00
George Krajcsovits	223f016c44	feat(tsdb): allow using ST capable XOR chunks - retain format on read (#18013 ) * feat(tsdb): allow appending to ST capable XOR chunk optionally Only for float samples as of now. Supports for in-order and out-of-order samples. Make sure that on readout the ST capable chunks are returned automatically. When the chunks are returned as is, this is trivially true. When a chunk needs to be re-coded due to deletion (tombstone) markers, we take the encoding of the original chunk. When a chunk needs to be created from overlapping chunks, we observe whether ST is zero or not and create the new chunk based on that. Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>	2026-02-20 09:15:51 +01:00
Kyle Eckhart	ae062151cd	tsdb/wlog: Remove any temproary checkpoints when creating a Checkpoint (#17598 ) * RemoveTmpDirs function to tsdbutil * Refactor db to use RemoveTmpDirs and no longer cleanup checkpoint tmp dirs * Use RemoveTmpDirs in wlog checkpoint to cleanup all checkpoint tmp folders * Add tests for RemoveTmpDirs * Ensure db.Open will still cleanup extra temporary checkpoints Signed-off-by: Kyle Eckhart <kgeckhart@users.noreply.github.com>	2026-02-17 09:23:54 +01:00
Ganesh Vernekar	fe5cb190e6	tsdb: Add metrics for stale series compaction (#17957 ) Signed-off-by: Ganesh Vernekar <ganesh.vernekar@reddit.com>	2026-02-06 09:05:56 +00:00
Ben Kochie	8d491cc642	tsdb: Migrate multi-errors to errors package (#17768 ) Modernize tsdb package by migrating multi-error handling to the standard library errors package. * Add a modernized CloseAll helper. Signed-off-by: SuperQ <superq@gmail.com>	2026-02-04 10:41:57 +01:00
Arve Knudsen	00a7faa2e3	tsdb: fix division by zero in stale series compaction (#17952 ) Guard the stale series ratio calculation by checking numSeries > 0 before computing the ratio. This prevents division by zero when the head has no series. Fixes #17949 Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>	2026-01-29 08:06:00 +01:00
Bartlomiej Plotka	2597a12080	st: Add a hidden 'st-storage' feature flag for PROM-60 (#17907 ) Signed-off-by: bwplotka <bwplotka@gmail.com> Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>	2026-01-28 09:05:54 +00:00
Ganesh Vernekar	3e4a094dbb	Add stale_series_compaction_threshold config file option Signed-off-by: Ganesh Vernekar <ganesh.vernekar@reddit.com>	2026-01-23 18:12:34 -08:00
Ganesh Vernekar	43e69388df	tsdb: Add stale series compaction support in the DB Signed-off-by: Ganesh Vernekar <ganesh.vernekar@reddit.com>	2026-01-23 18:00:08 -08:00
Ben Kochie	e14795bbf4	Remove copyright date from headers (#17785 ) Remove copyright dates from various files as part of [PROM-50]. [PROM-50]: https://github.com/prometheus/proposals/blob/main/proposals/0050-remove-copyright-dates.md Signed-off-by: SuperQ <superq@gmail.com>	2026-01-05 13:46:21 +01:00
NamanParlecha	c94101d023	TSDB: Option to configure TSDB Block Reload Interval (#16728 ) Add --storage.tsdb.block-reload-interval flag to configure TSDB block reload interval. --------- Signed-off-by: Naman-B-Parlecha <namanparlecha@gmail.com> Signed-off-by: NamanParlecha <namanparlecha@gmail.com> Co-authored-by: Arve Knudsen <arve.knudsen@gmail.com>	2025-12-15 09:31:17 +01:00
Julien	f73aba34cd	Merge pull request #17427 from roidelapluie/roidelapluie/ffapi API: Add a /api/v1/features endpoint	2025-12-10 10:14:03 +01:00
Julien Pivotto	a5671a002f	API: Add a /api/v1/features endpoint Signed-off-by: Julien Pivotto <291750+roidelapluie@users.noreply.github.com>	2025-12-09 16:13:14 +01:00
bwplotka	e7e45090e4	refactor(appenderV2): port TSDB non-head tests Signed-off-by: bwplotka <bwplotka@gmail.com>	2025-12-09 10:39:45 +00:00
bwplotka	0b70a07572	refactor(appenderV2): add TSDB AppenderV2 implementation Signed-off-by: bwplotka <bwplotka@gmail.com> tmp Signed-off-by: bwplotka <bwplotka@gmail.com>	2025-12-09 10:39:43 +00:00
Bartlomiej Plotka	f6ca7145ca	refactor(tsdb): use one test newTestDB constructor (#17638 ) For tests only, we had various ways of opening DB. Reduced to one instead of: * Open * newTestDB * newTestDBOpts * openTestDB This so https://github.com/prometheus/prometheus/pull/17629 is smaller and bit easier. Also for test maintainability and consistency. Signed-off-by: bwplotka <bwplotka@gmail.com>	2025-12-03 07:55:48 +00:00
Łukasz Mierzwa	8a1086a128	feat: Add flag that blocks lvl 1 compactions until upload is confirmed in an external JSON file (#17435 ) * Delay compactions until Thanos uploads all blocks Using Thanos sidecar with Prometheus requires us to disable TSDB compactions on Prometheus side by setting --storage.tsdb.min-block-duration and --storage.tsdb.max-block-duration to the same value. See https://thanos.io/tip/components/sidecar.md. The main problem this avoids is that Prometheus might compact given block before Thanos uploads it, creating a gap in Thanos metrics. Thanos does not upload compacted blocks because that would upload the same sample multiple times. You can tell Thanos to upload compacted blocks but that is aimed at one time migrations. This patch creates a bridge between Thanos and Prometheus by allowing Prometheus to read the shipper file Thanos creates, where it tracks which blocks were already uploaded, and using that data delays compaction of blocks until they are marked as uploaded by Thanos. Thanks to this both services can coordinate with each other (in a way) and we can stop disabling compaction on Prometheus side when Thanos uploads are enabled. The reason to have this is that disabling compactions have very dramatic performance cost. Since most time series exist for longer than a single block duration (2h by default) large chunks of block index will reference the same series, so 10 * 2h blocks will each have an index that is usually fairly big and is almost the same for all 10 blocks. Compaction de-duplicates the index so merging 10 blocks together would leave us with a single index that is around the same size as each of these 10 2h blocks would have (plus some extra for series that only exists in some blocks, but not all). Every range query that iterates over all 10 blocks would then have to read each index and so we're doing 10x more work then if we had a single compacted block. Signed-off-by: Lukasz Mierzwa <l.mierzwa@gmail.com> * Rename structs and functions to make this more generic Signed-off-by: Lukasz Mierzwa <l.mierzwa@gmail.com> * Address review comments Signed-off-by: Lukasz Mierzwa <l.mierzwa@gmail.com> * Cache UploadMeta for 1 minute Signed-off-by: Lukasz Mierzwa <l.mierzwa@gmail.com> --------- Signed-off-by: Lukasz Mierzwa <l.mierzwa@gmail.com>	2025-12-02 10:39:45 +00:00
Ben Kochie	204249fcb5	Update golangci-lint (#17478 ) * Update golangci-lint to v2.6.0 * Fixup various linting issues. * Fixup deprecations. * Add exception for `labels.MetricName` deprecation. Signed-off-by: SuperQ <superq@gmail.com>	2025-11-05 13:47:34 +01:00
Minh Nguyen	ad4b59c504	tsdb: Deprecate retention flags; add tsdb.retention runtime configuration (#17026 ) * Move storage from CL to config file Signed-off-by: pipiland2612 <nguyen.t.dang.minh@gmail.com> * Fix .md Signed-off-by: pipiland2612 <nguyen.t.dang.minh@gmail.com> * run make cli-documentation Signed-off-by: pipiland2612 <nguyen.t.dang.minh@gmail.com> * fix Signed-off-by: pipiland2612 <nguyen.t.dang.minh@gmail.com> * run make cli-documentation Signed-off-by: pipiland2612 <nguyen.t.dang.minh@gmail.com> * nit_fixed Signed-off-by: pipiland2612 <nguyen.t.dang.minh@gmail.com> * fix Signed-off-by: pipiland2612 <nguyen.t.dang.minh@gmail.com> * add test and update configuration.md Signed-off-by: pipiland2612 <nguyen.t.dang.minh@gmail.com> * fix lint Signed-off-by: pipiland2612 <nguyen.t.dang.minh@gmail.com> --------- Signed-off-by: pipiland2612 <nguyen.t.dang.minh@gmail.com>	2025-10-27 14:51:33 +00:00
beorn7	ad7d1aed99	Phase out native histogram feature flag The detailed plan for this is laid out in https://github.com/prometheus/prometheus/issues/16572 . This commit adds a global and local scrape config option `scrape_native_histograms`, which has to be set to true to ingest native histograms. To ease the transition, the feature flag is changed to simply set the default of `scrape_native_histograms` to true. Further implications: - The default scrape protocols now depend on the `scrape_native_histograms` setting. - Everywhere else, histograms are now "on by default". Documentation beyond the one for the feature flag and the scrape config are deliberately left out. See https://github.com/prometheus/prometheus/pull/17232 for that. Signed-off-by: beorn7 <beorn@grafana.com>	2025-10-15 14:50:52 +02:00
Lukasz Mierzwa	31282d67b7	Log when GC / block write starts Right now Prometheus only logs when these operations are completed. It's a bit surprising to see suddenly a message saying "I was busy doing X for the past N minutes" so let's add a message when the operation starts, so it's easier to understand what Prometheus was doing at any point in time when reading logs. Signed-off-by: Lukasz Mierzwa <l.mierzwa@gmail.com>	2025-08-26 10:30:22 +01:00
Matthieu MOREL	cef219c31c	chore: enable unused-receiver rule from revive Signed-off-by: Matthieu MOREL <matthieu.morel35@gmail.com>	2025-08-04 09:43:33 +00:00
sujal shah	4408a6bcaf	api: Create `/status/tsdb/blocks` endpoint. this endpoint serves blocks data to the client. Signed-off-by: sujal shah <sujalshah28092004@gmail.com>	2025-07-04 03:13:54 +05:30
Ayoub Mrini	317acb3d68	refactor: use the built-in max/min to simplify the code (#16617 ) Signed-off-by: carrychair <linghuchong404@gmail.com>	2025-05-27 14:42:50 +02:00
Ayoub Mrini	2edc3ed6c5	feat(tsdb): introduce --use-uncached-io feature flag and allow using it for chunks writing (#15365 ) Signed-off-by: machine424 <ayoubmrini424@gmail.com> Signed-off-by: Ayoub Mrini <ayoubmrini424@gmail.com>	2025-05-21 14:42:30 +02:00
carrychair	e83dc66bdb	refactor: use the built-in max/min to simplify the code Signed-off-by: carrychair <linghuchong404@gmail.com>	2025-05-20 14:36:39 +08:00
Yuchen Wang	5630a3906a	fix typo (#16480 ) Signed-off-by: Yuchen Wang <yuchen.wang@databricks.com>	2025-04-25 09:27:58 +02:00
Ryan Wu	7d73c1d3f8	refactor[discovery, tsdb]: simplify error handling and remove redundant checks (#16328 ) * refactor: simplify error handling and remove redundant checks Signed-off-by: Ryan Wu <rongjun0821@gmail.com> * Add the comment for return of reloading blocks failure Co-authored-by: Ayoub Mrini <ayoubmrini424@gmail.com> Signed-off-by: Ryan Wu <rongjun0821@gmail.com> * Add the comment for return of reloading blocks failure Signed-off-by: Ryan Wu <rongjun0821@gmail.com> --------- Signed-off-by: Ryan Wu <rongjun0821@gmail.com> Co-authored-by: Ayoub Mrini <ayoubmrini424@gmail.com>	2025-03-27 12:20:59 +01:00
Fiona Liao	37c2ebb5fd	Make out-of-order native histograms flag a no-op and always enable (#16207 ) * Remove experimental out-of-order native histogram flag This feature has been available in Prometheus since September 2024, and has no known issues. Therefore proposing to remove the flag entirely and always have it on. Note that there are still two settings that need to be configured (out-of-order time window > 0 and native histograms enabled) for this feature to work. Signed-off-by: Fiona Liao <fiona.liao@grafana.com> * Update CHANGELOG Signed-off-by: Fiona Liao <fiona.liao@grafana.com> * Keep feature flag with warning Signed-off-by: Fiona Liao <fiona.liao@grafana.com> * Update CHANGELOG Signed-off-by: Fiona Liao <fiona.liao@grafana.com> * Update tsdb/head_append.go Co-authored-by: George Krajcsovits <krajorama@users.noreply.github.com> Signed-off-by: Fiona Liao <fiona.y.liao@gmail.com> * Update CHANGELOG.md Co-authored-by: George Krajcsovits <krajorama@users.noreply.github.com> Signed-off-by: Fiona Liao <fiona.y.liao@gmail.com> * Update tsdb/head_append.go Co-authored-by: George Krajcsovits <krajorama@users.noreply.github.com> Signed-off-by: Fiona Liao <fiona.y.liao@gmail.com> * Additional cleanup of comments and test names Signed-off-by: Fiona Liao <fiona.liao@grafana.com> --------- Signed-off-by: Fiona Liao <fiona.liao@grafana.com> Signed-off-by: Fiona Liao <fiona.y.liao@gmail.com> Co-authored-by: George Krajcsovits <krajorama@users.noreply.github.com>	2025-03-18 10:59:02 +00:00
Bartlomiej Plotka	7a7bc65237	Add util/compression package to consolidate snappy/zstd use in Prometheus. (#16156 ) # Conflicts: # tsdb/db_test.go Apply suggestions from code review tmp Addressed comments. Update util/compression/buffers.go Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com> Co-authored-by: Arthur Silva Sens <arthursens2005@gmail.com>	2025-03-10 10:36:26 +00:00
Arve Knudsen	7cbf749096	Upgrade to github.com/oklog/ulid/v2 (#16168 ) Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>	2025-03-05 16:03:25 +01:00
Lukasz Mierzwa	e3728122b2	Update comments for methods that require a lock Signed-off-by: Lukasz Mierzwa <l.mierzwa@gmail.com>	2025-01-09 17:20:10 +00:00
Lukasz Mierzwa	a1740cd2e7	Remove unnecessary locks Compact() is an uppercase function that deals with locks on its own, so we shouldn't have a lock around it. Signed-off-by: Lukasz Mierzwa <lukasz@cloudflare.com>	2025-01-09 17:06:05 +00:00
Łukasz Mierzwa	d106b3beb7	Wrap db.blocks read in a read lock We don't hold db.mtx lock when trying to read db.blocks here so we need a read lock around this loop. Signed-off-by: Łukasz Mierzwa <l.mierzwa@gmail.com>	2025-01-09 17:06:05 +00:00
Łukasz Mierzwa	b880cea613	Fix locks in db.reloadBlocks() This partially reverts ae3d392aa9c3a5c5f92f8116738c5b32c98b09a7. ae3d392aa9c3a5c5f92f8116738c5b32c98b09a7 added a call to db.mtx.Lock() that lasts for the entire duration of db.reloadBlocks(), previous db.mtx would be locked only during critical part of db.reloadBlocks(). The motivation was to protect against races: `9e0351e161 (r555699794)` The 'reloads' being mentioned are (I think) reloadBlocks() calls, rather than db.reload() or other methods. TestTombstoneCleanRetentionLimitsRace was added to catch this but I wasn't able to ever get any error out of it, even after disabling all calls to db.mtx in reloadBlocks() and CleanTombstones(). To make things more complicated CleanupTombstones() itself calls reloadBlocks(), so it seems that the real issue is that we might have concurrent calls to reloadBlocks(). The problem with this change is that db.reloadBlocks() can take a very long time, that's because it might need to load very large blocks from disk, which is slow. While db.mtx is locked a large chunk of the db is locked, including queries, since db.mtx read lock is needed for db.Querier() call. One of the issues this manifests itself as is a gap in all metrics and blocked queries just after a large block compaction happens. When compaction merges multiple day-or-more blocks into a week-or-more block it create a single very big block. After that block is written it needs to be loaded and that seems to be taking many seconds (30-45), during which mtx is held and everything is blocked. Turns out that there is another lock that is more fine grained and aimed at this specific use case: // cmtx ensures that compactions and deletions don't run simultaneously. cmtx sync.Mutex All calls to reloadBlocks() are wrapped inside cmtx lock. The only exception is db.reload() which this change fixes. We can't add cmtx lock inside reloadBlocks() itself because it's called by a number of functions, some of which are already holding cmtx. Looking at the code I think it is sufficient to hold cmtx and skip a reloadBlocks() wide mtx call. Signed-off-by: Łukasz Mierzwa <l.mierzwa@gmail.com>	2025-01-09 17:05:39 +00:00
johncming	061400e31b	tsdb: export CheckpointPrefix constant (#15636 ) Exported the CheckpointPrefix constant to be used in other packages. Updated references to the constant in db.go and checkpoint.go files. This change improves code readability and maintainability. Signed-off-by: johncming <johncming@yahoo.com> Co-authored-by: johncming <conjohn668@gmail.com>	2024-12-29 17:54:45 +01:00
Ben Ye	140f4aa9ae	feat: Allow customizing TSDB postings decoder (#13567 ) * allow customizing TSDB postings decoder --------- Signed-off-by: Ben Ye <benye@amazon.com>	2024-11-11 07:59:24 +01:00
Matthieu MOREL	af1a19fc78	enable errorf rule from perfsprint linter Signed-off-by: Matthieu MOREL <matthieu.morel35@gmail.com>	2024-11-06 16:50:36 +01:00
Alban Hurtaud	4b56af7eb8	Add hidden flag for the delayed compaction random time window (#14919 ) * Add hidden flag for the delayed compaction random time window Signed-off-by: Alban HURTAUD <alban.hurtaud@amadeus.com> * Update cmd/prometheus/main.go Co-authored-by: Ayoub Mrini <ayoubmrini424@gmail.com> Signed-off-by: Alban Hurtaud <alban.hurtaud@amadeus.com> * Update cmd/prometheus/main.go Co-authored-by: Ayoub Mrini <ayoubmrini424@gmail.com> Signed-off-by: Alban Hurtaud <alban.hurtaud@amadeus.com> * Update tsdb/db.go Co-authored-by: Ayoub Mrini <ayoubmrini424@gmail.com> Signed-off-by: Alban Hurtaud <alban.hurtaud@amadeus.com> * Fix flag name according to review - add test for delay Signed-off-by: Alban HURTAUD <alban.hurtaud@amadeus.com> * Fix afer main rebase Signed-off-by: Alban HURTAUD <alban.hurtaud@amadeus.com> * Implement review comments Signed-off-by: Alban HURTAUD <alban.hurtaud@amadeus.com> * Update generatedelaytest to try with limit values Signed-off-by: Alban HURTAUD <alban.hurtaud@amadeus.com> --------- Signed-off-by: Alban HURTAUD <alban.hurtaud@amadeus.com> Signed-off-by: Alban Hurtaud <alban.hurtaud@amadeus.com> Co-authored-by: Ayoub Mrini <ayoubmrini424@gmail.com>	2024-11-04 08:26:26 +01:00

1 2 3 4 5

214 Commits