prometheus

mirror of https://github.com/prometheus/prometheus.git synced 2026-05-13 08:36:38 +02:00

Author	SHA1	Message	Date
avilevy	0f38319b92	feat(discovery,scrape): rename startup wait options and add DiscoveryReloadOnStartup - discovery: Rename `SkipInitialWait` to `SkipStartupWait` for clarity. - discovery: Pass `context.Context` to `flushUpdates` to handle cancellation and avoid leaks. - scrape: Add `DiscoveryReloadOnStartup` to `Options` to decouple startup discovery from `ScrapeOnShutdown`. - tests: Refactor `TestTargetSetTargetGroupsPresentOnStartup` and `TestManagerReloader` to use table-driven tests and `synctest` for better stability and coverage. Signed-off-by: avilevy <avilevy@google.com>	2026-03-25 22:00:51 +00:00
avilevy	405d13d479	scrape: Add TestManagerReloader and refactor discovery triggerSync Adds a new TestManagerReloader test suite using synctest to assert behavior of target updates, discovery reload ticker intervals, and ScrapeOnShutdown flags. Updates setupSynctestManager to allow skipping initial config setup by passing an interval of 0. Also renames the 'keep' variable to 'triggerSync' in ApplyConfig inside discovery/manager.go for clarity, and adds a descriptive comment. Signed-off-by: avilevy <avilevy@google.com>	2026-03-25 19:18:22 +00:00
avilevy	5a96218714	feat(discovery): add SkipInitialWait to bypass initial startup delay This adds a SkipInitialWait option to the discovery Manager, allowing consumers sensitive to startup latency to receive the first batch of discovered targets immediately instead of waiting for the updatert ticker. To support this without breaking the immediate dropped target notifications introduced in #13147, ApplyConfig now uses a keep flag to only trigger immediate downstream syncs for obsolete or updated providers. This prevents sending premature empty target groups for brand-new providers on initial startup. Additionally, the scrape manager's reloader loop is updated to process the initial triggerReload immediately, ensuring the end-to-end pipeline processes initial targets without artificial delays. Signed-off-by: avilevy <avilevy@google.com>	2026-03-20 23:34:53 +00:00
avilevy	3018f35527	Merge branch 'refs/heads/main' into skip-wait-for-discovery	2026-03-20 23:21:59 +00:00
Bartlomiej Plotka	776a71749a	Merge pull request #18314 from ridwanmsharif/scrape/fix-jitter scrape: reset ticker to align target scrape times with offset and intervals	2026-03-20 09:52:32 +01:00
Ridwan Sharif	101ae73380	scrape: address comments on PR Signed-off-by: Ridwan Sharif <ridwanmsharif@google.com>	2026-03-20 05:58:22 +00:00
Ridwan Sharif	8e8cd480cb	scrape: Introduce an `offsetSeed` option for deterministic scrape offset calculation and utilize it in tests Signed-off-by: Ridwan Sharif <ridwanmsharif@google.com>	2026-03-17 20:16:05 +00:00
Ridwan Sharif	695db71c68	scrape: add test for distribution of scrapes Signed-off-by: Ridwan Sharif <ridwanmsharif@google.com>	2026-03-17 20:00:03 +00:00
Ridwan Sharif	caa250a29c	scrape: reset ticker to align target scrape times with offset and intervals Signed-off-by: Ridwan Sharif <ridwanmsharif@google.com>	2026-03-17 17:49:02 +00:00
avilevy	e22aabf8c9	Merge branch 'refs/heads/main' into skip-wait-for-discovery	2026-03-17 15:43:51 +00:00
Bartlomiej Plotka	a02e20d98e	Merge branch 'main' into feature/start-time Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>	2026-03-17 13:06:25 +01:00
avilevy18	bdfb3fc232	scrape: add option to manager to allow scraping at shutdown; add initial offset option (#18067 ) * Adding scape on shutdown Signed-off-by: avilevy <avilevy@google.com> * scrape: replace skipOffsetting to make the test offset deterministic instead of skipping it entirely Signed-off-by: avilevy <avilevy@google.com> * renamed calculateScrapeOffset to getScrapeOffset Signed-off-by: avilevy <avilevy@google.com> * test(scrape): refactor time-based manager tests to use synctest Addresses PR feedback to remove flaky, time-based sleeping in the scrape manager tests. Add TestManager_InitialScrapeOffset and TestManager_ScrapeOnShutdown to use the testing/synctest package, completely eliminating real-world time.Sleep delays and making the assertions 100% deterministic. - Replaced httptest.Server with net.Pipe and a custom startFakeHTTPServer helper to ensure all network I/O remains durably blocked inside the synctest bubble. - Leveraged the skipOffsetting option to eliminate random scrape jitter, making the time-travel math exact and predictable. - Using skipOffsetting also safely bypasses the global singleflight DNS lookup in setOffsetSeed, which previously caused cross-bubble panics in synctest. - Extracted shared boilerplate into a setupSynctestManager helper to keep the test cases highly readable and data-driven. Signed-off-by: avilevy <avilevy@google.com> * Clarify use cases in InitialScrapeOffset comment Signed-off-by: avilevy <avilevy@google.com> * test(scrape): use httptest for mock server to respect context cancellation - Replaced manual HTTP string formatting over `net.Pipe` with `httptest.NewUnstartedServer`. - Implemented an in-memory `pipeListener` to allow the server to handle `net.Pipe` connections directly. This preserves `synctest` time isolation without opening real OS ports. - Added explicit `r.Context().Done()` handling in the mock HTTP handler to properly simulate aborted requests and scrape timeouts. - Validates that the request context remains active and is not prematurely cancelled during `ScrapeOnShutdown` scenarios. - Renamed `skipOffsetting` to `skipJitterOffsetting`. - Addressed other PR comments. Signed-off-by: avilevy <avilevy@google.com> * tmp Signed-off-by: bwplotka <bwplotka@gmail.com> * exp2 Signed-off-by: bwplotka <bwplotka@gmail.com> * fix Signed-off-by: bwplotka <bwplotka@gmail.com> * scrape: fix scrapeOnShutdown context bug and refactor test helpers The scrapeOnShutdown feature was failing during manager shutdown because the scrape pool context was being cancelled before the final shutdown scrapes could execute. Fix this by delaying context cancellation in scrapePool.stop() until after all scrape loops have stopped. In addition: - Added test cases to verify scrapeOnShutdown works with InitialScrapeOffset. - Refactored network test helper functions from manager_test.go to helpers_test.go. - Addressed other comments. Signed-off-by: avilevy <avilevy@google.com> * Update scrape/scrape.go Co-authored-by: Bartlomiej Plotka <bwplotka@gmail.com> Signed-off-by: avilevy18 <105948922+avilevy18@users.noreply.github.com> --------- Signed-off-by: avilevy <avilevy@google.com> Signed-off-by: bwplotka <bwplotka@gmail.com> Signed-off-by: avilevy18 <105948922+avilevy18@users.noreply.github.com> Co-authored-by: bwplotka <bwplotka@gmail.com>	2026-03-17 10:02:11 +00:00
bwplotka	c133a969af	Merge branch 'main' into start-time-main-sync	2026-03-12 08:28:15 +00:00
Matthieu MOREL	026d284c43	chore: fix httpNoBody issues from gocritic Signed-off-by: Matthieu MOREL <matthieu.morel35@gmail.com>	2026-03-02 20:06:30 +01:00
bwplotka	8f3a6020d8	Merge branch 'main' into st-main-sync2	2026-02-25 13:54:25 +00:00
avilevy	6c236f11f9	scrape: Bypass initial reload delay for ScrapeOnShutdown In short-lived environments like agent mode or serverless, the default 5-second `DiscoveryReloadInterval` can cause the process to terminate before the scrape manager has a chance to process targets and collect any metrics. Because the discovery manager sends an initial empty update upon configuration followed rapidly by the actual targets, simply waiting for a single reload trigger is insufficient—the real targets would still get trapped behind the ticker delay. This commit introduces an unthrottled startup loop in the `reloader` when `ScrapeOnShutdown` is enabled. It processes all incoming `triggerReload` signals immediately during the first interval. Once the initial tick fires, the `reloader` resets the ticker and falls back into its standard throttled loop, ensuring short-lived processes can discover and scrape targets instantly. Signed-off-by: avilevy <avilevy@google.com>	2026-02-24 20:16:24 +00:00
Julien Pivotto	3ab867b66a	scrape: Fix race condition in scrapeFailureLogger access Remove the separate scrapeFailureLoggerMtx and use targetMtx instead for synchronizing access to scrapeFailureLogger. This fixes a data race where Sync() would read scrapeFailureLogger while holding targetMtx but SetScrapeFailureLogger() would write to it while holding a different mutex. Add regression test to catch concurrent access issues. Signed-off-by: Julien Pivotto <291750+roidelapluie@users.noreply.github.com>	2026-02-23 14:54:03 +01:00
avilevy	df7bd65f58	renamed calculateScrapeOffset to getScrapeOffset Signed-off-by: avilevy <avilevy@google.com>	2026-02-18 21:30:47 +00:00
avilevy	a134497161	scrape: replace skipOffsetting to make the test offset deterministic instead of skipping it entirely Signed-off-by: avilevy <avilevy@google.com>	2026-02-18 21:25:10 +00:00
Bartlomiej Plotka	8d8371244b	Merge pull request #18108 from prometheus/bwplotka/fix scrape: add tests for ST appending; add warnings for ST feature flag users around _created drop	2026-02-18 12:26:17 +00:00
Bartlomiej Plotka	23d2ab447e	feat[scrape]: add ST parsing support to scrape AppenderV2 flow (#18103 ) Signed-off-by: bwplotka <bwplotka@gmail.com>	2026-02-18 10:15:14 +01:00
avilevy	8f35d4e343	Adding scape on shutdown Signed-off-by: avilevy <avilevy@google.com>	2026-02-11 21:23:36 +00:00
Bartlomiej Plotka	848b16d686	test: Add benchmark without storage + fix skipRecording mock feature (#17987 ) * test: Add benchmark without storage Signed-off-by: bwplotka <bwplotka@gmail.com> make bench fair Signed-off-by: bwplotka <bwplotka@gmail.com> tmp Signed-off-by: bwplotka <bwplotka@gmail.com> * Apply suggestions from code review Co-authored-by: Arve Knudsen <arve.knudsen@gmail.com> Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com> --------- Signed-off-by: bwplotka <bwplotka@gmail.com> Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com> Co-authored-by: Arve Knudsen <arve.knudsen@gmail.com>	2026-02-02 12:44:11 +00:00
Bartlomiej Plotka	88f6ee4c8e	tests(scrape): add TestScrapeLoopAppend_WithStorage (#17937 ) Signed-off-by: bwplotka <bwplotka@gmail.com>	2026-01-30 11:44:07 +00:00
Bartlomiej Plotka	36ea75d203	scrape: fix flaky appender test (#17962 ) Fixes https://github.com/prometheus/prometheus/issues/17941 Signed-off-by: bwplotka <bwplotka@gmail.com>	2026-01-29 10:50:17 +00:00
Aditya Prakash	5e66c9305f	scrape: clarify test channel name in manager_test (#17929 ) Signed-off-by: Nova <adityaprakash1357908@gmail.com>	2026-01-27 08:57:40 +00:00
Bartlomiej Plotka	bec70227f1	feat(scrape)[PART5b]: Add AppenderV2 support to scrape.NewManager constructor (#17872 ) * feat(scrape)[PART5b]: Add AppenderV2 support to scrape.NewManager optionally to V1 Signed-off-by: bwplotka <bwplotka@gmail.com> * Update scrape/manager.go Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com> * fixes after rebase Signed-off-by: bwplotka <bwplotka@gmail.com> * Apply suggestions from code review Co-authored-by: Arve Knudsen <arve.knudsen@gmail.com> Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com> --------- Signed-off-by: bwplotka <bwplotka@gmail.com> Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> Co-authored-by: Arve Knudsen <arve.knudsen@gmail.com>	2026-01-23 09:04:05 +00:00
Bartlomiej Plotka	0d116b0994	tests(teststorage): Close Storage in the helper (#17902 ) Signed-off-by: bwplotka <bwplotka@gmail.com>	2026-01-23 08:41:35 +00:00
Bartlomiej Plotka	664b255699	Merge pull request #17867 from prometheus/bwplotka/a2-scrape-1 refactor(appenderV2)[PART5a]: add AppendableV2 support to scrape loop + tests	2026-01-21 08:21:56 +00:00
Bryan Boreham	2dec6da3d1	[TESTS] Scraping: Reset appender in BenchmarkScrapeLoopAppend Otherwise performance is dominated by adding to a slice that gets longer and longer as the benchmark progresses. I chose to Rollback rather than Commit because that should do less work. Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	2026-01-17 16:37:21 +00:00
Devarsh	c7bc56cf6c	Add scrape commit and total duration metrics (#17665 ) * Add scrape commit and total duration metrics Signed-off-by: Devarsh <devarshshah2608@gmail.com> * update metric based on the review Signed-off-by: Devarsh <devarshshah2608@gmail.com> * conditionally record scrape duration Signed-off-by: Devarsh <devarshshah2608@gmail.com> * Fix formatting in scrape.go Signed-off-by: Devarsh <devarshshah2608@gmail.com> --------- Signed-off-by: Devarsh <devarshshah2608@gmail.com>	2026-01-13 14:07:27 -03:00
Marvin Rösch	fff29d330d	[BUGFIX] Scraping: drop sample if relabeling config says so Signed-off-by: Marvin Rösch <marvinroesch99@gmail.com>	2026-01-07 16:11:22 +01:00
Arthur Silva Sens	1e317d0098	Add configuration option to control `extra-scrape-metrics` (#17606 )	2026-01-06 09:00:49 -03:00
Ben Kochie	e14795bbf4	Remove copyright date from headers (#17785 ) Remove copyright dates from various files as part of [PROM-50]. [PROM-50]: https://github.com/prometheus/proposals/blob/main/proposals/0050-remove-copyright-dates.md Signed-off-by: SuperQ <superq@gmail.com>	2026-01-05 13:46:21 +01:00
Arve Knudsen	f0dfb9f802	fix(scrape): use HonorLabels instead of HonorTimestamps in newScrapeLoop (#17731 ) * fix(scrape): use HonorLabels instead of HonorTimestamps in newScrapeLoop The sampleMutator closure in newScrapeLoop was incorrectly passing HonorTimestamps to mutateSampleLabels instead of HonorLabels. This caused honor_labels configuration to be ignored, with the behavior incorrectly controlled by honor_timestamps instead. Adding TestNewScrapeLoopHonorLabelsWiring integration test that exercises the real newScrapeLoop constructor with HonorLabels and HonorTimestamps set to opposite values to catch this class of wiring bug. Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com> * Update scrape/scrape_test.go Co-authored-by: George Krajcsovits <krajorama@users.noreply.github.com> Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com> * Add honor_labels=false test case Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com> --------- Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com> Co-authored-by: George Krajcsovits <krajorama@users.noreply.github.com>	2025-12-22 16:28:08 +01:00
Bartlomiej Plotka	17e06dbab5	refactor(scrape)[PART2]: simplified scrapeLoop constructors & tests; add teststorage.Appendable mock (#17631 ) * refactor(scrape): simplified scrapeLoop constructors & tests; add teststorage.Appender mock Signed-off-by: bwplotka <bwplotka@gmail.com> debug * refactor(scrape): simplified newLoop even more Signed-off-by: bwplotka <bwplotka@gmail.com> * refactor(scrape): rename sl -> app, slApp -> app Signed-off-by: bwplotka <bwplotka@gmail.com> * fix TestScrapeLoopRun flakiness Signed-off-by: bwplotka <bwplotka@gmail.com> * fix lint Signed-off-by: bwplotka <bwplotka@gmail.com> * kill unused listSeriesSet code Signed-off-by: bwplotka <bwplotka@gmail.com> * fix closing to not panic Signed-off-by: bwplotka <bwplotka@gmail.com> * added extra benchmark for scrapeAndReport Signed-off-by: bwplotka <bwplotka@gmail.com> * added extra benchmark for restartLoops Signed-off-by: bwplotka <bwplotka@gmail.com> * addressed last comments Signed-off-by: bwplotka <bwplotka@gmail.com> * fix TestConcurrentAppender_ReturnsErrAppender naming Signed-off-by: bwplotka <bwplotka@gmail.com> * addressed small comments Signed-off-by: bwplotka <bwplotka@gmail.com> * refactor(scrape): ensure scrape config is reloaded; added test Signed-off-by: bwplotka <bwplotka@gmail.com> * addressed comments. Signed-off-by: bwplotka <bwplotka@gmail.com> --------- Signed-off-by: bwplotka <bwplotka@gmail.com>	2025-12-22 09:38:48 +00:00
Bryan Boreham	0711e89092	Merge pull request #17530 from bboreham/faster-scrape-relabel [PERF] Scraping: skip an unnecessary step when there are relabel rules	2025-12-16 10:40:16 +00:00
Rushabh Mehta	c2b86775b6	scrape: Fix potential goroutine leak in scrapeAndReport (#17554 ) * [scrape] Fix potential goroutine leak in scrape loop Signed-off-by: Rushabh Mehta <mehtarushabh2005@gmail.com> * Use correct error var Signed-off-by: Rushabh Mehta <mehtarushabh2005@gmail.com> * Add regression test Signed-off-by: Rushabh Mehta <mehtarushabh2005@gmail.com> --------- Signed-off-by: Rushabh Mehta <mehtarushabh2005@gmail.com>	2025-12-12 14:01:57 +01:00
Julien Pivotto	a5671a002f	API: Add a /api/v1/features endpoint Signed-off-by: Julien Pivotto <291750+roidelapluie@users.noreply.github.com>	2025-12-09 16:13:14 +01:00
Bryan Boreham	77ba5c5fbd	[PERF] Scraping: skip an unnecessary step when there are relabel rules Before it would do Builder->Labels->Builder, now we skip the conversions. Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	2025-11-28 15:07:40 +00:00
harsh kumar	30be1483d1	instrumentation: add native histograms to complement high-traffic summaries (#17374 ) This adds the following native histograms (with a few classic buckets for backwards compatibility), while keeping the corresponding summaries (same name, just without `_histogram`): - `prometheus_sd_refresh_duration_histogram_seconds` - `prometheus_rule_evaluation_duration_histogram_seconds` - `prometheus_rule_group_duration_histogram_seconds` - `prometheus_target_sync_length_histogram_seconds` - `prometheus_target_interval_length_histogram_seconds` - `prometheus_engine_query_duration_histogram_seconds` Signed-off-by: Harsh <harshmastic@gmail.com> Signed-off-by: harsh kumar <135993950+hxrshxz@users.noreply.github.com> Co-authored-by: Björn Rabenstein <github@rabenste.in>	2025-11-27 18:45:35 +01:00
Björn Rabenstein	b8d19543b8	Add histogram validation in remote-read and during reducing resolution (#17561 ) ReduceResolution is currently called before validation during ingestion. This will cause a panic if there are not enough buckets in the histogram. If there are too many buckets, the spurious buckets are ignored, and therefore the error in the input histogram is masked. Furthermore, invalid negative offsets might cause problems, too. Therefore, we need to do some minimal validation in reduceResolution. Fortunately, it is easy and shouldn't slow things down. Sadly, it requires to return errors, which triggers a bunch of code changes. Even here is a bright side, we can get rud of a few panics. (Remember: Don't panic!) In different news, we haven't done a full validation of histograms read via remote-read. This is not so much a security concern (as you can throw off Prometheus easily by feeding it bogus data via remote-read) but more that remote-read sources might be makeshift and could accidentally create invalid histograms. We really don't want to panic in that case. So this commit does not only add a check of the spans and buckets as needed for resolution reduction but also a full validation during remote-read. Signed-off-by: beorn7 <beorn@grafana.com>	2025-11-21 00:22:24 +01:00
Bryan Boreham	b7aae06181	Merge pull request #17114 from bboreham/scrape-stale-by-ref Scraping: detect staleness via unique reference	2025-11-14 18:32:26 +01:00
Bartlomiej Plotka	f50ff0a40a	feat: rename CreatedTimestamp to StartTimestamp (#17523 ) Partially fixes https://github.com/prometheus/prometheus/issues/17416 by renaming all CT* names to ST* in the whole codebase except RW2 (this is done in separate [PR](https://github.com/prometheus/prometheus/pull/17411)) and PrometheusProto exposition proto. ``` CreatedTimestamp -> StartTimestamp CreatedTimeStamp -> StartTimestamp created_timestamp -> start_timestamp CT -> ST ct -> st ``` Signed-off-by: bwplotka <bwplotka@gmail.com>	2025-11-13 14:17:51 +00:00
Ben Kochie	204249fcb5	Update golangci-lint (#17478 ) * Update golangci-lint to v2.6.0 * Fixup various linting issues. * Fixup deprecations. * Add exception for `labels.MetricName` deprecation. Signed-off-by: SuperQ <superq@gmail.com>	2025-11-05 13:47:34 +01:00
George Krajcsovits	d7bfc89f7a	Merge pull request #17431 from grafana/thampiotr/upstream-staleness-disabling scrape: Allow disabling end-of-run staleness markers for targets	2025-11-04 13:24:05 +01:00
Piotr	d6848c9f40	scrape: Allow disabling end-of-run staleness markers for targets Signed-off-by: Piotr <17101802+thampiotr@users.noreply.github.com>	2025-11-04 11:59:23 +00:00
Ben Kochie	48956f60d7	Update modernize (#17471 ) Apply additional Go modernize tool improvements. Signed-off-by: SuperQ <superq@gmail.com>	2025-11-04 05:13:49 +00:00
Julius Volz	0093e2159e	Merge pull request #17337 from prometheus/ui/visualize-relabel-steps ui: Allow viewing detailed relabeling steps for each discovered target	2025-11-02 13:51:55 +01:00
Lukasz Mierzwa	aac472df5b	Fix TestScrapeLoop_HistogramBucketLimit TestScrapeLoop_HistogramBucketLimit tests the bucket limiter but it also sets sample_limit to the same value, which seems incorrect. Signed-off-by: Lukasz Mierzwa <l.mierzwa@gmail.com>	2025-10-24 10:04:18 +01:00

1 2 3 4 5 ...

563 Commits