prometheus

mirror of https://github.com/prometheus/prometheus.git synced 2025-12-03 16:41:05 +01:00

Author	SHA1	Message	Date
Björn Rabenstein	b8d19543b8	Add histogram validation in remote-read and during reducing resolution (#17561 ) ReduceResolution is currently called before validation during ingestion. This will cause a panic if there are not enough buckets in the histogram. If there are too many buckets, the spurious buckets are ignored, and therefore the error in the input histogram is masked. Furthermore, invalid negative offsets might cause problems, too. Therefore, we need to do some minimal validation in reduceResolution. Fortunately, it is easy and shouldn't slow things down. Sadly, it requires to return errors, which triggers a bunch of code changes. Even here is a bright side, we can get rud of a few panics. (Remember: Don't panic!) In different news, we haven't done a full validation of histograms read via remote-read. This is not so much a security concern (as you can throw off Prometheus easily by feeding it bogus data via remote-read) but more that remote-read sources might be makeshift and could accidentally create invalid histograms. We really don't want to panic in that case. So this commit does not only add a check of the spans and buckets as needed for resolution reduction but also a full validation during remote-read. Signed-off-by: beorn7 <beorn@grafana.com>	2025-11-21 00:22:24 +01:00
Minh Nguyen	5087a25848	Remote Write Receive Fix: Remove duplicate labels when type-and-unit-label feature is on (#17546 ) * drop extra label from receiver Signed-off-by: pipiland2612 <nguyen.t.dang.minh@gmail.com> * used constant Signed-off-by: pipiland2612 <nguyen.t.dang.minh@gmail.com> --------- Signed-off-by: pipiland2612 <nguyen.t.dang.minh@gmail.com>	2025-11-18 09:37:09 +00:00
Bartlomiej Plotka	cefefc6897	prw2: Move Remote Write 2.0 CT to be per Sample; Rename to ST (start timestamp) (#17411 ) Relates to https://github.com/prometheus/prometheus/issues/16944#issuecomment-3164760343 Signed-off-by: bwplotka <bwplotka@gmail.com>	2025-11-17 14:59:40 +00:00
Laurent Dufresne	d99f8dacc4	chore: remove dead code (#17542 ) Signed-off-by: Laurent Dufresne <laurent.dufresne@grafana.com>	2025-11-17 10:37:55 +01:00
Bartlomiej Plotka	f50ff0a40a	feat: rename CreatedTimestamp to StartTimestamp (#17523 ) Partially fixes https://github.com/prometheus/prometheus/issues/17416 by renaming all CT* names to ST* in the whole codebase except RW2 (this is done in separate [PR](https://github.com/prometheus/prometheus/pull/17411)) and PrometheusProto exposition proto. ``` CreatedTimestamp -> StartTimestamp CreatedTimeStamp -> StartTimestamp created_timestamp -> start_timestamp CT -> ST ct -> st ``` Signed-off-by: bwplotka <bwplotka@gmail.com>	2025-11-13 14:17:51 +00:00
Bartlomiej Plotka	675bafe2fb	Merge pull request #17441 from pipiland2612/refactor_queue_manger Refactor part of queue_manger.go by creating struct to reuse some common function	2025-11-13 15:07:11 +01:00
Minh Nguyen	7ebff91cfd	OTLP Receiver: Only update metadata to WAL when metadata-wal-records feature is enabled (#17472 ) OTLP Receiver: Only update metadata to WAL when metadata-wal-records feature is enabled. --------- Signed-off-by: pipiland2612 <nguyen.t.dang.minh@gmail.com>	2025-11-13 09:53:12 +01:00
Ben Kochie	204249fcb5	Update golangci-lint (#17478 ) * Update golangci-lint to v2.6.0 * Fixup various linting issues. * Fixup deprecations. * Add exception for `labels.MetricName` deprecation. Signed-off-by: SuperQ <superq@gmail.com>	2025-11-05 13:47:34 +01:00
Minh Nguyen	30992dd032	[RW2] Fix: Only update metadata to WAL when metadata-wal-records feature is enabled (#17470 ) * add feature check when UpdateMetadata Signed-off-by: pipiland2612 <nguyen.t.dang.minh@gmail.com> * add appendMetadata boolean to write_hander Signed-off-by: pipiland2612 <nguyen.t.dang.minh@gmail.com> * fix Signed-off-by: pipiland2612 <nguyen.t.dang.minh@gmail.com> --------- Signed-off-by: pipiland2612 <nguyen.t.dang.minh@gmail.com>	2025-11-04 08:16:57 +00:00
Ben Kochie	48956f60d7	Update modernize (#17471 ) Apply additional Go modernize tool improvements. Signed-off-by: SuperQ <superq@gmail.com>	2025-11-04 05:13:49 +00:00
Minh Nguyen	784ec0a792	update test to test both v1 and v2 (#17467 ) Signed-off-by: pipiland2612 <nguyen.t.dang.minh@gmail.com>	2025-11-03 09:22:46 +00:00
pipiland2612	704afd8529	add timeSeriesAgeChecker to refactor filter code Signed-off-by: pipiland2612 <nguyen.t.dang.minh@gmail.com>	2025-10-31 23:19:53 +02:00
pipiland2612	9e6a626dae	create timeSeriesStats to reduce return variable Signed-off-by: pipiland2612 <nguyen.t.dang.minh@gmail.com>	2025-10-31 22:17:45 +02:00
pipiland2612	e1cb29bf8a	create common struct and function to DRY Signed-off-by: pipiland2612 <nguyen.t.dang.minh@gmail.com>	2025-10-31 21:55:14 +02:00
Minh Nguyen	c8f1de18a7	[RW2] Fix type and unit labels propagation in Remote Write v2 receiver to prioritize type-and-unit-labels feature (#17387 ) * fix Signed-off-by: pipiland2612 <nguyen.t.dang.minh@gmail.com> * fix nits & update docs Signed-off-by: pipiland2612 <nguyen.t.dang.minh@gmail.com> * fix docs Signed-off-by: pipiland2612 <nguyen.t.dang.minh@gmail.com> --------- Signed-off-by: pipiland2612 <nguyen.t.dang.minh@gmail.com>	2025-10-31 08:59:03 +00:00
György Krajcsovits	b8192127ee	Merge remote-tracking branch 'origin/release-3.7' into krajo/merge-3.7.3-to-main # Conflicts: # CHANGELOG.md # storage/remote/queue_manager_test.go	2025-10-30 09:21:25 +01:00
Laurent Dufresne	7621eb772c	histogram: Add `Error` type for all histogram errors `histogram.Error` becomes the generic wrapper type for all histogram errors. This makes it easier and less error prone when adding new errors to check if an error is an histogram error as well as making it less error prone to convert the errors. This change the type of those specific sentinel errors from error to `histogram.Error`, but it should almost never matter. e.g., `errors.Is(err, ErrHistogram...)` would still work out of the box. Signed-off-by: Laurent Dufresne <laurent.dufresne@grafana.com>	2025-10-30 08:45:34 +01:00
Ayoub Mrini	6806b68f93	[release-3.7] fix: Remote-write: revert changes in the queue resharding logic (#17412 ) * Revert "chore: deprecate prometheus_remote_storage_{samples,exemplars,histograms}_in_total and prometheus_remote_storage_highest_timestamp_in_seconds" This reverts commit ba14bc49db31a1b0ba3127e6ddf59a9f32a08dff. Signed-off-by: machine424 <ayoubmrini424@gmail.com> * Revert "storage/remote: compute highestTimestamp and dataIn at QueueManager level" This reverts commit 184c7eb9186aa8fea09920f2f8e8aa8a603da300. Signed-off-by: machine424 <ayoubmrini424@gmail.com> * fix(remote-write): bring back the per queue metrics Signed-off-by: machine424 <ayoubmrini424@gmail.com> * test(remote): add TestRemoteWrite_ReshardingWithoutDeadlock to reproduce the sharding scale up deadlock Signed-off-by: machine424 <ayoubmrini424@gmail.com> --------- Signed-off-by: machine424 <ayoubmrini424@gmail.com>	2025-10-29 14:04:09 +00:00
Arve Knudsen	c36e966bf8	OTLP: de-duplicate target_info samples with conflicting timestamps (#17400 ) Add logic to the target_info metric generation in the OTLP endpoint, so that any samples with the same timestamp for the same (target_info) series are de-duplicated. It comes out of a user's bug report about duplicated target_info samples in Grafana Mimir (which uses the Prometheus target_info generation logic). If I'm not mistaken, duplicate target_info samples should stem from multiple resources in the same OTLP request being translated to the same target_info label set. It shouldn't be caused by a Prometheus bug.	2025-10-28 14:13:43 +00:00
Minh Nguyen	6bb367970e	feat(promtool): add RW2 support to promtool push metrics using client_golang library (#17280 ) * Add WriteProto method and tests for promtool metrics This commit adds: 1. WriteProto method to storage/remote/client.go that handles marshaling and compression of protobuf messages 2. Updated parseAndPushMetrics in cmd/promtool/metrics.go to use the new WriteProto method 3. Comprehensive tests for PushMetrics functionality The WriteProto method provides a cleaner API for sending protobuf messages without manually handling marshaling and compression. Signed-off-by: pipiland2612 <nguyen.t.dang.minh@gmail.com> * use Write method from exp/api/remote Signed-off-by: pipiland2612 <nguyen.t.dang.minh@gmail.com> * fix Signed-off-by: pipiland2612 <nguyen.t.dang.minh@gmail.com> * fix lint Signed-off-by: pipiland2612 <nguyen.t.dang.minh@gmail.com> * fix test Signed-off-by: pipiland2612 <nguyen.t.dang.minh@gmail.com> * fix Signed-off-by: pipiland2612 <nguyen.t.dang.minh@gmail.com> * nit fixed Signed-off-by: pipiland2612 <nguyen.t.dang.minh@gmail.com> * fix lint Signed-off-by: pipiland2612 <nguyen.t.dang.minh@gmail.com> --------- Signed-off-by: pipiland2612 <nguyen.t.dang.minh@gmail.com>	2025-10-27 13:56:48 +00:00
Minh Nguyen	f070e35358	[RW]: Adopt client_golang/exp/api/remote types for receiving RW1 and RW2 (#17197 ) Signed-off-by: pipiland2612 <nguyen.t.dang.minh@gmail.com> # Conflicts: # storage/remote/write_handler.go * add comment Signed-off-by: pipiland2612 <nguyen.t.dang.minh@gmail.com> * fix Signed-off-by: pipiland2612 <nguyen.t.dang.minh@gmail.com> * fix failling test Signed-off-by: pipiland2612 <nguyen.t.dang.minh@gmail.com> * nit_fixing Signed-off-by: pipiland2612 <nguyen.t.dang.minh@gmail.com> * fix comment Signed-off-by: pipiland2612 <nguyen.t.dang.minh@gmail.com> --------- Signed-off-by: pipiland2612 <nguyen.t.dang.minh@gmail.com>	2025-10-24 10:31:34 +01:00
Julius Hinze	05612757b4	prometheusremotewrite: fix require.equal argument order (#17391 ) Signed-off-by: Julius Hinze <julius.hinze@grafana.com>	2025-10-23 15:13:32 +02:00
Arve Knudsen	ef42c088ba	OTLP: Add configuration parameters to control label name translation (#17345 ) As a follow-up to #17344, add two configuration parameters for controlling label name translation, both defaulting to on for backwards compatibility (currently these behaviours are hardcoded as enabled): * otlp.label_name_underscore_sanitization => Prefix label names starting with a single underscore with key_ when translating OTel attribute names * otlp.label_name_preserve_multiple_underscores => Keep multiple consecutive underscores in label names when translating OTel attribute names Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>	2025-10-22 08:27:35 +02:00
György Krajcsovits	ea398c15e8	Merge branch 'release-3.7' into krajo/merge-release-3071-to-main	2025-10-17 10:45:55 +02:00
Arve Knudsen	99d0967133	Fix lint failure Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>	2025-10-16 16:56:18 +02:00
Arve Knudsen	f5804e7cf2	Remove configuration parameters Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>	2025-10-16 16:37:24 +02:00
Arve Knudsen	3de3a296dd	Add reviewer feedback Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>	2025-10-16 16:13:08 +02:00
Arve Knudsen	dd3a607d2d	Add configuration parameters Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>	2025-10-16 16:13:08 +02:00
Arve Knudsen	7cf4b5da55	OTLP: Upgrade prometheus/otlptranslator The upgrade to prometheus/otlptranslator@7f02967de0 fixes two label name translation bugs, when in legacy name translation mode: * 'key' is no longer prefixed when label names start with an underscore * Multiple consecutive underscores are combined into one Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>	2025-10-16 16:13:08 +02:00
harsh kumar	16a9a827de	remote-write: Add type and unit labels to 2.0 receiver when feature flag enabled (#17329 ) * feat(remote): add support for type and unit labels in write handler Signed-off-by: Harsh <harshmastic@gmail.com> * minor fixes Signed-off-by: Harsh <harshmastic@gmail.com> * fix failing tests Signed-off-by: Harsh <harshmastic@gmail.com> * Update storage/remote/write_handler.go Co-authored-by: Bartlomiej Plotka <bwplotka@gmail.com> Signed-off-by: harsh kumar <135993950+hxrshxz@users.noreply.github.com> * Update storage/remote/write_handler.go Co-authored-by: Bartlomiej Plotka <bwplotka@gmail.com> Signed-off-by: harsh kumar <135993950+hxrshxz@users.noreply.github.com> * refactor: streamline label handling for type and unit in write handler tests Signed-off-by: Harsh <harshmastic@gmail.com> * test: enhance V2 message tests for type and unit labels Signed-off-by: Harsh <harshmastic@gmail.com> --------- Signed-off-by: Harsh <harshmastic@gmail.com> Signed-off-by: harsh kumar <135993950+hxrshxz@users.noreply.github.com> Co-authored-by: Bartlomiej Plotka <bwplotka@gmail.com>	2025-10-15 18:19:41 +01:00
beorn7	ad7d1aed99	Phase out native histogram feature flag The detailed plan for this is laid out in https://github.com/prometheus/prometheus/issues/16572 . This commit adds a global and local scrape config option `scrape_native_histograms`, which has to be set to true to ingest native histograms. To ease the transition, the feature flag is changed to simply set the default of `scrape_native_histograms` to true. Further implications: - The default scrape protocols now depend on the `scrape_native_histograms` setting. - Everywhere else, histograms are now "on by default". Documentation beyond the one for the feature flag and the scrape config are deliberately left out. See https://github.com/prometheus/prometheus/pull/17232 for that. Signed-off-by: beorn7 <beorn@grafana.com>	2025-10-15 14:50:52 +02:00
Fiona Liao	9a5bccbd4b	refactor: make OTEL temporality check easier to read (#16692 ) * Make OTEL temporality check easier to read * Add nolint comment Signed-off-by: Fiona Liao <fiona.liao@grafana.com>	2025-10-14 13:29:23 +02:00
George Krajcsovits	fe11cae637	Merge pull request #17287 from linasm/reject-nan-histogram-custom-bounds NHCB: Reject custom bucket bounds with NaN value	2025-10-06 18:11:03 +02:00
Linas Medziunas	c16db58061	NHCB: Reject custom bucket bounds with NaN value Signed-off-by: Linas Medziunas <linas.medziunas@gmail.com>	2025-10-06 16:37:28 +03:00
Minh Nguyen	106e6f2c77	[RW2] Return 400 for Exemplars without Series or Histograms not written (#17250 ) * fix Signed-off-by: pipiland2612 <nguyen.t.dang.minh@gmail.com> * fix cmt Signed-off-by: pipiland2612 <nguyen.t.dang.minh@gmail.com> --------- Signed-off-by: pipiland2612 <nguyen.t.dang.minh@gmail.com>	2025-10-06 12:53:44 +01:00
beorn7	3d7cf4c274	model/histogram: Validate non-negative count and zero bucket We have always validated that none of the bucket is negative. We should do the same for the count of observations and the zero bucket. Note that this was always implied in the protobuf exposition format because a count or a zero bucket population is ignored if it is not positive. Signed-off-by: beorn7 <beorn@grafana.com>	2025-10-01 16:40:41 +02:00
Bryan Boreham	0d3ec05056	Merge pull request #17043 from machine424/ffl chore: allow seamless use of testing/synctest for >=go1.24	2025-09-30 12:11:12 +01:00
György Krajcsovits	a5a6413c1a	better errors naming and formatting, typo fixes Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>	2025-09-23 11:20:55 +02:00
György Krajcsovits	6e42da8904	feat(remote): reduce resolution of native histograms on remote read If a sample read through remote read has too high resolution, reduce it to the maximum allowed. This is a slow path, but we only expect it to happen if the server side is newer version that allows higher resolution. Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>	2025-09-23 11:20:55 +02:00
György Krajcsovits	794c545930	Merge remote-tracking branch 'origin/main' into krajo/native-histogram-schema-validation	2025-09-23 10:51:02 +02:00
Minh Nguyen	d04550a9c4	[RW2] Return 400 error code for wrongly-formatted histograms (#17210 ) * return 400 error code Signed-off-by: pipiland2612 <nguyen.t.dang.minh@gmail.com> * fix Signed-off-by: pipiland2612 <nguyen.t.dang.minh@gmail.com> * add more cases Signed-off-by: pipiland2612 <nguyen.t.dang.minh@gmail.com> * format code Signed-off-by: pipiland2612 <nguyen.t.dang.minh@gmail.com> * nit_fixing Signed-off-by: pipiland2612 <nguyen.t.dang.minh@gmail.com> --------- Signed-off-by: pipiland2612 <nguyen.t.dang.minh@gmail.com>	2025-09-23 07:24:46 +02:00
machine424	365409d3be	chore: allow seamless use of testing/synctest for >=go1.24 Signed-off-by: machine424 <ayoubmrini424@gmail.com>	2025-09-19 22:48:25 +02:00
György Krajcsovits	5b39b79f5a	refactor error creation and tests Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>	2025-09-19 09:26:34 +02:00
György Krajcsovits	b99378f2c4	Merge remote-tracking branch 'origin/main' into krajo/native-histogram-schema-validation	2025-09-19 08:59:00 +02:00
George Krajcsovits	5e6900558a	Apply suggestions from code review Co-authored-by: Björn Rabenstein <beorn@grafana.com> Signed-off-by: George Krajcsovits <krajorama@users.noreply.github.com>	2025-09-19 08:58:27 +02:00
György Krajcsovits	f0a297bb7c	fix(remote): validate native histogram schema in remote read When remote read returns chunks, the validation is in tsdb/chunkenc. However when it returns samples, we need to modify the iterator to validate. Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>	2025-09-18 11:09:45 +02:00
machine424	8462515c75	test(storage/remote/queue_manager_test.go): use synctest in TestShutdown for better control over time The test becomes flaky after it was asked to run on parallel and "fight" for resources let's hide all of that Signed-off-by: machine424 <ayoubmrini424@gmail.com>	2025-09-17 11:20:07 +02:00
György Krajcsovits	0cf54d7819	perf(otlp): reduce logs from OTLP endpoint It's not possible to store created timestamp at the same timestamp as the current sample, so do not even try. In OpenTelemetry spec, if the start time is unknown, it will be set to the same timestamp as the first sample. https://opentelemetry.io/docs/specs/otel/metrics/data-model/#cumulative-streams-handling-unknown-start-time This means that we will get a lot of duplicate sample for timestamp errors and we should not log those. Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>	2025-09-17 08:50:43 +02:00
György Krajcsovits	bdf547ae9c	fix(nativehistograms): validation should fail on unsupported schemas Histogram.Validate and FloatHistogram.Validate now return error on unsupported schemas. Scrape and remote-write handler reduces the schema to the maximum allowed if it is above the maximum, but below theoretical maximum of 52. For scrape the maximum is a configuration option, for remote-write it is 8. Note: OTLP endpont already does the reduction, without checking that it is below 52 as the spec does not specify a maximum. Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>	2025-09-13 16:54:44 +02:00
Minh Nguyen	0fc2547740	Handle error gracefully for the `desymbolizeLabels` function in prompb/io/prometheus/write/v2/symbols.go (#17160 ) Signed-off-by: pipiland <user123@Minhs-Macbook.local> --------- Signed-off-by: pipiland <user123@Minhs-Macbook.local> Co-authored-by: pipiland <user123@Minhs-Macbook.local>	2025-09-08 13:04:55 -07:00

1 2 3 4 5 ...

750 Commits