In addHistogramDataPoints, exemplars assigned to the +Inf bucket of one
data point were carried over into the _sum and _count Append calls of
the next data point via the shared appOpts. Clear appOpts.Exemplars at
the start of each loop iteration to restore the nil-exemplar semantics
that existed before the AppenderV2 migration.
Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
The OTLP write handler and the PRW v2 histogram append path were missing
ErrTooOldSample from their error type checks, causing these errors to
fall through to the default case and return HTTP 500 Internal Server Error.
This triggered unnecessary retries in OTLP clients like the Python SDK.
The PRW v1 write handler (line 115) and the PRW v2 sample append path
(line 377) already correctly handle ErrTooOldSample as a 400, and this
change makes the remaining paths consistent.
Also adds ErrTooOldSample to the v1 sample/histogram log checks so
these errors are properly logged instead of silently returned.
Fixes#16645
Signed-off-by: Varun Chawla <varun_6april@hotmail.com>
Initial implementation of https://github.com/prometheus/prometheus/issues/17790.
Only implements ST-per-sample for Counters. Tests and benchmarks updated.
Note: This increases the size of the RefSample object for all users, whether st-per-sample is turned on or not.
Signed-off-by: Owen Williams <owen.williams@grafana.com>
The createAttributes error was incorrectly returning nil instead of err,
causing errors to be silently discarded. This could lead to silent data
loss for sum metrics during OTLP ingestion.
Fixes#17953
Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
* simplify readability of timeseries filtering by using the slices package
Signed-off-by: Callum Styan <callumstyan@gmail.com>
* ensure that BenchmarkBuildTimeSeries doesn't account for the building of
the actual proto in the benchmark results, we only care about the
buildTimeSeries call
Signed-off-by: Callum Styan <callumstyan@gmail.com>
---------
Signed-off-by: Callum Styan <callumstyan@gmail.com>
* otlptranslator: filter __name__ from OTLP attributes to prevent duplicates
OTLP metrics can have a __name__ attribute which, when combined with the
metric name passed via extras, creates duplicate __name__ labels.
This commit implements filtering out of any __name__ metric attribute from OTLP.
Also rename TestCreateAttributes to TestPrometheusConverter_createAttributes
for consistency, and add test cases for __name__, __type__, and __unit__ OTLP metric attributes.
---------
Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
* otlptranslator: add label caching for OTLP-to-Prometheus conversion
Add per-request caching to reduce redundant computation and allocations
during OTLP metric conversion:
1. Per-request label sanitization cache: Cache sanitized label names
within a request to avoid repeated string allocations for commonly
repeated labels like __name__, job, instance.
2. Resource-level label caching: Precompute and cache job, instance,
promoted resource attributes, and external labels once per
ResourceMetrics boundary instead of for each datapoint.
3. Scope-level label caching: Precompute and cache scope metadata labels
(otel_scope_name, otel_scope_version, etc.) once per ScopeMetrics
boundary.
4. LabelNamer instance caching: Reuse the LabelNamer struct across
datapoints within the same resource context.
These optimizations significantly reduce allocations and improve latency
for OTLP ingestion workloads with many datapoints per resource/scope.
---------
Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
Co-authored-by: George Krajcsovits <krajorama@users.noreply.github.com>
The benchmark was passing appendMetadata=false to NewCombinedAppender,
which caused UpdateMetadata to never be called on the underlying
noOpAppender. This resulted in app.metadata always being 0, failing
the assertion that metadata count should be positive.
Fix by enabling metadata appending in the benchmark.
Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
No implementation yet. Just to test the shape of the interface.
AtST is implemented for trivial cases, anything else is hard coded
to return 0.
Ref: https://github.com/prometheus/prometheus/issues/17791
Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
NHCB is native histograms with custom buckets.
prompb is used for both remote write 1.0 and remote read. We do not
support NHCB over remote write 1.0 , however we should absolutely
support it for remote read.
Prometheus remote write 1.0 client already refuses to send NHCB.
Prometheus remote write 1.0 server accepts NHCB, but doesn't store
custom values, corrupting the result. I'm now handling NHCB correctly,
instead of refusing or corrupting.
Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
ReduceResolution is currently called before validation during
ingestion. This will cause a panic if there are not enough buckets in
the histogram. If there are too many buckets, the spurious buckets are
ignored, and therefore the error in the input histogram is masked.
Furthermore, invalid negative offsets might cause problems, too.
Therefore, we need to do some minimal validation in reduceResolution.
Fortunately, it is easy and shouldn't slow things down. Sadly, it
requires to return errors, which triggers a bunch of code changes.
Even here is a bright side, we can get rud of a few panics. (Remember:
Don't panic!)
In different news, we haven't done a full validation of histograms
read via remote-read. This is not so much a security concern (as you
can throw off Prometheus easily by feeding it bogus data via
remote-read) but more that remote-read sources might be makeshift and
could accidentally create invalid histograms. We really don't want to
panic in that case. So this commit does not only add a check of the
spans and buckets as needed for resolution reduction but also a full
validation during remote-read.
Signed-off-by: beorn7 <beorn@grafana.com>
* drop extra label from receiver
Signed-off-by: pipiland2612 <nguyen.t.dang.minh@gmail.com>
* used constant
Signed-off-by: pipiland2612 <nguyen.t.dang.minh@gmail.com>
---------
Signed-off-by: pipiland2612 <nguyen.t.dang.minh@gmail.com>
Partially fixes https://github.com/prometheus/prometheus/issues/17416 by
renaming all CT* names to ST* in the whole codebase except RW2 (this is
done in separate
[PR](https://github.com/prometheus/prometheus/pull/17411)) and
PrometheusProto exposition proto.
```
CreatedTimestamp -> StartTimestamp
CreatedTimeStamp -> StartTimestamp
created_timestamp -> start_timestamp
CT -> ST
ct -> st
```
Signed-off-by: bwplotka <bwplotka@gmail.com>
OTLP Receiver: Only update metadata to WAL when metadata-wal-records feature is enabled.
---------
Signed-off-by: pipiland2612 <nguyen.t.dang.minh@gmail.com>