The was a bug (due to confusion?) on the local metadata cache that is cached
by metric family not the series metric name. The fix is to NOT use that local cache
at all (it's still needed for current metadata API implementation, added TODO
on how we can get rid of it).
I went ahead and also rename Metric field in metadata structs to MetricFamily to make
clear it's not always __name__.
Signed-off-by: bwplotka <bwplotka@gmail.com>
Found during testing for
https://github.com/grafana/mimir/issues/9072
Debug printout showed:
KRAJO: seriesName=cortex_request_duration_seconds_bucket,
metricFamily=cortex_request_duration_seconds_bucket,
type=GAUGE,
help=cortex_bucket_index_load_duration_seconds_sum,
unit=
which is nonsense.
I can imagine more cases where this is the case and makes actual sense.
Some targets might miss metadata and if there's a pipeline that loses it.
Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
What
Adds support for OTLP delta temporality to the OTLP endpoint.
This is done by calling the deltatocumulative processor from the OpenTelemetry collector during OTLP conversion.
Why
Delta conversion is a naturally stateful process, which requires careful request routing when operated inside a collector.
Prometheus is already stateful and doing the conversion in-server reduces the operational burden on the ingest architecture by only having one stateful component.
How
deltatocumulative is a OTel collector component that works as follows:
* pmetric.Metrics come from a receiver or in this case from the HTTP client
* It operates as an in-place update loop:
* for each sample, if not delta, leave unmodified
* if delta, do:
* state += sample, where state is the in-memory sum of all previous samples
* sample = state, sample value is now cumulative
* this is supported for sums (counters), gauges, histograms (old histograms) and exponential histograms (native histograms)
If a series receives no new samples for 5m, its state is removed from memory
Performance
Delta performance is a stateful operation and the OTel code is not highly optimized yet, e.g. it locks the entire processor for each request. Nonetheless, care has been taken to mitigate those effects:
delta conversion is behind a feature flag. If disabled, no conversion code is ever invoked
if enabled, conversion is not invoked if request not actually contains delta samples. This leads to no measureable performance difference between default-cumulative to convert-cumulative (only cumulative, feature on/off)
Signed-off-by: sh0rez <me@shorez.de>
Fix issues raised by staticcheck
We are not enabling staticcheck explicitly, though, because it has too many false positives.
---------
Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
BuildCompliantName was renamed to BuildCompliantMetricName, and it no longer takes UTF8 support into consideration. It focuses on building a metric name that follows Prometheus conventions.
A new function, BuildMetricName, was added to optionally add unit and type suffixes to OTLP metric names without translating any characters to underscores(_).
In non-UTF-8 mode, use strings.FieldsFunc to split string into tokens,
as it was before PR #15258. This makes the code more robust and
readable.
Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
Fixes UTF-8 aggregator label list items getting mutated with quote marks when String-ified.
Fixes quoted metric names not supported in metric declarations.
Fixes UTF-8 label names not being quoted when String-ified.
Fixes https://github.com/prometheus/prometheus/issues/15470
Fixes https://github.com/prometheus/prometheus/issues/15528
Signed-off-by: Owen Williams <owen.williams@grafana.com>
Co-authored-by: Bryan Boreham <bjboreham@gmail.com>
It does nothing for standard Prometheus builds with -tags stringlabels,
but the check to see if labels are being replaced has a cost.
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
When a remote-write is executed towards a host name that is resolved to multiple IP addresses, this PR introduces a possibility to force creation of new connections used for the remote-write request to a randomly chosen IP address from the ones corresponding to the host name. The default behavior remains unchanged, i.s., the IP address used for the connection creation remains the one chosen by Go.
This is an experimental feature, it is disabled by default.
Signed-off-by: Yuri Nikolic <durica.nikolic@grafana.com>
It was crashing due to uninitialized metrics, and not terminating due to
incorrectly reading segment names.
We need to export `SetMetrics` to avoid the first problem.
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>