* fix(promql): histogram_quantile NaN observed in native histogram
Fixes: #16578
See the issue for detailed explanation.
When a histogram had only NaN observations and no normal observations,
we returned 0 from the quantile, which is completely wrong. If there were
normal observations but we went over them, we returned the upper bound of
the existing buckets, however that contradicts expectations on
histogram_fraction. Now we return NaN if the quantile is calculated to be
over all normal observations, falling into NaNs (in a virtual +Inf bucket).
We also return info level annotations if we see any NaN observations.
The annotation calls out if we returned NaN or even if we took the
virtual +Inf bucket into account.
Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
* fix(promql): histogram_fraction NaN observed in native histogram
Fixes: #16580
According to the specification we should not take NaN observations
into account when calculating the fraction. This commit fixes that
and adds an info level annotation to let the user know about this.
Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
This commit adds the ts_of_(max,min,last)_over_time functions behind the experimental feature flag.
Signed-off-by: Michael Hoffmann <mhoffmann@cloudflare.com>
The new docs site will have syntax highlighting, so this adds language tags
to code boxes that are currently missing them. I didn't add `promql` as a
language yet since the highlighter doesn't support it yet, plus a lot of
the PromQL codeboxes in our docs aren't strictly valid PromQL, they are
more like multiple expressions listed in the same code box on multiple
lines. So I'm leaving that for sometime later.
In the HTTP API page, I moved the curl examples from the JSON codeboxes to
their own ones above the JSON output. I considered putting an "Output:"
text between the curl + JSON output, but I think the way it currently looks
without it is probably fine.
I also fixed a number of headings which were at the wrong level relative to
their nesting in the document.
I also removed `go` as a language from the Go template language examples,
because the Go template language isn't Go at all.
I also adjusted the indent on one codebox to be more reasonable (2 spaces
instead of 8).
And then finally, my editor made a bunch of whitespace changes
automatically, like removing trailing spaces.
Signed-off-by: Julius Volz <julius.volz@gmail.com>
Signed-off-by: Julius Volz <julius.volz@gmail.com>
* promql: histogram_fraction for bucket histograms
This PR extends the histogram_fraction function to also work with classic bucket histograms. This is beneficial because it allows expressions like sum(increase(my_bucket{le="0.5"}[10m]))/sum(increase(my_total[10m])) to be written without knowing the actual values of the "le" label, easing the transition to native histograms later on.
It also feels natural since histogram_quantile also can deal with classic histograms.
Signed-off-by: Michael Hoffmann <mhoffmann@cloudflare.com>
* promql: histogram_fraction for bucket histograms
* Add documentation and reduce code duplication
* Fix a bug in linear interpolation between bucket boundaries
* Add more PromQL tests
Signed-off-by: Michael Hoffmann <mhoffmann@cloudflare.com>
* Update docs/querying/functions.md
Co-authored-by: Björn Rabenstein <github@rabenste.in>
Signed-off-by: Michael Hoffmann <mhoffm@posteo.de>
---------
Signed-off-by: Michael Hoffmann <mhoffmann@cloudflare.com>
Signed-off-by: Michael Hoffmann <mhoffm@posteo.de>
Co-authored-by: Björn Rabenstein <github@rabenste.in>
This is also meant to document the actual implementation, but
see #13934 for the current state.
This also improves and streamlines some parts of the documentation
that are not strictly native histogram related, but are colocated with
them. In particular, the section about aggregation operators got
restructured quite a bit, including the removal of a quite verbose
example for `limit_ratio` (which was just too long an this location
and also a bit questionabl in its usefulness).
Signed-off-by: beorn7 <beorn@grafana.com>
What
Adds support for OTLP delta temporality to the OTLP endpoint.
This is done by calling the deltatocumulative processor from the OpenTelemetry collector during OTLP conversion.
Why
Delta conversion is a naturally stateful process, which requires careful request routing when operated inside a collector.
Prometheus is already stateful and doing the conversion in-server reduces the operational burden on the ingest architecture by only having one stateful component.
How
deltatocumulative is a OTel collector component that works as follows:
* pmetric.Metrics come from a receiver or in this case from the HTTP client
* It operates as an in-place update loop:
* for each sample, if not delta, leave unmodified
* if delta, do:
* state += sample, where state is the in-memory sum of all previous samples
* sample = state, sample value is now cumulative
* this is supported for sums (counters), gauges, histograms (old histograms) and exponential histograms (native histograms)
If a series receives no new samples for 5m, its state is removed from memory
Performance
Delta performance is a stateful operation and the OTel code is not highly optimized yet, e.g. it locks the entire processor for each request. Nonetheless, care has been taken to mitigate those effects:
delta conversion is behind a feature flag. If disabled, no conversion code is ever invoked
if enabled, conversion is not invoked if request not actually contains delta samples. This leads to no measureable performance difference between default-cumulative to convert-cumulative (only cumulative, feature on/off)
Signed-off-by: sh0rez <me@shorez.de>
This commit introduced two field in `/status` endpoint:
- The node currently serving the request.
- The current server time for debugging time drift issues.
fixes#15394.
Signed-off-by: sujal shah <sujalshah28092004@gmail.com>
The `info` function is an experiment to improve UX
around including labels from info metrics.
`info` has to be enabled via the feature flag `--enable-feature=promql-experimental-functions`.
This MVP of info simplifies the implementation by assuming:
* Only support for the target_info metric
* That target_info's identifying labels are job and instance
Also:
* Encode info samples' original timestamp as sample value
* Deduce info series select hints from top-most VectorSelector
---------
Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
Co-authored-by: Ying WANG <ying.wang@grafana.com>
Co-authored-by: Augustin Husson <augustin.husson@amadeus.com>
Co-authored-by: Bartlomiej Plotka <bwplotka@gmail.com>
Co-authored-by: Björn Rabenstein <github@rabenste.in>
Co-authored-by: Bryan Boreham <bjboreham@gmail.com>
Put a scalar to query, it can be graphed.
So the doc says "an expression that returns an instant vector is the only type which can be graphed." is not correct?
And also, a query_range, which used for graph, always return a range vector <https://promlabs.com/blog/2020/06/18/the-anatomy-of-a-promql-query/#range-queries> , so it's confusing to read the above statement.
Signed-off-by: Viet Hung Nguyen <hvn@familug.org>
This unifies the documentation of float literals and time durations
and updates all references to the old definitions.
Signed-off-by: beorn7 <beorn@grafana.com>
The OTLP receiver can now considered stable. We've had it for longer
than a year in main and has received constant improvements.
Signed-off-by: Jesus Vazquez <jesusvzpg@gmail.com>