This shows how float counters cannot go below zero when extrapolationg
for rate/increase, and how histograms do not have that protection yet,
leading to an overestimation of the rate/increase.
This also demonstrates edge cases where the count extrapolation does
not need to be limited, but an individual bucket still goes below
zero.
Signed-off-by: beorn7 <beorn@grafana.com>
step() is a new keyword introduced to represent the query step width in duration expressions.
min(a,b) and max(a,b) return the min and max from two duration expressions.
Signed-off-by: Julien Pivotto <291750+roidelapluie@users.noreply.github.com>
This commit brings back direct mean calculation (for `avg` and
`avg_over_time`) but isn't an outright revert of #16569. It keeps the
improved incremental mean calculation and features generally a bit
cleaner code than before.
Also, this commit...
- ...updates the lengthy comment explaining the whole situation and
trade-offs.
- ...divides the running sum and the Kahan compensation term
separately (in direct mean calculation) to avoid the (unlikely)
possibility that sum and Kahan compensation together ovorflow
float64.
- ...uncomments the tests that should now work again on darwin/arm64.
- ...uncomments the test that should now reliably yield the
(inaccurate) value 0 on all hardware platforms. Also, the test
description has been updated accordingly.
- ...adds avg_over_time tests for zero and one sample in the range.
Signed-off-by: beorn7 <beorn@grafana.com>
The test in question actually worked fine even before #16569. The
finding reported in the comment has turned out to be caused by
something else.
Signed-off-by: beorn7 <beorn@grafana.com>
* fix(promql): histogram_quantile NaN observed in native histogram
Fixes: #16578
See the issue for detailed explanation.
When a histogram had only NaN observations and no normal observations,
we returned 0 from the quantile, which is completely wrong. If there were
normal observations but we went over them, we returned the upper bound of
the existing buckets, however that contradicts expectations on
histogram_fraction. Now we return NaN if the quantile is calculated to be
over all normal observations, falling into NaNs (in a virtual +Inf bucket).
We also return info level annotations if we see any NaN observations.
The annotation calls out if we returned NaN or even if we took the
virtual +Inf bucket into account.
Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
* fix(promql): histogram_fraction NaN observed in native histogram
Fixes: #16580
According to the specification we should not take NaN observations
into account when calculating the fraction. This commit fixes that
and adds an info level annotation to let the user know about this.
Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
This commit fixes the evaluation of invalid expressions like
`sum(rate(`. Before that, it would trigger a panic in the PromQL engine
because it tried to access an index which is out of range.
The bug was probably introduced by 06d0b063ea.
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
This PR fixes a bug in ts_of_last_over_time where the float samples
where used when computing the last timestamp of the histogram samples.
Signed-off-by: Michael Hoffmann <mhoffmann@cloudflare.com>
Split unary operator handling in duration expressions into two specific
cases to fix precedence conflicts:
- Handle unary operators with number literals directly
- Handle unary operators with parenthesized expressions separately
This prevents unary minus from incorrectly binding to subsequent
operators in expressions like `foo offset -1^1`, ensuring it parses
as `(foo offset -1) ^ 1` rather than `foo offset (-1^1)`.
Fixes#16711
Signed-off-by: Julien Pivotto <291750+roidelapluie@users.noreply.github.com>
This commit adds the ts_of_(max,min,last)_over_time functions behind the experimental feature flag.
Signed-off-by: Michael Hoffmann <mhoffmann@cloudflare.com>
These tests fail on darwin/arm64.
One is expected, because the test demonstrates the limits of the
numerical accuracy of our methods, and different inaccurate outcomes
on different hardware are expected.
The other two are mysterious at the moment, see
https://github.com/prometheus/prometheus/issues/16714 for detailed
discussion and debugging.
Signed-off-by: beorn7 <beorn@grafana.com>
Restarting the depth-first walk on each leg of a binary expression is
convoluted. ISTM the correct logic is to walk the path backwards to the
first relevant function.
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
* Provide PromQL info annotations when rate()/increase() over series without counter label
Signed-off-by: Arthur Silva Sens <arthursens2005@gmail.com>
* Address comments
Signed-off-by: Arthur Silva Sens <arthursens2005@gmail.com>
---------
Signed-off-by: Arthur Silva Sens <arthursens2005@gmail.com>
As it turns out, if we combine Kahan summation and incremental mean
calculation properly, it works quite well and we do not need to switch
between simple mean calculation and incremental calculation based on
overflow.
This simplifies the code quite a bit.
Signed-off-by: beorn7 <beorn@grafana.com>
This was an oversight because the old tests still happened to pass
with the new behavior, but important test data was excluded at the
left end of the interval, rendering some tests not actually testing
what we want to test.
In the past, we also applied different strategies to adjust the test
(extend the range from 1m to 2m, or set the evaluation timestamp to
45s). This commit unifies things and reduces redundancy.
Signed-off-by: beorn7 <beorn@grafana.com>
The problem is in the counter reset detection. The code that loads the
samples is matrixIterSlice which uses the typed Buffer iterator, which
will preload the integer histogram samples, however the last sample is
always(!) loaded as a float histogram sample in matrixIterSlice and the
optimized iterator fails to detect counter resets in that case.
Also the iterator does not reset lastH, lastFH properly.
Ref: https://github.com/prometheus/prometheus/issues/16681
Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
Copy the benchmark for native histograms with exponential buckets and
adopt to native histograms with custom buckets.
Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
This is failing after 26088d01b9518b3315805186dcee534b986bf79f and it seems that currently tested position is incorrect, fix it
Signed-off-by: Lukasz Mierzwa <l.mierzwa@gmail.com>
The position range of nested aggregate expression was wrong, for the
expression "(sum(foo))" the position of "sum(foo)" should be 1-9, but
the parser could not decide the end of the expression on pos 9, instead
it read ahead to pos 10 and then emitted the aggregate. But we only
kept the last closing position (10) and wrote that into the aggregate.
The reason for this is that the parser cannot know from "(sum(foo)" alone
if the aggregate is finished. It could be finished as in "(sum(foo))" but
equally it could continue with group modifier as "(sum(foo) by (bar))".
The fix is to track ")" tokens in a stack in addition to the lastClosing
position, which is overloaded with other things like offset number tracking.
Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>