promql: histogram_stddev and histogram_stdvar should use arithmetic mean for custom buckets

Signed-off-by: amanycodes <amanycodes@gmail.com>
This commit is contained in:
Aman 2025-04-24 18:18:58 +05:30 committed by GitHub
parent 9659e30dec
commit 26bddcf068
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194
5 changed files with 22 additions and 12 deletions

View File

@ -428,9 +428,11 @@ annotation, you should find and remove the source of the invalid data.
## `histogram_stddev()` and `histogram_stdvar()`
`histogram_stddev(v instant-vector)` returns the estimated standard deviation
of observations for each histogram sample in `v`, based on the geometric mean
of the buckets where the observations lie. Float samples are ignored and do not
show up in the returned vector.
of observations for each histogram sample in `v`. For this estimation, all observations
in a bucket are assumed to have the value of the mean of the bucket boundaries. For
the zero bucket and for buckets with custom boundaries, the arithmetic mean is used.
For the usual exponential buckets, the geometric mean is used. Float samples are ignored
and do not show up in the returned vector.
Similarly, `histogram_stdvar(v instant-vector)` returns the estimated standard
variance of observations for each histogram sample in `v`.

View File

@ -1350,9 +1350,15 @@ func histogramVariance(vals []parser.Value, enh *EvalNodeHelper, varianceToResul
continue
}
var val float64
if bucket.Lower <= 0 && 0 <= bucket.Upper {
switch {
case sample.H.UsesCustomBuckets():
// Use arithmetic mean in case of custom buckets.
val = (bucket.Upper + bucket.Lower) / 2.0
case bucket.Lower <= 0 && bucket.Upper >= 0:
// Use zero (effectively the arithmetic mean) in the zero bucket of a standard exponential histogram.
val = 0
} else {
default:
// Use geometric mean in case of standard exponential buckets.
val = math.Sqrt(bucket.Upper * bucket.Lower)
if bucket.Upper < 0 {
val = -val

View File

@ -95,13 +95,13 @@ eval instant at 50m histogram_avg(testhistogram3)
# Test histogram_stddev. This has no classic equivalent.
eval instant at 50m histogram_stddev(testhistogram3)
{start="positive"} 2.8189265757336734
{start="negative"} 4.182715937754936
{start="positive"} 2.7435461458749795
{start="negative"} 4.187667907081458
# Test histogram_stdvar. This has no classic equivalent.
eval instant at 50m histogram_stdvar(testhistogram3)
{start="positive"} 7.946347039377573
{start="negative"} 17.495112615949154
{start="positive"} 7.527045454545455
{start="negative"} 17.5365625
# Test histogram_fraction.
#

View File

@ -337,7 +337,7 @@ load 10m
histogram_stddev_stdvar_3 {{schema:3 count:7 sum:62 z_bucket:1 buckets:[0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 ] n_buckets:[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 ]}}x1
eval instant at 10m histogram_stddev(histogram_stddev_stdvar_3)
{} 42.947236400258
{} 42.94723640026
eval instant at 10m histogram_stdvar(histogram_stddev_stdvar_3)
{} 1844.4651144196398

View File

@ -1266,9 +1266,11 @@ const funcDocs: Record<string, React.ReactNode> = {
</p>
<p>
<code>histogram_stddev(v instant-vector)</code> returns the estimated standard deviation of observations in a native
histogram, based on the geometric mean of the buckets where the observations lie. Samples that are not native
histograms are ignored and do not show up in the returned vector.
histogram. For this estimation, all observations in a bucket are assumed to have the value of the mean of the bucket boundaries.
For the zero bucket and for buckets with custom boundaries, the arithmetic mean is used. For the usual exponential buckets,
the geometric mean is used. Samples that are not native histograms are ignored and do not show up in the returned vector.
</p>
<p>