If a sample read through remote read has too high resolution,
reduce it to the maximum allowed.
This is a slow path, but we only expect it to happen if the server
side is newer version that allows higher resolution.
Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
Allow -9..52 schemas instead of just -4..8, but reduce resolution to 8 if
above.
The reduce code path will be slow, but we only expect it to happen if
TSDB already has higher resolution samples and we are in a rollback.
Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
When remote read returns chunks, the validation is in tsdb/chunkenc.
However when it returns samples, we need to modify the iterator to
validate.
Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
Otherwise higher level code like PromQL needs to constantly check if it
can handle the samples.
Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
With the fixed commit order, we can now handle the conversion of float
staleness markers to histogram staleness markers in a more direct way.
Signed-off-by: beorn7 <beorn@grafana.com>
Fixes https://github.com/prometheus/prometheus/issues/15177
The basic idea here is to divide the samples to be commited into (sub)
batches whenever we detect that the same series receives a sample of a
type different from the previous one. We then commit those batches one
after another, and we log them to the WAL one after another, so that
we hit both birds with the same stone. The cost of the stone is that
we have to track the sample type of each series in a map. Given the
amount of things we already track in the appender, I hope that it
won't make a dent. Note that this even addresses the NHCB special case
in the WAL.
This does a few other things that I could not resist to pick up on the
go:
- It adds more zeropool.Pools and uses the existing ones more
consistently. My understanding is that this was merely an oversight.
Maybe the additional pool usage will compensate for the increased
memory demand of the map.
- Create the synthetic zero sample for histograms a bit more
carefully. So far, we created a sample that always went into its own
chunk. Now we create a sample that is compatible enough with the
following sample to go into the same chunk. This changed the test
results quite a bit. But IMHO it makes much more sense now.
- Continuing past efforts, I changed more namings of `Samples` into
`Floats` to keep things consistent and less confusing. (Histogram
samples are also samples.) I still avoided changing names in other
packages.
- I added a few shortcuts `h := a.head`, saving many characters.
TODOs:
- Address @krajorama's TODOs about commit order and staleness handling.
Signed-off-by: beorn7 <beorn@grafana.com>
It's not possible to store created timestamp at the same timestamp as
the current sample, so do not even try.
In OpenTelemetry spec, if the start time is unknown, it will be set to
the same timestamp as the first sample.
https://opentelemetry.io/docs/specs/otel/metrics/data-model/#cumulative-streams-handling-unknown-start-time
This means that we will get a lot of duplicate sample for timestamp
errors and we should not log those.
Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
Signed-off-by: Andrew Hall <andrew.hall@grafana.com>
Co-authored-by: Charles Korn <charleskorn@users.noreply.github.com>
Co-authored-by: Arve Knudsen <arve.knudsen@gmail.com>
Histogram.Validate and FloatHistogram.Validate now return error on
unsupported schemas.
Scrape and remote-write handler reduces the schema to the maximum allowed
if it is above the maximum, but below theoretical maximum of 52.
For scrape the maximum is a configuration option, for remote-write it is 8.
Note: OTLP endpont already does the reduction, without checking that it is
below 52 as the spec does not specify a maximum.
Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
The current code stops the walk after we have found the first relevant
function. However, in expressions with multiple legs, we will then use
the `HistogramStatsIterator` at most once. This change should make
sure we explore all legs.
The added tests make sure we are not using `HistogramStatsIterator`
where we shouldn't (but the opposite can only be seen in a benchmark
or with a more explicit test).
Signed-off-by: beorn7 <beorn@grafana.com>
Prior to #17127, we needed to add another level in the AST to trigger
the usage of `HistogramStatsIterator`. This is fixed now.
Signed-off-by: beorn7 <beorn@grafana.com>
Previously, multiple calls returned a wrong counter reset hint.
This commit also includes a bunch of refactorings that partially have
value on their own. However, the need for them was triggered by the
additional work needed for idempotency, so I included them in this
commit.
Signed-off-by: beorn7 <beorn@grafana.com>
* fix(textparse): implement NHCB parsing in ProtoBuf parser directly
The NHCB conversion does some validation, but we can only return error
from Parser.Next() not Parser.Histogram(). So the conversion needs to
happen in Next().
There are 2 cases:
1. "always_scrape_classic_histograms" is enabled, in which case we
convert after returning the classic series. This is to be consistent
with the PromParser text parser, which collects NHCB while spitting out
classic series; then returns the NHCB.
2. "always_scrape_classic_histograms" is disabled. In which case we never
return the classic series.
Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
* refactor(textparse): skip classic series instead of adding NHCB around
Do not return the first classic series from the EntryType state,
switch to EntrySeries. This means we need to start the histogram
field state from -3 , not -2.
In EntrySeries, skip classic series if needed.
Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
* reuse nhcb converter
Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
* test(textparse/nhcb): test corrupting metric family name
NHCB parse doesn't always copy the metric name from the underlying
parser. When called via HELP, UNIT, the string is directly referenced
which means that the read-ahead of NHCB can corrupt it.
Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
* OTLP writer writes directly to appender
Do not convert to Remote-Write 1.0 protocol. Convert to TSDB Appender interface instead.
For downstream projects that still convert OTLP to something else (e.g. Mimir using
its own RW 1.0+2.0 compatible protocol), introduce a compatibility layer between
OTLP decoding and TSDB Appender. This is the CombinedAppender that hides the
implementation. Name is subject to change.
---------
Signed-off-by: David Ashpole <dashpole@google.com>
Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
Signed-off-by: George Krajcsovits <krajorama@users.noreply.github.com>
Co-authored-by: David Ashpole <dashpole@google.com>
Co-authored-by: Jesus Vazquez <jesusvazquez@users.noreply.github.com>
Co-authored-by: Arve Knudsen <arve.knudsen@gmail.com>
Hide adding NHCB parser on top another parser in New() function
so we can easily add direct NHCB capable parsers.
Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
Remote Write one currently attempts to send native histograms with
custom buckets, but these are not actually supported in RW1 protocol.
Drop, measure and log instead.
Fixes: #17140
Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
After an effective Seek, the lastFH isn't the lastFH anymore, so we
should nil it out.
In practice, this should only matter is sub-queries, because we are
otherwise not interested in a counter reset of the first sample
returned after a Seek.
Sub-queries, on the other hand, always do their own counter reset
detection. (For that, they would prefer to see the whole histogram, so
that's another problem for another commit.)
Signed-off-by: beorn7 <beorn@grafana.com>
The `HistogramStatsIterator` is only meant to be used within PromQL.
PromQL only ever uses float histograms. If `HistogramStatsIterator` is
capable of handling integer histograms, it will still be used, for
example by the `BufferedSeriesIterator`, which buffers samples and
will use an integer `Histogram` for it, if the underlying chunk is an
integer histogram chunk (which is common).
However, we can simply intercept the `Next` and `Seek` calls and
pretend to only ever be able te return float histograms. This has the
welcome side effect that we do not have to handle a mix of float and
integer histograms in the `HistogramStatsIterator` anymore.
With this commit, the `AtHistogram` call has been changed to panic so
that we ensure it is never called.
Benchmark differences between this and the previous commit:
name old time/op new time/op delta
NativeHistograms/histogram_count_with_short_rate_interval-16 837ms ± 3% 616ms ± 2% -26.36% (p=0.008 n=5+5)
NativeHistograms/histogram_count_with_long_rate_interval-16 1.11s ± 1% 0.91s ± 3% -17.75% (p=0.008 n=5+5)
NativeHistogramsCustomBuckets/histogram_count_with_short_rate_interval-16 751ms ± 6% 581ms ± 1% -22.63% (p=0.008 n=5+5)
NativeHistogramsCustomBuckets/histogram_count_with_long_rate_interval-16 1.13s ±11% 0.85s ± 2% -24.59% (p=0.008 n=5+5)
name old alloc/op new alloc/op delta
NativeHistograms/histogram_count_with_short_rate_interval-16 531MB ± 0% 148MB ± 0% -72.08% (p=0.008 n=5+5)
NativeHistograms/histogram_count_with_long_rate_interval-16 528MB ± 0% 145MB ± 0% -72.60% (p=0.016 n=5+4)
NativeHistogramsCustomBuckets/histogram_count_with_short_rate_interval-16 452MB ± 0% 145MB ± 0% -67.97% (p=0.016 n=5+4)
NativeHistogramsCustomBuckets/histogram_count_with_long_rate_interval-16 452MB ± 0% 141MB ± 0% -68.70% (p=0.016 n=5+4)
name old allocs/op new allocs/op delta
NativeHistograms/histogram_count_with_short_rate_interval-16 8.95M ± 0% 1.60M ± 0% -82.15% (p=0.008 n=5+5)
NativeHistograms/histogram_count_with_long_rate_interval-16 8.84M ± 0% 1.49M ± 0% -83.16% (p=0.008 n=5+5)
NativeHistogramsCustomBuckets/histogram_count_with_short_rate_interval-16 5.96M ± 0% 1.57M ± 0% -73.68% (p=0.008 n=5+5)
NativeHistogramsCustomBuckets/histogram_count_with_long_rate_interval-16 5.86M ± 0% 1.46M ± 0% -75.05% (p=0.016 n=5+4)
Signed-off-by: beorn7 <beorn@grafana.com>
PR #16702 introduced a regression because it was too strict in
detecting the condition for using the `HistogramStatsIterator`. It
essentially required the triggering function to be buried at least one
level deep.
`histogram_count(sum(rate(native_histogram_series[2m]))` would not
trigger anymore, but
`1*histogram_count(sum(rate(native_histogram_series[2m]))` would.
Ironically, PR #16682 made the performance of the
`HistogramStatsIterator` so much worse that _not_ using it was often
better, but this has to be addressed in a separate commit.
This commit reinstates the previous `HistogramStatsIterator` detection
behavior, as PR #16702 intended to keep it.
Relevant benchmark changes with this commit (i.e. old is without using
`HistogramStatsIterator`, new is with `HistogramStatsIterator`):
name old time/op new time/op delta
NativeHistograms/histogram_count_with_short_rate_interval-16 802ms ± 3% 837ms ± 3% +4.42% (p=0.008 n=5+5)
NativeHistograms/histogram_count_with_long_rate_interval-16 1.22s ± 3% 1.11s ± 1% -9.46% (p=0.008 n=5+5)
NativeHistogramsCustomBuckets/histogram_count_with_short_rate_interval-16 611ms ± 5% 751ms ± 6% +22.87% (p=0.008 n=5+5)
NativeHistogramsCustomBuckets/histogram_count_with_long_rate_interval-16 975ms ± 4% 1131ms ±11% +16.04% (p=0.008 n=5+5)
name old alloc/op new alloc/op delta
NativeHistograms/histogram_count_with_short_rate_interval-16 222MB ± 0% 531MB ± 0% +139.63% (p=0.008 n=5+5)
NativeHistograms/histogram_count_with_long_rate_interval-16 323MB ± 0% 528MB ± 0% +63.81% (p=0.008 n=5+5)
NativeHistogramsCustomBuckets/histogram_count_with_short_rate_interval-16 179MB ± 0% 452MB ± 0% +153.07% (p=0.016 n=4+5)
NativeHistogramsCustomBuckets/histogram_count_with_long_rate_interval-16 175MB ± 0% 452MB ± 0% +157.73% (p=0.016 n=4+5)
name old allocs/op new allocs/op delta
NativeHistograms/histogram_count_with_short_rate_interval-16 4.48M ± 0% 8.95M ± 0% +99.51% (p=0.008 n=5+5)
NativeHistograms/histogram_count_with_long_rate_interval-16 5.02M ± 0% 8.84M ± 0% +75.89% (p=0.008 n=5+5)
NativeHistogramsCustomBuckets/histogram_count_with_short_rate_interval-16 3.00M ± 0% 5.96M ± 0% +98.93% (p=0.008 n=5+5)
NativeHistogramsCustomBuckets/histogram_count_with_long_rate_interval-16 2.89M ± 0% 5.86M ± 0% +102.69% (p=0.016 n=4+5)
Signed-off-by: beorn7 <beorn@grafana.com>
- Add a code comment about a counter reset edge case (which is
hopefully not relevant in practice).
- Rename the receiver from `f` to `hsi`. (`f` seemed like completely
off as a name. `i` or `it` might have worked, too, but I ended up
with `hsi` as the easiest for the reader.)
Signed-off-by: beorn7 <beorn@grafana.com>
So far, we emitted a `HistogramCounterResetCollisionWarning` when
encountering conflicting counter resets in the calculation of (i)rate
and friends. We even tested for that. However, in the rate
calculation, we are not interested in those collisions. They are
actually expected.
On the other hand, we did not warn about those collisions when doing a
`sum` aggregation, where such a warning would be appropriate.
This commit removes the warning in the former case and adds it in the
latter. Sadly, we cannot really test this as we still remove the
counter reset hint for the first sample in a chunk. (And that's the
only sample where we could get a `NotCounterReset` hint.)
Signed-off-by: beorn7 <beorn@grafana.com>
This is an attempt to make sure that we are not accidentally warning
about conflicting counter resets in rate calculation, see
https://github.com/prometheus/prometheus/pull/17051#issuecomment-3226503416 .
This is done by being more explicit about the warn expectation.
However, as long as
https://github.com/prometheus/prometheus/issues/15346 is not
addressed, we won't be able to trigger the annotation this way anyway.
However, we can play a trick, by wrapping a suitable expression in
`histogram_count` or `histogram_sum`, which will invoke the
`HistogramStatsIterator`, which in turn creates counter reset hints on
the fly. So this commit also adds tests with that, both for absence of
an annotation with `rate` and presence of an annotation with
`sum_over_time`.
Signed-off-by: beorn7 <beorn@grafana.com>
test tbs
Signed-off-by: beorn7 <beorn@grafana.com>