* fix(nhcb): flaky test TestConvertClassicHistogramsToNHCB
The test was e2e, including actually scraping an HTTP endpoint and running
the scrape loop. This led to some timing issues.
I've simplified it to call the scrape loop append directly. I think that
this isn't nice as that is a private interface, but should gets rid of the
flakiness and there's already a bunch of test doing this.
Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
See
https://pkg.go.dev/golang.org/x/tools/gopls/internal/analysis/modernize
for details.
This ran into a few issues (arguably bugs in the modernize tool),
which I will fix in the next commit, so that we have transparency what
was done automatically.
Beyond those hiccups, I believe all the changes applied are
legitimate. Even where there might be no tangible direct gain, I would
argue it's still better to use the "modern" way to avoid micro
discussions in tiny style PRs later.
Signed-off-by: beorn7 <beorn@grafana.com>
Go will not allocate when reading from a map with a key cast from []byte to string.
Also remove some yoloString calls in package `textparse` - call a more suitable library function.
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
* close sync ch after sender() is stopped
* break if chan is closed
Signed-off-by: liyandi <littlepangdi@163.com>
Co-authored-by: liyandi <liyandi@xiaomi.com>
Fixes#16689
well, maybe not 100%, but should improve it.
Increase the scrape timeout to be more tolerant of slow test and
also use eventually when checking for targets.
Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
Scrape cache is used to emit StaleNaN markers after a series disappears so it should only hold entries for series that did end up in TSDB, which is not always the case due to sample_limit.
Signed-off-by: Lukasz Mierzwa <l.mierzwa@gmail.com>
Tests that look at samples with StaleNaN values will fail because these samples are generated from map iteration and so the order can be unstable.
Signed-off-by: Lukasz Mierzwa <l.mierzwa@gmail.com>
I was confused why there are no StaleNaN markers appended when a scrape hits sample_limit, but reading the code I see that's expected, so add a test for it.
Signed-off-by: Lukasz Mierzwa <l.mierzwa@gmail.com>
Currently all staleness markers are appended for any sample that disappears from scrape cache, even if that sample was never appended to TSDB.
When staleness markers are appended they always use ref=0 as the SeriesRef, so the downstream appender doesn't know if the sample is for a know series or not.
This changes the scrape cache so the map used for staleness tracking stores the cache entry instead of only the label set. Having the cache entry means:
- we can ignore stale samples that didn't end up in TSDB (not in the scrape cache)
- we can append them to TSDB using correct ref value, so the appender knows if they are for know or unknown series
Signed-off-by: Lukasz Mierzwa <l.mierzwa@gmail.com>
* Reload all scrape pools concurrently
At the moment all scrape pools that need to be reloaded are reloaded one by one. While reloads are ongoing mtxScrape is locked.
For each pool that's being reloaded we need to wait until all targets are updated.
This whole process can take a while and the more scrape pools to reload the longer.
At the same time all pools are independent and there's no real reason to do them one-by-one.
Reload each pool in a seperate goroutine so we finish config reload as ASAP as possible and unlock the mtxScrape.
Signed-off-by: Lukasz Mierzwa <l.mierzwa@gmail.com>
* Address PR review feedback
Signed-off-by: Lukasz Mierzwa <l.mierzwa@gmail.com>
---------
Signed-off-by: Lukasz Mierzwa <l.mierzwa@gmail.com>
Addresses https://github.com/prometheus/prometheus/issues/16371
This will help with migrating to native histograms with `convert_classic_histograms_to_nhcb` since users may still need to keep the classic histograms during a migration
Signed-off-by: chardch <otwordsne@gmail.com>
The new metric_name_escaping_scheme config option works in parallel with metric_name_validation_scheme and controls which escaping scheme is requested when scraping. When not specified, the scheme will request underscores if the validation scheme is set to legacy, and will request allow-utf-8 when the validation scheme is set to utf8. This setting allows users to allow utf8 names if they like, but explicitly request an escaping scheme rather than UTF-8.
Fixes https://github.com/prometheus/prometheus/issues/16034
Built on https://github.com/prometheus/prometheus/pull/16080
Signed-off-by: Owen Williams <owen.williams@grafana.com>
No need to validate, track, add sample, add exemplar if series isn't
going to be ingested. Basically this is the same as if this series was
dropped by relabelling.
Fixes#16217
Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
Global and Data Source configurations can specify legacy mode, but Prometheus now requires that the overall validation mode be set to UTF-8
Signed-off-by: Owen Williams <owen.williams@grafana.com>
We use the cache iteration number to detect when the same metric has
occurred twice in a scrape. We need to bump this number at the end of
every scrape, not just successful scrapes.
The `iter` number is also used:
* After a successful scrape, to delete older metrics and metadata.
* To detect when metadata changed in this scrape.
None of those additional cases is broken by incrementing the number
on error.
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
..instead of *int64. This is as an optimization and ease of use. We already
accepted in many places (proto histograms, PRW) that CT (or any timestamp really) 0
means not set.
Signed-off-by: bwplotka <bwplotka@gmail.com>
The `Accept` header should not include `escape=allow-utf-8` unless
explicitly requested.
Conveniently, there was already a test covering this header's value, it
just required updating so it also asserts that this value in the header
is not set in the cases we don't expect it to be set. I also converted
those tests into table tests to help make failures clearer.
Issue: https://github.com/prometheus/prometheus/issues/15857
Signed-off-by: Matt Hughes <mhughes@uw.co.uk>
Change case order for switch scrapeLoop
This commit changes the ordering in error identification switch cases for better production performance and adds reasonings behind error switch case order as comments.
---------
Signed-off-by: Laimis Juzeliūnas <asnelaimis@gmail.com>
* model/textparse: Change parser interface Metric(...) string to Labels(...)
Simplified the interface given no one is using the return argument.
Renamed for clarity too.
Found and discussed https://github.com/prometheus/prometheus/pull/15731#discussion_r1950916842
Signed-off-by: bwplotka <bwplotka@gmail.com>
* Fixed comments; optimized not needed copy for om and text.
Signed-off-by: bwplotka <bwplotka@gmail.com>
---------
Signed-off-by: bwplotka <bwplotka@gmail.com>
Also:
* split benchmark functions to make sure no one compares across parsers.
* testdata file have meaningful names reflecting the type representation
* promtestdata.txt now has all types, taken directly from long running Prometheus (https://demo.do.prometheus.io/)
Needed for https://github.com/prometheus/prometheus/pull/15731
Signed-off-by: bwplotka <bwplotka@gmail.com>