Previously the AWS SD ECS Role only discovered instances that used
`awsvpc` network mode, which attaches a dedicated Elastic Network
Interface (ENI). This change adds in additional logic so that we
discover instances that are using `host` and `bridge` networking modes,
where the IP address is that of the EC2 instance that is hosting the
container. Also this change exposes a number of additional labels that
relate to the EC2 instance when the launch type is `EC2`.
Signed-off-by: matt-gp <small_minority@hotmail.com>
* promtool: allow cardinality with metrics linting and add --lint to check metrics
Signed-off-by: ADITYA TIWARI <adityatiwari342005@gmail.com>
* fix/ci: Simplify test case variable declaration
Remove unnecessary variable declaration in test cases.
Signed-off-by: ADITYA TIWARI <142050150+ADITYATIWARI342005@users.noreply.github.com>
* promtool: avoid Tee for --lint=none
Signed-off-by: ADITYA TIWARI <adityatiwari342005@gmail.com>
* promtool: validate at least one feature enabled in check metrics
addresses feedback to ensure the command does something useful
now fails with clear error when both --lint=none and no --extended flag.
Signed-off-by: ADITYA TIWARI <adityatiwari342005@gmail.com>
---------
Signed-off-by: ADITYA TIWARI <adityatiwari342005@gmail.com>
Signed-off-by: ADITYA TIWARI <142050150+ADITYATIWARI342005@users.noreply.github.com>
Add a maximum limit of 10,000 to the TSDB status endpoint to prevent
resource exhaustion from excessively large limit values, as we preallocate
[]Stat for up to the limit: `make([]Stat, 0, length)`.
Note that the endpoint acquires a cardinality mutex during
stats calculation, so this can not be run in parallel.
Signed-off-by: Julien Pivotto <291750+roidelapluie@users.noreply.github.com>
This commit adds support for configuring a custom start timestamp
for Prometheus unit tests, allowing tests to use realistic timestamps
instead of starting at Unix epoch 0.
Signed-off-by: Julien Pivotto <291750+roidelapluie@users.noreply.github.com>
The return value of functions relating to the current time, e.g. time(),
is set by promtool to start at timestamp 0 at the start of a test's
evaluation.
This has the very nice consequence that tests can run reliably without
depending on when they are run.
It does, however, mean that tests will give out results that can be
unexpected by users.
If this behaviour is documented, then users will be empowered to write
tests for their rules that use time-dependent functions.
(Closes: prometheus/docs#1464)
Signed-off-by: Gabriel Filion <lelutin@torproject.org>
* Delay compactions until Thanos uploads all blocks
Using Thanos sidecar with Prometheus requires us to disable TSDB compactions on Prometheus side by setting --storage.tsdb.min-block-duration and --storage.tsdb.max-block-duration to the same value. See https://thanos.io/tip/components/sidecar.md. The main problem this avoids is that Prometheus might compact given block before Thanos uploads it, creating a gap in Thanos metrics. Thanos does not upload compacted blocks because that would upload the same sample multiple times. You can tell Thanos to upload compacted blocks but that is aimed at one time migrations. This patch creates a bridge between Thanos and Prometheus by allowing Prometheus to read the shipper file Thanos creates, where it tracks which blocks were already uploaded, and using that data delays compaction of blocks until they are marked as uploaded by Thanos. Thanks to this both services can coordinate with each other (in a way) and we can stop disabling compaction on Prometheus side when Thanos uploads are enabled.
The reason to have this is that disabling compactions have very dramatic performance cost. Since most time series exist for longer than a single block duration (2h by default) large chunks of block index will reference the same series, so 10 * 2h blocks will each have an index that is usually fairly big and is almost the same for all 10 blocks. Compaction de-duplicates the index so merging 10 blocks together would leave us with a single index that is around the same size as each of these 10 2h blocks would have (plus some extra for series that only exists in some blocks, but not all). Every range query that iterates over all 10 blocks would then have to read each index and so we're doing 10x more work then if we had a single compacted block.
Signed-off-by: Lukasz Mierzwa <l.mierzwa@gmail.com>
* Rename structs and functions to make this more generic
Signed-off-by: Lukasz Mierzwa <l.mierzwa@gmail.com>
* Address review comments
Signed-off-by: Lukasz Mierzwa <l.mierzwa@gmail.com>
* Cache UploadMeta for 1 minute
Signed-off-by: Lukasz Mierzwa <l.mierzwa@gmail.com>
---------
Signed-off-by: Lukasz Mierzwa <l.mierzwa@gmail.com>
* Add a nav title to fix docs website generator.
* Make it more clear that "Prometheus Agent" is a mode, not a seaparate
service.
* Add to index.
* Cleanup some wording.
* Add a downsides section.
Signed-off-by: SuperQ <superq@gmail.com>
Corrected the structure of the explanation regarding how samples are organized in the chunks directory and the handling of deletion records.
Signed-off-by: Raul Leite <sp4wn.root@gmail.com>
Partially fixes https://github.com/prometheus/prometheus/issues/17416 by
renaming all CT* names to ST* in the whole codebase except RW2 (this is
done in separate
[PR](https://github.com/prometheus/prometheus/pull/17411)) and
PrometheusProto exposition proto.
```
CreatedTimestamp -> StartTimestamp
CreatedTimeStamp -> StartTimestamp
created_timestamp -> start_timestamp
CT -> ST
ct -> st
```
Signed-off-by: bwplotka <bwplotka@gmail.com>
* add feature flag for remote write v2
Signed-off-by: pipiland2612 <nguyen.t.dang.minh@gmail.com>
* change from number to protobuf_message
Signed-off-by: pipiland2612 <nguyen.t.dang.minh@gmail.com>
* fix test
Signed-off-by: pipiland2612 <nguyen.t.dang.minh@gmail.com>
* fix name
Signed-off-by: pipiland2612 <nguyen.t.dang.minh@gmail.com>
* run make cli-documentation
Signed-off-by: pipiland2612 <nguyen.t.dang.minh@gmail.com>
* fix help
Signed-off-by: pipiland2612 <nguyen.t.dang.minh@gmail.com>
* run make cli-documentation
Signed-off-by: pipiland2612 <nguyen.t.dang.minh@gmail.com>
---------
Signed-off-by: pipiland2612 <nguyen.t.dang.minh@gmail.com>
Fixes: #16572
Mark as stable means that breaking changes are only allowed together with
major version release of Prometheus.
Co-authored-by: Björn Rabenstein <beorn@grafana.com>
Signed-off-by: George Krajcsovits <krajorama@users.noreply.github.com>
Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
As a follow-up to #17344, add two configuration parameters for controlling label
name translation, both defaulting to on for backwards compatibility (currently
these behaviours are hardcoded as enabled):
* otlp.label_name_underscore_sanitization => Prefix label names starting with a
single underscore with key_ when translating OTel attribute names
* otlp.label_name_preserve_multiple_underscores => Keep multiple consecutive
underscores in label names when translating OTel attribute names
Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
This adds:
* A `ScrapePoolConfig()` method to the scrape manager that allows getting
the scrape config for a given pool.
* An API endpoint at `/api/v1/targets/relabel_steps` that takes a pool name
and a label set of a target and returns a detailed list of applied
relabeling rules and their output for each step.
* A "show relabeling" link/button for each target on the discovery page
that shows the detailed flow of all relabeling rules (based on the API
response) for that target.
Note that this changes the JSON encoding of the relabeling rule config
struct to output the original snake_case (instead of camelCase) field names,
and before merging, we need to be sure that's ok :) See my comment about
that at https://github.com/prometheus/prometheus/pull/15383#issuecomment-3405591487
Fixes https://github.com/prometheus/prometheus/issues/17283
Signed-off-by: Julius Volz <julius.volz@gmail.com>
The detailed plan for this is laid out in
https://github.com/prometheus/prometheus/issues/16572 .
This commit adds a global and local scrape config option
`scrape_native_histograms`, which has to be set to true to ingest
native histograms.
To ease the transition, the feature flag is changed to simply set the
default of `scrape_native_histograms` to true.
Further implications:
- The default scrape protocols now depend on the
`scrape_native_histograms` setting.
- Everywhere else, histograms are now "on by default".
Documentation beyond the one for the feature flag and the scrape
config are deliberately left out. See
https://github.com/prometheus/prometheus/pull/17232 for that.
Signed-off-by: beorn7 <beorn@grafana.com>
Adds a `config` label (similar to `prometheus_sd_discovered_targets`) to
refresh metrics to help identify the source of refresh issues or
performance stats. In particular for HTTP SD, it can be common to have
multiple disparate HTTP SD sources that should be identified and not
lumped together. For example if one HTTP SD service has failures, that
should be evident in its own time series seperate from other HTTP SD
sources.
`config` seemed more appropriate than `endpoint` as a general standard
for `prometheus_sd` metrics.
Docs were also updated for HTTP SD to point at the new refresh metrics
rather than the older metrics.
Signed-off-by: Will Bollock <wbollock@linode.com>