Adds CheckpointFromInMemorySeries option for agent.Options to enable a faster checkpoint implementation that skips segment re-read and just uses in-memory data instead.
* feat: impl agent-specific checkpoint dir
* feat: impl ActiveSeries interface
* feat: use new checkpoint impl
* feat: hide new checkpoint impl behind a feature flag
* feat: add benchmark
* feat: add benchstat case
* feat: use feature flag in bench
* feat: use same labels for persisted state and append
* feat: set WAL segment size
* feat: add checkpoint size metric and bump series size
* feat: wal replay test
* feat: expose new checkpoint opts in cmd flags
* feat: update cli doc
* add ActiveSeries and DeletedSeries doc
Signed-off-by: x1unix <9203548+x1unix@users.noreply.github.com>
Signed-off-by: Denys Sedchenko <9203548+x1unix@users.noreply.github.com>
Co-authored-by: George Krajcsovits <krajorama@users.noreply.github.com>
* test(cmd/prometheus): add TestFeatureFlagsDocumented and fix help text
Add TestFeatureFlagsDocumented to ensure the --enable-feature help text
in and docs/feature_flags.md list the same set of flags.
The help text was out of sync with the documentation:
- Flags present in docs but missing from help text: `auto-reload-config`,
`metadata-wal-records`, `otlp-native-delta-ingestion`,
`promql-delayed-name-removal`, `type-and-unit-labels`. Added them.
- Flags present in help text but missing from docs: `auto-gomaxprocs`,
`expand-external-labels`. Removed them.
The help text is now sorted for better readability and kept in sync
with the documentation.
Also, the parsing of an empty `--enable-feature` was changed to
print `msg="Unknown option for --enable-feature" option=""` instead of nothing.
Signed-off-by: Ayoub Mrini <ayoubmrini424@gmail.com>
* main.go remove default for --enable-feature to avoid unwanted
Signed-off-by: Ayoub Mrini <ayoubmrini424@gmail.com>
---------
Signed-off-by: Ayoub Mrini <ayoubmrini424@gmail.com>
This adds a /api/v1/status/self_metrics endpoint that allows the frontend to
fetch metrics about the server itself, making it easier to construct frontend
pages that show the current server state. This is needed because fetching
metrics from its own /metrics endpoint would be both hard to parse and also
require CORS permissions on that endpoint (for cases where the frontend
dashboard is not the same origin, at least).
Signed-off-by: Julius Volz <julius.volz@gmail.com>
The filter field was documented as targeting the Catalog API but since
PR #17349 it was also passed to the Health API. This broke existing
configs using Catalog-only fields like ServiceTags, which the Health API
rejects (it uses Service.Tags instead).
Introduce a separate health_filter field that is passed exclusively to
the Health API, while filter remains catalog-only. Update the docs to
explain the two-phase discovery (Catalog for service listing, Health for
instances) and the field name differences between the two APIs.
Fixes#18479
Signed-off-by: Julien Pivotto <291750+roidelapluie@users.noreply.github.com>
Building off config-specific Prometheus refresh metrics from an earlier
PR (https://github.com/prometheus/prometheus/pull/17138), this deletes
refresh metrics like `prometheus_sd_refresh_duration_seconds` and
`prometheus_sd_refresh_failures_total` when the underlying scrape job
configuration is removed on reload. This reduces un-needed cardinality
from scrape job specific metrics while still preserving metrics that
indicate overall health of a service discovery engine.
For example,
`prometheus_sd_refresh_failures_total{config="linode-servers",mechanism="linode"} 1`
will no longer be exported by Prometheus when the `linode-servers`
scrape job for the Linode service provider is removed. The generic,
service discovery specific `prometheus_sd_linode_failures_total` metric
will persist however.
* fix: add targetsMtx lock for targets access
* test: validate refresh/discover metrics are gone
* ref: combine sdMetrics and refreshMetrics
Good idea from @bboreham to combine sdMetrics and refreshMetrics!
They're always passed around together and don't have much of a
reason not to be combined. mechanismMetrics makes it clear what kind of
metrics this is used for (service discovery mechanisms).
---------
Signed-off-by: Will Bollock <wbollock@linode.com>
The `--header` flag was already supported by `promtool query range` but was
missing from `promtool query instant`. This adds the same flag so users can
pass extra HTTP headers (e.g. `X-Scope-OrgID` for multi-tenant setups)
without needing to create an `--http.config.file`.
```
[ENHANCEMENT] promtool: Add `--header` flag to `query instant` command, matching existing `query range` behaviour.
```
Signed-off-by: Hoa <hoameomu@gmail.com>
Add Outscale VM service discovery using osc-sdk-go, including optional secret_key_file support, metrics, docs, and configuration examples. Document the default region (eu-west-2).
Signed-off-by: Aurelien Duboc <aurelienduboc96@gmail.com>
This adds 'databases' role to digitalocean_sd_config to discover DigitalOcean
Managed Database clusters. It follows the multi-role design pattern by
introducing a 'role' parameter (default: 'droplets').
Includes:
- Support for Managed Databases API.
- Pagination handling for Databases API.
- Comprehensive meta labels for database targets.
- Updated documentation and tests.
Signed-off-by: Vladimir Skesov <skesov@gmail.com>
* promql: add test for info() with data label matcher when info series goes stale
When info() is called with a data label matcher that doesn't match
the empty string (e.g. {data=~".+"}), samples at timestamps where
no info series is available should be dropped rather than falling
back to the original un-enriched series.
This case was missing from the test suite.
* docs: document info() behavior when info series is unavailable
Document that when no matching info series exists at a timestamp,
data label matchers that don't match the empty string cause the
sample to be dropped, while empty-matching matchers or no selector
return the series unenriched.
* promql: add test cases for info() fallback when info series goes stale
Add test cases for info(metric, {data=~".*"}) and info(metric) to
complement the existing info(metric, {data=~".+"}) test case, making
the behavioral contrast explicit: empty-matching matchers and no
selector fall back to the unenriched series, while non-empty-matching
matchers drop the sample.
---------
Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
Update functions.md file, change 'standard variance' to 'variance' in function descriptions
Signed-off-by: Pavel Rysnik <126406830+sakuuj@users.noreply.github.com>
Updated the description of the stdvar operator to clarify that it calculates variance instead of standard variance.
Signed-off-by: Pavel Rysnik <126406830+sakuuj@users.noreply.github.com>
The docs for these functions previously described them as acting on
"each histogram sample," which was ambiguous. Add "native" to clarify
they only operate on native histogram samples, not classic histograms.
This distinction was originally documented but lost when the
experimental feature warnings were removed.
Signed-off-by: Jeremy Rickards <jeremy.rickards@sap.com>
[hcloud.Server.Datacenter] is deprecated and will be removed after 1 July 2026. Use [hcloud.Server.Location] instead.
See https://docs.hetzner.cloud/changelog#2025-12-16-phasing-out-datacenters
Changes to Hetzner meta labels:
- `__meta_hetzner_datacenter`
- is deprecated for the role `robot` but kept for backward compatibility. Using `__meta_hetzner_robot_datacenter` is preferred.
- is deprecated for the role `hcloud` and will stop working after the 1 July 2026.
- `__meta_hetzner_hcloud_datacenter_location` label
- is deprecated but kept for backward compatibility, the same data is available in the [`hcloud.Server.Location`](https://pkg.go.dev/github.com/hetznercloud/hcloud-go/v2/hcloud#Server) struct.
- using `__meta_hetzner_hcloud_location` is preferred.
- `__meta_hetzner_hcloud_datacenter_location_network_zone`
- is deprecated but kept for backward compatibility, the same data is available in the [`hcloud.Server.Location`](https://pkg.go.dev/github.com/hetznercloud/hcloud-go/v2/hcloud#Server) struct.
- using `__meta_hetzner_hcloud_location_network_zone` is preferred.
- `__meta_hetzner_hcloud_location`
- replacement label for `__meta_hetzner_hcloud_datacenter_location`
- `__meta_hetzner_hcloud_location_network_zone`
- replacement label for `__meta_hetzner_hcloud_datacenter_location_network_zone`
- `__meta_hetzner_robot_datacenter`
- replacement label for `__meta_hetzner_datacenter` with the role `robot`.
Signed-off-by: Jonas Lammler <jonas.lammler@hetzner-cloud.de>
Extended Kubernetes SD to support the following pod-based labels:
* `__meta_kubernetes_pod_deployment_name`
* `__meta_kubernetes_pod_cronjob_name`
* `__meta_kubernetes_pod_job_name`
Signed-off-by: Pranshu Srivastava <rexagod@gmail.com>