prometheus

mirror of https://github.com/prometheus/prometheus.git synced 2026-01-25 18:41:40 +01:00

Author	SHA1	Message	Date
dependabot[bot]	99c8351d0e	chore(deps): bump github.com/hetznercloud/hcloud-go/v2 from 2.32.0 to 2.33.0 (#17762 ) * chore(deps): bump github.com/hetznercloud/hcloud-go/v2 Bumps [github.com/hetznercloud/hcloud-go/v2](https://github.com/hetznercloud/hcloud-go) from 2.32.0 to 2.33.0. - [Release notes](https://github.com/hetznercloud/hcloud-go/releases) - [Changelog](https://github.com/hetznercloud/hcloud-go/blob/main/CHANGELOG.md) - [Commits](https://github.com/hetznercloud/hcloud-go/compare/v2.32.0...v2.33.0) --- updated-dependencies: - dependency-name: github.com/hetznercloud/hcloud-go/v2 dependency-version: 2.33.0 dependency-type: direct:production update-type: version-update:semver-minor ... * Use `server.Datacenter` until next minor release - disable linting of it in the meantime --------- Signed-off-by: dependabot[bot] <support@github.com> Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Arve Knudsen <arve.knudsen@gmail.com>	2026-01-07 13:21:56 +01:00
VictorFilatov	0a2be81616	Fix link in discovery README.md (#17753 ) Signed-off-by: VictorFilatov <phylactus@gmail.com>	2026-01-06 09:55:03 +01:00
Ben Kochie	e14795bbf4	Remove copyright date from headers (#17785 ) Remove copyright dates from various files as part of [PROM-50]. [PROM-50]: https://github.com/prometheus/proposals/blob/main/proposals/0050-remove-copyright-dates.md Signed-off-by: SuperQ <superq@gmail.com>	2026-01-05 13:46:21 +01:00
matt-gp	9b6e244b83	AWS SD: ECS Bridge Mode Previously the AWS SD ECS Role only discovered instances that used `awsvpc` network mode, which attaches a dedicated Elastic Network Interface (ENI). This change adds in additional logic so that we discover instances that are using `host` and `bridge` networking modes, where the IP address is that of the EC2 instance that is hosting the container. Also this change exposes a number of additional labels that relate to the EC2 instance when the launch type is `EC2`. Signed-off-by: matt-gp <small_minority@hotmail.com>	2025-12-25 19:26:14 +00:00
Julien	f73aba34cd	Merge pull request #17427 from roidelapluie/roidelapluie/ffapi API: Add a /api/v1/features endpoint	2025-12-10 10:14:03 +01:00
Julien Pivotto	a5671a002f	API: Add a /api/v1/features endpoint Signed-off-by: Julien Pivotto <291750+roidelapluie@users.noreply.github.com>	2025-12-09 16:13:14 +01:00
intojhanurag	1ccc0fed81	Revert triton_test.go and kuma_test.go to main version Signed-off-by: intojhanurag <aojharaj2004@gmail.com>	2025-12-08 16:51:22 +00:00
intojhanurag	ea072fd56a	Added cleanup closure in kuma test Signed-off-by: intojhanurag <aojharaj2004@gmail.com>	2025-12-08 08:35:52 +00:00
intojhanurag	0251e888f9	Refactor Triton tests to simplify metrics cleanup using cleanup closure Signed-off-by: intojhanurag <aojharaj2004@gmail.com>	2025-12-08 08:26:24 +00:00
intojhanurag	025628f272	unregistering RefreshMetrics instances Signed-off-by: intojhanurag <aojharaj2004@gmail.com>	2025-12-06 19:50:49 +00:00
dongjiang	3239723098	Update golangci-lint and add modernize check (#17640 ) * add modernize check Signed-off-by: dongjiang1989 <dongjiang1989@126.com> * fix golangci lint Signed-off-by: dongjiang1989 <dongjiang1989@126.com> --------- Signed-off-by: dongjiang1989 <dongjiang1989@126.com>	2025-12-05 09:29:10 +01:00
harsh kumar	30be1483d1	instrumentation: add native histograms to complement high-traffic summaries (#17374 ) This adds the following native histograms (with a few classic buckets for backwards compatibility), while keeping the corresponding summaries (same name, just without `_histogram`): - `prometheus_sd_refresh_duration_histogram_seconds` - `prometheus_rule_evaluation_duration_histogram_seconds` - `prometheus_rule_group_duration_histogram_seconds` - `prometheus_target_sync_length_histogram_seconds` - `prometheus_target_interval_length_histogram_seconds` - `prometheus_engine_query_duration_histogram_seconds` Signed-off-by: Harsh <harshmastic@gmail.com> Signed-off-by: harsh kumar <135993950+hxrshxz@users.noreply.github.com> Co-authored-by: Björn Rabenstein <github@rabenste.in>	2025-11-27 18:45:35 +01:00
Will Bollock	4aa8941eb1	fix(discovery): aws discovery test fix (#17527 ) * fix: aws discovery test fix Fixes a problem introduced after the merge of this https://github.com/prometheus/prometheus/pull/17138 PR didn't take into account another merged PR! ``` discovery/aws/aws.go:218:54: too many arguments in call to NewEC2Discovery have (EC2SDConfig, slog.Logger, ec2Metrics) want (EC2SDConfig, discovery.DiscovererOptions) discovery/aws/aws.go:222:66: too many arguments in call to NewLightsailDiscovery have (LightsailSDConfig, slog.Logger, lightsailMetrics) want (LightsailSDConfig, discovery.DiscovererOptions) ``` Signed-off-by: Will Bollock <wbollock@linode.com> * fix: align ecs style ECS was a new service discovery tool added after this PR was merged: https://github.com/prometheus/prometheus/pull/17138 Aligns the style of passing a single "opts" to it like almost all the other service discovery engines now use Signed-off-by: Will Bollock <wbollock@linode.com> --------- Signed-off-by: Will Bollock <wbollock@linode.com>	2025-11-16 10:28:50 +00:00
Julius Hinze	987b28e26c	discovery: fix constructor arguments in aws discovery (#17526 ) Signed-off-by: Julius Hinze <julius.hinze@grafana.com>	2025-11-13 15:59:14 +00:00
Bryan Boreham	e02a65b6bd	Merge pull request #17138 from wbollock/feat/prometheus_refresh_config_label feat(metrics): add config label to refresh metrics	2025-11-13 14:51:39 +01:00
matt-gp	1b52ab9e3b	feat: AWS ECS Service Discovery Signed-off-by: matt-gp <small_minority@hotmail.com>	2025-11-06 22:48:07 +00:00
Ben Kochie	204249fcb5	Update golangci-lint (#17478 ) * Update golangci-lint to v2.6.0 * Fixup various linting issues. * Fixup deprecations. * Add exception for `labels.MetricName` deprecation. Signed-off-by: SuperQ <superq@gmail.com>	2025-11-05 13:47:34 +01:00
Ben Kochie	48956f60d7	Update modernize (#17471 ) Apply additional Go modernize tool improvements. Signed-off-by: SuperQ <superq@gmail.com>	2025-11-04 05:13:49 +00:00
György Krajcsovits	fbd5353a19	Merge remote-tracking branch 'origin/release-3.7' into krajo/merge-release-372-to-main	2025-10-22 18:02:22 +02:00
George Krajcsovits	d06c96136d	Merge pull request #17376 from sysadmind/fix-17375 discovery/aws: Fix region load from IMDS	2025-10-22 11:02:56 +02:00
Joe Adams	6f9af6651e	Update lightsail to use IMDS for region Signed-off-by: Joe Adams <github@joeadams.io>	2025-10-21 20:31:55 -04:00
Joe Adams	7a29bd2cb4	discovery/aws: Fix region load from IMDS Loading the local region from the Instance MetaData Service broke in v3.7. This adds the IMDS call back in order to load the local region when no other method has set the region. fixes #17375 Signed-off-by: Joe Adams <github@joeadams.io>	2025-10-20 22:47:07 -04:00
Julien Pivotto	c40a574197	discovery/ec2: Fix AWS SDK v2 credentials handling for EC2 and Lightsail discovery After the upgrade to AWS SDK v2, the EC2 and Lightsail service discovery stopped working when using the default AWS credential chain (environment variables, IAM roles, EC2 instance metadata, etc.). The issue was that the code unconditionally created a StaticCredentialsProvider with empty credentials when access_key and secret_key were not configured. In AWS SDK v2, this causes a "static credentials are empty" error and prevents the SDK from falling back to its default credential chain. Signed-off-by: Julien Pivotto <291750+roidelapluie@users.noreply.github.com>	2025-10-17 15:52:40 +02:00
Will Bollock	e894a22b88	feat: add config label to refresh metrics Adds a `config` label (similar to `prometheus_sd_discovered_targets`) to refresh metrics to help identify the source of refresh issues or performance stats. In particular for HTTP SD, it can be common to have multiple disparate HTTP SD sources that should be identified and not lumped together. For example if one HTTP SD service has failures, that should be evident in its own time series seperate from other HTTP SD sources. `config` seemed more appropriate than `endpoint` as a general standard for `prometheus_sd` metrics. Docs were also updated for HTTP SD to point at the new refresh metrics rather than the older metrics. Signed-off-by: Will Bollock <wbollock@linode.com>	2025-10-14 11:36:14 -04:00
Harsh	24282f7b44	test(discovery/xds): speed up tests with t.Parallel() Signed-off-by: Harsh <harshmastic@gmail.com>	2025-10-12 15:27:33 +05:30
Michael Shen	1eaddc64d0	Migrate K8s discovery service queues to use strongly typed queues Signed-off-by: Michael Shen <mishen@umich.edu>	2025-09-23 20:32:11 -07:00
Michael Shen	9c525b84c4	Add deprecation notice to associated K8s endpoints API objects Signed-off-by: Michael Shen <mishen@umich.edu>	2025-09-23 20:30:37 -07:00
Michael Shen	1703e54dfd	Update to k8s.io v0.33.5 Signed-off-by: Michael Shen <mishen@umich.edu>	2025-09-23 20:30:36 -07:00
Bryan Boreham	26279e5b6d	Merge pull request #17066 from cuiweixie/reflect.TypeFor-discovery discovery: refactor to use reflect.TypeFor Use a neater form, introduced in Go 1.22.	2025-09-16 12:22:14 +01:00
Arve Knudsen	913cc8f72b	Replace gopkg.in/yaml.v2 with go.yaml.in/yaml/v2 (#17151 ) * Replace gopkg.in/yaml.v2 with go.yaml.in/yaml/v2 * Upgrade to client_golang@v1.23.2 --------- Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>	2025-09-06 13:04:24 +02:00
beorn7	747c5ee2b1	Apply analyzer "modernize" to the whole codebase See https://pkg.go.dev/golang.org/x/tools/gopls/internal/analysis/modernize for details. This ran into a few issues (arguably bugs in the modernize tool), which I will fix in the next commit, so that we have transparency what was done automatically. Beyond those hiccups, I believe all the changes applied are legitimate. Even where there might be no tangible direct gain, I would argue it's still better to use the "modern" way to avoid micro discussions in tiny style PRs later. Signed-off-by: beorn7 <beorn@grafana.com>	2025-08-27 14:48:41 +02:00
cuiweixie	8e48f43e06	discovery: refactor to use reflect.TypeFor Signed-off-by: cuiweixie <cuiweixie@gmail.com>	2025-08-21 13:52:30 +08:00
Joe Adams	60cf922f89	Fix merge formatting Signed-off-by: Joe Adams <github@joeadams.io>	2025-08-07 21:17:21 -04:00
Joe Adams	8350c23e76	Merge branch 'main' into aws-sdk-go-v2 Signed-off-by: Joe Adams <github@joeadams.io>	2025-08-06 10:53:49 -04:00
Matthieu MOREL	cef219c31c	chore: enable unused-receiver rule from revive Signed-off-by: Matthieu MOREL <matthieu.morel35@gmail.com>	2025-08-04 09:43:33 +00:00
Joe Adams	cdfb67467f	Review feedback Signed-off-by: Joe Adams <github@joeadams.io>	2025-07-31 21:22:43 -04:00
Joe Adams	56a3bbf5c5	Fix import formatting Signed-off-by: Joe Adams <github@joeadams.io>	2025-07-29 23:17:23 -04:00
Joe Adams	eab9b696f2	Upgrade AWS SDK to v2 AWS SDK v1 is end of life soon, so migrate to the V2 SDK. The credential loading should work more consistently with other projects that use the SDK and load credentials from the appropriate locations including from environment variables. This affects the EC2 and Lightsail service discovery features. Signed-off-by: Joe Adams <github@joeadams.io>	2025-07-29 23:06:05 -04:00
Ayoub Mrini	9dc274687b	Merge pull request #16831 from machine424/nsmeta feat(discovery/kubernetes): allow attaching namespace metadata	2025-07-17 10:30:27 +01:00
machine424	a9f6fdd910	feat(discovery/kubernetes): allow attaching namespace metadata to ingress and service roles. with the help of claude-4-sonnet Signed-off-by: machine424 <ayoubmrini424@gmail.com>	2025-07-17 09:53:16 +02:00
Yandi Lee	8eb445b8a4	Discovery.Manager: close sync ch after sender() is stopped (#14465 ) * close sync ch after sender() is stopped * break if chan is closed Signed-off-by: liyandi <littlepangdi@163.com> Co-authored-by: liyandi <liyandi@xiaomi.com>	2025-07-11 17:15:01 +01:00
machine424	020e803ee0	chore(discovery): remove unused StaticProvider struct, library users can easily define it on their side Signed-off-by: machine424 <ayoubmrini424@gmail.com>	2025-07-09 17:10:13 +01:00
chenlujjj	a2735494e1	chore: complete error message in RegisterSDMetrics function (#14635 ) Signed-off-by: chenlujjj <953546398@qq.com>	2025-07-08 12:05:24 +00:00
machine424	c2d6e528e4	feat(discovery/kubernetes): allow attaching namespace metadata to endpointslice, endpoints and pod roles after injecting the labels for endpointslice, claude-4-sonnet helped transpose the code and tests to endpoints and pod roles fixes https://github.com/prometheus/prometheus/issues/9510 supersedes https://github.com/prometheus/prometheus/pull/13798 Signed-off-by: machine424 <ayoubmrini424@gmail.com> Co-authored-by: Paul BARRIE <paul.barrie.calmels@gmail.com>	2025-07-03 19:41:08 +02:00
Lukasz Mierzwa	b49d143595	Fix a race in discovery manager ApplyConfig & shutdown If we call ApplyConfig() at the same time the manager is being stopped we might end up hanging forever. This is because ApplyConfig() will try to cancel obsolete providers and wait until they are cancelled. It's done by setting a done() function that call Done() on a sync.WaitGroup: ``` if len(prov.newSubs) == 0 { wg.Add(1) prov.done = func() { wg.Done() } } ``` then calling prov.cancel() and finally waiting until all providers run done() function that by blocking it all on a wg.Wait() call. For each provider there is a goroutine created by calling Manager.startProvider(Provider): ``` func (m Manager) startProvider(ctx context.Context, p Provider) { m.logger.Debug("Starting provider", "provider", p.name, "subs", fmt.Sprintf("%v", p.subs)) ctx, cancel := context.WithCancel(ctx) updates := make(chan []targetgroup.Group) p.mu.Lock() p.cancel = cancel p.mu.Unlock() go p.d.Run(ctx, updates) go m.updater(ctx, p, updates) } ``` It creates a context that can be cancelled and that cancel function becomes prov.cancel. This is what ApplyConfig will call. If we look at the body of updater() method: ``` func (m Manager) updater(ctx context.Context, p Provider, updates chan []targetgroup.Group) { // Ensure targets from this provider are cleaned up. defer m.cleaner(p) for { select { case <-ctx.Done(): return [...] ``` we can see that it will exit if that context is cancelled and that will trigger a call to Manager.cleaner(). That cleaner() is where done() is called. So ApplyConfig() -> calls cancel() -> causes cleaner() to be executed -> calls done(). cancel() is also called from cancelDiscoverers() method that will be called by Manager.Run() when Manager is stopping: ``` func (m Manager) Run() error { go m.sender() <-m.ctx.Done() m.cancelDiscoverers() return m.ctx.Err() } ``` The problem is that if we call both ApplyConfig and stop the manager at the same time we might end up with: - We call Manager.ApplyConfig() - We stop the Manager - Manager.cancelDiscoverers() is called - Provider.cancel() is called for every Provider - cancel() causes provider context to be cancelled which terminates updater() for given Provider - cancelling context causes cleaner() method to be called for given Provider - cleaner() calls done() and exits - Provider is considered stopped at this point, there is no goroutine running that will call done() anymore - ApplyConfig iterates providers and decides that one is obsolete is must be stopped - It sets a custom done() function body with a WaitGroup.Done() call in it - Then ApplyConfig waits until all Providers run done() - But they are all stopped and no done() will be run - We wait forever This only happens if cancelDiscoverers() is run before ApplyConfig, if ApplyConfig runs first done() will be called, if cancelDiscoverers() is called first it will stop updater() instances and so done() won't be called anymore. Part of the problem is that there is no distinction between running and stopped providers. There is Provider.IsStarted() method that returns a bool based on the value of cancel function but ApplyConfig doesn't check it. Second problem is that although there is a mutex on a Provider it's used much in the code, so two goroutines can try to read and/or write provider.cancel and/or provider.done at the same time, making it all more likely to race. The easiest way to fix it is to check if the provider is started inside ApplyConfig so we don't try to stop a provider that's already stopped. For that we need to mark it as stopped after cancel() is called, by setting cancel to nil. This also needs better lock usage to avoid different parts of the code trying to set cancel and done at the same time. Signed-off-by: Lukasz Mierzwa <l.mierzwa@gmail.com>	2025-07-02 16:03:10 +01:00
Lukasz Mierzwa	357e652044	Add a test for a rare shutdown hang When doing a config reload that need to stop some providers while also sending SIGTERM to Prometheus at the same time can sometimes hang 1: sync.WaitGroup.Wait [83 minutes] [Created by run.(Group).Run in goroutine 1 @ group.go:37] sync sema.go:110 runtime_SemacquireWaitGroup(uint32(#166)) sync waitgroup.go:118 (WaitGroup).Wait(WaitGroup(#23)) discovery manager.go:276 (Manager).ApplyConfig(#23, #167) main main.go:964 main.func5(#120) main main.go:1505 reloadConfig({#183, 0x1b}, 1, #40, #43, #50, {#31, 0xa, 0}) main main.go:1182 main.func22() run group.go:38 (Group).Run.func1(*Group(#26), #51) Add a test for it. Signed-off-by: Lukasz Mierzwa <l.mierzwa@gmail.com>	2025-07-02 16:01:42 +01:00
Bryan Boreham	d6f9ba6310	[BUILD] Docker SD: Fix up deprecated types Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	2025-06-23 16:15:58 +01:00
Jan-Otto Kröpke	ceaa3bd6f9	discovery: add STACKIT SD (#16401 )	2025-06-17 15:41:14 +02:00
Ayoub Mrini	50ba25f273	chore(docs/kubernetes SD): add a note about Endpoints API being deprecated in kubernetes 1.33+ (#16684 ) * chore(docs/kubernetes SD): add a note about Endpoints API being deprecated in kubernetes 1.33+ Signed-off-by: machine424 <ayoubmrini424@gmail.com> * chore(discovery/kubernetes): add Endpoints API deprecation comment Signed-off-by: machine424 <ayoubmrini424@gmail.com> --------- Signed-off-by: machine424 <ayoubmrini424@gmail.com>	2025-06-06 11:56:27 +02:00
Zhengke Zhou	45211dc72f	chore: Adjust test and add comment about DNS resolution issue for failing tests (#16200 ) * chore: Add comment about DNS resolution issue for failing tests Signed-off-by: zhengkezhou1 <madzhou1@gmail.com> * remove unexported-return Signed-off-by: zhengkezhou1 <madzhou1@gmail.com> --------- Signed-off-by: zhengkezhou1 <madzhou1@gmail.com>	2025-05-27 14:40:09 +02:00

1 2 3 4 5 ...

854 Commits