mirror of
				https://github.com/prometheus/prometheus.git
				synced 2025-10-30 07:51:19 +01:00 
			
		
		
		
	
		
			
				
	
	
		
			98 lines
		
	
	
		
			3.5 KiB
		
	
	
	
		
			Markdown
		
	
	
	
	
	
			
		
		
	
	
			98 lines
		
	
	
		
			3.5 KiB
		
	
	
	
		
			Markdown
		
	
	
	
	
	
| ---
 | |
| title: Querying examples
 | |
| nav_title: Examples
 | |
| sort_rank: 4
 | |
| ---
 | |
| 
 | |
| # Query examples
 | |
| 
 | |
| ## Simple time series selection
 | |
| 
 | |
| Return all time series with the metric `http_requests_total`:
 | |
| 
 | |
|     http_requests_total
 | |
| 
 | |
| Return all time series with the metric `http_requests_total` and the given
 | |
| `job` and `handler` labels:
 | |
| 
 | |
|     http_requests_total{job="apiserver", handler="/api/comments"}
 | |
| 
 | |
| Return a whole range of time (in this case 5 minutes up to the query time)
 | |
| for the same vector, making it a [range vector](../basics/#range-vector-selectors):
 | |
| 
 | |
|     http_requests_total{job="apiserver", handler="/api/comments"}[5m]
 | |
| 
 | |
| Note that an expression resulting in a range vector cannot be graphed directly,
 | |
| but viewed in the tabular ("Console") view of the expression browser.
 | |
| 
 | |
| Using regular expressions, you could select time series only for jobs whose
 | |
| name match a certain pattern, in this case, all jobs that end with `server`:
 | |
| 
 | |
|     http_requests_total{job=~".*server"}
 | |
| 
 | |
| All regular expressions in Prometheus use [RE2
 | |
| syntax](https://github.com/google/re2/wiki/Syntax).
 | |
| 
 | |
| To select all HTTP status codes except 4xx ones, you could run:
 | |
| 
 | |
|     http_requests_total{status!~"4.."}
 | |
| 
 | |
| ## Subquery
 | |
| 
 | |
| Return the 5-minute rate of the `http_requests_total` metric for the past 30 minutes, with a resolution of 1 minute.
 | |
| 
 | |
|     rate(http_requests_total[5m])[30m:1m]
 | |
| 
 | |
| This is an example of a nested subquery. The subquery for the `deriv` function uses the default resolution. Note that using subqueries unnecessarily is unwise.
 | |
| 
 | |
|     max_over_time(deriv(rate(distance_covered_total[5s])[30s:5s])[10m:])
 | |
| 
 | |
| ## Using functions, operators, etc.
 | |
| 
 | |
| Return the per-second rate for all time series with the `http_requests_total`
 | |
| metric name, as measured over the last 5 minutes:
 | |
| 
 | |
|     rate(http_requests_total[5m])
 | |
| 
 | |
| Assuming that the `http_requests_total` time series all have the labels `job`
 | |
| (fanout by job name) and `instance` (fanout by instance of the job), we might
 | |
| want to sum over the rate of all instances, so we get fewer output time series,
 | |
| but still preserve the `job` dimension:
 | |
| 
 | |
|     sum by (job) (
 | |
|       rate(http_requests_total[5m])
 | |
|     )
 | |
| 
 | |
| If we have two different metrics with the same dimensional labels, we can apply
 | |
| binary operators to them and elements on both sides with the same label set
 | |
| will get matched and propagated to the output. For example, this expression
 | |
| returns the unused memory in MiB for every instance (on a fictional cluster
 | |
| scheduler exposing these metrics about the instances it runs):
 | |
| 
 | |
|     (instance_memory_limit_bytes - instance_memory_usage_bytes) / 1024 / 1024
 | |
| 
 | |
| The same expression, but summed by application, could be written like this:
 | |
| 
 | |
|     sum by (app, proc) (
 | |
|       instance_memory_limit_bytes - instance_memory_usage_bytes
 | |
|     ) / 1024 / 1024
 | |
| 
 | |
| If the same fictional cluster scheduler exposed CPU usage metrics like the
 | |
| following for every instance:
 | |
| 
 | |
|     instance_cpu_time_ns{app="lion", proc="web", rev="34d0f99", env="prod", job="cluster-manager"}
 | |
|     instance_cpu_time_ns{app="elephant", proc="worker", rev="34d0f99", env="prod", job="cluster-manager"}
 | |
|     instance_cpu_time_ns{app="turtle", proc="api", rev="4d3a513", env="prod", job="cluster-manager"}
 | |
|     instance_cpu_time_ns{app="fox", proc="widget", rev="4d3a513", env="prod", job="cluster-manager"}
 | |
|     ...
 | |
| 
 | |
| ...we could get the top 3 CPU users grouped by application (`app`) and process
 | |
| type (`proc`) like this:
 | |
| 
 | |
|     topk(3, sum by (app, proc) (rate(instance_cpu_time_ns[5m])))
 | |
| 
 | |
| Assuming this metric contains one time series per running instance, you could
 | |
| count the number of running instances per application like this:
 | |
| 
 | |
|     count by (app) (instance_cpu_time_ns)
 |