mirror of
https://github.com/hashicorp/vault.git
synced 2025-08-18 21:21:06 +02:00
* Document enabling config * Fix nav data JSON after disabling over-zealous prettifier * Address review feedback * Add warning about reloading config during overload * Bad metrics links * Another bad link * Add upgrade note about deprecation --------- Co-authored-by: Mike Palmiotto <mike.palmiotto@hashicorp.com>
82 lines
3.3 KiB
Plaintext
82 lines
3.3 KiB
Plaintext
---
|
|
layout: docs
|
|
page_title: 'Request Limiter'
|
|
description: >-
|
|
Vault provides an adaptive concurrency limiter to protect the Vault server
|
|
from overload.
|
|
---
|
|
|
|
# Request Limiter
|
|
|
|
@include 'alerts/enterprise-only.mdx'
|
|
|
|
<Warning title="Beta (Deprecated)">
|
|
|
|
The request limiter was released in Vault 1.16 as a Beta
|
|
feature. During Beta evaluation we found an alternative approach better met
|
|
the needs of our users. This feature will be removed from Vault in a future
|
|
release. It is replaced with [adaptive overload protection](/vault/docs/concepts/adaptive-overload-protection).
|
|
|
|
</Warning>
|
|
|
|
This document contains conceptual information about the **Request Limiter** and
|
|
its user-facing effects.
|
|
|
|
## Preventing overload
|
|
|
|
The Request Limiter aims to prevent overload by proactively detecting latency
|
|
deviation from a baseline and adapting the number of allowed in-flight requests.
|
|
|
|
This is done in two phases at the beginning of an HTTP request:
|
|
|
|
1. Consult the current number of allowed in-flight requests. If the new request
|
|
would exceed this limit, immediately reject it, indicating that the client
|
|
should retry later.
|
|
|
|
2. If the request is allowed, begin a measurement of its latency, allowing the
|
|
Request Limiter to calculate a new limit.
|
|
|
|
## Resource constraints
|
|
|
|
The Request Limiter intentionally focuses on preventing overload derived from
|
|
resource-constrained operations on the Vault server. Vault focuses on two
|
|
specific types of resource constraints which commonly cause issues in production
|
|
workloads:
|
|
|
|
1. Write latency in the storage backend, resulting in a growing queue of updates
|
|
to be flushed. These writes originate primarily from `Write`-based HTTP methods.
|
|
|
|
2. CPU utilization caused by computationally expensive PKI issue requests
|
|
(generally for RSA keys). Large numbers of these requests can consume all CPU
|
|
resources, preventing timely processing of other requests such as heartbeats and
|
|
health checks.
|
|
|
|
Storage constraints can be accounted for by limiting logical requests according
|
|
to their `http.Method`. We only measure and limit requests with `Write`-based
|
|
HTTP methods. Read requests do not generally cause storage updates, meaning that
|
|
their latencies are unlikely to be correlated with storage constraints.
|
|
|
|
CPU constraints are accounted for using the same underlying library and
|
|
technique; however, they require special treatment. The maximum number of
|
|
concurrent pki/issue requests found in testing (again, specifically for RSA
|
|
keys) is far lower than the minimum tolerable write request rate.
|
|
|
|
In both cases, utilization will be effectively throttled before Vault reaches
|
|
any degraded state. The resulting `503 - Service Unavailable` is a retryable
|
|
HTTP response code, which can be handled to gracefully retry and eventually
|
|
succeed. Clients should handle this by retrying with jitter and exponential
|
|
backoff. This is done within Vault's API `Client` implementation, using the
|
|
go-retryablehttp library.
|
|
|
|
## Read requests
|
|
|
|
HTTP methods such as `GET` and `LIST` are not subject to write request
|
|
limiting. This allows operators to continue querying server state without
|
|
needing to retry.
|
|
|
|
## Vault server overloaded
|
|
|
|
When Vault has reached capacity, new requests will be immediately rejected with a
|
|
retryable `503 - Service Unavailable`
|
|
[error](/vault/docs/concepts/adaptive-overload-protection/vault-server-temporarily-overloaded).
|