mirror of
https://github.com/hashicorp/vault.git
synced 2025-08-22 15:11:07 +02:00
* Update README Let contributors know that docs will now be located in UDR * Add comments to each mdx doc Comment has been added to all mdx docs that are not partials * chore: added changelog changelog check failure * wip: removed changelog * Fix content errors * Doc spacing * Update website/content/docs/deploy/kubernetes/vso/helm.mdx Co-authored-by: Tu Nguyen <im2nguyen@users.noreply.github.com> --------- Co-authored-by: jonathanfrappier <92055993+jonathanfrappier@users.noreply.github.com> Co-authored-by: Tu Nguyen <im2nguyen@users.noreply.github.com>
128 lines
6.2 KiB
Plaintext
128 lines
6.2 KiB
Plaintext
---
|
|
layout: docs
|
|
page_title: Rate limit quotas - collective, by IP, by entity
|
|
description: >-
|
|
Implement protections to prevent misbehaving applications and clients from impacting Vault performance.
|
|
---
|
|
|
|
> [!IMPORTANT]
|
|
> **Documentation Update:** Product documentation, which were located in this repository under `/website`, are now located in [`hashicorp/web-unified-docs`](https://github.com/hashicorp/web-unified-docs), colocated with all other product documentation. Contributions to this content should be done in the `web-unified-docs` repo, and not this one. Changes made to `/website` content in this repo will not be reflected on the developer.hashicorp.com website.
|
|
|
|
# Rate limit quotas - collective, by IP, by entity
|
|
|
|
As the number of Vault client applications increases, the incoming requests to
|
|
Vault can degrade Vault's performance. To protect your Vault environment's
|
|
stability and network, as well as storage resource consumption, use [rate limit
|
|
quotas](/vault/docs/configuration/create-lease-count-quota) and [lease count
|
|
quotas](/vault/docs/configuration/create-rate-limit-quota).
|
|
|
|
The rate limit quotas enforce API rate limiting using a [token
|
|
bucket](https://en.wikipedia.org/wiki/Token_bucket) algorithm. For Vault
|
|
Enterprise clusters, the rate limit quota supports a **`group_by`** option to
|
|
define a group of requests based on the characteristic they have in common, and
|
|
put them in the same bucket.
|
|
|
|
The available `group_by` modes are:
|
|
|
|
- `ip` - groups requests by their source IP address (_Default_)
|
|
|
|
- `none` - groups together all requests that match the rate limit quota rule
|
|
|
|
- `entity_then_ip` - groups requests by their entity ID for authenticated
|
|
requests that carry one, or by their IP for unauthenticated requests (or
|
|
requests whose authentication is not connected to an entity)
|
|
|
|
- `entity_then_none` - groups requests by their entity ID when available, but
|
|
the rest is all grouped together (for example, unauthenticated requests, and
|
|
requests with authentication that is not connected to an entity)
|
|
|
|
The `group_by` option with `entity_then_ip` or `entity_then_none` mode allows
|
|
you to set a secondary rate limit (**`secondary_rate`**). This rate limit
|
|
applies to the requests that fall under the IP or "none" groupings, while the
|
|
authenticated requests that contain an entity ID are subject to the primary rate
|
|
limit set by the `rate` parameter.
|
|
|
|
**Example:**
|
|
|
|
The command below creates a rate limit quota named "my-rate" with rate of 1,000
|
|
requests per second where `group_by` mode is `entity_then_none`. The secondary
|
|
rate is 2,000 requests per second. This means 1,000 requests per second for each
|
|
entity regardless of how many IP addresses authenticate the same entity. The
|
|
secondary rate of 2,000 requests per second applies to all requests that don't
|
|
have an entity such as unauthenticated requests.
|
|
|
|
```shell-session
|
|
$ vault write sys/quotas/rate-limit/my-rate \
|
|
rate=1000 \
|
|
group_by=entity_then_none \
|
|
secondary_rate=2000
|
|
```
|
|
|
|
The `entity_then_none` or `entity_then_ip` mode groups requests based on their
|
|
attached entity. This helps when your organization has:
|
|
|
|
- many workloads using the same IP
|
|
- single workloads using many IPs which may scale up or down
|
|
- dynamic IPs that change frequently
|
|
|
|
The group by "none" option creates one bucket for all requests at the designated
|
|
level (namespace, mount, or path) for that rate limit. For example, if your
|
|
organization provides Vault as a service to your customers, you segregate the
|
|
customers each into their own namespace. The default behavior of any rate limit
|
|
set for the namespace creates a bucket per IP. If the desired behavior is to set
|
|
a collective rate limit for all entities and workloads coming into the
|
|
namespace, the "none" option can achieve that.
|
|
|
|

|
|

|
|
|
|
<Tip title="You do not configure quotas on entities">
|
|
|
|
You can configure quotas on namespaces, mounts, paths, and roles. But you cannot
|
|
configure a rate limit quota for a specific entity.
|
|
|
|
Assume you created a rate limit quota on "customer-A" namespace with **group by
|
|
entity** mode. Vault checks the entity ID of the requests coming into the
|
|
"customer-A" namespace, and group them based on the matching entity ID.
|
|
|
|
</Tip>
|
|
|
|
|
|
## Resource quota best practices
|
|
|
|
The `group_by` option supplements the existing quota features.
|
|
|
|

|
|

|
|
|
|
- Use [Terraform Vault
|
|
provider](https://registry.terraform.io/providers/hashicorp/vault/latest/docs/resources/quota_rate_limit)
|
|
to configure and implement quotas instead of making API calls.
|
|
|
|
- Define at least one lease count quota to protect your Vault cluster from
|
|
[lease explosions](/vault/docs/configuration/prevent-lease-explosions).
|
|
|
|
- Configure low limits at the namespace level, and higher limits at the specific
|
|
problematic path. The most granular rate limit quotas takes the effect.
|
|
|
|

|
|

|
|
|
|
Refer to the [Resource Quotas](/vault/docs/concepts/resource-quotas#rate-limit-quota-precedence) page to understand which rate limit quota rule applies to a request.
|
|
|
|
- Use the `none` and `entity_then_none` modes with caution.
|
|
When you configure a rate limit quota at a high-level (for example, global
|
|
rate limit) with group by **none** mode, your Vault environment can become
|
|
vulnerable to becoming unresponsive if a single application purposefully or
|
|
erroneously exhausts the quota. At that point, no other applications or users
|
|
can send requests.
|
|
|
|
<Tip title="Vault benchmark tool">
|
|
|
|
To help you measure your Vault environment's performance, you can use the
|
|
benchmark tool. Refer to the [Benchmark Vault
|
|
performance](/vault/tutorials/operations/benchmark-vault) tutorial to learn
|
|
more.
|
|
|
|
</Tip>
|