vault/website/content/docs/configuration/identity-based-rate-limit.mdx
Erica Thompson 0660ea6fac
Update README (#31244)
* Update README

Let contributors know that docs will now be located in UDR

* Add comments to each mdx doc

Comment has been added to all mdx docs that are not partials

* chore: added changelog

changelog check failure

* wip: removed changelog

* Fix content errors

* Doc spacing

* Update website/content/docs/deploy/kubernetes/vso/helm.mdx

Co-authored-by: Tu Nguyen <im2nguyen@users.noreply.github.com>

---------

Co-authored-by: jonathanfrappier <92055993+jonathanfrappier@users.noreply.github.com>
Co-authored-by: Tu Nguyen <im2nguyen@users.noreply.github.com>
2025-07-22 08:12:22 -07:00

128 lines
6.2 KiB
Plaintext

---
layout: docs
page_title: Rate limit quotas - collective, by IP, by entity
description: >-
Implement protections to prevent misbehaving applications and clients from impacting Vault performance.
---
> [!IMPORTANT]
> **Documentation Update:** Product documentation, which were located in this repository under `/website`, are now located in [`hashicorp/web-unified-docs`](https://github.com/hashicorp/web-unified-docs), colocated with all other product documentation. Contributions to this content should be done in the `web-unified-docs` repo, and not this one. Changes made to `/website` content in this repo will not be reflected on the developer.hashicorp.com website.
# Rate limit quotas - collective, by IP, by entity
As the number of Vault client applications increases, the incoming requests to
Vault can degrade Vault's performance. To protect your Vault environment's
stability and network, as well as storage resource consumption, use [rate limit
quotas](/vault/docs/configuration/create-lease-count-quota) and [lease count
quotas](/vault/docs/configuration/create-rate-limit-quota).
The rate limit quotas enforce API rate limiting using a [token
bucket](https://en.wikipedia.org/wiki/Token_bucket) algorithm. For Vault
Enterprise clusters, the rate limit quota supports a **`group_by`** option to
define a group of requests based on the characteristic they have in common, and
put them in the same bucket.
The available `group_by` modes are:
- `ip` - groups requests by their source IP address (_Default_)
- `none` - groups together all requests that match the rate limit quota rule
- `entity_then_ip` - groups requests by their entity ID for authenticated
requests that carry one, or by their IP for unauthenticated requests (or
requests whose authentication is not connected to an entity)
- `entity_then_none` - groups requests by their entity ID when available, but
the rest is all grouped together (for example, unauthenticated requests, and
requests with authentication that is not connected to an entity)
The `group_by` option with `entity_then_ip` or `entity_then_none` mode allows
you to set a secondary rate limit (**`secondary_rate`**). This rate limit
applies to the requests that fall under the IP or "none" groupings, while the
authenticated requests that contain an entity ID are subject to the primary rate
limit set by the `rate` parameter.
**Example:**
The command below creates a rate limit quota named "my-rate" with rate of 1,000
requests per second where `group_by` mode is `entity_then_none`. The secondary
rate is 2,000 requests per second. This means 1,000 requests per second for each
entity regardless of how many IP addresses authenticate the same entity. The
secondary rate of 2,000 requests per second applies to all requests that don't
have an entity such as unauthenticated requests.
```shell-session
$ vault write sys/quotas/rate-limit/my-rate \
rate=1000 \
group_by=entity_then_none \
secondary_rate=2000
```
The `entity_then_none` or `entity_then_ip` mode groups requests based on their
attached entity. This helps when your organization has:
- many workloads using the same IP
- single workloads using many IPs which may scale up or down
- dynamic IPs that change frequently
The group by "none" option creates one bucket for all requests at the designated
level (namespace, mount, or path) for that rate limit. For example, if your
organization provides Vault as a service to your customers, you segregate the
customers each into their own namespace. The default behavior of any rate limit
set for the namespace creates a bucket per IP. If the desired behavior is to set
a collective rate limit for all entities and workloads coming into the
namespace, the "none" option can achieve that.
![Diagram indicating possible user paths](/img/resource-quotas/group-by_light.png#light-theme-only)
![Diagram indicating possible user paths](/img/resource-quotas/group-by_dark.png#dark-theme-only)
<Tip title="You do not configure quotas on entities">
You can configure quotas on namespaces, mounts, paths, and roles. But you cannot
configure a rate limit quota for a specific entity.
Assume you created a rate limit quota on "customer-A" namespace with **group by
entity** mode. Vault checks the entity ID of the requests coming into the
"customer-A" namespace, and group them based on the matching entity ID.
</Tip>
## Resource quota best practices
The `group_by` option supplements the existing quota features.
![Diagram indicating possible user paths](/img/resource-quotas/resource-quotas-use-cases_light.png#light-theme-only)
![Diagram indicating possible user paths](/img/resource-quotas/resource-quotas-use-cases_dark.png#dark-theme-only)
- Use [Terraform Vault
provider](https://registry.terraform.io/providers/hashicorp/vault/latest/docs/resources/quota_rate_limit)
to configure and implement quotas instead of making API calls.
- Define at least one lease count quota to protect your Vault cluster from
[lease explosions](/vault/docs/configuration/prevent-lease-explosions).
- Configure low limits at the namespace level, and higher limits at the specific
problematic path. The most granular rate limit quotas takes the effect.
![A diagram to show the granularity of target paths and corresponding rate](/img/resource-quotas/quota-granularity_light.png#light-theme-only)
![A diagram to show the granularity of target paths and corresponding rate](/img/resource-quotas/quota-granularity_dark.png#dark-theme-only)
Refer to the [Resource Quotas](/vault/docs/concepts/resource-quotas#rate-limit-quota-precedence) page to understand which rate limit quota rule applies to a request.
- Use the `none` and `entity_then_none` modes with caution.
When you configure a rate limit quota at a high-level (for example, global
rate limit) with group by **none** mode, your Vault environment can become
vulnerable to becoming unresponsive if a single application purposefully or
erroneously exhausts the quota. At that point, no other applications or users
can send requests.
<Tip title="Vault benchmark tool">
To help you measure your Vault environment's performance, you can use the
benchmark tool. Refer to the [Benchmark Vault
performance](/vault/tutorials/operations/benchmark-vault) tutorial to learn
more.
</Tip>