vault/website/content/docs/configuration/create-rate-limit-quota.mdx
Erica Thompson 0660ea6fac
Update README (#31244)
* Update README

Let contributors know that docs will now be located in UDR

* Add comments to each mdx doc

Comment has been added to all mdx docs that are not partials

* chore: added changelog

changelog check failure

* wip: removed changelog

* Fix content errors

* Doc spacing

* Update website/content/docs/deploy/kubernetes/vso/helm.mdx

Co-authored-by: Tu Nguyen <im2nguyen@users.noreply.github.com>

---------

Co-authored-by: jonathanfrappier <92055993+jonathanfrappier@users.noreply.github.com>
Co-authored-by: Tu Nguyen <im2nguyen@users.noreply.github.com>
2025-07-22 08:12:22 -07:00

851 lines
23 KiB
Plaintext
Raw Permalink Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

---
layout: docs
page_title: Create a rate limit quota
description: >-
Step-by-step instructions for creating rate limit quotas to tune the incoming workloads for your Vault mounts.
---
> [!IMPORTANT]
> **Documentation Update:** Product documentation, which were located in this repository under `/website`, are now located in [`hashicorp/web-unified-docs`](https://github.com/hashicorp/web-unified-docs), colocated with all other product documentation. Contributions to this content should be done in the `web-unified-docs` repo, and not this one. Changes made to `/website` content in this repo will not be reflected on the developer.hashicorp.com website.
# Create a rate limit quota
Vault's rate limit quotas allow Vault admins to control how traffic
enters the Vault cluster by setting a limit on the target namespace, mount,
path, or role. It is a part of Vault's core feature set available in both
Community and Enterprise Editions.
<Tip title="Community vs. Enterprise">
The default behavior of rate limit quota is to group incoming requests based on
its source IP address to apply the rate limit. That is the only available mode
for Vault Community Edition.
Vault Enterprise offers additional modes. To learn more, read the [Rate limit
quotas - collective, by IP, by
entity](/vault/docs/configuration/identity-based-rate-limit) page for additional
capability.
</Tip>
## Before you start
- **Confirm you have access to the root or administration namespace for your
Vault instance**. Modifying rate limit quotas is a restricted activity.
## Step 1: Determine the appropriate granularity
The granularity of your rate limits can affect the performance of your Vault
cluster. In particular, if your rate limits cause the number of rejected
requests to increase dramatically, the increased audit logging may impact Vault
performance.
## Step 2: Enable audit log
By default, the requests rejected due to rate limit quota violations are not
written to the audit log. Therefore, if you wish to log the rejected requests
for traceability, you must set the `enable_rate_limit_audit_logging` to `true`
against the `sys/quotas/config` endpoint. The requests rejected due to reaching
the lease count quotas are always logged that you do not need to set any
parameter.
<Note title="Performance consideration">
Enabling the rate limit audit logging may have an impact on the Vault
performance if the volume of rejected requests is large.
</Note>
<Tabs>
<Tab heading="CLI command" group="cli">
1. Enable a file audit device which outputs to `/var/log/vault-audit.log` (or
your desired file location).
```shell-session
$ vault audit enable file file_path="/var/log/vault-audit.log"
```
1. To enable the audit logging for rate limit quotas, execute the following
command.
```shell-session
$ vault write sys/quotas/config enable_rate_limit_audit_logging=true
```
1. Read the quota configuration to verify.
```shell-session
$ vault read sys/quotas/config
Key Value
--- -----
absolute_rate_limit_exempt_paths []
enable_rate_limit_audit_logging true
enable_rate_limit_response_headers false
rate_limit_exempt_paths []
```
</Tab>
<Tab heading="API call using cURL" group="api">
1. Enable file audit device.
First, create the HTTP request payload specifying the audit log path to be
`/var/log/vault-audit.log` (or your desired file location).
```shell-session
$ tee audit-payload.json <<EOF
{
"type": "file",
"options": {
"file_path": "/var/log/vault-audit.log"
}
}
EOF
```
Use the `sys/audit` endpoint to enable file audit log.
```shell-session
$ curl --header "X-Vault-Token: $VAULT_TOKEN" \
--request POST \
--data @audit-payload.json \
$VAULT_ADDR/v1/sys/audit/file
```
Set the target `file_path` to your desired location.
1. Create the HTTP request payload to enable the rate limit quota audit logging.
```shell-session
$ tee payload.json <<EOF
{
"enable_rate_limit_audit_logging": true
}
EOF
```
1. Invoke the `sys/quotas/config` endpoint.
```shell-session
$ curl --header "X-Vault-Token: $VAULT_TOKEN" \
--request POST \
--data @payload.json \
$VAULT_ADDR/v1/sys/quotas/config
```
1. Read the quota configuration to verify.
```shell-session
$ curl -s --header "X-Vault-Token: $VAULT_TOKEN" \
$VAULT_ADDR/v1/sys/quotas/config | jq -r ".data"
```
**Example output:**
<CodeBlockConfig hideClipboard>
```json
{
"absolute_rate_limit_exempt_paths": [],
"enable_rate_limit_audit_logging": true,
"enable_rate_limit_response_headers": false,
"rate_limit_exempt_paths": []
}
```
</CodeBlockConfig>
</Tab>
</Tabs>
## Step 3: Create a rate limit quota
Create a rate limit quota using the following parameters:
- `name` `(string: "")` - Name of the quota rule
- `path` `(string: "")` - Target namespace, mount, or path to apply the quota
rule. It can end with `*` (e.g., `auth/token/create*`). A blank path
configures a global rate limit quota.
- `rate` `float` - Rate for the number of allowed _requests per second_ (RPS)
- `role` `(string: "")` - Login role to apply this quota to. When you set this
parameter, you must configure the path to a valid auth method with a concept
of roles.
- `interval` `(int: 0)` - The duration to enforce rate limiting for (default is
1 second)
- `block_interval` `(string: "")` - If set, when a client reaches a rate limit
threshold, Vault prohibits the client from any further requests until after
the `block_interval` has elapsed.
- `inheritable` `(boolean: false)` - <EnterpriseAlert product="vault" inline /> Determine whether to
apply the quota rule to child namespaces.
- `group_by` `(string: "")` - <EnterpriseAlert product="vault" inline /> Define how to group incoming
requests. Refer to the [identity-based rate limit quotas](/vault/docs/configuration/identity-based-rate-limit) for more details.
- `secondary_rate` `(float: 0.0)` <EnterpriseAlert product="vault" inline /> Can only be set
for the `group_by` modes `entity_then_ip` or `entity_then_none`. This is the rate limit applied
to the requests that fall under the "ip" or "none" groupings, while the authenticated requests
that contain an entity ID are subject to the `rate` field instead. Defaults to the same value
as `rate`.
<Tabs>
<Tab heading="CLI command" group="cli">
Use `vault write` and the `sys/quotas/rate-limit/{quota-name}` path to create a
new rate limit quota.
<CodeBlockConfig hideClipboard>
```shell-session
$ vault write sys/quotas/rate-limit/<QUOTA_NAME> \
name="<QUOTA_RULE_NAME>" \
path="<TARGET_PATH>" \
rate=<ALLOWED_REQUEST_RATE> \
role="<ROLE_NAME>" \
interval=<DURATION_OF_RATE_LIMIT> \
block_interval=<DURATION_TO_BLOCK_REQUESTS> \
inheritable=<BOOLEAN> \
```
</CodeBlockConfig>
**Example:** Create a rate limit quota applies on the Vault cluster.
1. Create a rate limit quota named, "global-rate" which limits inbound workload
to 100 requests per second.
```shell-session
$ vault write sys/quotas/rate-limit/global-rate rate=100
Success! Data written to: sys/quotas/rate-limit/global-rate
```
1. Read the `global-rate` rule to verify its configuration.
```shell-session
$ vault read sys/quotas/rate-limit/global-rate
Key Value
--- -----
block_interval 0
group_by ip
inheritable true
interval 1
name global-rate
path n/a
rate 100
role n/a
type rate-limit
```
<Note>
In absence of `path`, this quota rule applies to the global level instead of
a specific mount or namespace.
</Note>
**Example:** Create a rate limit quota named, "transit-limit" which limits the
access to the Transit secrets engine to be 1,000 requests per minute (60
seconds). It groups requests by their source IP address.
1. Enable Transit secrets engine at `transit`.
```shell-session
$ vault secrets enable transit
Success! Enabled the transit secrets engine at: transit/
```
1. Create a rate limit quota.
```shell-session
$ vault write sys/quotas/rate-limit/transit-limit \
path="transit" \
rate=1000 \
interval=60
```
**Output:**
<CodeBlockConfig hideClipboard>
```plaintext
Success! Data written to: sys/quotas/rate-limit/transit-limit
```
</CodeBlockConfig>
1. Read the `transit-limit` rule to verify its configuration.
```shell-session
$ vault read sys/quotas/rate-limit/transit-limit
```
**Output:**
<CodeBlockConfig hideClipboard highlight="5,7-8,10">
```plaintext
Key Value
--- -----
block_interval 0
group_by ip
inheritable true
interval 60
name transit-limit
path transit/
rate 1000
role n/a
type rate-limit
```
</CodeBlockConfig>
### Path granularity
You can set the `path` to be deeper than the mount point (in this example,
`transit/`).
**Example:** Create a rate limit quota named, "transit-order" to limit the data
encryption requests using `orders` key to be 500 per second.
1. Create an encryption key named, "orders".
```shell-session
$ vault write -f transit/keys/orders
Key Value
--- -----
allow_plaintext_backup false
auto_rotate_period 0s
deletion_allowed false
derived false
exportable false
imported_key false
keys map[1:1695147293]
latest_version 1
min_available_version 0
min_decryption_version 1
min_encryption_version 0
name orders
supports_decryption true
supports_derivation true
supports_encryption true
supports_signing false
type aes256-gcm96
```
1. Create the "transit-order" rate limit quota.
```shell-session
$ vault write sys/quotas/rate-limit/transit-order \
path="transit/encrypt/orders" \
rate=500
```
**Output:**
<CodeBlockConfig hideClipboard>
```plaintext
Success! Data written to: sys/quotas/rate-limit/transit-order
```
</CodeBlockConfig>
1. Verify the rate limit quota configuration.
```shell-session
$ vault read sys/quotas/rate-limit/transit-order
```
**Output:**
<CodeBlockConfig hideClipboard highlight="6-8,10">
```plaintext
Key Value
--- -----
block_interval 0
group_by ip
inheritable true
interval 1
name transit-order
path transit/encrypt/orders
rate 500
role n/a
type rate-limit
```
</CodeBlockConfig>
### Vault Enterprise namespaces
For Vault Enterprise clusters, you can use the `inheritable` parameter to apply
the resource quota set on a namespace to its subsequent child namespaces.
Think of the following namespace hierarchy:
<CodeBlockConfig hideClipboard>
```plaintext
root
└── parent
└── child
└── grand-child
```
</CodeBlockConfig>
Under the `root` namespace, you have a `parent` namespace, and then
`parent/child` and `parent/child/grand-child` namespaces.
You can set the resource quota on the `parent` namespace which gets applied to
its child namespaces inheritably by setting the `inheritable` parameter to
`true`. By default, it is set to `false`.
1. Create a quota rule on the `us-west` namespace which its child namespaces
will inherit. The rate limit is 500 requests per minute.
```shell-session
$ vault write sys/quotas/rate-limit/us-west \
path="us-west" \
rate=500 \
interval=1m \
inheritable=true
```
**Output:**
<CodeBlockConfig hideClipboard>
```plaintext
Success! Data written to: sys/quotas/rate-limit/us-west
```
</CodeBlockConfig>
1. Verify the quota rule.
```shell-session
$ vault read sys/quotas/rate-limit/us-west
Key Value
--- -----
block_interval 0
group_by ip
inheritable true
interval 60
name us-west
path us-west/
rate 500
role n/a
type rate-limit
```
</Tab>
<Tab heading="API call using cURL" group="api">
Create a payload file with your quota settings, and then invoke the
`sys/quotas/rate-limit/{quota-name}` endpoint.
<CodeBlockConfig hideClipboard>
```json
{
"name": "<QUOTA_RULE_NAME>",
"path": "<TARGET_PATH>",
"rate": <ALLOWED_REQUEST_RATE>,
"role": "<ROLE_NAME>",
"interval": <DURATION_OF_RATE_LIMIT>,
"block_interval": <DURATION_TO_BLOCK_REQUESTS>,
"inheritable": <BOOLEAN>
}
```
</CodeBlockConfig>
**Example:** Create a rate limit quota named, "global-rate" which limits inbound
workload to 100 requests per second.
1. Invoke the `sys/quotas/rate-limit` endpoint.
```shell-session
$ curl --header "X-Vault-Token: $VAULT_TOKEN" \
--request POST \
--data '{ "rate": 100 }' \
$VAULT_ADDR/v1/sys/quotas/rate-limit/global-rate
```
1. Read the newly created `global-rate` quota rule.
```shell-session
$ curl -s --header "X-Vault-Token: $VAULT_TOKEN" \
$VAULT_ADDR/v1/sys/quotas/rate-limit/global-rate | jq -r ".data"
```
**Output:**
<CodeBlockConfig hideClipboard>
```json
{
"block_interval": 0,
"group_by": "ip",
"inheritable": true,
"interval": 1,
"name": "global-rate",
"path": "",
"rate": 100,
"role": "",
"type": "rate-limit"
}
```
</CodeBlockConfig>
<Note>
In absence of `path`, this quota rule applies to the `root` namespace instead
of a specific mount or namespace.
</Note>
**Example:** Create a rate limit quota named, "transit-limit" which limits the
access to the Transit secrets engine to be 1000 requests per minute (60
seconds).
1. Enable Transit secrets engine at `transit`.
```shell-session
$ curl --header "X-Vault-Token: $VAULT_TOKEN" \
--request POST \
--data '{"type":"transit"}' \
$VAULT_ADDR/v1/sys/mounts/transit
```
1. Create a "transit-limit" rate limit quota.
```shell-session
$ curl --header "X-Vault-Token: $VAULT_TOKEN" \
--request POST \
--data '{"path": "transit", "rate": 1000, "interval": 60 }' \
$VAULT_ADDR/v1/sys/quotas/rate-limit/transit-limit
```
1. Read the `transit-limit` rule to verify its configuration.
```shell-session
$ curl -s --header "X-Vault-Token: $VAULT_TOKEN" \
$VAULT_ADDR/v1/sys/quotas/rate-limit/transit-limit | jq -r ".data"
```
**Output:**
<CodeBlockConfig hideClipboard highlight="4,6-7,9">
```json
{
"block_interval": 0,
"group_by": "ip",
"inheritable": true,
"interval": 60,
"name": "transit-limit",
"path": "transit/",
"rate": 1000,
"role": "",
"type": "rate-limit"
}
```
</CodeBlockConfig>
### Path granularity
You can set the `path` to be deeper than the mount point (in this example,
`transit/`).
**Example:** Create a rate limit quota named, "transit-order" to limit the data
encryption requests using `orders` key to be 500 per second.
1. Create an encryption key named, "orders".
```shell-session
$ curl --header "X-Vault-Token: $VAULT_TOKEN" \
--request POST \
$VAULT_ADDR/v1/transit/keys/orders
```
1. Create the "transit-order" rate limit quota.
```shell-session
$ curl --header "X-Vault-Token: $VAULT_TOKEN" \
--request POST \
--data '{ "path": "transit/encrypt/orders", "rate": 500 }' \
$VAULT_ADDR/v1/sys/quotas/rate-limit/transit-order
```
1. Verify the rate limit quota configuration.
```shell-session
$ curl -s --header "X-Vault-Token: $VAULT_TOKEN" \
$VAULT_ADDR/v1/sys/quotas/rate-limit/transit-order | jq -r ".data"
```
**Output:**
<CodeBlockConfig hideClipboard highlight="4,6-7,9">
```json
{
"block_interval": 0,
"group_by": "ip",
"inheritable": true,
"interval": 1,
"name": "transit-order",
"path": "transit/encrypt/orders",
"rate": 500,
"role": "",
"type": "rate-limit"
}
```
</CodeBlockConfig>
### Vault Enterprise namespaces
For Vault Enterprise clusters, you can use the `inheritable` parameter to apply
the resource quota set on a namespace to its subsequent child namespaces.
Think of the following namespace hierarchy:
<CodeBlockConfig hideClipboard>
```plaintext
root
└── parent
└── child
└── grand-child
```
</CodeBlockConfig>
Under the `root` namespace, you have a `parent` namespace, and then
`parent/child` and `parent/child/grand-child` namespaces.
You can set the resource quota on the `parent` namespace which gets applied to
its child namespaces inheritably by setting the `inheritable` parameter to
`true`. By default, it is set to `false`.
1. Create a quota rule on the `us-west` namespace which its child namespace
inherits. The rate limit is 500 requests per minute.
```shell-session
$ curl --header "X-Vault-Token: $VAULT_TOKEN" \
--request POST \
--data '{"path": "us-west", "rate": 500, "interval": 60, "inheritable": true }' \
$VAULT_ADDR/v1/sys/quotas/rate-limit/us-west
```
1. Verify the quota rule.
```shell-session
$ curl -s --header "X-Vault-Token: $VAULT_TOKEN" \
$VAULT_ADDR/v1/sys/quotas/rate-limit/us-west | jq -r ".data"
```
**Output:**
<CodeBlockConfig hideClipboard>
```json
{
"block_interval": 0,
"group_by": "ip",
"inheritable": true,
"interval": 60,
"name": "us-west",
"path": "us-west/",
"rate": 500,
"role": "",
"type": "rate-limit"
}
```
</CodeBlockConfig>
</Tab>
<Tab heading="Terraform" group="terraform">
You can use [HashiCorp Terraform](/terraform/docs) to set the rate limit quotas.
**Example:**
1. Create a file named, `main.tf` with the following content.
<CodeBlockConfig lineNumbers>
```hcl
# Use Vault provider
provider vault {}
# Create "global-rate" which limits inbound workload to 100 requests per second
resource "vault_quota_rate_limit" "global" {
name = "global-rate"
path = ""
rate = 100
}
# Create "transit-limit" which limits the access to the Transit secrets engine to be 1000 requests per minute (60 seconds)
resource "vault_quota_rate_limit" "transit-limit" {
name = "transit-limit"
path = "transit/"
rate = 1000
interval = 60
depends_on = [ vault_mount.transit ]
}
# Path granularity: Create a rate limit quota, "transit-order" to limit the data encryption requests using orders key to be 500 per second
resource "vault_quota_rate_limit" "transit-order" {
name = "transit-order"
path = "transit/encrypt/orders"
rate = 500
depends_on = [ vault_mount.transit, vault_transit_secret_backend_key.key ]
}
# Enable transit secrets engine & create a test key
resource "vault_mount" "transit" {
path = "transit"
type = "transit"
description = "Test resource quota"
}
resource "vault_transit_secret_backend_key" "key" {
backend = vault_mount.transit.path
name = "orders"
}
```
</CodeBlockConfig>
The `main.tf` performs the following operations:
- Create a rate limit quota named, "global-rate" which limits inbound workload
to 100 requests per second (line 5 - 9)
- Create a rate limit quota named, "transit-limit" which limits the access to
the Transit secrets engine to be 1000 requests per minute (line 12 - 19)
- Create a rate limit quota named, "transit-order" to limit the data encryption
requests using orders key to be 500 per second (line 22 - 28)
- Enable transit secrets engine for testing (line 31 - 35)
- Create an encryption key named, `orders` for testing (line 37 - 40)
1. Initialize Terraform.
```shell-session
$ terraform init
Initializing the backend...
Initializing provider plugins...
...snip...
Terraform has been successfully initialized!
```
1. After you run `terraform init`, you can verify that it will create the
resources with `terraform plan`.
```shell-session
$ terraform plan
An execution plan has been generated and is shown below.
Resource actions are indicated with the following symbols:
...snip...
Plan: 5 to add, 0 to change, 0 to destroy.
```
You should note resources listed in the output.
1. Deploy the resources with `terraform apply`.
```shell-session
$ terraform apply -auto-approve
Terraform used the selected providers to generate the following execution plan.
...snip...
Apply complete! Resources: 5 added, 0 changed, 0 destroyed.
```
To learn more about Terraform, visit the [Terraform tutorials site](/terraform/tutorials).
### Vault Enterprise namespaces
For Vault Enterprise clusters, you can use the `inheritable` parameter to apply
the resource quota set on a namespace to its subsequent child namespaces.
Think of the following namespace hierarchy:
<CodeBlockConfig hideClipboard>
```plaintext
root
└── parent
└── child
└── grand-child
```
</CodeBlockConfig>
Under the `root` namespace, you have a `parent` namespace, and then
`parent/child` and `parent/child/grand-child` namespaces.
You can set the resource quota on the `parent` namespace which gets applied to
its child namespaces inheritably by setting the `inheritable` parameter to
`true`. By default, it is set to `false`.
You can use Terraform to create a quota rule on the us-west namespace which its
child namespaces will inherit. The following Terraform configuration sets the
rate limit to 500 requests per minute.
```hcl
provider vault {}
# Create a "us-west" namespace
resource "vault_namespace" "us-west" {
path = "us-west"
}
# Create a "us-west" rate limit quota
resource "vault_quota_rate_limit" "us-west" {
name = "us-west"
path = "us-west"
rate = 500
interval = 60
inheritable = true
}
```
</Tab>
</Tabs>
## Next steps
Proactive monitoring and periodic usage analysis can help you identify potential
problems before they escalate.
- Brush up on [general Vault resource quotas](/vault/docs/concepts/resource-quotas) in general.
- Learn about [lease count quotas for Vault Enterprise](/vault/docs/enterprise/lease-count-quotas).
- Review [Rate limit quotas - collective, by IP, by entity](/vault/docs/configuration/identity-based-rate-limit) if you are running Vault Enterprise clusters.