[Docs] Adding identity-based rate limit (#30717)

* Adding identity-based rate limit doc * Remove include * Add next steps section * Add best practices section * Add diagrams * Add a 'Terraform' tab * Fix reference link * Update the title * Undo the 'AOP' sub-section * Incorporate the review feedback * Incorporate the review feedback * Remove passive voice * Minor fix * Update website/content/docs/configuration/create-rate-limit-quota.mdx Co-authored-by: Bruno Oliveira de Souza <bruno.souza@hashicorp.com> * Update website/content/docs/configuration/identity-based-rate-limit.mdx Co-authored-by: Bruno Oliveira de Souza <bruno.souza@hashicorp.com> * Incorporate the review feedback * Incorporate review feedback * Update the short name for side-nav * Some fixes & incorporate review feedback * minor edit * Incorporate feedback * Update the BP section * Update website/content/docs/concepts/resource-quotas.mdx Co-authored-by: Bruno Oliveira de Souza <bruno.souza@hashicorp.com> * Update website/content/docs/configuration/create-rate-limit-quota.mdx Co-authored-by: Bruno Oliveira de Souza <bruno.souza@hashicorp.com> --------- Co-authored-by: Bruno Oliveira de Souza <bruno.souza@hashicorp.com>
2026-05-06 04:46:25 +02:00 · 2025-06-12 08:44:20 -07:00 · 2025-06-12 08:44:20 -07:00 · 068d576425
commit 068d576425
parent 1bb579639c
11 changed files with 1066 additions and 14 deletions
--- a/website/content/docs/concepts/resource-quotas.mdx
+++ b/website/content/docs/concepts/resource-quotas.mdx
@ -52,6 +52,37 @@ Vault also allows the inspection of the state of rate limiting in a Vault node
 through various [metrics](/vault/docs/internals/telemetry/metrics/core-system#quota-metrics) exposed
 and through enabling optional audit logging.

+<table>
+  <thead>
+    <tr>
+      <th>Feature</th>
+      <th>Description</th>
+      <th>Community Edition</th>
+      <th>Enterprise</th>
+    </tr>
+  </thead>
+  <tbody>
+    <tr>
+      <td rowSpan={2}><a href="/vault/docs/configuration/create-rate-limit-quota">Rate limit quotas</a></td>
+      <td>Limit maximum amount of requests per second (RPS) to a system or mount to protect network bandwidth</td>
+      <td style={{verticalAlign: 'middle', textAlign: 'center'}}>&#9989;</td>
+      <td style={{verticalAlign: 'middle', textAlign: 'center'}}>&#9989;</td>
+    </tr>
+    <tr>
+      <td>Identity-based and collective rate limits <a href="/vault/docs/configuration/identity-based-rate-limit">with <code>group_by</code> modes</a></td>
+      <td style={{verticalAlign: 'middle', textAlign: 'center'}}>&#10060;</td>
+      <td style={{verticalAlign: 'middle', textAlign: 'center'}}>&#9989;</td>
+    </tr>
+    <tr>
+      <td><a href="/vault/docs/configuration/create-lease-count-quota">Lease count quotas</a></td>
+      <td>Cap number of leases generated in a system or mount to protect system stability and storage performance at scale</td>
+      <td style={{verticalAlign: 'middle', textAlign: 'center'}}>&#10060;</td>
+      <td style={{verticalAlign: 'middle', textAlign: 'center'}}>&#9989;</td>
+    </tr>
+  </tbody>
+</table>
+
+
 ## Exempt routes

 By default, the following paths are exempt from rate limiting. However, Vault
@ -84,11 +115,27 @@ set up a local mount on both clusters with the same path, a user that only has
 access to the performance secondary cluster can create/update a quota for this
 path that will apply to the performance primary cluster.

-## Tutorial
+## Rate limit quota precedence

-Refer to [Protecting Vault with Resource
-Quotas](/vault/tutorials/operations/resource-quotas) for a
-step-by-step tutorial.
+You can define quotas on namespaces, mounts, and paths. If the path is an auth
+mount with a concept of roles (such as `/auth/approle/`), the rate limit quota
+restricts login requests to that mount with the specified role. 
+
+Only one rate limit quota rule can exist for any given path, and the most
+granular rate limit quota takes effect on the requests. 
+
+To understand which rate limit quota rule applies to a request, here is the
+order of precedence:
+
+1. The rate limit quota matching the target namespace, mount, and role
+1. The rate limit quota matching the target namespace, mount, and path
+1. The rate limit quota matching the target namespace, mount, and the longest
+   prefix of the target path that ends with a trailing glob (*)
+1. The rate limit quota matching the target namespace, and mount
+1. The rate limit quota matching the target namespace
+1. The rate limit quota matching the closest parent namespace, as long as the
+   match has the `inheritable` field set
+1. Global rate limit quota

 ## API

--- a/website/content/docs/configuration/create-lease-count-quota.mdx
+++ b/website/content/docs/configuration/create-lease-count-quota.mdx
@ -31,7 +31,6 @@ inheritable or limited to a specific role.
 ## Step 2: Apply the count quota

 <Tabs>
-
 <Tab heading="CLI" group="cli">

 Use `vault write` and the `sys/quotas/lease-count/{quota-name}` mount path to
@ -59,8 +58,8 @@ $ vault write                            \

 Success! Data written to: sys/quotas/lease-count/webapp-tokens
 ```
-</Tab>

+</Tab>
 <Tab heading="API" group="api">

 1. Create a payload file with your quota settings.
@ -103,14 +102,34 @@ Success! Data written to: sys/quotas/lease-count/webapp-tokens

 </Note>

-</Tab>

+</Tab>
+<Tab heading="Terraform" group="terraform">
+
+Use
+[`vault_quota_lease_count`](https://registry.terraform.io/providers/hashicorp/vault/latest/docs/resources/quota_lease_count)
+resource type to define a lease count quota.
+
+For example, to create a targeted quota limit called **webapp-tokens** on the
+`webapp` role for the `approle` plugin at the default mount path:
+
+
+```hcl
+resource "vault_quota_lease_count" "webapp-tokens" {
+  name = "webapp-tokens"
+  path = "auth/approle"
+  role = "webapp"
+  max_leases = 100
+  depends_on = [ vault_approle_auth_backend_role.webapp ]
+}
+```
+
+</Tab>
 </Tabs>

 ## Step 3: Confirm the quota settings

 <Tabs>
-
 <Tab heading="CLI" group="cli">

 Use `vault read` and the `sys/quotas/lease-count/{quota-name}` mount path to
@ -137,7 +156,6 @@ type           lease-count
 ```

 </Tab>
-
 <Tab heading="API" group="api">

 Call the `sys/quotas/lease-count/{quota-name}` endpoint to display the lease
@ -171,7 +189,6 @@ $ curl                                      \
 ```

 </Tab>
-
 </Tabs>

 ## Next steps 
@ -180,6 +197,6 @@ Proactive monitoring and periodic usage analysis can help you identify potential
 problems before they escalate.

 - Brush up on [general Vault resource quotas](/vault/docs/concepts/resource-quotas) in general.
- Learn about [lease count quotas for Vault Enterprise](/vault/docs/enterprise/lease-count-quotas).
+- Learn about [rate limit quota](/vault/docs/configuration/create-rate-limit-quota) to control request vaolume.
 - Learn how to [query audit device logs](/vault/tutorials/monitoring/query-audit-device-logs).
 - Review [key Vault metrics for common health checks](/well-architected-framework/reliability/reliability-vault-monitoring-key-metrics).
--- a/website/content/docs/configuration/create-rate-limit-quota.mdx
+++ b/website/content/docs/configuration/create-rate-limit-quota.mdx
@ -0,0 +1,848 @@
+---
+layout: docs
+page_title: Create a rate limit quota
+description: >-
+  Step-by-step instructions for creating rate limit quotas to tune the incoming workloads for your Vault mounts.
+---
+
+# Create a rate limit quota
+
+Vault's rate limit quotas allow Vault admins to control how traffic
+enters the Vault cluster by setting a limit on the target namespace, mount,
+path, or role. It is a part of Vault's core feature set available in both
+Community and Enterprise Editions.
+
+<Tip title="Community vs. Enterprise">
+
+The default behavior of rate limit quota is to group incoming requests based on
+its source IP address to apply the rate limit. That is the only available mode
+for Vault Community Edition.
+
+Vault Enterprise offers additional modes. To learn more, read the [Rate limit
+quotas - collective, by IP, by
+entity](/vault/docs/configuration/identity-based-rate-limit) page for additional
+capability.
+
+</Tip>
+
+## Before you start 
+
+- **Confirm you have access to the root or administration namespace for your
+  Vault instance**. Modifying rate limit quotas is a restricted activity.
+
+
+## Step 1: Determine the appropriate granularity
+
+The granularity of your rate limits can affect the performance of your Vault
+cluster. In particular, if your rate limits cause the number of rejected
+requests to increase dramatically, the increased audit logging may impact Vault
+performance.
+
+## Step 2: Enable audit log
+
+By default, the requests rejected due to rate limit quota violations are not
+written to the audit log. Therefore, if you wish to log the rejected requests
+for traceability, you must set the `enable_rate_limit_audit_logging` to `true`
+against the `sys/quotas/config` endpoint. The requests rejected due to reaching
+the lease count quotas are always logged that you do not need to set any
+parameter.
+
+<Note title="Performance consideration">
+
+Enabling the rate limit audit logging may have an impact on the Vault
+performance if the volume of rejected requests is large.
+
+</Note>
+
+<Tabs>
+<Tab heading="CLI command" group="cli">
+
+1. Enable a file audit device which outputs to `/var/log/vault-audit.log` (or
+   your desired file location).
+
+   ```shell-session
+   $ vault audit enable file file_path="/var/log/vault-audit.log"
+   ```
+
+1. To enable the audit logging for rate limit quotas, execute the following
+   command.
+
+   ```shell-session
+   $ vault write sys/quotas/config enable_rate_limit_audit_logging=true
+   ```
+
+1. Read the quota configuration to verify.
+
+   ```shell-session
+   $ vault read sys/quotas/config
+
+   Key                                   Value
+   ---                                   -----
+   absolute_rate_limit_exempt_paths      []
+   enable_rate_limit_audit_logging       true
+   enable_rate_limit_response_headers    false
+   rate_limit_exempt_paths               []
+   ```
+
+</Tab>
+<Tab heading="API call using cURL" group="api">
+
+1. Enable file audit device.
+
+   First, create the HTTP request payload specifying the audit log path to be
+   `/var/log/vault-audit.log` (or your desired file location).
+
+   ```shell-session
+   $ tee audit-payload.json <<EOF
+   {
+     "type": "file",
+     "options": {
+         "file_path": "/var/log/vault-audit.log"
+     }
+   }
+   EOF
+   ```
+
+   Use the `sys/audit` endpoint to enable file audit log.
+
+   ```shell-session
+   $ curl --header "X-Vault-Token: $VAULT_TOKEN" \
+       --request POST \
+       --data @audit-payload.json \
+       $VAULT_ADDR/v1/sys/audit/file
+   ```
+
+   Set the target `file_path` to your desired location.
+
+1. Create the HTTP request payload to enable the rate limit quota audit logging.
+
+   ```shell-session
+   $ tee payload.json <<EOF
+   {
+      "enable_rate_limit_audit_logging": true
+   }
+   EOF
+   ```
+
+1. Invoke the `sys/quotas/config` endpoint.
+
+   ```shell-session
+   $ curl --header "X-Vault-Token: $VAULT_TOKEN" \
+       --request POST \
+       --data @payload.json \
+       $VAULT_ADDR/v1/sys/quotas/config
+   ```
+
+1. Read the quota configuration to verify.
+
+   ```shell-session
+   $ curl -s --header "X-Vault-Token: $VAULT_TOKEN" \
+       $VAULT_ADDR/v1/sys/quotas/config | jq -r ".data"
+   ```
+
+   **Example output:**
+
+   <CodeBlockConfig hideClipboard>
+
+   ```json
+   {
+      "absolute_rate_limit_exempt_paths": [],
+      "enable_rate_limit_audit_logging": true,
+      "enable_rate_limit_response_headers": false,
+      "rate_limit_exempt_paths": []
+   }
+   ```
+
+   </CodeBlockConfig>
+
+</Tab>
+</Tabs>
+
+
+## Step 3: Create a rate limit quota
+
+Create a rate limit quota using the following parameters:
+
+- `name` `(string: "")` - Name of the quota rule
+- `path` `(string: "")` - Target namespace, mount, or path to apply the quota
+  rule. It can end with `*` (e.g., `auth/token/create*`). A blank path
+  configures a global rate limit quota. 
+- `rate` `float` - Rate for the number of allowed _requests per second_ (RPS)  
+- `role` `(string: "")` - Login role to apply this quota to. When you set this
+  parameter, you must configure the path to a valid auth method with a concept
+  of roles.  
+- `interval` `(int: 0)` - The duration to enforce rate limiting for (default is
+  1 second)
+- `block_interval` `(string: "")` - If set, when a client reaches a rate limit
+  threshold, Vault prohibits the client from any further requests until after
+  the `block_interval` has elapsed. 
+- `inheritable` `(boolean: false)` - <EnterpriseAlert product="vault" inline /> Determine whether to
+  apply the quota rule to child namespaces.   
+- `group_by` `(string: "")` - <EnterpriseAlert product="vault" inline /> Define how to group incoming
+  requests. Refer to the [identity-based rate limit quotas](/vault/docs/configuration/identity-based-rate-limit) for more details.
+- `secondary_rate` `(float: 0.0)` – <EnterpriseAlert product="vault" inline /> Can only be set
+  for the `group_by` modes `entity_then_ip` or `entity_then_none`. This is the rate limit applied
+  to the requests that fall under the "ip" or "none" groupings, while the authenticated requests
+  that contain an entity ID are subject to the `rate` field instead. Defaults to the same value
+  as `rate`.
+
+
+<Tabs>
+<Tab heading="CLI command" group="cli">
+
+Use `vault write` and the `sys/quotas/rate-limit/{quota-name}` path to create a
+new rate limit quota.
+
+<CodeBlockConfig hideClipboard>
+
+```shell-session
+$ vault write sys/quotas/rate-limit/<QUOTA_NAME> \
+    name="<QUOTA_RULE_NAME>" \
+    path="<TARGET_PATH>" \
+    rate=<ALLOWED_REQUEST_RATE> \
+    role="<ROLE_NAME>" \
+    interval=<DURATION_OF_RATE_LIMIT> \
+    block_interval=<DURATION_TO_BLOCK_REQUESTS> \
+    inheritable=<BOOLEAN> \
+``` 
+
+</CodeBlockConfig>
+
+
+**Example:** Create a rate limit quota applies on the Vault cluster.
+
+1. Create a rate limit quota named, "global-rate" which limits inbound workload
+   to 100 requests per second.
+
+   ```shell-session
+   $ vault write sys/quotas/rate-limit/global-rate rate=100
+   Success! Data written to: sys/quotas/rate-limit/global-rate
+   ```
+
+1. Read the `global-rate` rule to verify its configuration.
+
+   ```shell-session
+   $ vault read sys/quotas/rate-limit/global-rate
+
+   Key               Value
+   ---               -----
+   block_interval    0
+   group_by          ip
+   inheritable       true
+   interval          1
+   name              global-rate
+   path              n/a
+   rate              100
+   role              n/a
+   type              rate-limit
+   ```
+
+   <Note>
+
+   In absence of `path`, this quota rule applies to the global level instead of
+   a specific mount or namespace.
+
+   </Note>
+
+**Example:** Create a rate limit quota named, "transit-limit" which limits the
+access to the Transit secrets engine to be 1,000 requests per minute (60
+seconds). It groups requests by their source IP address.
+
+1. Enable Transit secrets engine at `transit`.
+
+   ```shell-session
+   $ vault secrets enable transit
+   Success! Enabled the transit secrets engine at: transit/
+   ```
+
+1. Create a rate limit quota.
+
+   ```shell-session
+   $ vault write sys/quotas/rate-limit/transit-limit \
+      path="transit" \
+      rate=1000 \
+      interval=60
+   ```
+
+   **Output:**
+
+   <CodeBlockConfig hideClipboard>
+
+   ```plaintext
+   Success! Data written to: sys/quotas/rate-limit/transit-limit
+   ```
+
+   </CodeBlockConfig>
+
+1. Read the `transit-limit` rule to verify its configuration.
+
+   ```shell-session
+   $ vault read sys/quotas/rate-limit/transit-limit
+   ```
+
+   **Output:**
+
+   <CodeBlockConfig hideClipboard highlight="5,7-8,10">
+
+   ```plaintext
+   Key               Value
+   ---               -----
+   block_interval    0
+   group_by          ip
+   inheritable       true
+   interval          60
+   name              transit-limit
+   path              transit/
+   rate              1000
+   role              n/a
+   type              rate-limit
+   ```
+
+   </CodeBlockConfig>
+
+
+### Path granularity
+
+You can set the `path` to be deeper than the mount point (in this example,
+`transit/`).
+
+**Example:** Create a rate limit quota named, "transit-order" to limit the data
+encryption requests using `orders` key to be 500 per second.
+
+1. Create an encryption key named, "orders".
+
+   ```shell-session
+   $ vault write -f transit/keys/orders
+
+   Key                       Value
+   ---                       -----
+   allow_plaintext_backup    false
+   auto_rotate_period        0s
+   deletion_allowed          false
+   derived                   false
+   exportable                false
+   imported_key              false
+   keys                      map[1:1695147293]
+   latest_version            1
+   min_available_version     0
+   min_decryption_version    1
+   min_encryption_version    0
+   name                      orders
+   supports_decryption       true
+   supports_derivation       true
+   supports_encryption       true
+   supports_signing          false
+   type                      aes256-gcm96
+   ```
+
+1. Create the "transit-order" rate limit quota.
+
+   ```shell-session
+   $ vault write sys/quotas/rate-limit/transit-order \
+      path="transit/encrypt/orders" \
+      rate=500
+   ```
+
+   **Output:**
+
+   <CodeBlockConfig hideClipboard>
+
+   ```plaintext
+   Success! Data written to: sys/quotas/rate-limit/transit-order
+   ```
+
+   </CodeBlockConfig>
+
+1. Verify the rate limit quota configuration.
+
+   ```shell-session
+   $ vault read sys/quotas/rate-limit/transit-order
+   ```
+
+   **Output:**
+
+   <CodeBlockConfig hideClipboard highlight="6-8,10">
+
+   ```plaintext
+   Key               Value
+   ---               -----
+   block_interval    0
+   group_by          ip
+   inheritable       true
+   interval          1
+   name              transit-order
+   path              transit/encrypt/orders
+   rate              500
+   role              n/a
+   type              rate-limit
+   ```
+
+   </CodeBlockConfig>
+
+
+### Vault Enterprise namespaces
+
+For Vault Enterprise clusters, you can use the `inheritable` parameter to apply
+the resource quota set on a namespace to its subsequent child namespaces.
+
+Think of the following namespace hierarchy:
+
+<CodeBlockConfig hideClipboard>
+
+```plaintext
+root
+   └── parent
+      └── child
+         └── grand-child
+```
+
+</CodeBlockConfig>
+
+Under the `root` namespace, you have a `parent` namespace, and then
+`parent/child` and `parent/child/grand-child` namespaces.
+
+You can set the resource quota on the `parent` namespace which gets applied to
+its child namespaces inheritably by setting the `inheritable` parameter to
+`true`. By default, it is set to `false`.
+
+1. Create a quota rule on the `us-west` namespace which its child namespaces
+   will inherit. The rate limit is 500 requests per minute.
+
+   ```shell-session
+   $ vault write sys/quotas/rate-limit/us-west \
+      path="us-west" \
+      rate=500 \
+      interval=1m \
+      inheritable=true
+   ```
+
+   **Output:**
+
+   <CodeBlockConfig hideClipboard>
+
+   ```plaintext
+   Success! Data written to: sys/quotas/rate-limit/us-west
+   ```
+
+   </CodeBlockConfig>
+
+1. Verify the quota rule.
+
+   ```shell-session
+   $ vault read sys/quotas/rate-limit/us-west
+
+   Key               Value
+   ---               -----
+   block_interval    0
+   group_by          ip
+   inheritable       true
+   interval          60
+   name              us-west
+   path              us-west/
+   rate              500
+   role              n/a
+   type              rate-limit
+   ```
+
+</Tab>
+<Tab heading="API call using cURL" group="api">
+
+Create a payload file with your quota settings, and then invoke the
+`sys/quotas/rate-limit/{quota-name}` endpoint.
+
+<CodeBlockConfig hideClipboard>
+
+```json
+{
+  "name": "<QUOTA_RULE_NAME>",
+  "path": "<TARGET_PATH>",
+  "rate": <ALLOWED_REQUEST_RATE>,
+  "role": "<ROLE_NAME>",
+  "interval": <DURATION_OF_RATE_LIMIT>,
+  "block_interval": <DURATION_TO_BLOCK_REQUESTS>,
+  "inheritable": <BOOLEAN>
+}
+```
+
+</CodeBlockConfig>
+
+**Example:** Create a rate limit quota named, "global-rate" which limits inbound
+workload to 100 requests per second.
+
+1. Invoke the `sys/quotas/rate-limit` endpoint.
+
+   ```shell-session
+   $ curl --header "X-Vault-Token: $VAULT_TOKEN" \
+       --request POST \
+       --data '{ "rate": 100 }' \
+       $VAULT_ADDR/v1/sys/quotas/rate-limit/global-rate
+   ```
+
+1. Read the newly created `global-rate` quota rule.
+
+   ```shell-session
+   $ curl -s --header "X-Vault-Token: $VAULT_TOKEN" \
+       $VAULT_ADDR/v1/sys/quotas/rate-limit/global-rate | jq -r ".data"
+   ```
+
+   **Output:**
+
+   <CodeBlockConfig hideClipboard>
+
+   ```json
+   {
+      "block_interval": 0,
+      "group_by": "ip",
+      "inheritable": true,
+      "interval": 1,
+      "name": "global-rate",
+      "path": "",
+      "rate": 100,
+      "role": "",
+      "type": "rate-limit"
+   }
+   ```
+
+   </CodeBlockConfig>
+
+   <Note>
+
+   In absence of `path`, this quota rule applies to the `root` namespace instead
+   of a specific mount or namespace.
+
+   </Note>
+
+**Example:** Create a rate limit quota named, "transit-limit" which limits the
+access to the Transit secrets engine to be 1000 requests per minute (60
+seconds).
+
+1. Enable Transit secrets engine at `transit`.
+
+   ```shell-session
+   $ curl --header "X-Vault-Token: $VAULT_TOKEN" \
+       --request POST \
+       --data '{"type":"transit"}' \
+       $VAULT_ADDR/v1/sys/mounts/transit
+   ```
+
+1. Create a "transit-limit" rate limit quota.
+
+   ```shell-session
+   $ curl --header "X-Vault-Token: $VAULT_TOKEN" \
+      --request POST \
+      --data '{"path": "transit", "rate": 1000, "interval": 60 }' \
+      $VAULT_ADDR/v1/sys/quotas/rate-limit/transit-limit
+   ```
+
+1. Read the `transit-limit` rule to verify its configuration.
+
+   ```shell-session
+   $ curl -s --header "X-Vault-Token: $VAULT_TOKEN" \
+      $VAULT_ADDR/v1/sys/quotas/rate-limit/transit-limit | jq -r ".data"
+   ```
+   **Output:**
+
+   <CodeBlockConfig hideClipboard highlight="4,6-7,9">
+
+   ```json
+   {
+      "block_interval": 0,
+      "group_by": "ip",
+      "inheritable": true,
+      "interval": 60,
+      "name": "transit-limit",
+      "path": "transit/",
+      "rate": 1000,
+      "role": "",
+      "type": "rate-limit"
+   }
+   ```
+
+   </CodeBlockConfig>
+
+
+### Path granularity
+
+You can set the `path` to be deeper than the mount point (in this example,
+`transit/`).
+
+**Example:** Create a rate limit quota named, "transit-order" to limit the data
+encryption requests using `orders` key to be 500 per second.
+
+1. Create an encryption key named, "orders".
+
+   ```shell-session
+   $ curl --header "X-Vault-Token: $VAULT_TOKEN" \
+       --request POST \
+       $VAULT_ADDR/v1/transit/keys/orders
+   ```
+
+1. Create the "transit-order" rate limit quota.
+
+   ```shell-session
+   $ curl --header "X-Vault-Token: $VAULT_TOKEN" \
+      --request POST \
+      --data '{ "path": "transit/encrypt/orders", "rate": 500 }' \
+      $VAULT_ADDR/v1/sys/quotas/rate-limit/transit-order
+   ```
+
+1. Verify the rate limit quota configuration.
+
+   ```shell-session
+   $ curl -s --header "X-Vault-Token: $VAULT_TOKEN" \
+      $VAULT_ADDR/v1/sys/quotas/rate-limit/transit-order | jq -r ".data"
+   ```
+   **Output:**
+
+   <CodeBlockConfig hideClipboard highlight="4,6-7,9">
+
+   ```json
+   {
+      "block_interval": 0,
+      "group_by": "ip",
+      "inheritable": true,
+      "interval": 1,
+      "name": "transit-order",
+      "path": "transit/encrypt/orders",
+      "rate": 500,
+      "role": "",
+      "type": "rate-limit"
+   }
+   ```
+
+   </CodeBlockConfig>
+
+
+### Vault Enterprise namespaces
+
+For Vault Enterprise clusters, you can use the `inheritable` parameter to apply
+the resource quota set on a namespace to its subsequent child namespaces.
+
+Think of the following namespace hierarchy:
+
+<CodeBlockConfig hideClipboard>
+
+```plaintext
+root
+   └── parent
+      └── child
+         └── grand-child
+```
+
+</CodeBlockConfig>
+
+Under the `root` namespace, you have a `parent` namespace, and then
+`parent/child` and `parent/child/grand-child` namespaces.
+
+You can set the resource quota on the `parent` namespace which gets applied to
+its child namespaces inheritably by setting the `inheritable` parameter to
+`true`. By default, it is set to `false`.
+
+
+1. Create a quota rule on the `us-west` namespace which its child namespace
+   inherits. The rate limit is 500 requests per minute.
+
+   ```shell-session
+   $ curl --header "X-Vault-Token: $VAULT_TOKEN" \
+      --request POST \
+      --data '{"path": "us-west", "rate": 500, "interval": 60, "inheritable": true }' \
+      $VAULT_ADDR/v1/sys/quotas/rate-limit/us-west
+   ```
+
+1. Verify the quota rule.
+
+   ```shell-session
+   $ curl -s --header "X-Vault-Token: $VAULT_TOKEN" \
+      $VAULT_ADDR/v1/sys/quotas/rate-limit/us-west | jq -r ".data"
+   ```
+
+   **Output:**
+
+   <CodeBlockConfig hideClipboard>
+
+   ```json
+   {
+      "block_interval": 0,
+      "group_by": "ip",
+      "inheritable": true,
+      "interval": 60,
+      "name": "us-west",
+      "path": "us-west/",
+      "rate": 500,
+      "role": "",
+      "type": "rate-limit"
+   }
+   ```
+
+   </CodeBlockConfig>
+
+</Tab>
+<Tab heading="Terraform" group="terraform">
+
+You can use [HashiCorp Terraform](/terraform/docs) to set the rate limit quotas.
+
+**Example:**
+
+1. Create a file named, `main.tf` with the following content.
+
+   <CodeBlockConfig lineNumbers>
+
+   ```hcl
+   # Use Vault provider
+   provider vault {}
+
+   # Create "global-rate" which limits inbound workload to 100 requests per second
+   resource "vault_quota_rate_limit" "global" {
+      name = "global-rate"
+      path = ""
+      rate = 100
+   }
+
+   # Create "transit-limit" which limits the access to the Transit secrets engine to be 1000 requests per minute (60 seconds)
+   resource "vault_quota_rate_limit" "transit-limit" {
+      name = "transit-limit"
+      path = "transit/"
+      rate = 1000
+      interval = 60
+
+      depends_on = [ vault_mount.transit ]
+   }
+
+   # Path granularity: Create a rate limit quota, "transit-order" to limit the data encryption requests using orders key to be 500 per second
+   resource "vault_quota_rate_limit" "transit-order" {
+      name = "transit-order"
+      path = "transit/encrypt/orders"
+      rate = 500
+
+   depends_on = [ vault_mount.transit, vault_transit_secret_backend_key.key ]
+   }
+
+   # Enable transit secrets engine & create a test key
+   resource "vault_mount" "transit" {
+      path                      = "transit"
+      type                      = "transit"
+      description               = "Test resource quota"
+   }
+
+   resource "vault_transit_secret_backend_key" "key" {
+      backend = vault_mount.transit.path
+      name    = "orders"
+   }
+   ```
+
+   </CodeBlockConfig>
+
+   The `main.tf` performs the following operations:
+
+   - Create a rate limit quota named, "global-rate" which limits inbound workload
+   to 100 requests per second (line 5 - 9)
+
+   - Create a rate limit quota named, "transit-limit" which limits the access to
+   the Transit secrets engine to be 1000 requests per minute (line 12 - 19)
+
+   - Create a rate limit quota named, "transit-order" to limit the data encryption
+   requests using orders key to be 500 per second (line 22 - 28)
+
+   - Enable transit secrets engine for testing (line 31 - 35)
+
+   - Create an encryption key named, `orders` for testing (line 37 - 40)
+
+1. Initialize Terraform.
+
+   ```shell-session
+   $ terraform init
+
+   Initializing the backend...
+
+   Initializing provider plugins...
+   ...snip...
+
+   Terraform has been successfully initialized!
+   ```
+
+1. After you run `terraform init`, you can verify that it will create the
+resources with `terraform plan`.
+
+   ```shell-session
+   $ terraform plan
+
+   An execution plan has been generated and is shown below.
+   Resource actions are indicated with the following symbols:
+   ...snip...
+
+   Plan: 5 to add, 0 to change, 0 to destroy.
+   ```
+
+   You should note resources listed in the output.
+
+1. Deploy the resources with `terraform apply`.
+
+   ```shell-session
+   $ terraform apply -auto-approve
+
+   Terraform used the selected providers to generate the following execution plan.
+   ...snip...
+   Apply complete! Resources: 5 added, 0 changed, 0 destroyed.
+   ```
+
+To learn more about Terraform, visit the [Terraform tutorials site](/terraform/tutorials).
+
+### Vault Enterprise namespaces
+
+For Vault Enterprise clusters, you can use the `inheritable` parameter to apply
+the resource quota set on a namespace to its subsequent child namespaces.
+
+Think of the following namespace hierarchy:
+
+<CodeBlockConfig hideClipboard>
+
+```plaintext
+root
+   └── parent
+      └── child
+         └── grand-child
+```
+
+</CodeBlockConfig>
+
+Under the `root` namespace, you have a `parent` namespace, and then
+`parent/child` and `parent/child/grand-child` namespaces.
+
+You can set the resource quota on the `parent` namespace which gets applied to
+its child namespaces inheritably by setting the `inheritable` parameter to
+`true`. By default, it is set to `false`.
+
+You can use Terraform to create a quota rule on the us-west namespace which its
+child namespaces will inherit. The following Terraform configuration sets the
+rate limit to 500 requests per minute.
+
+```hcl
+provider vault {}
+
+# Create a "us-west" namespace
+resource "vault_namespace" "us-west" {
+   path = "us-west"
+}
+
+# Create a "us-west" rate limit quota 
+resource "vault_quota_rate_limit" "us-west" {
+  name = "us-west"
+  path = "us-west"
+  rate = 500
+  interval = 60
+  inheritable = true
+}
+```
+
+</Tab>
+</Tabs>
+
+
+## Next steps 
+
+Proactive monitoring and periodic usage analysis can help you identify potential
+problems before they escalate.
+
+- Brush up on [general Vault resource quotas](/vault/docs/concepts/resource-quotas) in general.
+- Learn about [lease count quotas for Vault Enterprise](/vault/docs/enterprise/lease-count-quotas).
+- Review [Rate limit quotas - collective, by IP, by entity](/vault/docs/configuration/identity-based-rate-limit) if you are running Vault Enterprise clusters.
--- a/website/content/docs/configuration/identity-based-rate-limit.mdx
+++ b/website/content/docs/configuration/identity-based-rate-limit.mdx
@ -0,0 +1,124 @@
+---
+layout: docs
+page_title: Rate limit quotas - collective, by IP, by entity
+description: >-
+  Implement protections to prevent misbehaving applications and clients from impacting Vault performance.
+---
+
+# Rate limit quotas - collective, by IP, by entity
+
+As the number of Vault client applications increases, the incoming requests to
+Vault can degrade Vault's performance. To protect your Vault environment's
+stability and network, as well as storage resource consumption, use [rate limit
+quotas](/vault/docs/configuration/create-lease-count-quota) and [lease count
+quotas](/vault/docs/configuration/create-rate-limit-quota).
+
+The rate limit quotas enforce API rate limiting using a [token
+bucket](https://en.wikipedia.org/wiki/Token_bucket) algorithm. For Vault
+Enterprise clusters, the rate limit quota supports a **`group_by`** option to
+define a group of requests based on the characteristic they have in common, and
+put them in the same bucket. 
+
+The available `group_by` modes are:
+
+- `ip` - groups requests by their source IP address (_Default_)
+
+- `none` - groups together all requests that match the rate limit quota rule
+
+- `entity_then_ip` - groups requests by their entity ID for authenticated
+  requests that carry one, or by their IP for unauthenticated requests (or
+  requests whose authentication is not connected to an entity)
+
+- `entity_then_none` - groups requests by their entity ID when available, but
+  the rest is all grouped together (for example, unauthenticated requests, and
+  requests with authentication that is not connected to an entity)
+
+The `group_by` option with `entity_then_ip` or `entity_then_none` mode allows
+you to set a secondary rate limit (**`secondary_rate`**). This rate limit
+applies to the requests that fall under the IP or "none" groupings, while the
+authenticated requests that contain an entity ID are subject to the primary rate
+limit set by the `rate` parameter.
+
+**Example:**
+
+The command below creates a rate limit quota named "my-rate" with rate of 1,000
+requests per second where `group_by` mode is `entity_then_none`. The secondary
+rate is 2,000 requests per second. This means 1,000 requests per second for each
+entity regardless of how many IP addresses authenticate the same entity. The
+secondary rate of 2,000 requests per second applies to all requests that don't
+have an entity such as unauthenticated requests.
+
+```shell-session
+$ vault write sys/quotas/rate-limit/my-rate \
+    rate=1000 \
+    group_by=entity_then_none \
+    secondary_rate=2000     
+```
+
+The `entity_then_none` or `entity_then_ip` mode groups requests based on their
+attached entity. This helps when your organization has:
+
+- many workloads using the same IP
+- single workloads using many IPs which may scale up or down
+- dynamic IPs that change frequently
+
+The group by "none" option creates one bucket for all requests at the designated
+level (namespace, mount, or path) for that rate limit. For example, if your
+organization provides Vault as a service to your customers, you segregate the
+customers each into their own namespace. The default behavior of any rate limit
+set for the namespace creates a bucket per IP. If the desired behavior is to set
+a collective rate limit for all entities and workloads coming into the
+namespace, the "none" option can achieve that.
+
+![Diagram indicating possible user paths](/img/resource-quotas/group-by_light.png#light-theme-only)
+![Diagram indicating possible user paths](/img/resource-quotas/group-by_dark.png#dark-theme-only)
+
+<Tip title="You do not configure quotas on entities">
+
+You can configure quotas on namespaces, mounts, paths, and roles. But you cannot
+configure a rate limit quota for a specific entity.
+
+Assume you created a rate limit quota on "customer-A" namespace with **group by
+entity** mode. Vault checks the entity ID of the requests coming into the
+"customer-A" namespace, and group them based on the matching entity ID. 
+
+</Tip>
+
+
+## Resource quota best practices
+
+The `group_by` option supplements the existing quota features.
+
+![Diagram indicating possible user paths](/img/resource-quotas/resource-quotas-use-cases_light.png#light-theme-only)
+![Diagram indicating possible user paths](/img/resource-quotas/resource-quotas-use-cases_dark.png#dark-theme-only)
+
+- Use [Terraform Vault
+  provider](https://registry.terraform.io/providers/hashicorp/vault/latest/docs/resources/quota_rate_limit)
+  to configure and implement quotas instead of making API calls.
+
+- Define at least one lease count quota to protect your Vault cluster from
+  [lease explosions](/vault/docs/configuration/prevent-lease-explosions).
+
+- Configure low limits at the namespace level, and higher limits at the specific
+problematic path. The most granular rate limit quotas takes the effect. 
+
+  ![A diagram to show the granularity of target paths and corresponding rate](/img/resource-quotas/quota-granularity_light.png#light-theme-only)
+  ![A diagram to show the granularity of target paths and corresponding rate](/img/resource-quotas/quota-granularity_dark.png#dark-theme-only)
+
+  Refer to the [Resource Quotas](/vault/docs/concepts/resource-quotas#rate-limit-quota-precedence) page to understand which rate limit quota rule applies to a request. 
+
+- Use the `none` and `entity_then_none` modes with caution. 
+  When you configure a rate limit quota at a high-level (for example, global
+  rate limit) with group by **none** mode, your Vault environment can become
+  vulnerable to becoming unresponsive if a single application purposefully or
+  erroneously exhausts the quota. At that point, no other applications or users
+  can send requests.
+
+<Tip title="Vault benchmark tool">
+
+To help you measure your Vault environment's performance, you can use the
+benchmark tool. Refer to the [Benchmark Vault
+performance](/vault/tutorials/operations/benchmark-vault) tutorial to learn
+more.
+
+</Tip>
--- a/website/data/docs-nav-data.json
+++ b/website/data/docs-nav-data.json
@ -288,11 +288,9 @@
    ]
  },

-
  { "divider": true },
  { "heading": "OPERATIONS" },

-
  {
    "title": "Get Vault",
    "routes": [
@ -393,8 +391,26 @@
      },
      {
        "title": "Create a lease count quota",
+        "badge": {
+          "text": "ENT",
+          "type": "filled",
+          "color": "neutral"
+        },
        "path": "configuration/create-lease-count-quota"
      },
+      {
+        "title": "Create a rate limit quota",
+        "path": "configuration/create-rate-limit-quota"
+      },
+      {
+        "title": "Rate limit quotas group_by modes",
+        "badge": {
+          "text": "ENT",
+          "type": "filled",
+          "color": "neutral"
+        },
+        "path": "configuration/identity-based-rate-limit"
+      },
      {
        "title": "Configure completed request logging",
        "path": "configuration/log-requests-level"
@ -2638,7 +2654,7 @@
      },
      {
        "title": "Rollback upgrades",
-        "path" : "plugins/rollback"
+        "path": "plugins/rollback"
      },
      {
        "title": "Plugin development",
--- a/website/public/img/resource-quotas/group-by_dark.png
+++ b/website/public/img/resource-quotas/group-by_dark.png
--- a/website/public/img/resource-quotas/group-by_light.png
+++ b/website/public/img/resource-quotas/group-by_light.png
--- a/website/public/img/resource-quotas/quota-granularity_dark.png
+++ b/website/public/img/resource-quotas/quota-granularity_dark.png
--- a/website/public/img/resource-quotas/quota-granularity_light.png
+++ b/website/public/img/resource-quotas/quota-granularity_light.png
--- a/website/public/img/resource-quotas/resource-quotas-use-cases_dark.png
+++ b/website/public/img/resource-quotas/resource-quotas-use-cases_dark.png
--- a/website/public/img/resource-quotas/resource-quotas-use-cases_light.png
+++ b/website/public/img/resource-quotas/resource-quotas-use-cases_light.png