vault/website/content/docs/configuration/storage/raft.mdx
Lucy Davinhart || Strawb System b5e58a73b0
Update raft.mdx (#27275)
2024-05-30 10:33:56 -04:00

282 lines
14 KiB
Plaintext
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

---
layout: docs
page_title: Integrated Storage - Storage Backends - Configuration
description: >-
The Integrated Storage (Raft) backend is used to persist Vault's data. Unlike all the other
storage backends, this backend does not operate from a single source for the
data. Instead all the nodes in a Vault cluster will have a replicated copy of
the entire data. The data is replicated across the nodes using the Raft
Consensus Algorithm.
---
# Integrated storage (Raft) backend
The Integrated Storage backend is used to persist Vault's data. Unlike other storage
backends, Integrated Storage does not operate from a single source of data. Instead
all the nodes in a Vault cluster will have a replicated copy of Vault's data.
Data gets replicated across all the nodes via the [Raft Consensus
Algorithm][raft].
- **High Availability** the Integrated Storage backend supports high availability.
- **HashiCorp Supported** the Integrated Storage backend is officially supported
by HashiCorp.
```hcl
storage "raft" {
path = "/path/to/raft/data"
node_id = "raft_node_1"
}
cluster_addr = "http://127.0.0.1:8201"
```
~> **Note:** When using the Integrated Storage backend, it is required to provide
[`cluster_addr`](/vault/docs/concepts/ha#per-node-cluster-address) to indicate the address and port to be used for communication
between the nodes in the Raft cluster.
~> **Note:** When using the Integrated Storage backend, a separate
[`ha_storage`](/vault/docs/configuration#ha_storage)
backend cannot be declared.
~> **Note:** When using the Integrated Storage backend, it is strongly recommended to
set [`disable_mlock`](/vault/docs/configuration#disable_mlock) to `true`, and to disable memory swapping on the system.
## `raft` parameters
- `path` `(string: "")` The file system path where all the Vault data gets
stored.
This value can be overridden by setting the `VAULT_RAFT_PATH` environment variable.
- `node_id` `(string: "")` - The identifier for the node in the Raft cluster.
You can override `node_id` with the `VAULT_RAFT_NODE_ID` environment
variable. When `VAULT_RAFT_NODE_ID` is unset, Vault assigns a random
GUID during initialization and writes the value to `data/node-id` in the
directory specified by the `path` parameter.
- `performance_multiplier` `(integer: 0)` - An integer multiplier used by
servers to scale key Raft timing parameters, where each increment translates to approximately 1 2 seconds of delay. For example, setting the multiplier to "3" translates to 3 6 seconds of total delay. Tuning the multiplier affects the time it
takes Vault to detect leader failures and to perform leader elections, at the
expense of requiring more network and CPU resources for better performance.
Omitting this value or setting it to 0 uses default timing described below.
Lower values are used to tighten timing and increase sensitivity while higher
values relax timings and reduce sensitivity.
By default, Vault uses a balanced timing value of 5, which is suitable for most
platforms and scenarios. You should only adjust the timing value when platform
telemetry indicators that a change is needed or different timing is required due
to the overall reliability your platform (network, etc.).
Setting the timing value to 1 configures Raft to its highest performance (lowest
delay) mode. The maximum allowed value is 10.
- `trailing_logs` `(integer: 10000)` - This controls how many log entries are
left in the log store on disk after a snapshot is made. This should only be
adjusted when followers cannot catch up to the leader due to a very large
snapshot size and high write throughput causing log truncation before a
snapshot can be fully installed. If you need to use this to recover a cluster,
consider reducing write throughput or the amount of data stored on Vault. The
default value is 10000 which is suitable for all normal workloads. The
`trailing_logs` metric is not the same as `max_trailing_logs`.
- `snapshot_threshold` `(integer: 8192)` - This controls the minimum number of Raft
commit entries between snapshots that are saved to disk. This is a low-level
parameter that should rarely need to be changed. Very busy clusters
experiencing excessive disk IO may increase this value to reduce disk IO and
minimize the chances of all servers taking snapshots at the same time.
Increasing this trades off disk IO for disk space since the log will grow much
larger and the space in the `raft.db` file can't be reclaimed till the next
snapshot. Servers may take longer to recover from crashes or failover if this
is increased significantly as more logs will need to be replayed.
- `snapshot_interval` `(integer: 120 seconds)` - The snapshot interval
controls how often Raft checks whether a snapshot operation is
required. Raft randomly staggers snapshots between the configured
interval and twice the configured interval to keep the entire cluster
from performing a snapshot at once. The default snapshot interval is
120 seconds.
- `retry_join` `(list: [])` - A set of connection details for another node in the
cluster, which is used to help nodes locate a leader in order to join a cluster.
There can be one or more [`retry_join`](#retry_join-stanza) stanzas.
If the connection details for all nodes in the cluster are known in advance, you
can include these stanzas to enable nodes to automatically join the Raft cluster.
Once one of the nodes is initialized as the leader, the remaining nodes will use
their [`retry_join`](#retry_join-stanza) configuration to locate the leader and
join the cluster. Note that when using Shamir seal, the joined nodes will still
need to be unsealed manually.
See [the section below](#retry_join-stanza) for the parameters accepted by the
[`retry_join`](#retry_join-stanza) stanza.
- `retry_join_as_non_voter` `(boolean: false)` - <EnterpriseAlert inline />
Configures this node as a permanent non-voter. The node will not participate
in the Raft quorum but will still receive the data replication stream
enhancing the read throughput of the cluster. This option has the same effect
as the [`-non-voter`](/vault/docs/commands/operator/raft#non-voter) flag for
the `vault operator raft join` command, but only affects voting status when
joining via `retry_join` config. You can override the non-voter configuration
by setting the `VAULT_RAFT_RETRY_JOIN_AS_NON_VOTER` environment variable to
any non-empty value. Configuring a node as a non-voter is only valid if there
is at least one `retry_join` stanza.
- `max_entry_size` `(integer: 1048576)` - This configures the maximum number of
bytes for a Raft entry. It applies to both Put operations and transactions.
Any put or transaction operation exceeding this configuration value will cause
the respective operation to fail. Raft has a suggested max size of data in a
Raft log entry. This is based on current architecture, default timing, etc.
Integrated Storage also uses a chunk size that is the threshold used for
breaking a large value into chunks. By default, the chunk size is the same as
Raft's max size log entry. The default value for this configuration is 1048576
-- two times the chunking size.
- **Note:** This option corresponds to [Consul's `kv_max_value_size` parameter](/consul/docs/agent/config/config-files#kv_max_value_size) for
Vault clusters using a Consul storage backend. If you are migrating from Consul
storage to Raft Integrated Storage, and have changed this value in Consul from its
default to a value larger than the Integrated Storage default of 1MB, then you will
need to make the same change in Vault's Integrated Storage config.
- `max_mount_and_namespace_table_entry_size` `(integer)`- <EnterpriseAlert
inline /> Overrides `max_entry_size` to set a different limit for the specific
storage entries that contain mount tables, auth tables and namespace
configuration data. If you are reaching limits on the mount table size, you
can use this to increase the number of mounts and namespaces that can be
stored without the risk of other storage entries becoming too large. All other
notes on [`max_entry_size`](#max-entry-size) apply. Before changing this, read
the [Run Vault Enterprise
with many namespaces](/vault/docs/enterprise/namespaces/namespace-limits) guide regarding important performance considerations.
- `autopilot_reconcile_interval` `(string: "10s")` - This is the interval after
which autopilot will pick up any state changes. State change could mean multiple
things; for example a newly joined voter node, initially added as non-voter to
the Raft cluster by autopilot has successfully completed the stabilization
period thereby qualifying for being promoted as a voter, a node that has become
unhealthy and needs to be shown as such in the state API, a node has been marked
as dead needing eviction from Raft configuration, etc.
- `autopilot_update_interval` `(string: "2s")` - This is the interval after which
autopilot will poll Vault for any updates to the information it cares about. This
includes things like the autopilot configuration, current autopilot state, raft
configuration, known servers, latest raft index, and stats for all the known servers.
The information that autopilot receives will be used to calculate its next state.
- `autopilot_upgrade_version` `(string: "")` - <EnterpriseAlert inline />
Overrides the version used by Autopilot during [automated
upgrades](/vault/docs/enterprise/automated-upgrades). Vault's build version is
used by default. The string provided must be a valid [Semantic
Version](https://semver.org).
- `autopilot_redundancy_zone` `(string: "")` - <EnterpriseAlert inline />
Specifies a [redundancy zone](/vault/docs/enterprise/redundancy-zones) which
is used by Autopilot to automatically swap out failed servers for enhanced
reliability.
<Warning title="Experimental">
- `raft_wal` `(boolean: false)` - Enables the
[write-ahead](/vault/docs/internals/integrated-storage#configurable-raft-log-store)
log store instead of the default of BoltDB.
- `raft_log_verifier_enabled` `(boolean: false)` - Enables the raft log verifier.
The verifier periodically writes small raft logs and verifies checksums to
ensure that data has been written correctly. The verifier works with raft
write-ahead **and** BoltDB log stores.
- `raft_log_verification_interval` `(string: "60s")` - Sets the interval at
which the raft log verifier write verification logs. The default interval is
`60s` and the minimum supported interval is `10s`. The `raft_log_verification_interval`
parameter has no effect if `raft_log_verifier_enabled` is `false`.
</Warning>
### `retry_join` stanza
- `leader_api_addr` `(string: "")` - Address of a possible leader node.
- `auto_join` `(string: "")` - Cloud auto-join configuration, using
[go-discover](https://github.com/hashicorp/go-discover) syntax.
- `auto_join_scheme` `(string: "")` - The optional URI protocol scheme for addresses
discovered via auto-join. Available values are `http` or `https`.
- `auto_join_port` `(uint: "")` - The optional port used for addressed discovered
via auto-join.
- `leader_tls_servername` `(string: "")` - The TLS server name to use when
connecting with HTTPS.
Should match one of the names in the [DNS
SANs](https://en.wikipedia.org/wiki/Subject_Alternative_Name) of the remote
server certificate.
See also [Integrated Storage and TLS](/vault/docs/concepts/integrated-storage#autojoin-with-tls-servername)
- `leader_ca_cert_file` `(string: "")` - File path to the CA cert of the
possible leader node.
- `leader_client_cert_file` `(string: "")` - File path to the client certificate
for the follower node to establish client authentication with the possible
leader node.
- `leader_client_key_file` `(string: "")` - File path to the client key for the
follower node to establish client authentication with the possible leader node.
- `leader_ca_cert` `(string: "")` - CA cert of the possible leader node.
- `leader_client_cert` `(string: "")` - Client certificate for the follower node
to establish client authentication with the possible leader node.
- `leader_client_key` `(string: "")` - Client key for the follower node to
establish client authentication with the possible leader node.
Each [`retry_join`](#retry_join-stanza) block may provide TLS certificates via
file paths or as a single-line certificate string value with newlines delimited
by `\n`, but not a combination of both. Each [`retry_join`](#retry_join-stanza)
stanza may contain either a [`leader_api_addr`](#leader_api_addr) value or a
cloud [`auto_join`](#auto_join) configuration value, but not both. When an
[`auto_join`](#auto_join) value is provided, Vault will automatically attempt to
discover and resolve potential Raft leader addresses using [go-discover](https://github.com/hashicorp/go-discover).
See the go-discover
[README](https://github.com/hashicorp/go-discover/blob/master/README.md)
for details on the format of the `auto_join` value.
By default, Vault will attempt to reach discovered peers using HTTPS and port 8200. Operators may override these through the
[`auto_join_scheme`](#auto_join_scheme) and [`auto_join_port`](#auto_join_port)
fields respectively.
Example Configuration:
```hcl
storage "raft" {
path = "/Users/foo/raft/"
node_id = "node1"
retry_join {
leader_api_addr = "http://127.0.0.2:8200"
leader_ca_cert_file = "/path/to/ca1"
leader_client_cert_file = "/path/to/client/cert1"
leader_client_key_file = "/path/to/client/key1"
}
retry_join {
leader_api_addr = "http://127.0.0.3:8200"
leader_ca_cert_file = "/path/to/ca2"
leader_client_cert_file = "/path/to/client/cert2"
leader_client_key_file = "/path/to/client/key2"
}
retry_join {
leader_api_addr = "http://127.0.0.4:8200"
leader_ca_cert_file = "/path/to/ca3"
leader_client_cert_file = "/path/to/client/cert3"
leader_client_key_file = "/path/to/client/key3"
}
retry_join {
auto_join = "provider=aws region=eu-west-1 tag_key=vault tag_value=... access_key_id=... secret_access_key=..."
}
}
```
## Tutorial
Refer to the [Integrated
Storage](/vault/tutorials/raft) series of tutorials to learn more about implementing Vault using Integrated Storage.
[raft]: https://raft.github.io/ 'The Raft Consensus Algorithm'