mirror of
https://github.com/hashicorp/vault.git
synced 2025-09-16 03:11:09 +02:00
* Update README Let contributors know that docs will now be located in UDR * Add comments to each mdx doc Comment has been added to all mdx docs that are not partials * chore: added changelog changelog check failure * wip: removed changelog * Fix content errors * Doc spacing * Update website/content/docs/deploy/kubernetes/vso/helm.mdx Co-authored-by: Tu Nguyen <im2nguyen@users.noreply.github.com> --------- Co-authored-by: jonathanfrappier <92055993+jonathanfrappier@users.noreply.github.com> Co-authored-by: Tu Nguyen <im2nguyen@users.noreply.github.com>
100 lines
4.9 KiB
Plaintext
100 lines
4.9 KiB
Plaintext
---
|
|
layout: docs
|
|
page_title: Integrated Storage autopilot
|
|
description: Learn about the autopilot subsystem of integrated raft storage in Vault.
|
|
---
|
|
|
|
> [!IMPORTANT]
|
|
> **Documentation Update:** Product documentation, which were located in this repository under `/website`, are now located in [`hashicorp/web-unified-docs`](https://github.com/hashicorp/web-unified-docs), colocated with all other product documentation. Contributions to this content should be done in the `web-unified-docs` repo, and not this one. Changes made to `/website` content in this repo will not be reflected on the developer.hashicorp.com website.
|
|
|
|
# Autopilot
|
|
|
|
Autopilot enables automated workflows for managing Raft clusters. The current
|
|
feature set includes 3 main features: Server Stabilization, Dead Server Cleanup
|
|
and State API. These three features were introduced in Vault 1.7. The Enterprise
|
|
feature set includes 2 main features: Automated Upgrades and Redundancy Zones.
|
|
These two features were introduced in Vault 1.11.
|
|
|
|
## Server stabilization
|
|
|
|
Server stabilization helps to retain the stability of the Raft cluster by safely
|
|
joining new voting nodes to the cluster. When a new voter node is joined to an
|
|
existing cluster, autopilot adds it as a non-voter instead, and waits for a
|
|
pre-configured amount of time to monitor its health. If the node remains
|
|
healthy for the entire duration of stabilization, then that node will be
|
|
promoted as a voter. The server stabilization period can be tuned using
|
|
`server_stabilization_time` (see below).
|
|
|
|
## Dead server cleanup
|
|
|
|
Dead server cleanup automatically removes nodes deemed unhealthy from the
|
|
Raft cluster, avoiding the manual operator intervention. This feature can be
|
|
tuned using the `cleanup_dead_servers`, `dead_server_last_contact_threshold`,
|
|
and `min_quorum` (see below).
|
|
|
|
## State API
|
|
|
|
The [State API](/vault/api-docs/system/storage/raftautopilot#get-cluster-state) provides detailed information about all the nodes in the Raft cluster
|
|
in a single call. This API can be used for monitoring for cluster health.
|
|
|
|
### Follower health
|
|
|
|
Follower node health is determined by 2 factors.
|
|
|
|
- Its ability to heartbeat to leader node at regular intervals. Tuned using
|
|
`last_contact_threshold` (see below).
|
|
- Its ability to keep up with data replication from the leader node. Tuned using
|
|
`max_trailing_logs` (see below).
|
|
|
|
### Default configuration
|
|
|
|
By default, autopilot will be enabled with clusters using Vault 1.7+,
|
|
although dead server cleanup is not enabled by default. Upgrade of
|
|
Raft clusters deployed with older versions of Vault will also transition to use
|
|
autopilot automatically.
|
|
|
|
@include 'autopilot/config.mdx'
|
|
|
|
~> **Note**: Autopilot in Vault does similar things to what autopilot does in
|
|
[Consul](https://www.consul.io/). However, the configuration in these 2 systems
|
|
differ in terms of default values and thresholds; some additional parameters
|
|
might also show up in Vault in comparison to Consul as well. Autopilot in Vault
|
|
and Consul use different technical underpinnings requiring these differences, to
|
|
provide the autopilot functionality.
|
|
|
|
## Automated upgrades
|
|
|
|
[Automated Upgrades](/vault/docs/enterprise/automated-upgrades) lets you automatically upgrade a cluster of Vault nodes to a new version as
|
|
updated server nodes join the cluster. Once the number of nodes on the new version is
|
|
equal to or greater than the number of nodes on the old version, autopilot will promote
|
|
the newer versioned nodes to voters, demote the older versioned nodes to non-voters,
|
|
and initiate a leadership transfer from the older version leader to one of the newer
|
|
versioned nodes. After the leadership transfer completes, the older versioned non-voting
|
|
nodes can be removed from the cluster.
|
|
|
|
## Redundancy zones
|
|
|
|
[Redundancy Zones](/vault/docs/enterprise/redundancy-zones) provide both scaling and resiliency benefits by deploying non-voting
|
|
nodes alongside voting nodes on a per availability zone basis. When using redundancy zones,
|
|
each zone will have exactly one voting node and as many additional non-voting nodes as desired.
|
|
If the voting node in a zone fails, a non-voting node will be automatically promoted to
|
|
voter. If an entire zone is lost, a non-voting node from another zone will be promoted to voter,
|
|
maintaining quorum. These non-voting nodes function not only as hot standbys, but also
|
|
increase read scalability.
|
|
|
|
## Replication
|
|
|
|
DR secondary and Performance secondary clusters have their own autopilot configurations, managed
|
|
independently of their primary.
|
|
|
|
The [Autopilot API](/vault/api-docs/system/storage/raftautopilot) uses DR operation tokens for
|
|
authorization when executed against a DR secondary cluster.
|
|
|
|
## Tutorial
|
|
|
|
Refer to the following tutorials to learn more.
|
|
|
|
- [Integrated Storage Autopilot](/vault/tutorials/raft/raft-autopilot)
|
|
- [Fault Tolerance with Redundancy Zones](/vault/tutorials/raft/raft-redundancy-zones)
|
|
- [Automate Upgrades with Vault Enterprise](/vault/tutorials/raft/raft-upgrade-automation)
|