This PR introduces the CE plumbing for a new HTTP header, called
X-Vault-AOP-Force-Reject, which will force any associated request to
reject storage writes as if Vault were overloaded.
This flag is intended to test end-to-end functionality of write
rejection in Vault. This is specifically useful for testing 503 -
Service Unavailable HTTP response codes during load shedding.
This removes the WebSockets endpoint for events
(which will be moved to the Enterprise repo) and
disables tests that rely on it unless they are
running in Enterprise.
It also updates documentation to document that
events are only available in Vault Enterprise.
This commit introduces two new adaptive concurrency limiters in Vault,
which should handle overloading of the server during periods of
untenable request rate. The limiter adjusts the number of allowable
in-flight requests based on latency measurements performed across the
request duration. This approach allows us to reject entire requests
prior to doing any work and prevents clients from exceeding server
capacity.
The limiters intentionally target two separate vectors that have been
proven to lead to server over-utilization.
- Back pressure from the storage backend, resulting in bufferbloat in
the WAL system. (enterprise)
- Back pressure from CPU over-utilization via PKI issue requests
(specifically for RSA keys), resulting in failed heartbeats.
Storage constraints can be accounted for by limiting logical requests
according to their http.Method. We only limit requests with write-based
methods, since these will result in storage Puts and exhibit the
aforementioned bufferbloat.
CPU constraints are accounted for using the same underlying library and
technique; however, they require special treatment. The maximum number
of concurrent pki/issue requests found in testing (again, specifically
for RSA keys) is far lower than the minimum tolerable write request
rate. Without separate limiting, we would artificially impose limits on
tolerable request rates for non-PKI requests. To specifically target PKI
issue requests, we add a new PathsSpecial field, called limited,
allowing backends to specify a list of paths which should get
special-case request limiting.
For the sake of code cleanliness and future extensibility, we introduce
the concept of a LimiterRegistry. The registry proposed in this PR has
two entries, corresponding with the two vectors above. Each Limiter
entry has its own corresponding maximum and minimum concurrency,
allowing them to react to latency deviation independently and handle
high volumes of requests to targeted bottlenecks (CPU and storage).
In both cases, utilization will be effectively throttled before Vault
reaches any degraded state. The resulting 503 - Service Unavailable is a
retryable HTTP response code, which can be handled to gracefully retry
and eventually succeed. Clients should handle this by retrying with
jitter and exponential backoff. This is done within Vault's API, using
the go-retryablehttp library.
Limiter testing was performed via benchmarks of mixed workloads and
across a deployment of agent pods with great success.
* Re-process .well-known redirects with a recursive handler call rather than a 302 redirect
* Track when the RequestURI mismatches path (in a redirect) and add it to the audit log
* call cancelFunc
* wip
* more pruning
* Integrate OCSP into binary paths PoC
- Simplify some of the changes to the router
- Remove the binary test PKI endpoint
- Switch OCSP to use the new binary paths backend variable
* Fix proto generation and test compilation
* Add unit test for binary request handling
---------
Co-authored-by: Scott G. Miller <smiller@hashicorp.com>
The flag `events.alpha1` will no longer do anything, but we keep it
to prevent breaking users who have it in their configurations or
startup flags, or if it is referenced in other code.
* Adding explicit MPL license for sub-package.
This directory and its subdirectories (packages) contain files licensed with the MPLv2 `LICENSE` file in this directory and are intentionally licensed separately from the BSL `LICENSE` file at the root of this repository.
* Adding explicit MPL license for sub-package.
This directory and its subdirectories (packages) contain files licensed with the MPLv2 `LICENSE` file in this directory and are intentionally licensed separately from the BSL `LICENSE` file at the root of this repository.
* Updating the license from MPL to Business Source License.
Going forward, this project will be licensed under the Business Source License v1.1. Please see our blog post for more details at https://hashi.co/bsl-blog, FAQ at www.hashicorp.com/licensing-faq, and details of the license at www.hashicorp.com/bsl.
* add missing license headers
* Update copyright file headers to BUS-1.1
* Fix test that expected exact offset on hcl file
---------
Co-authored-by: hashicorp-copywrite[bot] <110428419+hashicorp-copywrite[bot]@users.noreply.github.com>
Co-authored-by: Sarah Thompson <sthompson@hashicorp.com>
Co-authored-by: Brian Kassouf <bkassouf@hashicorp.com>
* Add header operation to sdk/logical
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Add support for routing HEAD operations
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Add changelog entry
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
---------
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
This checks the request against the `read` permission for
`sys/events/subscribe/{eventType}` on the initial subscribe.
Future work includes moving this to its own verb (`subscribe`)
and periodically rechecking the request.
Tested locally by minting a token with the wrong permissions
and verifying that they are rejected as expected, and that
they work if the policy is adjusted to `sys/event/subscribe/*`
(or the specific topic name) with `read` permissions.
I had to change the `core.checkToken()` to be publicly accessible,
as it seems like the easiest way to check the token on the
`logical.Request` against all relevant policies, but without
going into all of the complex logic further in `handleLogical()`.
Co-authored-by: Tom Proctor <tomhjp@users.noreply.github.com>
* Refactor existing CRL function to storage getRevocationConfig
* Introduce ocsp_disable config option in config/crl
* Introduce OCSPSigning usage flag on issuer
* Add ocsp-request passthrough within lower layers of Vault
* Add OCSP responder to Vault PKI
* Add API documentation for OCSP
* Add cl
* Revert PKI storage migration modifications for OCSP
* Smaller PR feedback items
- pki.mdx doc update
- parens around logical.go comment to indicate DER encoded request is
related to OCSP and not the snapshots
- Use AllIssuers instead of writing them all out
- Drop zero initialization of crl config's Disable flag if not present
- Upgrade issuer on the fly instead of an initial migration
* Additional clean up backing out the writeRevocationConfig refactoring
* Remove Dirty issuer flag and update comment about not writing upgrade to
storage
* Address PR feedback and return Unknown response when mismatching issuer
* make fmt
* PR Feedback.
* More PR feedback
- Leverage ocsp response constant
- Remove duplicate errors regarding unknown issuers
* identity/oidc: allow filtering the list providers response by an allowed_client_id
* adds changelog
* adds api documentation
* use identity store view in list provider test
* VAULT-1564 report in-flight requests
* adding a changelog
* Changing some variable names and fixing comments
* minor style change
* adding unauthenticated support for in-flight-req
* adding documentation for the listener.profiling stanza
* adding an atomic counter for the inflight requests
addressing comments
* addressing comments
* logging completed requests
* fixing a test
* providing log_requests_info as a config option to determine at which level requests should be logged
* removing a member and a method from the StatusHeaderResponseWriter struct
* adding api docks
* revert changes in NewHTTPResponseWriter
* Fix logging invalid log_requests_info value
* Addressing comments
* Fixing a test
* use an tomic value for logRequestsInfo, and moving the CreateClientID function to Core
* fixing go.sum
* minor refactoring
* protecting InFlightRequests from data race
* another try on fixing a data race
* another try to fix a data race
* addressing comments
* fixing couple of tests
* changing log_requests_info to log_requests_level
* minor style change
* fixing a test
* removing the lock in InFlightRequests
* use single-argument form for interface assertion
* adding doc for the new configuration paramter
* adding the new doc to the nav data file
* minor fix
* handle HTTP PATCH requests as logical.PatchOperation
* update go.mod, go.sum
* a nil response for logical.PatchOperation should result in 404
* respond with 415 for incorrect MIME type in PATCH Content-Type header
* add abstraction to handle PatchOperation requests
* add ACLs for patch
* Adding JSON Merge support to the API client
* add HTTP PATCH tests to check high level response logic
* add permission-based 'kv patch' tests in prep to add HTTP PATCH
* adding more 'kv patch' CLI command tests
* fix TestHandler_Patch_NotFound
* Fix TestKvPatchCommand_StdinValue
* add audit log test for HTTP PATCH
* patch CLI changes
* add patch CLI tests
* change JSONMergePatch func to accept a ctx
* fix TestKVPatchCommand_RWMethodNotExists and TestKVPatchCommand_RWMethodSucceeds to specify -method flag
* go fmt
* add a test to verify patching works by default with the root token
* add changelog entry
* get vault-plugin-secrets-kv@add-patch-support
* PR feedback
* reorder some imports; go fmt
* add doc comment for HandlePatchOperation
* add json-patch@v5.5.0 to go.mod
* remove unnecessary cancelFunc for WriteBytes
* remove default for -method
* use stable version of json-patch; go mod tidy
* more PR feedback
* temp go get vault-plugin-secrets-kv@master until official release
Co-authored-by: Josh Black <raskchanky@users.noreply.github.com>
* removing extra information from the returned error, to avoid leaking it to unauthenticated requests
* removing extra information from the returned error, to avoid leaking it to unauthenticated requests
* Change the error message in a way that is retains the HTTP status code
Co-authored-by: bruj0 <ramakandra@gmail.com>
* removing extra information from the returned error, to avoid leaking it to unauthenticated requests
* removing extra information from the returned error, to avoid leaking it to unauthenticated requests
Co-authored-by: Scott Miller <smiller@hashicorp.com>
* Initial work
* rework
* s/dr/recovery
* Add sys/raw support to recovery mode (#7577)
* Factor the raw paths out so they can be run with a SystemBackend.
# Conflicts:
# vault/logical_system.go
* Add handleLogicalRecovery which is like handleLogical but is only
sufficient for use with the sys-raw endpoint in recovery mode. No
authentication is done yet.
* Integrate with recovery-mode. We now handle unauthenticated sys/raw
requests, albeit on path v1/raw instead v1/sys/raw.
* Use sys/raw instead raw during recovery.
* Don't bother persisting the recovery token. Authenticate sys/raw
requests with it.
* RecoveryMode: Support generate-root for autounseals (#7591)
* Recovery: Abstract config creation and log settings
* Recovery mode integration test. (#7600)
* Recovery: Touch up (#7607)
* Recovery: Touch up
* revert the raw backend creation changes
* Added recovery operation token prefix
* Move RawBackend to its own file
* Update API path and hit it using CLI flag on generate-root
* Fix a panic triggered when handling a request that yields a nil response. (#7618)
* Improve integ test to actually make changes while in recovery mode and
verify they're still there after coming back in regular mode.
* Refuse to allow a second recovery token to be generated.
* Resize raft cluster to size 1 and start as leader (#7626)
* RecoveryMode: Setup raft cluster post unseal (#7635)
* Setup raft cluster post unseal in recovery mode
* Remove marking as unsealed as its not needed
* Address review comments
* Accept only one seal config in recovery mode as there is no scope for migration
* sys/pprof: add pprof routes to the system backend
* sys/pprof: add pprof paths to handler with local-only check
* fix trailing slash on pprof index endpoint
* use new no-forward handler on pprof
* go mod tidy
* add pprof external tests
* disallow streaming requests to exceed DefaultMaxRequestDuration
* add max request duration test
This allows logical operations (along with a non-nil response writer) to
process http handler funcs within the operation function while keeping
auth and audit checks that the logical request flow provides.