* Add the ability to specify extra audit only fields from a plugin
* Add extra auditing fields within the PKI OCSP handler
* Add missing copywrite headers
* Format OCSP dates when non-zero, otherwise specify not set to be clear
* Feedback 2: Only set time fields if not zero instead of non-parsable string
* Serialize JSON fields in SDK response struct
* Perform renames based on RFC feedback
* Resolve OpenAPI test failure
* add cl
Co-authored-by: Steven Clark <steven.clark@hashicorp.com>
* Vault 42177 Add Backend Field (#12092)
* add a new struct for the total number of successful requests for transit and transform
* implement tracking for encrypt path
* implement tracking in encrypt path
* add tracking in rewrap
* add tracking to datakey path
* add tracking to hmac path
* add tracking to sign path
* add tracking to verify path
* unit tests for verify path
* add tracking to cmac path
* reset the global counter in each unit test
* add tracking to hmac verify
* add methods to retrieve and flush transit count
* modify the methods that store and update data protection call counts
* update the methods
* add a helper method to combine replicated and local data call counts
* add tracking to the endpoint
* fix some formatting errors
* add unit tests to path encrypt for tracking
* add unit tests to decrypt path
* fix linter error
* add unit tests to test update and store methods for data protection calls
* stub fix: do not create separate files
* fix the tracking by coordinating replicated and local data, add unit tests
* update all reference to the new data struct
* revert to previous design with just one global counter for all calls for each cluster
* complete external test
* no need to check if current count is greater than 0, remove it
* feedback: remove unnacassary comments about atomic addition, standardize comments
* leave jira id on todo comment, remove unused method
* rename mathods by removing HWM and max in names, update jira id in todo comment, update response field key name
* feedback: remove explicit counter in cmac tests, instead put in the expected number
* feedback: remove explicit tracking in the rest of the tests
* feedback: separate transit testing into its own external test
* Update vault/consumption_billing_util_test.go
Co-authored-by: divyaac <divya.chandrasekaran@hashicorp.com>
* update comment after test name change
* fix comments
* fix comments in test
* another comment fix
* feedback: remove incorrect comment
* fix a CE test
* fix the update method: instead of storing max, increment by the current count value
* update the unit test, remove local prefix as argument to the methods since we store only to non-replicated paths
* update the external test
* Adds a field to backend to track billing data
removed file
* Changed implementation to use a map instead
* Some more comments
* Add more implementation
* Edited grpc server backend
* Refactored a bit
* Fix one more test
* Modified map:
* Revert "Modified map:"
This reverts commit 1730fe1f358b210e6abae43fbdca09e585aaaaa8.
* Removed some other things
* Edited consumption billing files a bit
* Testing function
* Fix transit stuff and make sure tests pass
* Changes
* More changes
* More changes
* Edited external test
* Edited some more tests
* Edited and fixed tests
* One more fix
* Fix some more tests
* Moved some testing structures around and added error checking
* Fixed some nits
* Update builtin/logical/transit/path_sign_verify.go
Co-authored-by: Nick Cabatoff <ncabatoff@hashicorp.com>
* Edited some errors
* Fixed error logs
* Edited one more thing
* Decorate the error
* Update vault/consumption_billing.go
Co-authored-by: Nick Cabatoff <ncabatoff@hashicorp.com>
---------
Co-authored-by: Amir Aslamov <amir.aslamov@hashicorp.com>
Co-authored-by: Nick Cabatoff <ncabatoff@hashicorp.com>
* Edited stub function
---------
Co-authored-by: divyaac <divya.chandrasekaran@hashicorp.com>
Co-authored-by: Amir Aslamov <amir.aslamov@hashicorp.com>
Co-authored-by: Nick Cabatoff <ncabatoff@hashicorp.com>
Co-authored-by: divyaac <divyaac@berkeley.edu>
* Basic refactoring to reuse PKI certs for SSH
* Refactored so that files are moved to CE
* Modified comment
* Renamed CertCountSystemView
* Moved forwarding function and redefined consume function
* Renamed cert view file
* Moved forwarding function and redefined consume function
Small edit
Renamed cert view file
* Fix issues with commit
* Fix consume job
* Removed error
* Update vault/logical_system_helpers.go
---------
Co-authored-by: divyaac <divya.chandrasekaran@hashicorp.com>
Co-authored-by: Victor Rodriguez <vrizo@hashicorp.com>
Collect event subscriber filters on the active node of a cluster as
"cluster wide" filters, and send them from the secondary active to the
primary active node (`SendSecondaryFilters rpc`). The primary active
node forwards events downstream to the secondary active node if the
events match the secondary cluster's subscriber filters
(`RecvPrimaryEvents rpc`). Then the events are further distributed
around the secondary cluster via the existing `RecvActiveNodeEvents`
and `SendStandbyFilters` rpc's.
Events are forwarded downstream to the secondary cluster if the mount
exists on the secondary cluster, i.e. events from mounts with
`local=true` aren't forwarded, and events from mounts that are not
replicated via paths-filter aren't forwarded.
(This is the CE portion of the above^^)
Co-authored-by: Theron Voran <tvoran@users.noreply.github.com>
* license: update headers to IBM Corp.
* `make proto`
* update offset because source file changed
Signed-off-by: Ryan Cragun <me@ryan.ec>
Co-authored-by: Ryan Cragun <me@ryan.ec>
Increment certificate counts in all PKI backends.
Ensure that the PkiCertificateCounter is invoked every time we store and
issue a certificate by any of the PKI backends.
Co-authored-by: Victor Rodriguez <vrizo@hashicorp.com>
Co-authored-by: Steven Clark <steven.clark@hashicorp.com>
* add ce side code and stubs
* add changelog
* style refactor
* try to use APIPath as mount point instead of request field
* fix linter
* return a response struct instead of a pure timestamp
* add issue time to response
* add ttl to GetRotationInformation response
* rename field for clarity
* update ttl to just seconds
* rename next and last rotation time field; describe what they are
* rename function
* catch up to ent PR
* fix patch merge mistake
* Add an option to allow cert-auth to return metadata about client certs that fail login
* Add cl
* Update SPDX header for sdk/logical/response_test.go
This PR adds the CE plumbing to expose underyling ErrOverloaded errors.
The wrapper allows the HTTP layer to correctly assign 503 status codes
in responses.
This PR introduces the CE plumbing for a new high WritePriority, meant
to bypass rejection from the AOP write controller. We attach this
priority to any request on a sudo path, such that administrators can
still perform necessary operations during an overload.
This PR introduces the CE plumbing for a new HTTP header, called
X-Vault-AOP-Force-Reject, which will force any associated request to
reject storage writes as if Vault were overloaded.
This flag is intended to test end-to-end functionality of write
rejection in Vault. This is specifically useful for testing 503 -
Service Unavailable HTTP response codes during load shedding.
We have many hand-written String() methods (and similar) for enums.
These require more maintenance and are more error-prone than using
automatically generated methods. In addition, the auto-generated
versions can be more efficient.
Here, we switch to using https://github.com/loggerhead/enumer, itself
a fork of https://github.com/diegostamigni/enumer, no longer maintained,
and a fork of the mostly standard tool
https://pkg.go.dev/golang.org/x/tools/cmd/stringer.
We use this fork of enumer for Go 1.20+ compatibility and because
we require the `-transform` flag to be able to generate
constants that match our current code base.
Some enums were not targeted for this change:
* fix VAULT-24372
* use redaction settings in context to redact values in sys/leader
* add tests to check redaction in GetLeaderStatus and GetSealStatus
* add ENT badge to sys/config/ui/custom-messages api-docs page in ToC
* remove unrelated change to website ToC
* add gosimport to make fmt and run it
* move installation to tools.sh
* correct weird spacing issue
* Update Makefile
Co-authored-by: Nick Cabatoff <ncabatoff@hashicorp.com>
* fix a weird issue
---------
Co-authored-by: Nick Cabatoff <ncabatoff@hashicorp.com>
This PR introduces a new testonly endpoint for introspecting the
RequestLimiter state. It makes use of the endpoint to verify that changes to
the request_limiter config are honored across reload.
In the future, we may choose to make the sys/internal/request-limiter/status
endpoint available in normal binaries, but this is an expedient way to expose
the status for testing without having to rush the design.
In order to re-use as much of the existing command package utility funcionality
as possible without introducing sprawling code changes, I introduced a new
server_util.go and exported some fields via accessors.
The tests shook out a couple of bugs (including a deadlock and lack of
locking around the core limiterRegistry state).
This commit introduces two new adaptive concurrency limiters in Vault,
which should handle overloading of the server during periods of
untenable request rate. The limiter adjusts the number of allowable
in-flight requests based on latency measurements performed across the
request duration. This approach allows us to reject entire requests
prior to doing any work and prevents clients from exceeding server
capacity.
The limiters intentionally target two separate vectors that have been
proven to lead to server over-utilization.
- Back pressure from the storage backend, resulting in bufferbloat in
the WAL system. (enterprise)
- Back pressure from CPU over-utilization via PKI issue requests
(specifically for RSA keys), resulting in failed heartbeats.
Storage constraints can be accounted for by limiting logical requests
according to their http.Method. We only limit requests with write-based
methods, since these will result in storage Puts and exhibit the
aforementioned bufferbloat.
CPU constraints are accounted for using the same underlying library and
technique; however, they require special treatment. The maximum number
of concurrent pki/issue requests found in testing (again, specifically
for RSA keys) is far lower than the minimum tolerable write request
rate. Without separate limiting, we would artificially impose limits on
tolerable request rates for non-PKI requests. To specifically target PKI
issue requests, we add a new PathsSpecial field, called limited,
allowing backends to specify a list of paths which should get
special-case request limiting.
For the sake of code cleanliness and future extensibility, we introduce
the concept of a LimiterRegistry. The registry proposed in this PR has
two entries, corresponding with the two vectors above. Each Limiter
entry has its own corresponding maximum and minimum concurrency,
allowing them to react to latency deviation independently and handle
high volumes of requests to targeted bottlenecks (CPU and storage).
In both cases, utilization will be effectively throttled before Vault
reaches any degraded state. The resulting 503 - Service Unavailable is a
retryable HTTP response code, which can be handled to gracefully retry
and eventually succeed. Clients should handle this by retrying with
jitter and exponential backoff. This is done within Vault's API, using
the go-retryablehttp library.
Limiter testing was performed via benchmarks of mixed workloads and
across a deployment of agent pods with great success.