* Return the proper serial number in OCSP verification errors
- We returned the issuer's certificate number instead of the serial
number of the actual certificate we validated from an OCSP request.
- The problematic serial number within the error are never shown
currently in Vault. The only user of this library is cert-auth
which swallows errors around revoked certificates and returns
a boolean false instead of the actual error message.
* Add cl
* Use previously formatted serial in error msg
* Move fetchCertBySerial back into the main PKI package.
In order to avoid polluting the issuing package with StorageContext, move
fetchCertBySerial back to the main PKI package. Note that this requires that
FetchRevocationInfo also be moved back to the main package.
* Run make fmt.
* Move resolveIssuerCRLPath to PKI issuing package.
* Move fetchCertBySerial to PKI issuing package.
* Move fetchRevocationInfo to PKI revocation package.
* Make associateRevokedCertWithIsssuer a method of RevocationInfo.
* Move serialFromCert and normalizeSerial to PKI parsing package.
* Move writeUnifiedRevocationEntry to PKI revocation package.
* Run make fmt.
* Rename crlConfig to CrlConfig.
Rename defaultCrlConfig to DefaultCrlConfig.
* Move CrlConfig and DefaultCrlConfig to new package pki/revocation.
* Rename revocationInfo to RevocationInfo.
* Move RevocationInfo to pki/revocation.
* Add StorageContext interface to PKI's revocation package.
* Add CrlBuilderType interface to pki_backend package.
The purpose of the interface is to make it possible to gradually move (refactor)
CrlBuilder to the revocation package.
* Move CrlConfig and DefaultCrlConfig to package pki_backend.
* Make StorageContext.CrlBuilder() return a CrlBuilderType.
Add methods SetLastDeltaRebuildCheckTime() and ShouldInvalidate() to
CrlBuilderType.
* Move fetchIssuerMapForRevocationChecking to PKI's revocation package.
* Run make fmt.
* Add method storageContext.Logger().
* Add method storageContext.System().
* Add method storageContext.CrlBuilder().
* Add method storageContext.GetUnifiedTransferStatus().
* Add method storageContext.GetPkiManagedView().
* Add method storageContext.GetCertificateCounter().
* Add method storageContext.UseLegacyBundleCaStorage().
* Add method storageContext.GetRevokeStorageLock().
* Add acmeState to acmeContext.
Make acmeState accessible from acmeContext, so that storageContext doesn't have
to be used for this purpose.
* Decouple getAndValidateAcmeRole() from storageContext.Backend.
* Don't access Backend.ciepsState through storageContext.
* Add method storageContext.GetRole().
* Change signature of getCiepsAcmeSettings for CE compatibility.
- Within SCEP we need to expose an unauthed API endpoint that has
a handler for both GET and POST requests. This use-case wasn't
supported in the TestProperAuthing test which this adds.
Use hash_algorithm parameter on Transit's verify HMAC requests.
Parameter 'algorithm' has been deprecated in favour of 'hash_algorithm', so
update the pathHMACVerify() handler to use it when it is present.
* PKI: Change sign-intermediate to truncate notAfter by default
- The PKI sign-intermediate API allowed an end-user to request a TTL
value that would extend beyond the signing issuer's notAfter. This would
generate an invalid CA chain when properly validated.
- We are now changing the default behavior to truncate the returned certificate
to the signing issuer's notAfter.
- End-users can get the old behavior by configuring the signing issuer's
leaf_not_after_behavior field to permit, and call sign-intermediary
with the new argument enforce_leaf_not_after_behavior to true. The
new argument could also be used to enforce an error instead of truncating
behavior if the signing issuer's leaf_not_after_behavior is set to err.
* Add cl
* Add cl and upgrade note
* Apply suggestions from code review
Co-authored-by: Sarah Chavis <62406755+schavis@users.noreply.github.com>
---------
Co-authored-by: Sarah Chavis <62406755+schavis@users.noreply.github.com>
* Certificate Metadata, CE components
* License headers
* make proto
* move pathFetchMetadata to ENT
* move pathFetchMetadata path to ENT
* correct stub sig
* Issuers may not be available in legacy CA storage, shouldn't fail issue/sign
* clarify error msg
* add key types for cmac for transit key creation
* add test for key creation
* fix test logic and add cases
* fix logic for hmac
* add go doc
* fix key size and add check for HMAC key
* Use a less strict URL validation for PKI issuing and crl distribution urls
* comma handling
* limit to ldap
* remove comma hack
* changelog
* Add unit test validating ldap CRL urls
---------
Co-authored-by: Steve Clark <steven.clark@hashicorp.com>
We have many hand-written String() methods (and similar) for enums.
These require more maintenance and are more error-prone than using
automatically generated methods. In addition, the auto-generated
versions can be more efficient.
Here, we switch to using https://github.com/loggerhead/enumer, itself
a fork of https://github.com/diegostamigni/enumer, no longer maintained,
and a fork of the mostly standard tool
https://pkg.go.dev/golang.org/x/tools/cmd/stringer.
We use this fork of enumer for Go 1.20+ compatibility and because
we require the `-transform` flag to be able to generate
constants that match our current code base.
Some enums were not targeted for this change:
When creating database connections, there is a race
condition when multiple goroutines try to create the
connection at the same time. This happens, for
example, on leadership changes in a cluster.
Normally, the extra database connections are cleaned
up when this is detected. However, some database
implementations, notably Postgres, do not seem to
clean up in a timely manner, and can leak in these
scenarios.
To fix this, we create a global lock when creating
database connections to prevent multiple connections
from being created at the same time.
We also clean up the logic at the end so that
if (somehow) we ended up creating an additional
connection, we use the existing one rather than
the new one. This by itself would solve our
problem long-term, however, would still involve
many transient database connections being created
and immediately killed on leadership changes.
It's not ideal to have a single global lock for
database connection creation. Some potential
alternatives:
* a map of locks from the connection name to the lock.
The biggest downside is the we probably will want to
garbage collect this map so that we don't have an
unbounded number of locks.
* a small pool of locks, where we hash the connection
names to pick the lock. Using such a pool generally
is a good way to introduce deadlock, but since we
will only use it in a specific case, and the purpose
is to improve performance for concurrent connection
creation, this is probably acceptable.
Co-authored-by: Jason O'Donnell <2160810+jasonodonnell@users.noreply.github.com>
* Address OCSP client caching issue
- The OCSP cache built into the client that is used by cert-auth
would cache the responses but when pulling out a cached value the
response wasn't validating properly and was then thrown away.
- The issue was around a confusion of the client's internal status
vs the Go SDK OCSP status integer values.
- Add a test that validates the cache is now used
* Add cl
* Fix PKI test failing now due to the OCSP cache working
- Remove the previous lookup before revocation as now the OCSP
cache works so we don't see the new revocation as we are actually
leveraging the cache
* add gosimport to make fmt and run it
* move installation to tools.sh
* correct weird spacing issue
* Update Makefile
Co-authored-by: Nick Cabatoff <ncabatoff@hashicorp.com>
* fix a weird issue
---------
Co-authored-by: Nick Cabatoff <ncabatoff@hashicorp.com>
* Transit: Release locks using defer statements
- Leverage defer statements to Unlock the fetched policy
to avoid issues with forgetting to manually Unlock during
each return statement
* Add cl
* Base Binary Cert and CSR Parse functions.
* Add otherSANS parsing.
* Notate what doesn't exist on a CSR.
* Fix otherSans call err-checking and add basic-constriants to CSR
* Move BasicConstraint parsing to be optionally set.
* Refactored to use existing ParseBasicConstraintsExtension.
* Add handling for the ChangeSubjectName ext on CSR that is needed for EST
* Remove ChangeSubjectName - it's an attribute, not an extension, and there is no clean way to parse it, so pair down for now.
* Make these public methods, so they can be used in vault.
* Add unit tests for certutil.ParseCertificateToCreationParameters.
Also add unit tests for certutil.ParseCertificateToFields.
* Cleanup TestParseCertificate.
* Add unit tests for certutil.ParseCsrToCreationParameters and ParseCsrToFields.
* Fix return values for "add_basic_constraints" in certutil.ParseCsrToFields.
Add a test for parsing CSRs where "add_basic_constraints" is false.
* Clear up some todos.
* Add a test for certutil.ParseCertificateToCreationParameters for non-CA cert.
* Tweak TestParseCertificate/full_non_CA_cert.
* Basics of three remaining fields - keyUsage; extKeyUsage; PolicyIdentifiers
* Fix tests and err handling
* Add unit tests for policy_identifiers; ext_key_usage_oids; key_usage
* Add test on ext_key_usage_oids
* Remove duplicate usages elsewhere.
* Add error handling to csr-checks.
* Remove extranames on returned types.
* Remove useless function.
---------
Co-authored-by: Victor Rodriguez <vrizo@hashicorp.com>
* add new plugin wif fields to AWS Secrets Engine
* add changelog
* go get awsutil v0.3.0
* fix up changelog
* fix test and field parsing helper
* godoc on new test
* require role arn when audience set
* make fmt
---------
Co-authored-by: Austin Gebauer <agebauer@hashicorp.com>
Co-authored-by: Austin Gebauer <34121980+austingebauer@users.noreply.github.com>
This commit introduces two new adaptive concurrency limiters in Vault,
which should handle overloading of the server during periods of
untenable request rate. The limiter adjusts the number of allowable
in-flight requests based on latency measurements performed across the
request duration. This approach allows us to reject entire requests
prior to doing any work and prevents clients from exceeding server
capacity.
The limiters intentionally target two separate vectors that have been
proven to lead to server over-utilization.
- Back pressure from the storage backend, resulting in bufferbloat in
the WAL system. (enterprise)
- Back pressure from CPU over-utilization via PKI issue requests
(specifically for RSA keys), resulting in failed heartbeats.
Storage constraints can be accounted for by limiting logical requests
according to their http.Method. We only limit requests with write-based
methods, since these will result in storage Puts and exhibit the
aforementioned bufferbloat.
CPU constraints are accounted for using the same underlying library and
technique; however, they require special treatment. The maximum number
of concurrent pki/issue requests found in testing (again, specifically
for RSA keys) is far lower than the minimum tolerable write request
rate. Without separate limiting, we would artificially impose limits on
tolerable request rates for non-PKI requests. To specifically target PKI
issue requests, we add a new PathsSpecial field, called limited,
allowing backends to specify a list of paths which should get
special-case request limiting.
For the sake of code cleanliness and future extensibility, we introduce
the concept of a LimiterRegistry. The registry proposed in this PR has
two entries, corresponding with the two vectors above. Each Limiter
entry has its own corresponding maximum and minimum concurrency,
allowing them to react to latency deviation independently and handle
high volumes of requests to targeted bottlenecks (CPU and storage).
In both cases, utilization will be effectively throttled before Vault
reaches any degraded state. The resulting 503 - Service Unavailable is a
retryable HTTP response code, which can be handled to gracefully retry
and eventually succeed. Clients should handle this by retrying with
jitter and exponential backoff. This is done within Vault's API, using
the go-retryablehttp library.
Limiter testing was performed via benchmarks of mixed workloads and
across a deployment of agent pods with great success.
Adds the ability to pin a version for a specific plugin type + name to enable an easier plugin upgrade UX. After pinning and reloading, that version should be the only version in use.
No HTTP API implementation yet for managing pins, so no user-facing effects yet.
* Migration of OtherSANs Parsing Call to SDK helper from pki-issuer
* Based on PR feedback from Steve, remove internal variable, reference certutil directly.