* wip
* Unit test the CRL limit, wire up config
* Bigger error
* API docs
* wording
* max_crl_entries, + ignore 0 or < -1 values to the config endpoint
* changelog
* rename field in docs
* Update website/content/api-docs/secret/pki/index.mdx
Co-authored-by: Steven Clark <steven.clark@hashicorp.com>
* Update website/content/api-docs/secret/pki/index.mdx
Co-authored-by: Steven Clark <steven.clark@hashicorp.com>
---------
Co-authored-by: Steven Clark <steven.clark@hashicorp.com>
* Move fetchCertBySerial back into the main PKI package.
In order to avoid polluting the issuing package with StorageContext, move
fetchCertBySerial back to the main PKI package. Note that this requires that
FetchRevocationInfo also be moved back to the main package.
* Run make fmt.
* Move resolveIssuerCRLPath to PKI issuing package.
* Move fetchCertBySerial to PKI issuing package.
* Move fetchRevocationInfo to PKI revocation package.
* Make associateRevokedCertWithIsssuer a method of RevocationInfo.
* Move serialFromCert and normalizeSerial to PKI parsing package.
* Move writeUnifiedRevocationEntry to PKI revocation package.
* Run make fmt.
* Rename crlConfig to CrlConfig.
Rename defaultCrlConfig to DefaultCrlConfig.
* Move CrlConfig and DefaultCrlConfig to new package pki/revocation.
* Rename revocationInfo to RevocationInfo.
* Move RevocationInfo to pki/revocation.
* Add StorageContext interface to PKI's revocation package.
* Add CrlBuilderType interface to pki_backend package.
The purpose of the interface is to make it possible to gradually move (refactor)
CrlBuilder to the revocation package.
* Move CrlConfig and DefaultCrlConfig to package pki_backend.
* Make StorageContext.CrlBuilder() return a CrlBuilderType.
Add methods SetLastDeltaRebuildCheckTime() and ShouldInvalidate() to
CrlBuilderType.
* Move fetchIssuerMapForRevocationChecking to PKI's revocation package.
* Run make fmt.
* Add method storageContext.Logger().
* Add method storageContext.System().
* Add method storageContext.CrlBuilder().
* Add method storageContext.GetUnifiedTransferStatus().
* Add method storageContext.GetPkiManagedView().
* Add method storageContext.GetCertificateCounter().
* Add method storageContext.UseLegacyBundleCaStorage().
* Add method storageContext.GetRevokeStorageLock().
* Add acmeState to acmeContext.
Make acmeState accessible from acmeContext, so that storageContext doesn't have
to be used for this purpose.
* Decouple getAndValidateAcmeRole() from storageContext.Backend.
* Don't access Backend.ciepsState through storageContext.
* Add method storageContext.GetRole().
* Change signature of getCiepsAcmeSettings for CE compatibility.
* Certificate Metadata, CE components
* License headers
* make proto
* move pathFetchMetadata to ENT
* move pathFetchMetadata path to ENT
* correct stub sig
* Issuers may not be available in legacy CA storage, shouldn't fail issue/sign
* clarify error msg
* PKI refactoring to start breaking apart monolith into sub-packages
- This was broken down by commit within enterprise for ease of review
but would be too difficult to bring back individual commits back
to the CE repository. (they would be squashed anyways)
- This change was created by exporting a patch of the enterprise PR
and applying it to CE repository
* Fix TestBackend_OID_SANs to not be rely on map ordering
* Adding explicit MPL license for sub-package.
This directory and its subdirectories (packages) contain files licensed with the MPLv2 `LICENSE` file in this directory and are intentionally licensed separately from the BSL `LICENSE` file at the root of this repository.
* Adding explicit MPL license for sub-package.
This directory and its subdirectories (packages) contain files licensed with the MPLv2 `LICENSE` file in this directory and are intentionally licensed separately from the BSL `LICENSE` file at the root of this repository.
* Updating the license from MPL to Business Source License.
Going forward, this project will be licensed under the Business Source License v1.1. Please see our blog post for more details at https://hashi.co/bsl-blog, FAQ at www.hashicorp.com/licensing-faq, and details of the license at www.hashicorp.com/bsl.
* add missing license headers
* Update copyright file headers to BUS-1.1
* Fix test that expected exact offset on hcl file
---------
Co-authored-by: hashicorp-copywrite[bot] <110428419+hashicorp-copywrite[bot]@users.noreply.github.com>
Co-authored-by: Sarah Thompson <sthompson@hashicorp.com>
Co-authored-by: Brian Kassouf <bkassouf@hashicorp.com>
* Refactor CRL writing config to passthrough cache
When reading the CRL config via API endpoint, always read through to the
disk, updating the cache in the process. Similarly, when writing to the
CRL config, read first from disk (updating the cache), and on write,
also write back through the cache, providing consistency without the
need to invalidate through markConfigDirty(...).
This will form the basis of the new pattern for config caching.
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Refactor ACME writing config to passthrough cache
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
---------
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Replace all time.ParseDurations with testutil.ParseDurationSeconds
* Changelog
* Import formatting
* Import formatting
* Import formatting
* Import formatting
* Semgrep rule that runs as part of CI
* Add infrastructure for warnings on CRL rebuilds
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Add warning on issuer missing KU for CRL Signing
When an entire issuer equivalency class is missing CRL signing usage
(but otherwise has key material present), we should add a warning so
operators can either correct this issuer or create an equivalent version
with KU specified.
Resolves: https://github.com/hashicorp/vault/issues/20137
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Add tests for issuer warnings
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Add changelog entry
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Fix return order of CRL builders
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
---------
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Correctly find certificates for unified delta CRL
When building the unified delta CRL, WAL entries from the non-primary
cluster were ignored. This resulted in an incomplete delta CRL,
preventing some entries from appearing.
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Correctly rebuild unified delta CRLs
When deciding if the Unified Delta CRL should be rebuilt, we need to
check the status of all clusters and their last revoked serial numbers.
If any new serial has been revoked on any cluster, we should rebuild the
unified delta CRLs.
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Correctly persist Unified Delta CRL build entries
When building the unified CRL, we need to read the last seen serial
number from all clusters, not just the present cluster, and write it
to the last built serial for that cluster's unified delta WAL entry.
This prevents us from continuously rebuilding unified CRLs now that we
have fixed our rebuild heuristic.
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Fix getLastWALSerial for unified delta CRLs
getLastWALSerial ignored its path argument, preventing it from reading
the specified cluster-specific WAL entry. On the primary cluster, this
was mostly equivalent, but now that we're correctly reading WAL entries
and revocations for other clusters, we need to handle reading these
entries correctly.
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Copy delta WAL entries in event of failure
Any local delta WAL should be persisted to unified delta WAL space as
well. If such unified persistence fails, we need to ensure that they get
eventually moved up, otherwise they'll remain missing until the next
full CRL rebuild occurs, which might be significantly longer than when
the next delta CRL rebuild would otherwise occur. runUnifiedTransfer
already handles this for us, but it lacked logic for delta WAL serials.
The only interesting catch here is that we refuse to copy any entries
whose full unified revocation entry has not also been written.
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Make doUnifiedTransferMissingLocalSerials log an error
This message is mostly an error and would always be helpful information
to have when troubleshooting failures.
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Warn on cross-cluster write failures during revoke
When revoking certificates, we log cross-cluster revocation failures,
but we should really expose this information to the caller, that their
local revocation was successful, but their cross-cluster revocation
failed.
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Ensure unified delta WAL entry has full entry
Delta WAL entries are empty files whose only information (a revoked
serial number) is contained in the file path. These depend implicitly on
a full revocation entry existing for this file (whether a cross-cluster
unified entry or a local entry).
We should not write unified delta WAL entries without the corresponding
full unified revocation entry existing. Add a warning in this case.
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Add changelog entry
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
---------
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Log, don't err, on unified delta WAL write failure
When the PBPWF fails on the Active node of a PR Secondary cluster with a
read-only failure, there is no value in forwarding this request up to
the Active node of the PR Primary cluster: it does not have the local
revocation context necessary to write a Delta WAL entry for this
request, and would likely end up writing a cross-cluster revocation
entry (if it is enabled) or else erring completely.
Instead, log this error like we do when failing to write unified CRL
entries. Switch both to using Error instead of Debug for this type of
failure.
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Add changelog entry
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
---------
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Forward PKI revocation requests received by standby nodes to active node
- A refactoring that occurred in 1.13 timeframe removed what was
considered a specific check for standby nodes that wasn't required
as a writes should be returning ErrReadOnly.
- That sadly exposed a long standing bug where the errors from the
storage layer were not being properly wrapped, hiding the ErrReadOnly
coming from a write and failing the request.
* Add cl
* Add test for basic PKI operations against standby nodes
* Telemetry Metrics Configuration.
* Err Shadowing Fix (woah, semgrep is cool).
* Fix TestBackend_RevokePlusTidy_Intermediate
* Add Changelog.
* Fix memory leak. Code cleanup as suggested by Steve.
* Turn off metrics by default, breaking-change.
* Show on tidy-status before start-up.
* Fix tests
* make fmt
* Add emit metrics to periodicFunc
* Test not delivering unavailable metrics + fix.
* Better error message.
* Fixing the false-error bug.
* make fmt.
* Try to fix race issue, remove confusing comments.
* Switch metric counter variables to an atomic.Uint32
- Switch the metric counter variables to an atomic variable type
so that we are forced to properly load/store values to it
* Fix race-issue better by trying until the metric is sunk.
* make fmt.
* empty commit to retrigger non-race tests that all pass locally
---------
Co-authored-by: Steve Clark <steven.clark@hashicorp.com>
* Use the unified CRL on legacy CRL paths if UnifiedCRLOnExistingPaths is set
- If the crl configuration option unified_crl_on_existing_paths is set
to true along with the unified_crl feature, provide the unified crl
on the existing CRL paths.
- Added some test helpers to help debugging, they are being used by
the ENT test that validates this feature.
* Rename method to shouldLocalPathsUseUnified
* Allow unification of revocations on other clusters
If a BYOC revocation occurred on cluster A, while the cert was initially
issued and stored on cluster B, we need to use the invalidation on the
unified entry to detect this: the revocation queues only work for
non-PoP, non-BYOC serial only revocations and thus this BYOC would be
immediately accepted on cluster A. By checking all other incoming
revocations for duplicates on a given cluster, we can ensure that
unified revocation is consistent across clusters.
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Use time-of-use locking for global revocation processing
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
---------
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Unified revocation migration code
- Add a periodic function that will list the local revocations
and if any are missing from the unified revocation area will
force a write to the unified revocation folder/remote instance.
* PR Feedback
- Do not transfer expired certificates to unified space from local
- Move new periodic code into a periodic.go file
- Add a flag so we only run this stuff once if all is good, with
a force flag if we encounter errors or if unified_crl is toggled
on
* PR feedback take 2
- I missed this in the original review, that we were storing the
unified-crl in a cluster-local storage area so none of the other
hosts would receive it.
- Discovered while writing unit tests, the main cluster had the unified
crl but the other clusters would return an empty response
* Add unified CRL config storage helpers
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Add support to build unified CRLs
This allows us to build unified versions of both the complete and delta
CRLs. This mostly involved creating a new variant of the
unified-specific CRL builder, fetching certs from each cluster's storage
space.
Unlike OCSP, here we do not unify the node's local storage with the
cross-cluster storage: this node is the active of the performance
primary, so writes to unified storage happen exactly the same as
writes to cluster-local storage, meaning the two are always in
sync. Other performance secondaries do not rebuild the CRL, and hence
the out-of-sync avoidance that we'd like to solve with the OCSP
responder is not necessary to solve here.
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Add ability to fetch unified CRLs
This adds to the path-fetch APIs the ability to return the unified CRLs.
We update the If-Modified-Since infrastructure to support querying the
unified CRL specific data and fetchCertBySerial to support all unified
variants. This works for both the default/global fetch APIs and the
issuer-specific fetch APIs.
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Rebuild CRLs on unified status changes
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Handle rebuilding CRLs due to either changing
This allows detecting if the Delta CRL needs to be rebuilt because
either the local or the unified CRL needs to be rebuilt. We never
trigger rebuilding the unified delta on a non-primary cluster.
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Ensure serials aren't added to unified CRL twice
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Write delta WAL entries for unified CRLs
When we'd ordinarily write delta WALs for local CRLs, we also need to
populate the cross-cluster delta WAL. This could cause revocation to
appear to fail if the two clusters are disconnected, but notably regular
cross-cluster revocation would also fail.
Notably, this commit also changes us to not write Delta WALs when Delta
CRLs is disabled (versus previously doing it when auto rebuild is
enabled in case Delta CRLs were later asked for), and instead,
triggering rebuilding a complete CRL so we don't need up-to-date Delta
WAL info.
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Update IMS test for forced CRL rebuilds
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Move comment about perf-primary only invalidation
Also remove noisy debug log.
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Remove more noisy log statements during queue
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Skip revocation entries from our current cluster
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Add locking and comment about tidying revoke queue
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Switch to time.Since for tidy
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Refactor tidyStatuses into path_tidy.go
Leaving these in backend.go often causes us to miss adding useful values
to tidyStatus when we add a new config parameter.
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Track the number of deleted revocation request
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Allow tidy to remove confirmed revocation requests
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Add missing field to tidy test
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Add global, cross-cluster revocation queue to PKI
This adds a global, cross-cluster replicated revocation queue, allowing
operators to revoke certificates by serial number across any cluster. We
don't support revoking with private key (PoP) in the initial
implementation.
In particular, building on the PBPWF work, we add a special storage
location for handling non-local revocations which gets replicated up to
the active, primary cluster node and back down to all secondary PR
clusters. These then check the pending revocation entry and revoke the
serial locally if it exists, writing a cross-cluster confirmation entry.
Listing capabilities are present under pki/certs/revocation-queue,
allowing operators to see which certs are present. However, a future
improvement to the tidy subsystem will allow automatic cleanup of stale
entries.
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Allow tidying revocation queue entries
No manual operator control of revocation queue entries are allowed.
However, entries are stored with their request time, allowing tidy to,
after a suitable safety buffer, remove these unconfirmed and presumably
invalid requests.
Notably, when a cluster goes offline, it will be unable to process
cross-cluster revocations for certificates it holds. If tidy runs,
potentially valid revocations may be removed. However, it is up to the
administrator to ensure the tidy window is sufficiently long that any
required maintenance is done (or, prior to maintenance when an issue is
first noticed, tidy is temporarily disabled).
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Only allow enabling global revocation queue on Vault Enterprise
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Use a locking queue to handle revocation requests
This queue attempts to guarantee that PKI's invalidateFunc won't have
to wait long to execute: by locking only around access to the queue
proper, and internally using a list, we minimize the time spent locked,
waiting for queue accesses.
Previously, we held a lock during tidy and processing that would've
prevented us from processing invalidateFunc calls.
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* use_global_queue->cross_cluster_revocation
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Grab revocation storage lock when processing queue
We need to grab the storage lock as we'll actively be revoking new
certificates in the revocation queue. This ensures nobody else is
competing for storage access, across periodic funcs, new revocations,
and tidy operations.
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Fix expected tidy status test
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Allow probing RollbackManager directly in tests
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Address review feedback on revocationQueue
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Add more cancel checks, fix starting manual tidy
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Refactor CRL building into separate functions
This will allow us to add the ability to add and build a unified CRL
across all clusters, reusing logic that is common to both, but letting
each have their own certificate lists.
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Rename localCRLConfigEntry->internalCRLConfigEntry
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Rename Delta WALs to Local Delta WALs
This adds clarity that we'll have a separate local and remote Delta CRL
and WALs for each.
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Rename revokeCert variable to identify serial number formatting
* Refactor out lease specific behavior out of revokeCert
- Isolate the specific behavior regarding revoking lease specific
certificates outside of the revokeCert function and into the only
caller that leveraged used it.
- This allows us to simplify revokeCert a little bit and keeps the
function purely about revoking a certificate
* Within revokeCert short circuit the already revoked use-case
- Make the function a little easier to process by exiting early
if the certificate has already been revoked.
* Do not load certificates from storage multiple times during revocation
- Isolate the loading of a certificate and parsing of a certificate
into a single attempt, either when provided the certificate for BYOC
revocation or strictly from storage for the other revocation types.
* With BYOC write certificate entry using dashes not the legacy colon char
A lot of places took a (context, backend, request) tuple, ignoring the
request proper and only using it for its storage. This (modified) tuple
is exactly the set of elements in the shared storage context, so we
should be using that instead of manually passing all three elements
around.
This simplifies a few places where we'd generate a storage context at
the request level and then split it apart only to recreate it again
later (e.g., CRL building).
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
Also remove one duplicate error masked by return.
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
When revoking an issuer, we immediately force a full rebuild of all CRLs
(complete and delta). However, we had forgotten to guard the delta CRL's
inclusion of augmented issuers, resulting in double-listing the issuer's
serial number on both the complete and the delta CRL. This isn't
necessary as the delta's referenced complete CRL number has incremented
to the point where the issuer itself was included on the complete CRL.
Avoid this double reference and don't include issuers on delta CRLs;
they should always appear only on the complete CRL.
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* PKI: Do not load revoked certificates if CRL has been disabled
- Restore the prior behavior of not reading in all revoked certificates
if the CRL has been disabled as there might be performance issues
if a customer had or is still revoking a lot of certificates.
* Add cl
When adding delta CRL support, we unconditionally added the delta
indicator extension to the main CRL. We shouldn't have done this, and
instead only added it conditionally when we were building delta CRLs.
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
Basics of Cert-Count Telemetry, changelog, "best attempt" slice to capture (and test for) duplicates, Move sorting of possibleDoubleCountedRevokedSerials to after compare of entries. Add values to counter when still initializing.
Set lists to nil after use, Fix atomic2 import, Delay reporting metrics until after deduplication has completed,
The test works now, Move string slice to helper function; Add backendUUID to gauge name.
* Don't race for CRL rebuilding capability check
Core has recently seen some data races during SystemView/replication
updates between them and the PKI subsystem. This is because this
SystemView access occurs outside of a request (during invalidation
handling) and thus the proper lock isn't held.
Because replication status cannot change within the lifetime of a plugin
(and instead, if a node switches replication status, the entire plugin
instance will be torn down and recreated), it is safe to cache this
once, at plugin startup, and use it throughout its lifetime.
Thus, we replace this SystemView access with a stored boolean variable
computed ahead of time.
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Update builtin/logical/pki/backend.go
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Add path to manually rebuild delta CRLs
The crl/rotate-delta path behaves like crl/rotate, triggering a
cluster-local rebuild of just the delta CRL. This is useful for when
delta CRLs are enabled with a longer-than-desired auto-rebuild period
after some high-profile revocations occur.
In the event delta CRLs are not enabled, this becomes a no-op.
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Add tests for Delta CRL rebuilding
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Update documentation about Delta CRLs
Also fixes a omission in the If-Modified-Since docs to mention that the
response header should probably also be passed through.
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Issuer renames should invalidate CRL cache times
When an issuer is renamed (or rather, two issuers' names are swapped in
quick succession), this is akin to the earlier identified default issuer
update condition. So, when any issuer is updated, go ahead and trigger
the invalidation logic.
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Fix handling of delta CRL If-Modified-Since
The If-Modified-Since PR was proposed prior to the Delta CRL changes and
thus didn't take it into account. This follow-up commit fixes that,
addressing If-Modified-Since semantics for delta CRL fetching and
ensuring an accurate number is stored.
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* honor header if-modified-since if present
* pathGetIssuerCRL first version
* check if modified since for CA endpoints
* fix date comparison for CA endpoints
* suggested changes and refactoring
* add writeIssuer to updateDefaultIssuerId and fix error
* Move methods out of storage.go into util.go
For the most part, these take a SC as param, but aren't directly storage
relevant operations. Move them out of storage.go as a result.
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Use UTC timezone for storage
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Rework path_fetch for better if-modified-since handling
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Invalidate all issuers, CRLs on default write
When the default is updated, access under earlier timestamps will not
work as we're unclear if the timestamp is for this issuer or a previous
issuer. Thus, we need to invalidate the CRL and both issuers involved
(previous, next) by updating their LastModifiedTimes.
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Add tests for If-Modified-Since
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Correctly invalidate default issuer changes
When the default issuer changes, we'll have to mark the invalidation on
PR secondary clusters, so they know to update their CRL mapping as well.
The swapped issuers will have an updated modification time (which will
eventually replicate down and thus be correct), but the CRL modification
time is cluster-local information and thus won't be replicated.
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* make fmt
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Refactor sendNotModifiedResponseIfNecessary
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Add changelog entry
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Add documentation on if-modified-since
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
Co-authored-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Allow generation of up-to-date delta CRLs
While switching to periodic rebuilds of CRLs alleviates the constant
rebuild pressure on Vault during times of high revocation, the CRL
proper becomes stale. One response to this is to switch to OCSP, but not
every system has support for this. Additionally, OCSP usually requires
connectivity and isn't used to augment a pre-distributed CRL (and is
instead used independently).
By generating delta CRLs containing only new revocations, an existing
CRL can be supplemented with newer revocations without requiring Vault
to rebuild all complete CRLs. Admins can periodically fetch the delta
CRL and add it to the existing CRL and applications should be able to
support using serials from both.
Because delta CRLs are emptied when the next complete CRL is rebuilt, it
is important that applications fetch the delta CRL and correlate it to
their complete CRL; if their complete CRL is older than the delta CRL's
extension number, applications MUST fetch the newer complete CRL to
ensure they have a correct combination.
This modifies the revocation process and adds several new configuration
options, controlling whether Delta CRLs are enabled and when we'll
rebuild it.
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Add tests for delta CRLs
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Add changelog entry
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Add documentation on delta CRLs
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Address review feedback: fix several bugs
Thanks Steve!
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Correctly invoke periodic func on active nodes
We need to ensure we read the updated config (in case of OCSP request
handling on standby nodes), but otherwise want to avoid CRL/DeltaCRL
re-building.
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Refactor tidy steps into two separate helpers
This refactors the tidy go routine into two separate helpers, making it
clear where the boundaries of each are: variables are passed into these
method and concerns are separated. As more operations are rolled into
tidy, we can continue adding more helpers as appropriate. Additionally,
as we move to make auto-tidy occur, we can use these as points to hook
into periodic tidying.
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Refactor revInfo checking to helper
This allows us to validate whether or not a revInfo entry contains a
presently valid issuer, from the existing mapping. Coupled with the
changeset to identify the issuer on revocation, we can begin adding
capabilities to tidy to update this association, decreasing CRL build
time and increasing the performance of OCSP.
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Refactor issuer fetching for revocation purposes
Revocation needs to gracefully handle using the old legacy cert bundle,
so fetching issuers (and parsing them) needs to be done slightly
differently than other places. Refactor this from revokeCert into a
common helper that can be used by tidy.
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Allow tidy to associate revoked certs, issuers
When revoking a certificate, we need to associate the issuer that signed
its certificate back to the revInfo entry. Historically this was
performed during CRL building (and still remains so), but when running
without CRL building and with only OCSP, performance will degrade as the
issuer needs to be found each time.
Instead, allow the tidy operation to take over this role, allowing us to
increase the performance of OCSP and CRL in this scenario, by decoupling
issuer identification from CRL building in the ideal case.
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Add tests for tidy updates
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Add changelog entry
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Add documentation on new tidy parameter, metrics
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Refactor tidy config into shared struct
Finish adding metrics, status messages about new tidy operation.
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Refactor CRL tests to use /sys/mounts
Thanks Steve for the approach! This also address nits from Kit.
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Skip CRL building steps when disabled
This skips a number of steps during CRL build when it is disabled (and
forceNew is not set). In particular, we avoid fetching issuers, we avoid
associating issuers with revocation entries (and building that in-memory
mapping), making CRL building more efficient.
This means that there'll again be very little overhead on clusters with
the CRL disabled.
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Prevent revoking roots from appearing on own CRLs
This change ensures that when marking a root as revoked, it no longer
appears on its own CRL. Very few clients support this event (as
generally only leaves/intermediates are checked for presence on a
parent's CRL) and it is technically undefined behavior (if the root is
revoked, its own CRL should be untrusted and thus including it on its
own CRL isn't a safe/correct distribution channel).
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Ensure stability of revInfo issuer identification
As mentioned by Kit, iterating through each revInfoEntry and associating
the first issuer which matches it can cause churn when many (equivalent)
issuers are in the system and issuers come and go (via CRLSigning usage,
which has been modified in this release as well). Because we'd not
include issuers without CRLSigning usage, we'd cause our verification
helper, isRevInfoIssuerValid, to think the issuer ID is no longer value
(when instead, it just lacks crlSigning bits).
We address this by pulling in all issuers we know of for the
identification. This allows us to keep valid-but-not-for-signing
issuers, and use other representatives of their identity set for
signing/building the CRL (if they are enabled for such usage).
As a side effect, we now no longer place these entries on the default
CRL in the event all issuers in the CRL set are without the usage.
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Add changelog entry
This is only for the last commit.
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Identify issuer on revocation
When we attempt to revoke a leaf certificate, we already parse all of
the issuers within the mount (to x509.Certificate) to ensure we don't
accidentally revoke an issuer via the leaf revocation endpoint. We can
reuse this information to associate the issuer (via issuer/subject
comparison and signature checking) to the revoked cert in its revocation
info. This will help OCSP, avoiding the case where the OCSP handler
needs to associate a certificate to its issuer.
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Add test to ensure issuers are identified
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Allow automatic rebuilding of CRLs
When enabled, periodic rebuilding of CRLs will improve PKI mounts in two
way:
1. Reduced load during periods of high (new) revocations, as the CRL
isn't rebuilt after each revocation but instead on a fixed schedule.
2. Ensuring the CRL is never stale as long as the cluster remains up,
by checking for next CRL expiry and regenerating CRLs before that
happens. This may increase cluster load when operators have large
CRLs that they'd prefer to let go stale, rather than regenerating
fresh copies.
In particular, we set a grace period before expiration of CRLs where,
when the periodic function triggers (about once a minute), we check
upcoming CRL expirations and check if we need to rebuild the CRLs.
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Add changelog entry
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Add documentation on periodic rebuilding
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Allow modification of rollback period for testing
When testing backends that use the periodic func, and specifically,
testing the behavior of that periodic func, waiting for the usual 1m
interval can lead to excessively long test execution. By switching to a
shorter period--strictly for testing--we can make these tests execute
faster.
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Add tests for auto-rebuilding of CRLs
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Remove non-updating getConfig variant
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Avoid double reload of config
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Refactor existing CRL function to storage getRevocationConfig
* Introduce ocsp_disable config option in config/crl
* Introduce OCSPSigning usage flag on issuer
* Add ocsp-request passthrough within lower layers of Vault
* Add OCSP responder to Vault PKI
* Add API documentation for OCSP
* Add cl
* Revert PKI storage migration modifications for OCSP
* Smaller PR feedback items
- pki.mdx doc update
- parens around logical.go comment to indicate DER encoded request is
related to OCSP and not the snapshots
- Use AllIssuers instead of writing them all out
- Drop zero initialization of crl config's Disable flag if not present
- Upgrade issuer on the fly instead of an initial migration
* Additional clean up backing out the writeRevocationConfig refactoring
* Remove Dirty issuer flag and update comment about not writing upgrade to
storage
* Address PR feedback and return Unknown response when mismatching issuer
* make fmt
* PR Feedback.
* More PR feedback
- Leverage ocsp response constant
- Remove duplicate errors regarding unknown issuers
* Allow marking issuers as revoked
This allows PKI's issuers to be considered revoked and appear on each
others' CRLs. We disable issuance (via removing the usage) and prohibit
modifying the usage via the regular issuer management interface.
A separate endpoint is necessary because issuers (especially if signed
by a third-party CA using incremental serial numbers) might share a
serial number (e.g., an intermediate under cross-signing might share the
same number as an external root or an unrelated intermediate).
When the next CRL rebuild happens, this issuer will then appear on
others issuers CRLs, if they validate this issuer's certificate.
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Add changelog entry
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Add documentation on revoking issuers
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Add tests for issuer revocation semantics
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Notate that CRLs will be rebuilt
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Fix timestamp field from _utc -> to _rfc3339
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Ensure serial-based accesses shows as revoked
Thanks Kit!
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Add warning when revoking default issuer
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Refactor serial creation to common helper
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Add BYOC revocation to PKI mount
This allows operators to revoke certificates via a PEM blob passed to
Vault. In particular, Vault verifies the signature on the certificate
from an existing issuer within the mount, ensuring that one indeed
issued this certificate. The certificate is then added to storage and
its serial submitted for revocation.
This allows certificates generated with no_store=true to be submitted
for revocation afterwards, given a full copy of the certificate. As a
consequence, all roles can now safely move to no_store=true (if desired
for performance) and revocation can be done on a case-by-case basis.
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Add docs on BYOC revocation
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Add PEM length check to BYOC import
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Add changelog entry
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Add tests for BYOC
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Guard against legacy CA bundle usage
This prevents usage of the BYOC cert on a hybrid 1.10/1.12 cluster with
an non-upgraded CA issuer bundle.
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
strings.ReplaceAll(s, old, new) is a wrapper function for
strings.Replace(s, old, new, -1). But strings.ReplaceAll is more
readable and removes the hardcoded -1.
Signed-off-by: Eng Zer Jun <engzerjun@gmail.com>
* Add PSS signature support to Vault PKI engine
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Use issuer's RevocationSigAlg for CRL signing
We introduce a new parameter on issuers, revocation_signature_algorithm
to control the signature algorithm used during CRL signing. This is
because the SignatureAlgorithm value from the certificate itself is
incorrect for this purpose: a RSA root could sign an ECDSA intermediate
with say, SHA256WithRSA, but when the intermediate goes to sign a CRL,
it must use ECDSAWithSHA256 or equivalent instead of SHA256WithRSA. When
coupled with support for PSS-only keys, allowing the user to set the
signature algorithm value as desired seems like the best approach.
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Add use_pss, revocation_signature_algorithm docs
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Add PSS to signature role issuance test matrix
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Add changelog
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Allow roots to self-identify revocation alg
When using PSS support with a managed key, sometimes the underlying
device will not support PKCS#1v1.5 signatures. This results in CRL
building failing, unless we update the entry's signature algorithm
prior to building the CRL for the new root.
With a RSA-type key and use_pss=true, we use the signature bits value to
decide which hash function to use for PSS support.
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Add clearer error message on failed import
When CRL building fails during cert/key import, due to PSS failures,
give a better indication to the user that import succeeded its just CRL
building that failed. This tells them the parameter to adjust on the
issuer and warns that CRL building will fail until this is fixed.
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Add case insensitive SigAlgo matching
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Convert UsePSS back to regular bool
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Refactor PSS->certTemplate into helper function
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Proper string output on rev_sig_alg display
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Copy root's SignatureAlgorithm for CRL building
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>