* Clean up unused CRL entries when issuer is removed
When a issuer is removed, the space utilized by its CRL was not freed,
both from the CRL config mapping issuer IDs to CRL IDs and from the
CRL storage entry. We thus implement a two step cleanup, wherein
orphaned CRL IDs are removed from the config and any remaining full
CRL entries are removed from disk.
This relates to a Consul<->Vault interop issue (#22980), wherein Consul
creates a new issuer on every leadership election, causing this config
to grow. Deleting issuers manually does not entirely solve this problem
as the config does not fully reclaim space used in this entry.
Notably, an observation that when deleting issuers, the CRL was rebuilt
on secondary clusters (due to the invalidation not caring about type of
the operation); for consistency and to clean up the unified CRLs, we
also need to run the rebuild on the active primary cluster that deleted
the issuer as well.
This approach does allow cleanup on existing impacted clusters by simply
rebuilding the CRL.
Co-authored-by: Steven Clark <steven.clark@hashicorp.com>
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Add test case on CRL removal
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Add changelog entry
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
---------
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
Co-authored-by: Steven Clark <steven.clark@hashicorp.com>
* Adding explicit MPL license for sub-package.
This directory and its subdirectories (packages) contain files licensed with the MPLv2 `LICENSE` file in this directory and are intentionally licensed separately from the BSL `LICENSE` file at the root of this repository.
* Adding explicit MPL license for sub-package.
This directory and its subdirectories (packages) contain files licensed with the MPLv2 `LICENSE` file in this directory and are intentionally licensed separately from the BSL `LICENSE` file at the root of this repository.
* Updating the license from MPL to Business Source License.
Going forward, this project will be licensed under the Business Source License v1.1. Please see our blog post for more details at https://hashi.co/bsl-blog, FAQ at www.hashicorp.com/licensing-faq, and details of the license at www.hashicorp.com/bsl.
* add missing license headers
* Update copyright file headers to BUS-1.1
* Fix test that expected exact offset on hcl file
---------
Co-authored-by: hashicorp-copywrite[bot] <110428419+hashicorp-copywrite[bot]@users.noreply.github.com>
Co-authored-by: Sarah Thompson <sthompson@hashicorp.com>
Co-authored-by: Brian Kassouf <bkassouf@hashicorp.com>
* Refactor CRL writing config to passthrough cache
When reading the CRL config via API endpoint, always read through to the
disk, updating the cache in the process. Similarly, when writing to the
CRL config, read first from disk (updating the cache), and on write,
also write back through the cache, providing consistency without the
need to invalidate through markConfigDirty(...).
This will form the basis of the new pattern for config caching.
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Refactor ACME writing config to passthrough cache
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
---------
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Fix various EAB related issues
- List API wasn't plumbed through properly so it did not work as expected
- Use random 32 bytes instead of an EC key for EAB key values
- Update OpenAPI definitions
* Clean up unused EAB keys within tidy
* Move Vault EAB creation path to pki/acme/new-eab
* Update eab vault responses to match up with docs
* Allow creating storageContext with timeout
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Add challenge validation engine to ACME
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Initialize the ACME challenge validation engine
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Trigger challenge validation on endpoint submission
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Fix GetKeyThumbprint to use raw base64
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Point at localhost for testing
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Add cleanup of validation engine
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
---------
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Minor follow-ups to #16865
Fix PKI issuer upgrade logic when upgrading to 1.12 or later, to
actually turn off the issuer crl-signing usage when it intended to.
Fix minor typo in docs.
* changelog
When reading the config, we attempt to detect if the running Vault
instance has been changed from its Enterprise status on write.
Similarly, we should detect if the mount is a local mount instead. While
this isn't changeable at runtime, using sys/raw to side-load an invalid
config could be possible.
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Telemetry Metrics Configuration.
* Err Shadowing Fix (woah, semgrep is cool).
* Fix TestBackend_RevokePlusTidy_Intermediate
* Add Changelog.
* Fix memory leak. Code cleanup as suggested by Steve.
* Turn off metrics by default, breaking-change.
* Show on tidy-status before start-up.
* Fix tests
* make fmt
* Add emit metrics to periodicFunc
* Test not delivering unavailable metrics + fix.
* Better error message.
* Fixing the false-error bug.
* make fmt.
* Try to fix race issue, remove confusing comments.
* Switch metric counter variables to an atomic.Uint32
- Switch the metric counter variables to an atomic variable type
so that we are forced to properly load/store values to it
* Fix race-issue better by trying until the metric is sunk.
* make fmt.
* empty commit to retrigger non-race tests that all pass locally
---------
Co-authored-by: Steve Clark <steven.clark@hashicorp.com>
* Unified revocation migration code
- Add a periodic function that will list the local revocations
and if any are missing from the unified revocation area will
force a write to the unified revocation folder/remote instance.
* PR Feedback
- Do not transfer expired certificates to unified space from local
- Move new periodic code into a periodic.go file
- Add a flag so we only run this stuff once if all is good, with
a force flag if we encounter errors or if unified_crl is toggled
on
* PR feedback take 2
- I missed this in the original review, that we were storing the
unified-crl in a cluster-local storage area so none of the other
hosts would receive it.
- Discovered while writing unit tests, the main cluster had the unified
crl but the other clusters would return an empty response
* Add unified CRL config storage helpers
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Add support to build unified CRLs
This allows us to build unified versions of both the complete and delta
CRLs. This mostly involved creating a new variant of the
unified-specific CRL builder, fetching certs from each cluster's storage
space.
Unlike OCSP, here we do not unify the node's local storage with the
cross-cluster storage: this node is the active of the performance
primary, so writes to unified storage happen exactly the same as
writes to cluster-local storage, meaning the two are always in
sync. Other performance secondaries do not rebuild the CRL, and hence
the out-of-sync avoidance that we'd like to solve with the OCSP
responder is not necessary to solve here.
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Add ability to fetch unified CRLs
This adds to the path-fetch APIs the ability to return the unified CRLs.
We update the If-Modified-Since infrastructure to support querying the
unified CRL specific data and fetchCertBySerial to support all unified
variants. This works for both the default/global fetch APIs and the
issuer-specific fetch APIs.
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Rebuild CRLs on unified status changes
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Handle rebuilding CRLs due to either changing
This allows detecting if the Delta CRL needs to be rebuilt because
either the local or the unified CRL needs to be rebuilt. We never
trigger rebuilding the unified delta on a non-primary cluster.
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Ensure serials aren't added to unified CRL twice
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Add global, cross-cluster revocation queue to PKI
This adds a global, cross-cluster replicated revocation queue, allowing
operators to revoke certificates by serial number across any cluster. We
don't support revoking with private key (PoP) in the initial
implementation.
In particular, building on the PBPWF work, we add a special storage
location for handling non-local revocations which gets replicated up to
the active, primary cluster node and back down to all secondary PR
clusters. These then check the pending revocation entry and revoke the
serial locally if it exists, writing a cross-cluster confirmation entry.
Listing capabilities are present under pki/certs/revocation-queue,
allowing operators to see which certs are present. However, a future
improvement to the tidy subsystem will allow automatic cleanup of stale
entries.
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Allow tidying revocation queue entries
No manual operator control of revocation queue entries are allowed.
However, entries are stored with their request time, allowing tidy to,
after a suitable safety buffer, remove these unconfirmed and presumably
invalid requests.
Notably, when a cluster goes offline, it will be unable to process
cross-cluster revocations for certificates it holds. If tidy runs,
potentially valid revocations may be removed. However, it is up to the
administrator to ensure the tidy window is sufficiently long that any
required maintenance is done (or, prior to maintenance when an issue is
first noticed, tidy is temporarily disabled).
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Only allow enabling global revocation queue on Vault Enterprise
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Use a locking queue to handle revocation requests
This queue attempts to guarantee that PKI's invalidateFunc won't have
to wait long to execute: by locking only around access to the queue
proper, and internally using a list, we minimize the time spent locked,
waiting for queue accesses.
Previously, we held a lock during tidy and processing that would've
prevented us from processing invalidateFunc calls.
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* use_global_queue->cross_cluster_revocation
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Grab revocation storage lock when processing queue
We need to grab the storage lock as we'll actively be revoking new
certificates in the revocation queue. This ensures nobody else is
competing for storage access, across periodic funcs, new revocations,
and tidy operations.
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Fix expected tidy status test
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Allow probing RollbackManager directly in tests
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Address review feedback on revocationQueue
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Add more cancel checks, fix starting manual tidy
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Refactor CRL building into separate functions
This will allow us to add the ability to add and build a unified CRL
across all clusters, reusing logic that is common to both, but letting
each have their own certificate lists.
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Rename localCRLConfigEntry->internalCRLConfigEntry
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Rename Delta WALs to Local Delta WALs
This adds clarity that we'll have a separate local and remote Delta CRL
and WALs for each.
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Allow tidy to backup legacy CA bundles
With the new tidy_move_legacy_ca_bundle option, we'll use tidy to move
the legacy CA bundle from /config/ca_bundle to /config/ca_bundle.bak.
This does two things:
1. Removes ca_bundle from the hot-path of initialization after initial
migration has completed. Because this entry is seal wrapped, this
may result in performance improvements.
2. Allows recovery of this value in the event of some other failure
with migration.
Notably, this cannot occur during migration in the unlikely (and largely
unsupported) case that the operator immediately downgrades to Vault
<1.11.x. Thus, we reuse issuer_safety_buffer; while potentially long,
tidy can always be run manually with a shorter buffer (and only this
flag) to manually move the bundle if necessary.
In the event of needing to recover or undo this operation, it is
sufficient to use sys/raw to read the backed up value and subsequently
write it to its old path (/config/ca_bundle).
The new entry remains seal wrapped, but otherwise isn't used within the
code and so has better performance characteristics.
Performing a fat deletion (DELETE /root) will again remove the backup
like the old legacy bundle, preserving its wipe characteristics.
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Add changelog
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Add documentation about new tidy parameter
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Add tests for migration scenarios
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Clean up time comparisons
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Add cluster_aia_path templating variable
Per discussion with maxb, allow using a non-Vault distribution point
which may use an insecure transport for RFC 5280 compliance.
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Address feedback from Max
Co-authored-by: Max Bowsher <maxbowsher@gmail.com>
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
Co-authored-by: Max Bowsher <maxbowsher@gmail.com>
* Allow templating of cluster-local AIA URIs
This adds a new configuration path, /config/cluster, which retains
cluster-local configuration. By extending /config/urls and its issuer
counterpart to include an enable_templating parameter, we can allow
operators to correctly identify the particular cluster a cert was
issued on, and tie its AIA information to this (cluster, issuer) pair
dynamically.
Notably, this does not solve all usage issues around AIA URIs: the CRL
and OCSP responder remain local, meaning that some merge capability is
required prior to passing it to other systems if they use CRL files and
must validate requests with certs from any arbitrary PR cluster.
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Add documentation about templated AIAs
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Add changelog entry
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Add tests
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* AIA URIs -> AIA URLs
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* issuer.AIAURIs might be nil
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Allow non-nil response to config/urls
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Always validate URLs on config update
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Ensure URLs lack templating parameters
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Review feedback
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Add automatic tidy of expired issuers
To aid PKI users like Consul, which periodically rotate intermediates,
and provided a little more consistency with older versions of Vault
which would silently (and dangerously!) replace the configured CA on
root/intermediate generation, we introduce an automatic tidy of expired
issuers.
This includes a longer safety buffer (1 year) and logging of the
relevant issuer information prior to deletion (certificate contents, key
ID, and issuer ID/name) to allow admins to recover this value if
desired, or perform further cleanup of keys.
From my PoV, removal of the issuer is thus a relatively safe operation
compared to keys (which I do not feel comfortable removing) as they can
always be re-imported if desired. Additionally, this is an opt-in tidy
operation, not enabled by default. Lastly, most major performance
penalties comes with lots of issuers within the mount, not as much
large numbers of keys (as only new issuer creation/import operations are
affected, unlike LIST /issuers which is a public, unauthenticated
endpoint).
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Add changelog entry
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Add test for tidy
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Add docs on tidy of issuers
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Restructure logging
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Add missing fields to expected tidy output
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
Also remove one duplicate error masked by return.
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Correctly preserve other issuer config params
When setting a new default issuer, our helper function would overwrite
other parameters in the issuer configuration entry. However, up until
now, there were none.
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Add new parameter to allow default to follow new
This parameter will allow operators to have the default issuer
automatically update when a new root is generated or a single issuer
with a key (potentially with others lacking key) is imported.
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Storage migration tests fail on new members
These internal members shouldn't be tested by the storage migration
code, and so should be elided from the test results.
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Follow new issuer on root generation, import
This updates the two places where issuers can be created (outside of
legacy CA bundle migration which already sets the default) to follow
newly created issuers when the config is set.
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Add changelog entry
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Add documentation
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Add test for new default-following behavior
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Add new API to PKI to list revoked certificates
- A new API that will return the list of serial numbers of
revoked certificates on the local cluster.
* Add cl
* PR feedback
* Add regression test for default CRL expiry
Also fixes a bug w.r.t. upgrading older entries and missing the Delta
Rebuild Interval field, setting it to the default.
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Add changelog for earlier PR
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Fix interoperability concerns with PSS
When Go parses a certificate with rsaPSS OID, it will accept this
certificate but not parse the SubjectPublicKeyInfo, leaving the
PublicKeyAlgorithm and PublicKey fields blank, but otherwise not erring.
The same behavior occurs with rsaPSS OID CSRs.
On the other hand, when Go parses rsaPSS OID PKCS8 private keys, these
keys will fail to parse completely.
Thus, detect and fail on any empty PublicKey certs and CSRs, warning the
user that we cannot parse these correctly and thus refuse to operate.
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Run more PKI tests in parallel
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Add notes about PSS shortcomings to considerations
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Add ability to perform automatic tidy operations
This enables the PKI secrets engine to allow tidy to be started
periodically by the engine itself, avoiding the need for interaction.
This operation is disabled by default (to avoid load on clusters which
don't need tidy to be run) but can be enabled.
In particular, a default tidy configuration is written (via
/config/auto-tidy) which mirrors the options passed to /tidy. Two
additional parameters, enabled and interval, are accepted, allowing
auto-tidy to be enabled or disabled and controlling the interval
(between successful tidy runs) to attempt auto-tidy.
Notably, a manual execution of tidy will delay additional auto-tidy
operations. Status is reported via the existing /tidy-status endpoint.
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Add changelog entry
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Add documentation on auto-tidy
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Add tests for auto-tidy
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Prevent race during parallel testing
We modified the RollbackManager's execution window to allow more
faithful testing of the periodicFunc. However, the TestAutoRebuild and
the new TestAutoTidy would then race against each other for modifying
the period and creating their clusters (before resetting to the old
value).
This changeset adds a lock around this, preventing the races.
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Use tidyStatusLock to gate lastTidy time
This prevents a data race between the periodic func and the execution of
the running tidy.
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Add read lock around tidyStatus gauges
When reading from tidyStatus for computing gauges, since the underlying
values aren't atomics, we really should be gating these with a read lock
around the status access.
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Issuer renames should invalidate CRL cache times
When an issuer is renamed (or rather, two issuers' names are swapped in
quick succession), this is akin to the earlier identified default issuer
update condition. So, when any issuer is updated, go ahead and trigger
the invalidation logic.
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Fix handling of delta CRL If-Modified-Since
The If-Modified-Since PR was proposed prior to the Delta CRL changes and
thus didn't take it into account. This follow-up commit fixes that,
addressing If-Modified-Since semantics for delta CRL fetching and
ensuring an accurate number is stored.
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* honor header if-modified-since if present
* pathGetIssuerCRL first version
* check if modified since for CA endpoints
* fix date comparison for CA endpoints
* suggested changes and refactoring
* add writeIssuer to updateDefaultIssuerId and fix error
* Move methods out of storage.go into util.go
For the most part, these take a SC as param, but aren't directly storage
relevant operations. Move them out of storage.go as a result.
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Use UTC timezone for storage
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Rework path_fetch for better if-modified-since handling
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Invalidate all issuers, CRLs on default write
When the default is updated, access under earlier timestamps will not
work as we're unclear if the timestamp is for this issuer or a previous
issuer. Thus, we need to invalidate the CRL and both issuers involved
(previous, next) by updating their LastModifiedTimes.
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Add tests for If-Modified-Since
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Correctly invalidate default issuer changes
When the default issuer changes, we'll have to mark the invalidation on
PR secondary clusters, so they know to update their CRL mapping as well.
The swapped issuers will have an updated modification time (which will
eventually replicate down and thus be correct), but the CRL modification
time is cluster-local information and thus won't be replicated.
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* make fmt
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Refactor sendNotModifiedResponseIfNecessary
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Add changelog entry
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Add documentation on if-modified-since
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
Co-authored-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Allow generation of up-to-date delta CRLs
While switching to periodic rebuilds of CRLs alleviates the constant
rebuild pressure on Vault during times of high revocation, the CRL
proper becomes stale. One response to this is to switch to OCSP, but not
every system has support for this. Additionally, OCSP usually requires
connectivity and isn't used to augment a pre-distributed CRL (and is
instead used independently).
By generating delta CRLs containing only new revocations, an existing
CRL can be supplemented with newer revocations without requiring Vault
to rebuild all complete CRLs. Admins can periodically fetch the delta
CRL and add it to the existing CRL and applications should be able to
support using serials from both.
Because delta CRLs are emptied when the next complete CRL is rebuilt, it
is important that applications fetch the delta CRL and correlate it to
their complete CRL; if their complete CRL is older than the delta CRL's
extension number, applications MUST fetch the newer complete CRL to
ensure they have a correct combination.
This modifies the revocation process and adds several new configuration
options, controlling whether Delta CRLs are enabled and when we'll
rebuild it.
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Add tests for delta CRLs
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Add changelog entry
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Add documentation on delta CRLs
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Address review feedback: fix several bugs
Thanks Steve!
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Correctly invoke periodic func on active nodes
We need to ensure we read the updated config (in case of OCSP request
handling on standby nodes), but otherwise want to avoid CRL/DeltaCRL
re-building.
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Add ocsp_expiry configuration field to PKI crl config
- Add a new configurable duration field to the crl configuration to
allow operator control of how long an OCSP response can be cached
for.
- This is useful for how long a server like NGINX/Apache is
allowed to cache the response for OCSP stapling.
- A value of 0 means no one should cache the response.
- Address an issue discovered that we did not upgrade existing crl
configurations properly
* PR feedback
* Allow correct importing of certs without CRL KU
When Vault imports certificates without KU for CRLSign, we shouldn't
provision CRLUsage on the backing issuer; otherwise, we'll attempt to
build CRLs and Go will cause us to err out. This change makes it clear
(at issuer configuration time) that we can't possibly support this
operation and hopefully prevent users from running into the more cryptic
Go error.
Note that this does not apply for OCSP EKU: the EKU exists, per RFC 6960
Section 2.6 OCSP Signature Authority Delegation, to allow delegation of
OCSP signing to a child certificate. This EKU is not necessary on the
issuer itself, and generally assumes issuers are allowed to issue OCSP
responses regardless of KU/EKU.
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Add docs to clarify issue with import, CRL usage
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Add changelog entry
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Update website/content/api-docs/secret/pki.mdx
* Add additional test assertion
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
Previously we used the global backend-set crlLifetime as a default
value. However, this was refactored into a new defaultCrlConfig instead,
which we should reply with when the CRL configuration has not been set
yet. In particular, the 72h default expiry (and new 12h auto-rebuild
grace period) was added and made explicit.
This fixes the broken UI test.
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Allow automatic rebuilding of CRLs
When enabled, periodic rebuilding of CRLs will improve PKI mounts in two
way:
1. Reduced load during periods of high (new) revocations, as the CRL
isn't rebuilt after each revocation but instead on a fixed schedule.
2. Ensuring the CRL is never stale as long as the cluster remains up,
by checking for next CRL expiry and regenerating CRLs before that
happens. This may increase cluster load when operators have large
CRLs that they'd prefer to let go stale, rather than regenerating
fresh copies.
In particular, we set a grace period before expiration of CRLs where,
when the periodic function triggers (about once a minute), we check
upcoming CRL expirations and check if we need to rebuild the CRLs.
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Add changelog entry
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Add documentation on periodic rebuilding
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Allow modification of rollback period for testing
When testing backends that use the periodic func, and specifically,
testing the behavior of that periodic func, waiting for the usual 1m
interval can lead to excessively long test execution. By switching to a
shorter period--strictly for testing--we can make these tests execute
faster.
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Add tests for auto-rebuilding of CRLs
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Remove non-updating getConfig variant
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Avoid double reload of config
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Refactor existing CRL function to storage getRevocationConfig
* Introduce ocsp_disable config option in config/crl
* Introduce OCSPSigning usage flag on issuer
* Add ocsp-request passthrough within lower layers of Vault
* Add OCSP responder to Vault PKI
* Add API documentation for OCSP
* Add cl
* Revert PKI storage migration modifications for OCSP
* Smaller PR feedback items
- pki.mdx doc update
- parens around logical.go comment to indicate DER encoded request is
related to OCSP and not the snapshots
- Use AllIssuers instead of writing them all out
- Drop zero initialization of crl config's Disable flag if not present
- Upgrade issuer on the fly instead of an initial migration
* Additional clean up backing out the writeRevocationConfig refactoring
* Remove Dirty issuer flag and update comment about not writing upgrade to
storage
* Address PR feedback and return Unknown response when mismatching issuer
* make fmt
* PR Feedback.
* More PR feedback
- Leverage ocsp response constant
- Remove duplicate errors regarding unknown issuers
* Migrate existing PKI mounts that only contains a key
- We missed testing a use-case of the migration that someone has a PKI
mount point that generated a CSR but never called set-signed back on
that mount point so it only contains a key.
* Add cl
* Add per-issuer AIA URI information
Per discussion on GitHub with @maxb, this allows issuers to have their
own copy of AIA URIs. Because each issuer has its own URLs (for CA and
CRL access), its necessary to mint their issued certs pointing to the
correct issuer and not to the global default issuer. For anyone using
multiple issuers within a mount, this change allows the issuer to point
back to itself via leaf's AIA info.
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Add changelog
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Add documentation on per-issuer AIA info
Also add it to the considerations page as something to watch out for.
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Add tests for per-issuer AIA information
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Refactor AIA setting on the issuer
This introduces a common helper per Steve's suggestion.
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Clarify error messages w.r.t. AIA naming
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Clarify error messages regarding AIA URLs
This clarifies which request parameter the invalid URL is contained
in, disambiguating the sometimes ambiguous usage of AIA, per suggestion
by Max.
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Rename getURLs -> getGlobalAIAURLs
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Correct AIA acronym expansion word orders
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Fix bad comment suggesting re-generating roots
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Add two entries to URL tests
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Allow marking issuers as revoked
This allows PKI's issuers to be considered revoked and appear on each
others' CRLs. We disable issuance (via removing the usage) and prohibit
modifying the usage via the regular issuer management interface.
A separate endpoint is necessary because issuers (especially if signed
by a third-party CA using incremental serial numbers) might share a
serial number (e.g., an intermediate under cross-signing might share the
same number as an external root or an unrelated intermediate).
When the next CRL rebuild happens, this issuer will then appear on
others issuers CRLs, if they validate this issuer's certificate.
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Add changelog entry
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Add documentation on revoking issuers
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Add tests for issuer revocation semantics
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Notate that CRLs will be rebuilt
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Fix timestamp field from _utc -> to _rfc3339
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Ensure serial-based accesses shows as revoked
Thanks Kit!
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Add warning when revoking default issuer
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Refactor serial creation to common helper
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Add BYOC revocation to PKI mount
This allows operators to revoke certificates via a PEM blob passed to
Vault. In particular, Vault verifies the signature on the certificate
from an existing issuer within the mount, ensuring that one indeed
issued this certificate. The certificate is then added to storage and
its serial submitted for revocation.
This allows certificates generated with no_store=true to be submitted
for revocation afterwards, given a full copy of the certificate. As a
consequence, all roles can now safely move to no_store=true (if desired
for performance) and revocation can be done on a case-by-case basis.
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Add docs on BYOC revocation
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Add PEM length check to BYOC import
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Add changelog entry
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Add tests for BYOC
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Guard against legacy CA bundle usage
This prevents usage of the BYOC cert on a hybrid 1.10/1.12 cluster with
an non-upgraded CA issuer bundle.
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Add PSS signature support to Vault PKI engine
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Use issuer's RevocationSigAlg for CRL signing
We introduce a new parameter on issuers, revocation_signature_algorithm
to control the signature algorithm used during CRL signing. This is
because the SignatureAlgorithm value from the certificate itself is
incorrect for this purpose: a RSA root could sign an ECDSA intermediate
with say, SHA256WithRSA, but when the intermediate goes to sign a CRL,
it must use ECDSAWithSHA256 or equivalent instead of SHA256WithRSA. When
coupled with support for PSS-only keys, allowing the user to set the
signature algorithm value as desired seems like the best approach.
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Add use_pss, revocation_signature_algorithm docs
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Add PSS to signature role issuance test matrix
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Add changelog
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Allow roots to self-identify revocation alg
When using PSS support with a managed key, sometimes the underlying
device will not support PKCS#1v1.5 signatures. This results in CRL
building failing, unless we update the entry's signature algorithm
prior to building the CRL for the new root.
With a RSA-type key and use_pss=true, we use the signature bits value to
decide which hash function to use for PSS support.
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Add clearer error message on failed import
When CRL building fails during cert/key import, due to PSS failures,
give a better indication to the user that import succeeded its just CRL
building that failed. This tells them the parameter to adjust on the
issuer and warns that CRL building will fail until this is fixed.
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Add case insensitive SigAlgo matching
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Convert UsePSS back to regular bool
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Refactor PSS->certTemplate into helper function
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Proper string output on rev_sig_alg display
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Copy root's SignatureAlgorithm for CRL building
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
structs and mapstructure aren't really used within Vault much any more,
so we should start removing them. Luckily there was only one externally
accessible place where structs was used (AIA URLs config) so that was
easy to remove. The rest is mostly structure tag changes.
path_roles_tests.go relied on mapstructure in some places that broke,
but otherwise backend_test.go hasn't yet been modified to remove the
dependency on mapstructure. These didn't break as the underlying
CertBundle didn't get mapstructure support removed (as its in the SDK).
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
This will allow us to refactor the storage functions to take additional
parameters (or backend-inferred values) in the future. In particular, as
we look towards adding a storage cache layer, we'll need to add this to
the backend, which is now accessible from all storage functions.
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
- Do not set the first issuer we attempt to import as the default issuer unless
it has a corresponding key.
- Add the ability to set a default issuer if none exist and we import it's corresponding key after the fact.
- Add a warning to an end-user if we imported multiple issuers with keys and we
choose one of them as the default value.
* Add a warning when Issuing Certificate set on a role does not resolve.
* Ivanka's requests - add a warning on deleting issuer or changing it's name.
* reduce number of roles to iterate through; only verify roles after migration. ignore roles deleted behind our back.
* Protect against key and issuer name re-use
- While importing keys and issuers verify that the provided name if any has not been used by another key that we did not match against.
- Validate an assumption within the key import api, that we were provided a single key
- Add additional tests on the new key generation and key import handlers.
* Protect key import api end-users from using "default" as a name
- Do not allow end-users to provide the value of default as a name for key imports
as that would lead to weird and wonderful behaviors to the end-user.
* Add missing api-docs for PKI key import
* Address issues with revoke operations pre-migration of PKI issuers
- Leverage the legacyBundleShimID though out the path of CRL building
when legacy storage mode is active.
- Instead of having multiple locations without a lock checking for the
useLegacyBundleCaStorage flag is set, check it once and then use the
same issuerId everywhere
- Address some locking issues that might lead to a bad read/write when
switching from legacy to non-legacy mode on startup and post-migration
* Add test suite for PKI apis pre-migration to new issuer storage format
- Add tests that validate all apis work as expected in pre-migration mode
- Add tests for apis that we don't expect to work, they should return a
migration related error message
- Add some missing validations on various new apis.
- Leverage a Get lookup operation to see if our reference field is a UUID
instead of listing all key/issuers and iterating over the list.
- This should be faster and we get a cached lookup possibly if it was a
UUID entry that we previously loaded.
- Address some small feedback about migration wording as well.