25 Commits

Author SHA1 Message Date
Vault Automation
0c6c13dd38
license: update headers to IBM Corp. (#10229) (#10233)
* license: update headers to IBM Corp.
* `make proto`
* update offset because source file changed

Signed-off-by: Ryan Cragun <me@ryan.ec>
Co-authored-by: Ryan Cragun <me@ryan.ec>
2025-10-21 15:20:20 -06:00
Ryan Cragun
7af25674b4
VAULT-38884, VAULT-38885: enos(rhel): bump 9.5 to 9.6 and add 10.0 (#31500)
Bump RHEL to 9.6 and remove a test that requires a fixture that was
never merged.

Signed-off-by: Ryan Cragun <me@ryan.ec>
2025-08-15 10:33:55 -06:00
Ryan Cragun
b76a28a1e0
[VAULT-38883] enos: remove Ubuntu 20.04 from the test matrix (#31482)
Ubunut 20.04 is EOL. Per our support and package policies we no longer
need to develop or test for that platform.

Signed-off-by: Ryan Cragun <me@ryan.ec>
2025-08-12 15:51:30 -06:00
Luis (LT) Carbonell
4036485739
(enos) Add KMIP Enos Test Suite (#31378)
* (enos) Add KMIP Enos Test Suite

* skip KMIP for CE runs

* reads...

* cleanup variables

* fix
2025-07-29 14:13:28 -04:00
Tin Vo
857e66b3e2
VAULT-35602: Adding Enos OpenLDAP test (#30801)
* VAULT-35602: adding Enos LDAP Tests

* adding godaddy tests

* updating external integration target module name
2025-07-23 13:11:12 -07:00
Ryan Cragun
4889381b07
enos(consul): only use versions that are CE and ENT (#31057)
Right now our logic for consul doesn't consider whether or not the
version is available for ent or ce. Make sure that the versions we used
are available for both.

Signed-off-by: Ryan Cragun <me@ryan.ec>
2025-06-20 17:21:07 +00:00
Josh Black
5e90024b26
Add Enos benchmark scenario (#30675)
* Add Enos benchmark scenario

* add docs on how to run the scenario

* update description again

* see if this works better if we return an empty map

* hopefully disabling telemetry doesn't crash everything now

* yet another try at making telemetry configurable

* swap consul nodes over to be the same as the vault ones

* adjust up IOPs and add a note about it to the docs

* fix missing variables in the ec2 shim

* randomly pick an az for k6 and metrics instances

* enos(benchmark): futher modularize and make target infra cloud agnostic

The initial goal of this was to resolve an issue where sometimes the
one-or-more target instances would attempt to be provisioned in an
avaliability zone that doesn't support it. The target_ec2_instances
module already supports assigning based on instance offerings so I
wanted to use it for all instances. It also has a side effect of
provisioning instances in parallel to speed up overall scenario time.

I ended up futher modularizing the `benchmark` module into several
sub-modules that perform a single task well, and rely on provisioning in
the root module. This will allow us to utilize the module in other
clouds more easily should we desire to do that in the future.

Signed-off-by: Ryan Cragun <me@ryan.ec>

* add copywrite headers

Signed-off-by: Ryan Cragun <me@ryan.ec>

* address some feedback and limit disk iops to 16k by default

Signed-off-by: Ryan Cragun <me@ryan.ec>

---------

Signed-off-by: Ryan Cragun <me@ryan.ec>
Co-authored-by: Ryan Cragun <me@ryan.ec>
2025-06-20 09:40:21 -07:00
Luis (LT) Carbonell
ed52371b10
Upgrade FIPS 1402 -> 1403 (#30576)
* Upgrade FIPS 1402 -> 1403

* Clean up

* changelog
2025-05-12 15:01:30 -05:00
Ryan Cragun
f61bd3230c
enos(artifactory): unify dev and test scenario artifactory metadata into new module (#29891)
* enos(artifactory): unify dev and test scenario artifactory metadata into new module

There was previously a lot of shared logic between
`build_artifactory_artifact` and `build_artifactory_package` as it
regards to building an artifact name. When it comes down to it, both
modules are very similar and their only major difference is searching
for any artifact (released or not) by either a combination of
`revision`, `edition`, `version`, and `type` vs. searching for a
released artifact with a combination of `version`, `edition`, and
`type`.

Rather than bolt on new `s390x` and `fips1403` artifact metadata to
both, I factored their metadata for package names and such into a
unified and shared `artifact/metadata` module that is now called by
both.

This was tricky as dev and test scenarios currently differ in what
we pass in as the `vault_version`, but we hope to remove that
difference soon. We also add metadata support for the forthcoming
FIPS 140-3.

This commit was tested extensively, along with other test scenarios
in support for `s390x but will be useful immediately for FIPS 140-3
so I've extracted it out.

Signed-off-by: Ryan Cragun <me@ryan.ec>

* Fix artifactory metadata before merge

The initial pass of the artifactory metadata was largely untested and
extracted from a different branch. After testing, this commit fixes a
few issues with the metadata module.

In order to test this I also had to fix an issue where AWS secrets
engine testing became a requirement but is impossible unless you exectue
against a blessed AWS account that has required roles. Instead, we now
make those verification opt-in via a new variable.

We also make some improvements to the pki-verify-certificates script so
that it works reliably against all our supported distros.

We also update our dynamic configuration to use the updated versions in
samples.

Signed-off-by: Ryan Cragun <me@ryan.ec>
2025-04-25 14:55:26 -06:00
Guy J Grigsby
08c5a52b02
disable_mlock must now be explicitly included in config (#29974)
* require explicit value for disable_mlock

* set disable_mlock back to true for all docker tests

* fix build error

* update test config files

* change explicit mlock check to apply to integrated storage only.

* formatting and typo fixes

* added test for raft

* remove erroneous test

* remove unecessary doc line

* remove unecessary var

* pr suggestions

* test compile fix

* add mlock config value to enos tests

* enos lint

* update enos tests to pass disable_mlock value

* move mlock error to runtime to check for env var

* fixed mlock config detection logic

* call out mlock on/off tradeoffs to docs

* rewording production hardening section on mlock for clarity

* update error message when missing disable_mlock value to help customers with the previous default

* fix config doc error and update production-hardening doc to align with existing recommendations.

* remove extra check for mlock config value

* fix docker recovery test

* Update changelog/29974.txt

Explicitly call out that Vault will not start without disable_mlock included in the config.

Co-authored-by: Kuba Wieczorek <kuba.wieczorek@hashicorp.com>

* more docker test experimentation.

* passing disable_mlock into test cluster

* add VAULT_DISABLE_MLOCK envvar to docker tests and pass through the value

* add missing envvar for docker env test

* upate additional docker test disable_mlock values

* Apply suggestions from code review

Use active voice.

Co-authored-by: Sarah Chavis <62406755+schavis@users.noreply.github.com>

---------

Co-authored-by: Kuba Wieczorek <kuba.wieczorek@hashicorp.com>
Co-authored-by: Sarah Chavis <62406755+schavis@users.noreply.github.com>
2025-04-17 15:35:40 +02:00
Ryan Cragun
ce5885279b
VAULT-31181: Add pipeline tool to Vault (#28536)
As the Vault pipeline and release processes evolve over time, so too must the tooling that drives them. Historically we've utilized a combination of CI features and shell scripts that are wrapped into make targets to drive our CI. While this 
approach has worked, it requires careful consideration of what features to use (bash in CI almost never matches bash in developer machines, etc.) and often requires a deep understanding of several CLI tools (jq, etc). `make` itself also has limitations in user experience, e.g. passing flags.

As we're all in on Github Actions as our pipeline coordinator, continuing to utilize and build CLI tools to perform our pipeline tasks makes sense. This PR adds a new CLI tool called `pipeline` which we can use to build new isolated tasks that we can string together in Github Actions. We intend to use this utility as the interface for future release automation work, see VAULT-27514.

For the first task in this new `pipeline` tool, I've chosen to build two small sub-commands:

* `pipeline releases list-versions` - Allows us to list Vault versions between a range. The range is configurable either by setting `--upper` and/or `--lower` bounds, or by using the `--nminus` to set the N-X to go back from the current branches version. As CE and ENT do not have version parity we also consider the `--edition`, as well as none-to-many `--skip` flags to exclude specific versions.

* `pipeline generate enos-dynamic-config` - Which creates dynamic enos configuration based on the branch and the current list of release versions. It takes largely the same flags as the `release list-versions` command, however it also expects a `--dir` for the enos directory and a `--file` where the dynamic configuration will be written. This allows us to dynamically update and feed the latest versions into our sampling algorithm to get coverage over all supported prior versions.

We then integrate these new tools into the pipeline itself and cache the dynamic config on a weekly basis. We also cache the pipeline tool itself as it will likely become a repository for pipeline specific tooling. The caching strategy for the `pipeline` tool itself will make most workflows that require it super fast.


Signed-off-by: Ryan Cragun <me@ryan.ec>
2024-10-23 15:31:24 -06:00
Ryan Cragun
74b6cc799a
VAULT-29583: Modernize default distributions in enos scenarios (#28012)
* VAULT-29583: Modernize default distributions in enos scenarios

Our scenarios have been running the last gen of distributions in CI.
This updates our default distributions as follows:
  - Amazon: 2023
  - Leap:   15.6
  - RHEL:   8.10, 9.4
  - SLES:   15.6
  - Ubuntu: 20.04, 24.04

With these changes we also unlock a few new variants combinations:
  - `distro:amzn seal:pkcs11`
  - `arch:arm64 distro:leap`

We also normalize our distro key for Amazon Linux to `amzn`, which
matches the uname output on both versions that we've supported.

Signed-off-by: Ryan Cragun <me@ryan.ec>
2024-08-09 13:43:28 -06:00
Ryan Cragun
174da88b9d
VAULT-28146: Add IPV6 support to enos scenarios (#27884)
* VAULT-28146: Add IPV6 support to enos scenarios

Add support for testing all raft storage scenarios and variants when
running Vault with IPV6 networking. We retain our previous support for
IPV4 and create a new variant `ip_version` which can be used to
configure the IP version that we wish to test with.

It's important to note that the VPC in IPV6 mode is technically mixed
and that target machines still associate public IPV6 addresses. That
allows us to execute our resources against them from IPV4 networks like
developer machines and CI runners. Despite that, we've taken care to
ensure that only IPV6 addresses are used in IPV6 mode.

Because we previously had assumed the IP Version, Vault address, and
listener ports in so many places, this PR is essentially a rewrite and
removal of those assumptions. There are also a few places where
improvements to scenarios have been included as I encountered them while
working on the IPV6 changes.

Signed-off-by: Ryan Cragun <me@ryan.ec>
2024-07-30 11:00:27 -06:00
Ryan Cragun
456f180fb9
enos: use the correct install during upgrade (#27496)
Fix the upgrade scenario by utilizing the correct Vault install location
for the initial install and by using different initial upgrade versions.
Before we had upgrade versions that were only available for Enterprise
which would fail on CE.

Signed-off-by: Ryan Cragun <me@ryan.ec>
2024-06-13 22:21:02 +00:00
Ryan Cragun
84935e4416
[QT-697] enos: add descriptions and quality verification (#27311)
In order to take advantage of enos' ability to outline scenarios and to
inventory what verification they perform we needed to retrofit all of
that information to our existing scenarios and steps.

This change introduces an initial set of descriptions and verification
declarations that we can continue to refine over time.

As doing this required that I re-read every scenanario in its entirety I
also updated and fixed a few things along the way that I noticed,
including adding a few small features to enos that we utilize to make
handling initial versions programtic between versions instead of having a
delta between our globals in each branch.

* Update autopilot and in-place upgrade initial versions
* Programatically determine which initial versions to use based on Vault
  version
* Partially normalize steps between scenarios to make comparisons easier
* Update the MOTD to explain that VAULT_ADDR and VAULT_TOKEN have been
  set
* Add scenario and step descriptions to scenarios
* Add initial scenario quality verification declarations to scenarios
* Unpin Terraform in scenarios as >= 1.8.4 should work fine
2024-06-13 11:16:33 -06:00
Rebecca Willett
1f0639a79c
Remove Leap 15.4 from testing matrices and AMI data sources; remove vestiges of Ubuntu 18.04 testing (#27416) 2024-06-10 11:44:32 -04:00
Rebecca Willett
c28739512a
Add Amazon Linux, openSUSE Leap, and SUSE SLES support to Enos scenarios and modules (#25983)
Add Consul edition support to Enos scenarios and modules
Add Linux distros and Consul edition to Enos samples
Bump RHEL versions to 9.3 and 8.9
2024-06-05 12:58:35 -04:00
Ryan Cragun
dced3ad3f0
[VAULT-26888] Create developer scenarios (#27028)
* [VAULT-26888] Create developer scenarios

Create developer scenarios that have simplified inputs designed for
provisioning clusters and limited verification.

* Migrate Artifactory installation module from support team focused
  scenarios to the vault repository.
* Migrate support focused scenarios to the repo and update them to use
  the latest in-repo modules.
* Fully document and comment scenarios to help users outline, configure,
  and use the scenarios.
* Remove outdated references to the private registry that is not needed.
* Automatically configure the login shell profile to include the path to
  the vault binary and the VAULT_ADDR/VAULT_TOKEN environment variables.

Signed-off-by: Ryan Cragun <me@ryan.ec>
2024-05-15 12:10:27 -06:00
Ryan Cragun
27ab988205
[QT-695] Add config_mode variant to some scenarios (#26380)
Add `config_mode` variant to some scenarios so we can dynamically change
how we primarily configure the Vault cluster, either by a configuration
file or with environment variables.

As part of this change we also:
* Start consuming the Enos terraform provider from public Terraform
  registry.
* Remove the old `seal_ha_beta` variant as it is no longer required.
* Add a module that performs a `vault operator step-down` so that we can
  force leader elections in scenarios.
* Wire up an operator step-down into some scenarios to test both the old
  and new multiseal code paths during leader elections.

Signed-off-by: Ryan Cragun <me@ryan.ec>
2024-04-22 12:34:47 -06:00
Ryan Cragun
a087f7b267
[QT-627] enos: add pkcs11 seal testing with softhsm (#24349)
Add support for testing `+ent.hsm` and `+ent.hsm.fips1402` Vault editions
with `pkcs11` seal types utilizing a shared `softhsm` token. Softhsm2 is
a software HSM that will load seal keys from a local disk via pkcs11.
The pkcs11 seal implementation is fairly complex as we have to create a
one or more shared tokens with various keys and distribute them to all
nodes in the cluster before starting Vault. We also have to ensure that
each sets labels are unique.

We also make a few quality of life updates by utilizing globals for
variants that don't often change and update base versions for various
scenarios.

* Add `seal_pkcs11` module for creating a `pkcs11` seal key using
  `softhsm2` as our backing implementation.
* Require the latest enos provider to gain access to the `enos_user`
  resource to ensure correct ownership and permissions of the
  `softhsm2` data directory and files.
* Add `pkcs11` seal to all scenarios that support configuring a seal
  type.
* Extract system package installation out of the `vault_cluster` module
  and into its own `install_package` module that we can reuse.
* Fix a bug when using the local builder variant that mangled the path.
  This likely slipped in during the migration to auto-version bumping.
* Fix an issue where restarting Vault nodes with a socket seal would
  fail because a seal socket sync wasn't available on all nodes. Now we
  start the socket listener on all nodes to ensure any node can become
  primary and "audit" to the socket listner.
* Remove unused attributes from some verify modules.
* Go back to using cheaper AWS regions.
* Use globals for variants.
* Update initial vault version for `upgrade` and `autopilot` scenarios.
* Update the consul versions for all scenarios that support a consul
  storage backend.

Signed-off-by: Ryan Cragun <me@ryan.ec>
2023-12-08 14:00:45 -07:00
Ryan Cragun
807bacbc9c
test: don't use us-east-1 during an outage (#23396)
An ongoing incident in us-east-1 is impacting CI. We'll temporarily use
Ohio as it's cheaper than California.

Signed-off-by: Ryan Cragun <me@ryan.ec>
2023-09-28 22:20:08 +00:00
Ryan Cragun
391cc1157a
[QT-602] Run proxy and agent test scenarios (#23176)
Update our `proxy` and `agent` scenarios to support new variants and
perform baseline verification and their scenario specific verification.
We integrate these updated scenarios into the pipeline by adding them
to artifact samples.

We've also improved the reliability of the `autopilot` and `replication`
scenarios by refactoring our IP address gathering. Previously, we'd ask
vault for the primary IP address and use some Terraform logic to determine
followers. The leader IP address gathering script was also implicitly
responsible for ensuring that a found leader was within a given group of
hosts, and thus waiting for a given cluster to have a leader, and also for
doing some arithmetic and outputting `replication` specific output data.
We've broken these responsibilities into individual modules, improved their
error messages, and fixed various races and bugs, including:
* Fix a race between creating the file audit device and installing and starting
  vault in the `replication` scenario.
* Fix how we determine our leader and follower IP addresses. We now query
  vault instead of a prior implementation that inferred the followers and sometimes
  did not allow all nodes to be an expected leader.
* Fix a bug where we'd always always fail on the first wrong condition
  in the `vault_verify_performance_replication` module.

We also performed some maintenance tasks on Enos scenarios  byupdating our
references from `oss` to `ce` to handle the naming and license changes. We
also enabled `shellcheck` linting for enos module scripts.

* Rename `oss` to `ce` for license and naming changes.
* Convert template enos scripts to scripts that take environment
  variables.
* Add `shellcheck` linting for enos module scripts.
* Add additional `backend` and `seal` support to `proxy` and `agent`
  scenarios.
* Update scenarios to include all baseline verification.
* Add `proxy` and `agent` scenarios to artifact samples.
* Remove IP address verification from the `vault_get_cluster_ips`
  modules and implement a new `vault_wait_for_leader` module.
* Determine follower IP addresses by querying vault in the
  `vault_get_cluster_ips` module.
* Move replication specific behavior out of the `vault_get_cluster_ips`
  module and into it's own `replication_data` module.
* Extend initial version support for the `upgrade` and `autopilot`
  scenarios.

We also discovered an issue with undo_logs that has been described in
the VAULT-20259. As such, we've disabled the undo_logs check until
it has been fixed.

Signed-off-by: Ryan Cragun <me@ryan.ec>
2023-09-26 15:37:28 -06:00
Ryan Cragun
5449a99aba
test: wait for nc to be listening before enabling auditor (#23142)
Rather than assuming a short sleep will work, we instead wait until netcat is listening of the socket. We've also configured the netcat listener to persist after the first connection, which allows Vault and us to check the connection without the process closing.

As we implemented this we also ran into AWS issues in us-east-1 and us-west-2, so we've changed our deploy regions until those issues are resolved.

Signed-off-by: Ryan Cragun <me@ryan.ec>
2023-09-18 14:47:13 -06:00
Marc Boudreau
e30c50321c
enable all audit devices in Enos's vault_cluster module (#22408) 2023-09-15 10:44:23 -04:00
Ryan Cragun
5f1d2c56a2
[QT-506] Use enos scenario samples for testing (#22641)
Replace our prior implementation of Enos test groups with the new Enos
sampling feature. With this feature we're able to describe which
scenarios and variant combinations are valid for a given artifact and
allow enos to create a valid sample field (a matrix of all compatible
scenarios) and take an observation (select some to run) for us. This
ensures that every valid scenario and variant combination will
now be a candidate for testing in the pipeline. See QT-504[0] for further
details on the Enos sampling capabilities.

Our prior implementation only tested the amd64 and arm64 zip artifacts,
as well as the Docker container. We now include the following new artifacts
in the test matrix:
* CE Amd64 Debian package
* CE Amd64 RPM package
* CE Arm64 Debian package
* CE Arm64 RPM package

Each artifact includes a sample definition for both pre-merge/post-merge
(build) and release testing.

Changes:
* Remove the hand crafted `enos-run-matrices` ci matrix targets and replace
  them with per-artifact samples.
* Use enos sampling to generate different sample groups on all pull
  requests.
* Update the enos scenario matrices to handle HSM and FIPS packages.
* Simplify enos scenarios by using shared globals instead of
  cargo-culted locals.

Note: This will require coordination with vault-enterprise to ensure a
smooth migration to the new system. Integrating new scenarios or
modifying existing scenarios/variants should be much smoother after this
initial migration.

[0] https://github.com/hashicorp/enos/pull/102

Signed-off-by: Ryan Cragun <me@ryan.ec>
2023-09-08 12:46:32 -06:00