* license: update headers to IBM Corp.
* `make proto`
* update offset because source file changed
Signed-off-by: Ryan Cragun <me@ryan.ec>
Co-authored-by: Ryan Cragun <me@ryan.ec>
* [VAULT-39157] enos(cloud): add basic vault cloud scenario
Add the skeleton of a Vault Cloud scenario whereby we create an HCP
network, Vault Cloud cluster, and admin token.
In subsequent PR's we'll wire up building images, waiting on builds, and
ultimately fully testing the resulting image.
* copywrite: add headers
---------
Signed-off-by: Ryan Cragun <me@ryan.ec>
Co-authored-by: Ryan Cragun <me@ryan.ec>
* Add Enos benchmark scenario
* add docs on how to run the scenario
* update description again
* see if this works better if we return an empty map
* hopefully disabling telemetry doesn't crash everything now
* yet another try at making telemetry configurable
* swap consul nodes over to be the same as the vault ones
* adjust up IOPs and add a note about it to the docs
* fix missing variables in the ec2 shim
* randomly pick an az for k6 and metrics instances
* enos(benchmark): futher modularize and make target infra cloud agnostic
The initial goal of this was to resolve an issue where sometimes the
one-or-more target instances would attempt to be provisioned in an
avaliability zone that doesn't support it. The target_ec2_instances
module already supports assigning based on instance offerings so I
wanted to use it for all instances. It also has a side effect of
provisioning instances in parallel to speed up overall scenario time.
I ended up futher modularizing the `benchmark` module into several
sub-modules that perform a single task well, and rely on provisioning in
the root module. This will allow us to utilize the module in other
clouds more easily should we desire to do that in the future.
Signed-off-by: Ryan Cragun <me@ryan.ec>
* add copywrite headers
Signed-off-by: Ryan Cragun <me@ryan.ec>
* address some feedback and limit disk iops to 16k by default
Signed-off-by: Ryan Cragun <me@ryan.ec>
---------
Signed-off-by: Ryan Cragun <me@ryan.ec>
Co-authored-by: Ryan Cragun <me@ryan.ec>
* enos(artifactory): unify dev and test scenario artifactory metadata into new module
There was previously a lot of shared logic between
`build_artifactory_artifact` and `build_artifactory_package` as it
regards to building an artifact name. When it comes down to it, both
modules are very similar and their only major difference is searching
for any artifact (released or not) by either a combination of
`revision`, `edition`, `version`, and `type` vs. searching for a
released artifact with a combination of `version`, `edition`, and
`type`.
Rather than bolt on new `s390x` and `fips1403` artifact metadata to
both, I factored their metadata for package names and such into a
unified and shared `artifact/metadata` module that is now called by
both.
This was tricky as dev and test scenarios currently differ in what
we pass in as the `vault_version`, but we hope to remove that
difference soon. We also add metadata support for the forthcoming
FIPS 140-3.
This commit was tested extensively, along with other test scenarios
in support for `s390x but will be useful immediately for FIPS 140-3
so I've extracted it out.
Signed-off-by: Ryan Cragun <me@ryan.ec>
* Fix artifactory metadata before merge
The initial pass of the artifactory metadata was largely untested and
extracted from a different branch. After testing, this commit fixes a
few issues with the metadata module.
In order to test this I also had to fix an issue where AWS secrets
engine testing became a requirement but is impossible unless you exectue
against a blessed AWS account that has required roles. Instead, we now
make those verification opt-in via a new variable.
We also make some improvements to the pki-verify-certificates script so
that it works reliably against all our supported distros.
We also update our dynamic configuration to use the updated versions in
samples.
Signed-off-by: Ryan Cragun <me@ryan.ec>
* add test
* add as module
* more debugging of scenario
* fixes
* smoke test working
* autopilot test working
* revert local autopilot changes, cleanup comments and raft remove peer changes
* enos fmt
* modules fmt
* add vault_install_dir
* skip removal correctly for consul
* lint
* pr fixes
* passed run
* pr comments
* change step name everywhere
* fix
* check correct field
* remove cluster_name
Verify vault secret integrity in unauthenticated I/O streams (audit log, STDOUT/STDERR via the systemd journal) by scanning the text with Vault Radar. We search for both known and unknown secrets by using an index of KVV2 values and also by radar's built-in heuristics for credentials, secrets, and keys.
The verification has been added to many scenarios where a slight time increase is allowed, as we now have to install Vault Radar and scan the text. In practice this adds less than 10 seconds to the overall duration of a scenario.
In the in-place upgrade scenario we explicitly exclude this verification when upgrading from a version that we know will fail the check. We also make the verification opt-in so as to not require a Vault Radar license to run Enos scenarios, though it will always be enabled in CI.
As part of this we also update our enos workflow to utilize secret values from our self-hosted Vault when executing in the vault-enterprise repo context.
Signed-off-by: Ryan Cragun <me@ryan.ec>
* VAULT-30819: verify DR secondary leader before unsealing followers
After we've enabled DR replication on the secondary leader the existing
cluster followers will be resealed with the primary clusters encryption
keys. We have to unseal the followers to make them available. To ensure
that we absolutely take every precaution before attempting to unseal the
followers we now verify that the secondary leader is the cluster leader,
has a valid merkle tree, and is streaming wals from the primary cluster
before we attempt to unseal the secondary followers.
Signed-off-by: Ryan Cragun <me@ryan.ec>
Fix two occasional flakes in the DR replication scenario:
* Always verify that all nodes in the cluster are unsealed before
verifying test data. Previously we only verified seal status on
followers.
* Fix an occasional timeout when waiting for the cluster to unseal by
rewriting the module to retry for a set duration instead of
exponential backoff.
Signed-off-by: Ryan Cragun <me@ryan.ec>
* [VAULT-30189] enos: verify identity and OIDC tokens
Expand our baseline API and data verification by including the identity
and identity OIDC tokens secrets engines. We now create a test entity,
entity-alias, identity group, various policies, and associate them with
the entity. For the OIDC side, we now configure the OIDC issuer, create
and rotate named keys, create and associate roles with the named key,
and issue and introspect tokens.
During a second phase we also verify that the those some entities,
groups, keys, roles, config, etc all exist with the expected values.
This is useful to test durability after upgrades, migrations, etc.
This change also includes new updates our prior `auth/userpass` and `kv`
verification. We had two modules that were loosely coupled and
interdependent. This restructures those both into a singular module with
child modules and fixes the assumed values by requiring the read module
to verify against the created state.
Going forward we can continue to extend this secrets engine verification
module with additional create and read checks for new secrets engines.
Signed-off-by: Ryan Cragun <me@ryan.ec>
* auto-roll billing start enos test
* enos: don't expect curl available in docker image (#27984)
Signed-off-by: Ryan Cragun <me@ryan.ec>
* Update interoperability-matrix.mdx (#27977)
Updating the existing Vault/YubiHSM integration with a newer version of Vault as well as now supporting Managed Keys.
* Update hana db pkg (#27950)
* database/hana: use go-hdb v1.10.1
* docs/hana: quotes around password so dashes don't break it
* Clarify audit log failure telemetry docs. (#27969)
* Clarify audit log failure telemetry docs.
* Add the note about the misleading counts
* Auto-rolling billing start docs PR (#27926)
* auto-roll docs changes
* addressing comments
* address comments
* Update website/content/api-docs/system/internal-counters.mdx
Co-authored-by: Sarah Chavis <62406755+schavis@users.noreply.github.com>
* addressing some changes
* update docs
* update docs with common explanation file
* updated note info
* fix 1.18 upgrade doc
* fix content-check error
* Update website/content/partials/auto-roll-billing-start-example.mdx
Co-authored-by: miagilepner <mia.epner@hashicorp.com>
---------
Co-authored-by: Sarah Chavis <62406755+schavis@users.noreply.github.com>
Co-authored-by: miagilepner <mia.epner@hashicorp.com>
* docker: add upgrade notes for curl removal (#27995)
Signed-off-by: Ryan Cragun <me@ryan.ec>
* Update vault-plugin-auth-jwt to v0.21.1 (#27992)
* docs: fix upgrade 1.16.x (#27999)
Signed-off-by: Ryan Cragun <me@ryan.ec>
* UI: Add unsupportedCriticalCertExtensions to jwt config expected payload (#27996)
* Client Count Docs Updates/Cleanup (#27862)
* Docs changes
* More condensation of docs
* Added some clarity on date ranges
* Edited wording'
* Added estimation client count info
* Update website/content/api-docs/system/internal-counters.mdx
Co-authored-by: miagilepner <mia.epner@hashicorp.com>
---------
Co-authored-by: miagilepner <mia.epner@hashicorp.com>
* update(kubernetes.mdx): k8s-tokenreview URL (#27993)
Co-authored-by: akshya96 <87045294+akshya96@users.noreply.github.com>
* Update programmatic-management.mdx to clarify Terraform prereqs (#27548)
* UI: Replace getNewModel with hydrateModel when model exists (#27978)
* Replace getNewModel with hydrateModel when model exists
* Update getNewModel to only handle nonexistant model types
* Update test
* clarify test
* Fix auth-config models which need hydration not generation
* rename file to match service name
* cleanup + tests
* Add comment about helpUrl method
* Changelog for 1.17.3, 1.16.7 enterprise, 1.15.13 enterprise (#28018)
* changelog for 1.17.3, 1.16.7 enterprise, 1.15.13 enterprise
* Add spacing to match older changelogs
* Fix typo in variables.tf (#27693)
intialize -> initialize
Co-authored-by: akshya96 <87045294+akshya96@users.noreply.github.com>
* Update 1_15-auto-upgrade.mdx (#27675)
* Update 1_15-auto-upgrade.mdx
* Update known issue version numbers for AP issue
---------
Co-authored-by: Sarah Chavis <62406755+schavis@users.noreply.github.com>
* Update 1_16-default-policy-needs-to-be-updated.mdx (#27157)
Made a few grammar changes plus updating term from Vault IU to Vault UI
* change instances variable to hosts
* for each hosts
* add cluster addr port
* Add ENVs using NewTestDockerCluster (#27457)
* Add ENVs using NewTestDockerCluster
Currently NewTestDockerCluster had no means for setting any
environment variables. This makes it tricky to create test
for functionality that require thems, like having to set
AWS environment variables.
DockerClusterOptions now exposes an option to pass extra
enviroment variables to the containers, which are appended
to the existing ones.
* adding changelog
* added test case for setting env variables to containers
* fix changelog typo; env name
* Update changelog/27457.txt
Co-authored-by: akshya96 <87045294+akshya96@users.noreply.github.com>
* adding the missing copyright
---------
Co-authored-by: akshya96 <87045294+akshya96@users.noreply.github.com>
* UI: Build KV v2 overview page (#28106)
* move date-from-now helper to addon
* make overview cards consistent across engines
* make kv-paths-card component
* remove overview margin all together
* small styling changes for paths card
* small selector additions
* add overview card test
* add overview page and test
* add default timestamp format
* cleanup paths test
* fix dateFromNow import
* fix selectors, cleanup pki selectors
* and more selector cleanup
* make deactivated state single arg
* fix template and remove @isDeleted and @isDestroyed
* add test and hide badge unless deactivated
* address failings from changing selectors
* oops, not ready to show overview tab just yet!
* add deletionTime to currentSecret metadata getter
* Bump actions/download-artifact from 4.1.7 to 4.1.8 (#27704)
Bumps [actions/download-artifact](https://github.com/actions/download-artifact) from 4.1.7 to 4.1.8.
- [Release notes](https://github.com/actions/download-artifact/releases)
- [Commits](65a9edc588...fa0a91b85d)
---
updated-dependencies:
- dependency-name: actions/download-artifact
dependency-type: direct:production
update-type: version-update:semver-patch
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: akshya96 <87045294+akshya96@users.noreply.github.com>
* Bump actions/setup-node from 4.0.2 to 4.0.3 (#27738)
Bumps [actions/setup-node](https://github.com/actions/setup-node) from 4.0.2 to 4.0.3.
- [Release notes](https://github.com/actions/setup-node/releases)
- [Commits](60edb5dd54...1e60f620b9)
---
updated-dependencies:
- dependency-name: actions/setup-node
dependency-type: direct:production
update-type: version-update:semver-patch
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: akshya96 <87045294+akshya96@users.noreply.github.com>
* Add valid IP callout (#28112)
Co-authored-by: Yoko Hyakuna <yoko@hashicorp.com>
* Refactor SSH Configuration workflow (#28122)
* initial copy from other #28004
* pr feedback
* grr
* Bump browser-actions/setup-chrome from 1.7.1 to 1.7.2 (#28101)
Bumps [browser-actions/setup-chrome](https://github.com/browser-actions/setup-chrome) from 1.7.1 to 1.7.2.
- [Release notes](https://github.com/browser-actions/setup-chrome/releases)
- [Changelog](https://github.com/browser-actions/setup-chrome/blob/master/CHANGELOG.md)
- [Commits](db1b524c26...facf10a55b)
---
updated-dependencies:
- dependency-name: browser-actions/setup-chrome
dependency-type: direct:production
update-type: version-update:semver-patch
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: divyaac <divya.chandrasekaran@hashicorp.com>
* Bump vault-gcp-secrets-plugin (#28089)
Co-authored-by: divyaac <divya.chandrasekaran@hashicorp.com>
* docs: correct list syntax (#28119)
Co-authored-by: divyaac <divya.chandrasekaran@hashicorp.com>
* add semgrepconstraint check in skip step
---------
Signed-off-by: Ryan Cragun <me@ryan.ec>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: Ryan Cragun <me@ryan.ec>
Co-authored-by: Adam Rowan <92474478+bear359@users.noreply.github.com>
Co-authored-by: Theron Voran <tvoran@users.noreply.github.com>
Co-authored-by: Paul Banks <pbanks@hashicorp.com>
Co-authored-by: Sarah Chavis <62406755+schavis@users.noreply.github.com>
Co-authored-by: miagilepner <mia.epner@hashicorp.com>
Co-authored-by: Scott Miller <smiller@hashicorp.com>
Co-authored-by: Chelsea Shaw <82459713+hashishaw@users.noreply.github.com>
Co-authored-by: divyaac <divya.chandrasekaran@hashicorp.com>
Co-authored-by: Roman O'Brien <58272664+romanobrien@users.noreply.github.com>
Co-authored-by: Adrian Todorov <adrian.todorov@hashicorp.com>
Co-authored-by: VAL <val@hashicorp.com>
Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com>
Co-authored-by: Owen Zhang <86668876+owenzorrin@users.noreply.github.com>
Co-authored-by: gkoutsou <gkoutsou@users.noreply.github.com>
Co-authored-by: claire bontempo <68122737+hellobontempo@users.noreply.github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Jonathan Frappier <92055993+jonathanfrappier@users.noreply.github.com>
Co-authored-by: Yoko Hyakuna <yoko@hashicorp.com>
Co-authored-by: Angel Garbarino <Monkeychip@users.noreply.github.com>
Co-authored-by: Max Levine <max@maxlevine.co.uk>
Co-authored-by: Steffy Fort <steffyfort@gmail.com>
* VAULT-28146: Add IPV6 support to enos scenarios
Add support for testing all raft storage scenarios and variants when
running Vault with IPV6 networking. We retain our previous support for
IPV4 and create a new variant `ip_version` which can be used to
configure the IP version that we wish to test with.
It's important to note that the VPC in IPV6 mode is technically mixed
and that target machines still associate public IPV6 addresses. That
allows us to execute our resources against them from IPV4 networks like
developer machines and CI runners. Despite that, we've taken care to
ensure that only IPV6 addresses are used in IPV6 mode.
Because we previously had assumed the IP Version, Vault address, and
listener ports in so many places, this PR is essentially a rewrite and
removal of those assumptions. There are also a few places where
improvements to scenarios have been included as I encountered them while
working on the IPV6 changes.
Signed-off-by: Ryan Cragun <me@ryan.ec>
* Better handle symlinks in artifact paths.
* Fix a race condition in the local builder where Terraform wouldn't
wait for local builds to finish before attempting to install vault on
target nodes.
* Make building the web ui configurable in the dev scenario.
* Rename `vault_artifactory_artifact` to `build_artifactory_artifact` to
better align with existing "build" modules.
Signed-off-by: Ryan Cragun <me@ryan.ec>
* [VAULT-26888] Create developer scenarios
Create developer scenarios that have simplified inputs designed for
provisioning clusters and limited verification.
* Migrate Artifactory installation module from support team focused
scenarios to the vault repository.
* Migrate support focused scenarios to the repo and update them to use
the latest in-repo modules.
* Fully document and comment scenarios to help users outline, configure,
and use the scenarios.
* Remove outdated references to the private registry that is not needed.
* Automatically configure the login shell profile to include the path to
the vault binary and the VAULT_ADDR/VAULT_TOKEN environment variables.
Signed-off-by: Ryan Cragun <me@ryan.ec>
Add `config_mode` variant to some scenarios so we can dynamically change
how we primarily configure the Vault cluster, either by a configuration
file or with environment variables.
As part of this change we also:
* Start consuming the Enos terraform provider from public Terraform
registry.
* Remove the old `seal_ha_beta` variant as it is no longer required.
* Add a module that performs a `vault operator step-down` so that we can
force leader elections in scenarios.
* Wire up an operator step-down into some scenarios to test both the old
and new multiseal code paths during leader elections.
Signed-off-by: Ryan Cragun <me@ryan.ec>
Add support for testing `+ent.hsm` and `+ent.hsm.fips1402` Vault editions
with `pkcs11` seal types utilizing a shared `softhsm` token. Softhsm2 is
a software HSM that will load seal keys from a local disk via pkcs11.
The pkcs11 seal implementation is fairly complex as we have to create a
one or more shared tokens with various keys and distribute them to all
nodes in the cluster before starting Vault. We also have to ensure that
each sets labels are unique.
We also make a few quality of life updates by utilizing globals for
variants that don't often change and update base versions for various
scenarios.
* Add `seal_pkcs11` module for creating a `pkcs11` seal key using
`softhsm2` as our backing implementation.
* Require the latest enos provider to gain access to the `enos_user`
resource to ensure correct ownership and permissions of the
`softhsm2` data directory and files.
* Add `pkcs11` seal to all scenarios that support configuring a seal
type.
* Extract system package installation out of the `vault_cluster` module
and into its own `install_package` module that we can reuse.
* Fix a bug when using the local builder variant that mangled the path.
This likely slipped in during the migration to auto-version bumping.
* Fix an issue where restarting Vault nodes with a socket seal would
fail because a seal socket sync wasn't available on all nodes. Now we
start the socket listener on all nodes to ensure any node can become
primary and "audit" to the socket listner.
* Remove unused attributes from some verify modules.
* Go back to using cheaper AWS regions.
* Use globals for variants.
* Update initial vault version for `upgrade` and `autopilot` scenarios.
* Update the consul versions for all scenarios that support a consul
storage backend.
Signed-off-by: Ryan Cragun <me@ryan.ec>
Test HA seal migration in the `seal_ha` by removing the primary seal,
ensuring seal rewrap has completed, and verifying that data written
through the primary seal is available in the new primary seal.
We also add a verification for the seal type at various stages of the scenario.
* Allow configuring the seal alias and priority in the `start_vault`
module.
* Add seal migration to `seal_ha` scenario.
* Verify the data written through the original primary seal after the
seal migration.
* [QT-629] Verify the seal type at various stages in `seal_ha`.
Signed-off-by: Ryan Cragun <me@ryan.ec>
Add support for testing Vault Enterprise with HA seal support by adding
a new `seal_ha` scenario that configures more than one seal type for a
Vault cluster. We also extend existing scenarios to support testing
with or without the Seal HA code path enabled.
* Extract starting vault into a separate enos module to allow for better
handling of complex clusters that need to be started more than once.
* Extract seal key creation into a separate module and provide it to
target modules. This allows us to create more than one seal key and
associate it with instances. This also allows us to forego creating
keys when using shamir seals.
* [QT-615] Add support for configuring more that one seal type to
`vault_cluster` module.
* [QT-616] Add `seal_ha` scenario
* [QT-625] Add `seal_ha_beta` variant to existing scenarios to test with
both code paths.
* Unpin action-setup-terraform
* Add `kms:TagResource` to service user IAM profile
Signed-off-by: Ryan Cragun <me@ryan.ec>
Update our `proxy` and `agent` scenarios to support new variants and
perform baseline verification and their scenario specific verification.
We integrate these updated scenarios into the pipeline by adding them
to artifact samples.
We've also improved the reliability of the `autopilot` and `replication`
scenarios by refactoring our IP address gathering. Previously, we'd ask
vault for the primary IP address and use some Terraform logic to determine
followers. The leader IP address gathering script was also implicitly
responsible for ensuring that a found leader was within a given group of
hosts, and thus waiting for a given cluster to have a leader, and also for
doing some arithmetic and outputting `replication` specific output data.
We've broken these responsibilities into individual modules, improved their
error messages, and fixed various races and bugs, including:
* Fix a race between creating the file audit device and installing and starting
vault in the `replication` scenario.
* Fix how we determine our leader and follower IP addresses. We now query
vault instead of a prior implementation that inferred the followers and sometimes
did not allow all nodes to be an expected leader.
* Fix a bug where we'd always always fail on the first wrong condition
in the `vault_verify_performance_replication` module.
We also performed some maintenance tasks on Enos scenarios byupdating our
references from `oss` to `ce` to handle the naming and license changes. We
also enabled `shellcheck` linting for enos module scripts.
* Rename `oss` to `ce` for license and naming changes.
* Convert template enos scripts to scripts that take environment
variables.
* Add `shellcheck` linting for enos module scripts.
* Add additional `backend` and `seal` support to `proxy` and `agent`
scenarios.
* Update scenarios to include all baseline verification.
* Add `proxy` and `agent` scenarios to artifact samples.
* Remove IP address verification from the `vault_get_cluster_ips`
modules and implement a new `vault_wait_for_leader` module.
* Determine follower IP addresses by querying vault in the
`vault_get_cluster_ips` module.
* Move replication specific behavior out of the `vault_get_cluster_ips`
module and into it's own `replication_data` module.
* Extend initial version support for the `upgrade` and `autopilot`
scenarios.
We also discovered an issue with undo_logs that has been described in
the VAULT-20259. As such, we've disabled the undo_logs check until
it has been fixed.
Signed-off-by: Ryan Cragun <me@ryan.ec>
* Adding explicit MPL license for sub-package.
This directory and its subdirectories (packages) contain files licensed with the MPLv2 `LICENSE` file in this directory and are intentionally licensed separately from the BSL `LICENSE` file at the root of this repository.
* Adding explicit MPL license for sub-package.
This directory and its subdirectories (packages) contain files licensed with the MPLv2 `LICENSE` file in this directory and are intentionally licensed separately from the BSL `LICENSE` file at the root of this repository.
* Updating the license from MPL to Business Source License.
Going forward, this project will be licensed under the Business Source License v1.1. Please see our blog post for more details at https://hashi.co/bsl-blog, FAQ at www.hashicorp.com/licensing-faq, and details of the license at www.hashicorp.com/bsl.
* add missing license headers
* Update copyright file headers to BUS-1.1
* Fix test that expected exact offset on hcl file
---------
Co-authored-by: hashicorp-copywrite[bot] <110428419+hashicorp-copywrite[bot]@users.noreply.github.com>
Co-authored-by: Sarah Thompson <sthompson@hashicorp.com>
Co-authored-by: Brian Kassouf <bkassouf@hashicorp.com>
* Sync missing scenarios and modules
* Clean up variables and examples vars
* Add a `lint` make target for enos
* Update enos `fmt` workflow to run the `lint` target.
* Always use ipv4 addresses in target security groups.
Signed-off-by: Ryan Cragun <me@ryan.ec>
Add an updated `target_ec2_instances` module that is capable of
dynamically splitting target instances over subnet/az's that are
compatible with the AMI architecture and the associated instance type
for the architecture. Use the `target_ec2_instances` module where
necessary. Ensure that `raft` storage scenarios don't provision
unnecessary infrastructure with a new `target_ec2_shim` module.
After a lot of trial, the state of Ec2 spot instance capacity, their
associated APIs, and current support for different fleet types in AWS
Terraform provider, have proven to make using spot instances for
scenario targets too unreliable.
The current state of each method:
* `target_ec2_fleet`: unusable due to the fact that the `instant` type
does not guarantee fulfillment of either `spot` or `on-demand`
instance request types. The module does support both `on-demand` and
`spot` request types and is capable of bidding across a maximum of
four availability zones, which makes it an attractive choice if the
`instant` type would always fulfill requests. Perhaps a `request` type
with `wait_for_fulfillment` option like `aws_spot_fleet_request` would
make it more viable for future consideration.
* `target_ec2_spot_fleet`: more reliable if bidding for target instances
that have capacity in the chosen zone. Issues in the AWS provider
prevent us from bidding across multiple zones succesfully. Over the
last 2-3 months target capacity for the instance types we'd prefer to
use has dropped dramatically and the price is near-or-at on-demand.
The volatility for nearly no cost savings means we should put this
option on the shelf for now.
* `target_ec2_instances`: the most reliable method we've got. It is now
capable of automatically determing which subnets and availability
zones to provision targets in and has been updated to be usable for
both Vault and Consul targets. By default we use the cheapest medium
instance types that we've found are reliable to test vault.
* Update .gitignore
* enos/modules/create_vpc: create a subnet for every availability zone
* enos/modules/target_ec2_fleet: bid across the maximum of four
availability zones for targets
* enos/modules/target_ec2_spot_fleet: attempt to make the spot fleet bid
across more availability zones for targets
* enos/modules/target_ec2_instances: create module to use
ec2:RunInstances for scenario targets
* enos/modules/target_ec2_shim: create shim module to satisfy the
target module interface
* enos/scenarios: use target_ec2_shim for backend targets on raft
storage scenarios
* enos/modules/az_finder: remove unsed module
Signed-off-by: Ryan Cragun <me@ryan.ec>
We seem to hit occasional capacity issues when attempting to launch spot
fleets with arm64 instance types. After checking pricing in the regions
that we use, it appears that current and older generation amd64 t2 and
t3 instance types are running at quite a discount whereas t4 arm64
instances are barely under on-demand price, suggesting limited capacity
for arm64 spot instances at this time. We'll change our default backend
instance architecture to amd64 to bid for the cheaper t2 and t3
instances and increase our `max_price` globally to that of a RHEL
machine running on-demand with a t3.medium.
Signed-off-by: Ryan Cragun <me@ryan.ec>
Begin the process of migrating away from the "strongly encouraged not to
use"[0] Ec2 spot fleet API to the more modern `ec2:CreateFleet`.
Unfortuantely the `instant` type fleet does not guarantee fulfillment
with either on-demand or spot types. We'll need to add a feature similar
to `wait_for_fulfillment` on the `spot_fleet_request` resource[1] to
`ec2_fleet` before we can rely on it.
We also update the existing target fleets to support provisioning generic
targets. This has allowed us to remove our usage of `terraform-enos-aws-consul`
and replace it with a smaller `backend_consul` module in-repo.
We also remove `terraform-enos-aws-infra` and replace it with two smaller
in-repo modules `ec2_info` and `create_vpc`. This has allowed us to simplify
the vpc resources we use for each scneario, which in turn allows us to
not rely on flaky resources.
As part of this refactor we've also made it possible to provision
targets using different distro versions.
[0] https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/spot-best-practices.html#which-spot-request-method-to-use
[1] https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/spot_fleet_request#wait_for_fulfillment
* enos/consul: add `backend_consul` module that accepts target hosts.
* enos/target_ec2_spot_fleet: add support for consul networking.
* enos/target_ec2_spot_fleet: add support for customizing cluster tag
key.
* enos/scenarios: create `target_ec2_fleet` which uses a more modern
`ec2_fleet` API.
* enos/create_vpc: replace `terraform-enos-aws-infra` with smaller and
simplified version. Flatten the networking to a single route on the
default route table and a single subnet.
* enos/ec2_info: add a new module to give us useful ec2 information
including AMI id's for various arch/distro/version combinations.
* enos/ci: update service user role to allow for managing ec2 fleets.
Signed-off-by: Ryan Cragun <me@ryan.ec>
Use the latest version of enos-provider and upstream consul module.
These changes allow us to configure the vault log level in configuration
and also support configuring consul with an enterprise license.
Signed-off-by: Ryan Cragun <me@ryan.ec>
The previous strategy for provisioning infrastructure targets was to use
the cheapest instances that could reliably perform as Vault cluster
nodes. With this change we introduce a new model for target node
infrastructure. We've replaced on-demand instances for a spot
fleet. While the spot price fluctuates based on dynamic pricing,
capacity, region, instance type, and platform, cost savings for our
most common combinations range between 20-70%.
This change only includes spot fleet targets for Vault clusters.
We'll be updating our Consul backend bidding in another PR.
* Create a new `vault_cluster` module that handles installation,
configuration, initializing, and unsealing Vault clusters.
* Create a `target_ec2_instances` module that can provision a group of
instances on-demand.
* Create a `target_ec2_spot_fleet` module that can bid on a fleet of
spot instances.
* Extend every Enos scenario to utilize the spot fleet target acquisition
strategy and the `vault_cluster` module.
* Update our Enos CI modules to handle both the `aws-nuke` permissions
and also the privileges to provision spot fleets.
* Only use us-east-1 and us-west-2 in our scenario matrices as costs are
lower than us-west-1.
Signed-off-by: Ryan Cragun <me@ryan.ec>
* Adding an Enos test for undo logs
* fixing a typo
* feedback
* fixing typo
* running make fmt
* removing a dependency
* var name change
* fixing a variable
* fix builder
* fix product version
* adding required fields
* feedback
* add artifcat bundle back
* fmt check
* point to correct instance
* minor fix
* feedback
* feedback
Add our initial Enos integration tests to Vault. The Enos scenario
workflow will automatically be run on branches that are created from the
`hashicorp/vault` repository. See the README.md in ./enos a full description
of how to compose and execute scenarios locally.
* Simplify the metadata build workflow jobs
* Automatically determine the Go version from go.mod
* Add formatting check for Enos integration scenarios
* Add Enos smoke and upgrade integration scenarios
* Add Consul backend matrix support
* Add Ubuntu and RHEL distro support
* Add Vault edition support
* Add Vault architecture support
* Add Vault builder support
* Add Vault Shamir and awskms auto-unseal support
* Add Raft storage support
* Add Raft auto-join voter verification
* Add Vault version verification
* Add Vault seal verification
* Add in-place upgrade support for all variants
* Add four scenario variants to CI. These test a maximal distribution of
the aforementioned variants with the `linux/amd64` Vault install
bundle.
Signed-off-by: Ryan Cragun <me@ryan.ec>
Co-authored-by: Rebecca Willett <rwillett@hashicorp.com>
Co-authored-by: Jaymala <jaymalasinha@gmail.com>