Replace our prior implementation of Enos test groups with the new Enos
sampling feature. With this feature we're able to describe which
scenarios and variant combinations are valid for a given artifact and
allow enos to create a valid sample field (a matrix of all compatible
scenarios) and take an observation (select some to run) for us. This
ensures that every valid scenario and variant combination will
now be a candidate for testing in the pipeline. See QT-504[0] for further
details on the Enos sampling capabilities.
Our prior implementation only tested the amd64 and arm64 zip artifacts,
as well as the Docker container. We now include the following new artifacts
in the test matrix:
* CE Amd64 Debian package
* CE Amd64 RPM package
* CE Arm64 Debian package
* CE Arm64 RPM package
Each artifact includes a sample definition for both pre-merge/post-merge
(build) and release testing.
Changes:
* Remove the hand crafted `enos-run-matrices` ci matrix targets and replace
them with per-artifact samples.
* Use enos sampling to generate different sample groups on all pull
requests.
* Update the enos scenario matrices to handle HSM and FIPS packages.
* Simplify enos scenarios by using shared globals instead of
cargo-culted locals.
Note: This will require coordination with vault-enterprise to ensure a
smooth migration to the new system. Integrating new scenarios or
modifying existing scenarios/variants should be much smoother after this
initial migration.
[0] https://github.com/hashicorp/enos/pull/102
Signed-off-by: Ryan Cragun <me@ryan.ec>
* Adding explicit MPL license for sub-package.
This directory and its subdirectories (packages) contain files licensed with the MPLv2 `LICENSE` file in this directory and are intentionally licensed separately from the BSL `LICENSE` file at the root of this repository.
* Adding explicit MPL license for sub-package.
This directory and its subdirectories (packages) contain files licensed with the MPLv2 `LICENSE` file in this directory and are intentionally licensed separately from the BSL `LICENSE` file at the root of this repository.
* Updating the license from MPL to Business Source License.
Going forward, this project will be licensed under the Business Source License v1.1. Please see our blog post for more details at https://hashi.co/bsl-blog, FAQ at www.hashicorp.com/licensing-faq, and details of the license at www.hashicorp.com/bsl.
* add missing license headers
* Update copyright file headers to BUS-1.1
* Fix test that expected exact offset on hcl file
---------
Co-authored-by: hashicorp-copywrite[bot] <110428419+hashicorp-copywrite[bot]@users.noreply.github.com>
Co-authored-by: Sarah Thompson <sthompson@hashicorp.com>
Co-authored-by: Brian Kassouf <bkassouf@hashicorp.com>
* Sync missing scenarios and modules
* Clean up variables and examples vars
* Add a `lint` make target for enos
* Update enos `fmt` workflow to run the `lint` target.
* Always use ipv4 addresses in target security groups.
Signed-off-by: Ryan Cragun <me@ryan.ec>
Add an updated `target_ec2_instances` module that is capable of
dynamically splitting target instances over subnet/az's that are
compatible with the AMI architecture and the associated instance type
for the architecture. Use the `target_ec2_instances` module where
necessary. Ensure that `raft` storage scenarios don't provision
unnecessary infrastructure with a new `target_ec2_shim` module.
After a lot of trial, the state of Ec2 spot instance capacity, their
associated APIs, and current support for different fleet types in AWS
Terraform provider, have proven to make using spot instances for
scenario targets too unreliable.
The current state of each method:
* `target_ec2_fleet`: unusable due to the fact that the `instant` type
does not guarantee fulfillment of either `spot` or `on-demand`
instance request types. The module does support both `on-demand` and
`spot` request types and is capable of bidding across a maximum of
four availability zones, which makes it an attractive choice if the
`instant` type would always fulfill requests. Perhaps a `request` type
with `wait_for_fulfillment` option like `aws_spot_fleet_request` would
make it more viable for future consideration.
* `target_ec2_spot_fleet`: more reliable if bidding for target instances
that have capacity in the chosen zone. Issues in the AWS provider
prevent us from bidding across multiple zones succesfully. Over the
last 2-3 months target capacity for the instance types we'd prefer to
use has dropped dramatically and the price is near-or-at on-demand.
The volatility for nearly no cost savings means we should put this
option on the shelf for now.
* `target_ec2_instances`: the most reliable method we've got. It is now
capable of automatically determing which subnets and availability
zones to provision targets in and has been updated to be usable for
both Vault and Consul targets. By default we use the cheapest medium
instance types that we've found are reliable to test vault.
* Update .gitignore
* enos/modules/create_vpc: create a subnet for every availability zone
* enos/modules/target_ec2_fleet: bid across the maximum of four
availability zones for targets
* enos/modules/target_ec2_spot_fleet: attempt to make the spot fleet bid
across more availability zones for targets
* enos/modules/target_ec2_instances: create module to use
ec2:RunInstances for scenario targets
* enos/modules/target_ec2_shim: create shim module to satisfy the
target module interface
* enos/scenarios: use target_ec2_shim for backend targets on raft
storage scenarios
* enos/modules/az_finder: remove unsed module
Signed-off-by: Ryan Cragun <me@ryan.ec>
We seem to hit occasional capacity issues when attempting to launch spot
fleets with arm64 instance types. After checking pricing in the regions
that we use, it appears that current and older generation amd64 t2 and
t3 instance types are running at quite a discount whereas t4 arm64
instances are barely under on-demand price, suggesting limited capacity
for arm64 spot instances at this time. We'll change our default backend
instance architecture to amd64 to bid for the cheaper t2 and t3
instances and increase our `max_price` globally to that of a RHEL
machine running on-demand with a t3.medium.
Signed-off-by: Ryan Cragun <me@ryan.ec>
Begin the process of migrating away from the "strongly encouraged not to
use"[0] Ec2 spot fleet API to the more modern `ec2:CreateFleet`.
Unfortuantely the `instant` type fleet does not guarantee fulfillment
with either on-demand or spot types. We'll need to add a feature similar
to `wait_for_fulfillment` on the `spot_fleet_request` resource[1] to
`ec2_fleet` before we can rely on it.
We also update the existing target fleets to support provisioning generic
targets. This has allowed us to remove our usage of `terraform-enos-aws-consul`
and replace it with a smaller `backend_consul` module in-repo.
We also remove `terraform-enos-aws-infra` and replace it with two smaller
in-repo modules `ec2_info` and `create_vpc`. This has allowed us to simplify
the vpc resources we use for each scneario, which in turn allows us to
not rely on flaky resources.
As part of this refactor we've also made it possible to provision
targets using different distro versions.
[0] https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/spot-best-practices.html#which-spot-request-method-to-use
[1] https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/spot_fleet_request#wait_for_fulfillment
* enos/consul: add `backend_consul` module that accepts target hosts.
* enos/target_ec2_spot_fleet: add support for consul networking.
* enos/target_ec2_spot_fleet: add support for customizing cluster tag
key.
* enos/scenarios: create `target_ec2_fleet` which uses a more modern
`ec2_fleet` API.
* enos/create_vpc: replace `terraform-enos-aws-infra` with smaller and
simplified version. Flatten the networking to a single route on the
default route table and a single subnet.
* enos/ec2_info: add a new module to give us useful ec2 information
including AMI id's for various arch/distro/version combinations.
* enos/ci: update service user role to allow for managing ec2 fleets.
Signed-off-by: Ryan Cragun <me@ryan.ec>
We seen instances where we try to schedule a spot fleet in the
us-east-1d of the vault CI AWS account and cannot get capacity for our
instance type. That zone currently supports far fewer instance types so
we'll bump our max bid to handle cases where slightly more expensive
instances are available. Most of the time we'll be using much cheaper
instances but it's better to pay a fraction of a cent more than have to
retry the pipeline. As such, we increase our max bid price to something
that will almost certainly be fullfilled.
We also allow our package installer to go ahead when cloud init does not
update sources like we expect. This should handle occasional failures
where cloud-init doesn't update the sources within a reasonable amount
of time.
Signed-off-by: Ryan Cragun <me@ryan.ec>
Use the latest version of enos-provider and upstream consul module.
These changes allow us to configure the vault log level in configuration
and also support configuring consul with an enterprise license.
Signed-off-by: Ryan Cragun <me@ryan.ec>
The previous strategy for provisioning infrastructure targets was to use
the cheapest instances that could reliably perform as Vault cluster
nodes. With this change we introduce a new model for target node
infrastructure. We've replaced on-demand instances for a spot
fleet. While the spot price fluctuates based on dynamic pricing,
capacity, region, instance type, and platform, cost savings for our
most common combinations range between 20-70%.
This change only includes spot fleet targets for Vault clusters.
We'll be updating our Consul backend bidding in another PR.
* Create a new `vault_cluster` module that handles installation,
configuration, initializing, and unsealing Vault clusters.
* Create a `target_ec2_instances` module that can provision a group of
instances on-demand.
* Create a `target_ec2_spot_fleet` module that can bid on a fleet of
spot instances.
* Extend every Enos scenario to utilize the spot fleet target acquisition
strategy and the `vault_cluster` module.
* Update our Enos CI modules to handle both the `aws-nuke` permissions
and also the privileges to provision spot fleets.
* Only use us-east-1 and us-west-2 in our scenario matrices as costs are
lower than us-west-1.
Signed-off-by: Ryan Cragun <me@ryan.ec>