Commit Graph

19 Commits

Author SHA1 Message Date
Mike Palmiotto
43be9fc18a
Request Limiter (#25093)
This commit introduces two new adaptive concurrency limiters in Vault,
which should handle overloading of the server during periods of
untenable request rate. The limiter adjusts the number of allowable
in-flight requests based on latency measurements performed across the
request duration. This approach allows us to reject entire requests
prior to doing any work and prevents clients from exceeding server
capacity.

The limiters intentionally target two separate vectors that have been
proven to lead to server over-utilization.

- Back pressure from the storage backend, resulting in bufferbloat in
  the WAL system. (enterprise)
- Back pressure from CPU over-utilization via PKI issue requests
  (specifically for RSA keys), resulting in failed heartbeats.

Storage constraints can be accounted for by limiting logical requests
according to their http.Method. We only limit requests with write-based
methods, since these will result in storage Puts and exhibit the
aforementioned bufferbloat.

CPU constraints are accounted for using the same underlying library and
technique; however, they require special treatment. The maximum number
of concurrent pki/issue requests found in testing (again, specifically
for RSA keys) is far lower than the minimum tolerable write request
rate. Without separate limiting, we would artificially impose limits on
tolerable request rates for non-PKI requests. To specifically target PKI
issue requests, we add a new PathsSpecial field, called limited,
allowing backends to specify a list of paths which should get
special-case request limiting.

For the sake of code cleanliness and future extensibility, we introduce
the concept of a LimiterRegistry. The registry proposed in this PR has
two entries, corresponding with the two vectors above. Each Limiter
entry has its own corresponding maximum and minimum concurrency,
allowing them to react to latency deviation independently and handle
high volumes of requests to targeted bottlenecks (CPU and storage).

In both cases, utilization will be effectively throttled before Vault
reaches any degraded state. The resulting 503 - Service Unavailable is a
retryable HTTP response code, which can be handled to gracefully retry
and eventually succeed. Clients should handle this by retrying with
jitter and exponential backoff. This is done within Vault's API, using
the go-retryablehttp library.

Limiter testing was performed via benchmarks of mixed workloads and
across a deployment of agent pods with great success.
2024-01-26 14:26:21 -05:00
Austin Gebauer
d90c7e8ab5
systemview: adds method for plugins to generate identity tokens (#24929)
* systemview: adds method for plugins to generate identity tokens

* change test name and godoc

* adds changelog

* make proto to include comment
2024-01-18 11:01:14 -08:00
Steven Clark
3623dfc227
Add support for plugins to specify binary request paths (#23729)
* wip

* more pruning

* Integrate OCSP into binary paths PoC

 - Simplify some of the changes to the router
 - Remove the binary test PKI endpoint
 - Switch OCSP to use the new binary paths backend variable

* Fix proto generation and test compilation

* Add unit test for binary request handling

---------

Co-authored-by: Scott G. Miller <smiller@hashicorp.com>
2023-10-23 17:04:42 -04:00
Violet Hynes
f943c37a83
VAULT-19237 Add mount_type to secret response (#23047)
* VAULT-19237 Add mount_type to secret response

* VAULT-19237 changelog

* VAULT-19237 make MountType generic

* VAULT-19237 clean up comment

* VAULT-19237 update changelog

* VAULT-19237 update test, remove mounttype from wrapped responses

* VAULT-19237 fix a lot of tests

* VAULT-19237 standby test
2023-09-20 09:28:52 -04:00
Johan Brandhorst-Satzkorn
8253e59752
Migrate protobuf generation to Buf (#22099)
* Migrate protobuf generation to Buf

Buf simplifies the generation story and allows us to lean
into other features in the Buf ecosystem, such as dependency
management, linting, breaking change detection, formatting
and remote plugins.

* Format all protobuf files with buf

Also add a CI job to ensure formatting remains consistent

* Add CI job to warn on proto generate diffs

Some files were not regenerated with the latest version
of the protobuf binary. This CI job will ensure we are always
detect if the protobuf files need regenerating.

* Add CI job for linting protobuf files
2023-07-31 18:44:56 +00:00
Hamid Ghaf
e55c18ed12
adding copyright header (#19555)
* adding copyright header

* fix fmt and a test
2023-03-15 09:00:52 -07:00
Christopher Swenson
80485f927b
Add events sending routed from plugins (#18834)
This isn't perfect for sure, but it's solidifying and becoming a useful
base to work off.

This routes events sent from auth and secrets plugins to the main
`EventBus` in the Vault Core. Events sent from plugins are automatically
tagged with the namespace and plugin information associated with them.
2023-02-03 13:24:16 -08:00
Hamid Ghaf
46b9921aae
Allow Token Create Requests To Be Replicated (#18689)
* Allow Token Create Requests To Be Replicated

* adding a test

* revert a test
2023-01-24 14:00:27 -05:00
Alexander Scheel
c042e4dae3
Add path based primary write forwarding (PBPWF) - OSS (#18735)
* Add WriteForwardedStorage to sdk's plugin, logical in OSS

This should allow backends to specify paths to forward write
(storage.Put(...) and storage.Delete(...)) operations for.

Notably, these semantics are subject to change and shouldn't yet be
relied on.

Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>

* Collect paths for write forwarding in OSS

This adds a path manager to Core, allowing tracking across all Vault
versions of paths which could use write forwarding if available. In
particular, even on OSS offerings, we'll need to template {{clusterId}}
into the paths, in the event of later upgrading to Enterprise. If we
didn't, we'd end up writing paths which will no longer be accessible
post-migration, due to write forwarding now replacing the sentinel with
the actual cluster identifier.

Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>

* Add forwarded writer implementation to OSS

Here, for paths given to us, we determine if we need to do cluster
translation and perform local writing. This is the OSS variant.

Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>

* Wire up mount-specific request forwarding in OSS

Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>

* Clarify that state lock needs to be held to call HAState in OSS

Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>

* Move cluster sentinel constant to sdk/logical

Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>

* Expose ClusterID to Plugins via SystemView

This will let plugins learn what the Cluster's ID is, without having to
resort to hacks like writing a random string to its cluster-prefixed
namespace and then reading it once it has replicated.

Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>

* Add GRPC ClusterID implementation

For any external plugins which wish to use it.

Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>

Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
2023-01-20 16:36:18 -05:00
Rémi Lapeyre
385b8e8387
Add remote_port in the audit logs when it is available (#12790)
* Add remote_port in the audit logs when it is available

The `request.remote_port` field is now present in the audit log when it
is available:

```
{
  "time": "2021-10-10T13:53:51.760039Z",
  "type": "response",
  "auth": {
    "client_token": "hmac-sha256:1304aab0ac65747684e1b58248cc16715fa8f558f8d27e90fcbcb213220c0edf",
    "accessor": "hmac-sha256:f8cf0601dadd19aac84f205ded44c62898e3746a42108a51105a92ccc39baa43",
    "display_name": "root",
    "policies": [
      "root"
    ],
    "token_policies": [
      "root"
    ],
    "token_type": "service",
    "token_issue_time": "2021-10-10T15:53:44+02:00"
  },
  "request": {
    "id": "829c04a1-0352-2d9d-9bc9-00b928d33df5",
    "operation": "update",
    "mount_type": "system",
    "client_token": "hmac-sha256:1304aab0ac65747684e1b58248cc16715fa8f558f8d27e90fcbcb213220c0edf",
    "client_token_accessor": "hmac-sha256:f8cf0601dadd19aac84f205ded44c62898e3746a42108a51105a92ccc39baa43",
    "namespace": {
      "id": "root"
    },
    "path": "sys/audit/file",
    "data": {
      "description": "hmac-sha256:321a1d105f8c6fd62be4f34c4da4f0e6d1cdee9eb2ff4af0b59e1410950fe86b",
      "local": false,
      "options": {
        "file_path": "hmac-sha256:2421b5bf8dab1f9775b2e6e66e58d7bca99ab729f3f311782fda50717eee55b3"
      },
      "type": "hmac-sha256:30dff9607b4087e3ae6808b4a3aa395b1fc064e467748c55c25ddf0e9b150fcc"
    },
    "remote_address": "127.0.0.1",
    "remote_port": 54798
  },
  "response": {
    "mount_type": "system"
  }
}
```

Closes https://github.com/hashicorp/vault/issues/7716

* Add changelog entry

* Empty commit to trigger CI

* Add test and explicit error handling

* Change temporary file pattern in test
2022-01-26 15:47:15 -08:00
Austin Gebauer
74b74568dd
Adds ability to define an inline policy and internal metadata on tokens (#12682)
* Adds ability to define an inline policy and internal metadata to tokens

* Update comment on fetchEntityAndDerivedPolicies

* Simplify handling of inline policy

* Update comment on InternalMeta

Co-authored-by: Brian Kassouf <briankassouf@users.noreply.github.com>

* Improve argument name

Co-authored-by: Brian Kassouf <briankassouf@users.noreply.github.com>

* Use explicit SkipIdentityInheritance token field instead of implicit InlinePolicy behavior

* Add SkipIdentityInheritance to pb struct in token store create method

* Rename SkipIdentityInheritance to NoIdentityPolicies

* Merge latest from main and make proto

Co-authored-by: Brian Kassouf <briankassouf@users.noreply.github.com>
2021-10-07 10:36:22 -07:00
Tim Peoples
7bd2be52c8
Update plugin proto to send tls.ConnectionState (Op.2) (#12581) 2021-10-07 08:06:09 -04:00
swayne275
f8f289712a
Introduce Logical Unrecoverable Error, Use it in Expiration Manager (#11477)
* build out zombie lease system

* add typo for CI

* undo test CI commit

* time equality test isn't working on CI, so let's see what this does...

* add unrecoverable proto error, make proto, go mod vendor

* zombify leases if unrecoverable error, tests

* test fix: somehow pointer in pointer rx is null after pointer rx called

* tweaks based on roy feedback

* improve zombie errors

* update which errors are unrecoverable

* combine zombie logic

* keep subset of zombie lease in memory
2021-05-03 17:56:06 -06:00
Michael Golowka
ad90e0b39d
Add user configurable password policies available to secret engines (#8637)
* Add random string generator with rules engine

This adds a random string generation library that validates random
strings against a set of rules. The library is designed for use as generating
passwords, but can be used to generate any random strings.
2020-05-27 12:28:00 -06:00
Brian Kassouf
5fc424d3f5
Add identity templating helper to sdk/framework (#8088)
* Add identity templating helper to sdk/framework

* Cleanup a bit

* Fix length issue when groups/aliases are filtered due to ns

* review feedback
2020-01-06 10:16:52 -08:00
Jeff Mitchell
5559d40cf9
Move SudoPrivilege out of SystemView (#7266)
* Move SudoPrivilege out of SystemView

We only use this in token store and it literally doesn't work anything
that isn't the token store or system mount, so we should stop exposing
something that doesn't work.

* Reconcile extended system view with sdk/logical a bit and put an explanation for why SudoPrivilege isn't moved over
2019-08-26 10:23:46 -04:00
Mike Jarmy
c48159ea3a AWS upgrade role entries (#7025)
* upgrade aws roles

* test upgrade aws roles

* Initialize aws credential backend at mount time

* add a TODO

* create end-to-end test for builtin/credential/aws

* fix bug in initializer

* improve comments

* add Initialize() to logical.Backend

* use Initialize() in Core.enableCredentialInternal()

* use InitializeRequest to call Initialize()

* improve unit testing for framework.Backend

* call logical.Backend.Initialize() from all of the places that it needs to be called.

* implement backend.proto changes for logical.Backend.Initialize()

* persist current role storage version when upgrading aws roles

* format comments correctly

* improve comments

* use postUnseal funcs to initialize backends

* simplify test suite

* improve test suite

* simplify logic in aws role upgrade

* simplify aws credential initialization logic

* simplify logic in aws role upgrade

* use the core's activeContext for initialization

* refactor builtin/plugin/Backend

* use a goroutine to upgrade the aws roles

* misc improvements and cleanup

* do not run AWS role upgrade on DR Secondary

* always call logical.Backend.Initialize() when loading a plugin.

* improve comments

* on standbys and DR secondaries we do not want to run any kind of upgrade logic

* fix awsVersion struct

* clarify aws version upgrade

* make the upgrade logic for aws auth more explicit

* aws upgrade is now called from a switch

* fix fallthrough bug

* simplify logic

* simplify logic

* rename things

* introduce currentAwsVersion const to track aws version

* improve comments

* rearrange things once more

* conglomerate things into one function

* stub out aws auth initialize e2e test

* improve aws auth initialize e2e test

* finish aws auth initialize e2e test

* tinker with aws auth initialize e2e test

* tinker with aws auth initialize e2e test

* tinker with aws auth initialize e2e test

* fix typo in test suite

* simplify logic a tad

* rearrange assignment

* Fix a few lifecycle related issues in #7025 (#7075)

* Fix panic when plugin fails to load
2019-07-05 16:55:40 -07:00
Jeff Mitchell
263b96ef4e
Tokenhelper v2 (#6662)
This provides an sdk util for common token fields and parsing and plumbs it into token store roles.
2019-06-14 10:17:04 -04:00
Jeff Mitchell
170521481d
Create sdk/ and api/ submodules (#6583) 2019-04-12 17:54:35 -04:00