Commit Graph

2 Commits

Author SHA1 Message Date
Ryan Cragun
10c4371135
VAULT-34834: pipeline: add better heuristics for changed files (#30284)
* VAULT-34834: pipeline: add better heuristics for changed files

To fully support automated Enterprise to Community backports we need to
have better changed file detection for community and enterprise only
files. Armed with this metadata, future changes will be able to inspect
changed files and automatically remove enterprise only files when
creating the CE backports.

For this change we now have the following changed file groups:
  - autopilot
  - changelog
  - community
  - docs
  - enos
  - enterprise
  - app
  - gotoolchain
  - pipeline
  - proto
  - tools
  - ui

Not included in the change, but something I did while updating out
checkers was generate a list of files that included only in
vault-enterprise and run every path the enterprise detection rules
to ensure that they are categorized appropriately post changes in
VAULT-35431. While it's possible that they'll drift, our changed
file categorization is best effort anyway and changes will always
happen in vault-enterprise and require a developer to approve the
changes.

We've also included a few new files into the various groups and updated
the various workflows to use the new categories. I've also included a
small change to the pipeline composite action whereby we do not handle
Go module caching. This will greatly reduce work on doc-only branches
that need only ensure that the pipeline binary is compiled.

Signed-off-by: Ryan Cragun <me@ryan.ec>
2025-04-18 10:54:41 -06:00
Ryan Cragun
c37b3c46b4
VAULT-34822: Add pipeline github list changed-files (#30100)
* VAULT-34822: Add `pipeline github list changed-files`

Add a new `github list changed-files` sub-command to `pipeline` command and
integrate it into the pipeline. This replaces our previous
`changed-files.sh` script.

This command works quite a bit differently than the full checkout and
diff based solution we used before. Instead of checking out the base ref
and head ref and comparing a diff, we now provide either a pull request
number or git commit SHA and use the Github REST API to determine the
changed files.

This approach has several benefits:
  - Not requiring a local checkout of the repo to get the list of
    changed files. This yields a significant perfomance improvement in
    `setup` jobs where we typically determine the changed files list.
  - The CLI supports both PRs and commit SHAs.
  - The implementation is portable and doesn't require any system tools
    like `git` or `bash` to be installed.
  - A much more advanced system for adding group metadata to the changed
    files. These groupings are going to be used heavily in future
    pipeline automation work and will be used to make required jobs
    smarter.

The theoretical drawbacks:
   - It requires a GITHUB_TOKEN and only works for remote branches or
     commits in Github. We could eventually add a local diff sub-command
     or option to work locally, but that was not required for what we're
     trying to achieve here.

While the groupings that I added in this change are quite rudimentary,
the system will allow us to add additional groups with very little
overhead. I tried to make this change more or less a port of the old
system to enable future work. I did include one small change of
behavior, which is that we now build all extended targets if the
`go.mod` or `go.sum` files change. We do this to ensure that dependency
changes don't subtly result in some extended platform breakage.

Signed-off-by: Ryan Cragun <me@ryan.ec>
2025-03-28 15:18:52 -06:00