vault/tools/pipeline/internal/cmd/root.go
Ryan Cragun cda9ad3491
VAULT-33074: add github sub-command to pipeline (#29403)
* VAULT-33074: add `github` sub-command to `pipeline`

Investigating test workflow failures is common task that engineers on the
sustaining rotation perform. This task often requires quite a bit of
manual labor by manually inspecting all failed/cancelled workflows in
the Github UI on per repo/branch/workflow basis and performing root cause
analysis.

As we work to improve our pipeline discoverability this PR adds a new `github`
sub-command to the `pipeline` utility that allows querying for such workflows
and returning either machine readable or human readable summaries in a single
place. Eventually we plan to automate sending a summary of this data to
an OTEL collector automatically but for now sustaining engineers can
utilize it to query for workflows with lots of various criteria.

A common pattern for investigating build/enos test failure workflows would be:
```shell
export GITHUB_TOKEN="YOUR_TOKEN"
go run -race ./tools/pipeline/... github list-workflow-runs -o hashicorp -r vault -d '2025-01-13..2025-01-23' --branch main --status failure build
```

This will list `build` workflow runs in `hashicorp/vault` repo for the
`main` branch with the `status` or `conclusion` of `failure` within the date
range of `2025-01-13..2025-01-23`.

A sustaining engineer will likely do this for both `vault` and
`vault-enterprise` repositories along with `enos-release-testing-oss` and
`enos-release-testing-ent` workflows in addition to `build` in order to
get a full picture of the last weeks failures.

You can also use this utility to summarize workflows based on other
statuses, branches, HEAD SHA's, event triggers, github actors, etc. For
a full list of filter arguments you can pass `-h` to the sub-command.

> [!CAUTION]
> Be careful not to run this without setting strict filter arguments.
> Failing to do so could result in trying to summarize way too many
> workflows resulting in your API token being disabled for an hour.

Signed-off-by: Ryan Cragun <me@ryan.ec>
2025-01-31 13:48:38 -07:00

67 lines
1.4 KiB
Go

// Copyright (c) HashiCorp, Inc.
// SPDX-License-Identifier: BUSL-1.1
package cmd
import (
"fmt"
"log/slog"
"os"
"github.com/spf13/cobra"
slogctx "github.com/veqryn/slog-context"
)
type rootCmdCfg struct {
logLevel string
}
var rootCfg = &rootCmdCfg{}
func newRootCmd() *cobra.Command {
rootCmd := &cobra.Command{
Use: "pipeline",
Short: "Execute pipeline tasks",
Long: "Pipeline automation tasks",
}
rootCmd.PersistentFlags().StringVar(&rootCfg.logLevel, "log", "warn", "Set the log level. One of 'debug', 'info', 'warn', 'error'")
rootCmd.AddCommand(newGenerateCmd())
rootCmd.AddCommand(newGithubCmd())
rootCmd.AddCommand(newReleasesCmd())
rootCmd.PersistentPreRunE = func(cmd *cobra.Command, args []string) error {
var ll slog.Level
switch rootCfg.logLevel {
case "debug":
ll = slog.LevelDebug
case "info":
ll = slog.LevelInfo
case "warn":
ll = slog.LevelWarn
case "error":
ll = slog.LevelError
default:
return fmt.Errorf("unsupported log level: %s", rootCfg.logLevel)
}
h := slogctx.NewHandler(slog.NewTextHandler(os.Stdout, &slog.HandlerOptions{Level: ll}), nil)
slog.SetDefault(slog.New(h))
return nil
}
return rootCmd
}
// Execute executes the root pipeline command.
func Execute() {
rootCmd := newRootCmd()
rootCmd.SilenceErrors = true // We handle this below
if err := rootCmd.Execute(); err != nil {
slog.Default().Error(err.Error())
os.Exit(1)
}
}