* [QT-637] Streamline our build pipeline (#24892)
Context
-------
Building and testing Vault artifacts on pull requests and merges is
responsible for about 1/3rd of our overall spend on Vault CI. Of the
artifacts that we ship as part of a release, we do Enos testing scenarios
on the `linux/amd64` and `linux/arm64` binaries and their derivative
artifacts. The extended build artifacts for non-Linux platforms or less
common machine architectures are not tested at this time. They are built,
notarized, and signed as part of every pull request update and merge. As
we don't actually test these artifacts, the only gain we get from this
rather expensive behavior is that we wont merge a change that would prevent
Vault from building on one of the extended targets. Extended platform or
architecture changes are quite rare, so performing this work as frequently
as we do is costly in both monetary and developer time for little relative
safety benefit.
Goals
-----
Rethink and implement how and when we build binaries and artifacts of Vault
so that we can spend less money on repetitive work and while also reducing
the time it takes for the build and test pipelines to complete.
Solution
--------
Instead of building all release artifacts on every push, we'll opt to build
only our testable (core) artifacts. With this change we are introducing a
bit of risk. We could merge a change that breaks an extended platform and
only find out after the fact when we trigger a complete build for a release.
We'll hedge against that risk by building all of the release targets on a
scheduled cadence to ensure that they are still buildable.
We'll make building all of the targets optional on any pull request by
use of a `build/all` label on the pull request.
Further considerations
----------------------
* We want to reduce the total number of workflows and runners for all of our
pipelines if possible. As each workflow runner has infrastructure cost and
runner time penalties, using a single runner over many is often preferred.
* Many of our jobs runners have been optimized for cost and performance. We
should simplify the choices of which runners to use.
* CRT requires us to use the same build workflow in both CE and Ent.
Historically that meant that modifying `build.yml` in CE would result in a
merge conflict with `build.yml` in Ent, and break our merge workflows.
* Workflow flow control in both `build.yml` and `ci.yml` can be quite
complicated, as each needs to maintain compatibility whether executed as CE
or Ent, and when triggered with various Github events like pull_request,
push, and workflow_call, each with their own requirements.
* Many jobs utilize similar patterns of flow control and metadata but are not
reusable.
* Workflow call depth has a maximum of four, so we need to be quite
considerate when calling other workflows.
* Called workflows can only have 10 inputs.
Implementation
--------------
* Refactor the `build.yml` workflow to be agnostic to whether or not it is
executing in CE or Ent. That makes future updates to the build much easier
as we won't have to worry about merge conflicts when the change is merged
downstream.
* Extract common steps in workflows into composite actions that we can reuse.
* Fix bugs where some but not all workflows would use different Git
references when building and testing a pull request.
* We rewrite the application, docs, and UI change helpers as a composite
action. This allows us to re-use this logic to make consistent behavior
choices across build and CI.
* We combine several `build.yml` and `ci.yml` jobs into our final job.
This reduces the number of workflows required for the same behavior while
saving time overall.
* Update most of our action pins.
Results
-------
| Metric | Before | After | Diff |
|-------------------|----------|---------|-------|
| Duration: | ~14-18m | ~15-18m | ~ = |
| Workflows: | 43 | 18 | - 58% |
| Billable time: | ~1h15m | 16m | - 79% |
| Saved artifacts: | 34 | 12 | - 65% |
Infra costs should map closely to billable time.
Network I/O costs should map closely to the workflow count.
Storage costs should map directly with saved artifacts.
We could probably get parity with duration by getting more clever with
our UBI container build, as that's where we're seeing the increase. I'm
not yet concerned as it takes roughly the same time for this job to
complete as it did before.
While the CI workflow was not the focus on the PR, some shared
refactoring does show some marginal improvements there.
| Metric | Before | After | Diff |
|-------------------|----------|----------|--------|
| Duration: | ~24m | ~12.75m | - 15% |
| Workflows: | 55 | 47 | - 8% |
| Billable time: | ~4h20m | ~3h36m | - 7% |
Further focus on streamlining the CI workflows would likely result in a
few more marginal improvements, but nothing on the order like we've seen
with the build workflow.
[0] https://github.com/hashicorp/vault-enterprise/actions/runs/7875954928/job/21490054433?pr=5411#step:3:39
Signed-off-by: Ryan Cragun <me@ryan.ec>
* Fix licensing on various files
* Update CI and release files to BUSL-1.1
* Update offset within config_test_helpers.go
- Fix a test the same way it's been fixed on main/1.15
* combine into one checker
* combine and simplify ci checks
* add to test package list
* remove testing test
* only run deprecations check
* only run deprecations check
* remove unneeded repo check
* fix bash options
Co-authored-by: miagilepner <mia.epner@hashicorp.com>
* deprecation check
* adding script
* add execute permission to script
* revert changes
* adding the script back
* added working script for local and GHA
* give execute permissions
* updating revgrep
* adding changes to script, tools
* run go mod tidy
* removing default ref
* make bootstrap
* adding to makefile
* Migrate subset of CircleCI ci workflow to GitHub Actions
Runs test-go and test-go-remote-docker with a static splitting of test packages
* [skip actions] add comment to explain the purpose of test-generate-test-package-lists.sh and what to do if it fails
* change trigger to push
---------
Co-authored-by: Kuba Wieczorek <kuba.wieczorek@hashicorp.com>
* example for checking go doc tests
* add analyzer test and action
* get metadata step
* install revgrep
* fix for ci
* add revgrep to go.mod
* clarify how analysistest works
Remove gox in favor of go build.
`gox` hasn't had a release to update it in many years, so is missing
support for many modern systems, like `darwin/arm64`.
In any case, we only use it for dev builds, where we don't even use
the ability of it to build for multiple platforms. Release builds use
`go build` now.
So, this switches to `go build` everywhere.
I pulled this down and tested it in Windows as well. (Side note: I
couldn't get `gox` to work in Windows, so couldn't build before this
change.)
* copy over the webui
move web_ui to http
remove web ui files, add .gitkeep
updates, messing with gitkeep and ignoring web_ui
update ui scripts
gitkeep
ignore http/web_ui
Remove debugging
remove the jwt reference, that was from something else
restore old jwt plugin
move things around
Revert "move things around"
This reverts commit 2a35121850f5b6b82064ecf78ebee5246601c04f.
Update ui path handling to not need the web_ui name part
add desc
move the http.FS conversion internal to assetFS
update gitignore
remove bindata dep
clean up some comments
remove asset check script that's no longer needed
Update readme
remove more bindata things
restore asset check
update packagespec
update stub
stub the assetFS method and set uiBuiltIn to false for non-ui builds
update packagespec to build ui
* fail if assets aren't found
* tidy up vendor
* go mod tidy
* updating .circleci
* restore tools.go
* re-re-re-run make packages
* re-enable arm64
* Adding change log
* Removing a file
Co-authored-by: hamid ghaf <hamid@hashicorp.com>