feat(workflows): add label-driven bug-test workflow (#3239)#3257
Merged
Conversation
Add the third stage (assess → fix → test) of the semi-automated, human-gated bug pipeline. The `bug-test` agentic workflow triggers when a maintainer applies the `bug-test` label, runs the relevant tests in isolation against the fix, compiles a readable pass/fail report, and posts it back as a single issue comment. - Locates the fix under test: linked PR → named fix branch → current checkout fallback, only ever from origin. - Stack-agnostic test detection (uv+pytest, npm/pnpm/yarn, go, make) so it is decoupled from Spec Kit specifics and reusable by other projects. - Runs tests under a timeout as untrusted code; scoped read-only permissions; same URL-safety / untrusted-input guardrails as bug-assess. - Verification mode compares a generated fix against the historical fix for old/closed bugs to surface discrepancies. - Optional single result label (tests-passing / tests-failing / tests-inconclusive). Compiled bug-test.lock.yml with `gh aw compile`. Assisted-by: GitHub Copilot (model: Claude Opus 4.8, autonomous) Co-authored-by: Copilot App <223556219+Copilot@users.noreply.github.com>
Contributor
There was a problem hiding this comment.
Pull request overview
Adds a new label-driven gh-aw agentic workflow stage (bug-test) to run relevant tests against a proposed bug fix and post a single compiled test report back to the originating issue, completing the assess → fix → test pipeline.
Changes:
- Introduces a hand-authored
.github/workflows/bug-test.mdworkflow prompt/source for the test stage. - Adds the compiled
.github/workflows/bug-test.lock.ymlgenerated bygh aw compilefor execution in GitHub Actions.
Show a summary per file
| File | Description |
|---|---|
| .github/workflows/bug-test.md | Defines the label-triggered “bug-test” agent behavior (locate fix artifact, detect test stack, run with timeout, compile report, post comment/label). |
| .github/workflows/bug-test.lock.yml | Compiled, pinned GitHub Actions workflow generated from bug-test.md for actual execution. |
Review details
Tip
Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
- Files reviewed: 1/2 changed files
- Comments generated: 2
- Review effort level: Low
… workflow Align with repo standards (e.g. dependabot PR #3064, other workflows). Manually pinned in the compiled lock file for consistency. Assisted-by: GitHub Copilot (model: Claude Opus 4.8, autonomous) Co-authored-by: Copilot App <223556219+Copilot@users.noreply.github.com>
mnriem
approved these changes
Jul 1, 2026
Collaborator
|
Thank you! |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds the third stage (assess → fix → test) of the semi-automated, human-gated bug pipeline, closing #3239. A new gh-aw agentic workflow
bug-testtriggers when a maintainer applies thebug-testlabel, runs the relevant tests in isolation against the fix, compiles a readable pass/fail report, and posts it back as a single comment on the originating issue.Modeled on the existing
bug-assessworkflow for safety and trigger parity, and decoupled from Spec Kit specifics so other projects can reuse it.What's included
.github/workflows/bug-test.md— hand-authored agentic workflow source..github/workflows/bug-test.lock.yml— compiled withgh aw compilev0.79.8 (do not hand-edit).Behavior
issues: labeledgated tobug-test; bot-skip parity withbug-assess.origin(untrusted references are recorded, never fetched/executed).uv+pytest(default for this repo), npm/pnpm/yarn, go, make — no hardcoded ecosystem.$RUNNER_TEMP, never written to the working tree.test-report.mdwith a one-line verdict, counts table, failures, and caveats.tests-passing/tests-failing/tests-inconclusive).contents,issues,pull-requests), identical URL-safety / untrusted-input guardrails, maintainer remains the gatekeeper.actions/checkout@v7.0.0to align with other workflows in the repo.Acceptance criteria
bug-testmarkdown workflow added under.github/workflows/and compiled to its.lock.yml.Notes
bug-fixstage (Implement label-driven bug fix workflow #3238) is not yet merged;bug-testconsumes its output (PR/branch) but degrades gracefully when no fix artifact is found, reportinginconclusive.tests-passing/tests-failing/tests-inconclusive) are applied only if they exist in the repo; a missing label is a soft no-op and does not block the comment.gh awv0.79.8.actions/checkoutwas manually pinned to v7.0.0 to align with repo standards (similar to dependabot PR chore(deps): bump actions/checkout from 6.0.3 to 7.0.0 #3064).🤖 This PR was authored autonomously by GitHub Copilot (model: Claude Opus 4.8) on behalf of @BenBtg. Each commit carries an
Assisted-by:trailer.