docs: add ADR-001 for verification output parsing (CG-62)

2025-12-11 07:58:14 -07:00
parent f0e86d6920
commit 563f1ca145
1 changed files with 91 additions and 0 deletions
--- a/project/decisions/ADR-001-verification-output-parsing.md
+++ b/project/decisions/ADR-001-verification-output-parsing.md
@@ -0,0 +1,91 @@
+# ADR-001: Verification Output Parsing for Merge Gating
+
+**Status:** Accepted
+**Date:** 2025-12-11
+**Issue:** CG-62
+
+## Context
+
+The Agent Runner system automates the development workflow, including code verification and merging feature branches to main. The QA Agent performs verification and reports pass/fail status in its output.
+
+A critical bug was discovered where the QA Agent merged branch `issue/CG-19` to main despite reporting "Verification Failed" in its output. The root cause: the merge decision relied solely on the process exit code (0 = success), but the QA Agent could exit with code 0 while still reporting verification failure in its stdout.
+
+### Evidence from CG-19
+
+```
+Comment #5 (QA Agent Verification):
+> ## Verification Failed
+> The developer's comments claim comprehensive implementation but NONE of the
+> retry logic changes were actually implemented.
+
+Comment #6 (QA Agent Merge) - 3 seconds later:
+> ## Branch Merged
+> Feature branch has been merged to main and deleted.
+```
+
+## Decision
+
+Implement a secondary verification check that parses agent stdout for explicit pass/fail markers, independent of exit code.
+
+### Implementation
+
+1. **Added `_detect_verification_result(stdout)` function** that:
+   - Scans stdout for failure patterns (case-insensitive regex)
+   - Returns `(passed: bool, reason: str)` tuple
+   - Falls back to pass if no markers found (preserves exit-code-only behavior for agents without markers)
+
+2. **Failure patterns detected:**
+   - "verification failed"
+   - "requirements not met"
+   - "was not implemented" / "were not implemented"
+   - "none of the ... changes were implemented"
+   - "acceptance criteria not met"
+   - Markdown header "## Verification Failed"
+
+3. **Pass patterns detected:**
+   - "verification passed" / "verification succeeded"
+   - "all acceptance criteria met"
+   - Markdown header "## Verification Passed"
+
+4. **Modified merge logic:**
+   ```python
+   if task.task_type == "verification":
+       stdout_passed, detection_reason = _detect_verification_result(task.stdout or "")
+       logger.info(f"Verification result for {task.issue_id}: exit_code=0, stdout_check={stdout_passed} ({detection_reason})")
+
+       if stdout_passed:
+           self._merge_feature_branch(task)
+       else:
+           # Block merge, add comment, move to Triage
+   ```
+
+## Consequences
+
+### Positive
+
+- **Prevents unverified merges:** Failed verification now correctly blocks merge regardless of exit code
+- **Audit trail:** All merge decisions are logged with the detection reason
+- **User feedback:** Blocked merges add a "Merge Blocked" comment explaining why
+- **Backwards compatible:** Empty stdout or no markers falls back to exit code behavior
+
+### Negative
+
+- **Pattern maintenance:** New failure message formats may need to be added to the patterns list
+- **False positives possible:** If a passing verification mentions failure in a different context (e.g., "Fixed the verification failed issue"), it could be incorrectly blocked
+
+### Neutral
+
+- Issues blocked from merge are moved to Triage state for manual review
+- The original failure pattern from CG-19 ("none of the ... changes were actually implemented") is specifically covered
+
+## Alternatives Considered
+
+1. **Fix agent to always exit non-zero on failure:** Would require changes to all agent implementations and doesn't protect against agent bugs
+2. **Structured output format (JSON):** More robust but requires agent rewrite; stdout parsing is simpler and works with existing agents
+3. **Require explicit pass marker:** Would break backwards compatibility with agents that don't emit markers
+
+## Related
+
+- CG-62: Issue where this fix was implemented
+- CG-19: Issue where the bug originally manifested
+- `/opt/repos/agentrunner/runner.py`: Implementation location (lines 170-220, 429-450)