docs: add ADR-001 for verification output parsing (CG-62)

This commit is contained in:
ClearGrow Agent
2025-12-11 07:58:14 -07:00
parent f0e86d6920
commit 563f1ca145

View File

@@ -0,0 +1,91 @@
# ADR-001: Verification Output Parsing for Merge Gating
**Status:** Accepted
**Date:** 2025-12-11
**Issue:** CG-62
## Context
The Agent Runner system automates the development workflow, including code verification and merging feature branches to main. The QA Agent performs verification and reports pass/fail status in its output.
A critical bug was discovered where the QA Agent merged branch `issue/CG-19` to main despite reporting "Verification Failed" in its output. The root cause: the merge decision relied solely on the process exit code (0 = success), but the QA Agent could exit with code 0 while still reporting verification failure in its stdout.
### Evidence from CG-19
```
Comment #5 (QA Agent Verification):
> ## Verification Failed
> The developer's comments claim comprehensive implementation but NONE of the
> retry logic changes were actually implemented.
Comment #6 (QA Agent Merge) - 3 seconds later:
> ## Branch Merged
> Feature branch has been merged to main and deleted.
```
## Decision
Implement a secondary verification check that parses agent stdout for explicit pass/fail markers, independent of exit code.
### Implementation
1. **Added `_detect_verification_result(stdout)` function** that:
- Scans stdout for failure patterns (case-insensitive regex)
- Returns `(passed: bool, reason: str)` tuple
- Falls back to pass if no markers found (preserves exit-code-only behavior for agents without markers)
2. **Failure patterns detected:**
- "verification failed"
- "requirements not met"
- "was not implemented" / "were not implemented"
- "none of the ... changes were implemented"
- "acceptance criteria not met"
- Markdown header "## Verification Failed"
3. **Pass patterns detected:**
- "verification passed" / "verification succeeded"
- "all acceptance criteria met"
- Markdown header "## Verification Passed"
4. **Modified merge logic:**
```python
if task.task_type == "verification":
stdout_passed, detection_reason = _detect_verification_result(task.stdout or "")
logger.info(f"Verification result for {task.issue_id}: exit_code=0, stdout_check={stdout_passed} ({detection_reason})")
if stdout_passed:
self._merge_feature_branch(task)
else:
# Block merge, add comment, move to Triage
```
## Consequences
### Positive
- **Prevents unverified merges:** Failed verification now correctly blocks merge regardless of exit code
- **Audit trail:** All merge decisions are logged with the detection reason
- **User feedback:** Blocked merges add a "Merge Blocked" comment explaining why
- **Backwards compatible:** Empty stdout or no markers falls back to exit code behavior
### Negative
- **Pattern maintenance:** New failure message formats may need to be added to the patterns list
- **False positives possible:** If a passing verification mentions failure in a different context (e.g., "Fixed the verification failed issue"), it could be incorrectly blocked
### Neutral
- Issues blocked from merge are moved to Triage state for manual review
- The original failure pattern from CG-19 ("none of the ... changes were actually implemented") is specifically covered
## Alternatives Considered
1. **Fix agent to always exit non-zero on failure:** Would require changes to all agent implementations and doesn't protect against agent bugs
2. **Structured output format (JSON):** More robust but requires agent rewrite; stdout parsing is simpler and works with existing agents
3. **Require explicit pass marker:** Would break backwards compatibility with agents that don't emit markers
## Related
- CG-62: Issue where this fix was implemented
- CG-19: Issue where the bug originally manifested
- `/opt/repos/agentrunner/runner.py`: Implementation location (lines 170-220, 429-450)