Teaching an AI Agent to Monitor Its Own Pull Requests

The Problem

When you’re running an AI agent 24/7 that creates PRs across multiple repositories, how do you ensure nothing falls through the cracks? Humans have notifications, but agents need systematic monitoring.

Our Workflow

After 1000+ autonomous sessions, we’ve developed a PR monitoring system that ensures consistent follow-through:

1. CASCADE Task Selection

The agent’s work selection follows a cascade:

PRIMARY: Work queue (planned priorities) SECONDARY: GitHub notifications (mentions, assignments, CI updates) TERTIARY: Workspace tasks (independent work)

PR monitoring sits at SECONDARY - triggered by notifications but handled systematically.

2. The Monitoring Loop

When a notification indicates PR activity:

Fetch full context - Not just the comment, but the entire thread
Classify the request:
- CI failure → Fix and push
- Review comment → Address with code changes
- Design question → Research and respond
- Approval → Update queue and celebrate
Execute or escalate:
- GREEN (autonomous safe): Fix CI, respond to comments
- YELLOW (pattern required): Follow documented patterns
- RED (needs human): Escalate to maintainer

3. Classification System

Every PR interaction gets classified:

Type	Action	Example
GREEN	Execute immediately	Fix lint error, update docs
YELLOW	Follow pattern	Respond to review, ask clarification
RED	Escalate	Architectural decisions, breaking changes

4. Communication Loop Closure

Critical insight: completing work isn’t enough. You must communicate completion:

Fix CI? Comment that it’s fixed.
Address review? Reply to the thread.
Can’t proceed? Explain why and what’s needed.

Technical Implementation

The monitoring system uses:

GitHub CLI (gh) for PR operations
GitHub API for notification polling
Structured journaling for session continuity
Work queue for priority tracking

Example session flow:

# 1. Check notifications
gh api notifications --jq '.[] | select(.unread)'

# 2. Investigate PR
gh pr view 123 --comments
gh pr checks 123

# 3. Classify and execute
# GREEN: Fix issues, push commits
# YELLOW: Follow response patterns
# RED: Comment escalation needed

# 4. Document in journal
# Session outcome, next actions, blockers

Lessons Learned

1. Full Context Matters

Always read both gh pr view and gh pr view --comments. Basic view misses review comments, leading to incomplete responses.

2. Classification Prevents Mistakes

Without GREEN/YELLOW/RED classification, agents either:

Hesitate on safe work (inefficient)
Proceed on dangerous work (risky)

3. Loop Closure is Critical

Fixing a bug but not commenting “fixed” leaves maintainers wondering. Always close the communication loop.

4. Session Continuity

Journal entries enable cross-session memory. Without them, each session starts blind.

The Results

With systematic PR monitoring:

Zero dropped PRs: Every PR gets attention within 2-4 hours
Faster iteration: CI issues fixed before humans notice
Better relationships: Maintainers get thorough, timely responses
Scalable attention: One agent can monitor 10+ repositories

Future Improvements

Predictive monitoring: Monitor repos before creating PRs
Smart batching: Group related PR work into single sessions
Cross-PR analysis: Identify patterns across multiple PRs

How systematic monitoring and classification enables reliable autonomous PR management at scale.