Runs autonomous iterative AI loops for requirements, planning, or building phases using structured prompts and fresh context per iteration.
Install
Documentation
Ralph Loops Skill
> First time? Read [SETUP.md](./SETUP.md) first to install dependencies and verify your setup.
Autonomous AI agent loops for iterative development. Based on Geoffrey Huntley's Ralph Wiggum technique, as documented by Clayton Farr.
Script:skills/ralph-loops/scripts/ralph-loop.mjs
Dashboard: skills/ralph-loops/dashboard/ (run with node server.mjs)
Templates: skills/ralph-loops/templates/
Archive: ~/clawd/logs/ralph-archive/
---
⚠️ Known Issues
Claude Code Version Compatibility
Claude Code 2.1.29 has a critical bug that spawns orphaned sub-agents consuming 99% CPU. Iterations fail with "exit code null" on first run. Fix: Downgrade to 2.1.25:npm install -g @anthropic-ai/claude-code@2.1.25
Verify:
claude --version # Should show 2.1.25
This was discovered 2026-02-01. Check if newer versions fix the issue before upgrading.
---
⚠️ Don't Block the Conversation!
When running a Ralph loop, don't monitor it synchronously. The loop runs as a separate Claude CLI process — you can keep chatting.
❌ Wrong (blocks conversation):Start loop → sleep 60 → poll → sleep 60 → poll → ... (6 minutes of silence)
✅ Right (stays responsive):
Start loop → "It's running, I'll check periodically" → keep chatting → check on heartbeats
How to monitor without blocking:
1. Start the loop with node ralph-loop.mjs ... (runs in background)
2. Tell human: "Loop running. I'll check progress periodically or you can ask."
3. Check via process poll <sessionId> when asked or during heartbeats
4. Use the dashboard at http://localhost:3939 for real-time visibility
The loop is autonomous — that's the whole point. Don't babysit it at the cost of ignoring your human.---
Trigger Phrases
When human says:
| Phrase | Action |
|--------|--------|
| "Interview me about system X" | Start Phase 1 requirements interview |
| "Start planning system X" | Run ./loop.sh plan (needs specs first) |
| "Start building system X" | Run ./loop.sh build (needs plan first) |
| "Ralph loop over X" | ASK which phase (see below) |
When Human Says "Ralph Loop" — Clarify the Phase!
Don't assume which phase. Ask:
> "Which type of Ralph loop are we doing?
>
> 1️⃣ Interview — I'll ask you questions to build specs (Phase 1)
> 2️⃣ Planning — I'll iterate on an implementation plan (Phase 2)
> 3️⃣ Building — I'll implement from a plan, one task per iteration (Phase 3)
> 4️⃣ Generic — Simple iterative refinement on a single topic"
Then proceed based on their answer:| Choice | Action |
|--------|--------|
| Interview | Use templates/requirements-interview.md protocol |
| Planning | Need specs first → run planning loop with PROMPT_plan.md |
| Building | Need plan first → run build loop with PROMPT_build.md |
| Generic | Create prompt file, run ralph-loop.mjs directly |
Generic Ralph Loop Flow (Phase 4)
For simple iterative refinement (not full system builds):
1. Clarify the task — What exactly should be improved/refined?
2. Create a prompt file — Save to /tmp/ralph-prompt-<task>.md
3. Set completion criteria — What signals "done"?
4. Run the loop:
node skills/ralph-loops/scripts/ralph-loop.mjs \
--prompt "/tmp/ralph-prompt-<task>.md" \
--model opus \
--max 10 \
--done "RALPH_DONE"
5. Or spawn as sub-agent for long-running tasks
---
Core Philosophy
> "Human roles shift from 'telling the agent what to do' to 'engineering conditions where good outcomes emerge naturally through iteration."
> — Clayton Farr
Three principles drive everything:
1. Context is scarce — With ~176K usable tokens from a 200K window, keep each iteration lean
2. Plans are disposable — A drifting plan is cheaper to regenerate than salvage
3. Backpressure beats direction — Engineer environments where wrong outputs get rejected automatically
---
Three-Phase Workflow
┌─────────────────────────────────────────────────────────────────────┐
│ Phase 1: REQUIREMENTS │
│ Human + LLM conversation → JTBD → Topics → specs/*.md │
├─────────────────────────────────────────────────────────────────────┤
│ Phase 2: PLANNING │
│ Gap analysis (specs vs code) → IMPLEMENTATION_PLAN.md │
├─────────────────────────────────────────────────────────────────────┤
│ Phase 3: BUILDING │
│ One task per iteration → fresh context → backpressure → commit │
└─────────────────────────────────────────────────────────────────────┘
Phase 1: Requirements (Talk to Human)
Goal: Understand what to build BEFORE building it.This is the most important phase. Use structured conversation to:
1. Identify Jobs to Be Done (JTBD)
- What user need or outcome are we solving?
- Not features — outcomes
2. Break JTBD into Topics of Concern
- Each topic = one distinct aspect/component
- Use the "one sentence without 'and'" test
- ✓ "The color extraction system analyzes images to identify dominant colors"
- ✗ "The user system handles authentication, profiles, and billing" → 3 topics
3. Create Specs for Each Topic
- One markdown file per topic in specs/
- Capture requirements, acceptance criteria, edge cases
Template:templates/requirements-interview.md
Phase 2: Planning (Gap Analysis)
Goal: Create a prioritized task list without implementing anything.Uses PROMPT_plan.md in the loop:
- -Study all specs
- -Study existing codebase
- -Compare specs vs code (gap analysis)
- -Generate
IMPLEMENTATION_PLAN.mdwith prioritized tasks - -NO implementation — planning only
Usually completes in 1-2 iterations.
Phase 3: Building (One Task Per Iteration)
Goal: Implement tasks one at a time with fresh context.Uses PROMPT_build.md in the loop:
1. Read IMPLEMENTATION_PLAN.md
2. Pick the most important task
3. Investigate codebase (don't assume not implemented)
4. Implement
5. Run validation (backpressure)
6. Update plan, commit
7. Exit → fresh context → next iteration
Key insight: One task per iteration keeps context lean. The agent stays in the "smart zone" instead of accumulating cruft. Why fresh context matters:- -No accumulated mistakes — Each iteration starts clean; previous errors don't compound
- -Full context budget — 200K tokens for THIS task, not shared with finished work
- -Reduced hallucination — Shorter contexts = more grounded responses
- -Natural checkpoints — Each commit is a save point; easy to revert single iterations
---
File Structure
project/
├── loop.sh # Ralph loop script
├── PROMPT_plan.md # Planning mode instructions
├── PROMPT_build.md # Building mode instructions
├── AGENTS.md # Operational guide (~60 lines max)
├── IMPLEMENTATION_PLAN.md # Prioritized task list (generated)
└── specs/ # Requirement specs
├── topic-a.md
├── topic-b.md
└── ...
File Purposes
| File | Purpose | Who Creates |
|------|---------|-------------|
| specs/*.md | Source of truth for requirements | Human + Phase 1 |
| PROMPT_plan.md | Instructions for planning mode | Copy from template |
| PROMPT_build.md | Instructions for building mode | Copy from template |
| AGENTS.md | Build/test/lint commands | Human + Ralph |
| IMPLEMENTATION_PLAN.md | Task list with priorities | Ralph (Phase 2) |
Project Organization (Systems)
For Clawdbot systems, each Ralph project lives in <workspace>/systems/<name>/:
systems/
├── health-tracker/ # Example system
│ ├── specs/
│ │ ├── daily-tracking.md
│ │ └── test-scheduling.md
│ ├── PROMPT_plan.md
│ ├── PROMPT_build.md
│ ├── AGENTS.md
│ ├── IMPLEMENTATION_PLAN.md # ← exists = past Phase 1
│ └── src/
└── activity-planner/
├── specs/ # ← empty = still in Phase 1
└── ...
Phase Detection (Auto)
Detect current phase by checking what files exist:
| What Exists | Current Phase | Next Action |
|-------------|---------------|-------------|
| Nothing / empty specs/ | Phase 1: Requirements | Run requirements interview |
| specs/*.md but no IMPLEMENTATION_PLAN.md | Ready for Phase 2 | Run ./loop.sh plan |
| specs/*.md + IMPLEMENTATION_PLAN.md | Phase 2 or 3 | Review plan, run ./loop.sh build |
| Plan shows all tasks complete | Done | Archive or iterate |
Quick check:What phase are we in?
[ -z "$(ls specs/ 2>/dev/null)" ] && echo "Phase 1: Need specs" && exit
[ ! -f IMPLEMENTATION_PLAN.md ] && echo "Phase 2: Need plan" && exit
echo "Phase 3: Ready to build (or done)"
---
JTBD Breakdown
The hierarchy matters:
JTBD (Job to Be Done)
└── Topic of Concern (1 per spec file)
└── Tasks (many per topic, in IMPLEMENTATION_PLAN.md)
Example:
- -JTBD: "Help designers create mood boards"
- -Topics:
specs/image-collection.md
- Color extraction → specs/color-extraction.md
- Layout system → specs/layout-system.md
- Sharing → specs/sharing.md
- -Tasks: Each spec generates multiple implementation tasks
Topic Scope Test
> Can you describe the topic in one sentence without "and"?
If you need "and" or "also", it's probably multiple topics. Split it.
When to split:- -Multiple verbs in the description → separate topics
- -Different user personas involved → separate topics
- -Could be implemented by different teams → separate topics
- -Has its own failure modes → probably its own topic
❌ "User management handles registration, authentication, profiles, and permissions"
✅ Split into:
- "Registration creates new user accounts from email/password"
- "Authentication verifies user identity via login flow"
- "Profiles let users view and edit their information"
- "Permissions control what actions users can perform"
Counter-example (don't split):
✅ Keep together:
"Color extraction analyzes images and returns dominant color palettes"
Why: "analyzes" and "returns" are steps in one operation, not separate concerns.
---
Backpressure Mechanisms
Autonomous loops converge when wrong outputs get rejected. Three layers:
1. Downstream Gates (Hard)
Tests, type-checking, linting, build validation. Deterministic.
In AGENTS.md
Validation
- -Tests:
npm test
- -Typecheck:
npm run typecheck
- -Lint:
npm run lint
2. Upstream Steering (Soft)
Existing code patterns guide the agent. It discovers conventions through exploration.
3. LLM-as-Judge (Subjective)
For subjective criteria (tone, UX, aesthetics), use another LLM call with binary pass/fail.
> Start with hard gates. Add LLM-as-judge for subjective criteria only after mechanical backpressure works.
---
Prompt Structure
Geoffrey's prompts follow a numbered pattern:
| Section | Purpose |
|---------|---------|
| 0a-0d | Orient: Study specs, source, current plan |
| 1-4 | Main instructions: What to do this iteration |
| 999+ | Guardrails: Invariants (higher number = more critical) |
The Numbered Guardrails Pattern
Guardrails use escalating numbers (99999, 999999, 9999999...) to signal priority:
99999. Important: Capture the why in documentation.
999999. Important: Single sources of truth, no migrations.
9999999. Create git tags after successful builds.
99999999. Add logging if needed to debug.
999999999. Keep IMPLEMENTATION_PLAN.md current.
Why this works:
1. Visual prominence — Large numbers stand out, harder to skip
2. Implicit priority — More 9s = more critical (like DEFCON levels in reverse)
3. No collisions — Sparse numbering lets you insert new rules without renumbering
4. Mnemonic — Claude treats these as invariants, not suggestions
The "Important:" prefix is deliberate — it triggers Claude's attention.Key Language Patterns
Use Geoffrey's specific phrasing — it matters:
- -"study" (not "read" or "look at")
- -"don't assume not implemented" (critical!)
- -"using parallel subagents" / "up to N subagents"
- -"only 1 subagent for build/tests" (backpressure control)
- -"Ultrathink" (deep reasoning trigger)
- -"capture the why"
- -"keep it up to date"
- -"resolve them or document them"
---
Quick Start
1. Set Up Project Structure
mkdir -p myproject/specs
cd myproject
git init # Ralph expects git for commits
Copy templates
cp .//templates/PROMPT_plan.md .
cp .//templates/PROMPT_build.md .
cp .//templates/AGENTS.md .
cp .//templates/loop.sh .
chmod +x loop.sh
2. Customize Templates (Required!)
PROMPT_plan.md — Replace[PROJECT_GOAL] with your actual goal:
Before:
ULTIMATE GOAL: We want to achieve [PROJECT_GOAL].
After:
ULTIMATE GOAL: We want to achieve a fully functional mood board app with image upload and color extraction.
PROMPT_build.md — Adjust source paths if not using src/:
Before:
0c. For reference, the application source code is in src/*.
After:
0c. For reference, the application source code is in lib/*.
AGENTS.md — Update build/test/lint commands for your stack.
3. Phase 1: Requirements Gathering (Don't Skip!)
This phase happens WITH the human. Use the interview template:
cat .//templates/requirements-interview.md
The workflow:
1. Discuss the JTBD (Job to Be Done) — outcomes, not features
2. Break into Topics of Concern (each passes the "one sentence" test)
3. Write a spec file for each topic: specs/topic-name.md
4. Human reviews and approves specs
Example output:specs/
├── image-collection.md
├── color-extraction.md
├── layout-system.md
└── sharing.md
4. Phase 2: Planning
./loop.sh plan
Wait for IMPLEMENTATION_PLAN.md to be generated (usually 1-2 iterations). Review it — this is your task list.
5. Phase 3: Building
./loop.sh build 20 # Max 20 iterations
Watch it work. Add backpressure (tests, lints) as patterns emerge. Check commits for progress.
---
Loop Script Options
./loop.sh # Build mode, unlimited
./loop.sh 20 # Build mode, max 20 iterations
./loop.sh plan # Plan mode, unlimited
./loop.sh plan 5 # Plan mode, max 5 iterations
Or use the Node.js wrapper for more control:
node skills/ralph-loops/scripts/ralph-loop.mjs \
--prompt "./PROMPT_build.md" \
--model opus \
--max 20 \
--done "RALPH_DONE"
---
When to Regenerate the Plan
Plans drift. Regenerate when:
- -Ralph is going off track (implementing wrong things)
- -Plan feels stale or doesn't match current state
- -Too much clutter from completed items
- -You've made significant spec changes
- -You're confused about what's actually done
Just switch back to planning mode:
./loop.sh plan
Regeneration cost is one Planning loop. Cheap compared to Ralph going in circles.
---
Safety
Ralph requires --dangerously-skip-permissions to run autonomously. This bypasses Claude's permission system entirely.
- -Run in isolated environments (Docker, VM)
- -Only the API keys needed for the task
- -No access to private data beyond requirements
- -Restrict network connectivity where possible
- -Escape hatches: Ctrl+C stops the loop;
git reset --hardreverts uncommitted changes
---
Cost Expectations
| Task Type | Model | Iterations | Est. Cost |
|-----------|-------|------------|-----------|
| Generate plan | Opus | 1-2 | $0.50-1.00 |
| Implement simple feature | Opus | 3-5 | $1.00-2.00 |
| Implement complex feature | Opus | 10-20 | $3.00-8.00 |
| Full project buildout | Opus | 50+ | $15-50+ |
Tip: Use Sonnet for simpler tasks where plan is clear. Use Opus for planning and complex reasoning.---
Real-World Results
From Geoffrey Huntley:
- -6 repos generated overnight at YC hackathon
- -$50k contract completed for $297 in API costs
- -Created entire programming language over 3 months
---
Advanced: Running as Sub-Agent
For long loops, spawn as sub-agent so main session stays responsive:
sessions_spawn({,task:
cd /path/to/project && ./loop.sh build 20Summarize what was implemented when done.
label: "ralph-build",
model: "opus"
})
Check progress:
sessions_list({ kinds: ["spawn"] })
sessions_history({ label: "ralph-build", limit: 5 })
---
Troubleshooting
Ralph keeps implementing the same thing
- -Plan is stale → regenerate with
./loop.sh plan - -Backpressure missing → add tests that catch duplicates
Ralph goes in circles
- -Add more specific guardrails to prompts
- -Check if specs are ambiguous
- -Regenerate plan
Context getting bloated
- -Ensure one task per iteration (check prompt)
- -Keep AGENTS.md under 60 lines
- -Move status/progress to IMPLEMENTATION_PLAN.md, not AGENTS.md
Tests not running
- -Check AGENTS.md has correct validation commands
- -Ensure backpressure section in prompt references AGENTS.md
---
Edge Cases
Projects Without Git
The loop script expects git for commits and pushes. For projects without version control:
Option 1: Initialize git anyway (recommended)git init
git add -A
git commit -m "Initial commit before Ralph"
Option 2: Modify the prompts
- -Remove git-related guardrails from PROMPT_build.md
- -Remove the git push section from loop.sh
- -Use file backups instead: add
cp -r src/ backups/iteration-$ITERATION/to loop.sh
Add to loop.sh before each iteration:
tar -czf "snapshots/pre-iteration-$ITERATION.tar.gz" src/
Very Large Codebases
For codebases with 100K+ lines:
- -Reduce subagent parallelism: Change "up to 500 parallel Sonnet subagents" to "up to 50" in prompts
- -Scope narrowly: Use focused specs that target specific directories
- -Add path restrictions: In AGENTS.md, note which directories are in-scope
- -Consider workspace splitting: Treat large modules as separate Ralph projects
When Claude CLI Isn't Available
The methodology works with any Claude interface:
Claude API directly:Replace loop.sh with API calls using curl or a script
curl https://api.anthropic.com/v1/messages \
-H "x-api-key: $ANTHROPIC_API_KEY" \
-H "content-type: application/json" \
-d '{"model": "claude-sonnet-4-20250514", "max_tokens": 8192, "messages": [...]}'
Alternative agents:
- -Aider:
aider --opus --auto-commits - -Continue.dev: Use with Claude API key
- -Cursor: Composer mode with PROMPT files as context
The key principles (one task per iteration, fresh context, backpressure) apply regardless of tooling.
Non-Node.js Projects
Adapt AGENTS.md for your stack:
| Stack | Build | Test | Lint |
|-------|-------|------|------|
| Python | pip install -e . | pytest | ruff . |
| Go | go build ./... | go test ./... | golangci-lint run |
| Rust | cargo build | cargo test | cargo clippy |
| Ruby | bundle install | rspec | rubocop |
Also update path references in prompts (src/* → your source directory).
---
Learn More
- -Geoffrey Huntley: https://ghuntley.com/ralph/
- -Clayton Farr's Playbook: https://github.com/ClaytonFarr/ralph-playbook
- -Geoffrey's Fork: https://github.com/ghuntley/how-to-ralph-wiggum
---
Credits
Built by Johnathan & Q — a human-AI dyad.
- -Twitter: [@spacepixel](https://x.com/spacepixel)
- -ClawdHub: [clawhub.ai/skills/ralph-loops](https://www.clawhub.ai/skills/ralph-loops)
Launch an agent with Ralph Loops on Termo.