v1.0.2

Ralph Loops

Name: Ralph Loops
Author: qlifebot-coder

Runs autonomous iterative AI loops for requirements, planning, or building phases using structured prompts and fresh context per iteration.

Downloads

1.7k

Stars

Versions

Updated

2026-02-24

Install

npx clawhub@latest install ralph-loops

Documentation

Ralph Loops Skill

> First time? Read [SETUP.md](./SETUP.md) first to install dependencies and verify your setup.

Autonomous AI agent loops for iterative development. Based on Geoffrey Huntley's Ralph Wiggum technique, as documented by Clayton Farr.

Script: skills/ralph-loops/scripts/ralph-loop.mjs Dashboard: skills/ralph-loops/dashboard/ (run with node server.mjs) Templates: skills/ralph-loops/templates/ Archive: ~/clawd/logs/ralph-archive/

---

⚠️ Known Issues

Claude Code Version Compatibility

Claude Code 2.1.29 has a critical bug that spawns orphaned sub-agents consuming 99% CPU. Iterations fail with "exit code null" on first run. Fix: Downgrade to 2.1.25:

npm install -g @anthropic-ai/claude-code@2.1.25

Verify:

claude --version  # Should show 2.1.25

This was discovered 2026-02-01. Check if newer versions fix the issue before upgrading.

---

⚠️ Don't Block the Conversation!

When running a Ralph loop, don't monitor it synchronously. The loop runs as a separate Claude CLI process — you can keep chatting.

❌ Wrong (blocks conversation):

Start loop → sleep 60 → poll → sleep 60 → poll → ... (6 minutes of silence)

✅ Right (stays responsive):

Start loop → "It's running, I'll check periodically" → keep chatting → check on heartbeats

How to monitor without blocking:

1. Start the loop with node ralph-loop.mjs ... (runs in background)

2. Tell human: "Loop running. I'll check progress periodically or you can ask."

3. Check via process poll <sessionId> when asked or during heartbeats

4. Use the dashboard at http://localhost:3939 for real-time visibility

The loop is autonomous — that's the whole point. Don't babysit it at the cost of ignoring your human.

---

Trigger Phrases

When human says:

| Phrase | Action |

|--------|--------|

| "Interview me about system X" | Start Phase 1 requirements interview |

| "Start planning system X" | Run ./loop.sh plan (needs specs first) |

| "Start building system X" | Run ./loop.sh build (needs plan first) |

| "Ralph loop over X" | ASK which phase (see below) |

When Human Says "Ralph Loop" — Clarify the Phase!

Don't assume which phase. Ask:

> "Which type of Ralph loop are we doing?

> 1️⃣ Interview — I'll ask you questions to build specs (Phase 1)

> 2️⃣ Planning — I'll iterate on an implementation plan (Phase 2)

> 3️⃣ Building — I'll implement from a plan, one task per iteration (Phase 3)

> 4️⃣ Generic — Simple iterative refinement on a single topic"

Then proceed based on their answer:

| Choice | Action |

|--------|--------|

| Interview | Use templates/requirements-interview.md protocol |

| Planning | Need specs first → run planning loop with PROMPT_plan.md |

| Building | Need plan first → run build loop with PROMPT_build.md |

| Generic | Create prompt file, run ralph-loop.mjs directly |

Generic Ralph Loop Flow (Phase 4)

For simple iterative refinement (not full system builds):

1. Clarify the task — What exactly should be improved/refined?

2. Create a prompt file — Save to /tmp/ralph-prompt-<task>.md

3. Set completion criteria — What signals "done"?

4. Run the loop:

node skills/ralph-loops/scripts/ralph-loop.mjs \ --prompt "/tmp/ralph-prompt-<task>.md" \ --model opus \ --max 10 \ --done "RALPH_DONE"

5. Or spawn as sub-agent for long-running tasks

---

Core Philosophy

> "Human roles shift from 'telling the agent what to do' to 'engineering conditions where good outcomes emerge naturally through iteration."

> — Clayton Farr

Three principles drive everything:

1. Context is scarce — With ~176K usable tokens from a 200K window, keep each iteration lean

2. Plans are disposable — A drifting plan is cheaper to regenerate than salvage

3. Backpressure beats direction — Engineer environments where wrong outputs get rejected automatically

---

Three-Phase Workflow

┌─────────────────────────────────────────────────────────────────────┐
│  Phase 1: REQUIREMENTS                                              │
│  Human + LLM conversation → JTBD → Topics → specs/*.md              │
├─────────────────────────────────────────────────────────────────────┤
│  Phase 2: PLANNING                                                  │
│  Gap analysis (specs vs code) → IMPLEMENTATION_PLAN.md              │
├─────────────────────────────────────────────────────────────────────┤
│  Phase 3: BUILDING                                                  │
│  One task per iteration → fresh context → backpressure → commit     │
└─────────────────────────────────────────────────────────────────────┘

Phase 1: Requirements (Talk to Human)

Goal: Understand what to build BEFORE building it.

This is the most important phase. Use structured conversation to:

1. Identify Jobs to Be Done (JTBD)

- What user need or outcome are we solving?

- Not features — outcomes

2. Break JTBD into Topics of Concern

- Each topic = one distinct aspect/component

- Use the "one sentence without 'and'" test

- ✓ "The color extraction system analyzes images to identify dominant colors"

- ✗ "The user system handles authentication, profiles, and billing" → 3 topics

3. Create Specs for Each Topic

- One markdown file per topic in specs/

- Capture requirements, acceptance criteria, edge cases

Template: templates/requirements-interview.md

Phase 2: Planning (Gap Analysis)

Goal: Create a prioritized task list without implementing anything.

Uses PROMPT_plan.md in the loop:

-Study all specs
-Study existing codebase
-Compare specs vs code (gap analysis)
-Generate IMPLEMENTATION_PLAN.md with prioritized tasks
-NO implementation — planning only

Usually completes in 1-2 iterations.

Phase 3: Building (One Task Per Iteration)

Goal: Implement tasks one at a time with fresh context.

Uses PROMPT_build.md in the loop:

1. Read IMPLEMENTATION_PLAN.md

2. Pick the most important task

3. Investigate codebase (don't assume not implemented)

4. Implement

5. Run validation (backpressure)

6. Update plan, commit

7. Exit → fresh context → next iteration

Key insight: One task per iteration keeps context lean. The agent stays in the "smart zone" instead of accumulating cruft. Why fresh context matters:

-No accumulated mistakes — Each iteration starts clean; previous errors don't compound
-Full context budget — 200K tokens for THIS task, not shared with finished work
-Reduced hallucination — Shorter contexts = more grounded responses
-Natural checkpoints — Each commit is a save point; easy to revert single iterations

---

File Structure

project/
├── loop.sh                    # Ralph loop script
├── PROMPT_plan.md             # Planning mode instructions
├── PROMPT_build.md            # Building mode instructions  
├── AGENTS.md                  # Operational guide (~60 lines max)
├── IMPLEMENTATION_PLAN.md     # Prioritized task list (generated)
└── specs/                     # Requirement specs
    ├── topic-a.md
    ├── topic-b.md
    └── ...

File Purposes

| File | Purpose | Who Creates |

|------|---------|-------------|

| specs/*.md | Source of truth for requirements | Human + Phase 1 |

| PROMPT_plan.md | Instructions for planning mode | Copy from template |

| PROMPT_build.md | Instructions for building mode | Copy from template |

| AGENTS.md | Build/test/lint commands | Human + Ralph |

| IMPLEMENTATION_PLAN.md | Task list with priorities | Ralph (Phase 2) |

Project Organization (Systems)

For Clawdbot systems, each Ralph project lives in <workspace>/systems/<name>/:

systems/
├── health-tracker/           # Example system
│   ├── specs/
│   │   ├── daily-tracking.md
│   │   └── test-scheduling.md
│   ├── PROMPT_plan.md
│   ├── PROMPT_build.md
│   ├── AGENTS.md
│   ├── IMPLEMENTATION_PLAN.md  # ← exists = past Phase 1
│   └── src/
└── activity-planner/
    ├── specs/                  # ← empty = still in Phase 1
    └── ...

Phase Detection (Auto)

Detect current phase by checking what files exist:

| What Exists | Current Phase | Next Action |

|-------------|---------------|-------------|

| Nothing / empty specs/ | Phase 1: Requirements | Run requirements interview |

| specs/*.md but no IMPLEMENTATION_PLAN.md | Ready for Phase 2 | Run ./loop.sh plan |

| specs/*.md + IMPLEMENTATION_PLAN.md | Phase 2 or 3 | Review plan, run ./loop.sh build |

| Plan shows all tasks complete | Done | Archive or iterate |

Quick check:

What phase are we in?
[ -z "$(ls specs/ 2>/dev/null)" ] && echo "Phase 1: Need specs" && exit
[ ! -f IMPLEMENTATION_PLAN.md ] && echo "Phase 2: Need plan" && exit
echo "Phase 3: Ready to build (or done)"

---

JTBD Breakdown

The hierarchy matters:

JTBD (Job to Be Done)
└── Topic of Concern (1 per spec file)
    └── Tasks (many per topic, in IMPLEMENTATION_PLAN.md)

Example:

-JTBD: "Help designers create mood boards"
-Topics:

- Image collection → specs/image-collection.md

- Color extraction → specs/color-extraction.md

- Layout system → specs/layout-system.md

- Sharing → specs/sharing.md

-Tasks: Each spec generates multiple implementation tasks

Topic Scope Test

> Can you describe the topic in one sentence without "and"?

If you need "and" or "also", it's probably multiple topics. Split it.

When to split:

-Multiple verbs in the description → separate topics
-Different user personas involved → separate topics
-Could be implemented by different teams → separate topics
-Has its own failure modes → probably its own topic

Example split:

❌ "User management handles registration, authentication, profiles, and permissions"

✅ Split into:
   - "Registration creates new user accounts from email/password"
   - "Authentication verifies user identity via login flow"  
   - "Profiles let users view and edit their information"
   - "Permissions control what actions users can perform"

Counter-example (don't split):

✅ Keep together:
   "Color extraction analyzes images and returns dominant color palettes"

   Why: "analyzes" and "returns" are steps in one operation, not separate concerns.

---

Backpressure Mechanisms

Autonomous loops converge when wrong outputs get rejected. Three layers:

1. Downstream Gates (Hard)

Tests, type-checking, linting, build validation. Deterministic.

In AGENTS.md
Validation
-Tests: npm test
-Typecheck: npm run typecheck
-Lint: npm run lint

2. Upstream Steering (Soft)

Existing code patterns guide the agent. It discovers conventions through exploration.

3. LLM-as-Judge (Subjective)

For subjective criteria (tone, UX, aesthetics), use another LLM call with binary pass/fail.

> Start with hard gates. Add LLM-as-judge for subjective criteria only after mechanical backpressure works.

---

Prompt Structure

Geoffrey's prompts follow a numbered pattern:

| Section | Purpose |

|---------|---------|

| 0a-0d | Orient: Study specs, source, current plan |

| 1-4 | Main instructions: What to do this iteration |

| 999+ | Guardrails: Invariants (higher number = more critical) |

The Numbered Guardrails Pattern

Guardrails use escalating numbers (99999, 999999, 9999999...) to signal priority:

99999. Important: Capture the why in documentation.

999999. Important: Single sources of truth, no migrations.

9999999. Create git tags after successful builds.

99999999. Add logging if needed to debug.

999999999. Keep IMPLEMENTATION_PLAN.md current.

Why this works:

1. Visual prominence — Large numbers stand out, harder to skip

2. Implicit priority — More 9s = more critical (like DEFCON levels in reverse)

3. No collisions — Sparse numbering lets you insert new rules without renumbering

4. Mnemonic — Claude treats these as invariants, not suggestions

The "Important:" prefix is deliberate — it triggers Claude's attention.

Key Language Patterns

Use Geoffrey's specific phrasing — it matters:

-"study" (not "read" or "look at")
-"don't assume not implemented" (critical!)
-"using parallel subagents" / "up to N subagents"
-"only 1 subagent for build/tests" (backpressure control)
-"Ultrathink" (deep reasoning trigger)
-"capture the why"
-"keep it up to date"
-"resolve them or document them"

---

Quick Start

1. Set Up Project Structure

mkdir -p myproject/specs
cd myproject
git init  # Ralph expects git for commits

Copy templates
cp .//templates/PROMPT_plan.md .
cp .//templates/PROMPT_build.md .
cp .//templates/AGENTS.md .
cp .//templates/loop.sh .
chmod +x loop.sh

2. Customize Templates (Required!)

PROMPT_plan.md — Replace [PROJECT_GOAL] with your actual goal:

Before:
ULTIMATE GOAL: We want to achieve [PROJECT_GOAL].

After:
ULTIMATE GOAL: We want to achieve a fully functional mood board app with image upload and color extraction.

PROMPT_build.md — Adjust source paths if not using src/:

Before:
0c. For reference, the application source code is in src/*.

After:
0c. For reference, the application source code is in lib/*.

AGENTS.md — Update build/test/lint commands for your stack.

3. Phase 1: Requirements Gathering (Don't Skip!)

This phase happens WITH the human. Use the interview template:

cat .//templates/requirements-interview.md

The workflow:

1. Discuss the JTBD (Job to Be Done) — outcomes, not features

2. Break into Topics of Concern (each passes the "one sentence" test)

3. Write a spec file for each topic: specs/topic-name.md

4. Human reviews and approves specs

Example output:

specs/
├── image-collection.md
├── color-extraction.md
├── layout-system.md
└── sharing.md

4. Phase 2: Planning

./loop.sh plan

Wait for IMPLEMENTATION_PLAN.md to be generated (usually 1-2 iterations). Review it — this is your task list.

5. Phase 3: Building

./loop.sh build 20  # Max 20 iterations

Watch it work. Add backpressure (tests, lints) as patterns emerge. Check commits for progress.

---

Loop Script Options

./loop.sh              # Build mode, unlimited
./loop.sh 20           # Build mode, max 20 iterations
./loop.sh plan         # Plan mode, unlimited
./loop.sh plan 5       # Plan mode, max 5 iterations

Or use the Node.js wrapper for more control:

node skills/ralph-loops/scripts/ralph-loop.mjs \
  --prompt "./PROMPT_build.md" \
  --model opus \
  --max 20 \
  --done "RALPH_DONE"

---

When to Regenerate the Plan

Plans drift. Regenerate when:

-Ralph is going off track (implementing wrong things)
-Plan feels stale or doesn't match current state
-Too much clutter from completed items
-You've made significant spec changes
-You're confused about what's actually done

Just switch back to planning mode:

./loop.sh plan

Regeneration cost is one Planning loop. Cheap compared to Ralph going in circles.

---

Safety

Ralph requires --dangerously-skip-permissions to run autonomously. This bypasses Claude's permission system entirely.

Philosophy: "It's not if it gets popped, it's when. And what is the blast radius?" Protections:

-Run in isolated environments (Docker, VM)
-Only the API keys needed for the task
-No access to private data beyond requirements
-Restrict network connectivity where possible
-Escape hatches: Ctrl+C stops the loop; git reset --hard reverts uncommitted changes

---

Cost Expectations

|-----------|-------|------------|-----------|

| Generate plan | Opus | 1-2 | $0.50-1.00 |

| Implement simple feature | Opus | 3-5 | $1.00-2.00 |

| Implement complex feature | Opus | 10-20 | $3.00-8.00 |

| Full project buildout | Opus | 50+ | $15-50+ |

Tip: Use Sonnet for simpler tasks where plan is clear. Use Opus for planning and complex reasoning.

---

Real-World Results

From Geoffrey Huntley:

-6 repos generated overnight at YC hackathon
-$50k contract completed for $297 in API costs
-Created entire programming language over 3 months

---

Advanced: Running as Sub-Agent

For long loops, spawn as sub-agent so main session stays responsive:

sessions_spawn({
  task: cd /path/to/project && ./loop.sh build 20


Summarize what was implemented when done.,
  label: "ralph-build",
  model: "opus"
})

Check progress:

sessions_list({ kinds: ["spawn"] })
sessions_history({ label: "ralph-build", limit: 5 })

---

Troubleshooting

Ralph keeps implementing the same thing

-Plan is stale → regenerate with ./loop.sh plan
-Backpressure missing → add tests that catch duplicates

Ralph goes in circles

-Add more specific guardrails to prompts
-Check if specs are ambiguous
-Regenerate plan

Context getting bloated

-Ensure one task per iteration (check prompt)
-Keep AGENTS.md under 60 lines
-Move status/progress to IMPLEMENTATION_PLAN.md, not AGENTS.md

Tests not running

-Check AGENTS.md has correct validation commands
-Ensure backpressure section in prompt references AGENTS.md

---

Edge Cases

Projects Without Git

The loop script expects git for commits and pushes. For projects without version control:

Option 1: Initialize git anyway (recommended)

git init
git add -A
git commit -m "Initial commit before Ralph"

Option 2: Modify the prompts

-Remove git-related guardrails from PROMPT_build.md
-Remove the git push section from loop.sh
-Use file backups instead: add cp -r src/ backups/iteration-$ITERATION/ to loop.sh

Option 3: Use tarball snapshots

Add to loop.sh before each iteration:
tar -czf "snapshots/pre-iteration-$ITERATION.tar.gz" src/

Very Large Codebases

For codebases with 100K+ lines:

-Reduce subagent parallelism: Change "up to 500 parallel Sonnet subagents" to "up to 50" in prompts
-Scope narrowly: Use focused specs that target specific directories
-Add path restrictions: In AGENTS.md, note which directories are in-scope
-Consider workspace splitting: Treat large modules as separate Ralph projects

When Claude CLI Isn't Available

The methodology works with any Claude interface:

Claude API directly:

Replace loop.sh with API calls using curl or a script
curl https://api.anthropic.com/v1/messages \
  -H "x-api-key: $ANTHROPIC_API_KEY" \
  -H "content-type: application/json" \
  -d '{"model": "claude-sonnet-4-20250514", "max_tokens": 8192, "messages": [...]}'

Alternative agents:

-Aider: aider --opus --auto-commits
-Continue.dev: Use with Claude API key
-Cursor: Composer mode with PROMPT files as context

The key principles (one task per iteration, fresh context, backpressure) apply regardless of tooling.

Non-Node.js Projects

Adapt AGENTS.md for your stack:

|-------|-------|------|------|

Also update path references in prompts (src/* → your source directory).

---

Learn More

-Geoffrey Huntley: https://ghuntley.com/ralph/
-Clayton Farr's Playbook: https://github.com/ClaytonFarr/ralph-playbook
-Geoffrey's Fork: https://github.com/ghuntley/how-to-ralph-wiggum

---

Credits

Built by Johnathan & Q — a human-AI dyad.

-Twitter: [@spacepixel](https://x.com/spacepixel)
-ClawdHub: [clawhub.ai/skills/ralph-loops](https://www.clawhub.ai/skills/ralph-loops)

Launch an agent with Ralph Loops on Termo.

Use this skill View on ClawHub