Skip to main content

The harness

The whole lesson in one sentence: the harness is the wrapper code around the loop that turns a stateless model call into a running agent, and it stays surprisingly small.

What the harness is, and what it is not

Lesson 1 gave you the loop: model to tool to result to model, until done. That loop is six lines. It is also not enough to run anything real. The harness is everything wrapped around those six lines.

It is not the model. It is not the tools. It is the glue between them. Anthropic's engineering team and the wiki summary land on the same five jobs:

  • Agent loop: how many times you call the model and how you feed tool results back in.
  • State management: session state, conversation history, working memory.
  • Tool dispatch: which tool calls run, in what order, with what retries.
  • Context management: what goes into the window, what comes out, what gets cached.
  • Guardrails: scoped permissions, rate limits, safety checks.

Lesson 1 was the engine. This lesson is the car around it.

The split that makes the rest obvious: brain and hands

Before you build any of those five pieces, draw one line. Anthropic's term for it, from the Claude Managed Agents work, is brain and hands.

The brain is the model plus the harness: the reasoning and the control logic. It decides what should happen. The hands are the sandboxes, the tool execution, the MCP endpoints. They do the thing and touch the real world.

This is not decoration. It tells you where each concern goes:

LayerWhatJob
BrainModel + harness (loop, context, orchestration)Decide what to do
HandsSandboxes, tool execution, MCP endpointsExecute, hit the real world

The payoffs are concrete. Code runs in isolated containers, not in the same process as the model caller, so a bad command wrecks a sandbox and nothing else. As Solomon Hykes put it, an agent is "an LLM wrecking its environment in a loop." You want that environment to be disposable. Credentials live in a vault, not in the context window. To quote the source directly: "access tokens aren't actually within sandboxes nor are they available to the LLM." And you can swap the hands (Daytona, E2B, Modal) without the brain noticing, or scale the brain per session and the hands per tool load, independently.

When you are deciding where a piece of code goes, ask one question: is this deciding, or is this doing? Deciding goes in the brain. Doing goes in the hands.

State, and why you write it to disk

The loop in lesson 1 had no memory beyond the growing message list. That works until the process stops. State management is what survives.

There are two different things people call state, and conflating them causes bugs. One is the agent's memory: what it knows, what it learned. The other is run-state: what persists between separate runs of the agent. Artifacts, PR branches, workflow status. Run-state is the operational layer, and it raises three plain questions, from the run-state research:

  • Observability: how do you see what a run did when you were not watching?
  • State between runs: what survives from one invocation to the next?
  • Secrets: how do credentials reach the runner without leaking through git or logs?

The convergent answer across serious agents is unglamorous: write it to a file. OpenClaw keeps MEMORY.md and memory/YYYY-MM-DD.md. Claude Code keeps CLAUDE.md. Anthropic's memory tool writes to a sandboxed /memories directory. The rule underneath all of them: the model remembers only what gets written to disk. No hidden state. Editable, versionable with git, readable by both humans and the model. Vector stores did not vanish, they got demoted from primary storage to a derived index.

For run-state specifically, the same instinct holds. The cleanest pattern for a scheduled agent is to commit its output to a runs/ subtree in the repo: a git-diffable audit trail, free, unlimited retention, and the agent can read its own prior runs with a plain Bash call. Observability follows the same shape. Claude Code drops a plaintext JSONL transcript per session under ~/.claude/projects/, one JSON object per line: every message, tool call, result, and token count. Your forensic trail is a file you can grep.

Context management and guardrails: the loop's two failure modes

The loop fails in two ways, and two harness jobs exist to catch them.

It fails by filling up. Every turn adds tokens. Anthropic calls the result context rot: as the window fills, the model gets worse. Context management is the countermeasure. You retrieve just-in-time instead of dumping everything in. You compact long trajectories into summaries. You isolate sub-agent work so it does not pollute the main thread. A concrete warning from the anatomy source: install too many MCP servers and their tool descriptions alone can eat 76k of a 200k window before the agent has done anything. The harness curates the window. It does not just append to it.

It fails by doing damage. An agent in a loop, given a bash tool, can delete things. Guardrails are the countermeasure: scoped permissions on tool calls, rate limits, sandboxed execution. This is exactly why the brain-and-hands split exists. The dangerous part runs in the hands, where you can fence it. Steinberger's own security advice after 135,000 OpenClaw instances were found publicly exposed: "If you make sure you are the only person talking to it, the risk profile is much smaller." Guardrails are not paranoia. They are the difference between a tool that runs and a credential leak.

It is still small

Here is the part people do not expect. With all five jobs in it, the harness stays tiny.

Thorsten Ball wrote a working code-editing agent in under 400 lines of Go, and most of that was boilerplate. Geoffrey Huntley's mantra is the 300-line agent: five primitives are enough, read_file, list_files, bash, edit_file with old-string and new-string, and code_search over ripgrep. mini-swe-agent solves 68% of SWE-bench Verified in 100 lines. Thomas Ptacek of Fly.io calls this "the new 'hello world' of AI engineering."

So when a framework offers you a harness, understand the trade. A framework is a reusable harness; a managed platform is a hosted one. They save you the boilerplate, but Anthropic's own guidance is that they obscure the prompts and the loop. You can write the whole thing yourself in an afternoon. Start there, and reach for a framework only when you know exactly which of the five jobs you are paying it to do.

One more thing the harness is not: finished. Anthropic's position is that a harness can never stay static. It has to track the model's capabilities. As models get better at something, the harness should stop micromanaging that something. A harness tuned for Claude 3 may underperform on Claude 5. Harness engineering is a living job, not a one-time build.

Takeaways

  • The harness is five jobs around the loop: loop, state, tool dispatch, context, guardrails. Nothing more.
  • Split brain (decide) from hands (do). Dangerous execution and secrets live in the hands.
  • State means files. The model remembers only what gets written to disk.
  • The two loop failures are filling up (context rot) and doing damage (no guardrails). Each has a harness job to catch it.
  • It stays small: under 400 lines of Go, the 300-line mantra, 68% in 100 lines. Write it before you import it.

Where this goes next

You now have the wrapper, but the hardest of its five jobs got only a paragraph here: deciding what goes into the model's window and what stays out. That is the difference between an agent that stays sharp over a long task and one that rots halfway through. The next lesson, "Context engineering", is about exactly that: treating the context window as the scarce resource it is, and managing it on purpose.