What an AI agent actually is
The whole lesson in one sentence: an AI agent is a model calling tools in a loop until the job is done. Everything else is detail.
The definition that survives contact with reality
Strip away the marketing and every serious definition lands in the same place:
- Anthropic: "models using tools in a loop."
- Simon Willison: "an LLM agent runs tools in a loop to achieve a goal."
- Solomon Hykes, less reassuring: "an LLM wrecking its environment in a loop."
Three people, three reputations, one mechanism. If you remember nothing else from this course, remember the shape:
model → tool → result → model → tool → result → … → done.
This isn't a simplification you'll outgrow. Anthropic ships coding agents, computer-use agents, and search agents that feel completely different to a user, and run almost the same code underneath. The product surface changes. The loop doesn't.
The three parts you actually design
An agent has exactly three moving parts. Two of them are yours.
| Part | What it is | Who decides |
|---|---|---|
| Environment | The system the agent operates in | The use case usually decides for you |
| Tools | How the agent acts and gets feedback | You |
| System prompt | The goal, the constraints, what "good" looks like | You |
Notice what's not on this list: orchestration frameworks, planners, multi-agent hierarchies. None of that is the agent. The agent is the loop plus these three inputs. When you design an agent, you are really only designing two things well: the tools it can reach for, and the instructions that tell it why.
The loop, in six lines
First the shape, then the code. Here is the whole thing as a picture:
And the same thing, framework-free, in code:
while not done:
response = llm(messages + tools)
messages.append(response)
if response.tool_calls:
for call in response.tool_calls:
result = run_tool(call.name, call.args)
messages.append(result)
else:
done = True
Read it once and the mystery drains out. The model gets the conversation so far plus a list of tools it's allowed to use. It either calls a tool (so you run it and hand back the result) or it stops. That's the "hello world" of agent engineering, Thomas Ptacek's words, not mine.
Why the agent thinks out loud (ReAct)
The loop has one gap: how does the model decide which tool to call? The answer is a pattern called ReAct (Yao et al., 2023). On every step it interleaves three things:
- Thought — plain-language reasoning: "that search returned nothing, let me try a synonym."
- Action — the tool call.
- Observation — what the tool returned, fed back into the context.
In a real trace it looks like this:
Thought: I need the weather before I can answer.
Action: get_weather("Amsterdam")
Observation: 18°C, cloudy.
Thought: Now I can respond.
Action: finish("Bring a jacket.")
This is why a good agent trace reads like someone narrating their own work. The thinking isn't decoration, it's what lets the agent recover ("plan A failed, try B") without you hard-coding a planner. Almost every modern agent runs on ReAct or a close cousin.
It's smaller than the hype suggests
A complete coding agent needs about five primitives: read a file, list files, run a shell command, edit a file, and search code. That's it. To put a number on it: mini-swe-agent solves 68% of SWE-bench Verified in 100 lines of code.
So what do LangGraph, CrewAI and friends add? Abstractions on top of this core. Sometimes useful, but Anthropic's own warning is worth tattooing somewhere: frameworks "can obscure the underlying prompts and responses, making them harder to debug." Start at the loop. Add a framework layer only when the pain is concrete, not because the README looks nice.
The question most people skip: should this even be an agent?
Here's the unglamorous part, and the most valuable thing in this lesson. The default answer is no, build a workflow.
A workflow is a decision tree you draw and wire up explicitly. Predictable cost, bounded latency, every branch visible. An agent decides its own path at runtime. More capable, but the cost, the latency, and the consequences of a mistake all go up with that freedom.
So the real decision looks like this:
Everything falls to "workflow" unless all four gates say yes:
- Complex — the problem space is genuinely ambiguous. If you can draw the full decision tree, just build the tree.
- Valuable — the task is worth the tokens an agent burns exploring.
- Capable — the model can actually perform the critical sub-skills.
- Catchable — you can discover its mistakes (tests, CI, human review).
Coding hits all four (tests and CI make errors cheap to find), which is precisely why coding agents work and "an AI agent for everything" doesn't. Don't build an agent because agents are cool. Build one because the decision tree is too big to draw.
A trick you'll use constantly: think like your agent
While the model is reasoning and tools are running, "this is basically equivalent to us closing our eyes for three to five seconds and using the computer in the dark." Every step runs on a narrow slice of context, maybe 10–20k tokens. When your agent does something dumb, don't assume it's broken. Put yourself in its context window and ask: given only what it could see at that step, was the dumb move actually reasonable? Nine times out of ten, the fix is more context or a clearer tool, not a smarter model.
Takeaways
- An agent is a model using tools in a loop. Memorize the shape.
- You design two things: the tools and the system prompt. The loop is boilerplate.
- ReAct (thought → action → observation) is how the loop steers itself.
- It's small — hundreds of lines, not thousands. Frameworks are optional sugar.
- Workflow first. Only go agent when it's complex, valuable, capable and catchable.
Where this goes next
This lesson is the foundation: the loop, the three parts, and the discipline to not reach for an agent when a workflow will do. The chapters that follow build straight up from these six lines, into the harness that runs the loop in production, context engineering, memory, tools, and permissions. More lessons are on the way.