System Design Space
Knowledge graphSettings

Updated: April 7, 2026 at 9:45 AM

An Illustrated Guide to AI Agents (short summary)

medium

An agent is best understood not as a new kind of magic, but as a system of state, tools, planning, and checks.

The chapter shows where an agentic approach truly pays off and where it only adds orchestration, unpredictability, and operational risk.

For interviews and architecture discussions, it helps move the conversation away from the idea of a "fashionable agent" and toward autonomy boundaries, tool use, failure handling, and the cost of added complexity.

Practical value of this chapter

Agent loop

The book breaks an agent system into state, planning, tools, and checks so you can discuss it as architecture rather than as magic.

Memory and tools

It is a strong guide for explaining why agents need memory, how access to external systems works, and why these layers often break first.

Cost of autonomy

It helps you discuss the trade-offs of autonomy honestly: extra orchestration, cost, unpredictability, and stronger observability requirements.

Interview material

It gives you a clear frame for discussing memory, tool use, planning, multi-agent setups, and safe degradation.

Primary source

O'Reilly Learning

Early Release page for An Illustrated Guide to AI Agents

Open book

An Illustrated Guide to AI Agents

Authors: Jay Alammar, Maarten Grootendorst
Publisher: O'Reilly Media, Inc. (Early Release)
Length: Early Release (в процессе написания)

Jay Alammar and Maarten Grootendorst: a practical visual guide to agent systems covering memory, tools, planning, self-checking, multi-agent coordination, and engineering risks.

Original

Why this is the right follow-up to Hands-On LLM

If Hands-On Large Language Models explains how the model itself works, this book explains how to build a working agentic system around it.

Step 1: understand how the LLM works

Transformer internals, tokens, embeddings, inference behavior, and the model's core limitations.

Step 2: design the agentic application

Memory, tools, planning, self-checking, and multi-agent coordination as a distinct engineering problem.

Related chapter

AI Engineering

Live-system practices for AI systems: evaluation, RAG, agents, and fine-tuning

Open review

What is already available in the book

The book is in Early Release and already covers the core of agent architecture. The published chapters are already enough to build a coherent engineering picture, from memory and tools to planning and multi-agent coordination.

1

Introduction

Why an agentic approach matters and where a one-off LLM call stops being enough.

Chapter components:

  • What should count as an agentic system: the difference between a one-shot LLM call and a loop of planning, acting, and checking.
  • Core components: model, state, orchestrator, tool layer, and observability.
  • Scenarios where a chatbot is no longer enough: multi-step work, API integrations, and longer user-facing processes.
  • Architecture success criteria: reliability, controllability, latency, cost, and reproducibility.
2

Reasoning LLMs

What changes when a model can reason during inference, and how that affects the surrounding architecture.

Chapter components:

  • The difference between a fluent answer and genuine reasoning behavior on harder tasks.
  • How deeper reasoning during inference affects quality, response time, and cost.
  • When reasoning-heavy models are worth it and when a simpler generation path is enough.
  • How to control reasoning quality through step checks, fallback paths, and compute limits.
3

Memory

Short-term versus long-term memory, context engineering, and state management between agent steps.

Chapter components:

  • What belongs in the context window and what should move into external state.
  • Long-term memory: episodic storage of facts, preferences, and execution artifacts.
  • Retrieval policies: relevance, TTL, summarization, deduplication, and context hygiene.
  • Memory risks: sensitive-data leakage, context drift, and quality loss as state grows.
4

Tool Usage, Learning, and Protocols

Tool use, external integrations, and interaction protocols, including MCP.

Chapter components:

  • How to design tool contracts: argument schemas, validation, typing, and responsibility boundaries.
  • The tool execution cycle: selecting an action, handling errors, retry behavior, and post-processing.
  • How MCP and related approaches connect external systems and where their architecture trade-offs live.
  • Why moving from text generation to real-world actions immediately raises the bar for safety and auditability.
5

Planning and Reflection

Task decomposition, plan revision, and self-checking as a separate quality layer.

Chapter components:

  • How to break a complex goal into executable sub-tasks and staged plans.
  • When plans need to be revised because new data arrives, tools fail, or priorities change.
  • How reflection and self-checking loops can improve quality and reduce obvious mistakes.
  • Which budget, timeout, and loop-depth limits keep planning from becoming too expensive.
6

Multi-Agent Systems

Role separation, multi-agent coordination, and the price of added architectural complexity, including A2A scenarios.

Chapter components:

  • Role patterns: planner, researcher, executor, reviewer, and handoff rules between them.
  • Coordination topologies: a central coordinator, peer-to-peer collaboration, and hierarchical supervision.
  • State alignment across agents: shared context, messaging protocols, and conflict handling.
  • Key risks: cascading failures, difficult traceability, and higher infrastructure cost.

Where engineering risks show up

Memory and context management

Without an explicit strategy for storing and refreshing state, relevance drops quickly: latency grows, context drifts, and token costs become unpredictable.

Tools and safe integrations

Once tools are connected, the system moves from text generation to action execution. Mistakes in permissions, validation, or retry logic become risks for a live system.

Planning depth vs cost

Self-checking loops can improve output quality, but they also increase model calls. That means you need hard limits on budget, timeouts, and planning depth.

Coordination of multiple agents

A multi-agent setup can separate responsibilities better, but it also makes observability, error tracing, and consistency control more difficult.

Who should read it and how

Best fit for

  • Engineers and tech leads who design AI features as product systems rather than one-off demos.
  • Teams that need a practical engineering picture of memory, tools, planning, and orchestration.
  • Readers who already understand LLM fundamentals and want to move into agent and multi-agent architecture.

Suggested order

  1. Start with Hands-On Large Language Models to lock in the LLM foundation.
  2. Then read An Illustrated Guide to AI Agents as the architectural layer around the model.
  3. After that, reinforce live-system practices with AI Engineering and Prompt Engineering for LLMs.

What to study in parallel

In the tool-usage and multi-agent sections, it is useful to compare MCP and A2A directly, because protocol choice affects responsibility boundaries, observability, and overall system resilience.

Related chapters

Where to find the book

Enable tracking in Settings