System Design Space
Knowledge graphSettings

Updated: March 5, 2026 at 11:16 PM

An Illustrated Guide to AI Agents (short summary)

mid

Primary source

O'Reilly Learning

Early Release page for An Illustrated Guide to AI Agents

Open book

An Illustrated Guide to AI Agents

Authors: Jay Alammar, Maarten Grootendorst
Publisher: O'Reilly Media, Inc. (Early Release)
Length: Early Release (в процессе написания)

Jay Alammar and Maarten Grootendorst: a practical guide to AI agents - memory, tools, planning, reflection, multi-agent coordination, and engineering risks.

An Illustrated Guide to AI Agents - original coverOriginal

Why this is the right follow-up to Hands-On LLM

If Hands-On Large Language Models explains what happens inside the model, this book explains how to build a working agentic system around that model.

Step 1: understand the LLM engine

Transformer internals, tokens, embeddings, inference behavior, and core model limitations.

Step 2: design the agentic application

Memory, tools, planning, reflection, and multi-agent coordination as separate engineering work.

Related chapter

AI Engineering

Production practices for AI systems: evaluation, RAG, agents, finetuning

Open review

What is already available in the book

The book is in Early Release and already covers the core of agent architecture. The currently published chapters are enough to build a coherent mental model of AI agents.

1

Introduction

Why an agentic approach is needed, and where a simple LLM call ends and a real system begins.

Chapter components:

  • Definition of an agentic system: the difference between a one-shot LLM call and a loop of planning, acting, and checking.
  • Core components: model, state, orchestrator, tool layer, and observability.
  • Typical scenarios where a chatbot is not enough: multi-step work, API integrations, long-running workflows.
  • Architecture success criteria: reliability, controllability, latency/cost, and reproducibility.
2

Reasoning LLMs

What changes when a model can execute test-time reasoning chains, and how this impacts pipeline design.

Chapter components:

  • Difference between a fluent answer and true reasoning behavior on complex tasks.
  • Test-time reasoning: how deeper reasoning affects quality, response time, and cost.
  • When to choose reasoning models vs a standard generation pipeline.
  • Reasoning quality control: step verification, fallback strategies, and compute limits.
3

Memory

Short-term vs long-term memory, context engineering, and practical state management between agent steps.

Chapter components:

  • Working memory in the context window: what stays in prompt vs what moves outside.
  • Long-term memory: episodic storage of facts, preferences, and execution artifacts.
  • Retrieval policies: relevance, TTL, summarization, deduplication, and context hygiene.
  • Memory risks: sensitive-data leakage, context drift, and quality degradation as state grows.
4

Tool Usage, Learning, and Protocols

Function calling, external integrations, and interaction protocols (including MCP).

Chapter components:

  • Tool contract design: argument schemas, validation, typing, and clear boundaries.
  • Tool execution cycle: action selection, error handling, retry/idempotency, and post-processing.
  • External system integration via protocols (MCP and related approaches) with explicit trade-offs.
  • Moving from text generation to real-world actions: safety and audit requirements.
5

Planning and Reflection

Task decomposition, plan reassembly, self-critique, and feedback loops for better output quality.

Chapter components:

  • Breaking a complex goal into executable sub-tasks and staged plans.
  • Dynamic plan revision when tools fail, new data arrives, or priorities change.
  • Reflection and self-critique loops to improve reliability and reduce obvious errors.
  • Operational guardrails: budget/timebox, stop conditions, and reflection cost control.
6

Multi-Agent Systems

Role separation, multi-agent coordination, and architecture trade-offs (including A2A context).

Chapter components:

  • Role patterns: planner, researcher, executor, reviewer, and handoff rules.
  • Coordination topologies: hub-and-spoke, peer-to-peer, and hierarchical supervisors.
  • State alignment across agents: shared context, messaging protocols, and conflict handling.
  • Key risks: cascading failures, traceability complexity, and infrastructure cost growth.

Where engineering risks show up

Memory and context management

Without an explicit memory strategy, relevance drops quickly: latency grows, context drifts, and token costs become unpredictable.

Tools and safe integrations

When tools are connected, the system moves from text generation to action execution. Mistakes in permissions, validation, or idempotency become production risks.

Planning depth vs cost

Reflection loops can improve output quality, but they increase model calls. You need strict guardrails on budget, timeouts, and planning depth.

Coordination of multiple agents

A multi-agent setup can separate responsibilities better, but it also complicates observability, error tracing, and global consistency control.

Who should read it and how

Best fit for

  • Engineers and tech leads who design AI features as product systems, not one-off demos.
  • Teams that need a practical mental model of memory, tools, planning, and orchestration.
  • Readers who already understand LLM fundamentals and want to move into agent and multi-agent architecture.

Suggested order

  1. Start with Hands-On Large Language Models to lock in the LLM foundation.
  2. Then read An Illustrated Guide to AI Agents as the architecture layer around the model.
  3. After that, reinforce production practices with AI Engineering and Prompt Engineering for LLMs.

What to study in parallel

In the tool-usage and multi-agent sections, it is useful to compare MCP vs A2A approaches, because protocol choice affects responsibility boundaries and observability of agent systems.

Related materials

Enable tracking in Settings

System Design Space

© 2026 Alexander Polomodov