System Design Space
Knowledge graphSettings

Updated: March 24, 2026 at 2:56 PM

An Illustrated Guide to AI Agents (short summary)

medium

Agents are best understood not as a new kind of magic, but as a system of memory, tools, planning loops, and checks.

The chapter shows where an agentic approach truly pays off and where it makes the product harder because of extra orchestration, unpredictability, and operational risk.

In design reviews, it helps shift the discussion from "should we use an agent" to autonomy boundaries, tool use, failure handling, and the cost of added complexity.

Practical value of this chapter

Design in practice

Translate guidance on agent systems, tool use, and autonomous behavior control into architecture decisions for data flow, model serving, and quality control points.

Decision quality

Evaluate system quality through both model and platform metrics: precision/recall, latency, drift, cost, and operational risk.

Interview articulation

Frame answers as data -> model -> serving -> monitoring, showing where constraints appear and how you manage them.

Trade-off framing

Make trade-offs explicit for agent systems, tool use, and autonomous behavior control: experiment speed, quality, explainability, resource budget, and maintenance complexity.

Primary source

O'Reilly Learning

Early Release page for An Illustrated Guide to AI Agents

Open book

An Illustrated Guide to AI Agents

Authors: Jay Alammar, Maarten Grootendorst
Publisher: O'Reilly Media, Inc. (Early Release)
Length: Early Release (в процессе написания)

Jay Alammar and Maarten Grootendorst: a practical guide to AI agents - memory, tools, planning, reflection, multi-agent coordination, and engineering risks.

Original

Why this is the right follow-up to Hands-On LLM

If Hands-On Large Language Models explains what happens inside the model, this book explains how to build a working agentic system around that model.

Step 1: understand the LLM engine

Transformer internals, tokens, embeddings, inference behavior, and core model limitations.

Step 2: design the agentic application

Memory, tools, planning, reflection, and multi-agent coordination as separate engineering work.

Related chapter

AI Engineering

Production practices for AI systems: evaluation, RAG, agents, finetuning

Open review

What is already available in the book

The book is in Early Release and already covers the core of agent architecture. The currently published chapters are enough to build a coherent mental model of AI agents.

1

Introduction

Why an agentic approach is needed, and where a simple LLM call ends and a real system begins.

Chapter components:

  • Definition of an agentic system: the difference between a one-shot LLM call and a loop of planning, acting, and checking.
  • Core components: model, state, orchestrator, tool layer, and observability.
  • Typical scenarios where a chatbot is not enough: multi-step work, API integrations, long-running workflows.
  • Architecture success criteria: reliability, controllability, latency/cost, and reproducibility.
2

Reasoning LLMs

What changes when a model can execute test-time reasoning chains, and how this impacts pipeline design.

Chapter components:

  • Difference between a fluent answer and true reasoning behavior on complex tasks.
  • Test-time reasoning: how deeper reasoning affects quality, response time, and cost.
  • When to choose reasoning models vs a standard generation pipeline.
  • Reasoning quality control: step verification, fallback strategies, and compute limits.
3

Memory

Short-term vs long-term memory, context engineering, and practical state management between agent steps.

Chapter components:

  • Working memory in the context window: what stays in prompt vs what moves outside.
  • Long-term memory: episodic storage of facts, preferences, and execution artifacts.
  • Retrieval policies: relevance, TTL, summarization, deduplication, and context hygiene.
  • Memory risks: sensitive-data leakage, context drift, and quality degradation as state grows.
4

Tool Usage, Learning, and Protocols

Function calling, external integrations, and interaction protocols (including MCP).

Chapter components:

  • Tool contract design: argument schemas, validation, typing, and clear boundaries.
  • Tool execution cycle: action selection, error handling, retry/idempotency, and post-processing.
  • External system integration via protocols (MCP and related approaches) with explicit trade-offs.
  • Moving from text generation to real-world actions: safety and audit requirements.
5

Planning and Reflection

Task decomposition, plan reassembly, self-critique, and feedback loops for better output quality.

Chapter components:

  • Breaking a complex goal into executable sub-tasks and staged plans.
  • Dynamic plan revision when tools fail, new data arrives, or priorities change.
  • Reflection and self-critique loops to improve reliability and reduce obvious errors.
  • Operational guardrails: budget/timebox, stop conditions, and reflection cost control.
6

Multi-Agent Systems

Role separation, multi-agent coordination, and architecture trade-offs (including A2A context).

Chapter components:

  • Role patterns: planner, researcher, executor, reviewer, and handoff rules.
  • Coordination topologies: hub-and-spoke, peer-to-peer, and hierarchical supervisors.
  • State alignment across agents: shared context, messaging protocols, and conflict handling.
  • Key risks: cascading failures, traceability complexity, and infrastructure cost growth.

Where engineering risks show up

Memory and context management

Without an explicit memory strategy, relevance drops quickly: latency grows, context drifts, and token costs become unpredictable.

Tools and safe integrations

When tools are connected, the system moves from text generation to action execution. Mistakes in permissions, validation, or idempotency become production risks.

Planning depth vs cost

Reflection loops can improve output quality, but they increase model calls. You need strict guardrails on budget, timeouts, and planning depth.

Coordination of multiple agents

A multi-agent setup can separate responsibilities better, but it also complicates observability, error tracing, and global consistency control.

Who should read it and how

Best fit for

  • Engineers and tech leads who design AI features as product systems, not one-off demos.
  • Teams that need a practical mental model of memory, tools, planning, and orchestration.
  • Readers who already understand LLM fundamentals and want to move into agent and multi-agent architecture.

Suggested order

  1. Start with Hands-On Large Language Models to lock in the LLM foundation.
  2. Then read An Illustrated Guide to AI Agents as the architecture layer around the model.
  3. After that, reinforce production practices with AI Engineering and Prompt Engineering for LLMs.

What to study in parallel

In the tool-usage and multi-agent sections, it is useful to compare MCP vs A2A approaches, because protocol choice affects responsibility boundaries and observability of agent systems.

Related chapters

Where to find the book

Enable tracking in Settings