System Design Space
Knowledge graphSettings

Updated: February 21, 2026 at 11:59 PM

AI Engineering (short summary)

hard

Source

Book cube

Post with an overview of the series about AI Engineering.

Open post

AI Engineering

Authors: Chip Huyen
Publisher: O'Reilly Media, Inc.
Length: 534 pages

Chip Huyen on creating AI applications: foundation models, prompting, RAG, agents, finetuning, quality assessment and production practices.

AI Engineering - original coverOriginal
AI Engineering - translated editionTranslated

AI stack by Chip Huyen

Foundation Models

GPT-4, Claude, Gemini, Llama - selection and understanding of possibilities

Prompting & Context

Prompt engineering, few-shot, chain-of-thought, system prompts

RAG & Knowledge

Retrieval-Augmented Generation, vector stores, embeddings

Agents & Tools

Autonomous agents, function calling, orchestration

Finetuning & Adaptation

SFT, RLHF, LoRA, dataset engineering

Related chapter

Hands-On Large Language Models

Visual introduction to LLM: tokenization, embeddings, transformers

Читать обзор

Key ideas of the book

AI engineering ≠ ML engineering

Model-as-a-service has lowered the entry barrier. AI engineering means building applications based on foundation models, rather than training models from scratch.

Evaluation is the central theme

The deeper AI is embedded into a product, the higher the risk of errors. System validation, AI-as-a-judge and product metrics are a must have.

From simple to complex

Development framework: start with prompting, add RAG if necessary, move on to finetuning only when justified.

Production concerns

Latency, cost of inference, stability and graceful degradation - practices for delivering and operating AI features.

Related chapter

ML System Design

A Practical Guide to Designing ML Interview Systems

Читать обзор

Book structure: 10 chapters

Part I: Foundation

1

Introduction to Building AI Applications

The transition from ML to GenAI, the advantages of foundation models, tokens and multimodality, use cases and AI as a platform.

2

Understanding Foundation Models

Data and languages, transformers and attention mechanism, parameters and context window, post-training (SFT, RLHF), hallucinations.

Part II: AI Application Development

3

Evaluation Methodology

Why AI evaluation is difficult, entropy and perplexy, functional vs non-functional correctness.

4

Evaluate AI Systems

AI-as-a-judge, pairwise comparisons, benchmarks and their limitations, human baseline, product validation.

5

Prompt Engineering

Prompt structure, few-shot learning, chain-of-thought, system prompts and democratization of development.

6

RAG and Agents

Retrieval-Augmented Generation, vector stores, chunking strategies, agents and function calling.

7

Finetuning

When finetuning is justified, SFT vs RLHF, dataset engineering, LoRA and effective adaptation.

Part III: AI Engineering in Production

8

Dataset Engineering

Data collection and preparation, annotation, synthetic data, data flywheel.

9

Inference Optimization

Latency and throughput, quantization, batching, caching, cost optimization.

10

AI Engineering Architecture and User Feedback

Architecture of AI applications, collection of feedback, continuous improvement, MLOps for GenAI.

Evaluation: key theme of the book

Chip Huyen highlights quality assessment as a central issue in AI engineering. Two full chapters are devoted to methodology and practices:

Metrics

Perplexy, BLEU, ROUGE, semantic similarity, task-specific metrics

AI-as-a-Judge

LLM evaluates LLM: prompts for judging, bias and calibration

Product Validation

A/B tests, user feedback, business metrics alignment

Chapter Podcast Series

The book is reviewed by Alexander Polomodov (CTO of T-Bank) and Evgeny Sergeev (Engineering Director, Flo).

Issue #1

Preface & Intro Chapter

  • Book overview and 10-chapter structure
  • Transition from ML to GenAI
  • Tokens and multimodality
  • Prompt engineering and democratization
  • Integration of MCP and Claude Desktop

Issue #2

Understanding Foundation Models

  • Model training stages
  • Transformers and the attention mechanism
  • Options and context window
  • Post-training: SFT and RLHF
  • Hallucinations and their causes

Issue #3

Evaluation (Ch. 3–4)

  • Why AI Assessment is Difficult
  • Entropy and perplexy
  • AI as a judge
  • Pairwise model comparisons
  • Product Validation

Where to find the book

Enable tracking in Settings

System Design Space

© 2026 Alexander Polomodov