Source
Book cube
Post with an overview of the series about AI Engineering.
AI Engineering
Authors: Chip Huyen
Publisher: O'Reilly Media, Inc.
Length: 534 pages
Chip Huyen on creating AI applications: foundation models, prompting, RAG, agents, finetuning, quality assessment and production practices.
Original
TranslatedAI stack by Chip Huyen
Foundation Models
GPT-4, Claude, Gemini, Llama - selection and understanding of possibilities
Prompting & Context
Prompt engineering, few-shot, chain-of-thought, system prompts
RAG & Knowledge
Retrieval-Augmented Generation, vector stores, embeddings
Agents & Tools
Autonomous agents, function calling, orchestration
Finetuning & Adaptation
SFT, RLHF, LoRA, dataset engineering
Related chapter
Hands-On Large Language Models
Visual introduction to LLM: tokenization, embeddings, transformers
Key ideas of the book
AI engineering ≠ ML engineering
Model-as-a-service has lowered the entry barrier. AI engineering means building applications based on foundation models, rather than training models from scratch.
Evaluation is the central theme
The deeper AI is embedded into a product, the higher the risk of errors. System validation, AI-as-a-judge and product metrics are a must have.
From simple to complex
Development framework: start with prompting, add RAG if necessary, move on to finetuning only when justified.
Production concerns
Latency, cost of inference, stability and graceful degradation - practices for delivering and operating AI features.
Related chapter
ML System Design
A Practical Guide to Designing ML Interview Systems
Book structure: 10 chapters
Part I: Foundation
Introduction to Building AI Applications
The transition from ML to GenAI, the advantages of foundation models, tokens and multimodality, use cases and AI as a platform.
Understanding Foundation Models
Data and languages, transformers and attention mechanism, parameters and context window, post-training (SFT, RLHF), hallucinations.
Part II: AI Application Development
Evaluation Methodology
Why AI evaluation is difficult, entropy and perplexy, functional vs non-functional correctness.
Evaluate AI Systems
AI-as-a-judge, pairwise comparisons, benchmarks and their limitations, human baseline, product validation.
Prompt Engineering
Prompt structure, few-shot learning, chain-of-thought, system prompts and democratization of development.
RAG and Agents
Retrieval-Augmented Generation, vector stores, chunking strategies, agents and function calling.
Finetuning
When finetuning is justified, SFT vs RLHF, dataset engineering, LoRA and effective adaptation.
Part III: AI Engineering in Production
Dataset Engineering
Data collection and preparation, annotation, synthetic data, data flywheel.
Inference Optimization
Latency and throughput, quantization, batching, caching, cost optimization.
AI Engineering Architecture and User Feedback
Architecture of AI applications, collection of feedback, continuous improvement, MLOps for GenAI.
Evaluation: key theme of the book
Chip Huyen highlights quality assessment as a central issue in AI engineering. Two full chapters are devoted to methodology and practices:
Metrics
Perplexy, BLEU, ROUGE, semantic similarity, task-specific metrics
AI-as-a-Judge
LLM evaluates LLM: prompts for judging, bias and calibration
Product Validation
A/B tests, user feedback, business metrics alignment
Chapter Podcast Series
The book is reviewed by Alexander Polomodov (CTO of T-Bank) and Evgeny Sergeev (Engineering Director, Flo).
Issue #1
Preface & Intro Chapter
- Book overview and 10-chapter structure
- Transition from ML to GenAI
- Tokens and multimodality
- Prompt engineering and democratization
- Integration of MCP and Claude Desktop
Issue #2
Understanding Foundation Models
- Model training stages
- Transformers and the attention mechanism
- Options and context window
- Post-training: SFT and RLHF
- Hallucinations and their causes
