Source
Book cube
Book review from Alexander Polomodov
Prompt Engineering for LLMs
Authors: John Berryman, Albert Ziegler
Publisher: O'Reilly Media, Inc.
Length: 282 pages
John Berryman and Albert Ziegler (creators of GitHub Copilot): LLM Loop, RAG, agents, workflows and the transition to context engineering.
OriginalKey Idea: LLM Loop
The authors introduce the framework LLM Loop — cycle of working with the model:
Retrieval
Getting context
Snippetizing
Cutting into fragments
Scoring
Relevance assessment
Assembly
Building a prompt
Post-process
Processing the response
Related chapter
AI Engineering (Chip Huyen)
A broader view: RAG, agents, finetuning, production
Book structure: 3 parts, 11 chapters
Part I: LLM Basics
The structure and evolution of models, their training and the transition to dialogues.
Introduction to Prompt Engineering
Why LLMs look like “magic”, the evolution of language models, prompt engineering as an engineering discipline.
Understanding LLMs
LLM as a completion engine: tokens, autoregression, hallucinations, temperature, transformer basics.
Moving to Chat
From completion to chat: RLHF, instruct vs chat, alignment tax, API evolution. Prompting as “staging a play” (scenes/roles/cues).
Designing LLM Applications
LLM Loop key frame: retrieval → snippetizing → scoring → prompt assembly → post-processing.
Part II: Key Techniques
Few-shot examples, RAG to reduce hallucinations, formatting prompts.
Prompt Content
Static (instructions, few-shot) vs dynamic content. RAG: lexical vs neural, embeddings, vector storage, hierarchical summarization.
Assembling the Prompt
Packing in a context limit, anatomy of a prompt, document formats, elastic snippets. Valley of Meh: the middle of the prompt “sags”, the important one is closer to the end.
Taming the Model
Anatomy of completion: preamble, start/end markers, stop-sequence, streaming. Logprobs for confidence. Model selection: quality/price/latency.
Related chapter
Hands-On Large Language Models
Visual explanation of RAG, agents and LangChain
Part III: Advanced Topics
Agents with memory and tools, workflows, quality assessment.
Conversational Agency
Tool use: tool design, error handling, dangerous actions. Reasoning patterns: CoT, ReAct. Agent and UX assembly.
LLM Workflows
When workflow is better than agent. Tasks are like bricks, template prompts. Agent-driven workflow, stateful task agents, roles and delegation.
Evaluating LLM Applications
Offline: example suites, gold standard, LLM-as-judge, SOMA. Online: A/B tests and metrics.
Looking Ahead
Multimodality, UI/UX as part of quality, increased intelligence and speed of models.
Practical insight: Valley of Meh
The middle of the prompt “sags”
Models can better “see” the beginning and end of a prompt. Information in the middle is often ignored or processed less well.
Authors' recommendation:
- Important instructions - back to top (system prompt)
- Critical context - near the end
- Less important - in the middle
Relevance in 2026: Prompt → Context Engineering
Since the publication of the book, LLM technologies have stepped forward. The quality of models has increased - they understand the user better even without complex prompts. The best techniques are already built into the tools.
Context Engineering
Andrey Karpathy (2025): Focus on providing a model of the complete environment (data, history, tools) instead of selecting an ideal formulation.
PromptOps
Prompt versioning, request quality monitoring, context preparation automation.
Conclusion: The fundamental principles of the book are still useful. RAG is now ubiquitous, chain-of-thought has become a standard in AI agents. The authors honestly warned: APIs will become obsolete, but the basic ideas will remain.
