System Design Space
Knowledge graphSettings

Updated: March 24, 2026 at 2:56 PM

Prompt Engineering for LLMs (short summary)

medium

Prompt engineering hits a ceiling quickly if you treat it as hunting for the right wording instead of designing context.

The chapter shows why LLM loops, RAG, agents, and workflows change the subject entirely: answer quality starts depending on the full chain rather than on a single prompt.

In interviews and architecture discussions, that lets you talk about context engineering, retrieval quality, state management, and evaluation instead of falling back to folk wisdom about prompts.

Practical value of this chapter

Design in practice

Translate guidance on prompt engineering methods and model-output quality control into architecture decisions for data flow, model serving, and quality control points.

Decision quality

Evaluate system quality through both model and platform metrics: precision/recall, latency, drift, cost, and operational risk.

Interview articulation

Frame answers as data -> model -> serving -> monitoring, showing where constraints appear and how you manage them.

Trade-off framing

Make trade-offs explicit for prompt engineering methods and model-output quality control: experiment speed, quality, explainability, resource budget, and maintenance complexity.

Source

Book cube

Book review from Alexander Polomodov

Read post

Prompt Engineering for LLMs

Authors: John Berryman, Albert Ziegler
Publisher: O'Reilly Media, Inc.
Length: 282 pages

John Berryman and Albert Ziegler (creators of GitHub Copilot): LLM Loop, RAG, agents, workflows and the transition to context engineering.

Original

Key Idea: LLM Loop

The authors introduce the framework LLM Loop — cycle of working with the model:

1

Retrieval

Getting context

2

Snippetizing

Cutting into fragments

3

Scoring

Relevance assessment

4

Assembly

Building a prompt

5

Post-process

Processing the response

Related chapter

AI Engineering (Chip Huyen)

A broader view: RAG, agents, finetuning, production

Читать обзор

Book structure: 3 parts, 11 chapters

Part I: LLM Basics

The structure and evolution of models, their training and the transition to dialogues.

1

Introduction to Prompt Engineering

Why LLMs look like “magic”, the evolution of language models, prompt engineering as an engineering discipline.

2

Understanding LLMs

LLM as a completion engine: tokens, autoregression, hallucinations, temperature, transformer basics.

3

Moving to Chat

From completion to chat: RLHF, instruct vs chat, alignment tax, API evolution. Prompting as “staging a play” (scenes/roles/cues).

4

Designing LLM Applications

LLM Loop key frame: retrieval → snippetizing → scoring → prompt assembly → post-processing.

Part II: Key Techniques

Few-shot examples, RAG to reduce hallucinations, formatting prompts.

5

Prompt Content

Static (instructions, few-shot) vs dynamic content. RAG: lexical vs neural, embeddings, vector storage, hierarchical summarization.

6

Assembling the Prompt

Packing in a context limit, anatomy of a prompt, document formats, elastic snippets. Valley of Meh: the middle of the prompt “sags”, the important one is closer to the end.

7

Taming the Model

Anatomy of completion: preamble, start/end markers, stop-sequence, streaming. Logprobs for confidence. Model selection: quality/price/latency.

Related chapter

Hands-On Large Language Models

Visual explanation of RAG, agents and LangChain

Читать обзор

Part III: Advanced Topics

Agents with memory and tools, workflows, quality assessment.

8

Conversational Agency

Tool use: tool design, error handling, dangerous actions. Reasoning patterns: CoT, ReAct. Agent and UX assembly.

9

LLM Workflows

When workflow is better than agent. Tasks are like bricks, template prompts. Agent-driven workflow, stateful task agents, roles and delegation.

10

Evaluating LLM Applications

Offline: example suites, gold standard, LLM-as-judge, SOMA. Online: A/B tests and metrics.

11

Looking Ahead

Multimodality, UI/UX as part of quality, increased intelligence and speed of models.

Practical insight: Valley of Meh

The middle of the prompt “sags”

Models can better “see” the beginning and end of a prompt. Information in the middle is often ignored or processed less well.

Authors' recommendation:

  • Important instructions - back to top (system prompt)
  • Critical context - near the end
  • Less important - in the middle

Relevance in 2026: Prompt → Context Engineering

Since the publication of the book, LLM technologies have stepped forward. The quality of models has increased - they understand the user better even without complex prompts. The best techniques are already built into the tools.

Context Engineering

Andrey Karpathy (2025): Focus on providing a model of the complete environment (data, history, tools) instead of selecting an ideal formulation.

PromptOps

Prompt versioning, request quality monitoring, context preparation automation.

Conclusion: The fundamental principles of the book are still useful. RAG is now ubiquitous, chain-of-thought has become a standard in AI agents. The authors honestly warned: APIs will become obsolete, but the basic ideas will remain.

Related chapters

Where to find the book

Enable tracking in Settings