System Design Space
Knowledge graphSettings

Updated: May 1, 2026 at 6:48 PM

Programming Meanings by Alexey Gusakov (CTO Yandex)

medium

A talk by Yandex CTO Alexey Gusakov on how AI products move from hand-coded algorithms to designing intent, constraints, evaluation loops, and useful system behavior.

The idea of programming meanings matters because it shifts attention away from hand-describing an algorithm and toward designing system behavior.

The chapter shows how intent, constraints, evaluation loops, and the reward model become the working engineering loop of an AI product rather than a thin layer around the model.

For design reviews, it is a strong case for discussing utility design, product semantics, and the part of AI architecture that rarely appears on a standard service diagram.

Practical value of this chapter

Behavior design

The chapter helps you discuss an AI product not only through code and models, but through intent, constraints, and the working evaluation loop.

Quality loop

It is a strong case for showing how product rules, reward design, and evaluation combine into one improvement cycle.

Engineering usefulness

The material makes it clear that usefulness in an AI system is designed and measured rather than automatically produced by a strong model.

Interview material

It is a strong case for discussing product semantics, reward design, and observability in AI features.

Programming Meanings by Alexey Gusakov (CTO Yandex)

A breakdown of Alexey Gusakov's talk on how AI products are moving from hand-coded algorithms toward designing intent, constraints, evaluation loops, and useful system behavior.

Speaker:Alexey Gusakov, CTO of the Search and Advertising Technologies business group (Yandex)
Format:Tech talk on product development and AI system architecture
Focus:LLM assistants, reward modeling, orchestration, and measurable answer quality

Source

Telegram: Book Cube

A review of the talk with engineering and product-architecture takeaways.

Read the review

What “programming meanings” means

The core idea is that you design not only code paths, but also model behavior through intent, constraints, knowledge context, tools, and success metrics.

In this model, value comes from an iterative loop hypothesis → prototype → measurement → additional training → integration, not from one delivery of the “perfect algorithm.”

How the approach evolved

2022: “Product Guru” and the first mistakes

A conversational assistant for product selection showed that a questionnaire disguised as a dialogue frustrates users. Those failures became signals of what actually feels useful.

The turn after ChatGPT

Instead of betting on one big magical release, the team chose an incremental path: improve the existing experience in small, verifiable steps.

Answers grounded in structured sources

The model plans which documents to use and assembles the answer from verifiable fragments rather than generating it out of thin air.

A system of constraints instead of one metric

Quality is defined by rules and metrics: factual accuracy, answer length, personalization, variety, and a ban on invented facts.

A repeatable learning loop

Answers are evaluated by people and automated checks, then the generative model and reward model are updated, and the changes are measured again through real feedback.

Orchestrating multiple models

Even without changing the base weights, quality grows through a pipeline of multiple models, tools, and additional compute.

Related chapter

AI Engineering

A systems view of the AI product lifecycle in production.

Open chapter

The working loop for model improvement

1. Evaluation

Answers are labeled and ranked by people and automated evaluators.

2. Training

The generative model and reward model are updated.

3. Rollout

Changes move into online experiments and A/B tests.

4. Feedback

Metrics and feedback start the next improvement cycle.

Common problems and fixes

Optimizing for the evaluator instead of user value

Symptom: The model adapts to the checker: it artificially lengthens answers, copies sources, and adds caveats that do not improve the outcome for the user.

Fix: Use length constraints, penalties for copy-paste and bureaucratic style, and only keep caveats where they are contextually useful.

Vague product requirements

Symptom: Instructions like “be smart and useful” do not translate into a stable product outcome or a reproducible working loop.

Fix: Treat intent, constraints, test sets, and quality metrics as mandatory release artifacts.

Related topic

Observability & Monitoring Design

How to build observability and alerting for production systems.

Open chapter

What this changes in system design

  • The prompt, rule set, and reward model become system artifacts just like API contracts and source code.
  • You need an observability loop for answer quality: factual accuracy, length, duplication, click-through, satisfaction, and escalation rate.
  • Product development and model development merge into one cycle: hypothesis -> experiment -> measurement -> additional training -> staged rollout.
  • The source base and the retrieval loop become critical for verifiability: the assistant has to rely on evidence that can be checked.

Additional materials

Related chapters

Enable tracking in Settings