System Design Space
Knowledge graphSettings

Updated: February 21, 2026 at 11:59 PM

Programming meanings

mid

Speech by Yandex CTO Alexey Gusakov on the transition from coding algorithms to designing intentions, restrictions, metrics and reward cycles in LLM products.

Programming meanings

Analysis of the report by Alexey Gusakov (Yandex): how product development is shifting from detailed coding of algorithms to designing intentions, restrictions and reward cycles.

Speaker:Alexey Gusakov, CTO of the Search and Advertising Technologies business group (Yandex)
Format:Technology talk / product + ML architecture
Focus:LLM assistants, reward modeling, orchestration and measurable quality of answers

Source

Telegram: book_cube

Review of the report with engineering and product-architectural conclusions.

Read review

What is “meaning programming”

The main idea: you are programming not only code branches, but also model behavior through intentions, constraints, knowledge context, tools and success metrics.

In this paradigm, value is created through an iterative cycle. hypothesis → prototype → measurement → additional training → integration, and not a single delivery of the “ideal algorithm”.

Go step by step

2022: “Product Guru” and first mistakes

A conversational assistant for selecting products showed that the “questionnaire disguised as a dialogue” irritates users. Errors became a source of signals about useful UX.

Rotate after ChatGPT exits

Instead of a “big magic release,” the team chose an incremental path: improving the existing output in small, verifiable steps.

Answers from structured sources

The model plans which documents to use and assembles the answer from the fragments it checks, rather than “generating it out of thin air.”

System of restrictions instead of one goal

Quality criteria are set by rules and metrics: truthfulness, length, personalization, variety, ban on fictitious facts.

Repeatable learning cycle

AI trainers evaluate the answers, the generative model and reward model are trained, changes are rolled out and measured again using feedback.

Orchestration of multiple models

Even without changing the base weights, quality increases due to a pipeline of several models, tools and an additional compute.

Related chapter

AI Engineering

A systematic view of the life cycle of an AI product in production.

Open chapter

ML as a food conveyor

1. Evaluation

AI trainers mark and rank answers.

2. Training

The generative model and reward model are updated.

3. Rollout

Changes are going into online experimentation and A/B.

4. Feedback

Metrics and feedback complete the next cycle.

Common problems and fixes

Reward-hacking

Symptom: The model adapts to the evaluator: artificially lengthens answers, copies sources, and adds unnecessary disclaimers.

Fix: Regularization of length, penalties for copy-paste and clericalism, targeted style adjustment and contextual use of disclaimers.

Fuzzy product requirements

Symptom: Instructions at the “be smart and useful” level do not turn into stable product results and a reproducible pipeline.

Fix: Formalized intents, restrictions, test cases and quality metrics as mandatory release artifacts.

Related topic

Observability & Monitoring Design

How to build observability and alerting for production systems.

Open chapter

What does this change in system design?

  • Prompt, rules and reward-model become the same artifacts of the system as API contracts and source code.
  • We need an observability contour for the quality of answers: truthfulness, length, duplication, CTR/satisfaction, proportion of escalations.
  • Product and ML development merge into a single cycle: hypothesis -> experiment -> measurement -> additional training -> rollout.
  • The reference base and retrieval circuit are critical for verifiability: the assistant must rely on verifiable sources.

Additional materials

Related chapters

Enable tracking in Settings

System Design Space

© 2026 Alexander Polomodov