Programming meanings
Analysis of the report by Alexey Gusakov (Yandex): how product development is shifting from detailed coding of algorithms to designing intentions, restrictions and reward cycles.
Source
Telegram: book_cube
Review of the report with engineering and product-architectural conclusions.
What is “meaning programming”
The main idea: you are programming not only code branches, but also model behavior through intentions, constraints, knowledge context, tools and success metrics.
In this paradigm, value is created through an iterative cycle. hypothesis → prototype → measurement → additional training → integration, and not a single delivery of the “ideal algorithm”.
Go step by step
2022: “Product Guru” and first mistakes
A conversational assistant for selecting products showed that the “questionnaire disguised as a dialogue” irritates users. Errors became a source of signals about useful UX.
Rotate after ChatGPT exits
Instead of a “big magic release,” the team chose an incremental path: improving the existing output in small, verifiable steps.
Answers from structured sources
The model plans which documents to use and assembles the answer from the fragments it checks, rather than “generating it out of thin air.”
System of restrictions instead of one goal
Quality criteria are set by rules and metrics: truthfulness, length, personalization, variety, ban on fictitious facts.
Repeatable learning cycle
AI trainers evaluate the answers, the generative model and reward model are trained, changes are rolled out and measured again using feedback.
Orchestration of multiple models
Even without changing the base weights, quality increases due to a pipeline of several models, tools and an additional compute.
Related chapter
AI Engineering
A systematic view of the life cycle of an AI product in production.
ML as a food conveyor
1. Evaluation
AI trainers mark and rank answers.
2. Training
The generative model and reward model are updated.
3. Rollout
Changes are going into online experimentation and A/B.
4. Feedback
Metrics and feedback complete the next cycle.
Common problems and fixes
Reward-hacking
Symptom: The model adapts to the evaluator: artificially lengthens answers, copies sources, and adds unnecessary disclaimers.
Fix: Regularization of length, penalties for copy-paste and clericalism, targeted style adjustment and contextual use of disclaimers.
Fuzzy product requirements
Symptom: Instructions at the “be smart and useful” level do not turn into stable product results and a reproducible pipeline.
Fix: Formalized intents, restrictions, test cases and quality metrics as mandatory release artifacts.
Related topic
Observability & Monitoring Design
How to build observability and alerting for production systems.
What does this change in system design?
- Prompt, rules and reward-model become the same artifacts of the system as API contracts and source code.
- We need an observability contour for the quality of answers: truthfulness, length, duplication, CTR/satisfaction, proportion of escalations.
- Product and ML development merge into a single cycle: hypothesis -> experiment -> measurement -> additional training -> rollout.
- The reference base and retrieval circuit are critical for verifiability: the assistant must rely on verifiable sources.

