Source
Alexander Polomodov
The approach is based on Alexander’s work for conducting System Design interviews in Russian bigtech
Long-term preparation for the System Design Interview is not about memorizing ready-made answers; and the development of architectural thinking. In this chapter we will take a closer look at what skills needed at each stage of the interview and what resources will help develop them.
Key idea
Let's go through the 7 stages of System Design interviews and consider for each: what you need to know and know what books and resources to study, and what practical skills to practice.
Stage 1: Formalization of requirements
At this stage, the most important thing is to be able to ask the right questionsto clarify what is included in the scope of the task, and what can be safely left out of the brackets. You need to know how to collect functional requirements in the form of scenarios and define non-functional characteristics.
What you need to know
- Find out the scope of the task through the right questions
- Collect functional requirements as use cases
- Define non-functional requirements (architectural characteristics)
- Prioritize conflicting demands
Requirements Gathering Approaches
Use Cases (UML)
Classic approach with actors, systems and a list of scenarios. For every scenario happy path and exceptional flows are recorded - which allows you to understand how the script should work.
Actor → System → Scenario (Happy + Exceptional)User Story
An informal approach from the point of view of the user and the feature useful to him.
Jobs to be Done (JTBD)
Focus on outcome and context - what "work" the user "hires" the system to perform. The same product can be “hired” by different users for different purposes.
Non-functional requirements analysis
To analyze architectural characteristics, I recommend studying Architecture Tradeoff Analysis Method (ATAM). Key Concepts:
- Sensitivity points — decisions affecting only one quality attribute
- Trade-off points — decisions affecting multiple attributes (improving one makes another worse)
- Fit for purpose — the system meets the stated requirements
- Fit for use — the system is convenient for real use
Related chapter
API Gateway
How to design a public API, ensure security and routing of requests.
Stage 2: System Boundaries and Public API
At this stage, you need to determine how the outside world will interact with the system. This includes choosing protocols, data formats, and understanding the network stack.
What you need to know
Network stack
- TCP/IP and UDP - when to use what
- HTTP/1.1, HTTP/2, HTTP/3 - evolution and trade-offs
- WebSocket - for real-time communication
- DNS, CDN, Load Balancers
API styles
- REST — resource-based approach
- gRPC - efficient RPC with protobuf
- GraphQL — flexible requests from the client
- Messaging - asynchronous integration
C4 Model for visualization
I recommend mastering C4 Model — an approach to visualizing architecture at different levels:
C1
Context
The system and its environment
C2
Container
Applications and storage
C3
Component
Components inside a container
C4
Code
Classes and Functions
Stage 3: Basic data flows
At this stage, the main data paths through the system are designed - as information records and reads which components are involved.
Write Path
The data path when written is from the client to the persistent storage.
- Input Validation
- Write-ahead logging
- Synchronous/asynchronous recording
- Replication and confirmation
Read Path
The data path when reading is from the request to the response to the client.
- Caching (L1, L2, CDN)
- Reading from replicas
- Pagination and streaming
- Materialized views
Notations for describing threads
- Sequence Diagram (UML) — sequence of messages between components
- Activity Diagram (UML) — business process with branches and parallel threads
- BPMN — more formal notation for business processes
- Data Flow Diagram — movement of data between processes and storages
Related chapter
Guide to Databases
A framework for designing data schemas and selecting storage.
Step 4: Conceptual Data Diagram
At this stage, a data model is designed without reference to specific technologies. Entities, their attributes and relationships between them are determined.
Stateful vs Stateless
Stateful components
Store state between requests. Harder to scale.
- Databases
- Caches with persistence
- Message brokers
- Session stores
Stateless components
They do not persist request state and are easy to scale horizontally.
- API servers
- Workers
- Load balancers
- Gateways
Related chapter
Learning Domain-Driven Design
Practical DDD: Strategic and tactical design, contexts and events.
Domain-Driven Design
To design complex domains, I recommend studying DDD. Key Concepts:
- Bounded Context — clear boundaries of the model with a single language inside
- Aggregate — cluster of objects forming a consistency unit
- Entity vs Value Object — objects with identity vs immutable values
- Domain Events — events significant for business
Stage 5: Selecting Technologies
At this stage, the conceptual scheme turns into a real one - specific technologies taking into account their trade-offs and domain failures.
Technology categories
Databases
PostgreSQL
ACID, complex queries, JSON
MySQL
reliability, replication
MongoDB
documents, flexible scheme
Cassandra
high availability, scale
Caching
Redis
data structures, pub/sub, persistence
Memcached
simple key-value, multi-threading
Message Queues
Kafka
high throughput, replay
RabbitMQ
flexible routing, AMQP
SQS
managed, serverless
Search
Elasticsearch
full text search, analytics
Meilisearch
simple, typo-tolerant
Failure Domains
It's important to understand
For each technology, think about: what happens if it fails? What is the blast radius? How will the system recover? This shows the maturity of the engineer in the interview.
Stage 6: Scaling
When the basic architecture is ready, we need to discuss how the system will grow. What happens when the load increases by 10x, 100x, 1000x?
Vertical scaling
Increasing the resources of one machine (CPU, RAM, Disk).
- ✅ Easy to implement
- ✅ No code changes
- ❌ Limited by machine size
- ❌ Single point of failure
Horizontal scaling
Adding new system instances.
- ✅ Almost limitless
- ✅ Fault tolerance
- ❌ More difficult to implement
- ❌ Requires stateless design
Data Scaling Techniques
- Partitioning — separating data by key (user_id, region, time)
- Sharding — distribution of data across several independent databases
- Consistent hashing — minimizing redistribution when adding nodes
- Replication — copies of data for reading and fault tolerance
- CQRS — separation of read and write models
Stage 7: Advanced Topics
If time remains, discuss topics for long-term system operation. This shows that you think not only about development, but also about production.
Observability
- Metrics (Prometheus, Grafana)
- Logs (ELK, Loki)
- Traces (Jaeger, Zipkin)
- SLI/SLO/SLA
Deployment
- Blue-green deployment
- Canary releases
- Feature flags
- Rollback strategies
Security
- Authentication & Authorization
- Encryption at rest/in transit
- API rate limiting
- Audit logging
Disaster Recovery
- RTO & RPO
- Backup strategies
- Failover automation
- Chaos engineering
Recommended reading
For a deep understanding of System Design, we recommend studying professional literature. Books provide a foundation that will not be out of date in a year and help you understand the principles and not just remember ready-made solutions.
Part 4: Review of System Design Sources
A complete collection of book reviews on distributed systems, architecture, microservices, DDD and SRE - with key ideas and practical conclusions
Conclusion
Long-term training is a marathon, not a sprint. Don't try to read everything in a month. Instead select 2-3 books from the list and study them deeply. Practice on real projects and mock interviews.
The main thing is to develop architectural thinkingrather than memorizing ready-made solutions. The interviewer can always change the conditions of the task, and it is important to be able to adapt.
In the next chapter we will look at short-term preparation tactics - what to do, if the interview is in a week.
