At the ArchDays 2022 conference, we conducted a public System Design interview in the format, as close as possible to a real interview at Big Tech companies. The task is to design hotel reservation system. This analysis will show the entire process from formalizing requirements to discussing scaling.
Related case
Designing Airbnb
A similar task with geo-search, dynamic pricing and two-sided marketplace.
Video recording of the interview
The full recording of the public interview is available on YouTube. I recommend watching it before reading the analysis to see the dynamics of the process.
Watch on YouTubeStatement of the problem
Hotel Booking System
Design a hotel reservation system for the Russian market.
Total number of hotels in the system
Total number of rooms
Functional Requirements
View information
Detailed information about hotels and rooms
Online booking
Select dates and room type with payment
Payment
Integration with payment system
Cancellation
Possibility to cancel a reservation
Important business requirement: Overbooking up to 10%
The system must support overbooking - the ability to sell more rooms, than is available. This is standard practice in the hotel industry, since some reservations are canceled or guests do not arrive (no-show).
What was the scope?
Load Estimation (Back of the Envelope)
Important: The average load is small, but there will be significant peaks - seasonal sales, holidays, flash-sales. The system must withstand a load 10-100 times higher than average.
Public API
/v1/hotels/{hotel_id}Getting hotel information
/v1/hotels/{hotel_id}/rooms/{room_type_id}Getting information about the number type
/v1/reservationsMaking a reservation
{
"hotel_id": "123",
"room_type_id": "456",
"start_date": "2024-03-15",
"end_date": "2024-03-20",
"guest_info": { ... }
}/v1/reservations/{reservation_id}Cancellation
Evolution of the data model
One of the key topics of the interview is how to properly design a data model. to support overbooking and high competition.
Naive approach: Reservations table
Reservation ├── id ├── hotel_id ├── room_type_id ├── room_id (specific number) ├── start_date ├── end_date ├── status └── guest_id
Problem 1: To check availability you need to do a COUNT on all reservations for the date
Problem 2: Overbooking is difficult to implement - you need to know the exact number of available rooms
Problem 3: Race condition for simultaneous bookings
Improved approach: Inventory-based model
RoomTypeInventory ├── hotel_id ├── room_type_id ├── date ├── total_rooms (total rooms of this type) ├── total_reserved (reserved) └── overbooking_limit (overbooking limit) Constraint: total_reserved <= total_rooms * (1 + overbooking_limit)
Advantage 1: Instant availability check - one SELECT
Advantage 2: Overbooking is built into the model via overbooking_limit
Advantage 3: Atomic update via UPDATE with condition
Handling concurrency
A critical topic is how to avoid race condition when two users trying to book the last room at the same time.
Pessimistic Locking (SELECT FOR UPDATE)
BEGIN; SELECT * FROM room_inventory WHERE hotel_id = ? AND room_type_id = ? AND date = ? FOR UPDATE; -- Check and update UPDATE room_inventory SET total_reserved = total_reserved + 1 ... COMMIT;
Pros
Guaranteed consistency, simple logic
Cons
Line blocking, throughput reduction, deadlock risk
Optimistic Locking (Repeatable Read)
-- Isolation Level: REPEATABLE READ BEGIN; UPDATE room_inventory SET total_reserved = total_reserved + 1 WHERE hotel_id = ? AND room_type_id = ? AND date = ? AND total_reserved < total_rooms * (1 + overbooking_limit); -- If affected_rows = 0, there are no numbers COMMIT;
Pros
High throughput, no locking, atomicity
Cons
Retry required for conflicts
Database Constraints
ALTER TABLE room_inventory ADD CONSTRAINT check_overbooking CHECK (total_reserved <= total_rooms * (1 + overbooking_limit));
Additional protection at the database level - even with bugs in the code, the system will not violate the limits.
Choice for our scale
At load ~2.3 TPS with peaks up to 100 TPS Optimistic Locking — optimal choice. It provides high performance without complexity distributed locks. Pessimistic Locking makes sense at very high competition for specific resources.
Scaling Strategies
Time partitioning
Splitting the inventory table by month or quarter.
- Old data can be archived
- Queries work with less data
- Simplifies maintenance operations
Sharding by hotel_id
Distribution of data across different shards based on hotel ID.
- Horizontal recording scaling
- Load isolation between hotels
- Consistent hashing for distribution
Key takeaways from the interviews
Start with clarifying questions
Overbooking is a critical requirement that fundamentally changes the architecture. Without this question, you can go in the wrong direction.
Improve your design iteratively
They showed the evolution from the naive Reservation model to the Inventory-based approach. The interviewer evaluates not the final answer, but the train of thought.
Justify the choice of competitive strategy
Pessimistic vs Optimistic locking - classic trade-off. It is important to explain why a particular approach was chosen for a given load.
Think about scaling ahead
Even if the current workload is light, show understanding that how the system will grow and what strategies to apply.
