Frontend system design case: Design Instagram Feed

A feed only looks simple until you have to balance media loading, caching, ranking, infinite scroll, and the feeling of instant response at the same time. That makes it a strong case for showing how a familiar screen quickly turns into a frontend system with its own pipelines and constraints.

The practical value of the chapter is that it breaks the feed down into engineering choices: pagination, prefetching, caching, render budget, and mobile UX under unstable network conditions. It is useful whenever you need to understand where the interface actually pays for performance and convenience.

For case interviews and architecture reviews, the chapter works well because it shifts the discussion away from pretty UI and toward client-side data flow, backend contracts, refresh strategy, and the trade-offs between perceived performance and implementation complexity.

Practical value of this chapter

Design in practice

Turn guidance on client feed architecture and the UX/latency/cost balance into concrete decisions for composition, ownership, and client-runtime behavior.

Decision quality

Evaluate architecture through measurable outcomes: delivery speed, UI stability, observability, change cost, and operating risk.

Interview articulation

Structure answers as problem -> constraints -> architecture -> trade-offs -> migration path with explicit frontend reasoning.

Trade-off framing

Make trade-offs explicit around client feed architecture and the UX/latency/cost balance: team scale, technical debt, performance budget, and long-term maintainability.

Context

Frontend architecture overview

This case focuses on a product feed with high UX pressure and strict performance requirements.

Open chapter

Design Instagram Feed is not a “render a list of posts” task. The feed runs into three quantities that pull in different directions: personalization, response time and traffic cost. Relevant content has to arrive fast while scrolling stays smooth on an average mobile device — exactly where network and memory are tight.

Problem & Context

The user opens the app and expects the meaningful part of the feed almost instantly; a blank screen for a couple of seconds reads as “the app is lagging.” The UX breaks in four places: a long time to first meaningful feed item, jerky scroll, media that loads late, and likes and comments that behave unpredictably on an unstable network.

Functional requirements

An infinite feed with cursor-based pagination and fast loading of new cards.
Support for videos and images, likes, comments and saves without a full screen reload.
Personalized ordering of posts that blends fresh and most relevant items.
Stable UX on a poor network: skeleton loaders, retries and graceful degradation.

Non-functional requirements

Time to first meaningful feed item under 2 s on an average mobile device.
Scroll stays smooth, without noticeable jank or dropped frames.
Conservative use of bandwidth and battery via lazy loading and image optimization.
Resilience to load spikes during mass publications and peak hours.

Scale assumptions

Daily active users (DAU)

50M+

A large share of traffic comes from mobile clients and short sessions.

Feed requests

200k–600k RPS

Peak load is 2–3× the baseline during regional prime-time hours.

Media payload

~200 KB preview / 1–3 MB full

Image and video optimization is critical for latency and CDN cost.

Client memory budget

< 120 MB per feed screen

List virtualization is mandatory for long scrolling sessions.

Caching strategies

Without a cache every return to the feed hits origin: both latency and the infrastructure bill go up.

Open chapter

Architecture

Feed BFF

Aggregates ranking, content and social metadata, and returns a compact DTO shaped for the UI.

Ranking service

Produces a personalized order of posts and returns a candidate set with scores and reason codes.

Media service + CDN

Generates multi-size previews and manages Cache-Control, signed URLs and progressive delivery.

Interaction service

Likes and comments are handled asynchronously with an optimistic UI and reconciliation on the client.

Feed pipeline: a client request fans out from Feed BFF to ranking and media, passes through cache layers and returns to the device as a compact DTO for a virtualized render.

Deep dives

Pagination and prefetch

Cursor-based pagination prevents gaps and duplicates when the feed shifts. The client prefetches the next batch ahead of a scroll threshold detected via the Intersection Observer API.

Render performance

A long feed rests on three moves: a virtualized list, memoized cards and placeholders for media. Outside the viewport, cards do not hold heavy DOM or media resources — otherwise memory hits its budget and the scroll starts dropping frames.

Optimistic interactions

A like is applied locally immediately and then confirmed by the server. On conflict the client reconciles state with a clear UX, and as a last resort rolls back with a notification.

Cache hierarchy

An in-memory screen cache, local storage in IndexedDB via a service worker, the browser HTTP cache, and the CDN edge cache. The key goal is to minimize cold fetches when the user returns to the feed.

Cache layers: a warm request is served from the in-memory client cache; a miss falls through IndexedDB and the HTTP cache, and only then reaches the CDN edge and the origin.

Trade-offs

Strong personalization improves retention but complicates explainability and the analysis of user complaints.

The more aggressive the prefetch, the smoother the scroll — and the more traffic the client wastes on posts it never reaches on a mobile network.

A thin BFF reduces client complexity but adds a critical server layer with a large blast radius.

Optimistic updates take the wait out of the UX, but the cost is a careful rollback: on a network error the state has to revert so the user understands the like did not go through.

References

Related chapters

Why do we need frontend architecture? - Where this feed case fits into the broader frontend architecture decisions and how they affect delivery speed.
Caching Strategies - Approaches to multi-layer caching that speed up feed delivery and reduce backend load.
Load Balancing Algorithms - How to scale the feed API under high peak RPS and uneven traffic.
Observability & Monitoring Design - What to measure for scroll smoothness and feed completeness, and how an error budget tells degradation apart from the norm.
Event-Driven Architecture - Asynchronous processing of likes and comments and fan-out feed updates.