The Top 10 Most Common System Design Questions, Explained
A breakdown of the caching, sharding, and scaling strategies behind the industry's most popular interview prompts.
A comprehensive guide to the distributed system concepts hiding behind the most common interview questions.
System design interviews rarely exist to see if you can perfectly clone a billion-dollar company in 45 minutes. Instead, interviewers use specific applications as trojan horses to test your understanding of fundamental distributed system concepts: caching, partitioning, geospatial indexing, and concurrency.
Here is a detailed architectural breakdown of the 10 most common system design questions and the core engineering problems they are actually testing.
1. Design a URL Shortener (e.g., TinyURL)
The Scenario: Users input a long URL and receive a highly compressed, unique alias. Clicking the alias redirects them to the original site.
The Core Challenge: Collision prevention and read-heavy scaling.
If two users shorten different links, they absolutely cannot receive the same alias. Furthermore, URL shorteners are heavily read-skewed (often a 100:1 read-to-write ratio).
The Architectural Solution:
Alias Generation: Use Base62 encoding (
A-Z, a-z, 0-9). A 7-character Base62 string gives you roughly 3.5 trillion unique combinations.Collision Avoidance: Instead of randomly generating hashes and checking the database for duplicates, use an offline Ticket Server (or a centralized auto-incrementing database like ZooKeeper) to assign a unique integer to each request, then convert that integer to Base62.
Scaling Reads: Put a massive Redis or Memcached layer in front of the database. When a user clicks a link, check the cache first. Only hit the database (often a NoSQL store like Cassandra for easy scaling) on a cache miss.
Why use ZooKeeper/Ticket Servers? ZooKeeper ensures that every application server gets a unique, non-overlapping range of numerical IDs (e.g., Server A gets 1–10,000, Server B gets 10,001–20,000).
2. Design a Social Media News Feed (e.g., Twitter, Facebook)
The Scenario: A user opens the app and sees a chronologically ordered feed of posts from everyone they follow.
The Core Challenge: The Fan-Out Problem.
If an average user tweets, it needs to be delivered to 200 followers. If a celebrity with 100 million followers tweets, calculating that feed on the fly for every single follower will crash your database.
The Architectural Solution:
Push Model (Fan-out on write): For normal users, when they post, immediately push that post into the pre-computed, in-memory feed caches (Redis) of all their followers.
Pull Model (Fan-out on read): For celebrities (users with millions of followers), do not push their posts. Instead, when a follower loads their app, the system pulls the celebrity’s posts from the database and merges them with the follower’s cached feed on the fly. This hybrid approach saves massive amounts of compute.
3. Design a Real-Time Chat App (e.g., WhatsApp, Discord)
The Scenario: Users can send instantaneous 1-on-1 messages, see when friends are online, and see “typing...” indicators.
The Core Challenge: Stateful connections and message ordering.
Standard HTTP requests are stateless—the server closes the connection after responding. You cannot build real-time chat by having the phone ask the server “Do I have new messages?” every second (polling).
The Architectural Solution:
WebSockets: Establish persistent, bi-directional WebSocket connections between the client and a fleet of Chat Servers.
Message Routing: When User A messages User B, the server pushes the payload into a message broker (like Apache Kafka or RabbitMQ). A routing service checks a fast Key-Value store to find out exactly which physical Chat Server User B is currently connected to, and forwards the message there.
Presence Service: To track “Online” status, clients send a heartbeat ping every 5 seconds. If the server misses three heartbeats, the user is marked offline.
4. Design a Video Streaming Service (e.g., YouTube, Netflix)
The Scenario: Users upload massive video files, and viewers globally stream them without buffering.
The Core Challenge: Heavy payload processing and global latency.
You cannot stream a raw 4K video file from a server in New York to a mobile phone in Tokyo on a 3G network.
The Architectural Solution:
Transcoding Pipeline: When a video is uploaded to Blob Storage (AWS S3), it triggers a distributed processing queue. Worker nodes chop the video into tiny chunks and encode them into multiple resolutions (1080p, 720p, 480p) and formats asynchronously using a Directed Acyclic Graph (DAG) workflow.
Content Delivery Networks (CDNs): The transcoded video chunks are aggressively cached on edge servers globally. When the user in Tokyo presses play, they are streaming the video from a server physically located in Tokyo, entirely bypassing your main data center.
𝐋𝐞𝐚𝐫𝐧 𝐭𝐨 𝐛𝐮𝐢𝐥𝐝 𝐆𝐢𝐭, 𝐃𝐨𝐜𝐤𝐞𝐫, 𝐑𝐞𝐝𝐢𝐬, 𝐇𝐓𝐓𝐏 𝐬𝐞𝐫𝐯𝐞𝐫𝐬, 𝐚𝐧𝐝 𝐜𝐨𝐦𝐩𝐢𝐥𝐞𝐫𝐬, 𝐟𝐫𝐨𝐦 𝐬𝐜𝐫𝐚𝐭𝐜𝐡. Get 40% OFF CodeCrafters: https://app.codecrafters.io/join?via=the-coding-gopher
5. Design a Ride-Sharing App (e.g., Uber, Lyft)
The Scenario: Riders request a car, the system finds the nearest driver, and the rider tracks the car moving in real-time.
The Core Challenge: High-frequency spatial indexing.
You have 100,000 drivers updating their GPS coordinates every 3 seconds. The database must instantly query “Who are the 5 closest drivers to this specific coordinate?”
The Architectural Solution:
Geospatial Grids: Divide the map into mathematical grids using S2 Geometry or Geohash. Instead of executing complex SQL distance calculations, you simply query the database for “all drivers currently in Grid ID 9102.”
In-Memory Location Tracking: Because drivers move constantly, writing to a hard drive every 3 seconds is too slow. Driver locations and their corresponding grid IDs are stored entirely in RAM using Redis.
6. Design a Location-Based Search (e.g., Yelp, Tinder)
The Scenario: A user searches for “pizza” or “potential matches” within a specific radius of their current location.
The Core Challenge: Read-heavy spatial queries.
Unlike Uber, restaurants do not move. The data is highly static, but the read volume is astronomical.
The Architectural Solution:
QuadTrees: Build an in-memory tree data structure. A QuadTree recursively subdivides a 2D map into four quadrants. If a quadrant has too many restaurants (e.g., downtown Manhattan), it splits again. To find a pizza place, the system quickly traverses the tree down to the user’s specific geographical quadrant.
Database: Since the data is static, spatial databases like PostGIS (an extension for PostgreSQL) are heavily utilized to handle complex polygon and radius queries.
7. Design a Web Crawler (e.g., Google Search)
The Scenario: Build an automated system that discovers, downloads, and indexes the entire internet.
The Core Challenge: Graph traversal at an infinite scale and deduplication.
The internet is full of infinite loops (Page A links to Page B, which links to Page A). You must avoid downloading the same page twice while adhering to domain rate limits so you don’t accidentally DDoS a website.
The Architectural Solution:
Distributed URL Frontier: A massive message queue (Kafka) prioritizes URLs to be crawled.
Bloom Filters: To check if you have already crawled a URL, you cannot query a massive database of 10 billion links. Instead, use a Bloom Filter—a highly space-efficient probabilistic data structure that can tell you in microseconds if a URL has definitely not been seen, or possibly been seen.
8. Design a Distributed Rate Limiter
The Scenario: Build a middleware service that prevents any single API key from making more than 100 requests per minute.
The Core Challenge: Atomic synchronization across multiple servers.
If a user hits Server A, and then immediately hits Server B, both servers need to know the exact state of the user’s request count.
The Architectural Solution:
Algorithms: Implement the Token Bucket or Sliding Window algorithm.
Atomic Operations: Store the counters in a fast, centralized Redis cache. Because fetching a counter, incrementing it, and saving it can cause race conditions, you must use Redis Lua scripts to execute the entire check-and-update operation atomically in a single step.
9. Design a Distributed Key-Value Store (e.g., Redis, Cassandra)
The Scenario: Design the underlying architecture for a NoSQL database capable of storing petabytes of data across hundreds of machines.
The Core Challenge: Data partitioning, replication, and the CAP Theorem.
When a server inevitably catches fire, no data can be lost, and the system must remain online.
The Architectural Solution:
Consistent Hashing: Map your data keys and your server IPs onto a virtual ring. This allows you to add or remove servers from the cluster without needing to remap every single piece of data in the entire database.
Quorum Consensus: To handle replication, use a leaderless architecture. When data is written, write it to
Nnodes. The write is considered successful onceWnodes acknowledge it. To read, read fromRnodes. As long asW + R > N, you are mathematically guaranteed to always read the most recent data.
10. Design a Ticketmaster / Flash Sale System
The Scenario: 5 million fans attempt to buy 50,000 concert tickets at the exact same second.
The Core Challenge: Extreme concurrency and preventing overselling.
You absolutely cannot allow two people to purchase the exact same seat, nor can you let the database crash under the stampede of traffic.
The Architectural Solution:
Queuing: Do not let 5 million users hit the database directly. Place them in a virtual waiting room and process purchase requests sequentially through a message queue.
Redis Pre-Decrementing: Load the total ticket inventory into Redis before the sale starts. When a request comes in, decrement the Redis counter using an atomic operation. Only if the counter is
> 0does the user proceed to the actual checkout flow.Database Locking: When securing the specific seat, use pessimistic locking (
SELECT * FROM seats WHERE id = 123 FOR UPDATE). This locks the database row, physically preventing any other transaction from modifying that seat until the current user completes their payment or their 5-minute timer expires.











