← All posts
3 min read

Design a URL shortener (the interview classic, done properly)

#system-design#architecture

The URL shortener is the "hello world" of system design — small enough to finish in 45 minutes, deep enough to show real trade-offs.

1. Requirements first, always

  • Shorten a long URL → sho.rt/Ab3xK9
  • Redirect fast (this is 99% of traffic)
  • Scale: say 100M new URLs/month, 10B redirects/month
  • Links live for years; custom aliases optional

2. Napkin math

  • Writes: 100M / month ≈ 40 writes/sec
  • Reads: 10B / month ≈ 4,000 reads/sec — a 100:1 read/write ratio
  • Storage: 100M/month × ~500 bytes × 5 years ≈ 3 TB. Small!

The math tells you the shape: this is a read-heavy, cache-everything problem, not a big-data problem.

3. The short code

Base62-encode a unique ID ([a-zA-Z0-9]). 7 characters gives 62⁷ ≈ 3.5 trillion combinations — enough forever.

Where does the ID come from? A dedicated ID range allocator: each app server leases a block of IDs (say 100k) from a coordinator, then hands them out locally with zero contention. No hashing collisions to handle, no hot counter.

4. Architecture

  • Redirects: HTTP 301 (permanent, cacheable) if analytics don't matter; 302 if they do — you want the hit to reach you.
  • DB: anything key-value shaped works at 3 TB — DynamoDB fits perfectly (key = code, value = URL, TTL for expiry).
  • Cache: the top 20% of links serve ~90% of traffic. A small Redis in front of the DB absorbs almost everything.

5. What breaks at scale (the discussion that gets you hired)

  • Hot links (a viral tweet) → per-node in-memory cache in front of Redis
  • Analytics at 4k rps → don't write per-click rows synchronously; push events to a queue (Kinesis/SQS) and aggregate
  • Abuse → rate-limit creation, blocklist scanning on write

The lesson that generalizes: do the napkin math before drawing boxes — it tells you which problems are real and which are imaginary.