Skip to content
Back to System Design
System Design6 min read

System Design: URL Shortener

system-designarchitecturedistributed-systemscaching
Share

Every system design interview starts here. It looks deceptively simple — take a long URL, make it short, redirect when someone clicks it. But the moment you start thinking about scale, analytics, and latency, it becomes a masterclass in distributed systems tradeoffs.

Let's design a URL shortener handling 100M new URLs per month and 10B redirects per month. Those are roughly bit.ly's numbers.

requirements

Functional:

  • Given a long URL, generate a unique short URL
  • Redirect short URL to the original
  • Custom aliases (optional)
  • Analytics: click count, geographic data, referrer, device
  • Link expiration (TTL)

Non-functional:

  • Redirect latency under 50ms (p99)
  • High availability — a broken redirect loses revenue for every customer
  • Eventual consistency is fine for analytics
  • Short URLs should be as short as possible

back-of-envelope math

100M new URLs/month = ~40 writes/second. That's nothing. The read side is the beast: 10B redirects/month = ~3,800 reads/second, with spikes easily hitting 50k/s during viral content.

Read:write ratio is ~100:1. This is a read-heavy system. Cache everything.

Storage: 100M URLs × 500 bytes average = 50GB/month. After 5 years, 3TB. Fits on a single machine, but we'll shard for availability.

high-level architecture

┌──────────┐     ┌───────────┐     ┌──────────────┐
│  Client   │────▶│  API GW /  │────▶│  URL Service  │
│ (browser) │◀────│  CDN Edge  │◀────│  (write path) │
└──────────┘     └─────┬─────┘     └──────┬───────┘
                       │                    │
                 ┌─────▼─────┐       ┌─────▼──────┐
                 │   Redis    │       │  Postgres   │
                 │  (cache)   │       │  (source    │
                 │            │       │   of truth) │
                 └───────────┘       └─────┬──────┘
                                           │
                                     ┌─────▼──────┐
                                     │  Analytics  │
                                     │  Pipeline   │
                                     │  (Kafka →   │
                                     │   ClickHouse)│
                                     └────────────┘

Two distinct paths: write (create short URL) and read (redirect). They have completely different performance characteristics, so we design them separately.

the encoding problem

The core question: how do you turn https://example.com/some/very/long/path?with=params into https://sho.rt/abc123?

Option 1: Hash-based. MD5 or SHA-256 the URL, take the first 7 characters. Problem: collisions. MD5's first 7 chars give you 36^7 = ~78B combinations, but birthday paradox means you'll see collisions much earlier. You need collision detection and retry logic.

Option 2: Counter-based with Base62. Use an auto-incrementing counter, encode it as Base62 (a-z, A-Z, 0-9). Counter value 1,000,000 becomes 4c92 — 4 characters. At 7 characters, Base62 gives you 3.5 trillion unique URLs. No collisions by design.

I'd go with counter-based + Base62. It's deterministic, no collisions, and the short URLs are as short as physically possible.

Counter: 1000000
Base62:  4c92

Counter: 56800235584
Base62:  zzzzzz (6 chars, 56.8B URLs)

Encoding: repeatedly divide by 62, map remainders to charset
Charset: 0-9 (0-9), a-z (10-35), A-Z (36-61)

The counter source matters. A single auto-increment sequence is a bottleneck. Solutions:

  • Range-based allocation: Each app server pre-fetches a range (e.g., server A gets 1M-2M, server B gets 2M-3M). No coordination needed during normal operation. Zookeeper or etcd manages range assignment.
  • Snowflake-style IDs: Timestamp + machine ID + sequence number. More complex but fully decentralized.

Range-based is simpler and what I'd pick for an MVP that can handle the scale we defined.

the redirect path (where latency matters)

A redirect is a 301/302 HTTP response. The entire flow:

  1. Client requests sho.rt/abc123
  2. CDN edge checks cache — if hit, return 301 immediately (sub-10ms)
  3. Cache miss → hit Redis — if hit, return 301 (~5ms)
  4. Redis miss → query Postgres — return 301 (~20-50ms), populate Redis
Client ──▶ CDN Edge Cache ──miss──▶ Redis ──miss──▶ Postgres
  ◀── 301   ◀── 301 + cache set      ◀── 301 + cache set

301 vs 302: 301 (permanent redirect) lets browsers cache it forever — great for latency, terrible for analytics. 302 (temporary redirect) forces the browser to hit your server every time. Most shorteners use 302 because analytics is the business model.

Cache warming: Popular URLs get cached on first access. The Zipf distribution saves us — a tiny fraction of URLs receive the vast majority of clicks. In practice, a Redis instance with 10GB RAM caches the hot set comfortably.

analytics pipeline

Every redirect fires an event. You can't process this synchronously — that adds latency to the redirect. Instead:

  1. Redirect handler publishes event to Kafka (async, fire-and-forget)
  2. Stream processor enriches events (GeoIP lookup, device parsing from User-Agent)
  3. Aggregated results land in ClickHouse (columnar store optimized for analytics)

This decouples the redirect latency from analytics processing. The redirect stays fast. Analytics are eventually consistent — a 30-second delay is fine.

key tradeoffs

Consistency vs. availability on writes: If two users submit the same URL simultaneously, do they get the same short code? With counter-based encoding, they get different codes for the same URL. That's intentional — it lets you track analytics per creator. If you want deduplication, add a URL → short_code lookup table, but that's an extra read on every write.

Custom aliases: These break the counter-based model. You need a separate lookup to check if the alias is taken. Store custom aliases in the same table but mark them differently. Always validate: no offensive words, no collisions with generated codes.

Link expiration: Add a expires_at column. Check on redirect. Lazy deletion — don't run batch jobs to delete expired links. When a cache entry expires and the DB record is also expired, return 404 and delete.

scaling to billions

At 10B redirects/month, the read path dominates. Scaling strategy:

  • CDN layer: Cloudflare or Fastly edge cache handles 80%+ of redirects. This is your most cost-effective scaling lever.
  • Redis cluster: Consistent hashing across 3-5 nodes. Handles cache misses from CDN.
  • Postgres read replicas: 3 replicas behind pgBouncer. Writes go to primary, reads to replicas.
  • Horizontal app servers: Stateless, behind a load balancer. Scale to 20+ instances during traffic spikes.

The write path barely needs scaling at 40 writes/second. A single Postgres primary handles that with room to spare.

what I'd build for an MVP

Postgres + Redis + a single Node.js service + Cloudflare CDN. No Kafka, no ClickHouse — just log click events to a clicks table and run batch analytics queries nightly. That handles the first 10M URLs easily. Add the streaming pipeline when analytics latency matters.

The system design interview wants you to go big. Real engineering starts small and scales when the data forces you to.


Share

More in System Design