System Design: News Feed
The news feed is arguably the most important screen in social media. It's the first thing you see when you open Twitter, Instagram, or LinkedIn. It determines what content gets attention, which creators get reach, and ultimately how much time you spend in the app.
Behind that simple scrolling list is a distributed system that assembles personalized content for hundreds of millions of users, ranked by relevance, in under 200ms. Let's design it.
requirements
Functional:
- Users see a feed of posts from people they follow
- Feed is ranked by relevance (not purely chronological)
- Support text, images, videos, shared links
- New posts appear in followers' feeds within seconds
- Infinite scroll with cursor-based pagination
Non-functional:
- Feed generation latency under 200ms
- Support 500M daily active users
- A user follows up to 5,000 accounts
- The median user's feed draws from ~200 followees
- 99.99% availability
the core problem: fan-out
When User A publishes a post, it needs to appear in the feed of every follower. If A has 10M followers, that's 10M feed entries to create. This is the fan-out problem, and there are two fundamental approaches.
Fan-out on WRITE Fan-out on READ
(push model) (pull model)
User A posts User A posts
│ │
▼ ▼
Write to ALL Write to A's
follower feeds own timeline
│ │
├──▶ Feed of User 1 │
├──▶ Feed of User 2 User 1 opens app
├──▶ Feed of User 3 │
├──▶ ... ▼
└──▶ Feed of User N Query all followees
Merge + rank + return
Pros: Fast reads Pros: Fast writes
Cons: Expensive writes Cons: Slow reads
Stale data for edits High fan-in
Fan-out on write: Pre-compute every user's feed. When A posts, push the post ID into every follower's feed cache. Reads are instant — just fetch the pre-built feed. Downside: if A has 10M followers, that's 10M writes per post. Celebrity accounts become absurdly expensive.
Fan-out on read: Don't pre-compute anything. When User 1 opens the app, query all 200 accounts they follow, merge the results, rank them, and return. Writes are cheap (one record). Reads are expensive (200 queries merged in real-time).
the hybrid approach (what everyone actually uses)
Neither extreme works at scale. The solution is hybrid:
- Fan-out on write for normal users (accounts with < 10k followers)
- Fan-out on read for celebrity/high-follower accounts
Hybrid Fan-out Architecture:
User posts ──▶ Post Service ──▶ Fan-out Service
│
┌─────────────┼─────────────┐
│ │ │
< 10k followers? │ ≥ 10k followers?
│ │ │
▼ │ ▼
Push to each │ Store in
follower's │ celebrity
feed cache │ timeline only
│ │ │
└─────────────┼─────────────┘
│
Feed Request
│
▼
Merge cached feed
+ celebrity timelines
+ rank
+ return
Twitter calls this approach "earlybird" in their architecture. Instagram uses a similar hybrid. The threshold (10k followers) varies by system — some use dynamic thresholds based on post frequency.
the full architecture
┌──────────┐ ┌────────────┐ ┌──────────────┐
│ Client │───▶│ API GW │───▶│ Post Service │
└──────────┘ └──────┬─────┘ └──────┬───────┘
│ │
│ ┌──────▼───────┐
│ │ Fan-out │
│ │ Service │
│ └──┬────────┬──┘
│ │ │
┌────▼─────┐ ┌────▼────┐ ┌─▼──────────┐
│ Feed │ │ Redis │ │ Post Store │
│ Service │ │ (feed │ │ (Postgres/ │
│ (merge + │ │ cache) │ │ Cassandra) │
│ rank) │ └─────────┘ └────────────┘
└─────┬────┘
│
┌─────▼─────┐
│ Ranking │
│ Service │
│ (ML model)│
└───────────┘
feed cache design
Each user's feed is a sorted set in Redis. The member is the post ID, and the score is either the timestamp or a relevance score.
Redis Feed Cache:
Key: feed:{user_id}
Type: Sorted Set
Members: post IDs
Scores: timestamp (or ranking score)
Example:
feed:user_123
post_789 → 1709142000 (most recent)
post_456 → 1709138400
post_234 → 1709134800
...
(keep last 800 entries, evict oldest)
Limit each user's feed cache to ~800 posts. Nobody scrolls past page 20. This keeps Redis memory predictable: 500M users × 800 post IDs × 16 bytes ≈ 6.4TB. That's a Redis cluster with ~100 nodes, which is large but manageable for a company at this scale.
ranking algorithm
A chronological feed is simple but produces terrible engagement. The ranking service scores each post by predicted relevance to the specific user.
Feature signals:
- Affinity: How often does this user interact with the author? (likes, comments, DMs)
- Content type: Does this user prefer images, videos, or text?
- Recency: Exponential decay — posts lose relevance over time
- Engagement velocity: How fast is this post getting likes? Early engagement predicts broad appeal
- Social proof: Did the user's close friends engage with it?
Ranking Score (simplified):
score = w1 * affinity(user, author)
+ w2 * content_type_preference(user, post.type)
+ w3 * recency_decay(post.created_at)
+ w4 * engagement_velocity(post)
+ w5 * social_proof(user.friends, post)
Where weights w1-w5 are learned via ML
(typically a gradient-boosted tree or neural ranker)
Two-phase ranking: Don't run the ML model on all 800 cached posts. Phase 1: lightweight scoring to pick the top 200 candidates. Phase 2: full ML model on the 200 candidates to produce the final ranked feed. This keeps latency under 100ms even with a heavy model.
pagination
Offset-based pagination breaks in feeds because new posts are constantly inserted. Use cursor-based pagination.
The cursor is the last post ID (or score) the client received. The next page request says "give me 20 posts scored lower than this cursor." This is stable regardless of new insertions.
Request: GET /feed?cursor=post_456&limit=20
Response: {
posts: [...20 posts...],
next_cursor: "post_234",
has_more: true
}
handling post updates and deletes
When a post is edited or deleted after fan-out:
Lazy invalidation: Don't update every follower's feed cache. When the Feed Service retrieves posts from cache, it fetches the full post data from the Post Store. If the post is deleted, skip it. If it's edited, use the latest version. The cache only stores post IDs, not content.
Active invalidation for high-priority cases: If a post is deleted for policy violations, you might want to actively remove it from feeds. Publish a "post_deleted" event that the Fan-out Service processes in reverse — remove the post ID from every follower's feed cache.
the cold start problem
New users have no feed cache. Users who haven't opened the app in 6 months have a stale cache. Both hit the same problem: you need to generate a feed from scratch.
Solution: pre-computed "trending" feed. Maintain a global trending feed that's updated every few minutes. Serve this to cold users immediately while building their personalized feed in the background. Within 2-3 sessions, the personalized feed takes over.
Backfill worker: When a user follows someone new, backfill the last 20 posts from that account into the user's feed cache. This prevents the "I followed someone but don't see their posts" problem.
key tradeoffs
Relevance vs. recency: Ranking by relevance keeps engagement high but creates filter bubbles. Pure chronological gives you everything but buries good content under noise. Most platforms let users choose (Twitter's "For You" vs. "Following" tabs). The ranking model should have a recency floor — no post older than 48 hours in the main feed.
Consistency vs. latency: A user's feed doesn't need to be perfectly consistent. If a post appears 5 seconds late, nobody notices. If feed generation takes 2 seconds, everyone notices. Always favor latency over consistency in feeds.
Storage cost vs. read speed: Fan-out on write burns storage and write throughput. Fan-out on read burns compute on every feed request. The hybrid approach optimizes the common case (normal users, fast reads) and accepts higher latency for the uncommon case (celebrity posts, which get merged on read).
Engagement optimization vs. user well-being: The ranking model can optimize for time spent, but that's how you end up in front of Congress. Modern feed design includes diversity signals (don't show 10 posts about the same topic in a row), serendipity (surface content from outside the user's bubble), and controls (let users say "show less like this").
what the big companies do
- Twitter/X: Hybrid fan-out. "For You" tab uses a neural ranking model called SimClusters. Earlybird handles candidate retrieval. The timeline mixer merges and deduplicates.
- Instagram: Fan-out on write for the main feed. ML ranking with a heavy model (they can afford the compute). Stories use a separate, simpler system.
- LinkedIn: Fan-out on write with a multi-objective ranker balancing engagement, content quality, and network diversity.
The pattern: everyone uses a hybrid fan-out with ML ranking. The differentiator is the ranking model quality and how well they handle the celebrity problem.
More in System Design
System Design: Distributed Job Scheduler
Designing a cron-at-scale system — priority queues, exactly-once execution, retry with dead letter queues, and the monitoring that keeps it honest.
System Design: File Storage Service
Designing S3-like object storage — chunking, deduplication, CDN integration, and the metadata layer that ties it all together.
System Design: Distributed Cache
Consistent hashing, eviction policies, cache stampede prevention, and the Redis vs Memcached decision you'll actually face in production.