23 min read Last updated on Feb 6, 2026

Design a Social Feed (Facebook/Instagram)

A comprehensive system design for social feed generation and ranking covering fan-out strategies, ML-powered ranking, graph-based storage, and caching at scale. This design addresses sub-second feed delivery for billions of users with personalized content ranking, handling the “celebrity problem” where a single post can require millions of fan-out operations.

Mermaid diagram
High-level architecture: Aggregators query leaf servers for candidates, ranking service applies ML models, fan-out service distributes posts to follower feeds, and TAO provides graph storage.

Social feed systems solve three interconnected problems: efficient content distribution (getting posts to followers), personalized ranking (showing the most relevant content first), and low-latency retrieval (sub-second feed loads at massive scale).

Core architectural decisions:

DecisionChoiceRationale
Fan-out strategyHybrid push/pullPush for regular users, pull for celebrities (>10K followers)
Content storageTAO-like graph storeObjects + associations model fits social data naturally
Feed rankingMulti-stage ML funnelBillions → hundreds → ~50 candidates through progressive filtering
CachingTwo-tier regionalFollowers handle reads, leaders maintain consistency
ConsistencyEventual (~1 min)Acceptable for social content; strong only for engagement counts

Key trade-offs accepted:

  • Eventual consistency for feed updates (fresh-enough is good enough)
  • Higher write amplification for regular users (pre-computed feeds)
  • Complex hybrid logic for celebrity handling
  • ML model staleness (hourly retraining) in exchange for training efficiency

What this design optimizes:

  • p99 feed load < 500ms globally
  • Zero data loss for user-generated content
  • Personalized ranking with 1000+ signals
  • Linear horizontal scaling to billions of users
RequirementPriorityNotes
Home feed generationCoreAggregated posts from followed users
Post creationCoreText, images, videos with privacy settings
Feed rankingCorePersonalized relevance ordering
Real-time feed updatesCoreNew posts appear without full refresh
Engagement actionsCoreLike, comment, share, save
Following/followersCoreAsymmetric social graph
Feed paginationCoreInfinite scroll with cursor-based loading
Post visibilityExtendedPublic, friends-only, custom lists
Stories/ephemeral contentExtended24-hour expiring content
Algorithmic vs. chronological toggleExtendedUser preference for feed type
RequirementTargetRationale
Availability99.99% (4 nines)Revenue-critical, user retention
Feed load latencyp99 < 500msUser experience threshold
Post publish latencyp99 < 2sAcceptable for async processing
Feed freshness< 1 minuteBalance between freshness and efficiency
Ranking model latencyp99 < 100msReal-time personalization requirement
Data durability99.999999%No user content loss

Users:

  • Monthly Active Users (MAU): 2 billion
  • Daily Active Users (DAU): 1 billion (50% of MAU)
  • Peak concurrent users: 200 million (20% of DAU)

Traffic:

  • Feed loads per user per day: 20
  • Daily feed requests: 1B × 20 = 20 billion/day = 230K RPS
  • Peak multiplier: 3× → 700K RPS
  • Posts per user per day: 0.5 average (power law distribution)
  • Daily posts: 500 million posts/day = 5.8K posts/second

Storage:

  • Average post size: 2KB (metadata, excluding media)
  • Media per post: 500KB average (after compression, CDN-served)
  • Daily post storage: 500M × 2KB = 1TB/day (metadata)
  • Daily media storage: 500M × 500KB = 250TB/day
  • Social graph edges: 2B users × 500 avg connections = 1 trillion edges
  • Graph storage: 1T edges × 100 bytes = 100TB

Fan-out estimation:

  • Average followers: 500
  • Posts requiring fan-out: 500M/day
  • Fan-out writes: 500M × 500 = 250 billion writes/day
  • Celebrity optimization: Top 1% (20M users) average 50K followers
  • Without optimization: 20M × 50K × 0.5 posts = 500 trillion writes/day (impossible)

Best when:

  • Uniform follower distribution (no celebrities)
  • Read-heavy workload (reads >> writes)
  • Real-time feed freshness is critical
  • Simpler operational model preferred

Architecture:

Mermaid diagram

Key characteristics:

  • Pre-computed feeds in Redis/Memcached
  • O(1) feed reads, O(followers) writes
  • Feed cache stores post IDs in sorted order

Trade-offs:

  • Extremely fast feed reads (single cache lookup)
  • Simple feed retrieval logic
  • Predictable read latency
  • Massive write amplification for celebrities
  • High storage cost (duplicate post references)
  • Delayed delivery during fan-out processing

Real-world example: Twitter historically used this approach, storing up to 800 tweets per timeline in Redis. At 39M QPS with 105TB RAM across 10,000+ Redis instances, this works for most users but requires special handling for high-follower accounts.

Best when:

  • Celebrity-heavy platform (many high-follower users)
  • Write-heavy workload
  • Storage cost is a primary concern
  • Acceptable higher read latency

Architecture:

Mermaid diagram

Key characteristics:

  • No write amplification
  • Feed computed on demand
  • Read complexity: O(followees × posts_per_user)

Trade-offs:

  • No write amplification
  • Always fresh content
  • Lower storage requirements
  • Higher read latency (aggregation required)
  • More compute per request
  • Harder to apply complex ranking

Real-world example: Facebook moved to this approach in 2007. The Multifeed system fetches tens of thousands of potential updates per user, then ranks and filters to ~45 items. This requires sophisticated indexing and caching to maintain sub-second latency.

Best when:

  • Mixed follower distribution (most users small, some celebrities)
  • Need to optimize both reads and writes
  • Can handle additional system complexity
  • Facebook/Instagram/Twitter scale

Architecture:

Mermaid diagram

Key characteristics:

  • Threshold-based routing (typically 10K followers)
  • Pre-computed feeds + on-demand celebrity merging
  • Best of both worlds for common cases

Trade-offs:

  • Optimal latency for typical users (pre-computed)
  • Handles celebrities without write explosion
  • Flexible threshold tuning
  • Two code paths to maintain
  • More complex feed read logic
  • Edge cases around threshold (users crossing 10K)

Real-world example: Discord uses this approach: direct fan-out for small servers, Kafka-based distribution for large servers (100+ members). Instagram similarly uses hybrid fan-out with ML-based feed generation.

FactorFan-out WriteFan-out ReadHybrid
Read latency~10ms~200ms~50ms
Write latencyO(followers)O(1)O(followers) for small
Celebrity handlingImpossibleNativeOptimized
Storage costHighLowMedium
ComplexityLowMediumHigh
Production examplesTwitter (pre-2015)Facebook MultifeedInstagram, Discord

This article focuses on Path C (Hybrid) because:

  1. Represents modern production architectures at Facebook/Instagram scale
  2. Demonstrates trade-off thinking essential for system design
  3. Handles the full spectrum from regular users to celebrities
  4. Balances latency, storage, and operational complexity
Mermaid diagram

The feed generation layer follows Facebook’s Multifeed architecture with disaggregated tiers:

Aggregator Tier:

  • CPU-intensive query processing and ranking
  • Stateless, horizontally scalable
  • Queries multiple leaf servers in parallel
  • Applies ML ranking models to candidates

Leaf Tier:

  • Memory-intensive, stores recent action indices
  • Indexes posts by author, sorted by time
  • Maintains in-memory structures for fast retrieval
  • Sharded by user ID

Tailer:

  • Real-time data pipeline from Kafka
  • Updates leaf indices as posts are created
  • Handles index rebuilding from persistent storage

Design decisions:

DecisionChoiceRationale
Tier separationDisaggregated aggregator/leaf40% efficiency gain, independent scaling
Leaf shardingBy author user_idCo-locates author’s posts for range queries
Index structureTime-sorted skip listFast range queries with O(log n) insert
Memory managementLRU eviction, flash overflowBalance hot data in RAM, cold on SSD

Manages the social graph (follows, friends, blocks) and content relationships:

Data model:

  • Objects: Users, posts, comments, pages (nodes)
  • Associations: Follows, likes, comments, tags (edges)

Design characteristics:

  • Two tables only: objects and associations
  • Associations stored on source object’s shard
  • Enables single-shard queries for common patterns
Mermaid diagram

Handles post distribution to follower feeds:

Routing logic:

interface FanoutRouter {
routePost(post: Post, author: User): FanoutStrategy
}
type FanoutStrategy =
| { type: "push"; followers: string[] }
| { type: "pull"; authorId: string }
| { type: "hybrid"; pushTo: string[]; markForPull: boolean }

Threshold-based routing:

  • Authors with < 10K followers: Full push fan-out
  • Authors with 10K-1M followers: Hybrid (push to active followers, pull for rest)
  • Authors with > 1M followers: Pull only (mark post for read-time merging)

Multi-stage funnel progressively narrows candidates:

Mermaid diagram

Endpoint: GET /api/v1/feed

Query Parameters:

ParameterTypeDescription
cursorstringPagination cursor (opaque)
limitintItems per page (default: 20, max: 50)
refreshbooleanForce fresh feed generation

Response (200 OK):

{
"posts": [
{
"id": "post_abc123",
"author": {
"id": "user_456",
"username": "johndoe",
"displayName": "John Doe",
"avatarUrl": "https://cdn.example.com/avatars/456.jpg",
"isVerified": true
},
"content": {
"type": "image",
"text": "Beautiful sunset! 🌅",
"media": [
{
"id": "media_789",
"type": "image",
"url": "https://cdn.example.com/posts/789.jpg",
"thumbnailUrl": "https://cdn.example.com/posts/789_thumb.jpg",
"width": 1080,
"height": 1350,
"altText": "Sunset over the ocean"
}
]
},
"engagement": {
"likeCount": 1542,
"commentCount": 89,
"shareCount": 23,
"viewCount": 12450,
"isLiked": false,
"isSaved": false
},
"ranking": {
"score": 0.89,
"reason": "friend_interaction"
},
"createdAt": "2024-02-03T10:30:00Z",
"visibility": "public"
}
],
"pagination": {
"nextCursor": "eyJ0IjoxNzA2ODg2NDAwfQ",
"hasMore": true
},
"meta": {
"feedType": "ranked",
"generatedAt": "2024-02-03T12:00:00Z"
}
}

Create Post

Endpoint: POST /api/v1/posts

Request:

{
"content": {
"text": "Hello world!",
"mediaIds": ["upload_123", "upload_456"]
},
"visibility": "public",
"allowComments": true,
"allowSharing": true,
"location": {
"latitude": 37.7749,
"longitude": -122.4194,
"name": "San Francisco, CA"
}
}

Response (201 Created):

{
"id": "post_new789",
"author": {
"id": "user_123",
"username": "currentuser"
},
"content": {
"text": "Hello world!",
"media": [...]
},
"createdAt": "2024-02-03T12:05:00Z",
"visibility": "public",
"fanoutStatus": "pending"
}

Endpoint: GET /api/v1/feed/stream

Event Types:

event: new_post
data: {"postId": "post_xyz", "authorId": "user_456", "preview": "Check out..."}
event: engagement_update
data: {"postId": "post_abc", "likeCount": 1543, "commentCount": 90}
event: post_removed
data: {"postId": "post_old", "reason": "author_deleted"}

Like/Unlike Post

Endpoint: POST /api/v1/posts/{id}/like

Response (200 OK):

{
"liked": true,
"likeCount": 1543,
"timestamp": "2024-02-03T12:10:00Z"
}

Endpoint: GET /api/v1/posts/{id}/comments

Query Parameters:

ParameterTypeDescription
cursorstringPagination cursor
limitintComments per page (default: 20)
sortstring”top” (engagement), “newest”, “oldest”

Response (200 OK):

{
"comments": [
{
"id": "comment_123",
"author": {
"id": "user_789",
"username": "commenter",
"avatarUrl": "..."
},
"text": "Great post!",
"likeCount": 45,
"replyCount": 3,
"isLiked": false,
"createdAt": "2024-02-03T10:35:00Z",
"replies": []
}
],
"pagination": {
"nextCursor": "...",
"hasMore": true
},
"totalCount": 89
}
CodeErrorWhen
400INVALID_CONTENTPost content violates rules
401UNAUTHORIZEDMissing or invalid token
403FORBIDDENUser blocked or content restricted
404POST_NOT_FOUNDPost doesn’t exist or is deleted
429RATE_LIMITEDToo many requests
503FEED_UNAVAILABLEFeed generation temporarily unavailable

Rate limits:

EndpointLimitWindow
Feed load60per minute
Post creation10per hour
Like/comment100per minute
Media upload50per hour
CREATE TABLE objects (
id BIGINT PRIMARY KEY,
type VARCHAR(50) NOT NULL,
data BLOB,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
INDEX idx_type_created (type, created_at)
);

Object types:

TypeData Fields
userusername, display_name, avatar_url, bio, follower_count
postauthor_id, content, visibility, like_count, comment_count
commentpost_id, author_id, text, like_count
mediapost_id, url, type, dimensions, alt_text
CREATE TABLE associations (
id1 BIGINT NOT NULL,
assoc_type VARCHAR(50) NOT NULL,
id2 BIGINT NOT NULL,
time BIGINT NOT NULL,
data BLOB,
PRIMARY KEY (id1, assoc_type, id2),
INDEX idx_id1_type_time (id1, assoc_type, time DESC)
);

Association types:

Typeid1id2Data
followsfollower_idfollowee_idcreated_at
authoreduser_idpost_id-
likeduser_idpost_idcreated_at
commenteduser_idcomment_id-
taggedpost_iduser_id-

Sharding strategy:

  • Shard by id1 (source object)
  • Co-locates user’s follows, likes, authored posts
  • Enables single-shard queries for common patterns
# Pre-computed feed (sorted set)
# Score = ranking score or timestamp
ZADD feed:{user_id} {score} {post_id}
# Keep last 500 posts per feed
ZREMRANGEBYRANK feed:{user_id} 0 -501
# Feed metadata (hash)
HSET feed:meta:{user_id}
last_generated 1706886400000
version 42
type "ranked"
# Celebrity posts index (for pull-based merging)
ZADD celebrity_posts:{author_id} {timestamp} {post_id}
# User's recent engagement (for ranking features)
ZADD user:engaged:{user_id} {timestamp} {post_id}
EXPIRE user:engaged:{user_id} 604800 # 7 days

Post Index (Leaf Servers)

In-memory index structure on leaf servers:

interface PostIndex {
// Posts by author, sorted by time (descending)
authorIndex: Map<UserId, SortedSet<PostId, Timestamp>>
// Posts by visibility for filtering
visibilityIndex: Map<Visibility, Set<PostId>>
// Recent posts global index (for trending)
recentPosts: SortedSet<PostId, Timestamp>
}
interface SortedSet<K, S> {
add(key: K, score: S): void
range(start: S, end: S, limit: number): K[]
remove(key: K): void
}
-- Users table
CREATE TABLE users (
id BIGINT PRIMARY KEY AUTO_INCREMENT,
username VARCHAR(50) UNIQUE NOT NULL,
display_name VARCHAR(100),
email VARCHAR(255) UNIQUE NOT NULL,
password_hash VARCHAR(255) NOT NULL,
avatar_url TEXT,
bio TEXT,
follower_count INT DEFAULT 0,
following_count INT DEFAULT 0,
post_count INT DEFAULT 0,
is_verified BOOLEAN DEFAULT FALSE,
is_celebrity BOOLEAN DEFAULT FALSE, -- >10K followers
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
INDEX idx_username (username),
INDEX idx_email (email)
) ENGINE=InnoDB;
-- Posts table
CREATE TABLE posts (
id BIGINT PRIMARY KEY AUTO_INCREMENT,
author_id BIGINT NOT NULL,
content_text TEXT,
content_type ENUM('text', 'image', 'video', 'link') NOT NULL,
visibility ENUM('public', 'friends', 'private') DEFAULT 'public',
like_count INT DEFAULT 0,
comment_count INT DEFAULT 0,
share_count INT DEFAULT 0,
view_count INT DEFAULT 0,
is_deleted BOOLEAN DEFAULT FALSE,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
INDEX idx_author_created (author_id, created_at DESC),
INDEX idx_created (created_at DESC),
FOREIGN KEY (author_id) REFERENCES users(id)
) ENGINE=InnoDB;
-- Social graph (follow relationships)
CREATE TABLE follows (
follower_id BIGINT NOT NULL,
followee_id BIGINT NOT NULL,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
PRIMARY KEY (follower_id, followee_id),
INDEX idx_followee (followee_id),
FOREIGN KEY (follower_id) REFERENCES users(id),
FOREIGN KEY (followee_id) REFERENCES users(id)
) ENGINE=InnoDB;
-- Engagement (likes, saves)
CREATE TABLE post_likes (
user_id BIGINT NOT NULL,
post_id BIGINT NOT NULL,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
PRIMARY KEY (user_id, post_id),
INDEX idx_post (post_id),
FOREIGN KEY (user_id) REFERENCES users(id),
FOREIGN KEY (post_id) REFERENCES posts(id)
) ENGINE=InnoDB;
Data TypeStoreRationale
Social graphTAO (MySQL-backed)Optimized for graph queries, association lists
User profilesMySQLACID, moderate scale
PostsMySQL + TAO cacheDurability + fast graph queries
Feed cacheRedisSub-ms reads, sorted sets for ranking
Post indexLeaf servers (memory)Ultra-fast candidate retrieval
MediaS3 + CDNCost-effective, globally distributed
ML featuresFeature storeConsistent features for training/serving
AnalyticsClickHouseTime-series, aggregations
12 collapsed lines
class CandidateRetriever {
private readonly leafClient: LeafClient
private readonly taoClient: TAOClient
private readonly redis: RedisCluster
async getCandidates(userId: string): Promise<CandidateSet> {
// 1. Get user's followees
const followees = await this.taoClient.getAssociations(userId, "follows", { limit: 5000 })
// 2. Fetch recent posts from each followee (parallel)
const postPromises = followees.map((followee) =>
this.leafClient.getAuthorPosts(followee.id2, {
since: Date.now() - 7 * 24 * 60 * 60 * 1000, // 7 days
limit: 50,
}),
)
const postsByAuthor = await Promise.all(postPromises)
// 3. Fetch celebrity posts for pull-based merging
const celebrityFollowees = followees.filter((f) => f.data.isCelebrity)
const celebrityPosts = await this.getCelebrityPosts(celebrityFollowees.map((f) => f.id2))
// 4. Merge and deduplicate
const allPosts = [...postsByAuthor.flat(), ...celebrityPosts]
// 5. Apply eligibility filters
const eligible = await this.filterEligible(userId, allPosts)
return {
candidates: eligible,
source: {
fromCache: postsByAuthor.length,
fromCelebrity: celebrityPosts.length,
},
}
}
private async filterEligible(userId: string, posts: Post[]): Promise<Post[]> {
// Filter out: blocked authors, hidden posts, already seen
const [blocked, hidden, seen] = await Promise.all([
this.taoClient.getAssociations(userId, "blocks"),
this.redis.smembers(`hidden:${userId}`),
this.redis.smembers(`seen:${userId}`),
])
const blockedSet = new Set(blocked.map((b) => b.id2))
const hiddenSet = new Set(hidden)
const seenSet = new Set(seen)
return posts.filter(
(post) =>
!blockedSet.has(post.authorId) &&
!hiddenSet.has(post.id) &&
!seenSet.has(post.id) &&
this.checkVisibility(userId, post),
)
}
}
15 collapsed lines
class FeedRanker {
private readonly featureStore: FeatureStore
private readonly modelServer: ModelServer
async rank(userId: string, candidates: Post[]): Promise<RankedPost[]> {
// Stage 1: First-pass lightweight ranking
const stage1 = await this.firstPassRank(userId, candidates)
const top500 = stage1.slice(0, 500)
// Stage 2: Second-pass neural network
const stage2 = await this.secondPassRank(userId, top500)
const top100 = stage2.slice(0, 100)
// Stage 3: Final reranking with diversity
const final = await this.finalRerank(userId, top100)
return final.slice(0, 50)
}
private async firstPassRank(userId: string, candidates: Post[]): Promise<RankedPost[]> {
// Lightweight features (can compute in-process)
const features = candidates.map((post) => ({
postId: post.id,
recency: this.computeRecency(post.createdAt),
authorAffinity: this.getAuthorAffinity(userId, post.authorId),
contentType: post.contentType,
engagementVelocity: this.getEngagementVelocity(post),
}))
// Simple linear model for speed
const scores = features.map(
(f) =>
0.3 * f.recency +
0.4 * f.authorAffinity +
0.2 * f.engagementVelocity +
0.1 * this.contentTypeBoost(f.contentType),
)
return this.sortByScore(candidates, scores)
}
private async secondPassRank(userId: string, candidates: Post[]): Promise<RankedPost[]> {
// Fetch rich features from feature store
const features = await this.featureStore.getBatch(
candidates.map((c) => ({
userId,
postId: c.id,
authorId: c.authorId,
})),
)
// Neural network scoring
const scores = await this.modelServer.predict("feed_ranking_v2", features)
return this.sortByScore(candidates, scores)
}
private async finalRerank(userId: string, candidates: RankedPost[]): Promise<RankedPost[]> {
// Apply diversity rules
const diversified = this.applyDiversity(candidates, {
maxPerAuthor: 2,
contentTypeMix: { image: 0.4, video: 0.3, text: 0.3 },
maxAds: 3,
})
// Apply integrity filters (misinformation, policy violations)
const filtered = await this.applyIntegrityFilters(diversified)
return filtered
}
private applyDiversity(posts: RankedPost[], rules: DiversityRules): RankedPost[] {
const result: RankedPost[] = []
const authorCounts = new Map<string, number>()
const typeCounts = new Map<string, number>()
for (const post of posts) {
const authorCount = authorCounts.get(post.authorId) || 0
if (authorCount >= rules.maxPerAuthor) continue
// Check content type distribution
const typeCount = typeCounts.get(post.contentType) || 0
const typeRatio = typeCount / (result.length + 1)
const maxRatio = rules.contentTypeMix[post.contentType] || 0.5
if (typeRatio > maxRatio && result.length > 10) continue
result.push(post)
authorCounts.set(post.authorId, authorCount + 1)
typeCounts.set(post.contentType, typeCount + 1)
}
return result
}
}
20 collapsed lines
class FanoutService {
private readonly CELEBRITY_THRESHOLD = 10_000
private readonly BATCH_SIZE = 1000
async fanoutPost(post: Post, author: User): Promise<FanoutResult> {
const followerCount = author.followerCount
if (followerCount < this.CELEBRITY_THRESHOLD) {
return this.pushFanout(post, author)
} else if (followerCount < 1_000_000) {
return this.hybridFanout(post, author)
} else {
return this.pullOnly(post, author)
}
}
private async pushFanout(post: Post, author: User): Promise<FanoutResult> {
// Get all followers
const followers = await this.getFollowers(author.id)
// Process in batches to avoid memory pressure
const batches = this.chunk(followers, this.BATCH_SIZE)
let written = 0
for (const batch of batches) {
const pipeline = this.redis.pipeline()
for (const followerId of batch) {
// Add to follower's feed with timestamp score
pipeline.zadd(`feed:${followerId}`, post.createdAt.getTime(), post.id)
// Trim to max 500 posts
pipeline.zremrangebyrank(`feed:${followerId}`, 0, -501)
}
await pipeline.exec()
written += batch.length
// Emit progress for monitoring
this.metrics.increment("fanout.writes", batch.length)
}
return {
strategy: "push",
followersReached: written,
duration: Date.now() - post.createdAt.getTime(),
}
}
private async hybridFanout(post: Post, author: User): Promise<FanoutResult> {
// Get active followers (engaged in last 7 days)
const activeFollowers = await this.getActiveFollowers(author.id, 7)
// Push to active followers only
const pushResult = await this.pushToFollowers(post, activeFollowers)
// Mark post for pull-based retrieval by inactive followers
await this.redis.zadd(`celebrity_posts:${author.id}`, post.createdAt.getTime(), post.id)
await this.redis.expire(`celebrity_posts:${author.id}`, 7 * 24 * 60 * 60)
return {
strategy: "hybrid",
pushed: pushResult.count,
markedForPull: author.followerCount - activeFollowers.length,
}
}
private async pullOnly(post: Post, author: User): Promise<FanoutResult> {
// Only index for pull-based retrieval
await this.redis.zadd(`celebrity_posts:${author.id}`, post.createdAt.getTime(), post.id)
// Update author's post index
await this.leafClient.indexPost(post)
return {
strategy: "pull",
indexed: true,
}
}
}

TAO uses a two-tier caching architecture with leasing for consistency:

15 collapsed lines
class TAOCache {
private readonly leaseTimeout = 10_000 // 10 seconds
async get(id1: string, assocType: string): Promise<Association[] | null> {
// Try follower cache first
const followerResult = await this.followerCache.get(this.cacheKey(id1, assocType))
if (followerResult) {
this.metrics.increment("cache.hit.follower")
return followerResult
}
// Try leader cache
const leaderResult = await this.leaderCache.get(this.cacheKey(id1, assocType))
if (leaderResult) {
this.metrics.increment("cache.hit.leader")
// Populate follower cache
await this.followerCache.set(this.cacheKey(id1, assocType), leaderResult, { ttl: 300 })
return leaderResult
}
// Cache miss - fetch from MySQL with lease
return this.fetchWithLease(id1, assocType)
}
private async fetchWithLease(id1: string, assocType: string): Promise<Association[]> {
const leaseKey = `lease:${id1}:${assocType}`
// Try to acquire lease
const acquired = await this.redis.set(leaseKey, "1", "NX", "PX", this.leaseTimeout)
if (!acquired) {
// Another request is fetching - wait and retry
await this.sleep(100)
return this.get(id1, assocType)
}
try {
// Fetch from MySQL
const data = await this.mysql.query(
`SELECT * FROM associations
WHERE id1 = ? AND assoc_type = ?
ORDER BY time DESC`,
[id1, assocType],
)
// Populate both cache tiers
await Promise.all([
this.leaderCache.set(this.cacheKey(id1, assocType), data, { ttl: 3600 }),
this.followerCache.set(this.cacheKey(id1, assocType), data, { ttl: 300 }),
])
return data
} finally {
// Release lease
await this.redis.del(leaseKey)
}
}
async invalidate(id1: string, assocType: string): Promise<void> {
// Delete from both tiers
const key = this.cacheKey(id1, assocType)
await Promise.all([this.followerCache.del(key), this.leaderCache.del(key)])
}
}

Engagement counts (likes, comments) use write-behind pattern for efficiency:

12 collapsed lines
class EngagementService {
private readonly FLUSH_INTERVAL = 5000 // 5 seconds
private pendingUpdates = new Map<string, EngagementDelta>()
async incrementLike(postId: string): Promise<void> {
// Immediate Redis increment for read consistency
await this.redis.hincrby(`post:${postId}`, "like_count", 1)
// Buffer MySQL update
this.bufferUpdate(postId, { likes: 1 })
}
private bufferUpdate(postId: string, delta: EngagementDelta): void {
const existing = this.pendingUpdates.get(postId) || {
likes: 0,
comments: 0,
shares: 0,
}
this.pendingUpdates.set(postId, {
likes: existing.likes + (delta.likes || 0),
comments: existing.comments + (delta.comments || 0),
shares: existing.shares + (delta.shares || 0),
})
}
// Periodic flush to MySQL
@Scheduled(FLUSH_INTERVAL)
async flushToMySQL(): Promise<void> {
const updates = new Map(this.pendingUpdates)
this.pendingUpdates.clear()
const queries = Array.from(updates.entries()).map(([postId, delta]) =>
this.mysql.query(
`UPDATE posts SET
like_count = like_count + ?,
comment_count = comment_count + ?,
share_count = share_count + ?
WHERE id = ?`,
[delta.likes, delta.comments, delta.shares, postId],
),
)
await Promise.all(queries)
}
}

For infinite scroll with thousands of potential posts:

15 collapsed lines
interface VirtualFeedConfig {
containerHeight: number
estimatedItemHeight: number
overscan: number // Extra items above/below viewport
}
class VirtualFeed {
private heightCache = new Map<string, number>()
private offsetCache: number[] = []
calculateVisibleRange(scrollTop: number, posts: Post[]): { start: number; end: number } {
// Binary search for start index
let start = this.binarySearchOffset(scrollTop - this.config.overscan * 300)
// Calculate end based on viewport
let accumulatedHeight = this.offsetCache[start] || 0
let end = start
while (
end < posts.length &&
accumulatedHeight < scrollTop + this.config.containerHeight + this.config.overscan * 300
) {
accumulatedHeight += this.getItemHeight(posts[end])
end++
}
return { start, end }
}
private getItemHeight(post: Post): number {
// Check cache first
if (this.heightCache.has(post.id)) {
return this.heightCache.get(post.id)!
}
// Estimate based on content type
let estimate = 100 // Base height
if (post.content.media?.length > 0) {
estimate += 400 // Image/video
}
if (post.content.text?.length > 200) {
estimate += 50 // Long text
}
return estimate
}
// Called after actual render to update height cache
onItemRendered(postId: string, actualHeight: number): void {
this.heightCache.set(postId, actualHeight)
this.rebuildOffsetCache()
}
}
10 collapsed lines
class FeedStore {
private posts = new Map<string, Post>()
private pendingLikes = new Set<string>()
async likePost(postId: string): Promise<void> {
const post = this.posts.get(postId)
if (!post || this.pendingLikes.has(postId)) return
// Optimistic update
this.pendingLikes.add(postId)
this.updatePost(postId, {
engagement: {
...post.engagement,
likeCount: post.engagement.likeCount + 1,
isLiked: true,
},
})
// Re-render immediately
this.notifySubscribers(postId)
try {
await this.api.likePost(postId)
} catch (error) {
// Rollback on failure
this.updatePost(postId, {
engagement: {
...post.engagement,
likeCount: post.engagement.likeCount,
isLiked: false,
},
})
this.notifySubscribers(postId)
} finally {
this.pendingLikes.delete(postId)
}
}
}
12 collapsed lines
class FeedStreamManager {
private eventSource: EventSource | null = null
private reconnectAttempt = 0
connect(userId: string): void {
this.eventSource = new EventSource(`/api/v1/feed/stream?userId=${userId}`)
this.eventSource.addEventListener("new_post", (event) => {
const data = JSON.parse(event.data)
this.onNewPost(data)
})
this.eventSource.addEventListener("engagement_update", (event) => {
const data = JSON.parse(event.data)
this.onEngagementUpdate(data)
})
this.eventSource.onerror = () => {
this.scheduleReconnect()
}
}
private onNewPost(data: NewPostEvent): void {
// Show "New posts available" indicator instead of auto-inserting
// This prevents jarring scroll position changes
this.feedStore.setPendingPosts(data.count)
this.ui.showNewPostsIndicator()
}
private onEngagementUpdate(data: EngagementUpdateEvent): void {
// Update counts in place (non-disruptive)
this.feedStore.updateEngagement(data.postId, data)
}
loadNewPosts(): void {
// User clicked "Show new posts"
const pending = this.feedStore.getPendingPosts()
this.feedStore.prependPosts(pending)
this.feedStore.clearPendingPosts()
this.ui.scrollToTop()
}
}
10 collapsed lines
class FeedPrefetcher {
private readonly PREFETCH_THRESHOLD = 5 // Posts from bottom
onScroll(visibleRange: { start: number; end: number }, totalPosts: number): void {
const postsRemaining = totalPosts - visibleRange.end
if (postsRemaining < this.PREFETCH_THRESHOLD && !this.loading) {
this.prefetchNextPage()
}
}
private async prefetchNextPage(): Promise<void> {
this.loading = true
try {
const cursor = this.feedStore.getNextCursor()
const response = await this.api.getFeed({ cursor, limit: 20 })
// Add to store but don't re-render yet
this.feedStore.appendPosts(response.posts)
this.feedStore.setNextCursor(response.pagination.nextCursor)
} finally {
this.loading = false
}
}
}
ComponentPurposeOptions
API GatewayAuth, rate limitingKong, Envoy, Nginx
Graph StoreSocial graphCustom TAO-like, Neo4j, DGraph
KV CacheFeed cache, sessionsRedis, KeyDB, Dragonfly
Message QueueFan-out, eventsKafka, Pulsar, NATS
Relational DBUsers, postsMySQL, PostgreSQL, CockroachDB
Object StoreMedia filesMinIO, Ceph, S3-compatible
ML ServingRanking modelsTensorFlow Serving, Triton, Seldon
Feature StoreML featuresFeast, Tecton, custom
Mermaid diagram

Service configurations:

ServiceConfigurationRationale
Aggregatorc5.4xlarge (16 vCPU, 32GB)CPU-bound ranking computation
Leafr5.8xlarge (32 vCPU, 256GB)Memory-intensive index storage
Fan-out workersFargate SpotCost-effective async processing
ElastiCache Redisr6g.2xlarge cluster modeSub-ms feed cache reads
Aurora MySQLdb.r6g.4xlarge Multi-AZDurability, read replicas for scale
DynamoDBOn-demand modeTAO-like graph store, auto-scaling
SageMakerp3.2xlarge (1 GPU)Neural network inference
Mermaid diagram

Multi-region considerations:

  • Primary region for all writes (US-East)
  • Read replicas in each region for local read latency
  • Redis cross-region replication for feed cache
  • ~50-100ms replication lag acceptable for social content
  • Failover to secondary region if primary unavailable

This design provides a social feed system with:

  1. Sub-500ms feed generation via hybrid fan-out and pre-computed caches
  2. Personalized ranking using multi-stage ML funnel (1000+ signals)
  3. Infinite scalability through disaggregated architecture and sharding
  4. Celebrity handling via pull-based merging avoiding write explosion
  5. Eventual consistency acceptable for social content (strong for engagement)

Key architectural decisions:

  • Hybrid fan-out balances latency (push) with scalability (pull)
  • TAO-like graph store optimizes social relationship queries
  • Multi-stage ranking funnel enables complex ML without latency impact
  • Two-tier caching with leasing prevents thundering herd

Known limitations:

  • ~1 minute eventual consistency for feed updates
  • Pull-based celebrity posts may have slightly higher latency
  • ML model staleness between hourly retraining windows
  • Complex operational model with multiple specialized tiers

Future enhancements:

  • Real-time ML model serving for instant personalization
  • Federated learning for privacy-preserving ranking
  • Graph neural networks for improved content understanding
  • Edge-based feed generation for reduced latency
  • Distributed systems fundamentals (caching, sharding, replication)
  • ML basics (training vs. serving, feature engineering)
  • Graph database concepts
  • Message queue patterns (pub/sub, fan-out)
TermDefinition
Fan-outDistributing a post to multiple follower feeds
TAOFacebook’s graph store (The Associations and Objects)
AggregatorService that queries and combines data from multiple sources
Leaf serverMemory-intensive server storing indexed data
Candidate retrievalFirst stage of ranking - gathering potential items
AffinityStrength of relationship between user and content/author
LeasingCache coordination pattern to prevent thundering herd
  • Hybrid fan-out (push for regular users, pull for celebrities) balances write amplification with read latency
  • TAO-like graph storage with objects and associations models social data naturally and enables efficient graph queries
  • Multi-stage ML ranking (billions → hundreds → 50) enables sophisticated personalization without latency impact
  • Two-tier regional caching with followers and leaders maintains consistency while providing sub-ms reads
  • Eventual consistency (~1 minute) is acceptable for social content; engagement counts use write-behind pattern
  • Scale to 700K RPS feed loads with disaggregated aggregator/leaf architecture

Real-World Implementations:

Academic Papers:

Related Articles:

Continue Reading
  • Previous

    Design Real-Time Chat and Messaging

    System Design / System Design Problems 21 min read

    A comprehensive system design for real-time chat and messaging covering connection management, message delivery guarantees, ordering strategies, presence systems, group chat fan-out, and offline synchronization. This design addresses sub-second message delivery at WhatsApp/Discord scale (100B+ messages/day) with strong delivery guarantees and mobile-first offline resilience.

  • Next

    Design a Notification System

    System Design / System Design Problems 24 min read

    A comprehensive system design for multi-channel notifications covering event ingestion, channel routing, delivery guarantees, user preferences, rate limiting, and failure handling. This design addresses sub-second delivery at Uber/LinkedIn scale (millions of notifications per second) with at-least-once delivery guarantees and user-centric throttling.