19 min read Last updated on Feb 6, 2026

Design Spotify Music Streaming

Spotify serves 675+ million monthly active users across 180+ markets, streaming from a catalog of 100+ million tracks. Unlike video platforms where files are gigabytes, audio files are megabytes—but the scale of concurrent streams, personalization depth, and the expectation of instant playback create unique challenges. This design covers the audio delivery pipeline, the recommendation engine that drives 30%+ of listening, offline sync, and the microservices architecture that enables 300+ autonomous teams to ship independently.

Mermaid diagram
High-level architecture: clients connect through API gateway to microservices; audio delivered via multi-CDN; events flow through Pub/Sub to analytics.

Spotify’s architecture is shaped by three fundamental constraints:

  1. Audio is lightweight but latency-critical: A 3-minute track at 320 kbps is ~7 MB—trivial compared to video. But users expect instant playback on tap. The architecture optimizes for time-to-first-byte, not throughput.

  2. Personalization is the product: Discovery features (Discover Weekly, Daily Mix, Release Radar) drive 30%+ of streams. The recommendation system processes billions of listening events daily to generate personalized content for each user.

  3. Offline mode is a first-class feature: Premium subscribers can download thousands of tracks. This requires a license management system, intelligent sync, and storage management across devices.

The core mechanisms:

  • Multi-CDN delivery via Akamai, Fastly, and AWS CloudFront, with intelligent routing based on user location and CDN health
  • Ogg Vorbis encoding at multiple bitrates (96-320 kbps) with automatic quality adaptation based on network conditions
  • Cassandra for user data (playlists, listening history) with write-optimized schema design
  • Hybrid recommendation system combining collaborative filtering, content-based analysis (via Echo Nest audio features), and natural language processing
  • Google Cloud Platform for compute, storage, and data processing after 2016 migration from on-premise
RequirementPriorityNotes
Audio playbackCoreAdaptive streaming, gapless playback, crossfade
SearchCoreTracks, artists, albums, playlists, podcasts
PlaylistsCoreCreate, edit, collaborative playlists
Library managementCoreSave tracks, albums, follow artists
Offline downloadsCorePremium feature, license-protected
Personalized recommendationsCoreDiscover Weekly, Daily Mix, Release Radar
Social featuresExtendedFriend activity, shared playlists
PodcastsExtendedEpisodes, shows, in-progress tracking
LyricsExtendedSynced lyrics display
Live eventsOut of scopeConcerts, virtual events
AudiobooksOut of scopeSeparate purchase model
RequirementTargetRationale
Playback availability99.99%Revenue-critical, user retention
Time to first audiop99 < 500msUser expectation for instant playback
Search latencyp99 < 200msResponsive search experience
Recommendation freshness< 24 hoursDaily personalization updates
Offline sync reliability99.9%Downloaded content must play
Concurrent streamsSupport 50M+Peak evening traffic globally
Catalog update latency< 4 hoursNew releases available quickly

Spotify-scale baseline (2024):

Monthly active users: 675 million
Premium subscribers: 265 million (39%)
Free (ad-supported) users: 410 million (61%)
Catalog:
- Tracks: 100+ million
- Podcasts: 6+ million shows
- New tracks added daily: ~100,000
Streaming traffic:
- Average streams per DAU: ~25 tracks/day
- DAU estimate: 300M (45% of MAU)
- Daily streams: 7.5 billion
- Peak concurrent: ~50M streams
Audio file sizes (3-minute track):
- 96 kbps (Low): ~2.2 MB
- 160 kbps (Normal): ~3.6 MB
- 320 kbps (Very High): ~7.2 MB
- Average effective: ~4 MB per track
Daily bandwidth:
- 7.5B streams × 4 MB = 30 PB/day
- With 90% CDN hit rate: 3 PB/day from origin

Storage estimation:

Audio storage:
- 100M tracks × 4 quality levels × 4 MB avg = 1.6 PB
- With metadata, artwork: ~2 PB total
User data:
- 675M users × 500 playlists avg × 100 tracks = massive
- Listening history: 100 events/user/day × 30 days = 600B events/month

Best when:

  • Smaller scale (< 100M users)
  • Geographic concentration
  • Simpler operations preferred

Architecture:

  • Single CDN provider (e.g., CloudFront)
  • Origin shield layer to reduce origin load
  • Simple routing via DNS

Trade-offs:

  • Simpler vendor management
  • Consistent caching behavior
  • Easier debugging
  • Single point of failure
  • Vendor lock-in on pricing
  • May have regional coverage gaps

Real-world example: SoundCloud relies primarily on AWS CloudFront for audio delivery.

Best when:

  • Massive global scale (100M+ users)
  • Need for high availability
  • Leverage competitive CDN pricing

Architecture:

  • Multiple CDN providers (Akamai, Fastly, AWS)
  • Real-time CDN health monitoring
  • Client-side CDN selection based on performance
  • Specialized CDNs for different content types

Trade-offs:

  • No single point of failure
  • Cost optimization through CDN arbitrage
  • Best performance per region
  • Leverage each CDN’s strengths
  • Complex routing logic
  • Inconsistent caching behavior
  • Multiple vendor relationships

Real-world example: Spotify uses Akamai and AWS for audio streaming, Fastly for images and UI assets.

Used by Spotify 2008-2014:

Spotify originally used peer-to-peer technology, with 80% of traffic served by peers in 2011.

Why they moved away:

  • Improved CDN economics at scale
  • Mobile devices (poor P2P participants)
  • Complexity of P2P on modern networks (NAT, firewalls)
  • Sufficient server capacity globally
FactorSingle CDNMulti-CDNP2P-Assisted
Availability99.9%99.99%Variable
Setup complexityLowHighVery High
Operating costMediumLower at scaleLowest
Mobile supportFullFullLimited
Latency consistencyHighMediumVariable
Best for< 100M users> 100M usersCost-sensitive

This article focuses on Path B (Multi-CDN) because:

  1. Spotify scale requires geographic diversity
  2. The multi-CDN pattern demonstrates advanced content delivery
  3. It represents the current industry standard for major streaming services
Mermaid diagram
Domain-driven microservices architecture with specialized data stores per domain.

Spotify uses gRPC with Protocol Buffers for inter-service communication:

Why gRPC:

  • Binary serialization (smaller payloads than JSON)
  • Strong typing via protobuf
  • Bidirectional streaming support
  • Code generation for multiple languages

Service mesh:

  • Envoy proxy for load balancing, observability
  • Circuit breakers for fault isolation
  • Automatic retries with exponential backoff
  1. User taps play → Client sends play request to Playback Service
  2. Playback Service validates → Checks subscription, licensing, availability
  3. License acquired → DRM key retrieved for encrypted content
  4. CDN URL returned → Client receives signed URL with CDN selection
  5. Audio streamed → Client fetches segments from edge CDN
  6. Prefetch triggered → Next track segments pre-fetched for gapless playback
  7. Event logged → Stream event sent to Pub/Sub for analytics

Every user action generates events:

Play event → Pub/Sub → Dataflow → BigQuery (analytics)
→ Bigtable (real-time features)
→ Recommendation update
Search event → Pub/Sub → Search ranking signals
Follow event → Pub/Sub → Social graph update

Event volume:

  • 1 trillion+ Pub/Sub messages per day
  • Sub-second end-to-end latency for real-time features
Mermaid diagram
Multi-bitrate encoding: each track encoded to 4 quality levels for adaptive streaming.
QualityBitrateFormatAvailabilityFile Size (3 min)
Low24 kbpsOgg VorbisFree/Premium~0.5 MB
Normal96 kbpsOgg VorbisFree/Premium~2.2 MB
High160 kbpsOgg VorbisFree/Premium~3.6 MB
Very High320 kbpsOgg VorbisPremium only~7.2 MB
Web256 kbpsAACWeb player~5.8 MB

Why Ogg Vorbis:

  • Open-source, royalty-free codec
  • Comparable quality to MP3 at lower bitrates
  • Better than MP3 at same bitrate (especially < 128 kbps)
  • Efficient hardware decoding on mobile devices

AAC for web:

  • Native browser support without plugins
  • Required for Safari/iOS web player

The client dynamically selects quality based on:

if network_type == "cellular" and data_saver_enabled:
quality = LOW (24 kbps)
elif network_type == "cellular":
quality = NORMAL (96 kbps)
elif buffering_recently:
quality = decrease_one_level()
elif buffer_healthy and bandwidth_sufficient:
quality = user_preference (up to 320 kbps)

Buffer management:

  • Target buffer: 10-30 seconds of audio
  • Low watermark: 5 seconds (trigger quality drop)
  • High watermark: 30 seconds (allow quality increase)

For seamless album listening:

  1. Prefetch: Start fetching next track when current is 90% complete
  2. Decode ahead: Decode first 5 seconds of next track
  3. Crossfade boundary: Handle precise sample-accurate transitions
  4. Memory management: Release previous track’s buffer

Implementation challenges:

  • Different sample rates between tracks
  • Metadata gaps in some files
  • Client memory constraints on mobile
Mermaid diagram
CDN tiering: Akamai/AWS for latency-sensitive audio, Fastly for cacheable assets.
def select_cdn(user_location, content_type, cdns_health):
"""Select optimal CDN for request."""
candidates = []
for cdn in available_cdns:
if not cdns_health[cdn].is_healthy:
continue
latency = get_latency_estimate(cdn, user_location)
availability = cdns_health[cdn].availability_99p
cost = get_cost_per_gb(cdn, user_location)
score = (
0.5 * normalize(latency, lower_is_better=True) +
0.3 * normalize(availability, lower_is_better=False) +
0.2 * normalize(cost, lower_is_better=True)
)
candidates.append((cdn, score))
return max(candidates, key=lambda x: x[1])[0]
Audio: /{track_id}/{quality}/{segment}.ogg
Images: /{image_id}/{size}.jpg

Cache TTL strategy:

Content TypeTTLRationale
Audio files1 yearImmutable content
Album artwork30 daysRarely changes
Artist images7 daysOccasional updates
Playlist covers1 dayUser-generated
API responses5 minutesBalance freshness/load

Audio URLs include authentication:

https://audio-cdn.spotify.com/tracks/{track_id}/320.ogg
?sig={hmac_signature}
&exp={expiration_timestamp}
&uid={user_id}

Signature validation:

  • HMAC-SHA256 with rotating keys
  • 1-hour expiration for streaming URLs
  • Rate limiting per user/IP

Endpoint: POST /v1/me/player/play

Request:

{
"context_uri": "spotify:playlist:37i9dQZF1DXcBWIGoYBM5M",
"offset": {
"position": 0
},
"position_ms": 0
}

Response (204 No Content on success)

Error Responses:

  • 401 Unauthorized: Invalid or expired token
  • 403 Forbidden: Premium required for this feature
  • 404 Not Found: Track/playlist not available
  • 429 Too Many Requests: Rate limit exceeded

Endpoint: GET /v1/tracks/{id}

Response (200 OK):

{
"id": "3n3Ppam7vgaVa1iaRUc9Lp",
"name": "Mr. Brightside",
"duration_ms": 222973,
"explicit": false,
"popularity": 87,
"preview_url": "https://p.scdn.co/mp3-preview/...",
"album": {
"id": "4OHNH3sDzIxnmUADXzv2kT",
"name": "Hot Fuss",
"images": [
{
"url": "https://i.scdn.co/image/...",
"height": 640,
"width": 640
}
],
"release_date": "2004-06-07"
},
"artists": [
{
"id": "0C0XlULifJtAgn6ZNCW2eu",
"name": "The Killers"
}
],
"available_markets": ["US", "GB", "DE", ...]
}

Endpoint: GET /v1/search

Parameters:

ParameterTypeRequiredDescription
qstringYesSearch query
typestringYesComma-separated: track,artist,album,playlist
limitintegerNoMax results per type (default: 20, max: 50)
offsetintegerNoPagination offset
marketstringNoISO country code for availability filtering

Response:

{
"tracks": {
"items": [...],
"total": 1000,
"limit": 20,
"offset": 0,
"next": "https://api.spotify.com/v1/search?offset=20&..."
},
"artists": {...},
"albums": {...}
}

Endpoint: POST /v1/users/{user_id}/playlists

Request:

{
"name": "Road Trip",
"description": "Songs for the drive",
"public": false,
"collaborative": false
}

Response (201 Created):

{
"id": "7d2D2S5F4d0r33mDf0d33D",
"name": "Road Trip",
"owner": {
"id": "user123",
"display_name": "John"
},
"tracks": {
"total": 0
},
"snapshot_id": "MTY4MzI0..."
}
Endpoint CategoryLimitWindow
Standard endpoints100 requests30 seconds
Search30 requests30 seconds
Player control50 requests30 seconds
Playlist modifications25 requests30 seconds
CREATE TABLE tracks (
id VARCHAR(22) PRIMARY KEY, -- Spotify base62 ID
name VARCHAR(500) NOT NULL,
duration_ms INTEGER NOT NULL,
explicit BOOLEAN DEFAULT false,
popularity SMALLINT DEFAULT 0,
isrc VARCHAR(12), -- International Standard Recording Code
preview_url TEXT,
-- Denormalized for read performance
album_id VARCHAR(22) REFERENCES albums(id),
-- Audio features (from Echo Nest analysis)
tempo DECIMAL(6,3), -- BPM
key SMALLINT, -- 0-11 pitch class
mode SMALLINT, -- 0=minor, 1=major
time_signature SMALLINT,
danceability DECIMAL(4,3),
energy DECIMAL(4,3),
valence DECIMAL(4,3),
created_at TIMESTAMPTZ DEFAULT NOW(),
updated_at TIMESTAMPTZ DEFAULT NOW()
);
-- Track-Artist relationship (many-to-many)
CREATE TABLE track_artists (
track_id VARCHAR(22) REFERENCES tracks(id),
artist_id VARCHAR(22) REFERENCES artists(id),
position SMALLINT NOT NULL, -- Artist order
PRIMARY KEY (track_id, artist_id)
);
-- Indexes for common queries
CREATE INDEX idx_tracks_album ON tracks(album_id);
CREATE INDEX idx_tracks_popularity ON tracks(popularity DESC);
CREATE INDEX idx_tracks_isrc ON tracks(isrc);

Cassandra excels at playlist storage due to write-heavy patterns:

CREATE TABLE playlists (
user_id TEXT,
playlist_id TEXT,
name TEXT,
description TEXT,
is_public BOOLEAN,
is_collaborative BOOLEAN,
snapshot_id TEXT,
follower_count COUNTER,
created_at TIMESTAMP,
updated_at TIMESTAMP,
PRIMARY KEY (user_id, playlist_id)
) WITH CLUSTERING ORDER BY (playlist_id ASC);
CREATE TABLE playlist_tracks (
playlist_id TEXT,
position INT,
track_id TEXT,
added_by TEXT,
added_at TIMESTAMP,
PRIMARY KEY (playlist_id, position)
) WITH CLUSTERING ORDER BY (position ASC);
-- Denormalized for efficient ordering
CREATE TABLE playlist_tracks_by_added (
playlist_id TEXT,
added_at TIMESTAMP,
position INT,
track_id TEXT,
PRIMARY KEY (playlist_id, added_at, position)
) WITH CLUSTERING ORDER BY (added_at DESC, position ASC);

Why Cassandra for playlists:

  • Write-optimized (append-only storage)
  • Horizontal scaling for 675M users
  • Tunable consistency (eventual for non-critical reads)
  • Counter support for follower counts
CREATE TABLE listening_history (
user_id TEXT,
listened_at TIMESTAMP,
track_id TEXT,
context_uri TEXT, -- playlist, album, or artist
duration_ms INT,
PRIMARY KEY (user_id, listened_at)
) WITH CLUSTERING ORDER BY (listened_at DESC)
AND default_time_to_live = 7776000; -- 90 days TTL
{
"mappings": {
"properties": {
"track_id": { "type": "keyword" },
"name": {
"type": "text",
"analyzer": "standard",
"fields": {
"exact": { "type": "keyword" },
"autocomplete": {
"type": "text",
"analyzer": "autocomplete"
}
}
},
"artist_names": {
"type": "text",
"fields": { "exact": { "type": "keyword" } }
},
"album_name": { "type": "text" },
"popularity": { "type": "integer" },
"duration_ms": { "type": "integer" },
"explicit": { "type": "boolean" },
"available_markets": { "type": "keyword" },
"release_date": { "type": "date" }
}
},
"settings": {
"analysis": {
"analyzer": {
"autocomplete": {
"tokenizer": "autocomplete",
"filter": ["lowercase"]
}
},
"tokenizer": {
"autocomplete": {
"type": "edge_ngram",
"min_gram": 1,
"max_gram": 20,
"token_chars": ["letter", "digit"]
}
}
}
}
}
Data TypeStoreRationale
Catalog (tracks, albums, artists)PostgreSQLRelational queries, complex joins
User data (playlists, saves)CassandraWrite-heavy, horizontal scaling
Listening historyCassandraTime-series, high volume
Search indexElasticsearchFull-text search, faceting
ML featuresCloud BigtableWide columns, sparse data
Hot metadataRedis/MemcachedSub-ms latency
AnalyticsBigQueryAd-hoc queries, massive scale
Mermaid diagram
Two-stage recommendation: retrieve candidates via embedding similarity, rank with ML model.

Matrix factorization approach:

Given user-track interaction matrix R (675M users × 100M tracks), learn latent factors:

RU×VTR \approx U \times V^T

Where:

  • UU = user matrix (675M × 128)
  • VV = track matrix (100M × 128)

Implementation:

  • Alternating Least Squares (ALS) on Spark
  • Weekly retraining on full dataset
  • Incremental updates for new users/tracks

Each track has computed audio features:

FeatureRangeDescription
Tempo0-250 BPMBeats per minute
Key0-11Pitch class (C=0, C#=1, …)
Mode0-1Minor=0, Major=1
Danceability0.0-1.0Rhythmic suitability for dancing
Energy0.0-1.0Perceptual intensity
Valence0.0-1.0Musical positivity
Speechiness0.0-1.0Presence of spoken words
Acousticness0.0-1.0Acoustic vs. electronic
Instrumentalness0.0-1.0Absence of vocals
Liveness0.0-1.0Presence of audience

Generation schedule:

  • Runs Sunday night for Monday delivery
  • Pre-computed on Cloud Bigtable
  • 675M personalized 30-track playlists

Algorithm:

  1. User taste profile: Aggregate recent listening into genre/artist weights
  2. Candidate selection: Find tracks listened to by similar users (collaborative)
  3. Audio filtering: Match audio features to user preferences (content-based)
  4. Freshness boost: Prioritize tracks user hasn’t heard
  5. Diversity injection: Ensure variety across genres, artists
  6. Final ranking: ML model predicts skip probability

For real-time recommendations, use Annoy (Approximate Nearest Neighbors Oh Yeah):

Index structure:

  • 128-dimensional embeddings for 100M tracks
  • Forest of random projection trees
  • Trade-off: accuracy vs. query time

Query performance:

  • 10ms for top-100 nearest neighbors
  • 95% recall vs. exact search
  • 100 trees provides good balance
Mermaid diagram
Offline download flow: queue → prioritize → fetch → encrypt → store locally.

DRM implementation:

  • Encrypted audio files using AES-256
  • Per-device keys tied to account
  • Keys stored in secure enclave (iOS) or hardware-backed keystore (Android)

License constraints:

ConstraintValueRationale
Offline validity30 daysRequires periodic online check
Device limit5 devicesPrevent account sharing
Track limit10,000 per deviceStorage management
Concurrent offline1 deviceLicensing terms

Smart downloads:

def prioritize_downloads(playlist, device_storage):
"""Prioritize which tracks to download first."""
scored_tracks = []
for track in playlist.tracks:
score = 0
# User explicitly requested
if track in user_requested:
score += 100
# Recently played (likely to play again)
if track in recent_plays:
score += 50
# High popularity in playlist
score += track.playlist_position_score
# Already partially downloaded
if track.partial_download:
score += 30
scored_tracks.append((track, score))
# Download in priority order until storage full
for track, _ in sorted(scored_tracks, reverse=True):
if device_storage.available > track.size:
download(track)

Eviction policy:

  1. Remove tracks not played in 90+ days
  2. Remove tracks from unfollowed playlists
  3. LRU eviction when approaching storage limit

Storage estimation UI:

Playlist: Road Trip (50 tracks)
Download size: 180 MB (Normal quality)
350 MB (Very High quality)
Device storage: 2.1 GB available
Mermaid diagram
Search pipeline: parse → correct → expand → search → rank → deduplicate.

Implementation using Elasticsearch:

{
"query": {
"bool": {
"should": [
{
"match": {
"name.autocomplete": {
"query": "mr bright",
"operator": "and"
}
}
},
{
"match": {
"artist_names.autocomplete": {
"query": "mr bright",
"operator": "and"
}
}
}
],
"minimum_should_match": 1
}
},
"sort": ["_score", { "popularity": "desc" }],
"size": 10
}

Performance targets:

  • Typeahead latency: p99 < 50ms
  • Full search latency: p99 < 200ms
  • Index update lag: < 4 hours for new releases
SignalWeightDescription
Text relevance0.3BM25 score from Elasticsearch
Popularity0.25Global stream count (log-scaled)
User affinity0.2Based on listening history
Freshness0.15Boost for new releases
Market availability0.1Available in user’s region

Global player state:

interface PlayerState {
// Current playback
currentTrack: Track | null
position_ms: number
duration_ms: number
isPlaying: boolean
// Queue
queue: Track[]
queuePosition: number
// Context (what initiated playback)
context: {
type: "playlist" | "album" | "artist" | "search"
uri: string
}
// Shuffle and repeat
shuffle: boolean
repeatMode: "off" | "context" | "track"
// Device
activeDevice: Device
volume: number
}

State synchronization:

  • Local state for immediate UI feedback
  • WebSocket for cross-device sync (Spotify Connect)
  • Optimistic updates with reconciliation
class AudioBuffer {
private segments: Map<number, ArrayBuffer> = new Map()
private prefetchAhead = 30 // seconds
async ensureBuffered(currentPosition: number): Promise<void> {
const currentSegment = Math.floor(currentPosition / SEGMENT_SIZE)
const targetSegment = Math.ceil((currentPosition + this.prefetchAhead) / SEGMENT_SIZE)
for (let i = currentSegment; i <= targetSegment; i++) {
if (!this.segments.has(i)) {
const segment = await this.fetchSegment(i)
this.segments.set(i, segment)
}
}
// Evict old segments to manage memory
this.evictOldSegments(currentSegment - 2)
}
}
ConstraintMitigation
BatteryBatch network requests, use efficient codecs
Data usageQuality auto-adjust, download on WiFi
MemoryLimit buffer size, lazy-load images
BackgroundiOS: Background Audio mode; Android: Foreground Service
OfflineSQLite for metadata, encrypted file storage

Web Audio API usage:

const audioContext = new AudioContext()
const source = audioContext.createBufferSource()
const gainNode = audioContext.createGain()
// Crossfade between tracks
function crossfade(currentSource, nextSource, duration) {
const now = audioContext.currentTime
// Fade out current
currentSource.gainNode.gain.setValueAtTime(1, now)
currentSource.gainNode.gain.linearRampToValueAtTime(0, now + duration)
// Fade in next
nextSource.gainNode.gain.setValueAtTime(0, now)
nextSource.gainNode.gain.linearRampToValueAtTime(1, now + duration)
nextSource.start(now)
}
Mermaid diagram
GCP deployment: GKE for microservices, managed data services, Pub/Sub for event streaming.
ServiceUse CaseScale
GKEMicroservices orchestration300+ services
Cloud Pub/SubEvent streaming1T+ messages/day
Cloud DataflowStream/batch processingPetabytes/day
BigQueryAnalytics, ML training10M+ queries/month
Cloud BigtableML feature storePetabytes
Cloud StorageAudio files, backupsExabytes
Cloud SpannerTransactional dataGlobal consistency

Migration Story

Timeline:

  • 2016: Announced migration from on-premise to GCP
  • 2017: Fully migrated to Google Cloud
  • Result: 60% cost reduction, faster product development

Key decisions:

  • Kafka → Pub/Sub for event delivery (4x lower latency)
  • Hadoop → Dataflow for batch/stream processing
  • Custom dashboards → BigQuery for analytics
Regions:
- us-central1 (Primary Americas)
- europe-west1 (Primary EMEA)
- asia-east1 (Primary APAC)
Data replication:
- User data: Multi-region Spanner
- Audio: Cloud Storage multi-region
- Analytics: BigQuery cross-region

Spotify open-sourced Backstage in 2020—their internal developer portal:

Features:

  • Service catalog (track all microservices)
  • TechDocs (documentation as code)
  • Software templates (scaffold new services)
  • Plugin ecosystem (integrate with tools)

Impact:

  • 2,200+ contributors
  • 3,000+ adopting companies
  • CNCF incubating project

Designing Spotify-scale music streaming requires different optimizations than video platforms:

Key architectural decisions:

  1. Multi-CDN delivery (Akamai, AWS, Fastly) provides global reach with failover and cost optimization
  2. Ogg Vorbis encoding at multiple bitrates (96-320 kbps) balances quality and bandwidth with adaptive switching
  3. Cassandra for user data handles write-heavy workloads (playlists, history) with horizontal scaling
  4. Hybrid recommendation combining collaborative filtering, audio features, and NLP drives 30%+ of listening
  5. GCP migration (2016-2017) reduced costs 60% while enabling faster product iteration
  6. Event-driven architecture via Pub/Sub processes 1T+ events/day for real-time personalization

What this design optimizes for:

  • Instant playback (< 500ms time-to-first-audio)
  • Seamless cross-device experience (Spotify Connect)
  • Deep personalization (Discover Weekly, Daily Mix)
  • Offline reliability (encrypted downloads with license management)

What this design sacrifices:

  • Lossless audio quality (limited to 320 kbps lossy until recent Premium updates)
  • Real-time social features (friend activity delayed)
  • Podcast transcription/search (limited compared to dedicated platforms)

When to choose this design:

  • Audio streaming at scale (100M+ users)
  • Personalization as core differentiator
  • Need for offline mode with DRM
  • CDN architecture: edge caching, origin shield concepts
  • Audio encoding: codecs, bitrates, compression
  • Distributed databases: Cassandra data modeling, consistency trade-offs
  • Recommendation systems: collaborative filtering, content-based filtering basics
  • Stream processing: event-driven architecture, Pub/Sub patterns
TermDefinition
ABRAdaptive Bitrate—dynamically selecting audio quality based on network conditions
Ogg VorbisOpen-source, royalty-free audio codec used by Spotify
Gapless playbackSeamless transition between tracks without silence gaps
CrossfadeGradual blend between end of one track and start of next
Collaborative filteringRecommendation based on similar users’ behavior
Content-based filteringRecommendation based on item attributes (audio features)
Echo NestMusic intelligence company acquired by Spotify in 2014
Spotify ConnectProtocol for cross-device playback control
Pub/SubPublish-Subscribe messaging pattern for event streaming
Edge n-gramTokenization for autocomplete (prefixes: “s”, “sp”, “spo”…)
  • Spotify serves 675M+ MAU with multi-CDN delivery (Akamai, AWS, Fastly) for global reach
  • Ogg Vorbis encoding at 96-320 kbps with client-side adaptive quality selection
  • Cassandra handles write-heavy user data (playlists, history) with horizontal scaling
  • Hybrid recommendation (collaborative + content-based + NLP) drives 30%+ of streams
  • Event pipeline via Pub/Sub processes 1T+ events/day for real-time personalization
  • Offline mode uses encrypted storage with per-device DRM licensing
  • Backstage developer portal (open-sourced 2020) manages 300+ internal microservices
Continue Reading
  • Previous

    Design Instagram: Photo Sharing at Scale

    System Design / System Design Problems 22 min read

    A photo-sharing social platform at Instagram scale handles 1+ billion photos uploaded daily, serves feeds to 500M+ daily active users, and delivers sub-second playback for Stories. This design covers the image upload pipeline, feed generation with fan-out strategies, Stories architecture, and the recommendation systems that power Explore—focusing on the architectural decisions that enable Instagram to process 95M+ daily uploads while maintaining real-time feed delivery.

  • Next

    Design Real-Time Chat and Messaging

    System Design / System Design Problems 21 min read

    A comprehensive system design for real-time chat and messaging covering connection management, message delivery guarantees, ordering strategies, presence systems, group chat fan-out, and offline synchronization. This design addresses sub-second message delivery at WhatsApp/Discord scale (100B+ messages/day) with strong delivery guarantees and mobile-first offline resilience.