System Design Problems
23 min read

Design Uber-Style Ride Hailing

A comprehensive system design for a ride-hailing platform handling real-time driver-rider matching, geospatial indexing at scale, dynamic pricing, and sub-second location tracking. This design addresses the core challenges of matching millions of riders with drivers in real-time while optimizing for ETAs, driver utilization, and surge pricing across global markets.

ML Platform

Data Layer

Real-Time Layer

Core Services

API Gateway

Client Layer

Rider App

Driver App

Load Balancer

Auth Service

Rate Limiter

Ride Service

Matching Service

DISCO

Location Service

Pricing Service

Routing & ETA

Payment Service

WebSocket Gateway

Location Queue

Kafka

Match Queue

Trip Store

Schemaless

Location Cache

Redis + H3

Driver State

Cassandra

Analytics

HDFS/Hive

DeepETA Model

Surge Predictor

Fraud Detection

High-level architecture: Rider and driver apps connect through a WebSocket gateway for real-time updates. The Matching Service (DISCO) uses H3-indexed location data and ML-powered ETA predictions to optimize driver dispatch.

Ride-hailing systems solve three interconnected problems: spatial indexing (finding nearby drivers efficiently), dispatch optimization (matching riders to drivers to minimize wait time globally, not just per-request), and dynamic pricing (balancing supply and demand in real-time).

The core spatial index uses H3 hexagonal cells rather than geohash or quadtrees. Hexagons provide uniform neighbor distances (all 6 neighbors equidistant), enabling accurate proximity searches without the edge artifacts of square cells. Uber open-sourced H3 for this reason—gradient analysis for surge pricing and demand forecasting becomes trivial when all adjacent cells have equal weight.

Dispatch is not “find closest driver.” Traffic, bridges, and one-way streets make Euclidean distance meaningless. The Matching Service (DISCO) uses ETA-based assignment with batch optimization: accumulate requests over a short window (100-200ms), build a bipartite graph of riders and available drivers, and solve the assignment problem to minimize total wait time across all requests. This global optimization outperforms greedy per-request matching by 10-20% on wait times.

Surge pricing uses a hybrid approach: real-time supply/demand ratios within H3 cells set the base multiplier, while ML models adjust for predicted demand (events, weather, historical patterns). The key design insight is that surge must be spatially granular (different blocks have different multipliers) but temporally smooth (avoid jarring jumps that frustrate users).

FeaturePriorityScope
Request ride (pickup, destination)CoreFull
Real-time driver matchingCoreFull
Live location tracking (driver → rider)CoreFull
Dynamic/surge pricingCoreFull
ETA predictionCoreFull
Driver dispatch and navigationCoreFull
Trip lifecycle (start, complete, cancel)CoreFull
Payment processingCoreOverview
Driver ratings and feedbackHighOverview
Ride history and receiptsHighBrief
Scheduled ridesMediumBrief
Ride sharing (UberPool)MediumOut of scope
Driver earnings and payoutsLowOut of scope
RequirementTargetRationale
Availability99.99%Revenue-critical; outage = stranded users
Matching latencyp99 < 3 secondsUser perception of “instant” match
Location update latencyp99 < 500msReal-time tracking accuracy
ETA accuracy±2 minutes medianUser trust in time estimates
Surge price staleness< 30 secondsAvoid stale pricing on request
Throughput1M+ location updates/secGlobal driver fleet scale
ConsistencyEventual (< 5s) for location; strong for paymentsLocation tolerates staleness; payments cannot

Users and Drivers:

  • Daily trips: 30M trips/day (Uber 2024)
  • Monthly active users: 180M
  • Active drivers: 8.8M globally
  • Peak concurrent: 3M drivers online, 10M active rider sessions

Traffic:

  • Location updates: 8.8M drivers × 1 update/4 seconds = 2.2M updates/sec peak
  • Ride requests: 30M/day = ~350 RPS average, ~1,500 RPS peak
  • ETA queries: Each match queries 10-20 candidate drivers = 15-30K RPS
  • Price checks: 10× ride requests (users check before confirming) = 15K RPS peak

Storage:

  • Trip record: ~5KB (metadata, route, pricing, payment)
  • Daily trip storage: 30M × 5KB = 150GB/day
  • Location history (hot): 8.8M drivers × 4KB/minute × 60 min = 2TB rolling window
  • Yearly growth: ~55TB/year for trips alone

Message Volume:

  • Kafka: 1 trillion messages/day (Uber published)
  • Real-time events: Location, trip state, notifications

Best when:

  • Simpler implementation requirements
  • Moderate scale (< 100K concurrent drivers)
  • Latency constraints relaxed (5+ seconds acceptable)

Key characteristics:

  • Use geohash strings for spatial indexing (e.g., 9q8yyk for SF downtown)
  • Match each request immediately to the nearest available driver
  • Single-request optimization (no batching)

Trade-offs:

  • Simple implementation (geohash is well-understood)
  • Lower latency for individual matches
  • Easier to reason about and debug
  • Geohash edge effects (neighbors at different distances)
  • Suboptimal global wait times (greedy isn’t globally optimal)
  • Struggles with dense urban areas (many equidistant drivers)

Real-world example: Early Uber and Lyft used geohash-based approaches. Worked well at small scale but required rearchitecture as cities densified.

Best when:

  • High driver density in urban areas
  • Global optimization matters (minimize total wait time)
  • Scale demands efficient spatial queries

Key characteristics:

  • H3 hexagonal cells for uniform spatial indexing
  • Batch requests over 100-200ms windows
  • Solve assignment as bipartite matching problem
  • ETA-based costs, not distance-based

Trade-offs:

  • Uniform neighbor distances (no edge artifacts)
  • 10-20% better wait times via global optimization
  • Natural fit for surge pricing (hexagonal heatmaps)
  • Efficient k-ring queries for nearby cells
  • Higher implementation complexity
  • Slight latency increase from batching (100-200ms)
  • Requires ML for accurate ETA (not just distance)

Real-world example: Uber developed and open-sourced H3 specifically for this use case. Their DISCO system uses batch optimization to minimize city-wide wait times.

Best when:

  • Variable driver density (sparse suburbs, dense cities)
  • Need adaptive spatial resolution
  • Streaming-first architecture

Key characteristics:

  • Quadtree adapts cell size to driver density
  • Stream processing (Flink/Samza) for continuous matching
  • No batching—process events as they arrive

Trade-offs:

  • Adaptive resolution (small cells in dense areas)
  • No batching latency
  • Natural fit for streaming architectures
  • Complex rebalancing as density changes
  • Non-uniform neighbor distances (like geohash)
  • Harder to aggregate for surge pricing
FactorPath A (Geohash)Path B (H3 Batch)Path C (Quadtree)
Spatial uniformityLow (edge effects)HighMedium
Matching optimalityGreedy (local)GlobalLocal
Implementation complexityLowHighMedium
LatencyLowest+100-200msLow
ScaleModerateHighHigh
Best forMVP, low-densityDense urban, globalVariable density

This article implements Path B (H3 + Batch Optimization) because Uber’s published architecture demonstrates this approach at scale. The 10-20% improvement in wait times from global optimization justifies the added complexity for a revenue-critical system.

Manages the trip lifecycle from request to completion:

  • Create ride request (validate pickup/destination, check surge)
  • Track trip state machine: REQUESTED → MATCHED → DRIVER_ARRIVING → IN_PROGRESS → COMPLETED
  • Handle cancellations (with appropriate fees/policies)
  • Store trip records for history and analytics

The core dispatch engine that matches riders to drivers:

  • Receive ride requests from Ride Service
  • Query Location Service for nearby available drivers
  • Query Routing Service for ETA from each candidate to pickup
  • Build bipartite graph: riders as left vertices, drivers as right vertices
  • Edge weights = ETA (lower is better)
  • Solve assignment problem to minimize total ETA across all pending requests
  • Dispatch selected driver, update driver state to DISPATCHED

Tracks real-time driver positions at million-message-per-second scale:

  • Ingest GPS updates from Driver App (every 4-5 seconds)
  • Store current position in Redis with H3 index
  • Maintain hot window of recent positions for trajectory
  • Publish location changes to Kafka for other services
  • Support spatial queries: “drivers within k-rings of H3 cell”

Calculates fares including dynamic surge:

  • Base fare calculation (distance, time, vehicle type)
  • Surge multiplier lookup (per H3 cell, refreshed every 5-10 minutes)
  • Upfront pricing (show fare before ride request)
  • Promo code and discount application
  • Final fare calculation with actuals (if route differs)

Provides navigation and time estimates:

  • Road network graph (OpenStreetMap or licensed)
  • Real-time traffic integration
  • ETA prediction using DeepETA model (hybrid physics + ML)
  • Turn-by-turn navigation for Driver App
  • Route optimization for UberPool (not in scope)

Handles financial transactions:

  • Tokenized payment methods (no raw card numbers stored)
  • Charge on trip completion
  • Split payments (multiple riders)
  • Driver payout aggregation
  • Fraud detection integration
Driver AppWebSocket GatewayETA ServiceLocation ServiceMatching ServicePricing ServiceRide ServiceAPI GatewayRider AppDriver AppWebSocket GatewayETA ServiceLocation ServiceMatching ServicePricing ServiceRide ServiceAPI GatewayRider AppPOST /rides (pickup, destination)createRide(request)getPrice(pickup, destination)Calculate base + surgeReturn upfront pricerequestMatch(rideId, pickup)getNearbyDrivers(pickupH3, radius=3km)Return candidate drivers (10-20)batchETA(candidates[], pickup)Return ETAsSolve assignment (batch with other requests)Driver assigned (driverId, ETA)Push match to riderDriver details + ETAPush dispatch to driverRide request + pickup location Rider WebSocketMatching ServiceRedis ClusterLocation ServiceKafkaWebSocket GatewayDriver AppRider WebSocketMatching ServiceRedis ClusterLocation ServiceKafkaWebSocket GatewayDriver Apploop[Every 4-5 seconds]When matching, query RedisGPS update (lat, lng, heading, speed)Publish to location topicConsume location eventCompute H3 cell (resolution 9)GEOADD drivers:{h3cell} (lng, lat, driverId)SET driver:{driverId}:location {payload}EXPIRE 30s (auto-cleanup stale drivers)GEORADIUS drivers:* (pickup, 3km)Return drivers with distances

Endpoint: POST /api/v1/rides

3 collapsed lines
// Headers
Authorization: Bearer {access_token}
Content-Type: application/json
// Request body
{
"pickup": {
"latitude": 37.7749,
"longitude": -122.4194,
"address": "123 Market St, San Francisco, CA"
},
"destination": {
"latitude": 37.7899,
"longitude": -122.4014,
"address": "456 Mission St, San Francisco, CA"
},
"vehicleType": "UBER_X",
"paymentMethodId": "pm_abc123",
"riderCount": 1,
"scheduledTime": null
}

Response (201 Created):

5 collapsed lines
{
"rideId": "ride_abc123xyz",
"status": "REQUESTED",
"pickup": {
"latitude": 37.7749,
"longitude": -122.4194,
"address": "123 Market St, San Francisco, CA"
},
"destination": {
"latitude": 37.7899,
"longitude": -122.4014,
"address": "456 Mission St, San Francisco, CA"
},
"estimate": {
"fareAmount": 1850,
"fareCurrency": "USD",
"surgeMultiplier": 1.2,
"distanceMeters": 2100,
"durationSeconds": 480
},
"etaToPickup": null,
"driver": null,
"vehicle": null,
"createdAt": "2024-01-15T14:30:00Z"
}

Error Responses:

  • 400 Bad Request: Invalid coordinates, unsupported area
  • 402 Payment Required: Payment method declined
  • 409 Conflict: Rider has active ride in progress
  • 429 Too Many Requests: Rate limit exceeded (anti-fraud)
  • 503 Service Unavailable: No drivers in area (with retry-after)

Rate Limits: 10 requests/minute per user (prevents spam requests)

Endpoint: WebSocket wss://location.uber.com/v1/driver

// Client → Server (every 4-5 seconds)
{
"type": "LOCATION_UPDATE",
"payload": {
"latitude": 37.7751,
"longitude": -122.4183,
"heading": 45,
"speed": 12.5,
"accuracy": 5.0,
"timestamp": 1705329000000
}
}
// Server → Client (acknowledgment)
{
"type": "LOCATION_ACK",
"payload": {
"received": 1705329000123
}
}

Design Decision: WebSocket vs HTTP Polling

Why WebSocket for driver location updates:

  • Bidirectional: Server can push ride requests instantly
  • Persistent connection: Amortizes TCP handshake cost across thousands of updates
  • Battery efficiency: Single connection vs repeated HTTP requests
  • Sub-second latency: Critical for real-time tracking

Endpoint: GET /api/v1/drivers/nearby

Query Parameters:

ParameterTypeDescription
latitudefloatPickup latitude
longitudefloatPickup longitude
vehicleTypesstringComma-separated (UBER_X,UBER_BLACK)
radiusintegerSearch radius in meters (default: 3000)

Response:

3 collapsed lines
{
"drivers": [
{
"driverId": "driver_xyz",
"latitude": 37.7755,
"longitude": -122.418,
"heading": 90,
"vehicleType": "UBER_X",
"etaSeconds": 180
},
{
"driverId": "driver_abc",
"latitude": 37.774,
"longitude": -122.42,
"heading": 270,
"vehicleType": "UBER_X",
"etaSeconds": 240
}
],
3 collapsed lines
"surgeMultiplier": 1.2,
"surgeExpiresAt": "2024-01-15T14:35:00Z"
}

Design Decision: Why Return Limited Driver Info

The response includes approximate positions (±100m fuzzy) rather than exact locations:

  • Privacy: Drivers’ real-time positions are sensitive
  • Performance: Fewer precision bits = better compression
  • Purpose: Only needed for UI (show cars on map), not for matching

Endpoint: GET /api/v1/surge

Query Parameters:

ParameterTypeDescription
latitudefloatPickup location
longitudefloatPickup location
vehicleTypestringVehicle type

Response:

{
"multiplier": 1.5,
"reason": "HIGH_DEMAND",
"h3Cell": "892a100d2c3ffff",
"expiresAt": "2024-01-15T14:40:00Z",
"demandLevel": "VERY_HIGH",
"supplyLevel": "LOW"
}

Design Decision: Surge Expiration

Surge multipliers include an expiresAt timestamp. If the rider’s request comes after expiration, the client must re-fetch. This prevents:

  • Stale high surge (rider sees 2x, actually 1x now—under-charges)
  • Stale low surge (rider sees 1x, actually 2x—creates pricing disputes)

Uber uses a MySQL-based append-only store called Schemaless for trip data. Each “cell” is immutable; updates create new cells.

Primary Store: Schemaless (MySQL-backed)

5 collapsed lines
-- Schemaless stores data as (row_key, column_name, ref_key) tuples
-- row_key: trip_id (UUID)
-- column_name: "base", "driver", "route", "payment", etc.
-- ref_key: version number
-- body: BLOB (protobuf-serialized data)
CREATE TABLE trips (
added_id BIGINT AUTO_INCREMENT PRIMARY KEY, -- Total ordering
row_key BINARY(16) NOT NULL, -- trip_id as UUID
column_name VARCHAR(64) NOT NULL,
ref_key BIGINT NOT NULL, -- Version/timestamp
body MEDIUMBLOB NOT NULL,
created_at DATETIME NOT NULL,
INDEX idx_row_column (row_key, column_name, ref_key DESC)
);

Trip Data Columns:

Column NameContentsWhen Written
basePickup, destination, rider_id, vehicle_typeOn request
matchdriver_id, vehicle_id, match_time, etaOn match
routePolyline, distance, duration, waypointsOn trip start
pricingBase fare, surge, discounts, final amountOn completion
paymentTransaction ID, status, methodOn charge
ratingRider rating, driver rating, feedbackPost-trip

Design Decision: Schemaless vs Traditional SQL

Why Schemaless for trips:

  • Append-only: No in-place updates simplifies consistency
  • Flexible schema: Add new columns without migrations
  • Time-travel: Query any historical version of a trip
  • Sharded by row_key: Trips for same user co-located

Sharding: 4,096 logical shards, hash(trip_id) determines shard.

Terminal window
# Driver's current location (expires in 30s if no update)
SET driver:{driver_id}:location '{"lat":37.7751,"lng":-122.4183,"h3":"89283082837ffff","heading":45,"speed":12.5,"status":"AVAILABLE","ts":1705329000}'
EXPIRE driver:{driver_id}:location 30
# Geo-indexed by H3 cell (resolution 9 ≈ 100m cells)
# Score = timestamp for LRU-style queries
ZADD drivers:h3:89283082837ffff 1705329000 driver_123
ZADD drivers:h3:89283082837ffff 1705328995 driver_456
# Status index for quick filtering
SADD drivers:status:AVAILABLE driver_123 driver_456
SREM drivers:status:AVAILABLE driver_789
SADD drivers:status:ON_TRIP driver_789

Design Decision: H3 Resolution 9

Resolution 9 cells are approximately 100m × 100m. This provides:

  • Fine enough granularity for urban density
  • Coarse enough to avoid millions of cells per city
  • Efficient k-ring queries (ring of radius 3 covers ~3km)
-- Driver profile and state (high availability)
CREATE TABLE driver_state (
driver_id UUID,
city_id INT,
status TEXT, -- OFFLINE, AVAILABLE, DISPATCHED, ON_TRIP
vehicle_id UUID,
current_trip_id UUID,
last_location_update TIMESTAMP,
rating DECIMAL,
total_trips INT,
acceptance_rate DECIMAL,
PRIMARY KEY ((city_id), driver_id)
) WITH CLUSTERING ORDER BY (driver_id ASC);
-- Per-driver time series (write-heavy)
CREATE TABLE driver_locations (
driver_id UUID,
day DATE,
timestamp TIMESTAMP,
latitude DOUBLE,
longitude DOUBLE,
h3_cell TEXT,
speed DOUBLE,
heading INT,
PRIMARY KEY ((driver_id, day), timestamp)
) WITH CLUSTERING ORDER BY (timestamp DESC);

Design Decision: Cassandra for Driver State

  • Write-heavy workload: Millions of location updates/second
  • Partition by city: Co-locates drivers in same market
  • Tunable consistency: Read at LOCAL_ONE for speed, write at LOCAL_QUORUM for durability
  • Natural time series: Location history with TTL for retention
Data TypeStoreRationale
Trip recordsSchemaless (MySQL)Append-only, time-travel, flexible schema
Real-time locationRedis ClusterSub-ms reads, geo queries, TTL
Driver profile/stateCassandraHigh write throughput, tunable consistency
Surge pricingRedis + KafkaLow latency reads, event streaming for updates
Payment transactionsPostgreSQLACID for financial data
Analytics/ML featuresHDFS + HiveBatch processing, ML training data

H3 is Uber’s open-source hexagonal hierarchical spatial index. It divides Earth into hexagonal cells at 16 resolution levels.

Square grid (geohash): Hexagonal grid (H3):
+---+---+---+ / \ / \ / \
| A | B | C | | | | |
+---+---+---+ → | A | B | C |
| D | X | E | | | | |
+---+---+---+ \ / \ / \ /
| F | G | H | | | |
+---+---+---+ | D | E |
Distances from X: Distances from X:
- A,C,F,H: √2 (diagonal) - All 6 neighbors: 1 (uniform!)
- B,D,E,G: 1 (cardinal)

Key advantage: Hexagons have 6 equidistant neighbors. This eliminates the diagonal distance problem in square grids, which matters for:

  • Surge pricing (adjacent cells should have equal influence)
  • Demand forecasting (spatial smoothing)
  • Driver proximity (accurate radius queries)
ResolutionAvg Edge (km)Avg Area (km²)Use Case
71.225.16City districts, regional surge
80.460.74Neighborhood level
90.170.11Driver indexing (≈100m)
100.0660.015Street level
8 collapsed lines
import h3 from "h3-js"
// Convert lat/lng to H3 cell at resolution 9
function locationToH3(lat: number, lng: number): string {
return h3.latLngToCell(lat, lng, 9)
// Returns: "89283082837ffff" (64-bit as hex string)
}
// Get all H3 cells within radius (k-ring)
function getCellsInRadius(centerH3: string, radiusKm: number): string[] {
// k=1 ≈ 300m, k=3 ≈ 1km, k=10 ≈ 3km at resolution 9
const k = Math.ceil(radiusKm / 0.17) // 170m per cell at res 9
return h3.gridDisk(centerH3, k)
// Returns all cells within k "rings" of center
}
// Example: Find drivers within 2km of pickup
function findNearbyDrivers(pickupLat: number, pickupLng: number, radiusKm: number = 2): Promise<Driver[]> {
const centerCell = locationToH3(pickupLat, pickupLng)
const searchCells = getCellsInRadius(centerCell, radiusKm)
// Query Redis for drivers in each cell
// Using pipelining for efficiency
const pipeline = redis.pipeline()
for (const cell of searchCells) {
pipeline.zrange(`drivers:h3:${cell}`, 0, -1)
}
const results = await pipeline.exec()
const driverIds = new Set(results.flat().filter(Boolean))
return fetchDriverDetails([...driverIds])
}

DISCO (Dispatch Optimization) matches riders to drivers using batch optimization rather than greedy nearest-driver assignment.

10 collapsed lines
interface RideRequest {
rideId: string
pickupH3: string
pickupLat: number
pickupLng: number
requestTime: number
}
interface DriverCandidate {
driverId: string
h3Cell: string
lat: number
lng: number
etaSeconds: number // To pickup
}
// Batch window: accumulate requests for 100-200ms
class MatchingBatcher {
private pendingRequests: RideRequest[] = []
private batchInterval = 150 // ms
constructor() {
setInterval(() => this.processBatch(), this.batchInterval)
}
addRequest(request: RideRequest) {
this.pendingRequests.push(request)
}
private async processBatch() {
if (this.pendingRequests.length === 0) return
const requests = this.pendingRequests
this.pendingRequests = []
// 1. Find candidate drivers for all requests
const candidatesMap = await this.findCandidatesForAll(requests)
// 2. Build bipartite graph
const graph = this.buildBipartiteGraph(requests, candidatesMap)
// 3. Solve assignment (minimize total ETA)
const assignments = this.solveAssignment(graph)
6 collapsed lines
// 4. Dispatch drivers
for (const { rideId, driverId, eta } of assignments) {
await this.dispatchDriver(rideId, driverId, eta)
}
}
}

The assignment problem is: given N riders and M drivers with a cost matrix (ETAs), find the assignment that minimizes total cost.

5 collapsed lines
// Cost matrix: riders × drivers
// cost[i][j] = ETA for driver j to reach rider i's pickup
// Use Infinity for infeasible pairs (driver too far)
function solveAssignment(requests: RideRequest[], candidates: Map<string, DriverCandidate[]>): Assignment[] {
const n = requests.length
const allDrivers = new Set<string>()
candidates.forEach((c) => c.forEach((d) => allDrivers.add(d.driverId)))
const m = allDrivers.size
const driverList = [...allDrivers]
// Build cost matrix
const cost: number[][] = Array(n)
.fill(null)
.map(() => Array(m).fill(Infinity))
for (let i = 0; i < n; i++) {
const rideCandidates = candidates.get(requests[i].rideId) ?? []
for (const candidate of rideCandidates) {
const j = driverList.indexOf(candidate.driverId)
cost[i][j] = candidate.etaSeconds
}
}
// Hungarian algorithm: O(n³)
// For large scale, use auction algorithm or relaxation-based approximations
const assignments = hungarianAlgorithm(cost)
return assignments.map(([i, j]) => ({
rideId: requests[i].rideId,
driverId: driverList[j],
eta: cost[i][j],
}))
}

Design Decision: Batch Size and Window

  • Window too short (< 50ms): Not enough requests to optimize
  • Window too long (> 300ms): Noticeable user delay
  • Sweet spot: 100-200ms: 10-50 concurrent requests in dense areas, imperceptible delay

Improvement over greedy: Uber reports 10-20% reduction in average wait times from batch optimization.

Uber’s DeepETA uses a hybrid approach: physics-based routing for the baseline, ML for residual correction.

┌─────────────────┐
│ Road Network │
│ Graph (OSM) │
└────────┬────────┘
Pickup/Destination ──────┼──────────────┐
│ │
▼ ▼
┌─────────────────┐ ┌─────────────────┐
│ Routing Engine │ │ DeepETA Model │
│ (Dijkstra/A*) │ │ (Linear Transf) │
└────────┬────────┘ └────────┬────────┘
│ │
▼ ▼
Physics-based ETA ML Residual (R)
(Y)
│ │
└───────┬───────────┘
Final ETA = Y + R
5 collapsed lines
interface ETAFeatures {
// Spatial features
originH3: string
destH3: string
routeH3Cells: string[] // H3 cells along route
// Temporal features
hourOfDay: number // 0-23
dayOfWeek: number // 0-6
isHoliday: boolean
minutesSinceMidnight: number
// Traffic features
currentTrafficIndex: number // 0-1 (free flow to gridlock)
historicalTrafficIndex: number // Same time last week
trafficTrend: number // Improving/worsening
// Route features
distanceMeters: number
numIntersections: number
numHighwaySegments: number
routingEngineETA: number // Physics-based baseline
// Weather (optional)
precipitation: number
visibility: number
}
// Model inference
7 collapsed lines
async function predictETA(features: ETAFeatures): Promise<number> {
// Call ML serving layer (Michelangelo)
const residual = await mlClient.predict("deepeta", features)
// Final ETA = routing engine + ML residual
return features.routingEngineETA + residual
}

Performance:

  • Median latency: 3.25ms
  • P95 latency: 4ms
  • QPS: Hundreds of thousands/second at Uber
8 collapsed lines
interface SurgeCell {
h3Cell: string // Resolution 7 (larger area)
demandCount: number // Ride requests in last 5 minutes
supplyCount: number // Available drivers
multiplier: number // Calculated surge
updatedAt: number
}
// Calculate surge for each H3 cell
async function calculateSurge(cityId: string): Promise<Map<string, SurgeCell>> {
const cells = new Map<string, SurgeCell>()
// Get H3 cells for the city (resolution 7, ~5km² each)
const cityCells = getCityH3Cells(cityId, 7)
for (const h3Cell of cityCells) {
// Count requests in this cell (last 5 minutes)
const demandCount = (await redis.get(`demand:${h3Cell}:5min`)) || 0
// Count available drivers in this cell
const childCells = h3.cellToChildren(h3Cell, 9) // Expand to res 9
let supplyCount = 0
for (const child of childCells) {
supplyCount += await redis.scard(`drivers:h3:${child}:available`)
}
// Calculate multiplier
const ratio = supplyCount > 0 ? demandCount / supplyCount : 10
const multiplier = calculateMultiplier(ratio)
cells.set(h3Cell, {
h3Cell,
demandCount,
supplyCount,
multiplier,
updatedAt: Date.now(),
})
}
11 collapsed lines
return cells
}
function calculateMultiplier(demandSupplyRatio: number): number {
// Piecewise linear function
// ratio < 0.5: no surge (1.0x)
// ratio 0.5-1.0: linear 1.0x-1.5x
// ratio 1.0-2.0: linear 1.5x-2.5x
// ratio > 2.0: cap at 3.0x (regulatory/PR limits)
if (demandSupplyRatio < 0.5) return 1.0
if (demandSupplyRatio < 1.0) return 1.0 + (demandSupplyRatio - 0.5)
if (demandSupplyRatio < 2.0) return 1.5 + (demandSupplyRatio - 1.0)
return Math.min(3.0, 1.5 + demandSupplyRatio - 1.0)
}

Raw surge calculations can be noisy (5-minute windows have high variance). Apply smoothing:

5 collapsed lines
// Exponential moving average to prevent jarring jumps
function smoothSurge(
currentMultiplier: number,
previousMultiplier: number,
alpha: number = 0.3, // Smoothing factor
): number {
// New surge = α × current + (1-α) × previous
const smoothed = alpha * currentMultiplier + (1 - alpha) * previousMultiplier
// Also limit change rate (max ±0.3 per update)
const maxDelta = 0.3
const delta = smoothed - previousMultiplier
if (Math.abs(delta) > maxDelta) {
return previousMultiplier + Math.sign(delta) * maxDelta
}
return smoothed
}

Design Decision: Surge Resolution

  • Spatial: Resolution 7 (~5km²) for surge, resolution 9 (~100m) for driver indexing
  • Temporal: Recalculate every 5-10 minutes, smooth changes
  • Why larger cells for surge: Surge should be stable across a neighborhood; overly granular surge creates confusion (“why is it 2x here but 1.2x across the street?”)

Problem: Displaying 20+ driver pins updating every 4-5 seconds causes jank on mobile devices.

Solution: Batch Updates + Canvas Rendering

10 collapsed lines
// Instead of updating each marker individually,
// batch updates and render to canvas overlay
class DriverMapLayer {
private canvas: HTMLCanvasElement
private drivers: Map<string, DriverPosition> = new Map()
private pendingUpdates: DriverPosition[] = []
private rafId: number | null = null
// Buffer updates, render on next animation frame
updateDriver(position: DriverPosition) {
this.pendingUpdates.push(position)
if (!this.rafId) {
this.rafId = requestAnimationFrame(() => this.render())
}
}
private render() {
// Apply all pending updates
for (const update of this.pendingUpdates) {
this.drivers.set(update.driverId, update)
}
this.pendingUpdates = []
this.rafId = null
// Clear and redraw all drivers
const ctx = this.canvas.getContext("2d")!
ctx.clearRect(0, 0, this.canvas.width, this.canvas.height)
for (const driver of this.drivers.values()) {
this.drawDriver(ctx, driver)
}
}
11 collapsed lines
private drawDriver(ctx: CanvasRenderingContext2D, driver: DriverPosition) {
const [x, y] = this.latLngToPixel(driver.lat, driver.lng)
// Draw car icon rotated to heading
ctx.save()
ctx.translate(x, y)
ctx.rotate((driver.heading * Math.PI) / 180)
ctx.drawImage(this.carIcon, -12, -12, 24, 24)
ctx.restore()
}
}
5 collapsed lines
// State machine for ride lifecycle
type RideStatus =
| "IDLE"
| "REQUESTING"
| "MATCHING"
| "DRIVER_ASSIGNED"
| "DRIVER_ARRIVING"
| "DRIVER_ARRIVED"
| "IN_PROGRESS"
| "COMPLETED"
| "CANCELLED"
interface RideState {
status: RideStatus
ride: Ride | null
driver: Driver | null
driverLocation: LatLng | null
etaSeconds: number | null
route: LatLng[] | null
}
// WebSocket message handler
function handleRideUpdate(state: RideState, message: WSMessage): RideState {
switch (message.type) {
case "DRIVER_ASSIGNED":
return {
...state,
status: "DRIVER_ASSIGNED",
driver: message.driver,
etaSeconds: message.eta,
}
case "DRIVER_LOCATION":
return {
11 collapsed lines
...state,
driverLocation: message.location,
etaSeconds: message.eta,
status: message.eta < 60 ? "DRIVER_ARRIVED" : state.status,
}
case "TRIP_STARTED":
return {
...state,
status: "IN_PROGRESS",
route: message.route,
}
case "TRIP_COMPLETED":
return {
...state,
status: "COMPLETED",
ride: { ...state.ride!, fare: message.fare },
}
default:
return state
}
}

Problem: Riders in areas with poor connectivity may lose connection mid-request.

Solution: Optimistic UI + Request Queue

5 collapsed lines
// Queue ride request locally if offline
class RideRequestQueue {
private queue: RideRequest[] = []
async requestRide(request: RideRequest): Promise<void> {
if (!navigator.onLine) {
// Store locally
this.queue.push(request)
localStorage.setItem("pendingRides", JSON.stringify(this.queue))
throw new OfflineError("Ride queued, will submit when online")
}
return this.submitRide(request)
}
// Called when connectivity restored
async processQueue(): Promise<void> {
const pending = [...this.queue]
this.queue = []
for (const request of pending) {
try {
await this.submitRide(request)
} catch (e) {
// Re-queue if still failing
this.queue.push(request)
}
}
8 collapsed lines
localStorage.setItem("pendingRides", JSON.stringify(this.queue))
}
}
// Listen for online event
window.addEventListener("online", () => {
rideQueue.processQueue()
})
ComponentRequirementOptions
Message Queue1M+ msg/sec, orderingKafka, Pulsar, RedPanda
Real-time CacheSub-ms geo queriesRedis Cluster, KeyDB
Time-series DBLocation historyCassandra, ScyllaDB, TimescaleDB
Stream ProcessingReal-time aggregationsFlink, Kafka Streams, Samza
ML ServingLow-latency inferenceTensorFlow Serving, Triton, custom
Object StorageTrip receipts, ML modelsS3-compatible (MinIO)

Region: US-East

Region: US-West

Edge Layer

Data

Compute

Cross-region replication

ML

SageMaker

ETA Inference

CloudFront

Route 53

Latency-based routing

ALB

EKS Cluster

Core Services

WebSocket Fleet

ECS

Aurora PostgreSQL

Payments

ElastiCache Redis

Location Cache

MSK Kafka

Event Streaming

Keyspaces

Driver State

ALB

EKS Cluster

MSK Kafka

ComponentAWS ServiceConfiguration
API ServicesEKS (Kubernetes)50-500 pods, HPA on CPU/requests
WebSocket GatewayECS Fargate100-1000 tasks, sticky sessions
Location CacheElastiCache Redisr6g.xlarge cluster mode, 6 shards
Event StreamingMSK (Kafka)150 nodes, 3 AZs, tiered storage
Driver StateKeyspaces (Cassandra)On-demand capacity
Payments DBAurora PostgreSQLdb.r6g.2xlarge, Multi-AZ, read replicas
ML InferenceSageMakerml.c5.4xlarge, auto-scaling
Object StorageS3 + CloudFrontTrip receipts, ML models

Uber operates globally with active-active deployments. Key considerations:

  1. Kafka cross-region replication: uReplicator (Uber’s open-source tool) for zero-data-loss replication
  2. Cassandra multi-DC: LOCAL_QUORUM writes, LOCAL_ONE reads for low latency
  3. Redis geo-replication: Active-passive per region (location data is region-specific)
  4. DNS-based routing: Route users to nearest region based on latency
Managed ServiceSelf-HostedWhen to Self-Host
MSKApache KafkaCost at scale (Uber runs own Kafka)
KeyspacesApache CassandraSpecific tuning, cost
ElastiCacheRedis on EC2Redis modules, cost
SageMakerTensorFlow ServingCustom models, latency requirements

This design prioritizes batch-optimized matching with H3 spatial indexing over simpler greedy approaches. The 10-20% improvement in average wait times justifies the added complexity for a system where every second of reduced wait time translates to user satisfaction and driver utilization.

Key architectural decisions:

  1. H3 hexagonal indexing over geohash: Uniform neighbor distances eliminate edge artifacts in proximity queries and surge calculations. Uber open-sourced H3 for this reason.

  2. Batch optimization over greedy matching: Accumulating requests for 100-200ms and solving the global assignment problem outperforms per-request nearest-driver matching significantly.

  3. Hybrid ETA (physics + ML): The routing engine provides a solid baseline; ML models learn residual corrections for traffic patterns, events, and local conditions.

  4. Schemaless for trips, Cassandra for driver state: Append-only storage with flexible schema handles the write-heavy, evolving trip data model; Cassandra’s tunable consistency fits the driver location workload.

  5. Spatially-granular, temporally-smooth surge: Resolution 7 H3 cells (~5km²) provide stable neighborhood-level pricing; temporal smoothing prevents jarring multiplier changes.

Limitations and future improvements:

  • UberPool matching: Shared rides require solving a more complex routing problem with pickup/dropoff ordering constraints.
  • Predictive dispatch: Pre-positioning drivers based on predicted demand could further reduce wait times.
  • Dynamic pricing experimentation: ML models could optimize multipliers for market equilibrium rather than simple supply/demand ratios.
  • Distributed systems fundamentals (CAP theorem, eventual consistency)
  • Database concepts (sharding, replication, time-series data)
  • Graph algorithms (Dijkstra, bipartite matching basics)
  • Basic understanding of ML inference serving
  • H3: Hexagonal Hierarchical Spatial Index—Uber’s open-source geospatial indexing system using hexagonal cells at 16 resolution levels
  • DISCO: Dispatch Optimization—Uber’s matching service that assigns riders to drivers
  • Schemaless: Uber’s MySQL-based append-only data store with flexible schema
  • k-ring: In H3, the set of all cells within k “hops” of a center cell
  • ETA: Estimated Time of Arrival—predicted time for driver to reach pickup or destination
  • Surge: Dynamic pricing multiplier applied when demand exceeds supply
  • Bipartite matching: Assignment problem where two disjoint sets (riders, drivers) are matched to minimize total cost
  • H3 spatial indexing provides uniform neighbor distances, enabling accurate proximity queries and smooth surge pricing gradients
  • Batch-optimized matching (100-200ms windows) achieves 10-20% better wait times than greedy nearest-driver assignment
  • Hybrid ETA prediction combines physics-based routing with ML residual correction for ±2 minute accuracy
  • Real-time location tracking at 2M+ updates/second uses Redis with H3-indexed sorted sets and 30-second TTLs
  • Surge pricing operates at H3 resolution 7 (~5km²) with temporal smoothing to prevent jarring changes
  • Schemaless (MySQL) + Cassandra handles the write-heavy workload with append-only trip records and high-throughput driver state

Read more

  • Previous

    Design Real-Time Chat and Messaging

    System Design / System Design Problems 21 min read

    A comprehensive system design for real-time chat and messaging covering connection management, message delivery guarantees, ordering strategies, presence systems, group chat fan-out, and offline synchronization. This design addresses sub-second message delivery at WhatsApp/Discord scale (100B+ messages/day) with strong delivery guarantees and mobile-first offline resilience.

  • Next

    Design Search Autocomplete: Prefix Matching at Scale

    System Design / System Design Problems 18 min read

    A system design for search autocomplete (typeahead) covering prefix data structures, ranking algorithms, distributed architecture, and sub-100ms latency requirements. This design addresses the challenge of returning relevant suggestions within the user’s typing cadence—typically under 100ms—while handling billions of queries daily.