Design Uber-Style Ride Hailing

A comprehensive system design for a ride-hailing platform handling real-time driver-rider matching, geospatial indexing at scale, dynamic pricing, and sub-second location tracking. This design addresses the core challenges of matching millions of riders with drivers in real-time while optimizing for ETAs, driver utilization, and surge pricing across global markets.

High-level architecture: Rider and driver apps connect through a WebSocket gateway for real-time updates. The Matching Service (DISCO) uses H3-indexed location data and ML-powered ETA predictions to optimize driver dispatch.

Abstract

Ride-hailing systems solve three interconnected problems: spatial indexing (finding nearby drivers efficiently), dispatch optimization (matching riders to drivers to minimize wait time globally, not just per-request), and dynamic pricing (balancing supply and demand in real-time).

The core spatial index uses H3 hexagonal cells rather than geohash or quadtrees. Hexagons provide uniform neighbor distances (all 6 neighbors equidistant), enabling accurate proximity searches without the edge artifacts of square cells. Uber open-sourced H3 for this reason—gradient analysis for surge pricing and demand forecasting becomes trivial when all adjacent cells have equal weight.

Dispatch is not “find closest driver.” Traffic, bridges, and one-way streets make Euclidean distance meaningless. The Matching Service (DISCO) uses ETA-based assignment with batch optimization: accumulate requests over a short window (100-200ms), build a bipartite graph of riders and available drivers, and solve the assignment problem to minimize total wait time across all requests. This global optimization outperforms greedy per-request matching by 10-20% on wait times.

Surge pricing uses a hybrid approach: real-time supply/demand ratios within H3 cells set the base multiplier, while ML models adjust for predicted demand (events, weather, historical patterns). The key design insight is that surge must be spatially granular (different blocks have different multipliers) but temporally smooth (avoid jarring jumps that frustrate users).

Requirements

Functional Requirements

Feature	Priority	Scope
Request ride (pickup, destination)	Core	Full
Real-time driver matching	Core	Full
Live location tracking (driver → rider)	Core	Full
Dynamic/surge pricing	Core	Full
ETA prediction	Core	Full
Driver dispatch and navigation	Core	Full
Trip lifecycle (start, complete, cancel)	Core	Full
Payment processing	Core	Overview
Driver ratings and feedback	High	Overview
Ride history and receipts	High	Brief
Scheduled rides	Medium	Brief
Ride sharing (UberPool)	Medium	Out of scope
Driver earnings and payouts	Low	Out of scope

Non-Functional Requirements

Requirement	Target	Rationale
Availability	99.99%	Revenue-critical; outage = stranded users
Matching latency	p99 < 3 seconds	User perception of “instant” match
Location update latency	p99 < 500ms	Real-time tracking accuracy
ETA accuracy	±2 minutes median	User trust in time estimates
Surge price staleness	< 30 seconds	Avoid stale pricing on request
Throughput	1M+ location updates/sec	Global driver fleet scale
Consistency	Eventual (< 5s) for location; strong for payments	Location tolerates staleness; payments cannot

Scale Estimation

Users and Drivers:

Daily trips: 30M trips/day (Uber 2024)
Monthly active users: 180M
Active drivers: 8.8M globally
Peak concurrent: 3M drivers online, 10M active rider sessions

Traffic:

Location updates: 8.8M drivers × 1 update/4 seconds = 2.2M updates/sec peak
Ride requests: 30M/day = ~350 RPS average, ~1,500 RPS peak
ETA queries: Each match queries 10-20 candidate drivers = 15-30K RPS
Price checks: 10× ride requests (users check before confirming) = 15K RPS peak

Storage:

Trip record: ~5KB (metadata, route, pricing, payment)
Daily trip storage: 30M × 5KB = 150GB/day
Location history (hot): 8.8M drivers × 4KB/minute × 60 min = 2TB rolling window
Yearly growth: ~55TB/year for trips alone

Message Volume:

Kafka: 1 trillion messages/day (Uber published)
Real-time events: Location, trip state, notifications

Design Paths

Path A: Geohash + Greedy Matching

Best when:

Simpler implementation requirements
Moderate scale (< 100K concurrent drivers)
Latency constraints relaxed (5+ seconds acceptable)

Key characteristics:

Use geohash strings for spatial indexing (e.g., 9q8yyk for SF downtown)
Match each request immediately to the nearest available driver
Single-request optimization (no batching)

Trade-offs:

✅ Simple implementation (geohash is well-understood)
✅ Lower latency for individual matches
✅ Easier to reason about and debug
❌ Geohash edge effects (neighbors at different distances)
❌ Suboptimal global wait times (greedy isn’t globally optimal)
❌ Struggles with dense urban areas (many equidistant drivers)

Real-world example: Early Uber and Lyft used geohash-based approaches. Worked well at small scale but required rearchitecture as cities densified.

Path B: H3 + Batch Optimization (Chosen)

Best when:

High driver density in urban areas
Global optimization matters (minimize total wait time)
Scale demands efficient spatial queries

Key characteristics:

H3 hexagonal cells for uniform spatial indexing
Batch requests over 100-200ms windows
Solve assignment as bipartite matching problem
ETA-based costs, not distance-based

Trade-offs:

✅ Uniform neighbor distances (no edge artifacts)
✅ 10-20% better wait times via global optimization
✅ Natural fit for surge pricing (hexagonal heatmaps)
✅ Efficient k-ring queries for nearby cells
❌ Higher implementation complexity
❌ Slight latency increase from batching (100-200ms)
❌ Requires ML for accurate ETA (not just distance)

Real-world example: Uber developed and open-sourced H3 specifically for this use case. Their DISCO system uses batch optimization to minimize city-wide wait times.

Path C: Quadtree + Real-Time Streaming

Best when:

Variable driver density (sparse suburbs, dense cities)
Need adaptive spatial resolution
Streaming-first architecture

Key characteristics:

Quadtree adapts cell size to driver density
Stream processing (Flink/Samza) for continuous matching
No batching—process events as they arrive

Trade-offs:

✅ Adaptive resolution (small cells in dense areas)
✅ No batching latency
✅ Natural fit for streaming architectures
❌ Complex rebalancing as density changes
❌ Non-uniform neighbor distances (like geohash)
❌ Harder to aggregate for surge pricing

Path Comparison

Factor	Path A (Geohash)	Path B (H3 Batch)	Path C (Quadtree)
Spatial uniformity	Low (edge effects)	High	Medium
Matching optimality	Greedy (local)	Global	Local
Implementation complexity	Low	High	Medium
Latency	Lowest	+100-200ms	Low
Scale	Moderate	High	High
Best for	MVP, low-density	Dense urban, global	Variable density

This Article’s Focus

This article implements Path B (H3 + Batch Optimization) because Uber’s published architecture demonstrates this approach at scale. The 10-20% improvement in wait times from global optimization justifies the added complexity for a revenue-critical system.

High-Level Design

Service Architecture

Ride Service

Manages the trip lifecycle from request to completion:

Create ride request (validate pickup/destination, check surge)
Track trip state machine: REQUESTED → MATCHED → DRIVER_ARRIVING → IN_PROGRESS → COMPLETED
Handle cancellations (with appropriate fees/policies)
Store trip records for history and analytics

Matching Service (DISCO)

The core dispatch engine that matches riders to drivers:

Receive ride requests from Ride Service
Query Location Service for nearby available drivers
Query Routing Service for ETA from each candidate to pickup
Build bipartite graph: riders as left vertices, drivers as right vertices
Edge weights = ETA (lower is better)
Solve assignment problem to minimize total ETA across all pending requests
Dispatch selected driver, update driver state to DISPATCHED

Location Service

Tracks real-time driver positions at million-message-per-second scale:

Ingest GPS updates from Driver App (every 4-5 seconds)
Store current position in Redis with H3 index
Maintain hot window of recent positions for trajectory
Publish location changes to Kafka for other services
Support spatial queries: “drivers within k-rings of H3 cell”

Pricing Service

Calculates fares including dynamic surge:

Base fare calculation (distance, time, vehicle type)
Surge multiplier lookup (per H3 cell, refreshed every 5-10 minutes)
Upfront pricing (show fare before ride request)
Promo code and discount application
Final fare calculation with actuals (if route differs)

Routing & ETA Service

Provides navigation and time estimates:

Road network graph (OpenStreetMap or licensed)
Real-time traffic integration
ETA prediction using DeepETA model (hybrid physics + ML)
Turn-by-turn navigation for Driver App
Route optimization for UberPool (not in scope)

Payment Service

Handles financial transactions:

Tokenized payment methods (no raw card numbers stored)
Charge on trip completion
Split payments (multiple riders)
Driver payout aggregation
Fraud detection integration

Data Flow: Requesting a Ride

Data Flow: Real-Time Location Tracking

API Design

Ride Request

Endpoint: POST /api/v1/rides


3 collapsed lines
1
// Headers
2
Authorization: Bearer {access_token}
3
Content-Type: application/json
4

5
// Request body
6
{
7
  "pickup": {
8
    "latitude": 37.7749,
9
    "longitude": -122.4194,
10
    "address": "123 Market St, San Francisco, CA"
11
  },
12
  "destination": {
13
    "latitude": 37.7899,
14
    "longitude": -122.4014,
15
    "address": "456 Mission St, San Francisco, CA"
16
  },
17
  "vehicleType": "UBER_X",
18
  "paymentMethodId": "pm_abc123",
19
  "riderCount": 1,
20
  "scheduledTime": null
21
}

Response (201 Created):


5 collapsed lines
1
{
2
  "rideId": "ride_abc123xyz",
3
  "status": "REQUESTED",
4
  "pickup": {
5
    "latitude": 37.7749,
6
    "longitude": -122.4194,
7
    "address": "123 Market St, San Francisco, CA"
8
  },
9
  "destination": {
10
    "latitude": 37.7899,
11
    "longitude": -122.4014,
12
    "address": "456 Mission St, San Francisco, CA"
13
  },
14
  "estimate": {
15
    "fareAmount": 1850,
16
    "fareCurrency": "USD",
17
    "surgeMultiplier": 1.2,
18
    "distanceMeters": 2100,
19
    "durationSeconds": 480
20
  },
21
  "etaToPickup": null,
22
  "driver": null,
23
  "vehicle": null,
24
  "createdAt": "2024-01-15T14:30:00Z"
25
}

Error Responses:

400 Bad Request: Invalid coordinates, unsupported area
402 Payment Required: Payment method declined
409 Conflict: Rider has active ride in progress
429 Too Many Requests: Rate limit exceeded (anti-fraud)
503 Service Unavailable: No drivers in area (with retry-after)

Rate Limits: 10 requests/minute per user (prevents spam requests)

Location Update (Driver)

Endpoint: WebSocket wss://location.uber.com/v1/driver

1
// Client → Server (every 4-5 seconds)
2
{
3
  "type": "LOCATION_UPDATE",
4
  "payload": {
5
    "latitude": 37.7751,
6
    "longitude": -122.4183,
7
    "heading": 45,
8
    "speed": 12.5,
9
    "accuracy": 5.0,
10
    "timestamp": 1705329000000
11
  }
12
}
13

14
// Server → Client (acknowledgment)
15
{
16
  "type": "LOCATION_ACK",
17
  "payload": {
18
    "received": 1705329000123
19
  }
20
}

Design Decision: WebSocket vs HTTP Polling

Why WebSocket for driver location updates:

Bidirectional: Server can push ride requests instantly
Persistent connection: Amortizes TCP handshake cost across thousands of updates
Battery efficiency: Single connection vs repeated HTTP requests
Sub-second latency: Critical for real-time tracking

Get Nearby Drivers (for fare estimate screen)

Endpoint: GET /api/v1/drivers/nearby

Query Parameters:

Parameter	Type	Description
`latitude`	float	Pickup latitude
`longitude`	float	Pickup longitude
`vehicleTypes`	string	Comma-separated (UBER_X,UBER_BLACK)
`radius`	integer	Search radius in meters (default: 3000)

Response:


3 collapsed lines
1
{
2
  "drivers": [
3
    {
4
      "driverId": "driver_xyz",
5
      "latitude": 37.7755,
6
      "longitude": -122.418,
7
      "heading": 90,
8
      "vehicleType": "UBER_X",
9
      "etaSeconds": 180
10
    },
11
    {
12
      "driverId": "driver_abc",
13
      "latitude": 37.774,
14
      "longitude": -122.42,
15
      "heading": 270,
16
      "vehicleType": "UBER_X",
17
      "etaSeconds": 240
18
    }
19
  ],
3 collapsed lines
20
  "surgeMultiplier": 1.2,
21
  "surgeExpiresAt": "2024-01-15T14:35:00Z"
22
}

Design Decision: Why Return Limited Driver Info

The response includes approximate positions (±100m fuzzy) rather than exact locations:

Privacy: Drivers’ real-time positions are sensitive
Performance: Fewer precision bits = better compression
Purpose: Only needed for UI (show cars on map), not for matching

Surge Pricing

Endpoint: GET /api/v1/surge

Query Parameters:

Parameter	Type	Description
`latitude`	float	Pickup location
`longitude`	float	Pickup location
`vehicleType`	string	Vehicle type

Response:

1
{
2
  "multiplier": 1.5,
3
  "reason": "HIGH_DEMAND",
4
  "h3Cell": "892a100d2c3ffff",
5
  "expiresAt": "2024-01-15T14:40:00Z",
6
  "demandLevel": "VERY_HIGH",
7
  "supplyLevel": "LOW"
8
}

Design Decision: Surge Expiration

Surge multipliers include an expiresAt timestamp. If the rider’s request comes after expiration, the client must re-fetch. This prevents:

Stale high surge (rider sees 2x, actually 1x now—under-charges)
Stale low surge (rider sees 1x, actually 2x—creates pricing disputes)

Data Modeling

Trip Schema (Schemaless)

Uber uses a MySQL-based append-only store called Schemaless for trip data. Each “cell” is immutable; updates create new cells.

Primary Store: Schemaless (MySQL-backed)


5 collapsed lines
1
-- Schemaless stores data as (row_key, column_name, ref_key) tuples
2
-- row_key: trip_id (UUID)
3
-- column_name: "base", "driver", "route", "payment", etc.
4
-- ref_key: version number
5
-- body: BLOB (protobuf-serialized data)
6

7
CREATE TABLE trips (
8
    added_id BIGINT AUTO_INCREMENT PRIMARY KEY,  -- Total ordering
9
    row_key BINARY(16) NOT NULL,                  -- trip_id as UUID
10
    column_name VARCHAR(64) NOT NULL,
11
    ref_key BIGINT NOT NULL,                      -- Version/timestamp
12
    body MEDIUMBLOB NOT NULL,
13
    created_at DATETIME NOT NULL,
14

15
    INDEX idx_row_column (row_key, column_name, ref_key DESC)
16
);

Trip Data Columns:

Column Name	Contents	When Written
`base`	Pickup, destination, rider_id, vehicle_type	On request
`match`	driver_id, vehicle_id, match_time, eta	On match
`route`	Polyline, distance, duration, waypoints	On trip start
`pricing`	Base fare, surge, discounts, final amount	On completion
`payment`	Transaction ID, status, method	On charge
`rating`	Rider rating, driver rating, feedback	Post-trip

Design Decision: Schemaless vs Traditional SQL

Why Schemaless for trips:

Append-only: No in-place updates simplifies consistency
Flexible schema: Add new columns without migrations
Time-travel: Query any historical version of a trip
Sharded by row_key: Trips for same user co-located

Sharding: 4,096 logical shards, hash(trip_id) determines shard.

Driver Location (Redis)

# Driver's current location (expires in 30s if no update)
SET driver:{driver_id}:location '{"lat":37.7751,"lng":-122.4183,"h3":"89283082837ffff","heading":45,"speed":12.5,"status":"AVAILABLE","ts":1705329000}'
EXPIRE driver:{driver_id}:location 30

# Geo-indexed by H3 cell (resolution 9 ≈ 100m cells)
# Score = timestamp for LRU-style queries
ZADD drivers:h3:89283082837ffff 1705329000 driver_123
ZADD drivers:h3:89283082837ffff 1705328995 driver_456

# Status index for quick filtering
SADD drivers:status:AVAILABLE driver_123 driver_456
SREM drivers:status:AVAILABLE driver_789
SADD drivers:status:ON_TRIP driver_789

Design Decision: H3 Resolution 9

Resolution 9 cells are approximately 100m × 100m. This provides:

Fine enough granularity for urban density
Coarse enough to avoid millions of cells per city
Efficient k-ring queries (ring of radius 3 covers ~3km)

Driver State (Cassandra)

1
-- Driver profile and state (high availability)
2
CREATE TABLE driver_state (
3
    driver_id UUID,
4
    city_id INT,
5
    status TEXT,           -- OFFLINE, AVAILABLE, DISPATCHED, ON_TRIP
6
    vehicle_id UUID,
7
    current_trip_id UUID,
8
    last_location_update TIMESTAMP,
9
    rating DECIMAL,
10
    total_trips INT,
11
    acceptance_rate DECIMAL,
12
    PRIMARY KEY ((city_id), driver_id)
13
) WITH CLUSTERING ORDER BY (driver_id ASC);
14

15
-- Per-driver time series (write-heavy)
16
CREATE TABLE driver_locations (
17
    driver_id UUID,
18
    day DATE,
19
    timestamp TIMESTAMP,
20
    latitude DOUBLE,
21
    longitude DOUBLE,
22
    h3_cell TEXT,
23
    speed DOUBLE,
24
    heading INT,
25
    PRIMARY KEY ((driver_id, day), timestamp)
26
) WITH CLUSTERING ORDER BY (timestamp DESC);

Design Decision: Cassandra for Driver State

Write-heavy workload: Millions of location updates/second
Partition by city: Co-locates drivers in same market
Tunable consistency: Read at LOCAL_ONE for speed, write at LOCAL_QUORUM for durability
Natural time series: Location history with TTL for retention

Database Selection Matrix

Data Type	Store	Rationale
Trip records	Schemaless (MySQL)	Append-only, time-travel, flexible schema
Real-time location	Redis Cluster	Sub-ms reads, geo queries, TTL
Driver profile/state	Cassandra	High write throughput, tunable consistency
Surge pricing	Redis + Kafka	Low latency reads, event streaming for updates
Payment transactions	PostgreSQL	ACID for financial data
Analytics/ML features	HDFS + Hive	Batch processing, ML training data

Low-Level Design

H3 Spatial Indexing

H3 is Uber’s open-source hexagonal hierarchical spatial index. It divides Earth into hexagonal cells at 16 resolution levels.

Why Hexagons?

1
Square grid (geohash):           Hexagonal grid (H3):
2

3
+---+---+---+                    / \ / \ / \
4
| A | B | C |                   |   |   |   |
5
+---+---+---+    →              | A | B | C |
6
| D | X | E |                   |   |   |   |
7
+---+---+---+                    \ / \ / \ /
8
| F | G | H |                     |   |   |
9
+---+---+---+                     | D | E |
10

11
Distances from X:                 Distances from X:
12
  - A,C,F,H: √2 (diagonal)          - All 6 neighbors: 1 (uniform!)
13
  - B,D,E,G: 1 (cardinal)

Key advantage: Hexagons have 6 equidistant neighbors. This eliminates the diagonal distance problem in square grids, which matters for:

Surge pricing (adjacent cells should have equal influence)
Demand forecasting (spatial smoothing)
Driver proximity (accurate radius queries)

H3 Resolution Table

Resolution	Avg Edge (km)	Avg Area (km²)	Use Case
7	1.22	5.16	City districts, regional surge
8	0.46	0.74	Neighborhood level
9	0.17	0.11	Driver indexing (≈100m)
10	0.066	0.015	Street level

H3 Operations


8 collapsed lines
1
import h3 from "h3-js"
2

3
// Convert lat/lng to H3 cell at resolution 9
4
function locationToH3(lat: number, lng: number): string {
5
  return h3.latLngToCell(lat, lng, 9)
6
  // Returns: "89283082837ffff" (64-bit as hex string)
7
}
8

9
// Get all H3 cells within radius (k-ring)
10
function getCellsInRadius(centerH3: string, radiusKm: number): string[] {
11
  // k=1 ≈ 300m, k=3 ≈ 1km, k=10 ≈ 3km at resolution 9
12
  const k = Math.ceil(radiusKm / 0.17) // 170m per cell at res 9
13
  return h3.gridDisk(centerH3, k)
14
  // Returns all cells within k "rings" of center
15
}
16

17
// Example: Find drivers within 2km of pickup
18
function findNearbyDrivers(pickupLat: number, pickupLng: number, radiusKm: number = 2): Promise<Driver[]> {
19
  const centerCell = locationToH3(pickupLat, pickupLng)
20
  const searchCells = getCellsInRadius(centerCell, radiusKm)
21

22
  // Query Redis for drivers in each cell
23
  // Using pipelining for efficiency
24
  const pipeline = redis.pipeline()
25
  for (const cell of searchCells) {
26
    pipeline.zrange(`drivers:h3:${cell}`, 0, -1)
27
  }
28

29
  const results = await pipeline.exec()
30
  const driverIds = new Set(results.flat().filter(Boolean))
31
  return fetchDriverDetails([...driverIds])
32
}

Matching Algorithm (DISCO)

DISCO (Dispatch Optimization) matches riders to drivers using batch optimization rather than greedy nearest-driver assignment.

Batch Optimization Window


10 collapsed lines
1
interface RideRequest {
2
  rideId: string
3
  pickupH3: string
4
  pickupLat: number
5
  pickupLng: number
6
  requestTime: number
7
}
8

9
interface DriverCandidate {
10
  driverId: string
11
  h3Cell: string
12
  lat: number
13
  lng: number
14
  etaSeconds: number // To pickup
15
}
16

17
// Batch window: accumulate requests for 100-200ms
18
class MatchingBatcher {
19
  private pendingRequests: RideRequest[] = []
20
  private batchInterval = 150 // ms
21

22
  constructor() {
23
    setInterval(() => this.processBatch(), this.batchInterval)
24
  }
25

26
  addRequest(request: RideRequest) {
27
    this.pendingRequests.push(request)
28
  }
29

30
  private async processBatch() {
31
    if (this.pendingRequests.length === 0) return
32

33
    const requests = this.pendingRequests
34
    this.pendingRequests = []
35

36
    // 1. Find candidate drivers for all requests
37
    const candidatesMap = await this.findCandidatesForAll(requests)
38

39
    // 2. Build bipartite graph
40
    const graph = this.buildBipartiteGraph(requests, candidatesMap)
41

42
    // 3. Solve assignment (minimize total ETA)
43
    const assignments = this.solveAssignment(graph)
44

6 collapsed lines
45
    // 4. Dispatch drivers
46
    for (const { rideId, driverId, eta } of assignments) {
47
      await this.dispatchDriver(rideId, driverId, eta)
48
    }
49
  }
50
}

Hungarian Algorithm for Assignment

The assignment problem is: given N riders and M drivers with a cost matrix (ETAs), find the assignment that minimizes total cost.


5 collapsed lines
1
// Cost matrix: riders × drivers
2
// cost[i][j] = ETA for driver j to reach rider i's pickup
3
// Use Infinity for infeasible pairs (driver too far)
4

5
function solveAssignment(requests: RideRequest[], candidates: Map<string, DriverCandidate[]>): Assignment[] {
6
  const n = requests.length
7
  const allDrivers = new Set<string>()
8
  candidates.forEach((c) => c.forEach((d) => allDrivers.add(d.driverId)))
9
  const m = allDrivers.size
10
  const driverList = [...allDrivers]
11

12
  // Build cost matrix
13
  const cost: number[][] = Array(n)
14
    .fill(null)
15
    .map(() => Array(m).fill(Infinity))
16

17
  for (let i = 0; i < n; i++) {
18
    const rideCandidates = candidates.get(requests[i].rideId) ?? []
19
    for (const candidate of rideCandidates) {
20
      const j = driverList.indexOf(candidate.driverId)
21
      cost[i][j] = candidate.etaSeconds
22
    }
23
  }
24

25
  // Hungarian algorithm: O(n³)
26
  // For large scale, use auction algorithm or relaxation-based approximations
27
  const assignments = hungarianAlgorithm(cost)
28

29
  return assignments.map(([i, j]) => ({
30
    rideId: requests[i].rideId,
31
    driverId: driverList[j],
32
    eta: cost[i][j],
33
  }))
34
}

Design Decision: Batch Size and Window

Window too short (< 50ms): Not enough requests to optimize
Window too long (> 300ms): Noticeable user delay
Sweet spot: 100-200ms: 10-50 concurrent requests in dense areas, imperceptible delay

Improvement over greedy: Uber reports 10-20% reduction in average wait times from batch optimization.

ETA Prediction (DeepETA)

Uber’s DeepETA uses a hybrid approach: physics-based routing for the baseline, ML for residual correction.

Architecture

1
                    ┌─────────────────┐
2
                    │  Road Network   │
3
                    │  Graph (OSM)    │
4
                    └────────┬────────┘
5
                             │
6
    Pickup/Destination ──────┼──────────────┐
7
                             │              │
8
                             ▼              ▼
9
                    ┌─────────────────┐ ┌─────────────────┐
10
                    │ Routing Engine  │ │  DeepETA Model  │
11
                    │ (Dijkstra/A*)   │ │ (Linear Transf) │
12
                    └────────┬────────┘ └────────┬────────┘
13
                             │                   │
14
                             ▼                   ▼
15
                    Physics-based ETA    ML Residual (R)
16
                          (Y)
17
                             │                   │
18
                             └───────┬───────────┘
19
                                     │
20
                                     ▼
21
                            Final ETA = Y + R

Feature Engineering


5 collapsed lines
1
interface ETAFeatures {
2
  // Spatial features
3
  originH3: string
4
  destH3: string
5
  routeH3Cells: string[] // H3 cells along route
6

7
  // Temporal features
8
  hourOfDay: number // 0-23
9
  dayOfWeek: number // 0-6
10
  isHoliday: boolean
11
  minutesSinceMidnight: number
12

13
  // Traffic features
14
  currentTrafficIndex: number // 0-1 (free flow to gridlock)
15
  historicalTrafficIndex: number // Same time last week
16
  trafficTrend: number // Improving/worsening
17

18
  // Route features
19
  distanceMeters: number
20
  numIntersections: number
21
  numHighwaySegments: number
22
  routingEngineETA: number // Physics-based baseline
23

24
  // Weather (optional)
25
  precipitation: number
26
  visibility: number
27
}
28

29
// Model inference
7 collapsed lines
30
async function predictETA(features: ETAFeatures): Promise<number> {
31
  // Call ML serving layer (Michelangelo)
32
  const residual = await mlClient.predict("deepeta", features)
33

34
  // Final ETA = routing engine + ML residual
35
  return features.routingEngineETA + residual
36
}

Performance:

Median latency: 3.25ms
P95 latency: 4ms
QPS: Hundreds of thousands/second at Uber

Surge Pricing Engine

Supply/Demand Calculation


8 collapsed lines
1
interface SurgeCell {
2
  h3Cell: string // Resolution 7 (larger area)
3
  demandCount: number // Ride requests in last 5 minutes
4
  supplyCount: number // Available drivers
5
  multiplier: number // Calculated surge
6
  updatedAt: number
7
}
8

9
// Calculate surge for each H3 cell
10
async function calculateSurge(cityId: string): Promise<Map<string, SurgeCell>> {
11
  const cells = new Map<string, SurgeCell>()
12

13
  // Get H3 cells for the city (resolution 7, ~5km² each)
14
  const cityCells = getCityH3Cells(cityId, 7)
15

16
  for (const h3Cell of cityCells) {
17
    // Count requests in this cell (last 5 minutes)
18
    const demandCount = (await redis.get(`demand:${h3Cell}:5min`)) || 0
19

20
    // Count available drivers in this cell
21
    const childCells = h3.cellToChildren(h3Cell, 9) // Expand to res 9
22
    let supplyCount = 0
23
    for (const child of childCells) {
24
      supplyCount += await redis.scard(`drivers:h3:${child}:available`)
25
    }
26

27
    // Calculate multiplier
28
    const ratio = supplyCount > 0 ? demandCount / supplyCount : 10
29
    const multiplier = calculateMultiplier(ratio)
30

31
    cells.set(h3Cell, {
32
      h3Cell,
33
      demandCount,
34
      supplyCount,
35
      multiplier,
36
      updatedAt: Date.now(),
37
    })
38
  }
39

11 collapsed lines
40
  return cells
41
}
42

43
function calculateMultiplier(demandSupplyRatio: number): number {
44
  // Piecewise linear function
45
  // ratio < 0.5: no surge (1.0x)
46
  // ratio 0.5-1.0: linear 1.0x-1.5x
47
  // ratio 1.0-2.0: linear 1.5x-2.5x
48
  // ratio > 2.0: cap at 3.0x (regulatory/PR limits)
49

50
  if (demandSupplyRatio < 0.5) return 1.0
51
  if (demandSupplyRatio < 1.0) return 1.0 + (demandSupplyRatio - 0.5)
52
  if (demandSupplyRatio < 2.0) return 1.5 + (demandSupplyRatio - 1.0)
53
  return Math.min(3.0, 1.5 + demandSupplyRatio - 1.0)
54
}

Temporal Smoothing

Raw surge calculations can be noisy (5-minute windows have high variance). Apply smoothing:


5 collapsed lines
1
// Exponential moving average to prevent jarring jumps
2
function smoothSurge(
3
  currentMultiplier: number,
4
  previousMultiplier: number,
5
  alpha: number = 0.3, // Smoothing factor
6
): number {
7
  // New surge = α × current + (1-α) × previous
8
  const smoothed = alpha * currentMultiplier + (1 - alpha) * previousMultiplier
9

10
  // Also limit change rate (max ±0.3 per update)
11
  const maxDelta = 0.3
12
  const delta = smoothed - previousMultiplier
13
  if (Math.abs(delta) > maxDelta) {
14
    return previousMultiplier + Math.sign(delta) * maxDelta
15
  }
16

17
  return smoothed
18
}

Design Decision: Surge Resolution

Spatial: Resolution 7 (~5km²) for surge, resolution 9 (~100m) for driver indexing
Temporal: Recalculate every 5-10 minutes, smooth changes
Why larger cells for surge: Surge should be stable across a neighborhood; overly granular surge creates confusion (“why is it 2x here but 1.2x across the street?”)

Frontend Considerations

Map Rendering Performance

Problem: Displaying 20+ driver pins updating every 4-5 seconds causes jank on mobile devices.

Solution: Batch Updates + Canvas Rendering


10 collapsed lines
1
// Instead of updating each marker individually,
2
// batch updates and render to canvas overlay
3

4
class DriverMapLayer {
5
  private canvas: HTMLCanvasElement
6
  private drivers: Map<string, DriverPosition> = new Map()
7
  private pendingUpdates: DriverPosition[] = []
8
  private rafId: number | null = null
9

10
  // Buffer updates, render on next animation frame
11
  updateDriver(position: DriverPosition) {
12
    this.pendingUpdates.push(position)
13

14
    if (!this.rafId) {
15
      this.rafId = requestAnimationFrame(() => this.render())
16
    }
17
  }
18

19
  private render() {
20
    // Apply all pending updates
21
    for (const update of this.pendingUpdates) {
22
      this.drivers.set(update.driverId, update)
23
    }
24
    this.pendingUpdates = []
25
    this.rafId = null
26

27
    // Clear and redraw all drivers
28
    const ctx = this.canvas.getContext("2d")!
29
    ctx.clearRect(0, 0, this.canvas.width, this.canvas.height)
30

31
    for (const driver of this.drivers.values()) {
32
      this.drawDriver(ctx, driver)
33
    }
34
  }
11 collapsed lines
35

36
  private drawDriver(ctx: CanvasRenderingContext2D, driver: DriverPosition) {
37
    const [x, y] = this.latLngToPixel(driver.lat, driver.lng)
38
    // Draw car icon rotated to heading
39
    ctx.save()
40
    ctx.translate(x, y)
41
    ctx.rotate((driver.heading * Math.PI) / 180)
42
    ctx.drawImage(this.carIcon, -12, -12, 24, 24)
43
    ctx.restore()
44
  }
45
}

Real-Time Trip Tracking State


5 collapsed lines
1
// State machine for ride lifecycle
2
type RideStatus =
3
  | "IDLE"
4
  | "REQUESTING"
5
  | "MATCHING"
6
  | "DRIVER_ASSIGNED"
7
  | "DRIVER_ARRIVING"
8
  | "DRIVER_ARRIVED"
9
  | "IN_PROGRESS"
10
  | "COMPLETED"
11
  | "CANCELLED"
12

13
interface RideState {
14
  status: RideStatus
15
  ride: Ride | null
16
  driver: Driver | null
17
  driverLocation: LatLng | null
18
  etaSeconds: number | null
19
  route: LatLng[] | null
20
}
21

22
// WebSocket message handler
23
function handleRideUpdate(state: RideState, message: WSMessage): RideState {
24
  switch (message.type) {
25
    case "DRIVER_ASSIGNED":
26
      return {
27
        ...state,
28
        status: "DRIVER_ASSIGNED",
29
        driver: message.driver,
30
        etaSeconds: message.eta,
31
      }
32

33
    case "DRIVER_LOCATION":
34
      return {
11 collapsed lines
35
        ...state,
36
        driverLocation: message.location,
37
        etaSeconds: message.eta,
38
        status: message.eta < 60 ? "DRIVER_ARRIVED" : state.status,
39
      }
40

41
    case "TRIP_STARTED":
42
      return {
43
        ...state,
44
        status: "IN_PROGRESS",
45
        route: message.route,
46
      }
47

48
    case "TRIP_COMPLETED":
49
      return {
50
        ...state,
51
        status: "COMPLETED",
52
        ride: { ...state.ride!, fare: message.fare },
53
      }
54

55
    default:
56
      return state
57
  }
58
}

Offline Handling

Problem: Riders in areas with poor connectivity may lose connection mid-request.

Solution: Optimistic UI + Request Queue


5 collapsed lines
1
// Queue ride request locally if offline
2
class RideRequestQueue {
3
  private queue: RideRequest[] = []
4

5
  async requestRide(request: RideRequest): Promise<void> {
6
    if (!navigator.onLine) {
7
      // Store locally
8
      this.queue.push(request)
9
      localStorage.setItem("pendingRides", JSON.stringify(this.queue))
10
      throw new OfflineError("Ride queued, will submit when online")
11
    }
12

13
    return this.submitRide(request)
14
  }
15

16
  // Called when connectivity restored
17
  async processQueue(): Promise<void> {
18
    const pending = [...this.queue]
19
    this.queue = []
20

21
    for (const request of pending) {
22
      try {
23
        await this.submitRide(request)
24
      } catch (e) {
25
        // Re-queue if still failing
26
        this.queue.push(request)
27
      }
28
    }
29

8 collapsed lines
30
    localStorage.setItem("pendingRides", JSON.stringify(this.queue))
31
  }
32
}
33

34
// Listen for online event
35
window.addEventListener("online", () => {
36
  rideQueue.processQueue()
37
})

Infrastructure Design

Cloud-Agnostic Concepts

Component	Requirement	Options
Message Queue	1M+ msg/sec, ordering	Kafka, Pulsar, RedPanda
Real-time Cache	Sub-ms geo queries	Redis Cluster, KeyDB
Time-series DB	Location history	Cassandra, ScyllaDB, TimescaleDB
Stream Processing	Real-time aggregations	Flink, Kafka Streams, Samza
ML Serving	Low-latency inference	TensorFlow Serving, Triton, custom
Object Storage	Trip receipts, ML models	S3-compatible (MinIO)

AWS Reference Architecture

Component	AWS Service	Configuration
API Services	EKS (Kubernetes)	50-500 pods, HPA on CPU/requests
WebSocket Gateway	ECS Fargate	100-1000 tasks, sticky sessions
Location Cache	ElastiCache Redis	r6g.xlarge cluster mode, 6 shards
Event Streaming	MSK (Kafka)	150 nodes, 3 AZs, tiered storage
Driver State	Keyspaces (Cassandra)	On-demand capacity
Payments DB	Aurora PostgreSQL	db.r6g.2xlarge, Multi-AZ, read replicas
ML Inference	SageMaker	ml.c5.4xlarge, auto-scaling
Object Storage	S3 + CloudFront	Trip receipts, ML models

Multi-Region Active-Active

Uber operates globally with active-active deployments. Key considerations:

Kafka cross-region replication: uReplicator (Uber’s open-source tool) for zero-data-loss replication
Cassandra multi-DC: LOCAL_QUORUM writes, LOCAL_ONE reads for low latency
Redis geo-replication: Active-passive per region (location data is region-specific)
DNS-based routing: Route users to nearest region based on latency

Self-Hosted Alternatives

Managed Service	Self-Hosted	When to Self-Host
MSK	Apache Kafka	Cost at scale (Uber runs own Kafka)
Keyspaces	Apache Cassandra	Specific tuning, cost
ElastiCache	Redis on EC2	Redis modules, cost
SageMaker	TensorFlow Serving	Custom models, latency requirements

Conclusion

This design prioritizes batch-optimized matching with H3 spatial indexing over simpler greedy approaches. The 10-20% improvement in average wait times justifies the added complexity for a system where every second of reduced wait time translates to user satisfaction and driver utilization.

Key architectural decisions:

H3 hexagonal indexing over geohash: Uniform neighbor distances eliminate edge artifacts in proximity queries and surge calculations. Uber open-sourced H3 for this reason.
Batch optimization over greedy matching: Accumulating requests for 100-200ms and solving the global assignment problem outperforms per-request nearest-driver matching significantly.
Hybrid ETA (physics + ML): The routing engine provides a solid baseline; ML models learn residual corrections for traffic patterns, events, and local conditions.
Schemaless for trips, Cassandra for driver state: Append-only storage with flexible schema handles the write-heavy, evolving trip data model; Cassandra’s tunable consistency fits the driver location workload.
Spatially-granular, temporally-smooth surge: Resolution 7 H3 cells (~5km²) provide stable neighborhood-level pricing; temporal smoothing prevents jarring multiplier changes.

Limitations and future improvements:

UberPool matching: Shared rides require solving a more complex routing problem with pickup/dropoff ordering constraints.
Predictive dispatch: Pre-positioning drivers based on predicted demand could further reduce wait times.
Dynamic pricing experimentation: ML models could optimize multipliers for market equilibrium rather than simple supply/demand ratios.

Appendix

Prerequisites

Distributed systems fundamentals (CAP theorem, eventual consistency)
Database concepts (sharding, replication, time-series data)
Graph algorithms (Dijkstra, bipartite matching basics)
Basic understanding of ML inference serving

Terminology

H3: Hexagonal Hierarchical Spatial Index—Uber’s open-source geospatial indexing system using hexagonal cells at 16 resolution levels
DISCO: Dispatch Optimization—Uber’s matching service that assigns riders to drivers
Schemaless: Uber’s MySQL-based append-only data store with flexible schema
k-ring: In H3, the set of all cells within k “hops” of a center cell
ETA: Estimated Time of Arrival—predicted time for driver to reach pickup or destination
Surge: Dynamic pricing multiplier applied when demand exceeds supply
Bipartite matching: Assignment problem where two disjoint sets (riders, drivers) are matched to minimize total cost

Summary

H3 spatial indexing provides uniform neighbor distances, enabling accurate proximity queries and smooth surge pricing gradients
Batch-optimized matching (100-200ms windows) achieves 10-20% better wait times than greedy nearest-driver assignment
Hybrid ETA prediction combines physics-based routing with ML residual correction for ±2 minute accuracy
Real-time location tracking at 2M+ updates/second uses Redis with H3-indexed sorted sets and 30-second TTLs
Surge pricing operates at H3 resolution 7 (~5km²) with temporal smoothing to prevent jarring changes
Schemaless (MySQL) + Cassandra handles the write-heavy workload with append-only trip records and high-throughput driver state

References

H3: Uber’s Hexagonal Hierarchical Spatial Index - Official Uber blog on H3 design rationale
H3 Documentation - API reference and resolution tables
H3 GitHub - Open-source implementation
Schemaless: Uber’s Trip Datastore - Append-only MySQL architecture
DeepETA: How Uber Predicts Arrival Times - ML-based ETA prediction
Michelangelo: Uber’s ML Platform - ML serving infrastructure
Cassandra Operations at Scale - Driver state storage
CacheFront: Serving 150M Reads/Second - Redis caching architecture
Chaperone: Auditing Kafka Messages - Trillion messages/day at scale
uReplicator: Kafka Cross-DC Replication - Zero-data-loss replication
Driver Surge Pricing Research - Economic analysis of surge mechanisms

Read more