Design Google Maps

A system design for a mapping and navigation platform handling tile-based rendering, real-time routing with traffic awareness, geocoding, and offline maps. This design addresses continental-scale road networks (18M+ nodes), sub-second routing queries, and 97%+ ETA accuracy.

Mermaid diagram — High-level architecture: CDN-cached tiles, specialized services for routing/geocoding/traffic, and multi-source data ingestion.

Abstract

Google Maps solves three distinct problems that require different architectural approaches:

Map Rendering: Quadtree-based tile pyramid with vector tiles—zoom level 0 is 1 tile, zoom 18 is 69 billion tiles. CDN caching is essential; cache hit rates exceed 95% for popular areas.
Routing: Dijkstra alone takes 4.8 seconds for European road networks (18M nodes). Contraction Hierarchies (CH) preprocessing reduces query time to 163 microseconds—a 30,000x speedup. Traffic-aware routing overlays real-time edge weights on the precomputed hierarchy.
ETA Prediction: Graph Neural Networks model “supersegments” (road sections with shared traffic patterns) to achieve 97%+ accuracy, with up to 50% improvement over baseline in congested cities.

The core tradeoff: preprocessing time vs. query latency. CH requires 5+ minutes to precompute but enables sub-millisecond queries. For traffic updates, a hybrid approach applies real-time weights to the static hierarchy rather than recomputing.

Requirements

Functional Requirements

Feature	Priority	In Scope
Map tile rendering	Core	Yes
Turn-by-turn routing	Core	Yes
Real-time traffic	Core	Yes
ETA prediction	Core	Yes
Geocoding (address → coordinates)	Core	Yes
Reverse geocoding (coordinates → address)	Core	Yes
POI search	High	Yes
Place autocomplete	High	Yes
Offline maps	High	Yes
Street View	Medium	Brief mention
Transit routing	Medium	Out of scope
Indoor maps	Low	Out of scope

Non-Functional Requirements

Requirement	Target	Rationale
Availability	99.99%	User-facing, safety-critical for navigation
Tile latency	p99 < 100ms	Map rendering responsiveness
Routing latency	p99 < 500ms	User expectation for route calculation
ETA accuracy	97%+ trips within ±10%	User trust in arrival predictions
Offline storage	< 2GB per region	Mobile device constraints
Traffic update frequency	< 2 minutes	Real-time usefulness

Scale Estimation

Users:

DAU: 1 billion (Google Maps actual scale)
Peak concurrent: 100M (10% of DAU)

Tile Traffic:

Average session: 50 tile requests (zoom/pan interactions)
Daily tile requests: 1B × 50 = 50B requests/day
Peak RPS: 50B / 86400 × 3 (peak multiplier) ≈ 1.7M RPS

Routing Traffic:

Routes per DAU: 2 average
Daily routing requests: 2B/day ≈ 23K RPS
Peak: 70K RPS

Storage:

Global road network: ~1 billion road segments
CH index size: 50-100 bytes/node × 1B ≈ 50-100 TB
Vector tiles (all zoom levels): ~50 PB
Traffic data: 1B segments × 24 hours × 365 days × 4 bytes ≈ 35 TB/year

Design Paths

Path A: Preprocessing-Heavy (Contraction Hierarchies)

Best when:

Road network changes infrequently (< daily)
Query latency is critical (< 1ms routing)
Traffic updates can be overlaid without full recomputation

Architecture:

Precompute Contraction Hierarchies offline (5-10 minutes for continental networks)
Store preprocessed graph in memory-mapped files
Apply traffic as edge weight multipliers at query time

Trade-offs:

✅ Sub-millisecond query time (163 µs median)
✅ Predictable latency under load
❌ Preprocessing blocks road network updates
❌ Memory-intensive (21-68 bytes/node overhead)

Real-world example: OSRM, GraphHopper, Google Maps routing engine.

Path B: Dynamic Routing (ALT Algorithm)

Best when:

Road network changes frequently (construction, closures)
Real-time edge weights are primary concern
Preprocessing time must be minimal

Architecture:

Precompute distances to landmark nodes only
Use triangle inequality for search pruning
Full dynamic edge weights without reprocessing

Trade-offs:

✅ Handles dynamic networks well
✅ Lighter preprocessing (seconds to minutes)
❌ 10-100x slower queries than CH
❌ Landmark selection impacts quality

Path Comparison

Factor	Path A (CH)	Path B (ALT)
Query time	163 µs	1-10 ms
Preprocessing	5-10 minutes	Seconds
Dynamic updates	Overlay required	Native support
Memory overhead	High (21-68 B/node)	Moderate
Best for	Production routing	Experimental/dynamic

This Article’s Focus

This article focuses on Path A (Contraction Hierarchies) because production mapping services require sub-millisecond query times at scale. Traffic is handled via edge weight overlays rather than full recomputation.

High-Level Design

Tile Service

Serves pre-rendered or dynamically generated map tiles using a quadtree addressing scheme.

Tile Addressing (Web Mercator):

1
/{z}/{x}/{y}.{format}

z: Zoom level (0-22)
x: Column index (0 to 2^z - 1)
y: Row index (0 to 2^z - 1)

Zoom Level Properties:

Zoom	Tile Count	Meters/Pixel (Equator)	Use Case
0	1	156,543 m	World view
10	1,048,576	152.87 m	City-level
15	~1 billion	4.78 m	Street-level
18	~69 billion	0.60 m	Building-level
20	~1.1 trillion	0.15 m	Maximum detail

Vector Tiles vs. Raster Tiles:

Aspect	Raster Tiles	Vector Tiles
Format	Pre-rendered PNG/JPEG	Protobuf-encoded geometry
Size	100-300 KB	20-50 KB compressed
Styling	Fixed at render time	Client-side, dynamic
Scaling	Pixelates	Infinite (vector math)
Updates	Full tile replacement	Delta updates possible

Vector tiles (Mapbox Vector Tile specification) encode features as Protocol Buffers with a default extent of 4096 coordinate units per tile. This enables client-side rendering with custom styles and smooth zooming.

Routing Service

Computes optimal paths using Contraction Hierarchies with traffic overlays.

How Contraction Hierarchies Work:

Node Ordering: Rank nodes by importance (highways > arterials > local roads). Importance is computed using edge difference, contraction depth, and original edges.
Contraction: Iteratively remove least-important nodes. For each removed node, add “shortcut” edges between its neighbors if the shortest path went through it.
Query: Run bidirectional Dijkstra, but only traverse edges going “upward” in the hierarchy. The searches meet at the highest-importance node on the optimal path.

Performance Benchmarks (North America, 87M vertices):

Metric	Value
Preprocessing time	307 seconds (~5 min)
Query time (median)	163 microseconds
Speedup over Dijkstra	~30,000x
Storage overhead	21-68 bytes/node

Traffic-Aware Routing:

Real-time traffic is applied as edge weight multipliers without recomputing the hierarchy:

Collect probe data (GPS traces from devices)
Map-match probes to road segments
Compute segment speeds from probe timestamps
Store speed multipliers: actual_speed / free_flow_speed
At query time: edge_weight = base_weight × traffic_multiplier

This hybrid approach preserves sub-millisecond queries while incorporating live traffic.

Traffic Service

Collects, processes, and serves real-time traffic data.

Data Sources:

Source	Quality	Latency	Coverage
GPS probes (mobile)	Medium	Real-time	High (urban)
Connected cars (OEM)	High	Real-time	Growing
Road sensors	High	Real-time	Limited
User reports	Variable	Real-time	Incident-specific
Historical patterns	N/A	N/A	Baseline

Floating Car Data (FCD) Pipeline:

1
Probe → Map Matching → Segment Assignment → Speed Aggregation → Traffic State

Probe ingestion: Timestamped (lat, lon, speed) tuples
Map matching: Hidden Markov Model assigns probes to road segments
Aggregation: Window-based speed averaging (2-5 minute windows)
Traffic state: Free flow / Light / Moderate / Heavy / Standstill

ETA Prediction with Graph Neural Networks:

Google’s DeepMind collaboration uses GNNs to predict travel times:

Supersegments: Groups of adjacent road segments with shared traffic patterns
Graph structure: Nodes = supersegments, edges = transitions
Features: Historical speeds, time of day, day of week, weather
Output: Predicted travel time distribution

Results:

97%+ trips with ETA within ±10%
Up to 50% reduction in negative ETA outcomes (Sydney, Tokyo, Berlin)

Geocoding Service

Converts between addresses and coordinates.

Forward Geocoding Pipeline:

1
Input: "1600 Amphitheatre Parkway, Mountain View, CA"
2
  ↓
3
Address Parsing (libpostal): {street: "1600 Amphitheatre Parkway", city: "Mountain View", state: "CA"}
4
  ↓
5
Normalization: Expand abbreviations, standardize format
6
  ↓
7
Candidate Generation: Query spatial index for matching addresses
8
  ↓
9
Scoring: Rank by text similarity, location confidence
10
  ↓
11
Output: {lat: 37.4220, lng: -122.0841, confidence: 0.98}

Reverse Geocoding:

Given (lat, lon), find the nearest address:

Query R-tree/S2 index for nearby address points
Interpolate street address from road segment data
Return formatted address with administrative hierarchy

Spatial Indexing Options:

Index	Structure	Use Case
R-tree	Bounding rectangles	General spatial queries, PostGIS
Quadtree	Recursive quadrants	Tile-based lookups
S2 Geometry	Spherical cells (Hilbert curve)	Global-scale, hierarchical

Google uses S2 Geometry for global coverage with hierarchical cell IDs that enable efficient prefix-based queries.

Search Service (POI and Autocomplete)

Place Autocomplete Data Structures:

Structure	Lookup Time	Space	Best For
Trie	O(m)	High	Exact prefix match
Ternary Search Tree	O(m)	Medium	Space-efficient prefix
Pruning Radix Trie	O(m)	Low	Production autocomplete

Where m = query length.

Ranking Signals:

Text relevance (edit distance, prefix match)
Popularity (visit frequency)
Recency (user’s recent searches)
Proximity (distance from user’s location)
Category match (restaurants, gas stations)

API Design

Tile API

Endpoint: GET /tiles/{z}/{x}/{y}.{format}

Path Parameters:

z: Zoom level (0-22)
x: Tile column
y: Tile row
format: png, mvt (vector), pbf

Response Headers:

1
Content-Type: image/png | application/vnd.mapbox-vector-tile
2
Cache-Control: public, max-age=86400
3
ETag: "abc123"

Response: Binary tile data

Caching Strategy:

CDN cache: 24 hours for zoom < 15, 1 hour for zoom ≥ 15
Client cache: ETag-based conditional requests
Cache hit rate target: > 95%

Routing API

Endpoint: POST /routes

Request:

1
{
2
  "origin": { "lat": 37.422, "lng": -122.0841 },
3
  "destination": { "lat": 37.7749, "lng": -122.4194 },
4
  "waypoints": [{ "lat": 37.5585, "lng": -122.2711 }],
5
  "mode": "driving",
6
  "departure_time": "2024-01-15T08:00:00Z",
7
  "alternatives": true,
8
  "traffic_model": "best_guess"
9
}

Response:

1
{
2
  "routes": [
3
    {
4
      "legs": [
5
        {
6
          "distance": {"value": 45000, "text": "45 km"},
7
          "duration": {"value": 2700, "text": "45 min"},
8
          "duration_in_traffic": {"value": 3300, "text": "55 min"},
9
          "steps": [
10
            {
11
              "instruction": "Head north on Amphitheatre Pkwy",
12
              "distance": {"value": 500, "text": "500 m"},
13
              "duration": {"value": 60, "text": "1 min"},
14
              "polyline": "encoded_polyline_string",
15
              "maneuver": "turn-right"
16
            }
17
          ]
18
        }
19
      ],
20
      "overview_polyline": "encoded_polyline_string",
21
      "bounds": {"northeast": {...}, "southwest": {...}},
22
      "warnings": ["Route includes toll roads"]
23
    }
24
  ]
25
}

Error Responses:

400 Bad Request: Invalid coordinates, missing required fields
404 Not Found: No route found (disconnected points)
429 Too Many Requests: Rate limit exceeded

Rate Limits: 1000 requests/minute per API key

Geocoding API

Forward Geocoding:

GET /geocode?address={address}&bounds={sw_lat,sw_lng,ne_lat,ne_lng}

Response:

1
{
2
  "results": [
3
    {
4
      "formatted_address": "1600 Amphitheatre Parkway, Mountain View, CA 94043",
5
      "geometry": {
6
        "location": {"lat": 37.4220, "lng": -122.0841},
7
        "location_type": "ROOFTOP",
8
        "viewport": {...}
9
      },
10
      "address_components": [
11
        {"long_name": "1600", "types": ["street_number"]},
12
        {"long_name": "Amphitheatre Parkway", "types": ["route"]}
13
      ],
14
      "place_id": "ChIJ..."
15
    }
16
  ]
17
}

Reverse Geocoding:

GET /geocode?latlng={lat},{lng}

Place Autocomplete API

Endpoint: GET /places/autocomplete?input={query}&location={lat},{lng}&radius={meters}

Response:

1
{
2
  "predictions": [
3
    {
4
      "description": "Googleplex, Mountain View, CA",
5
      "place_id": "ChIJ...",
6
      "structured_formatting": {
7
        "main_text": "Googleplex",
8
        "secondary_text": "Mountain View, CA"
9
      },
10
      "distance_meters": 1200
11
    }
12
  ]
13
}

Debouncing: Client should debounce requests (300ms) to reduce API calls during typing.

Data Modeling

Road Graph Schema

Primary Store: Custom binary format for in-memory graph processing

1
Node:
2
  - id: uint64
3
  - lat: float32
4
  - lon: float32
5
  - ch_level: uint16  // Contraction hierarchy level
6

7
Edge:
8
  - source: uint64
9
  - target: uint64
10
  - distance: uint32  // meters
11
  - duration: uint32  // seconds (free flow)
12
  - road_class: uint8 // motorway, primary, secondary, etc.
13
  - is_shortcut: bool
14
  - shortcut_middle: uint64  // For path unpacking

Storage:

Memory-mapped file for O(1) access
Compressed with LZ4 for disk storage
Sharded by geographic region (continent-level)

Tile Metadata Schema

Primary Store: Object storage (S3-compatible) with key-value index

1
Key: {layer}/{z}/{x}/{y}
2
Value: {
3
  tile_data: bytes,
4
  etag: string,
5
  generated_at: timestamp,
6
  source_version: string
7
}

Tile Generation Pipeline:

Raw map data (OpenStreetMap format)
Feature extraction and simplification per zoom level
Vector tile encoding (Mapbox Vector Tile spec)
Compression (gzip)
Upload to object storage

Traffic Data Schema

Primary Store: Time-series database (InfluxDB, TimescaleDB, or custom)

1
CREATE TABLE traffic_observations (
2
  segment_id BIGINT NOT NULL,
3
  timestamp TIMESTAMPTZ NOT NULL,
4
  speed_kmh SMALLINT,
5
  probe_count SMALLINT,
6
  confidence REAL
7
);
8

9
-- Hypertable for time-series performance
10
SELECT create_hypertable('traffic_observations', 'timestamp');
11

12
-- Index for real-time lookups
13
CREATE INDEX idx_traffic_segment_time
14
ON traffic_observations(segment_id, timestamp DESC);

Retention:

Raw observations: 7 days
Hourly aggregates: 1 year
Daily patterns: 5 years

POI Schema

Primary Store: PostgreSQL with PostGIS extension

1
CREATE TABLE places (
2
  id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
3
  name TEXT NOT NULL,
4
  location GEOGRAPHY(POINT, 4326) NOT NULL,
5
  category VARCHAR(50),
6
  address_components JSONB,
7
  popularity_score REAL DEFAULT 0,
8
  created_at TIMESTAMPTZ DEFAULT NOW(),
9
  updated_at TIMESTAMPTZ DEFAULT NOW()
10
);
11

12
CREATE INDEX idx_places_location ON places USING GIST(location);
13
CREATE INDEX idx_places_category ON places(category);
14
CREATE INDEX idx_places_name_trgm ON places USING GIN(name gin_trgm_ops);

Database Selection Matrix

Data Type	Store	Rationale
Road graph	Memory-mapped file	O(1) access, no serialization overhead
Tiles	Object storage + CDN	Static content, high read volume
Traffic (real-time)	Time-series DB	Time-windowed queries, retention policies
Places/POI	PostgreSQL + PostGIS	Spatial queries, full-text search
User data	PostgreSQL	ACID, relational queries
Search index	Elasticsearch	Full-text, autocomplete, facets

Low-Level Design

Contraction Hierarchies Implementation

Node Ordering Heuristic:

1
priority(v) = edge_difference(v)
2
            + contract_depth(v)
3
            + original_edges(v)

Where:

edge_difference: Number of shortcuts that would be added minus edges removed
contract_depth: Maximum hierarchy level among neighbors
original_edges: Number of non-shortcut edges (prefer removing shortcuts first)

Witness Search:

Before adding a shortcut u→w (through v), check if a shorter path u→w exists without v:

1
shortcut_needed = (d(u,v) + d(v,w)) < witness_search(u, w, excluding v)

Witness search is a bounded Dijkstra; the bound is the proposed shortcut length.

Query Algorithm:

1
def ch_query(source, target):
2
    # Bidirectional Dijkstra, only "upward" edges
3
    forward = dijkstra_upward(source)
4
    backward = dijkstra_upward(target)
5

6
    # Find best meeting point
7
    best_dist = infinity
8
    meeting_node = None
9
    for node in forward.visited ∩ backward.visited:
10
        dist = forward.dist[node] + backward.dist[node]
11
        if dist < best_dist:
12
            best_dist = dist
13
            meeting_node = node
14

15
    # Unpack shortcuts recursively
16
    return unpack_path(source, meeting_node, target)

Why “upward” only works:

The hierarchy ensures that for any shortest path, there exists a path in the CH graph that only goes up (in CH level) from source, meets at some top node, then only goes up (reversed = down in original) to target. This dramatically prunes the search space.

Map Matching Algorithm

Map matching assigns GPS probes to road segments. The standard approach uses a Hidden Markov Model (HMM):

States: Road segments within radius of GPS point Observations: GPS coordinates Emission probability: Based on distance from GPS point to road segment Transition probability: Based on routing distance between consecutive candidate segments

Algorithm (Viterbi):

For each GPS point, find candidate road segments within 50m radius
Compute emission probabilities: P(gps | segment) ∝ exp(-distance² / σ²)
Compute transition probabilities between consecutive candidates
Find most likely sequence using Viterbi algorithm

Output: Sequence of road segments with timestamps → enables speed computation per segment.

Tile Rendering Pipeline

Vector Tile Generation:

Feature extraction: Query PostGIS for features in tile bounding box
Simplification: Douglas-Peucker algorithm, tolerance based on zoom level
Clipping: Clip features to tile boundary with buffer
Encoding: Convert to Mapbox Vector Tile (MVT) format
Compression: gzip for storage/transfer

Simplification Tolerance:

Zoom Level	Tolerance (meters)	Rationale
0-5	1000+	Only major features visible
6-10	100-1000	Country/state level
11-15	10-100	City level
16+	1-10	Street level, minimal simplification

MVT Structure:

1
message Tile {
2
  repeated Layer layers = 3;
3
}
4

5
message Layer {
6
  required string name = 1;
7
  repeated Feature features = 2;
8
  repeated string keys = 3;    // Shared key list
9
  repeated Value values = 4;   // Shared value list
10
  optional uint32 extent = 5;  // Default 4096
11
}
12

13
message Feature {
14
  optional uint64 id = 1;
15
  repeated uint32 tags = 2;    // Indices into keys/values
16
  optional GeomType type = 3;
17
  repeated uint32 geometry = 4; // Command-encoded
18
}

Keys and values are deduplicated across features for compression.

Frontend Considerations

Tile Loading Strategy

Viewport-Based Loading:

1
interface Viewport {
2
  center: LatLng
3
  zoom: number
4
  bounds: LatLngBounds
5
}
6

7
function getTilesForViewport(viewport: Viewport): TileCoord[] {
8
  const { bounds, zoom } = viewport
9
  const tiles: TileCoord[] = []
10

11
  const minTile = latLngToTile(bounds.southwest, zoom)
12
  const maxTile = latLngToTile(bounds.northeast, zoom)
13

14
  for (let x = minTile.x; x <= maxTile.x; x++) {
15
    for (let y = minTile.y; y <= maxTile.y; y++) {
16
      tiles.push({ z: zoom, x, y })
17
    }
18
  }
19

20
  return tiles
21
}

Prefetching:

Load tiles 1 level above and below current zoom (for smooth zoom transitions)
Load adjacent tiles outside viewport (buffer for panning)
Typical: 20-30 tiles per view state

Tile Cache (Client-Side):

1
class TileCache {
2
  private cache: Map<string, ImageBitmap>
3
  private maxSize: number = 500 // tiles
4
  private lru: string[] = []
5

6
  get(key: string): ImageBitmap | undefined {
7
    const tile = this.cache.get(key)
8
    if (tile) {
9
      // Move to end of LRU
10
      this.lru = this.lru.filter((k) => k !== key)
11
      this.lru.push(key)
12
    }
13
    return tile
14
  }
15

16
  set(key: string, tile: ImageBitmap): void {
17
    if (this.cache.size >= this.maxSize) {
18
      const evict = this.lru.shift()!
19
      this.cache.delete(evict)
20
    }
21
    this.cache.set(key, tile)
22
    this.lru.push(key)
23
  }
24
}

Vector Tile Rendering

WebGL Rendering Pipeline:

Parse MVT protobuf → geometry arrays
Upload vertex buffers to GPU
Apply style rules (zoom-dependent line widths, colors)
Render with appropriate shaders

Libraries:

Mapbox GL JS / MapLibre GL JS (WebGL-based)
Leaflet with vector tile plugins (Canvas/SVG fallback)
deck.gl for data visualization layers

Performance Considerations:

Batch draw calls per layer
Use instanced rendering for repeated symbols (icons)
Level-of-detail: reduce vertex count at lower zooms
Web Workers for MVT parsing (off main thread)

Route Visualization

Polyline Rendering:

1
interface RouteLayer {
2
  // Decoded polyline as [lat, lng] pairs
3
  coordinates: [number, number][]
4

5
  // Style
6
  strokeColor: string
7
  strokeWidth: number
8

9
  // Animation state (for traffic coloring)
10
  trafficSegments: {
11
    startIndex: number
12
    endIndex: number
13
    severity: "free_flow" | "light" | "moderate" | "heavy"
14
  }[]
15
}

Traffic Coloring:

Severity	Color	Speed Ratio
Free flow	Green (#4CAF50)	> 0.8
Light	Yellow (#FFEB3B)	0.6-0.8
Moderate	Orange (#FF9800)	0.4-0.6
Heavy	Red (#F44336)	< 0.4

Animation (Navigation Mode):

Update position marker at 60fps
Smooth interpolation between GPS updates
Re-route detection: compare current position to expected route

Offline Maps Implementation

Download Strategy:

1
interface OfflineRegion {
2
  bounds: LatLngBounds
3
  minZoom: number
4
  maxZoom: number
5
  includeRouting: boolean
6
}
7

8
function estimateDownloadSize(region: OfflineRegion): number {
9
  let totalTiles = 0
10
  for (let z = region.minZoom; z <= region.maxZoom; z++) {
11
    const tilesAtZoom = countTilesInBounds(region.bounds, z)
12
    totalTiles += tilesAtZoom
13
  }
14
  // Average 30KB per compressed vector tile
15
  return totalTiles * 30 * 1024
16
}

Storage Format:

SQLite database with:

Tile blobs keyed by (z, x, y)
Metadata (region bounds, version, expiry)
Road graph subset for offline routing

Delta Updates:

Server generates diff between versions
Client downloads only changed tiles (typically 2-8 MB)
Apply patch to local database
Verify integrity with checksums

Bandwidth reduction: up to 75% compared to full re-download.

Infrastructure Design

CDN Architecture

Tile CDN Requirements:

Edge locations in 100+ cities
Cache hit rate > 95%
Origin shield to protect tile servers
Custom cache keys: /{layer}/{z}/{x}/{y}

Cache Hierarchy:

1
User → Edge PoP → Regional Cache → Origin Shield → Tile Server

Cache TTLs:

Zoom Level	TTL	Rationale
0-10	30 days	Rarely changes (coastlines, countries)
11-15	7 days	City infrastructure
16-22	1 day	Street details, POIs

Routing Service Deployment

Memory Requirements:

North America CH graph: ~20 GB RAM
Europe CH graph: ~15 GB RAM
Global: ~100 GB RAM (sharded)

Deployment Strategy:

Regional sharding: Each region runs its own CH graph
Cross-region routing: Stitch at border nodes
Replication: 3 instances per region for availability
Updates: Blue-green deployment for new CH builds

Cloud Architecture (AWS)

Component	Service	Configuration
Tile CDN	CloudFront	Global edge, S3 origin
Tile Storage	S3	Standard for hot tiles, Glacier for archive
Routing Service	ECS Fargate	Memory-optimized (r6g.4xlarge equivalent)
Traffic Ingestion	Kinesis	Sharded by region
Traffic Processing	Lambda + Kinesis Analytics	Real-time aggregation
Geocoding	OpenSearch	Geo queries, autocomplete
POI Database	RDS PostgreSQL	PostGIS extension
Graph Storage	EFS	Shared memory-mapped files

Self-Hosted Alternatives

Managed Service	Self-Hosted	When to Self-Host
CloudFront	Nginx + Varnish	Cost at extreme scale
OpenSearch	Elasticsearch	Specific plugins needed
RDS PostgreSQL	PostgreSQL on EC2	PostGIS extensions, cost
Kinesis	Apache Kafka	Higher throughput, cost
Timestream	InfluxDB	Open-source flexibility

Monitoring and Observability

Key Metrics

Tile Service:

Cache hit rate (target: > 95%)
Tile generation latency (p99 < 200ms)
Error rate by zoom level

Routing Service:

Query latency (p50, p99)
Routes not found rate
Traffic overlay staleness

Traffic Service:

Probe ingestion lag (target: < 30s)
Segment coverage (% of roads with data)
ETA accuracy (actual vs. predicted)

Alerting Thresholds

Metric	Warning	Critical
Tile cache hit rate	< 90%	< 80%
Routing p99 latency	> 500ms	> 1s
Traffic data lag	> 2 min	> 5 min
ETA accuracy	< 95%	< 90%

Conclusion

This design addresses the core challenges of a mapping platform:

Rendering at scale: Tile pyramid with vector tiles enables efficient caching (95%+ hit rate) and client-side styling. The quadtree structure scales from 1 tile (world view) to 69 billion (building level).
Fast routing: Contraction Hierarchies preprocessing (5 minutes) enables 163µs queries—a 30,000x speedup over Dijkstra. Traffic is overlaid as edge weight multipliers without full recomputation.
Accurate ETAs: Graph Neural Networks on supersegments achieve 97%+ accuracy. The key insight: model road sections with shared traffic patterns rather than individual segments.

Key tradeoffs accepted:

Preprocessing time for query speed (CH approach)
Storage overhead (21-68 bytes/node) for routing performance
Eventual consistency in traffic data (2-minute lag acceptable)

Limitations:

CH preprocessing blocks rapid road network updates (construction, closures)
Traffic accuracy depends on probe density (sparse in rural areas)
Offline routing requires downloading regional graph subsets

Future improvements:

Machine learning for route personalization
Real-time construction detection from probe data
Federated routing across providers

Appendix

Prerequisites

Graph algorithms (Dijkstra, A*)
Spatial indexing concepts (R-tree, quadtree)
Distributed systems fundamentals
CDN caching strategies

Terminology

Term	Definition
CH (Contraction Hierarchies)	Preprocessing technique that creates shortcuts to speed up routing queries
MVT (Mapbox Vector Tile)	Protocol buffer format for encoding map features as vectors
FCD (Floating Car Data)	Anonymized GPS traces from vehicles used for traffic estimation
Supersegment	Group of adjacent road segments with shared traffic patterns (GNN concept)
Map Matching	Algorithm to assign GPS probes to road network segments
Web Mercator	Map projection (EPSG:3857) used by most web maps

Summary

Tile system: Quadtree pyramid with vector tiles (MVT), CDN-cached, 20-30 tiles per viewport
Routing: Contraction Hierarchies with 163µs query time, traffic as edge weight overlay
Traffic: FCD from probes, 2-minute aggregation windows, GNN-based ETA prediction
Geocoding: Address parsing + spatial indexing (R-tree/S2), autocomplete via pruning radix trie
Offline: SQLite-stored tiles and graph subset, delta updates reduce bandwidth 75%
Scale: 1.7M RPS tiles, 70K RPS routing at peak

References

Contraction Hierarchies: Faster and Simpler Hierarchical Routing in Road Networks - Geisberger et al., 2008
Mapbox Vector Tile Specification - Protobuf encoding for vector tiles
Traffic Prediction with Advanced Graph Neural Networks - DeepMind, 2021
OpenStreetMap Zoom Levels - Tile resolution by zoom
Google S2 Geometry Library - Spherical geometry for spatial indexing
OSRM (Open Source Routing Machine) - Production CH implementation
libpostal - Address parsing library
Route Planning in Transportation Networks - Bast et al., comprehensive survey