Design a URL Shortener: IDs, Storage, and Scale

A URL shortener looks like a one-line problem — store a code → URL row, return a redirect — but the constraints that matter are the ones that surface only at scale: collision-free IDs across many writers, single-digit-millisecond redirects across the planet, hot-key behavior on viral links, and an analytics path that never blocks the redirect. This article walks the design end-to-end at roughly Bitly’s order of magnitude (≈6 billion clicks/month as of 2014, growing past 9 billion by 2017)¹, with each major decision tied to its underlying constraint and trade-off.

High-level architecture: CDN absorbs cached redirects, the redirect path stays free of analytics work, and a separate shortening path handles writes and security scanning.

Mental model

Hold three ideas in mind for the rest of the article:

Read path and write path are different services with different SLOs. The redirect path is single-digit-millisecond and absorbs the entire fan-out of viral traffic. The shortening path runs writes, validation, and malware scanning, and is happy with hundreds of milliseconds.
The short code is the only thing that matters in the hot path. Once you’ve decided how it’s generated, every other choice (storage layout, caching tier, sharding key) follows from that one.
Analytics can never block redirects. Anything you want to count goes onto a queue, period. Counters in the primary store are a footgun (more on Cassandra COUNTER below).

The core decisions, in one table:

Decision	Choice	Why
ID generation	KGS for random codes + Snowflake fallback	Pre-allocated codes guarantee zero collisions; Snowflake handles batch / programmatic writes
Encoding	Base62 (`0-9 A-Z a-z`)	URL-safe, compact, no `+`/`/` to escape
Primary store	Wide-column KV (Cassandra / ScyllaDB / DynamoDB)	Partition-key lookups are O(1); read-heavy 100:1 fits the model
Cache tier	CDN edge → Redis cluster → primary store	CDN absorbs viral spikes; Redis catches everything else
Redirect status	`302 Found` with `Cache-Control: private, max-age=60`	`302` is not heuristically cacheable per RFC 9111; explicit `max-age` keeps CDN absorption while preserving analytics fidelity²
Analytics	Fire-and-forget → Kafka → ClickHouse	Decouples redirect SLO from analytics throughput
Sharding	Consistent hashing on `short_code`	Minimal data movement during cluster changes

Trade-offs you are explicitly accepting:

302 increases origin load relative to 301’s heuristic caching, in exchange for accurate per-click analytics.
A pre-generated key service (KGS) costs storage and operational attention, in exchange for zero collisions.
Async analytics means dashboards lag by 1–5 seconds.
A bloom filter in front of the primary store eliminates most cache-stampede traffic for non-existent codes, at the cost of a small fixed memory footprint and a tunable false-positive rate (see Redirect path).

Requirements

Functional

Requirement	Priority	Notes
Shorten long URLs	Core	Generate a unique short code for any well-formed URL
Redirect short URLs	Core	Return `302` with destination in `Location`
Custom short codes	Core	User-specified aliases (e.g. `suj.ee/launch-2026`)
Link expiration	Core	TTL-based or click-limit
Click analytics	Core	Count, geo, device, referrer
Link management	Extended	Edit destination, disable links
Bulk shortening	Extended	Batch API with job tracking
QR code generation	Extended	On-demand QR for any short URL

Non-functional

Requirement	Target	Why this number
Availability	99.99 %	Short links are embedded in emails, posts, QR codes; downtime is irreversible
Redirect latency	p99 < 50 ms	Below the human-perceptible threshold and SEO-relevant for crawlers
Write latency	p99 < 200 ms	Acceptable for an interactive `POST /urls`
Read throughput	100k RPS sustained	Headroom for viral amplification
Write throughput	1k RPS sustained	Most traffic is reads
Durability	11 nines	Equivalent to S3 standard; lost mappings are unrecoverable from the origin
URL lifetime	≥ 5 years default	Permalinks for content; tunable per link

Scale estimation

Working at roughly Bitly’s 2014 order of magnitude:

100:1 read:write ratio. Single viral link can produce 80 % of daily traffic in minutes — plan for it.
1 M new URLs/day, 100 M redirects/day → ~1.2 k RPS average, ~12 k peak, 100 k+ during a spike.
5-year storage: 1 M URLs/day × 365 × 5 × ~500 B ≈ 0.9 TB of mappings, plus 100 M clicks/day × 200 B ≈ 7 TB/year of analytics.

The numbers say two things: storage is cheap, and the engineering problem is read latency under bursty traffic, not raw capacity.

ID generation: four real options

Picking how you mint short codes determines almost everything downstream — coordination model, collision risk, code length, and whether the design is horizontally scalable at all.

Option A — Auto-increment counter

A relational BIGINT IDENTITY column hands out IDs; the application Base62-encodes the value.

Counter-based shortening: a single SQL row hands out the next ID; Base62 encoding turns it into the short code.

Pros: trivially correct, compact codes (sequential IDs Base62-encode to short strings), single source of truth.

Cons: a single writer is a single point of failure and a hard ceiling on write throughput; the IDs are guessable, exposing total link counts and enabling enumeration scraping. Workable for an internal tool or an MVP, indefensible at Bitly scale.

Option B — Hash-based generation

Hash the long URL (MD5/SHA-256), truncate to the desired length, retry on collision.

Hash-based shortening: hash → truncate → collision check → store or rehash with a salt.

Pros: deterministic — same long URL always produces the same short code, enabling natural deduplication. No central coordinator.

Cons: collisions are inevitable. With a 7-character Base62 keyspace (), the expected number of collisions across writes is approximately — roughly 142 k collisions at 1 B URLs (about one write in 7 000) and 14 M at 10 B (about one in 700). Each collision means an extra round trip to the store and a salt-and-rehash retry. It also rules out user-chosen custom codes since they’d have to fit the same scheme.

Option C — Snowflake (distributed time-ordered IDs)

Twitter’s Snowflake³ packs a 64-bit ID as [1 bit unused][41 bit timestamp ms since epoch][10 bit worker ID][12 bit sequence], yielding 4 096 IDs per millisecond per node and 1 024 nodes — about 4.1 M IDs/s aggregate at one-millisecond grain⁴.

Snowflake ID generation per datacenter: each app server holds a generator instance with a unique node ID; Base62 encoding produces the user-visible short code.

Bits	Field	Purpose
41	Timestamp	Milliseconds since a chosen epoch (≈ 69 years headroom)
10	Node ID	1 024 unique generators (datacenter + worker)
12	Sequence	4 096 IDs/ms/node

Pros: no coordination, time-ordered (good for range scans and analytics), proven at Twitter and adopted by Discord for snowflake-style message IDs⁵.

Cons: Base62-encoding a 64-bit integer produces an 11-character code, longer than what KGS or counter approaches yield. Operationally you must (a) prevent two nodes from ever sharing a node ID and (b) survive backwards clock jumps — the canonical implementations refuse to issue IDs until the clock recovers, which is a real availability event.

Note

The epoch in a Snowflake implementation is arbitrary as long as it predates the first issued ID. Twitter chose 1288834974657 ms (2010-11-04T01:42:54.657 UTC)³. Pick your own and document it; you cannot change it later without remapping every existing ID.

Option D — Pre-generated Key Generation Service (KGS)

An offline process generates all short codes in advance and stores them in a keys_unused table. Application servers fetch batches into a local buffer and atomically move codes from unused → allocated → used.

KGS lifecycle: an offline generator fills the unused-keys table; app servers fetch batches into local caches and mark codes used on assignment.

Pros: zero collisions by construction; codes are short and configurable in length; custom user-chosen codes can be reserved out of the same pool.

Cons: storing the entire 6-character Base62 keyspace ( codes) at 6 B per code is roughly 400 GB raw — manageable but not trivial. KGS becomes a critical dependency on the write path, and key exhaustion needs to be alerted on long before it happens. App-server crashes orphan whatever codes were in their local buffer; at scale this is acceptable wastage.

Comparison

Factor	Counter	Hash	Snowflake	KGS
Collision risk	None	Real	None	None
Coordination on write	Required	None	None	Batch fetch
Code length	Shortest	Fixed	11 chars	Configurable
Predictability	High	Low	Medium	Low
Horizontal scale	Poor	Good	Excellent	Good
Custom codes	Hard	Hard	Hard	Native
Operational burden	Low	Medium	Medium	High

This article focuses on a KGS-primary, Snowflake-fallback hybrid. KGS handles user-facing single-link creation (short codes, custom aliases). Snowflake covers programmatic / bulk writes where the longer code is acceptable and you do not want to pay a KGS round-trip.

High-level design

Request path

Backing services

Shortening service

Handles writes — validation, malware scanning, code assignment, persistence.

Decision	Choice	Why
Duplicate handling	Optional dedup, off by default	Tracking links want one code per click campaign; idempotent shortening would defeat that
URL validation	Format synchronously, reachability via async HEAD	Don’t block the user on a slow destination
Scanning	Synchronous for new domains, async for known-good	Fast path for the long tail of safe traffic
Custom codes	Reserved out of the KGS pool	Keeps the code space coherent; one source of truth

Redirect path

The 99 % case. Must be cheap, predictable, and never block on analytics or scanning.

Redirect sequence: CDN edge cache → Redis hot cache → primary store fallback; analytics is fire-and-forget at every layer.

Why each layer is there:

CDN edge (CloudFront / Fastly / Cloudflare) caches the 302 for the configured max-age. A viral link that goes from 0 → 100 k RPS gets absorbed entirely at the edge after the first hit per POP.
Redis cluster caches everything that escapes the CDN. LRU eviction; a TTL of an hour or so keeps memory bounded.
Bloom filter in Redis keeps cache-stampede-style attacks (spraying random codes that don’t exist) from reaching the primary store. The trade-off is a fixed-cost false-positive rate. For a 1 B-item bloom filter at a 0.1 % false-positive rate, the canonical formula gives ≈ 14.4 bits/element and roughly 1.7 GB of memory⁶.

Important

Use 302 Found with Cache-Control: private, max-age=60. RFC 9111 makes 301 heuristically cacheable indefinitely by default; 302 is only cacheable when you opt in via explicit freshness headers². The explicit max-age lets the CDN absorb spikes for a minute while keeping clicks observable.

Key Generation Service (KGS)

KGS internals: an offline generator fills the unused-keys table; the API distributes batches into per-app-server queues; used keys are tracked separately for forensics.

Allocation flow:

Each app server requests a batch (typically 1 000 codes).
The KGS atomically moves the batch from unused → allocated keyed by the requesting server’s instance ID.
The app server holds the batch in memory and assigns codes locally.
On code use, the row moves from allocated → used.
On graceful shutdown, the server returns its unused tail; on a crash, those codes are orphaned. At scale, a few thousand orphaned codes per crash is irrelevant against billions of available codes.

Failure handling:

KGS unavailable. App servers continue serving from their local buffers. Sized for ~1 hour of writes, this absorbs short outages; longer than that, the shortening API starts returning 503 while redirects keep working from cache and the primary store.
Key exhaustion. Alert at 80 % of the unused pool consumed; trigger background generation. Never let the buffer drop below the time it takes to provision more codes.

Analytics collector

Click data is captured outside the redirect path. The redirect handler emits a fire-and-forget event; everything downstream is best-effort with no SLO impact on the redirect.

1interface ClickEvent {2  shortCode: string3  timestamp: number45  // Captured at edge6  ipHash: string         // SHA-256 with rotating salt; never store raw IP for >1 day7  userAgent: string8  referer: string | null910  // Enriched downstream11  country: string12  city: string13  deviceType: "mobile" | "desktop" | "tablet"14  browser: string15  os: string1617  isBot: boolean18  botType: string | null19}

Pipeline:

Redirect service appends to an in-memory buffer; flushed to Kafka in batches of 100 or every second.
Stream processor enriches events (geo-IP, UA parsing, bot detection).
Enriched events land in ClickHouse via batched inserts.
A real-time counter in Redis is updated for the API to read without hitting ClickHouse.

Caution

Do not put click counters in the primary store as Cassandra COUNTER columns. Counter writes are not idempotent; on a write timeout the client cannot tell whether the increment succeeded, so a retry over-counts and a non-retry under-counts⁷. Counter columns also can’t share a table with non-counter data, can’t have TTLs, and can’t be part of a primary key. Keep “fresh” counters in Redis (atomic INCR) and authoritative aggregates in ClickHouse.

API design

Create short URL

POST /api/v1/urls

1{2  "url": "https://example.com/very/long/path?with=params",3  "customCode": "launch-2026",4  "expiresAt": "2027-12-31T23:59:59Z",5  "password": "optional-password",6  "maxClicks": 1000,7  "tags": ["campaign-2026", "social"]8}

1{2  "id": "url_abc123def456",3  "shortCode": "launch-2026",4  "shortUrl": "https://suj.ee/launch-2026",5  "longUrl": "https://example.com/very/long/path?with=params",6  "createdAt": "2026-04-21T10:00:00Z",7  "expiresAt": "2027-12-31T23:59:59Z",8  "isPasswordProtected": true,9  "maxClicks": 1000,10  "clickCount": 0,11  "qrCode": "https://suj.ee/api/v1/urls/url_abc123def456/qr"12}

Code	Error	When
400	`INVALID_URL`	Malformed scheme or unreachable URL
400	`INVALID_CUSTOM_CODE`	Code contains disallowed characters
409	`CODE_TAKEN`	Custom code already in use
403	`URL_BLOCKED`	Destination flagged by scanner
429	`RATE_LIMITED`	Too many requests

Plan	Create/hour	Create/day
Free	50	500
Pro	500	5 000
Enterprise	5 000	Unlimited

Redirect

GET /{shortCode} returns:

1HTTP/1.1 302 Found2Location: https://example.com/very/long/path3Cache-Control: private, max-age=604X-Robots-Tag: noindex

Code	When
404	Short code not found
410	Link expired or disabled
429	Click limit exceeded
403	Password required (renders HTML form)

Analytics

GET /api/v1/urls/{id}/analytics

Param	Type	Default	Notes
period	string	`7d`	One of `24h`, `7d`, `30d`, `90d`, `custom`
startDate	ISO8601	—	Required when `period=custom`
endDate	ISO8601	—	Required when `period=custom`
groupBy	string	`day`	`hour`, `day`, `week`, `month`

1{2  "urlId": "url_abc123def456",3  "period": { "start": "2026-04-14T00:00:00Z", "end": "2026-04-21T23:59:59Z" },4  "summary": { "totalClicks": 15420, "uniqueClicks": 12350, "botClicks": 1230 },5  "timeSeries": [6    { "date": "2026-04-14", "clicks": 2100, "unique": 1800 },7    { "date": "2026-04-15", "clicks": 2450, "unique": 2100 }8  ],9  "topReferrers": [10    { "referrer": "x.com", "clicks": 5200, "percentage": 33.7 },11    { "referrer": "linkedin.com", "clicks": 3100, "percentage": 20.1 }12  ],13  "topCountries": [14    { "country": "US", "clicks": 6800, "percentage": 44.1 },15    { "country": "UK", "clicks": 2300, "percentage": 14.9 }16  ],17  "devices": {18    "mobile":  { "clicks": 9200, "percentage": 59.7 },19    "desktop": { "clicks": 5800, "percentage": 37.6 },20    "tablet":  { "clicks":  420, "percentage":  2.7 }21  }22}

Bulk and listing

POST /api/v1/urls/bulk returns 202 Accepted with a job ID; the caller polls /api/v1/jobs/{id}. GET /api/v1/urls?cursor=...&limit=50 returns paginated user URLs with cursor-based navigation. Cursor pagination is deliberately chosen over offset because the underlying urls_by_user partition is sorted on created_at DESC.

Data modeling

URL mappings (Cassandra / ScyllaDB / DynamoDB)

The primary table is optimized for redirect lookups by short_code. Cassandra’s LeveledCompactionStrategy is the right choice here: read-heavy workloads benefit from at-most-one-SSTable-per-level guarantees, at the cost of higher write amplification⁸.

1CREATE TABLE url_mappings (2  short_code TEXT,3  long_url   TEXT,4  user_id    UUID,5  created_at TIMESTAMP,6  expires_at TIMESTAMP,7  is_active  BOOLEAN,8  password_hash TEXT,9  max_clicks INT,10  metadata MAP<TEXT, TEXT>,11  PRIMARY KEY (short_code)12) WITH default_time_to_live = 157680000  -- 5 years13  AND compaction = {'class': 'LeveledCompactionStrategy'}14  AND caching = {'keys': 'ALL', 'rows_per_partition': 'ALL'};

Note: click_count is not in this table. Counter columns force a separate counter-only table and bring the consistency footguns called out earlier; we keep the source of truth in ClickHouse and a fresh value in Redis.

A secondary table keyed by user supports the dashboard:

1CREATE TABLE urls_by_user (2  user_id    UUID,3  created_at TIMESTAMP,4  short_code TEXT,5  long_url   TEXT,6  click_count BIGINT,7  is_active  BOOLEAN,8  PRIMARY KEY ((user_id), created_at, short_code)9) WITH CLUSTERING ORDER BY (created_at DESC);

This is a denormalized companion table written by the shortening service; the click_count here is a periodically refreshed cache from ClickHouse, not a live counter.

Why a wide-column store at all? O(1) lookups by partition key, horizontal scaling, tunable consistency (read ONE for redirects is fine; write QUORUM for shortening), and built-in TTL for expiration. ScyllaDB is a drop-in replacement that trades JVM dependence for a shard-per-core C++ runtime and reports lower p99 tail latencies in vendor benchmarks⁹. DynamoDB is the AWS-native option with serverless billing; it enforces hard per-partition limits (3 000 RCU / 1 000 WCU) but uses adaptive capacity (“split for heat”) to redistribute hot partitions automatically — important for viral links¹⁰.

Click analytics (ClickHouse)

Raw click events land in a MergeTree partitioned by month; query workloads are served by materialized views.

1CREATE TABLE clicks (2  short_code String,3  clicked_at DateTime64(3),45  ip_hash    FixedString(16),6  country    LowCardinality(String),7  city       String,89  device_type Enum8('mobile' = 1, 'desktop' = 2, 'tablet' = 3),10  browser     LowCardinality(String),11  os          LowCardinality(String),1213  referrer_domain LowCardinality(String),14  referrer_path   String,1516  is_bot   UInt8,17  bot_type LowCardinality(String),1819  date Date MATERIALIZED toDate(clicked_at),20  hour UInt8 MATERIALIZED toHour(clicked_at)21)22ENGINE = MergeTree()23PARTITION BY toYYYYMM(clicked_at)24ORDER BY (short_code, clicked_at)25TTL clicked_at + INTERVAL 1 YEAR;

For sums (count()), SummingMergeTree is the right materialized-view target. For unique counts, use AggregatingMergeTree with uniqState() and read with uniqMerge(); storing uniqExact() directly inside a SummingMergeTree does not aggregate further on subsequent merges and will quietly under-count¹¹:

1CREATE MATERIALIZED VIEW clicks_daily_mv2ENGINE = AggregatingMergeTree()3PARTITION BY toYYYYMM(date)4ORDER BY (short_code, date, country, device_type)5AS SELECT6  short_code,7  date,8  country,9  device_type,10  countState()      AS clicks,11  uniqState(ip_hash) AS unique_clicks12FROM clicks13GROUP BY short_code, date, country, device_type;

ClickHouse strengths used here: columnar storage with LowCardinality dictionary encoding for the high-cardinality-low-value columns (country, browser, OS), MergeTree partition pruning by month for time-range queries, and TTL-driven retention. Best practice is ten-to-hundred partitions total, which toYYYYMM keeps you safely inside¹².

Users and configuration (PostgreSQL)

Relational data — accounts, custom domains, API keys — lives in PostgreSQL because it actually benefits from ACID transactions and joins.

1CREATE TABLE users (2  id UUID PRIMARY KEY DEFAULT gen_random_uuid(),3  email TEXT UNIQUE NOT NULL,4  password_hash TEXT NOT NULL,5  plan TEXT DEFAULT 'free',6  api_key_hash TEXT UNIQUE,7  created_at TIMESTAMPTZ DEFAULT NOW(),8  updated_at TIMESTAMPTZ DEFAULT NOW()9);1011CREATE TABLE custom_domains (12  id UUID PRIMARY KEY DEFAULT gen_random_uuid(),13  user_id UUID REFERENCES users(id),14  domain TEXT UNIQUE NOT NULL,15  is_verified BOOLEAN DEFAULT false,16  ssl_status TEXT DEFAULT 'pending',17  created_at TIMESTAMPTZ DEFAULT NOW()18);1920CREATE TABLE api_keys (21  id UUID PRIMARY KEY DEFAULT gen_random_uuid(),22  user_id UUID REFERENCES users(id),23  key_hash TEXT UNIQUE NOT NULL,24  name TEXT,25  permissions JSONB DEFAULT '["read", "write"]',26  last_used_at TIMESTAMPTZ,27  expires_at TIMESTAMPTZ,28  created_at TIMESTAMPTZ DEFAULT NOW()29);

KGS storage (PostgreSQL)

The KGS lives in PostgreSQL because its work is small-volume, transactional, and benefits from FOR UPDATE SKIP LOCKED for collision-free batch allocation:

1CREATE TABLE keys_unused (2  short_code TEXT PRIMARY KEY,3  created_at TIMESTAMPTZ DEFAULT NOW()4);56CREATE TABLE keys_allocated (7  short_code   TEXT PRIMARY KEY,8  allocated_to TEXT NOT NULL,9  allocated_at TIMESTAMPTZ DEFAULT NOW()10);1112CREATE TABLE keys_used (13  short_code TEXT PRIMARY KEY,14  used_at TIMESTAMPTZ DEFAULT NOW()15);1617CREATE OR REPLACE FUNCTION allocate_keys(server_id TEXT, batch_size INT)18RETURNS TABLE(short_code TEXT) AS $$19BEGIN20  RETURN QUERY21  WITH allocated AS (22    DELETE FROM keys_unused23    WHERE short_code IN (24      SELECT ku.short_code25      FROM keys_unused ku26      LIMIT batch_size27      FOR UPDATE SKIP LOCKED28    )29    RETURNING keys_unused.short_code30  )31  INSERT INTO keys_allocated (short_code, allocated_to)32  SELECT a.short_code, server_id33  FROM allocated a34  RETURNING keys_allocated.short_code;35END;36$$ LANGUAGE plpgsql;

FOR UPDATE SKIP LOCKED is the key primitive — concurrent allocators don’t block on each other’s batches, they each grab the next available rows.

Selection matrix

Data	Store	Why
URL mappings	Cassandra / ScyllaDB / DynamoDB	O(1) by partition key, horizontal scale, TTL
Click events	ClickHouse	Columnar, compression, sub-second aggregates
User accounts	PostgreSQL	ACID, joins, native UUIDs, plenty fast at this size
KGS keys	PostgreSQL	Transactional batch allocation with `SKIP LOCKED`
Hot URL cache	Redis	Sub-ms `GET`, TTL, atomic operations
Rate limits	Redis	Atomic counters, sliding-window via sorted sets
Bloom filter	Redis (`RedisBloom`)	`BF.MEXISTS` membership for cheap negative lookups

Low-level design

Base62 encoder

1const CHARSET = "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz"2const BASE = BigInt(62)34export function encodeBase62(num: bigint): string {5  if (num === 0n) return CHARSET[0]6  let result = ""7  while (num > 0n) {8    result = CHARSET[Number(num % BASE)] + result9    num = num / BASE10  }11  return result12}1314export function decodeBase62(str: string): bigint {15  let result = 0n16  for (const char of str) {17    const index = CHARSET.indexOf(char)18    if (index === -1) throw new Error(`Invalid character: ${char}`)19    result = result * BASE + BigInt(index)20  }21  return result22}2324export function encodeBase62Padded(num: bigint, length: number): string {25  return encodeBase62(num).padStart(length, "0")26}

Length	Combinations	Comfortable up to
6	56.8 B	A small public service
7	3.5 T	Bitly-class scale
8	218 T	Internet-scale headroom

Snowflake generator

1const EPOCH = 1735689600000n // 2025-01-01T00:00:00Z; choose your own and never change it.2const NODE_BITS = 10n3const SEQUENCE_BITS = 12n45const MAX_NODE_ID = (1n << NODE_BITS) - 1n6const MAX_SEQUENCE = (1n << SEQUENCE_BITS) - 1n7const NODE_SHIFT = SEQUENCE_BITS8const TIMESTAMP_SHIFT = SEQUENCE_BITS + NODE_BITS910export class SnowflakeGenerator {11  private nodeId: bigint12  private sequence: bigint = 0n13  private lastTimestamp: bigint = -1n1415  constructor(nodeId: number) {16    if (nodeId < 0 || BigInt(nodeId) > MAX_NODE_ID) {17      throw new Error(`Node ID must be between 0 and ${MAX_NODE_ID}`)18    }19    this.nodeId = BigInt(nodeId)20  }2122  generate(): bigint {23    let timestamp = BigInt(Date.now()) - EPOCH2425    if (timestamp < this.lastTimestamp) {26      // Clock moved backwards (NTP step). Refuse rather than risk duplicate IDs.27      throw new Error("Clock moved backwards; refusing to issue an ID")28    }2930    if (timestamp === this.lastTimestamp) {31      this.sequence = (this.sequence + 1n) & MAX_SEQUENCE32      if (this.sequence === 0n) {33        timestamp = this.waitNextMillis(this.lastTimestamp)34      }35    } else {36      this.sequence = 0n37    }3839    this.lastTimestamp = timestamp40    return (timestamp << TIMESTAMP_SHIFT) | (this.nodeId << NODE_SHIFT) | this.sequence41  }4243  private waitNextMillis(lastTimestamp: bigint): bigint {44    let timestamp = BigInt(Date.now()) - EPOCH45    while (timestamp <= lastTimestamp) {46      timestamp = BigInt(Date.now()) - EPOCH47    }48    return timestamp49  }50}

The clock-backwards check is the operationally-important detail: NTP slewing is fine, but a sudden step backward would issue duplicates. Throwing is the canonical Twitter-implementation choice; the alternative is a wait-loop that turns a clock event into a service availability event.

Redirect service with bloom filter

1import { BloomFilter } from "bloom-filters"23interface RedirectResult {4  found: boolean5  longUrl?: string6  isExpired?: boolean7  requiresPassword?: boolean8}910class RedirectService {11  private readonly redis: RedisCluster12  private readonly cassandra: CassandraClient13  private readonly bloomFilter: BloomFilter14  private readonly analytics: AnalyticsCollector1516  constructor() {17    // 1B items, 0.1% false-positive rate ≈ 14.4 bits/item ≈ 1.7 GB18    this.bloomFilter = BloomFilter.create(1_000_000_000, 0.001)19  }2021  async redirect(shortCode: string, context: RequestContext): Promise<RedirectResult> {22    if (!this.bloomFilter.has(shortCode)) {23      return { found: false }24    }2526    const cached = await this.redis.hgetall(`url:${shortCode}`)27    if (cached && cached.long_url) {28      this.logClick(shortCode, context)29      return this.buildResult(cached)30    }3132    const row = await this.cassandra.execute(33      "SELECT * FROM url_mappings WHERE short_code = ?",34      [shortCode],35    )36    if (!row || row.length === 0) {37      return { found: false } // Bloom-filter false positive38    }3940    const url = row[0]41    await this.redis.hset(`url:${shortCode}`, {42      long_url: url.long_url,43      expires_at: url.expires_at?.toISOString() || "",44      password_hash: url.password_hash || "",45      is_active: url.is_active ? "1" : "0",46    })47    await this.redis.expire(`url:${shortCode}`, 3600)4849    this.logClick(shortCode, context)50    return this.buildResult(url)51  }5253  private buildResult(data: any): RedirectResult {54    if (data.is_active === "0" || data.is_active === false) return { found: false }55    if (data.expires_at && new Date(data.expires_at) < new Date()) {56      return { found: true, isExpired: true }57    }58    if (data.password_hash) {59      return { found: true, requiresPassword: true, longUrl: data.long_url }60    }61    return { found: true, longUrl: data.long_url }62  }6364  private logClick(shortCode: string, context: RequestContext): void {65    this.analytics66      .log({67        shortCode,68        timestamp: Date.now(),69        ip: context.ip,70        userAgent: context.userAgent,71        referer: context.referer,72      })73      .catch((err) => console.error("Analytics error:", err))74  }75}

Sliding-window rate limiter

A standard sorted-set rate limiter. The Lua script keeps the read-modify-write atomic, which matters under burst load:

1interface RateLimitResult {2  allowed: boolean3  remaining: number4  resetAt: number5}67class SlidingWindowRateLimiter {8  private readonly redis: RedisCluster910  async checkLimit(key: string, limit: number, windowMs: number): Promise<RateLimitResult> {11    const now = Date.now()12    const windowStart = now - windowMs1314    const result = await this.redis.eval(15      `16      local key = KEYS[1]17      local now = tonumber(ARGV[1])18      local window_start = tonumber(ARGV[2])19      local limit = tonumber(ARGV[3])20      local window_ms = tonumber(ARGV[4])2122      redis.call('ZREMRANGEBYSCORE', key, '-inf', window_start)23      local count = redis.call('ZCARD', key)2425      if count < limit then26        redis.call('ZADD', key, now, now .. ':' .. math.random())27        redis.call('PEXPIRE', key, window_ms)28        return {1, limit - count - 1, now + window_ms}29      else30        local oldest = redis.call('ZRANGE', key, 0, 0, 'WITHSCORES')31        local reset_at = oldest[2] + window_ms32        return {0, 0, reset_at}33      end34      `,35      [key],36      [now, windowStart, limit, windowMs],37    )3839    return { allowed: result[0] === 1, remaining: result[1], resetAt: result[2] }40  }41}

URL scanner

1interface ScanResult {2  isSafe: boolean3  threats: string[]4  scanTime: number5}67class URLScanner {8  private readonly blocklist: BlocklistService9  private readonly webRisk: WebRiskClient        // Google Web Risk API for commercial use10  private readonly virusTotal: VirusTotalClient11  private readonly redis: RedisCluster1213  async scan(url: string): Promise<ScanResult> {14    const urlHash = this.hashUrl(url)1516    const cached = await this.redis.get(`scan:${urlHash}`)17    if (cached) return JSON.parse(cached)1819    const domain = new URL(url).hostname20    const threats: string[] = []2122    if (await this.blocklist.contains(domain)) {23      return this.cacheResult(urlHash, { isSafe: false, threats: ["blocklist"], scanTime: Date.now() })24    }2526    if (await this.isKnownGood(domain)) {27      return this.cacheResult(urlHash, { isSafe: true, threats: [], scanTime: Date.now() })28    }2930    const wrResult = await this.webRisk.lookup(url)31    if (wrResult.threats.length > 0) threats.push(...wrResult.threats)3233    if (await this.isSuspicious(domain)) {34      const vtResult = await this.virusTotal.scan(url)35      if (vtResult.positives > 2) threats.push("malware")36    }3738    return this.cacheResult(urlHash, { isSafe: threats.length === 0, threats, scanTime: Date.now() })39  }4041  private async cacheResult(hash: string, result: ScanResult): Promise<ScanResult> {42    const ttl = result.isSafe ? 86400 : 360043    await this.redis.setex(`scan:${hash}`, ttl, JSON.stringify(result))44    return result45  }4647  private async isKnownGood(domain: string): Promise<boolean> {48    return this.redis.sismember("domains:allowlist", domain)49  }5051  private async isSuspicious(domain: string): Promise<boolean> {52    const domainAge = await this.getDomainAge(domain)53    return domainAge < 3054  }55}

Note

Google Safe Browsing v4 is restricted to non-commercial use. Commercial URL shorteners must use the Google Web Risk API; the Lookup endpoint is rate-limited to 6 000 requests/minute per project, which is the natural ceiling on synchronous scanning¹³.

Analytics pipeline

1interface ClickEvent {2  shortCode: string3  timestamp: number4  ip: string5  userAgent: string6  referer: string | null7}89class AnalyticsCollector {10  private readonly kafka: KafkaProducer11  private readonly buffer: ClickEvent[] = []12  private readonly BUFFER_SIZE = 10013  private readonly FLUSH_INTERVAL = 10001415  constructor() {16    setInterval(() => this.flush(), this.FLUSH_INTERVAL)17  }1819  async log(event: ClickEvent): Promise<void> {20    this.buffer.push(event)21    if (this.buffer.length >= this.BUFFER_SIZE) await this.flush()22  }2324  private async flush(): Promise<void> {25    if (this.buffer.length === 0) return26    const events = this.buffer.splice(0)27    await this.kafka.sendBatch({28      topic: "clicks",29      messages: events.map((e) => ({30        key: e.shortCode,31        value: JSON.stringify(e),32        timestamp: e.timestamp.toString(),33      })),34    })35  }36}3738class ClickProcessor {39  private readonly clickhouse: ClickHouseClient40  private readonly geoIP: GeoIPService41  private readonly deviceParser: DeviceParser42  private readonly botDetector: BotDetector4344  async process(event: ClickEvent): Promise<EnrichedClick> {45    const geo = await this.geoIP.lookup(event.ip)46    const device = this.deviceParser.parse(event.userAgent)47    const isBot = this.botDetector.detect(event.userAgent, event.ip)4849    return {50      short_code: event.shortCode,51      clicked_at: new Date(event.timestamp),52      ip_hash: this.hashIP(event.ip),  // GDPR: rotating-salt hash; never persist raw IP53      country: geo.country,54      city: geo.city,55      device_type: device.type,56      browser: device.browser,57      os: device.os,58      referrer_domain: this.extractDomain(event.referer),59      referrer_path: this.extractPath(event.referer),60      is_bot: isBot.isBot ? 1 : 0,61      bot_type: isBot.type,62    }63  }6465  private hashIP(ip: string): string {66    return crypto67      .createHash("sha256")68      .update(ip + process.env.IP_SALT)69      .digest("hex")70      .substring(0, 32)71  }72}

The salt is rotated on a schedule (typically daily) so that the hash is reversible only within the rotation window — long enough to deduplicate same-day visitors, short enough to limit GDPR exposure on any single salt leak.

Failure modes and operational implications

This is the part the design has to survive in production.

Viral hot keys

A single tweet can move a link from 0 to 100 k RPS in minutes. Mitigations stack:

CDN edge caching is the primary defense — once the first request per POP populates the edge cache, the next 60 s are served entirely off the CDN and never touch your infrastructure.
Redis hot-key shielding. If the same code shows up in your Redis cluster’s hottest-key telemetry, optionally pin it to a local in-process cache on the redirect nodes for the duration of the spike.
Per-link rate caps for free-tier abuse: a free-plan link with > 10 M clicks/hour is almost certainly being weaponized; throttle and alert.

Cache-stampede on non-existent codes

A scanner spraying random codes that don’t exist would bypass Redis (miss everywhere) and hammer the primary store. The bloom filter in front of Redis keeps roughly 99.9 % of these requests from ever reaching the primary store. False positives still pass through, which is why the Cassandra read is bounded with a small local connection pool.

Abuse and link rot

URL shorteners are an ideal vector for phishing because the destination is opaque¹⁴. Defense:

At creation time: synchronous Web Risk API lookup, with a fast-path allowlist of well-known domains; suspicious domains (recently registered, unusual TLD) get a deeper VirusTotal scan.
Continuously: rescan all live links periodically (a TTL-based queue keyed on last_scanned_at); flip is_active = false on links whose destination later turns malicious. The redirect path checks is_active on every read.
At request time: rate-limit per source IP, and refuse to redirect requests with anomalous patterns (no User-Agent, suspicious Referer).

Link rot is the destination going away — your link is fine, but destination.example.com/page returns 404. Background HEAD-checking and a last_known_status column let you serve a graceful interstitial instead of a hard browser error.

Counter drift

ClickHouse aggregates lag the redirect by 1–5 seconds. If the API is asked for “clicks right now”, read from Redis (INCR-style fresh counter), not ClickHouse. Reconcile the two values periodically; small drift is acceptable, large drift means an analytics pipeline incident worth paging on.

KGS exhaustion

The most embarrassing outage is “out of short codes”. Alert on keys_unused dropping below your weekly burn rate, not below some absolute number. The generator job runs continuously off-peak; if it fails for a day, the alert fires before write capacity is at risk.

Redirect-service partial outage

If Redis is degraded but the primary store is healthy, the redirect path slows down but stays correct. If the primary store is down, the bloom filter and Redis can serve roughly the cache hit rate — typically > 95 % — with the rest returning a temporary 503. Do not return 404 for misses-through-degradation; that bakes a wrong answer into client and CDN caches.

Frontend considerations

The redirect handler is the hot path

1export async function handleRedirect(req: Request): Promise<Response> {2  const shortCode = req.url.split("/").pop()3  if (!isValidCode(shortCode)) {4    return new Response(null, { status: 404 })5  }67  const result = await redirectService.redirect(shortCode, {8    ip: req.headers.get("x-forwarded-for"),9    userAgent: req.headers.get("user-agent"),10    referer: req.headers.get("referer"),11  })1213  if (!result.found) return new Response(null, { status: 404 })14  if (result.isExpired) return new Response("Link expired", { status: 410 })15  if (result.requiresPassword) return renderPasswordPage(shortCode)1617  return new Response(null, {18    status: 302,19    headers: {20      Location: result.longUrl,21      "Cache-Control": "private, max-age=60",22      "X-Robots-Tag": "noindex",23    },24  })25}

302 vs 301 in practice

Concern	`301 Moved Permanently`	`302 Found`
Default cacheability	Heuristically cacheable indefinitely (RFC 9111 §4.2.2)²	Only cacheable with explicit freshness
Click tracking	Misses cached repeat clicks	Tracks every click after `max-age` expires
Updating destination	Browser/CDN may serve stale forever	Reflected after `max-age` expires
CDN strategy	Long TTL is safe — but you can never retract	Short TTL recommended (60 s typical)

Dashboard state

1interface DashboardState {2  urls: Map<string, URLSummary>3  selectedUrl: string | null4  analytics: AnalyticsData | null5  dateRange: DateRange6  isLoading: boolean7}89const useDashboardStore = create<DashboardState>((set, get) => ({10  urls: new Map(),11  selectedUrl: null,12  analytics: null,13  dateRange: { start: subDays(new Date(), 7), end: new Date() },14  isLoading: false,1516  fetchUrls: async () => {17    set({ isLoading: true })18    const urls = await api.getUrls()19    set({ urls: new Map(urls.map((u) => [u.id, u])), isLoading: false })20  },2122  selectUrl: async (urlId: string) => {23    set({ selectedUrl: urlId, isLoading: true })24    const analytics = await api.getAnalytics(urlId, get().dateRange)25    set({ analytics, isLoading: false })26  },2728  updateDateRange: async (range: DateRange) => {29    set({ dateRange: range })30    const { selectedUrl } = get()31    if (selectedUrl) {32      set({ isLoading: true })33      const analytics = await api.getAnalytics(selectedUrl, range)34      set({ analytics, isLoading: false })35    }36  },37}))

Real-time click counter

1class ClickStreamClient {2  private ws: WebSocket | null = null3  private subscriptions = new Set<string>()45  connect(authToken: string): void {6    this.ws = new WebSocket(`wss://api.suj.ee/ws?token=${authToken}`)7    this.ws.onmessage = (event) => {8      const data = JSON.parse(event.data)9      if (data.type === "click") this.handleClick(data.shortCode, data.count)10    }11  }1213  subscribe(shortCode: string): void {14    this.subscriptions.add(shortCode)15    this.ws?.send(JSON.stringify({ action: "subscribe", shortCode }))16  }1718  private handleClick(shortCode: string, count: number): void {19    useDashboardStore.getState().updateClickCount(shortCode, count)20  }21}

The WebSocket reads from the Redis fresh counter (via a thin pub/sub bridge), not from ClickHouse. Real-time means within 1 s of the click; dashboards pulling from ClickHouse are eventually consistent within the materialized-view refresh interval.

Infrastructure

Cloud-agnostic shape

Component	Purpose	Options
CDN	Edge cache, DDoS absorb	Cloudflare, Fastly, CloudFront
Load balancer	Traffic distribution	HAProxy, NGINX, AWS ALB
Application	Redirect, API	Node.js, Go, Rust
KV cache	Hot URLs, rate limits, bloom	Redis Cluster, KeyDB, Dragonfly
Primary store	URL mappings	Cassandra, ScyllaDB, DynamoDB
Analytics store	Click data	ClickHouse, Druid, TimescaleDB
Message queue	Analytics pipeline	Kafka, Pulsar, Redpanda
Object storage	Exports, backups	S3, GCS, MinIO

AWS reference

Service	Configuration	Why
CloudFront	200+ POPs	Global low-latency redirects
Redirect tier (Fargate)	2 vCPU, 4 GB, 50 tasks	Stateless, scales horizontally
Shortening tier (Fargate)	2 vCPU, 4 GB, 10 tasks	Lower traffic, write-heavy
ElastiCache Redis	r6g.xlarge cluster, 3 shards	Hot URLs, rate limits, bloom
Amazon Keyspaces	On-demand	Serverless Cassandra; auto-scales
RDS PostgreSQL	db.r6g.large Multi-AZ	Users, KGS, configuration
MSK	kafka.m5.large × 3	Click event streaming

Managed	Self-hosted alternative	When to switch
Amazon Keyspaces	ScyllaDB on EC2	Cost at scale, p99 sensitivity
ElastiCache	Redis Cluster on EC2	RedisBloom / specific modules
CloudFront	Cloudflare	DDoS protection, predictable pricing
MSK	Redpanda	Lower latency, simpler ops

Monitoring

Metric	Alert at	Action
Redirect latency p99	> 100 ms	Check Redis health, CDN cache-hit ratio
CDN cache-hit ratio	< 80 %	Review `Cache-Control`; scope `Vary` headers
404 rate	> 5 %	Likely a scanner; check WAF rules
KGS unused inventory	< 1 week of writes	Trigger key generation
Analytics lag	> 60 s	Scale Kafka consumers; check ClickHouse
Counter drift (ClickHouse vs Redis)	> 5 %	Pipeline incident; investigate

Each redirect carries a trace ID propagated through CDN → LB → service → cache → store → analytics. Sample rate is 100 % for errors and 1 % for normal traffic — enough to characterize p99 without paying the cost of full sampling at the redirect tier.

Practical takeaways

The redirect path is the product. Optimize relentlessly for it; everything else exists to serve it without slowing it down.
302 with a short explicit Cache-Control: max-age is the right default. 301 only if you genuinely don’t need analytics and are sure the destination is forever.
KGS for primary code generation, Snowflake for bulk/programmatic. This avoids the worst of both — collision overhead and unnecessary code length.
Counters never live in the primary store. Cassandra COUNTER columns invite over- and under-counting on retries; use Redis for fresh values and ClickHouse for authoritative totals.
Bloom filter sizing is math, not vibes. ; 1 B items at 0.1 % FPR is ≈ 1.7 GB.
Plan for the viral spike, not the steady state. A single link can move 80 % of daily traffic in minutes; CDN-first design is non-negotiable.
Abuse defense is a continuous job, not a creation-time check. Live-rescan known links; flip is_active when a destination turns bad.

Appendix

Prerequisites

Distributed-systems fundamentals: replication, consistent hashing, quorum reads/writes.
Database trade-offs: when SQL beats NoSQL and vice versa.
HTTP redirect semantics — RFC 9110 §15.4.2 (301), §15.4.3 (302); RFC 9111 §4.2.2 (heuristic freshness).
Caching strategies: TTL, eviction policies, hot-key behavior.

Terminology

Term	Definition
Base62	Encoding using `0-9 A-Z a-z` for URL-safe short codes
Snowflake ID	Twitter’s distributed ID format: timestamp + worker + sequence in 64 bits
KGS	Key Generation Service — pre-allocates unique short codes
Bloom filter	Probabilistic membership test; never says “no” wrong, can say “yes” wrong
Consistent hashing	Sharding scheme that minimizes data movement on cluster-membership changes
CDN	Content Delivery Network — edge caching for global low latency
Hot key	A cache key receiving disproportionate traffic (viral links)

References

Bitly — Lessons Learned Building a Distributed System That Handles 6 Billion Clicks a Month (2014) — primary source for scale numbers and SOA architecture pattern.
Bitly — NSQ: realtime distributed message processing at scale — origin of NSQ; useful baseline for async messaging design.
Twitter Engineering — Announcing Snowflake (2010) — original Snowflake design.
Snowflake ID — Wikipedia — concise reference for the 64-bit layout.
RFC 9110 — HTTP Semantics, §15.4.2 (301), §15.4.3 (302)
RFC 9111 — HTTP Caching, §4.2.2 (Calculating Heuristic Freshness)
Apache Cassandra — Counter columns (COUNTER data type)
Apache Cassandra — Leveled Compaction Strategy
Google Web Risk API — Quotas and limits
ClickHouse — Use materialized views and Top 10 best practices
AWS Database Blog — DynamoDB partitions, hot keys, and split-for-heat

Bitly: Lessons Learned Building a Distributed System That Handles 6 Billion Clicks a Month, High Scalability summary of a Bitly engineering presentation, July 2014. Subsequent industry coverage put Bitly at 9–11 B clicks/month by late 2017. ↩
RFC 9111 §4.2.2 (Calculating Heuristic Freshness) and RFC 9110 §15.4.3 (302 Found). 301 responses are heuristically cacheable; 302 responses are cacheable only when the response includes explicit freshness information (e.g. Cache-Control: max-age or Expires). ↩ ↩² ↩³
Announcing Snowflake, Twitter (now X) Engineering Blog, June 2010. Custom epoch 1288834974657 ms = 2010-11-04T01:42:54.657 UTC. ↩ ↩²
Snowflake ID, Wikipedia, accessed 2026-04-21. ↩
Discord Developer Documentation — Snowflakes. ↩
Bloom-filter bit count and optimal hash count . For and : bits ( GB), . See Bloom filter calculator and the Redis Bloom filter docs bits-per-element table. ↩
Apache Cassandra — Data Types: counter on counter limitations; Cassandra counter columns: nice in theory, hazardous in practice, Ably Engineering, for a production-experience perspective on retry and replica-failure semantics. ↩
Apache Cassandra — Leveled Compaction Strategy. Note that Cassandra 5.0 introduces UCS (Unified Compaction Strategy); LCS remains valid for read-dominant workloads. ↩
ScyllaDB vs. Apache Cassandra, ScyllaDB. Vendor source — values are indicative, not independent benchmarks. ↩
Scaling DynamoDB: how partitions, hot keys, and split-for-heat impact performance, AWS Database Blog. ↩
See ClickHouse — Use materialized views and the engine-specific docs for SummingMergeTree and AggregatingMergeTree. SummingMergeTree only sums numeric columns on merge; non-additive aggregates such as uniqExact belong in AggregatingMergeTree with *State / *Merge functions. ↩
Top 10 best practices tips for ClickHouse, ClickHouse Blog. Recommends 10–100 partitions total; toYYYYMM is the typical default. ↩
Google Web Risk — Quotas and limits and Safe Browsing APIs (v4) Overview on the non-commercial restriction. ↩
URL shortening allows threats to evade URL filtering and categorization tools, Menlo Security. ↩