Design a Flash Sale System

Building a system to handle millions of concurrent users competing for limited inventory during time-bounded sales events. Flash sales present a unique challenge: extreme traffic spikes (10-100x normal) concentrated in seconds, with zero tolerance for inventory errors. This design covers virtual waiting rooms, atomic inventory management, and asynchronous order processing.

Flash sale system architecture: CDN-based waiting room absorbs traffic spike, queue service manages admission, Redis handles atomic inventory, message queue decouples order processing.

Abstract

Flash sale design centers on three constraints working against each other:

Traffic absorption — Millions of users arriving in seconds cannot hit your database directly. A CDN-hosted waiting room absorbs the spike; a queue service meters admission at backend capacity.
Inventory accuracy — Overselling destroys trust. Redis Lua scripts provide atomic “check-and-decrement” operations. Pre-allocation (tokens = inventory) bounds the problem.
Order durability under load — Synchronous order processing cannot scale to 500K+ TPS. Asynchronous queues decouple order receipt from processing, with guaranteed delivery.

The mental model: waiting room → token gate → atomic inventory → async order queue. Each layer handles one constraint and shields the next.

Design Decision	Tradeoff
CDN waiting room	Absorbs traffic cheaply; adds user-facing latency
Token-based admission	Prevents overselling; requires pre-allocation
Redis atomic counters	Sub-millisecond inventory checks; single point of failure
Async order processing	Handles 100x normal load; delayed confirmation

Requirements

Functional Requirements

Feature	Scope	Notes
Virtual waiting room	Core	Absorbs traffic spike before backend
Queue management	Core	FIFO admission with position tracking
Inventory reservation	Core	Atomic decrement, no overselling
Order placement	Core	Async processing with durability
Bot detection	Core	Multi-layer defense
Payment processing	Core	Idempotent, timeout-aware
Order confirmation	Core	Email/push notification
Purchase limits	Extended	1-2 units per customer
VIP early access	Extended	Tiered queue priority
Real-time inventory display	Extended	Eventually consistent display

Non-Functional Requirements

Requirement	Target	Rationale
Availability	99.99%	Revenue impact; Alibaba achieved “zero downtime” during Singles Day
Waiting room latency	< 100ms	Static CDN, must feel instant
Inventory check latency	< 50ms	Critical path, Redis required
Checkout latency	< 5s	User-acceptable; async processing hides backend
Queue position accuracy	Real-time	Trust requires visible progress
Inventory accuracy	100%	Zero tolerance for overselling
Order durability	Zero loss	Queued orders must survive failures

Scale Estimation

Traffic Profile:

Metric	Normal	Flash Sale Peak	Multiplier
Concurrent users	100K	10M	100x
Page requests/sec	10K RPS	1M RPS	100x
Inventory checks/sec	1K RPS	500K RPS	500x
Orders/sec	100 TPS	10K TPS	100x

Back-of-envelope (1M users, 10K inventory):

1
Users arriving in first minute: 1,000,000
2
Waiting room page views: 1M × 3 refreshes = 3M requests/min = 50K RPS
3
Queue status checks: 1M × 1 check/5sec = 200K RPS
4
Inventory checks (admitted users): 50K users admitted × 1 check = 50K RPS spike
5
Orders attempted: 50K (not all convert)
6
Orders completed: 10K (inventory limit)

Storage:

1
Queue state: 1M users × 100 bytes = 100MB (Redis)
2
Order records: 10K orders × 5KB = 50MB (PostgreSQL)
3
Event logs: 10M events × 200 bytes = 2GB/sale

Design Paths

Path A: Pre-Allocation Model (Token-Based)

Best when:

Fixed, known inventory quantity
Fairness is paramount (ticketing, limited editions)
High-value items where overselling is catastrophic

Architecture:

Key characteristics:

Tokens generated equal to inventory before sale starts
Each admitted user receives one token
Token guarantees checkout opportunity (not purchase—user may abandon)
Token expires if unused (returns to pool)

Trade-offs:

✅ Zero overselling by construction
✅ Predictable admission rate
✅ Fair (FIFO or randomized entry)
❌ Requires accurate inventory count pre-sale
❌ Token management complexity (expiration, reclaim)
❌ Abandoned tokens reduce conversion

Real-world example: SeatGeek uses token-based admission for concert ticket sales. Lambda + DynamoDB manages token lifecycle; tokens expire on purchase or 15-minute timeout, returning to the pool for the next user in queue.

Path B: Real-Time Inventory Model (Counter-Based)

Best when:

Dynamic inventory (multiple warehouses, restocking)
E-commerce flash sales with variable stock
Lower-stakes items where occasional overselling is recoverable

Architecture:

Key characteristics:

No pre-allocation; inventory checked in real-time
Atomic decrement at checkout (not admission)
Rate limiting protects backend; doesn’t guarantee purchase
Inventory can be restocked mid-sale

Trade-offs:

✅ Handles dynamic inventory
✅ Simpler pre-sale setup
✅ Can restock mid-sale
❌ Overselling risk if counter and order processing desync
❌ Users admitted without guarantee (frustration)
❌ Thundering herd on inventory service if rate limiting fails

Real-world example: Alibaba Singles Day uses Redis atomic counters with Lua scripts. Product ID = key, inventory = value. Lua script performs atomic GET + DECR in single operation. Handles 583K operations/second with careful sharding.

Path Comparison

Factor	Path A (Token)	Path B (Counter)
Overselling risk	Zero	Low (with proper atomicity)
Setup complexity	Higher	Lower
Dynamic inventory	Difficult	Native
User expectation	Guaranteed opportunity	Best effort
Fairness	Explicit (token order)	Implicit (first to checkout)
Best for	Ticketing, limited releases	E-commerce, restockable goods

This Article’s Focus

This article implements Path A (Token-Based) for the core flow because:

Flash sales typically have fixed, high-value inventory
Fairness is a differentiator (users accept waiting if fair)
Zero overselling is non-negotiable for most use cases

Path B implementation details are covered in the Variations section.

High-Level Design

Component Overview

Component	Responsibility	Technology
Virtual Waiting Room	Absorb traffic spike, display queue position	Static HTML on CDN
Queue Service	Manage admission, assign tokens	Lambda + DynamoDB
Inventory Service	Atomic inventory operations	Redis Cluster
Order Service	Process orders asynchronously	ECS + SQS
Payment Service	Handle payments idempotently	Stripe/Adyen integration
Notification Service	Send confirmations	SES + SNS
Bot Detection	Filter non-human traffic	WAF + Custom rules

Request Flow

API Design

Queue Service APIs

Join Queue

1
POST /api/v1/queue/join
2
Authorization: Bearer {user_token}
3
X-Device-Fingerprint: {fingerprint}
4

5
{
6
  "sale_id": "flash-sale-2024-001",
7
  "product_ids": ["sku-001", "sku-002"]
8
}

Response (202 Accepted):

1
{
2
  "queue_ticket": "qt_abc123xyz",
3
  "position": 15234,
4
  "estimated_wait_seconds": 180,
5
  "status_url": "/api/v1/queue/status/qt_abc123xyz"
6
}

Error responses:

400 Bad Request: Invalid sale_id or product not in flash sale
403 Forbidden: Bot detected or user already in queue
429 Too Many Requests: Rate limit exceeded

Check Queue Status

1
GET /api/v1/queue/status/{queue_ticket}

Response (200 OK):

1
{
2
  "queue_ticket": "qt_abc123xyz",
3
  "status": "waiting",
4
  "position": 8234,
5
  "estimated_wait_seconds": 90,
6
  "poll_interval_seconds": 5
7
}

Status values:

waiting: In queue, not yet admitted
admitted: Token assigned, can proceed to checkout
expired: Waited too long, removed from queue
completed: Purchased or abandoned checkout

Token Admission (Internal)

When user reaches front of queue:

1
{
2
  "queue_ticket": "qt_abc123xyz",
3
  "status": "admitted",
4
  "checkout_token": "ct_xyz789abc",
5
  "checkout_url": "/checkout?token=ct_xyz789abc",
6
  "token_expires_at": "2024-03-15T10:05:00Z"
7
}

Checkout Service APIs

Start Checkout Session

1
POST /api/v1/checkout/start
2
Authorization: Bearer {user_token}
3

4
{
5
  "checkout_token": "ct_xyz789abc",
6
  "product_id": "sku-001",
7
  "quantity": 1
8
}

Response (201 Created):

1
{
2
  "session_id": "cs_def456",
3
  "reserved_until": "2024-03-15T10:05:00Z",
4
  "product": {
5
    "id": "sku-001",
6
    "name": "Limited Edition Sneaker",
7
    "price": 299.0,
8
    "currency": "USD"
9
  },
10
  "next_step": "payment"
11
}

Error responses:

400 Bad Request: Invalid token or product
409 Conflict: Token already used
410 Gone: Token expired
422 Unprocessable: Inventory exhausted (token invalid)

Submit Order

1
POST /api/v1/orders
2
Authorization: Bearer {user_token}
3

4
{
5
  "session_id": "cs_def456",
6
  "shipping_address": {
7
    "line1": "123 Main St",
8
    "city": "San Francisco",
9
    "state": "CA",
10
    "postal_code": "94102",
11
    "country": "US"
12
  },
13
  "payment_method_id": "pm_card_visa"
14
}

Response (202 Accepted):

1
{
2
  "order_id": "ord_789xyz",
3
  "status": "processing",
4
  "estimated_confirmation": "< 60 seconds",
5
  "tracking_url": "/api/v1/orders/ord_789xyz"
6
}

Design note: Returns 202 (not 201) because order processing is asynchronous. The order is durably queued but not yet confirmed.

Pagination Strategy

Queue status uses cursor-based polling, not traditional pagination:

1
{
2
  "position": 1234,
3
  "poll_interval_seconds": 5,
4
  "next_poll_after": "2024-03-15T10:01:05Z"
5
}

Rationale: Queue position changes continuously. Polling interval increases as position improves (less uncertainty near front).

Data Modeling

Queue State (DynamoDB)

1
Table: FlashSaleQueue
2
Partition Key: sale_id
3
Sort Key: queue_ticket
4

5
Attributes:
6
- user_id: string
7
- position: number (GSI for ordering)
8
- status: enum [waiting, admitted, expired, completed]
9
- joined_at: ISO8601
10
- admitted_at: ISO8601 | null
11
- checkout_token: string | null
12
- token_expires_at: ISO8601 | null
13
- device_fingerprint: string
14
- ip_address: string

GSI: sale_id-position-index for efficient position lookups.

Why DynamoDB: Single-digit millisecond latency at any scale, automatic scaling, TTL for expired entries.

Inventory Counter (Redis)

1
# Inventory count per product
2
SET inventory:sku-001 10000
3

4
# Atomic decrement with Lua script
5
EVAL "
6
  local count = redis.call('GET', KEYS[1])
7
  if tonumber(count) > 0 then
8
    return redis.call('DECR', KEYS[1])
9
  else
10
    return -1
11
  end
12
" 1 inventory:sku-001

Why Lua script: GET and DECR must be atomic. Without Lua, two concurrent requests could both see count=1 and both decrement, causing overselling.

Token Registry (Redis)

1
# Token → user mapping with TTL
2
SETEX token:ct_xyz789abc 300 "user_123"
3

4
# Used tokens (prevent replay)
5
SADD used_tokens:flash-sale-2024-001 ct_xyz789abc

TTL: 5 minutes for checkout tokens. Expired tokens return to the pool.

Order Schema (PostgreSQL)

1
CREATE TABLE orders (
2
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
3
    user_id UUID NOT NULL REFERENCES users(id),
4
    sale_id VARCHAR(50) NOT NULL,
5
    checkout_token VARCHAR(100) NOT NULL UNIQUE,
6
    status VARCHAR(20) DEFAULT 'pending',
7

8
    -- Order details
9
    product_id VARCHAR(50) NOT NULL,
10
    quantity INT NOT NULL DEFAULT 1,
11
    unit_price DECIMAL(10,2) NOT NULL,
12
    total_amount DECIMAL(10,2) NOT NULL,
13
    currency VARCHAR(3) DEFAULT 'USD',
14

15
    -- Shipping
16
    shipping_address JSONB NOT NULL,
17

18
    -- Payment
19
    payment_intent_id VARCHAR(100),
20
    payment_status VARCHAR(20),
21

22
    -- Timestamps
23
    created_at TIMESTAMPTZ DEFAULT NOW(),
24
    confirmed_at TIMESTAMPTZ,
25

26
    -- Idempotency
27
    idempotency_key VARCHAR(100) UNIQUE
28
);
29

30
CREATE INDEX idx_orders_user ON orders(user_id, created_at DESC);
31
CREATE INDEX idx_orders_sale ON orders(sale_id, status);
32
CREATE INDEX idx_orders_payment ON orders(payment_intent_id);

Idempotency key: Prevents duplicate orders if user retries during network issues. Typically {user_id}:{checkout_token}.

Database Selection Matrix

Data	Store	Rationale
Queue state	DynamoDB	Single-digit ms latency, auto-scale, TTL
Inventory counters	Redis Cluster	Sub-ms atomic operations
Tokens	Redis	TTL, fast lookup
Orders	PostgreSQL	ACID, complex queries, durability
Event logs	Kinesis → S3	High throughput, analytics
User sessions	Redis	Fast auth checks

Low-Level Design

Virtual Waiting Room

The waiting room is the first line of defense. It must:

Absorb millions of requests without backend load
Provide fair queue positioning
Communicate progress transparently

Architecture:

Static HTML design:


10 collapsed lines
1
<!DOCTYPE html>
2
<html>
3
  <head>
4
    <title>Flash Sale - Please Wait</title>
5
    <meta http-equiv="Cache-Control" content="no-cache" />
6
  </head>
7
  <body>
8
    <div id="waiting-room">
9
      <h1>You're in the queue</h1>
10

11
      <!-- Key UI elements -->
12
      <div id="position">Position: <span id="pos-number">--</span></div>
13
      <div id="estimate">Estimated wait: <span id="wait-time">--</span></div>
14
      <div id="progress-bar">
15
        <div id="progress-fill" style="width: 0%"></div>
16
      </div>
17

18
      <!-- Status messages -->
19
      <div id="status-message">Please keep this tab open</div>
20
      <div id="redirect-notice" style="display:none">Redirecting to checkout...</div>
21
    </div>
22

23
    <script src="/queue-client.js"></script>
24
  </body>
1 collapsed line
25
</html>

Queue polling logic:


7 collapsed lines
1
interface QueueStatus {
2
  status: "waiting" | "admitted" | "expired"
3
  position?: number
4
  estimated_wait_seconds?: number
5
  checkout_url?: string
6
  poll_interval_seconds: number
7
}
8

9
async function pollQueueStatus(ticket: string): Promise<void> {
10
  const response = await fetch(`/api/v1/queue/status/${ticket}`)
11
  const status: QueueStatus = await response.json()
12

13
  switch (status.status) {
14
    case "waiting":
15
      updateUI(status.position, status.estimated_wait_seconds)
16
      // Exponential backoff near front of queue
17
      const interval = status.poll_interval_seconds * 1000
18
      setTimeout(() => pollQueueStatus(ticket), interval)
19
      break
20

21
    case "admitted":
22
      showRedirectNotice()
23
      // Small delay for user to see the message
24
      setTimeout(() => {
25
        window.location.href = status.checkout_url
26
      }, 1500)
27
      break
28

29
    case "expired":
30
      showExpiredMessage()
31
      break
32
  }
33
}
34

35
// Start polling on page load
36
const ticket = new URLSearchParams(window.location.search).get("ticket")
37
if (ticket) {
38
  pollQueueStatus(ticket)
1 collapsed line
39
}

Design decisions:

Decision	Rationale
Static HTML on CDN	Millions of users hitting origin would saturate it; CDN absorbs at edge
Client-side polling	Push (WebSocket) at this scale requires massive connection management
Exponential backoff	Users near front poll more frequently; reduces total requests
No refresh needed	Single-page polling prevents users from losing position by refreshing

Queue Service (Token Management)

The queue service manages the FIFO queue and token assignment.

Lambda handler:


14 collapsed lines
1
import { DynamoDB } from "@aws-sdk/client-dynamodb"
2
import { DynamoDBDocument } from "@aws-sdk/lib-dynamodb"
3

4
const ddb = DynamoDBDocument.from(new DynamoDB({}))
5

6
interface QueueEntry {
7
  sale_id: string
8
  queue_ticket: string
9
  user_id: string
10
  position: number
11
  status: "waiting" | "admitted" | "expired" | "completed"
12
  checkout_token?: string
13
}
14

15
export async function joinQueue(
16
  saleId: string,
17
  userId: string,
18
  deviceFingerprint: string,
19
): Promise<{ ticket: string; position: number }> {
20
  // Check if user already in queue
21
  const existing = await findUserInQueue(saleId, userId)
22
  if (existing) {
23
    return { ticket: existing.queue_ticket, position: existing.position }
24
  }
25

26
  // Get current queue length (approximate, for position)
27
  const position = await getNextPosition(saleId)
28

29
  const ticket = generateTicket()
30

31
  await ddb.put({
32
    TableName: "FlashSaleQueue",
33
    Item: {
34
      sale_id: saleId,
35
      queue_ticket: ticket,
36
      user_id: userId,
37
      position: position,
38
      status: "waiting",
39
      joined_at: new Date().toISOString(),
40
      device_fingerprint: deviceFingerprint,
41
      ttl: Math.floor(Date.now() / 1000) + 3600, // 1 hour TTL
42
    },
43
    ConditionExpression: "attribute_not_exists(queue_ticket)",
44
  })
45

46
  return { ticket, position }
47
}
48

49
export async function admitNextUsers(saleId: string, count: number): Promise<void> {
50
  // Invoked by EventBridge at fixed rate (e.g., every second)
51
  // Admits 'count' users from front of queue
52

53
  const waiting = await ddb.query({
54
    TableName: "FlashSaleQueue",
55
    IndexName: "sale_id-position-index",
56
    KeyConditionExpression: "sale_id = :sid",
57
    FilterExpression: "#status = :waiting",
58
    ExpressionAttributeNames: { "#status": "status" },
16 collapsed lines
59
    ExpressionAttributeValues: {
60
      ":sid": saleId,
61
      ":waiting": "waiting",
62
    },
63
    Limit: count,
64
    ScanIndexForward: true, // Ascending by position (FIFO)
65
  })
66

67
  for (const entry of waiting.Items || []) {
68
    await admitUser(entry as QueueEntry)
69
  }
70
}
71

72
async function admitUser(entry: QueueEntry): Promise<void> {
73
  const token = generateCheckoutToken()
74
  const expiresAt = new Date(Date.now() + 5 * 60 * 1000) // 5 min
75

76
  await ddb.update({
77
    TableName: "FlashSaleQueue",
78
    Key: { sale_id: entry.sale_id, queue_ticket: entry.queue_ticket },
79
    UpdateExpression: "SET #status = :admitted, checkout_token = :token, token_expires_at = :exp",
80
    ExpressionAttributeNames: { "#status": "status" },
81
    ExpressionAttributeValues: {
82
      ":admitted": "admitted",
83
      ":token": token,
84
      ":exp": expiresAt.toISOString(),
85
    },
86
  })
87

88
  // Also store token in Redis for fast lookup during checkout
89
  await redis.setex(`token:${token}`, 300, entry.user_id)
90
}

Admission rate control:

The admission rate must match backend capacity. EventBridge triggers admitNextUsers every second:

1
Admission rate = min(backend_capacity, remaining_inventory / expected_checkout_time)
2

3
Example:
4
- Backend can handle 1000 checkouts/sec
5
- Remaining inventory: 5000
6
- Average checkout time: 60 seconds
7
- Admission rate: min(1000, 5000/60) = min(1000, 83) = 83 users/sec

Design decisions:

Decision	Rationale
DynamoDB for queue	Handles millions of entries with single-digit ms latency
Position as GSI	Enables efficient “next N users” query
EventBridge for admission	Decouples admission rate from user requests
Token in Redis + DynamoDB	Redis for fast checkout validation; DynamoDB for durability

Inventory Service (Atomic Counters)

The inventory service prevents overselling through atomic operations.

Redis Lua script for atomic reservation:

1
-- reserve_inventory.lua
2
-- KEYS[1] = inventory key (e.g., "inventory:sku-001")
3
-- KEYS[2] = reserved set key (e.g., "reserved:sku-001")
4
-- ARGV[1] = user_id
5
-- ARGV[2] = quantity
6
-- ARGV[3] = reservation_id
7
-- ARGV[4] = ttl_seconds
8

9
local inventory_key = KEYS[1]
10
local reserved_key = KEYS[2]
11
local user_id = ARGV[1]
12
local quantity = tonumber(ARGV[2])
13
local reservation_id = ARGV[3]
14
local ttl = tonumber(ARGV[4])
15

16
-- Check current inventory
17
local available = tonumber(redis.call('GET', inventory_key) or 0)
18

19
if available < quantity then
20
    return { err = 'insufficient_inventory', available = available }
21
end
22

23
-- Atomic decrement
24
local new_count = redis.call('DECRBY', inventory_key, quantity)
25

26
if new_count < 0 then
27
    -- Race condition: restore and fail
28
    redis.call('INCRBY', inventory_key, quantity)
29
    return { err = 'race_condition' }
30
end
31

32
-- Track reservation for expiration
33
redis.call('HSET', reserved_key, reservation_id,
34
    cjson.encode({ user_id = user_id, quantity = quantity, created_at = redis.call('TIME')[1] }))
35
redis.call('EXPIRE', reserved_key, ttl)
36

37
return { ok = true, remaining = new_count, reservation_id = reservation_id }

Inventory service implementation:


11 collapsed lines
1
import Redis from "ioredis"
2
import { readFileSync } from "fs"
3

4
const redis = new Redis.Cluster([
5
  { host: "redis-1.example.com", port: 6379 },
6
  { host: "redis-2.example.com", port: 6379 },
7
  { host: "redis-3.example.com", port: 6379 },
8
])
9

10
const reserveScript = readFileSync("./reserve_inventory.lua", "utf-8")
11

12
interface ReservationResult {
13
  success: boolean
14
  reservation_id?: string
15
  remaining?: number
16
  error?: string
17
}
18

19
export async function reserveInventory(
20
  productId: string,
21
  userId: string,
22
  quantity: number,
23
  ttlSeconds: number = 300,
24
): Promise<ReservationResult> {
25
  const reservationId = `res_${Date.now()}_${userId}`
26

27
  const result = (await redis.eval(
28
    reserveScript,
29
    2, // number of keys
30
    `inventory:${productId}`,
31
    `reserved:${productId}`,
32
    userId,
33
    quantity.toString(),
34
    reservationId,
35
    ttlSeconds.toString(),
36
  )) as any
37

38
  if (result.err) {
39
    return { success: false, error: result.err }
40
  }
41

42
  return {
43
    success: true,
44
    reservation_id: reservationId,
45
    remaining: result.remaining,
46
  }
47
}
48

14 collapsed lines
49
export async function releaseReservation(productId: string, reservationId: string): Promise<void> {
50
  // Called when checkout times out or user abandons
51
  const reserved = await redis.hget(`reserved:${productId}`, reservationId)
52
  if (reserved) {
53
    const { quantity } = JSON.parse(reserved)
54
    await redis.incrby(`inventory:${productId}`, quantity)
55
    await redis.hdel(`reserved:${productId}`, reservationId)
56
  }
57
}
58

59
export async function confirmReservation(productId: string, reservationId: string): Promise<void> {
60
  // Called after successful payment - just remove from reserved set
61
  await redis.hdel(`reserved:${productId}`, reservationId)
62
}

Reservation lifecycle:

Design decisions:

Decision	Rationale
Lua script	Atomic read-check-decrement prevents race conditions
Redis Cluster	Horizontal scaling for high throughput
Reservation with TTL	Prevents inventory lock-up from abandoned checkouts
Hash for reservations	O(1) lookup/delete by reservation ID

Order Processing (Async Queue)

Orders are placed on a durable queue for async processing. This decouples order receipt from processing, preventing database overwhelm.

Order submission flow:


9 collapsed lines
1
import { SQSClient, SendMessageCommand } from "@aws-sdk/client-sqs"
2
import { v4 as uuid } from "uuid"
3

4
const sqs = new SQSClient({})
5
const ORDER_QUEUE_URL = process.env.ORDER_QUEUE_URL!
6

7
interface OrderRequest {
8
  session_id: string
9
  user_id: string
10
  product_id: string
11
  quantity: number
12
  shipping_address: Address
13
  payment_method_id: string
14
}
15

16
export async function submitOrder(request: OrderRequest): Promise<{ order_id: string }> {
17
  const orderId = uuid()
18
  const idempotencyKey = `${request.user_id}:${request.session_id}`
19

20
  // Check for duplicate submission
21
  const existing = await db.orders.findOne({ idempotency_key: idempotencyKey })
22
  if (existing) {
23
    return { order_id: existing.id }
24
  }
25

26
  // Create order record in pending state
27
  await db.orders.insert({
28
    id: orderId,
29
    user_id: request.user_id,
30
    product_id: request.product_id,
31
    quantity: request.quantity,
32
    status: "pending",
33
    idempotency_key: idempotencyKey,
34
    created_at: new Date(),
35
  })
36

37
  // Queue for async processing
38
  await sqs.send(
39
    new SendMessageCommand({
40
      QueueUrl: ORDER_QUEUE_URL,
41
      MessageBody: JSON.stringify({
42
        order_id: orderId,
43
        ...request,
44
      }),
45
      MessageDeduplicationId: idempotencyKey,
46
      MessageGroupId: request.user_id, // Ensures per-user ordering
47
    }),
48
  )
49

50
  return { order_id: orderId }
51
}

Order processor (worker):


14 collapsed lines
1
import { SQSEvent, SQSRecord } from "aws-lambda"
2
import Stripe from "stripe"
3

4
const stripe = new Stripe(process.env.STRIPE_SECRET_KEY!)
5

6
interface OrderMessage {
7
  order_id: string
8
  user_id: string
9
  product_id: string
10
  quantity: number
11
  shipping_address: Address
12
  payment_method_id: string
13
  session_id: string
14
}
15

16
export async function handler(event: SQSEvent): Promise<void> {
17
  for (const record of event.Records) {
18
    await processOrder(record)
19
  }
20
}
21

22
async function processOrder(record: SQSRecord): Promise<void> {
23
  const message: OrderMessage = JSON.parse(record.body)
24

25
  try {
26
    // 1. Verify reservation still valid
27
    const reservation = await getReservation(message.product_id, message.session_id)
28
    if (!reservation) {
29
      await markOrderFailed(message.order_id, "reservation_expired")
30
      return
31
    }
32

33
    // 2. Process payment
34
    const paymentIntent = await stripe.paymentIntents.create({
35
      amount: calculateTotal(message.product_id, message.quantity),
36
      currency: "usd",
37
      payment_method: message.payment_method_id,
38
      confirm: true,
39
      idempotency_key: `payment_${message.order_id}`,
40
    })
41

42
    if (paymentIntent.status !== "succeeded") {
43
      await releaseReservation(message.product_id, message.session_id)
44
      await markOrderFailed(message.order_id, "payment_failed")
45
      return
46
    }
47

48
    // 3. Confirm inventory (remove from reserved set)
49
    await confirmReservation(message.product_id, message.session_id)
50

51
    // 4. Update order status
52
    await db.orders.update(message.order_id, {
53
      status: "confirmed",
54
      payment_intent_id: paymentIntent.id,
55
      confirmed_at: new Date(),
56
    })
57

58
    // 5. Send confirmation
59
    await sendOrderConfirmation(message.order_id)
60
  } catch (error) {
61
    // Let SQS retry with exponential backoff
62
    throw error
63
  }
64
}
65

66
async function markOrderFailed(orderId: string, reason: string): Promise<void> {
67
  await db.orders.update(orderId, {
68
    status: "failed",
6 collapsed lines
69
    failure_reason: reason,
70
  })
71

72
  // Notify user
73
  await sendOrderFailureNotification(orderId, reason)
74
}

Dead letter queue handling:

Orders that fail after max retries go to a Dead Letter Queue (DLQ) for manual review:

1
export async function handleDeadLetter(record: SQSRecord): Promise<void> {
2
  const message = JSON.parse(record.body)
3

4
  // Log for investigation
5
  console.error("Order failed permanently", {
6
    order_id: message.order_id,
7
    attempts: record.attributes.ApproximateReceiveCount,
8
    error: record.attributes.DeadLetterQueueSourceArn,
9
  })
10

11
  // Alert ops team
12
  await pagerduty.createIncident({
13
    title: `Flash sale order failed: ${message.order_id}`,
14
    severity: "high",
15
  })
16

17
  // Release inventory back to pool
18
  await releaseReservation(message.product_id, message.session_id)
19
}

Design decisions:

Decision	Rationale
SQS FIFO queue	Exactly-once processing, per-user ordering
Idempotency key	Prevents duplicate orders on retry
Payment before confirmation	Never confirm inventory without successful payment
DLQ for failures	Ensures no order is silently lost

Bot Detection and Fairness

Multi-Layer Bot Defense

Layer 1: Edge defense (WAF)

1
# AWS WAF rules for flash sale
2
Rules:
3
  - Name: RateLimitPerIP
4
    Action: Block
5
    Statement:
6
      RateBasedStatement:
7
        Limit: 100 # requests per 5 minutes per IP
8
        AggregateKeyType: IP
9

10
  - Name: BlockKnownBots
11
    Action: Block
12
    Statement:
13
      IPSetReferenceStatement:
14
        ARN: arn:aws:wafv2:....:ipset/known-bots
15

16
  - Name: GeoRestriction
17
    Action: Block
18
    Statement:
19
      NotStatement:
20
        Statement:
21
          GeoMatchStatement:
22
            CountryCodes: [US, CA, GB, DE] # Allowed countries

Layer 2: Application-level detection


4 collapsed lines
1
interface BotSignals {
2
  score: number
3
  signals: string[]
4
}
5

6
export function detectBot(request: Request): BotSignals {
7
  const signals: string[] = []
8
  let score = 0
9

10
  // Device fingerprint consistency
11
  const fp = request.headers.get("x-device-fingerprint")
12
  if (!fp || fp.length < 32) {
13
    signals.push("missing_fingerprint")
14
    score += 30
15
  }
16

17
  // Behavioral signals
18
  const timing = parseTimingHeader(request)
19
  if (timing.pageLoadToAction < 500) {
20
    // < 500ms is suspicious
21
    signals.push("fast_interaction")
22
    score += 25
23
  }
24

25
  // Browser consistency
26
  const ua = request.headers.get("user-agent")
27
  const acceptLang = request.headers.get("accept-language")
28
  if (isHeadlessBrowser(ua) || !acceptLang) {
29
    signals.push("headless_indicators")
30
    score += 40
31
  }
32

33
  // Known residential proxy detection
11 collapsed lines
34
  const ip = getClientIP(request)
35
  if (await isResidentialProxy(ip)) {
36
    signals.push("residential_proxy")
37
    score += 20
38
  }
39

40
  return { score, signals }
41
}
42

43
export function shouldChallenge(signals: BotSignals): boolean {
44
  return signals.score >= 50
45
}
46

47
export function shouldBlock(signals: BotSignals): boolean {
48
  return signals.score >= 80
49
}

Layer 3: Queue-level protection

1
export async function validateQueueJoin(
2
  userId: string,
3
  deviceFingerprint: string,
4
  saleId: string,
5
): Promise<{ allowed: boolean; reason?: string }> {
6
  // Check for duplicate user
7
  const existingEntry = await findUserInQueue(saleId, userId)
8
  if (existingEntry) {
9
    return { allowed: false, reason: "already_in_queue" }
10
  }
11

12
  // Check for fingerprint reuse (same device, different accounts)
13
  const fpCount = await countFingerprintInQueue(saleId, deviceFingerprint)
14
  if (fpCount >= 2) {
15
    return { allowed: false, reason: "device_limit_exceeded" }
16
  }
17

18
  // Velocity check: how many queues has this user joined recently?
19
  const recentJoins = await countRecentQueueJoins(userId, 3600) // last hour
20
  if (recentJoins >= 5) {
21
    return { allowed: false, reason: "velocity_exceeded" }
22
  }
23

24
  return { allowed: true }
25
}

Fairness Mechanisms

1. FIFO queue with randomized entry window

Users who arrive before sale start are randomized when the sale begins (prevents “refresh at exactly 10:00:00” advantage):

1
export async function openSaleQueue(saleId: string): Promise<void> {
2
  // Get all users who arrived in pre-sale window (e.g., last 15 minutes)
3
  const earlyArrivals = await getEarlyArrivals(saleId)
4

5
  // Shuffle positions randomly
6
  const shuffled = shuffleArray(earlyArrivals)
7

8
  // Assign positions 1, 2, 3, ...
9
  for (let i = 0; i < shuffled.length; i++) {
10
    await updatePosition(shuffled[i].queue_ticket, i + 1)
11
  }
12

13
  // Users arriving after sale start get position = current_max + 1 (true FIFO)
14
}

2. Per-customer purchase limits

1
export async function validatePurchaseLimit(userId: string, productId: string, quantity: number): Promise<boolean> {
2
  const existingOrders = await db.orders.count({
3
    user_id: userId,
4
    product_id: productId,
5
    status: { $in: ["confirmed", "pending"] },
6
  })
7

8
  const LIMIT_PER_USER = 2
9
  return existingOrders + quantity <= LIMIT_PER_USER
10
}

Frontend Considerations

Waiting Room UX

Critical UX decisions:

Decision	Implementation	Rationale
Progress indicator	Position + estimated time + progress bar	Reduces anxiety; users know they’re progressing
No refresh needed	SPA with polling	Prevents users from losing position
Transparent communication	Show exact position	Trust requires honesty
Graceful degradation	Static HTML	Must work even if JS fails

Optimistic UI for checkout:

1
async function submitOrder(orderData: OrderData): Promise<void> {
2
  // Optimistic: show "Processing..." immediately
3
  setOrderStatus("processing")
4
  showConfirmationPreview(orderData)
5

6
  try {
7
    const { order_id } = await api.submitOrder(orderData)
8

9
    // Poll for confirmation (async processing)
10
    pollOrderStatus(order_id, (status) => {
11
      if (status === "confirmed") {
12
        setOrderStatus("confirmed")
13
        showSuccessAnimation()
14
      } else if (status === "failed") {
15
        setOrderStatus("failed")
16
        showRetryOption()
17
      }
18
    })
19
  } catch (error) {
20
    // Revert optimistic UI
21
    setOrderStatus("error")
22
    showErrorMessage(error)
23
  }
24
}

Real-Time Queue Updates

Polling vs WebSocket decision:

Factor	Polling	WebSocket
Scale	Easy (stateless)	Hard (connection management)
Latency	5-10s	Sub-second
Infrastructure	Simple	Complex
Battery impact	Higher	Lower

Chosen: Adaptive polling — Poll every 5s when far from front; every 1s when close.

1
function calculatePollInterval(position: number, totalAhead: number): number {
2
  const progressPercent = 1 - position / totalAhead
3

4
  if (progressPercent > 0.9) return 1000 // Top 10%: 1s
5
  if (progressPercent > 0.7) return 2000 // Top 30%: 2s
6
  if (progressPercent > 0.5) return 3000 // Top 50%: 3s
7
  return 5000 // Back 50%: 5s
8
}

Client State Management

1
interface FlashSaleState {
2
  // Queue state
3
  queueTicket: string | null
4
  position: number | null
5
  status: "idle" | "queued" | "admitted" | "checkout" | "completed" | "expired"
6

7
  // Checkout state
8
  checkoutToken: string | null
9
  checkoutExpiresAt: Date | null
10
  reservationId: string | null
11

12
  // Order state
13
  orderId: string | null
14
  orderStatus: "pending" | "processing" | "confirmed" | "failed" | null
15
}
16

17
// State persisted to localStorage for tab recovery
18
function persistState(state: FlashSaleState): void {
19
  localStorage.setItem("flash-sale-state", JSON.stringify(state))
20
}
21

22
// Restore on page load (handles accidental tab close)
23
function restoreState(): FlashSaleState | null {
24
  const saved = localStorage.getItem("flash-sale-state")
25
  if (!saved) return null
26

27
  const state = JSON.parse(saved)
28

29
  // Check if checkout token is still valid
30
  if (state.checkoutExpiresAt && new Date(state.checkoutExpiresAt) < new Date()) {
31
    return null // Expired, start fresh
32
  }
33

34
  return state
35
}

Infrastructure Design

Cloud-Agnostic Components

Component	Purpose	Requirements
CDN	Waiting room, static assets	Edge caching, high throughput
Serverless compute	Queue service, APIs	Auto-scale, pay-per-use
Key-value store	Inventory counters, tokens	Sub-ms latency, atomic operations
Document store	Queue state	Single-digit ms, auto-scale
Message queue	Order processing	Durability, exactly-once
Relational DB	Orders, users	ACID, complex queries

AWS Reference Architecture

Service configuration:

Service	Configuration	Rationale
CloudFront	Origin: S3 (static), Cache: 1 year	Waiting room must survive origin failure
API Gateway	Throttling: 10K RPS, Burst: 5K	Protects backend during spike
Lambda	Memory: 1024MB, Timeout: 30s, Reserved: 1000	Predictable latency under load
ElastiCache	Redis Cluster, 3 nodes, r6g.large	Sub-ms latency, failover
DynamoDB	On-demand, Auto-scaling	Handles unpredictable load
SQS FIFO	3000 msg/sec, 14-day retention	Order durability
RDS	Multi-AZ, db.r6g.xlarge, Read replicas	ACID + read scaling

Self-Hosted Alternatives

Managed Service	Self-Hosted Option	Trade-off
ElastiCache	Redis Cluster on EC2	More control, operational burden
DynamoDB	Cassandra/ScyllaDB	Cost at scale, complexity
SQS FIFO	Kafka	Higher throughput, operational complexity
Lambda	Kubernetes + KEDA	Fine-grained control, cold starts

Variations

Path B Implementation: Real-Time Counter Model

For e-commerce with dynamic inventory, replace token-based admission with real-time inventory checks:

1
export async function attemptPurchase(
2
  productId: string,
3
  userId: string,
4
  quantity: number,
5
): Promise<{ success: boolean; orderId?: string }> {
6
  // Rate limit first (protect backend)
7
  const allowed = await rateLimiter.check(userId, "purchase")
8
  if (!allowed) {
9
    return { success: false }
10
  }
11

12
  // Atomic inventory check + decrement
13
  const result = await redis.eval(
14
    `
15
    local count = redis.call('GET', KEYS[1])
16
    if tonumber(count) >= tonumber(ARGV[1]) then
17
      return redis.call('DECRBY', KEYS[1], ARGV[1])
18
    else
19
      return -1
20
    end
21
  `,
22
    1,
23
    `inventory:${productId}`,
24
    quantity,
25
  )
26

27
  if (result < 0) {
28
    return { success: false } // Sold out
29
  }
30

31
  // Proceed to order (inventory already decremented)
32
  const orderId = await createOrder(productId, userId, quantity)
33
  return { success: true, orderId }
34
}

Key difference: Inventory decremented at purchase attempt, not at queue admission. Higher risk of “sold out after waiting” but supports dynamic restocking.

VIP Early Access

Add priority tiers to queue service:

1
interface QueueEntry {
2
  // ... existing fields
3
  tier: "vip" | "member" | "standard"
4
  tierJoinedAt: Date
5
}
6

7
export async function getNextPosition(saleId: string, tier: string): Promise<number> {
8
  // VIPs get positions 1-1000, members 1001-10000, standard 10001+
9
  const tierOffsets = { vip: 0, member: 1000, standard: 10000 }
10
  const offset = tierOffsets[tier]
11

12
  const countInTier = await ddb.query({
13
    TableName: "FlashSaleQueue",
14
    KeyConditionExpression: "sale_id = :sid",
15
    FilterExpression: "tier = :tier",
16
    ExpressionAttributeValues: { ":sid": saleId, ":tier": tier },
17
  })
18

19
  return offset + (countInTier.Count || 0) + 1
20
}

Raffle-Based Allocation

For extremely limited inventory (e.g., 100 items, 1M users), replace queue with raffle:

1
export async function enterRaffle(saleId: string, userId: string): Promise<void> {
2
  // Entry window: 1 hour before draw
3
  await ddb.put({
4
    TableName: "FlashSaleRaffle",
5
    Item: {
6
      sale_id: saleId,
7
      user_id: userId,
8
      entry_id: uuid(),
9
      entered_at: new Date().toISOString(),
10
    },
11
  })
12
}
13

14
export async function drawWinners(saleId: string, count: number): Promise<string[]> {
15
  // Get all entries
16
  const entries = await getAllEntries(saleId)
17

18
  // Cryptographically random selection
19
  const shuffled = cryptoShuffle(entries)
20
  const winners = shuffled.slice(0, count)
21

22
  // Grant checkout tokens to winners
23
  for (const winner of winners) {
24
    await grantCheckoutToken(winner.user_id, saleId)
25
  }
26

27
  return winners.map((w) => w.user_id)
28
}

Conclusion

Flash sale systems require coordinated defense at every layer:

Traffic absorption: CDN-hosted waiting room prevents backend overwhelm. Static HTML + client-side polling scales infinitely at the edge.
Fair admission: Token-based queue management (Path A) guarantees purchase opportunity. FIFO with randomized early arrival prevents “refresh race.”
Inventory accuracy: Redis Lua scripts provide atomic check-and-decrement. Zero overselling through construction, not hope.
Order durability: Async processing via SQS decouples order receipt from processing. DLQ ensures no order is silently lost.
Bot defense: Multi-layer detection (WAF → behavioral → queue-level) raises the bar for attackers without blocking legitimate users.

What this design optimizes for:

Zero overselling (100% inventory accuracy)
Fairness (transparent queue position)
Durability (no lost orders)
Scalability (1M+ concurrent users)

What it sacrifices:

Latency (queue wait time)
Simplicity (multiple coordinated services)
Dynamic inventory (pre-allocation model)

Known limitations:

Token expiration requires careful tuning (too short: frustrated users; too long: wasted inventory)
Sophisticated bots with residential proxies remain challenging
VIP tiers can feel unfair to standard users

Appendix

Prerequisites

Distributed systems fundamentals (CAP theorem, consistency models)
Queue theory basics (FIFO, rate limiting)
Redis data structures and Lua scripting
Message queue patterns (at-least-once, exactly-once)
Payment processing (idempotency, webhooks)

Summary

Flash sales require a waiting room → token gate → atomic inventory → async order queue architecture
CDN-hosted waiting room absorbs traffic spikes cheaply and reliably
Token-based admission (Path A) guarantees purchase opportunity and prevents overselling by construction
Redis Lua scripts provide atomic inventory operations at 500K+ ops/second
Async order processing via message queues decouples order receipt from fulfillment
Multi-layer bot defense (WAF + behavioral + queue-level) raises attack cost without blocking legitimate users

References

Alibaba Cloud: System Stability for Large-Scale Flash Sales - Alibaba Singles Day architecture
AWS Prime Day 2025 Metrics - Scale benchmarks
SeatGeek Virtual Waiting Room Architecture - Token-based queue implementation
Shopify Flash Sales Architecture - Multi-tenant SaaS approach
Ticketmaster Queue System - Virtual waiting room UX
Redis Distributed Locks - Atomic operations patterns
Martin Kleppmann: Designing Data-Intensive Applications - Distributed systems fundamentals

Read more