Offline-First Architecture

Building applications that prioritize local data and functionality, treating network connectivity as an enhancement rather than a requirement—the storage APIs, sync strategies, and conflict resolution patterns that power modern collaborative and offline-capable applications.

Offline-first inverts the traditional web model: instead of fetching data from servers and caching it locally, data lives locally first and syncs to servers when possible. This article explores the browser APIs that enable this pattern, the sync strategies that keep data consistent, and how production applications like Figma, Notion, and Linear solve these problems at scale.

Offline-first architecture: application reads/writes to local storage first, service worker manages caching, and sync happens in the background when connectivity allows.

Abstract

Offline-first architecture treats local storage as the primary data source and network as a sync mechanism. The core mental model:

Local-first data: Application reads from and writes to local storage (IndexedDB, OPFS) immediately. Network operations are asynchronous background tasks, not blocking user interactions.
Service Workers as network proxy: Service Workers intercept all network requests, enabling caching strategies (cache-first, network-first, stale-while-revalidate) and background sync when connectivity returns.
Conflict resolution is the hard problem: When multiple clients modify the same data offline, syncing creates conflicts. Three approaches: Last-Write-Wins (simple but loses data), Operational Transform (requires central server), and CRDTs (mathematically guaranteed convergence but complex).
Storage is constrained and unreliable: Browser storage quotas vary wildly (Safari: 1GB, Chrome: 60% of disk). Storage can be evicted without warning unless persistent storage is requested and granted.

Pattern	Complexity	Data Loss Risk	Offline Duration	Best For
Cache-only	Low	High (stale data)	Minutes	Static assets
Sync queue	Medium	Medium (conflicts)	Hours	Form submissions
OT-based	High	Low	Days	Real-time collab
CRDT-based	Very High	None	Indefinite	P2P, long offline

The Challenge

Browser Constraints

Building offline-first applications means working within browser limitations that don’t exist in native apps.

Main thread contention: IndexedDB operations are asynchronous but still affect the main thread. Large reads/writes can cause jank. Service Workers run on a separate thread but share CPU with the page.

Storage quotas: Browsers limit how much data an origin can store, and quotas vary dramatically:

Browser	Best-Effort Mode	Persistent Mode	Eviction Behavior
Chrome	60% of disk	60% of disk	LRU when >80% full
Firefox	10% of disk (max 10GB)	50% of disk (max 8TB)	LRU by origin
Safari	~1GB total	Not supported	7 days without user interaction

Safari’s aggressive eviction: Safari deletes all website data (IndexedDB, Cache API, localStorage) after 7 days without user interaction when Intelligent Tracking Prevention (ITP) is enabled. This fundamentally breaks long-term offline storage for Safari users.

“After 7 days of Safari use without user interaction on your site, all the website’s script-writable storage forms are deleted.” — WebKit Blog, 2020

Storage API fragmentation: Different storage mechanisms have different characteristics:

Storage Type	Max Size	Persistence	Indexed	Transaction Support
localStorage	5MB	Session/persistent	No	No
sessionStorage	5MB	Tab session	No	No
IndexedDB	Origin quota	Persistent	Yes	Yes
Cache API	Origin quota	Persistent	No	No
OPFS	Origin quota	Persistent	No	No

Network Realities

navigator.onLine is unreliable: This API only indicates whether the browser has a network interface—not whether it can reach the internet. A LAN connection without internet access reports online: true.

1
// Don't rely on this for actual connectivity
2
navigator.onLine // true even without internet access
3

4
// Instead, detect actual connectivity
5
async function checkConnectivity(): Promise<boolean> {
6
  try {
7
    const response = await fetch("/api/health", {
8
      method: "HEAD",
9
      cache: "no-store",
10
    })
11
    return response.ok
12
  } catch {
13
    return false
14
  }
15
}

Network transitions are complex: Users move between WiFi, cellular, and offline. Requests can fail mid-flight. Servers can be reachable but slow. Offline-first apps must handle all these states gracefully.

Scale Factors

The right offline strategy depends on data characteristics:

Factor	Simple Offline	Full Offline-First
Data size	< 10MB	100MB+
Update frequency	< 1/hour	Real-time
Concurrent editors	Single user	Multiple users
Offline duration	Minutes	Days/weeks
Conflict complexity	Overwrites acceptable	Must preserve all edits

Storage Layer

IndexedDB: The Foundation

IndexedDB is the primary storage mechanism for offline-first apps. It’s a transactional, indexed object store that can handle large amounts of structured data.

Transaction model: IndexedDB uses transactions with three modes:

readonly: Multiple concurrent reads allowed
readwrite: Serialized writes, blocks other readwrite transactions on same object stores
versionchange: Schema changes, exclusive access to entire database


3 collapsed lines
1
// Database initialization with versioning
2
const DB_NAME = "offline-app"
3
const DB_VERSION = 2
4

5
function openDatabase(): Promise<IDBDatabase> {
6
  return new Promise((resolve, reject) => {
7
    const request = indexedDB.open(DB_NAME, DB_VERSION)
8

9
    request.onupgradeneeded = (event) => {
10
      const db = request.result
11
      const oldVersion = event.oldVersion
12

13
      // Version 1: Initial schema
14
      if (oldVersion < 1) {
15
        const store = db.createObjectStore("documents", { keyPath: "id" })
16
        store.createIndex("by_updated", "updatedAt")
17
      }
18
      // Version 2: Add sync metadata
19
      if (oldVersion < 2) {
20
        const store = request.transaction!.objectStore("documents")
21
        store.createIndex("by_sync_status", "syncStatus")
7 collapsed lines
22
      }
23
    }
24

25
    request.onsuccess = () => resolve(request.result)
26
    request.onerror = () => reject(request.error)
27
  })
28
}

Versioning is critical: IndexedDB schema changes require version increments. Opening a database with a lower version than exists fails. The onupgradeneeded handler must handle all version migrations sequentially.

Cursor operations for large datasets: For datasets too large to load entirely, use cursors:


2 collapsed lines
1
async function* iterateDocuments(db: IDBDatabase): AsyncGenerator<Document> {
2
  const tx = db.transaction("documents", "readonly")
3
  const store = tx.objectStore("documents")
4
  const request = store.openCursor()
5

6
  while (true) {
7
    const cursor = await new Promise<IDBCursorWithValue | null>((resolve) => {
8
      request.onsuccess = () => resolve(request.result)
9
    })
10
    if (!cursor) break
11
    yield cursor.value
12
    cursor.continue()
13
  }
14
}

Origin Private File System (OPFS)

OPFS provides file system access within the browser sandbox—faster than IndexedDB for binary data and large files.

When to use OPFS over IndexedDB:

Binary files (images, videos, documents)
Large blobs (>10MB)
Sequential read/write patterns
Web Workers with synchronous access needed


2 collapsed lines
1
// OPFS access
2
async function saveFile(name: string, data: ArrayBuffer): Promise<void> {
3
  const root = await navigator.storage.getDirectory()
4
  const fileHandle = await root.getFileHandle(name, { create: true })
5

6
  // Async API (main thread or worker)
7
  const writable = await fileHandle.createWritable()
8
  await writable.write(data)
9
  await writable.close()
10
}
11

12
// Synchronous API (workers only) - much faster
13
function saveFileSync(name: string, data: ArrayBuffer): void {
14
  const root = navigator.storage.getDirectory()
15
  const fileHandle = root.getFileHandleSync(name, { create: true })
16
  const accessHandle = fileHandle.createSyncAccessHandle()
17
  accessHandle.write(data)
2 collapsed lines
18
  accessHandle.close()
19
}

OPFS limitations:

No indexing (unlike IndexedDB)—you manage your own file organization
Synchronous API only in Web Workers
No cross-origin access
Same quota as IndexedDB (shared origin quota)

Storage Manager API

The Storage Manager API provides quota information and persistence requests:

1
async function checkStorageStatus(): Promise<{
2
  quota: number
3
  usage: number
4
  persistent: boolean
5
}> {
6
  const estimate = await navigator.storage.estimate()
7
  const persistent = await navigator.storage.persisted()
8

9
  return {
10
    quota: estimate.quota ?? 0,
11
    usage: estimate.usage ?? 0,
12
    persistent,
13
  }
14
}
15

16
async function requestPersistence(): Promise<boolean> {
17
  // Chrome auto-grants for "important" sites (bookmarked, installed PWA)
18
  // Firefox prompts the user
19
  // Safari doesn't support persistent storage
20
  if (navigator.storage.persist) {
21
    return await navigator.storage.persist()
22
  }
23
  return false
24
}

Persistence reality: Without persistent storage, browsers can evict your data at any time when storage pressure occurs. Chrome uses LRU eviction by origin. Safari’s 7-day limit applies regardless of persistence requests.

Design implication: Never assume local data will survive. Always design for re-sync from server. Treat local storage as a cache that improves UX, not as the source of truth.

Service Workers

Service Workers are JavaScript workers that intercept network requests, enabling offline functionality and background sync.

Lifecycle

Service Workers have a distinct lifecycle that affects how updates propagate:

1
Install → Waiting → Activate → Running → Idle → Terminated
2
                ↑                              ↓
3
                └──────── Fetch event ─────────┘

Installation: Service Worker is downloaded and parsed. install event fires—use this to pre-cache critical assets.

Waiting: New Service Worker waits until all tabs using the old version close. This prevents breaking in-flight requests.

Activation: Old Service Worker is replaced. activate event fires—use this to clean up old caches.


4 collapsed lines
1
const CACHE_VERSION = "v2"
2
const STATIC_CACHE = `static-${CACHE_VERSION}`
3
const DYNAMIC_CACHE = `dynamic-${CACHE_VERSION}`
4

5
self.addEventListener("install", (event: ExtendableEvent) => {
6
  event.waitUntil(
7
    caches.open(STATIC_CACHE).then((cache) => {
8
      return cache.addAll(["/", "/app.js", "/styles.css", "/offline.html"])
9
    }),
10
  )
11
  // Skip waiting to activate immediately (use carefully)
12
  // self.skipWaiting();
13
})
14

15
self.addEventListener("activate", (event: ExtendableEvent) => {
16
  event.waitUntil(
17
    caches.keys().then((keys) => {
18
      return Promise.all(keys.filter((key) => !key.includes(CACHE_VERSION)).map((key) => caches.delete(key)))
19
    }),
20
  )
21
  // Take control of all pages immediately
22
  // self.clients.claim();
23
})

skipWaiting pitfall: Calling skipWaiting() activates the new Service Worker immediately, but existing pages still have old JavaScript. This can cause version mismatches between page code and Service Worker. Only use if your update is backward-compatible.

Caching Strategies

Jake Archibald’s “Offline Cookbook” defines canonical caching strategies. Each has distinct trade-offs:

Cache-First: Serve from cache, fall back to network. Best for static assets that rarely change.

1
async function cacheFirst(request: Request): Promise<Response> {
2
  const cached = await caches.match(request)
3
  if (cached) return cached
4

5
  const response = await fetch(request)
6
  if (response.ok) {
7
    const cache = await caches.open(STATIC_CACHE)
8
    cache.put(request, response.clone())
9
  }
10
  return response
11
}

Network-First: Try network, fall back to cache. Best for frequently-updated content where freshness matters.


2 collapsed lines
1
async function networkFirst(request: Request, timeout = 3000): Promise<Response> {
2
  try {
3
    const controller = new AbortController()
4
    const timeoutId = setTimeout(() => controller.abort(), timeout)
5

6
    const response = await fetch(request, { signal: controller.signal })
7
    clearTimeout(timeoutId)
8

9
    if (response.ok) {
10
      const cache = await caches.open(DYNAMIC_CACHE)
11
      cache.put(request, response.clone())
12
    }
13
    return response
14
  } catch {
15
    const cached = await caches.match(request)
16
    if (cached) return cached
17
    throw new Error("Network failed and no cache available")
2 collapsed lines
18
  }
19
}

Stale-While-Revalidate: Serve from cache immediately, update cache in background. Best for content where slight staleness is acceptable.

1
async function staleWhileRevalidate(request: Request): Promise<Response> {
2
  const cache = await caches.open(DYNAMIC_CACHE)
3
  const cached = await cache.match(request)
4

5
  const fetchPromise = fetch(request).then((response) => {
6
    if (response.ok) {
7
      cache.put(request, response.clone())
8
    }
9
    return response
10
  })
11

12
  return cached ?? fetchPromise
13
}

Strategy selection by resource type:

Resource Type	Strategy	Rationale
App shell (HTML, JS, CSS)	Cache-first with version	Immutable builds
API responses	Network-first	Freshness critical
User-generated content	Stale-while-revalidate	UX + eventual freshness
Images/media	Cache-first	Rarely change
Authentication endpoints	Network-only	Must be fresh

Background Sync

Background Sync API allows deferring actions until connectivity is available:


5 collapsed lines
1
// In your application code
2
async function queueSync(data: SyncData): Promise<void> {
3
  // Store the data in IndexedDB
4
  await saveToSyncQueue(data)
5

6
  // Register for background sync
7
  const registration = await navigator.serviceWorker.ready
8
  await registration.sync.register("sync-pending-changes")
9
}
10

11
// In service worker
12
self.addEventListener("sync", (event: SyncEvent) => {
13
  if (event.tag === "sync-pending-changes") {
14
    event.waitUntil(processSyncQueue())
15
  }
16
})
17

18
async function processSyncQueue(): Promise<void> {
19
  const pending = await getPendingSyncItems()
20

21
  for (const item of pending) {
22
    try {
23
      await fetch("/api/sync", {
24
        method: "POST",
9 collapsed lines
25
        body: JSON.stringify(item),
26
      })
27
      await markSynced(item.id)
28
    } catch {
29
      // Will retry on next sync event
30
      throw new Error("Sync failed")
31
    }
32
  }
33
}

Background Sync limitations:

Chrome-only (as of 2024)
No guarantee of timing—browser decides when to fire sync event
Limited to ~3 minutes of execution time
Requires Service Worker to be registered

Periodic Background Sync: Allows periodic sync even when app is closed. Requires explicit permission and Chrome only:

1
// Check support and register
2
if ("periodicSync" in navigator.serviceWorker) {
3
  const registration = await navigator.serviceWorker.ready
4
  const status = await navigator.permissions.query({
5
    name: "periodic-background-sync" as PermissionName,
6
  })
7

8
  if (status.state === "granted") {
9
    await registration.periodicSync.register("sync-content", {
10
      minInterval: 24 * 60 * 60 * 1000, // 24 hours minimum
11
    })
12
  }
13
}

Workbox

Workbox (Google) encapsulates Service Worker patterns in a production-ready library. It’s used by ~54% of mobile sites with Service Workers.


8 collapsed lines
1
import { precacheAndRoute } from "workbox-precaching"
2
import { registerRoute } from "workbox-routing"
3
import { CacheFirst, NetworkFirst, StaleWhileRevalidate } from "workbox-strategies"
4
import { BackgroundSyncPlugin } from "workbox-background-sync"
5

6
// Precache app shell (injected at build time)
7
precacheAndRoute(self.__WB_MANIFEST)
8

9
// API calls: network-first with background sync fallback
10
registerRoute(
11
  ({ url }) => url.pathname.startsWith("/api/"),
12
  new NetworkFirst({
13
    cacheName: "api-cache",
14
    plugins: [
15
      new BackgroundSyncPlugin("api-queue", {
16
        maxRetentionTime: 24 * 60, // 24 hours
17
      }),
18
    ],
19
  }),
20
)
21

22
// Images: cache-first
23
registerRoute(
24
  ({ request }) => request.destination === "image",
10 collapsed lines
25
  new CacheFirst({
26
    cacheName: "images",
27
    plugins: [
28
      new ExpirationPlugin({
29
        maxEntries: 100,
30
        maxAgeSeconds: 30 * 24 * 60 * 60, // 30 days
31
      }),
32
    ],
33
  }),
34
)

Why use Workbox:

Handles cache versioning and cleanup automatically
Precaching with revision hashing
Built-in plugins for expiration, broadcast updates, background sync
Webpack/Vite integration for build-time manifest generation

Sync Strategies

When clients make changes offline, syncing those changes creates the hardest problems in offline-first architecture.

Last-Write-Wins (LWW)

The simplest conflict resolution: most recent timestamp wins.

1
interface Document {
2
  id: string
3
  content: string
4
  updatedAt: number // Unix timestamp
5
}
6

7
function resolveConflict(local: Document, remote: Document): Document {
8
  return local.updatedAt > remote.updatedAt ? local : remote
9
}

When LWW works:

Single-user applications
Data where loss is acceptable (analytics, logs)
Coarse-grained updates (entire document, not fields)

When LWW fails:

Multi-user editing (Alice’s changes overwrite Bob’s)
Fine-grained updates (field-level changes lost)
Clock skew between clients causes wrong “winner”

Clock skew problem: Client clocks can drift. A device with clock set to the future always wins. Solutions:

Use server timestamps (but requires connectivity)
Hybrid logical clocks (Lamport timestamp + physical time)
Vector clocks (discussed below)

Vector Clocks

Vector clocks track causality—which events “happened before” others—without synchronized physical clocks.

1
type VectorClock = Map<string, number>
2

3
function increment(clock: VectorClock, nodeId: string): VectorClock {
4
  const newClock = new Map(clock)
5
  newClock.set(nodeId, (newClock.get(nodeId) ?? 0) + 1)
6
  return newClock
7
}
8

9
function merge(a: VectorClock, b: VectorClock): VectorClock {
10
  const result = new Map(a)
11
  for (const [nodeId, count] of b) {
12
    result.set(nodeId, Math.max(result.get(nodeId) ?? 0, count))
13
  }
14
  return result
15
}
16

17
function compare(a: VectorClock, b: VectorClock): "before" | "after" | "concurrent" {
18
  let aBefore = false
19
  let bBefore = false
20

21
  const allNodes = new Set([...a.keys(), ...b.keys()])
22
  for (const nodeId of allNodes) {
23
    const aCount = a.get(nodeId) ?? 0
24
    const bCount = b.get(nodeId) ?? 0
25
    if (aCount < bCount) aBefore = true
26
    if (bCount < aCount) bBefore = true
27
  }
28

29
  if (aBefore && !bBefore) return "before"
30
  if (bBefore && !aBefore) return "after"
31
  return "concurrent" // True conflict
32
}

Vector clocks detect conflicts but don’t resolve them: When compare returns 'concurrent', you have a true conflict that needs application-specific resolution.

Space overhead: Vector clocks grow with number of writers. For N clients, each entry is O(N). Dynamo-style systems use “version vectors” with pruning.

Operational Transform (OT)

Operational Transform models changes as operations that can be transformed when concurrent.

How OT works:

Client captures operations: insert('Hello', position: 0)
Client applies operation locally (optimistic update)
Client sends operation to server
Server transforms operation against concurrent operations
Server broadcasts transformed operation to other clients

1
interface TextOperation {
2
  type: "insert" | "delete"
3
  position: number
4
  text?: string // for insert
5
  length?: number // for delete
6
}
7

8
// Transform op1 given op2 was applied first
9
function transform(op1: TextOperation, op2: TextOperation): TextOperation {
10
  if (op2.type === "insert") {
11
    if (op1.position >= op2.position) {
12
      return { ...op1, position: op1.position + op2.text!.length }
13
    }
14
  } else if (op2.type === "delete") {
15
    if (op1.position >= op2.position + op2.length!) {
16
      return { ...op1, position: op1.position - op2.length! }
17
    }
18
    // More complex cases: overlapping deletes, etc.
19
  }
20
  return op1
21
}

OT requires central coordination: The server maintains operation history and performs transforms. This means OT doesn’t work for true peer-to-peer or extended offline scenarios.

OT complexity: Transformation functions are notoriously difficult to get right. Google Docs has had OT-related bugs despite years of engineering. The transformation must satisfy mathematical properties (convergence, intention preservation) that are hard to verify.

Where OT excels: Real-time collaborative editing with always-on connectivity. Low latency because changes apply immediately with optimistic updates.

CRDTs (Conflict-free Replicated Data Types)

CRDTs are data structures mathematically designed to merge without conflicts. Any order of applying changes converges to the same result.

Two types of CRDTs:

State-based (CvRDT): Replicate entire state, merge using mathematical join.

1
// G-Counter: Grow-only counter
2
type GCounter = Map<string, number>
3

4
function increment(counter: GCounter, nodeId: string): GCounter {
5
  const newCounter = new Map(counter)
6
  newCounter.set(nodeId, (newCounter.get(nodeId) ?? 0) + 1)
7
  return newCounter
8
}
9

10
function merge(a: GCounter, b: GCounter): GCounter {
11
  const result = new Map(a)
12
  for (const [nodeId, count] of b) {
13
    result.set(nodeId, Math.max(result.get(nodeId) ?? 0, count))
14
  }
15
  return result
16
}
17

18
function value(counter: GCounter): number {
19
  return Array.from(counter.values()).reduce((sum, n) => sum + n, 0)
20
}

Operation-based (CmRDT): Replicate operations, apply in any order. Requires reliable delivery (all operations eventually arrive).

Common CRDT types:

CRDT	Use Case	Trade-off
G-Counter	Likes, views	Grow-only
PN-Counter	Votes (up/down)	Two G-Counters
G-Set	Tags, followers	Grow-only set
OR-Set (Observed-Remove)	General sets	Handles concurrent add/remove
LWW-Register	Single values	Last-write-wins
RGA (Replicated Growable Array)	Text editing	Complex, high overhead

Text CRDTs: For collaborative text editing, specialized CRDTs like RGA, WOOT, or Yjs’s implementation track character positions with unique IDs that survive concurrent edits.

1
// Simplified RGA node structure
2
interface RGANode {
3
  id: { clientId: string; seq: number }
4
  char: string
5
  tombstone: boolean // Deleted but kept for ordering
6
  after: RGANode["id"] | null // Insert position
7
}

CRDT trade-offs:

Pros: Mathematically guaranteed convergence, works fully offline, no central server required
Cons: High memory overhead (tombstones, metadata), complex implementation, eventual consistency only

“CRDTs are the only data structures that can guarantee consistency in a fully decentralized system, but many published algorithms have subtle bugs. It’s easy to implement CRDTs badly.” — Martin Kleppmann

Interleaving anomaly: When two users type “foo” and “bar” at the same position, naive CRDTs may produce “fboaor” instead of “foobar” or “barfoo”. Production CRDTs (Yjs, Automerge) handle this with sophisticated tie-breaking.

Sync Strategy Comparison

Factor	LWW	Vector Clocks	OT	CRDT
Conflict resolution	Automatic (lossy)	Detect only	Server-based	Automatic (lossless)
Offline duration	Any	Any	Short (needs server)	Any
Implementation complexity	Low	Medium	High	Very High
Memory overhead	Low	Medium	Low	High
P2P support	Yes	Partial	No	Yes
Data loss risk	High	Application-dependent	Low	None

Design Paths

Path 1: Cache-Only PWA

Architecture: Service Worker caches static assets and API responses. No local database. Changes require network.

1
Browser → Service Worker → Cache API → (Network when available)

Best for:

Read-heavy applications (news, documentation)
Short offline periods (subway, airplane mode)
Content that doesn’t change offline

Implementation complexity:

Aspect	Effort
Initial setup	Low
Feature additions	Low
Sync logic	None
Testing	Low

Device/network profile:

Works well on: All devices, any network
Struggles on: Extended offline, collaborative editing

Trade-offs:

Simplest implementation
No sync conflicts
Limited offline functionality
Stale data possible

Path 2: Sync Queue Pattern

Architecture: Changes stored in IndexedDB queue, processed when online. Server is source of truth.

1
App → IndexedDB (queue) → Background Sync → Server → IndexedDB (confirmed)

Best for:

Form submissions (surveys, orders)
Single-user data (personal notes, todos)
Tolerance for occasional conflicts

Implementation complexity:

Aspect	Effort
Initial setup	Medium
Feature additions	Medium
Sync logic	Medium (queue management)
Testing	Medium

Key implementation concerns:

Idempotency: Server must handle duplicate submissions
Ordering: Queue processes FIFO, but network latency can reorder
Failure handling: Permanent failures need user notification


8 collapsed lines
1
interface SyncQueueItem {
2
  id: string
3
  operation: "create" | "update" | "delete"
4
  entity: string
5
  data: unknown
6
  timestamp: number
7
  retries: number
8
}
9

10
async function processQueue(): Promise<void> {
11
  const queue = await getSyncQueue()
12

13
  for (const item of queue) {
14
    try {
15
      await sendToServer(item)
16
      await removeFromQueue(item.id)
17
    } catch (error) {
18
      if (isPermanentError(error)) {
19
        await markAsFailed(item.id)
20
        notifyUser(`Failed to sync: ${item.entity}`)
21
      } else {
22
        await incrementRetry(item.id)
23
        if (item.retries >= MAX_RETRIES) {
24
          await markAsFailed(item.id)
25
        }
26
      }
27
    }
28
  }
29
}

Trade-offs:

Handles common offline scenarios
Server-side conflict resolution
May lose changes on permanent failures
Doesn’t support real-time collaboration

Path 3: CRDT-Based Local-First

Architecture: Local CRDT state is authoritative. Peers sync directly or through relay server. No central source of truth.

1
App → CRDT State (IndexedDB) ↔ Peer/Server ↔ Other Clients' CRDT State

Best for:

Collaborative editing (documents, whiteboards)
P2P applications
Extended offline with multiple editors

Implementation complexity:

Aspect	Effort
Initial setup	High
Feature additions	High
Sync logic	Very High (CRDT implementation)
Testing	Very High

Library options:

Library	Focus	Bundle Size	Mature
Yjs	Text/structured data	~15KB	Yes
Automerge	JSON documents	~100KB	Yes
Liveblocks	Real-time + CRDT	SaaS	Yes
ElectricSQL	Postgres sync	~50KB	Emerging


5 collapsed lines
1
import * as Y from "yjs"
2
import { IndexeddbPersistence } from "y-indexeddb"
3
import { WebsocketProvider } from "y-websocket"
4

5
// Create CRDT document
6
const doc = new Y.Doc()
7

8
// Persist to IndexedDB
9
const persistence = new IndexeddbPersistence("my-doc", doc)
10

11
// Sync with server/peers when online
12
const provider = new WebsocketProvider("wss://sync.example.com", "my-doc", doc)
13

14
// Get shared types
15
const text = doc.getText("content")
16
const todos = doc.getArray("todos")
17

18
// Changes automatically sync
19
text.insert(0, "Hello")

Trade-offs:

True offline-first with guaranteed convergence
Supports P2P architecture
Complex implementation
High memory overhead
Eventual consistency only (no transactions)

Decision Framework

Real-World Implementations

Figma: Canvas-Level Offline

Challenge: Complex vector graphics with potentially millions of objects, multiple concurrent editors.

Approach:

CRDT-based multiplayer engine
30-day offline window (7 days on Safari due to ITP)
Changes stored in IndexedDB with timestamp metadata
On reconnect, changes merge via CRDT semantics

Technical details:

WebAssembly for CRDT operations (performance critical)
Custom CRDT for vector graphics (not text-focused)
Selective sync—only download what’s viewed, not entire project
Background prefetch of likely-needed files

Limitation: Can’t download entire project for offline. Must have previously opened a file within the offline window.

Key insight: “The hardest part isn’t the CRDT—it’s making the UX feel instant while syncing in the background.” — Evan Wallace, Figma CTO

Notion: Block-Based CRDT

Challenge: Rich text documents with blocks (paragraphs, code, embeds), tables, and databases.

Approach:

Custom CRDT system (inspired by Martin Kleppmann’s research)
Per-page sync with lastDownloadedTimestamp tracking
Selective sync—only fetch pages with newer lastUpdatedTime on server

Technical details:

Peritext integration for rich text formatting CRDTs (handles formatting spans)
Database views sync separately from underlying data
50-row database limit in initial offline version (increased over time)

Limitation: Non-text properties (select fields, relations) harder to merge. Some conflict resolution requires user intervention for complex database changes.

Source: Notion engineering blog, 2024

Linear: Delta Sync

Challenge: Project management with issues, projects, and workflows. Must feel instant.

Approach:

Bootstrap process downloads initial state
WebSocket for incremental delta packets
IndexedDB for local cache, not full offline editing
Server maintains authoritative sync ID (incremental integer)

Technical details:

Sync ID increments with each server-side transaction
Client tracks last seen sync ID, requests deltas since that ID
Optimistic updates with rollback on server rejection
Not true offline-first—designed as connectivity failsafe

Trade-off accepted: Offline is “best effort”—can view cached data, but edits require eventual connectivity. Simpler than full CRDT but limits offline duration.

Excalidraw: P2P with localStorage

Challenge: Collaborative whiteboard with no backend requirement.

Approach:

Pseudo-P2P: Central server relays end-to-end encrypted messages
State stored in localStorage (keys: excalidraw for objects, excalidraw-state for UI)
Union merge for conflict resolution—all elements from all clients combine
End-to-end encryption—server never sees content

Technical details:

WebSockets via Socket.IO for message relay
No server-side storage—all state is client-side
Room-based collaboration with shareable links
Works fully offline for local edits

Limitation: Union merge means no true delete—“deleted” elements can reappear if another client hadn’t seen the delete. Trade-off for simplicity.

Browser Constraints Deep Dive

Storage Quota Management

1
async function manageStorageQuota(): Promise<void> {
2
  const { quota, usage } = await navigator.storage.estimate()
3
  const usagePercent = (usage! / quota!) * 100
4

5
  if (usagePercent > 80) {
6
    // Proactive cleanup before hitting quota
7
    await evictOldCache()
8
  }
9

10
  if (usagePercent > 95) {
11
    // Critical: may start failing writes
12
    await aggressiveCleanup()
13
    notifyUser("Storage nearly full")
14
  }
15
}
16

17
async function evictOldCache(): Promise<void> {
18
  const cache = await caches.open("dynamic-cache")
19
  const requests = await cache.keys()
20

21
  // Sort by access time (stored in custom header or IndexedDB metadata)
22
  const sorted = await sortByLastAccess(requests)
23

24
  // Evict oldest 20%
25
  const toEvict = sorted.slice(0, Math.floor(sorted.length * 0.2))
26
  await Promise.all(toEvict.map((req) => cache.delete(req)))
27
}

Quota exceeded handling: When quota is exceeded, IndexedDB and Cache API throw QuotaExceededError. Always wrap storage operations:

1
async function safeWrite(key: string, value: unknown): Promise<boolean> {
2
  try {
3
    await writeToIndexedDB(key, value)
4
    return true
5
  } catch (error) {
6
    if (error.name === "QuotaExceededError") {
7
      await evictOldCache()
8
      try {
9
        await writeToIndexedDB(key, value)
10
        return true
11
      } catch {
12
        notifyUser("Storage full. Some data may not be saved offline.")
13
        return false
14
      }
15
    }
16
    throw error
17
  }
18
}

Safari’s 7-Day Eviction

Safari’s ITP deletes all script-writable storage after 7 days without user interaction. Mitigation strategies:

Prompt for PWA installation: Installed PWAs are exempt from 7-day limit
Request persistent storage: Not supported in Safari, but doesn’t hurt
Design for re-sync: Assume local data may disappear
Track last interaction: Warn users approaching 7-day cliff

1
const SAFARI_EVICTION_DAYS = 7
2

3
function checkEvictionRisk(): { daysRemaining: number; atRisk: boolean } {
4
  const lastInteraction = localStorage.getItem("lastInteraction")
5
  if (!lastInteraction) {
6
    localStorage.setItem("lastInteraction", Date.now().toString())
7
    return { daysRemaining: SAFARI_EVICTION_DAYS, atRisk: false }
8
  }
9

10
  const daysSince = (Date.now() - parseInt(lastInteraction)) / (1000 * 60 * 60 * 24)
11
  const daysRemaining = SAFARI_EVICTION_DAYS - daysSince
12

13
  // Update interaction timestamp
14
  localStorage.setItem("lastInteraction", Date.now().toString())
15

16
  return {
17
    daysRemaining: Math.max(0, daysRemaining),
18
    atRisk: daysRemaining < 2,
19
  }
20
}

Cross-Browser Testing

Offline-first behavior varies significantly across browsers. Test matrix:

Scenario	Chrome	Firefox	Safari
Quota exceeded	QuotaExceededError	QuotaExceededError	QuotaExceededError
Persistent storage	Auto-grant for PWAs	User prompt	Not supported
Background sync	Supported	Not supported	Not supported
Service Worker + private mode	Works	Works	Limited
IndexedDB in iframe	Works	Works	Blocked (3rd party)

Common Pitfalls

1. Trusting navigator.onLine

The mistake: Using navigator.onLine to determine if sync should happen.

1
// Don't do this
2
if (navigator.onLine) {
3
  await syncData()
4
}

Why it fails: navigator.onLine only checks for network interface, not internet connectivity. LAN without internet, captive portals, and firewalls all report online: true.

The fix: Use actual fetch with timeout as connectivity check:

1
async function canReachServer(): Promise<boolean> {
2
  try {
3
    const controller = new AbortController()
4
    const timeoutId = setTimeout(() => controller.abort(), 5000)
5

6
    const response = await fetch("/api/health", {
7
      method: "HEAD",
8
      signal: controller.signal,
9
      cache: "no-store",
10
    })
11

12
    clearTimeout(timeoutId)
13
    return response.ok
14
  } catch {
15
    return false
16
  }
17
}

2. Ignoring IndexedDB Versioning

The mistake: Not handling schema upgrades properly.

1
// Dangerous: no version handling
2
const request = indexedDB.open("mydb")
3
request.onsuccess = () => {
4
  const db = request.result
5
  const tx = db.transaction("users", "readwrite") // May not exist!
6
}

Why it fails: If schema changes, existing users have old schema. Without proper onupgradeneeded, code accessing new object stores crashes.

The fix: Always increment version and handle migrations:

1
const DB_VERSION = 3 // Increment with each schema change
2

3
request.onupgradeneeded = (event) => {
4
  const db = request.result
5
  const oldVersion = event.oldVersion
6

7
  // Migrate through each version
8
  if (oldVersion < 1) {
9
    db.createObjectStore("users", { keyPath: "id" })
10
  }
11
  if (oldVersion < 2) {
12
    db.createObjectStore("settings", { keyPath: "key" })
13
  }
14
  if (oldVersion < 3) {
15
    const users = request.transaction!.objectStore("users")
16
    users.createIndex("by_email", "email", { unique: true })
17
  }
18
}

3. Service Worker Update Races

The mistake: Using skipWaiting() without considering page code compatibility.

Why it fails: Old page JavaScript + new Service Worker can have API mismatches. Cached responses may not match expected format.

The fix: Either reload the page after Service Worker update, or ensure backward compatibility:

1
// In Service Worker
2
self.addEventListener("message", (event) => {
3
  if (event.data === "skipWaiting") {
4
    self.skipWaiting()
5
  }
6
})
7

8
// In page
9
navigator.serviceWorker.addEventListener("controllerchange", () => {
10
  // New SW took over, reload to ensure consistency
11
  window.location.reload()
12
})

4. Unbounded Storage Growth

The mistake: Caching without eviction policy.

1
// Grows forever
2
const cache = await caches.open("api-responses")
3
cache.put(request, response) // Never cleaned up

Why it fails: Eventually hits quota, causing write failures. User experience degrades suddenly rather than gracefully.

The fix: Implement LRU or time-based eviction:

1
const MAX_CACHE_ENTRIES = 100
2
const MAX_CACHE_AGE_MS = 7 * 24 * 60 * 60 * 1000 // 7 days
3

4
async function cacheWithEviction(request: Request, response: Response): Promise<void> {
5
  const cache = await caches.open("api-responses")
6
  const keys = await cache.keys()
7

8
  // Evict if over limit
9
  if (keys.length >= MAX_CACHE_ENTRIES) {
10
    await cache.delete(keys[0]) // FIFO, or implement LRU
11
  }
12

13
  // Store with timestamp
14
  const headers = new Headers(response.headers)
15
  headers.set("x-cached-at", Date.now().toString())
16
  const newResponse = new Response(response.body, {
17
    status: response.status,
18
    headers,
19
  })
20

21
  await cache.put(request, newResponse)
22
}

5. Sync Conflict Denial

The mistake: Assuming conflicts won’t happen because “users don’t edit the same thing.”

Why it fails: Conflicts happen when the same user edits on multiple devices, when sync is delayed, or when retries duplicate operations.

The fix: Design for conflicts from the start:

Use idempotent operations with unique IDs
Implement conflict detection and resolution UI
Log conflicts for debugging
Test with simulated network partitions

Conclusion

Offline-first architecture inverts the traditional web assumption: data lives locally, sync is background, and network is optional. This enables responsive UX regardless of connectivity but introduces complexity in storage management, sync strategies, and conflict resolution.

Key architectural decisions:

Storage choice: IndexedDB for structured data with indexing needs, OPFS for binary files and performance-critical access, Cache API for HTTP responses. All share origin quota—monitor and manage proactively.

Sync strategy: LWW for simple, loss-tolerant cases. Sync queues for form-style interactions. OT for real-time collaboration with reliable connectivity. CRDTs for true offline-first with guaranteed convergence.

Browser reality: Safari’s 7-day eviction breaks long-term offline. Persistent storage is unreliable. navigator.onLine is useless. Design for data loss and re-sync.

The technology is mature—Yjs, Automerge, and Workbox provide production-ready foundations. The complexity is in choosing the right trade-offs for your use case and handling the edge cases that browser APIs don’t abstract away.

Appendix

Prerequisites

Browser storage APIs: localStorage, IndexedDB concepts
Service Workers: Basic lifecycle and fetch interception
Distributed systems basics: Consistency models, network partitions
Promises/async: Modern JavaScript async patterns

Terminology

CmRDT: Commutative/operation-based CRDT—replicate operations, apply in any order
CvRDT: Convergent/state-based CRDT—replicate state, merge with join function
ITP: Intelligent Tracking Prevention—Safari’s privacy feature that limits storage
LWW: Last-Write-Wins—conflict resolution where latest timestamp wins
OPFS: Origin Private File System—browser file system API
OT: Operational Transform—sync strategy that transforms concurrent operations
PWA: Progressive Web App—web app with offline capability via Service Worker
Tombstone: Marker for deleted item in CRDT—kept for ordering, never truly removed

Summary

Local-first data model: Application reads/writes to IndexedDB or OPFS immediately; network sync is asynchronous
Service Workers: Intercept requests, implement caching strategies (cache-first, network-first, stale-while-revalidate), enable background sync
Storage constraints: Quotas vary (Safari ~1GB, Chrome 60% disk); Safari evicts after 7 days without interaction; persistent storage helps but isn’t guaranteed
Conflict resolution: LWW loses data; OT requires server; CRDTs guarantee convergence but are complex; choose based on offline duration and collaboration needs
Production patterns: Figma uses CRDTs with 30-day window; Notion uses CRDTs with selective sync; Linear uses delta sync (not true offline-first); Excalidraw uses union merge with localStorage

References

Service Workers Spec - W3C - Normative specification
The Offline Cookbook - Jake Archibald - Canonical caching patterns
IndexedDB API - W3C - Storage specification
CRDTs: The Hard Parts - Martin Kleppmann - CRDT design challenges
Workbox Documentation - Google - Service Worker library
Storage Quotas and Eviction - MDN - Browser storage limits
Origin Private File System - web.dev - OPFS guide
Full Third-Party Cookie Blocking - WebKit - Safari’s 7-day storage limit
Figma’s Multiplayer Technology - Evan Wallace - Production CRDT implementation
How We Made Notion Available Offline - Notion Engineering - Block-based CRDT sync
Linear’s Sync Engine - Reverse Engineering - Delta sync approach
Excalidraw P2P Collaboration - Union merge pattern
CRDT Papers Collection - Academic CRDT research
Yjs Documentation - Production CRDT library
Automerge - JSON CRDT library

Read more