Offline-First Architecture

Building applications that prioritize local data and functionality, treating network connectivity as an enhancement rather than a requirement—the storage APIs, sync strategies, and conflict resolution patterns that power modern collaborative and offline-capable applications.

Offline-first inverts the traditional web model: instead of fetching data from servers and caching it locally, data lives locally first and syncs to servers when possible. This article explores the browser APIs that enable this pattern, the sync strategies that keep data consistent, and how production applications like Figma, Notion, and Linear solve these problems at scale.

Offline-first architecture: the app reads and writes to IndexedDB or OPFS immediately, the Service Worker intercepts fetches and runs background sync, and reconciliation happens between the local store and the server when connectivity returns. — Offline-first architecture: the app reads and writes locally first; Service Worker handles caching and background sync; reconciliation happens between the local store and the server.

Abstract

Offline-first architecture treats local storage as the primary data source and network as a sync mechanism. The core mental model:

Local-first data: Application reads from and writes to local storage (IndexedDB, OPFS) immediately. Network operations are asynchronous background tasks, not blocking user interactions.
Service Workers as network proxy: Service Workers intercept all network requests, enabling caching strategies (cache-first, network-first, stale-while-revalidate) and background sync when connectivity returns.
Conflict resolution is the hard problem: When multiple clients modify the same data offline, syncing creates conflicts. Three approaches: Last-Write-Wins (simple but loses data), Operational Transform (requires central server), and CRDTs (mathematically guaranteed convergence but complex).
Storage is constrained and unreliable: Browser storage quotas are mostly disk-percentage based (Chrome and Safari around 60% of disk per origin, Firefox 10%/50%), but Safari evicts the entire script-writable surface after seven days of browser use without site interaction. Persistent storage helps in Chromium/Firefox but is a no-op in Safari.

Pattern	Complexity	Data Loss Risk	Offline Duration	Best For
Cache-only	Low	High (stale data)	Minutes	Static assets
Sync queue	Medium	Medium (conflicts)	Hours	Form submissions
OT-based	High	Low	Days	Real-time collab
CRDT-based	Very High	None	Indefinite	P2P, long offline

Local-First Principles

“Offline-first” is sometimes used loosely to mean “the page doesn’t blow up on a flaky train”. The stricter framing comes from Kleppmann et al., “Local-first software” (Ink & Switch, 2019), which sets seven ideals a fully local-first system aims for:

No spinners — the app responds to input without round-tripping a server.
Your work is not trapped on one device — state syncs across the user’s devices.
The network is optional — full functionality offline; sync runs in the background when a connection appears.
Seamless collaboration — real-time multi-user editing on par with cloud apps.
Long-term preservation — data outlives the application and the company that shipped it.
Security and privacy by default — end-to-end encryption; the server holds ciphertext it cannot read.
You retain ultimate ownership and control — the user can fork, export, and walk away.

Most production “offline-first” web apps satisfy 1–4 and partially 5; ideals 6 and 7 are usually only met by P2P + CRDT systems like Excalidraw or Ink & Switch’s own research prototypes. Treat the seven ideals as a target gradient, not a binary check — every architecture in this article makes a different cut.

The Challenge

Browser Constraints

Building offline-first applications means working within browser limitations that don’t exist in native apps.

Main thread contention: IndexedDB operations are asynchronous but still affect the main thread. Large reads/writes can cause jank. Service Workers run on a separate thread but share CPU with the page.

Storage quotas: Browsers limit how much data an origin can store, and quotas vary dramatically. The numbers below are from the MDN storage quotas reference (2026):

Browser	Best-Effort Mode	Persistent Mode	Eviction Behavior
Chrome	Up to 60% of disk per origin	Up to 60% of disk per origin	LRU once group quota (80% of disk) is full
Firefox	Smaller of 10% disk or 10 GiB (per eTLD+1)	Up to 50% of disk, capped 8 TiB	LRU per group, after user prompt
Safari	Up to 60% of disk per origin (browser app)	Persistence flag is no-op	7 days of browser use without interaction

Note

Safari’s ~1 GB cap is a frequently repeated but obsolete number. WebKit moved to disk-percentage quotas (60% per origin in the browser app, 15% in embedded WebViews) several releases back. The headline constraint on Safari today is the 7-day eviction, not the size cap.

Safari’s aggressive eviction: Safari deletes all script-writable storage (IndexedDB, Cache API, localStorage, sessionStorage, Service Worker registrations) after seven days of Safari use without a meaningful interaction on the site. “Seven days of Safari use” means days the browser is actively used, not seven calendar days — and the only exemptions are home-screen PWAs, which carry their own usage counter. This fundamentally breaks long-term offline storage for Safari users on web pages.

Storage API fragmentation: Different storage mechanisms have different characteristics:

Storage Type	Max Size	Persistence	Indexed	Transaction Support
localStorage	5MB	Session/persistent	No	No
sessionStorage	5MB	Tab session	No	No
IndexedDB	Origin quota	Persistent	Yes	Yes
Cache API	Origin quota	Persistent	No	No
OPFS	Origin quota	Persistent	No	No

Network Realities

navigator.onLine is unreliable: MDN explicitly warns that browsers may not have a reliable way to know whether the device can actually reach the internet. The flag flips on whether the browser has any network interface; a LAN connection behind a captive portal or a firewall that blocks egress will still report online: true.

1// Don't rely on this for actual connectivity2navigator.onLine // true even without internet access34// Instead, detect actual connectivity5async function checkConnectivity(): Promise<boolean> {6  try {7    const response = await fetch("/api/health", {8      method: "HEAD",9      cache: "no-store",10    })11    return response.ok12  } catch {13    return false14  }15}

Network transitions are complex: Users move between WiFi, cellular, and offline. Requests can fail mid-flight. Servers can be reachable but slow. Offline-first apps must handle all these states gracefully.

Scale Factors

The right offline strategy depends on data characteristics:

Factor	Simple Offline	Full Offline-First
Data size	< 10MB	100MB+
Update frequency	< 1/hour	Real-time
Concurrent editors	Single user	Multiple users
Offline duration	Minutes	Days/weeks
Conflict complexity	Overwrites acceptable	Must preserve all edits

Storage Layer

IndexedDB: The Foundation

IndexedDB is the primary storage mechanism for offline-first apps. It’s a transactional, indexed object store that can handle large amounts of structured data.

Transaction model: IndexedDB uses transactions with three modes:

readonly: Multiple concurrent reads allowed
readwrite: Serialized writes, blocks other readwrite transactions on same object stores
versionchange: Schema changes, exclusive access to entire database

1// Database initialization with versioning2const DB_NAME = "offline-app"3const DB_VERSION = 245function openDatabase(): Promise<IDBDatabase> {6  return new Promise((resolve, reject) => {7    const request = indexedDB.open(DB_NAME, DB_VERSION)89    request.onupgradeneeded = (event) => {10      const db = request.result11      const oldVersion = event.oldVersion1213      // Version 1: Initial schema14      if (oldVersion < 1) {15        const store = db.createObjectStore("documents", { keyPath: "id" })16        store.createIndex("by_updated", "updatedAt")17      }18      // Version 2: Add sync metadata19      if (oldVersion < 2) {20        const store = request.transaction!.objectStore("documents")21        store.createIndex("by_sync_status", "syncStatus")22      }23    }2425    request.onsuccess = () => resolve(request.result)26    request.onerror = () => reject(request.error)27  })28}

Versioning is critical: IndexedDB schema changes require version increments. Opening a database with a lower version than exists fails. The onupgradeneeded handler must handle all version migrations sequentially.

Cursor operations for large datasets: For datasets too large to load entirely, use cursors. Each cursor.continue() re-fires onsuccess on the original request, so a single onsuccess handler per cursor is the correct shape:

1async function* iterateDocuments(db: IDBDatabase): AsyncGenerator<Document> {2  const tx = db.transaction("documents", "readonly")3  const store = tx.objectStore("documents")4  const request = store.openCursor()56  while (true) {7    const cursor = await new Promise<IDBCursorWithValue | null>((resolve, reject) => {8      request.onsuccess = () => resolve(request.result)9      request.onerror = () => reject(request.error)10    })11    if (!cursor) return12    yield cursor.value as Document13    cursor.continue()14  }15}

Caution

IndexedDB transactions auto-commit when the event loop returns to the microtask queue with no pending requests. If you await a Promise that does not chain another IndexedDB request, the transaction commits and the next cursor.continue() throws TransactionInactiveError. Inside an async generator, the consumer’s awaits between yields are exactly such gaps — wrap heavy per-row work outside the cursor (collect IDs first, then re-open a transaction) when consumer code is async.

Origin Private File System (OPFS)

OPFS provides file system access within the browser sandbox — faster than IndexedDB for binary data and large files. The full surface lives in the WHATWG File System Standard and is documented end-to-end on MDN.

When to use OPFS over IndexedDB:

Binary files (images, videos, documents)
Large blobs (>10 MB)
Sequential read/write patterns
Web Workers where synchronous access is acceptable

Important

All OPFS handle lookups (navigator.storage.getDirectory(), getFileHandle(), getDirectoryHandle()) are asynchronous on every thread. The synchronous file API only kicks in after you have an async file handle and call createSyncAccessHandle() — and the resulting FileSystemSyncAccessHandle only works inside a dedicated Web Worker.

1async function saveFile(name: string, data: ArrayBuffer): Promise<void> {2  const root = await navigator.storage.getDirectory()3  const fileHandle = await root.getFileHandle(name, { create: true })45  const writable = await fileHandle.createWritable()6  await writable.write(data)7  await writable.close()8}

1self.addEventListener("message", async (event: MessageEvent<{ name: string; data: ArrayBuffer }>) => {2  const { name, data } = event.data3  const root = await navigator.storage.getDirectory()4  const fileHandle = await root.getFileHandle(name, { create: true })56  const accessHandle = await fileHandle.createSyncAccessHandle()7  try {8    accessHandle.write(data, { at: 0 })9    accessHandle.flush()10  } finally {11    accessHandle.close()12  }13})

OPFS limitations:

No indexing (unlike IndexedDB) — you manage your own file organization.
The fast sync API is Worker-only; main-thread code uses the async createWritable() writer.
No cross-origin access; sandbox is per origin.
Same quota as IndexedDB (shared origin quota).

Storage Manager API

The Storage Manager API provides quota information and persistence requests:

1async function checkStorageStatus(): Promise<{2  quota: number3  usage: number4  persistent: boolean5}> {6  const estimate = await navigator.storage.estimate()7  const persistent = await navigator.storage.persisted()89  return {10    quota: estimate.quota ?? 0,11    usage: estimate.usage ?? 0,12    persistent,13  }14}1516async function requestPersistence(): Promise<boolean> {17  // Chrome auto-grants for "important" sites (bookmarked, installed PWA)18  // Firefox prompts the user19  // Safari doesn't support persistent storage20  if (navigator.storage.persist) {21    return await navigator.storage.persist()22  }23  return false24}

Persistence reality: Without persistent storage, browsers can evict your data at any time when storage pressure occurs. Chrome uses LRU eviction by origin. Safari’s 7-day limit applies regardless of persistence requests.

Design implication: Never assume local data will survive. Always design for re-sync from server. Treat local storage as a cache that improves UX, not as the source of truth.

Service Workers

Service Workers are JavaScript workers that intercept network requests, enabling offline functionality and background sync.

Lifecycle

Service Workers have a distinct lifecycle that affects how updates propagate. The official model lives in the W3C Service Workers spec, §2 Lifecycle; a more readable narrative is on web.dev.

Service Worker lifecycle: install, wait for old clients to close, activate, then idle/running cycles driven by fetch, sync, and message events.

Installation: Service Worker is downloaded and parsed. install event fires — use this to pre-cache critical assets.

Waiting: New Service Worker waits until all clients controlled by the old version close. This prevents breaking in-flight requests.

Activation: Old Service Worker is replaced. activate event fires — use this to clean up old caches.

1// service-worker.ts2const CACHE_VERSION = "v2"3const STATIC_CACHE = `static-${CACHE_VERSION}`4const DYNAMIC_CACHE = `dynamic-${CACHE_VERSION}`56self.addEventListener("install", (event: ExtendableEvent) => {7  event.waitUntil(8    caches.open(STATIC_CACHE).then((cache) => {9      return cache.addAll(["/", "/app.js", "/styles.css", "/offline.html"])10    }),11  )12  // Skip waiting to activate immediately (use carefully)13  // self.skipWaiting();14})1516self.addEventListener("activate", (event: ExtendableEvent) => {17  event.waitUntil(18    caches.keys().then((keys) => {19      return Promise.all(keys.filter((key) => !key.includes(CACHE_VERSION)).map((key) => caches.delete(key)))20    }),21  )22  // Take control of all pages immediately23  // self.clients.claim();24})

skipWaiting pitfall: Calling skipWaiting() activates the new Service Worker immediately, but existing pages still have old JavaScript. This can cause version mismatches between page code and Service Worker. Only use if your update is backward-compatible.

Caching Strategies

Jake Archibald’s Offline Cookbook defines the canonical caching strategies. Each has distinct trade-offs.

Cache-priority Service Worker strategies: cache-first returns the cached copy and only goes to network on miss, cache-only returns cached or an offline page and never touches the network. — Cache-priority strategies — cache-first and cache-only. Both consult the cache first; only cache-first falls back to the network on miss.

Network-aware Service Worker strategies: network-first races the network with a timeout and falls back to cache, stale-while-revalidate returns the cached copy immediately and refreshes the cache in the background. — Network-aware strategies — network-first and stale-while-revalidate. Both touch the network on every request; SWR returns the cached copy first while it refreshes.

Cache-First: Serve from cache, fall back to network. Best for static assets that rarely change.

1async function cacheFirst(request: Request): Promise<Response> {2  const cached = await caches.match(request)3  if (cached) return cached45  const response = await fetch(request)6  if (response.ok) {7    const cache = await caches.open(STATIC_CACHE)8    cache.put(request, response.clone())9  }10  return response11}

Network-First: Try network, fall back to cache. Best for frequently-updated content where freshness matters.

1async function networkFirst(request: Request, timeout = 3000): Promise<Response> {2  try {3    const controller = new AbortController()4    const timeoutId = setTimeout(() => controller.abort(), timeout)56    const response = await fetch(request, { signal: controller.signal })7    clearTimeout(timeoutId)89    if (response.ok) {10      const cache = await caches.open(DYNAMIC_CACHE)11      cache.put(request, response.clone())12    }13    return response14  } catch {15    const cached = await caches.match(request)16    if (cached) return cached17    throw new Error("Network failed and no cache available")18  }19}

Stale-While-Revalidate: Serve from cache immediately, update cache in background. Best for content where slight staleness is acceptable.

1async function staleWhileRevalidate(request: Request): Promise<Response> {2  const cache = await caches.open(DYNAMIC_CACHE)3  const cached = await cache.match(request)45  const fetchPromise = fetch(request).then((response) => {6    if (response.ok) {7      cache.put(request, response.clone())8    }9    return response10  })1112  return cached ?? fetchPromise13}

Strategy selection by resource type:

Resource Type	Strategy	Rationale
App shell (HTML, JS, CSS)	Cache-first with version	Immutable builds
API responses	Network-first	Freshness critical
User-generated content	Stale-while-revalidate	UX + eventual freshness
Images/media	Cache-first	Rarely change
Authentication endpoints	Network-only	Must be fresh

Background Sync

Background Sync API allows deferring actions until connectivity is available:

1// In your application code2async function queueSync(data: SyncData): Promise<void> {3  // Store the data in IndexedDB4  await saveToSyncQueue(data)56  // Register for background sync7  const registration = await navigator.serviceWorker.ready8  await registration.sync.register("sync-pending-changes")9}1011// In service worker12self.addEventListener("sync", (event: SyncEvent) => {13  if (event.tag === "sync-pending-changes") {14    event.waitUntil(processSyncQueue())15  }16})1718async function processSyncQueue(): Promise<void> {19  const pending = await getPendingSyncItems()2021  for (const item of pending) {22    try {23      await fetch("/api/sync", {24        method: "POST",25        body: JSON.stringify(item),26      })27      await markSynced(item.id)28    } catch {29      // Will retry on next sync event30      throw new Error("Sync failed")31    }32  }33}

Background Sync limitations (per MDN compatibility data):

Chromium-only as of 2026 — neither Firefox nor Safari ship SyncManager.
No guarantee of timing — the browser decides when to fire the sync event based on connectivity and engagement signals.
Service worker scripts have a tight execution budget; long-running sync work can be terminated.
Requires an active Service Worker registration on a secure origin.

Periodic Background Sync: Allows periodic sync even when the app is closed. Requires the periodic-background-sync permission, an installed PWA, and is also Chromium-only:

1const registration = await navigator.serviceWorker.ready23if ("periodicSync" in registration) {4  const status = await navigator.permissions.query({5    name: "periodic-background-sync" as PermissionName,6  })78  if (status.state === "granted") {9    await (registration as any).periodicSync.register("sync-content", {10      minInterval: 24 * 60 * 60 * 1000,11    })12  }13}

Note

minInterval is a hint, not a floor. Chrome ignores it whenever site engagement is low and may delay or skip wake-ups entirely. Treat periodic sync as best-effort freshness, never as a primary sync path.

Workbox

Workbox (Google) encapsulates Service Worker patterns in a production-ready library — precaching with revision hashes, route-level caching strategies, and pluggable expiration / sync queues. Adoption sits in the low single digits of mobile sites that ship a Service Worker per the HTTP Archive 2025 Web Almanac PWA chapter; the bigger story is that overall Service Worker adoption jumped because Google Tag Manager started auto-installing one.

1import { precacheAndRoute } from "workbox-precaching"2import { registerRoute } from "workbox-routing"3import { CacheFirst, NetworkFirst, StaleWhileRevalidate } from "workbox-strategies"4import { BackgroundSyncPlugin } from "workbox-background-sync"56// Precache app shell (injected at build time)7precacheAndRoute(self.__WB_MANIFEST)89// API calls: network-first with background sync fallback10registerRoute(11  ({ url }) => url.pathname.startsWith("/api/"),12  new NetworkFirst({13    cacheName: "api-cache",14    plugins: [15      new BackgroundSyncPlugin("api-queue", {16        maxRetentionTime: 24 * 60, // 24 hours17      }),18    ],19  }),20)2122// Images: cache-first23registerRoute(24  ({ request }) => request.destination === "image",25  new CacheFirst({26    cacheName: "images",27    plugins: [28      new ExpirationPlugin({29        maxEntries: 100,30        maxAgeSeconds: 30 * 24 * 60 * 60, // 30 days31      }),32    ],33  }),34)

Why use Workbox:

Handles cache versioning and cleanup automatically
Precaching with revision hashing
Built-in plugins for expiration, broadcast updates, background sync
Webpack/Vite integration for build-time manifest generation

Sync Strategies

When clients make changes offline, syncing those changes creates the hardest problems in offline-first architecture.

Last-Write-Wins (LWW)

The simplest conflict resolution: most recent timestamp wins.

1interface Document {2  id: string3  content: string4  updatedAt: number // Unix timestamp5}67function resolveConflict(local: Document, remote: Document): Document {8  return local.updatedAt > remote.updatedAt ? local : remote9}

When LWW works:

Single-user applications
Data where loss is acceptable (analytics, logs)
Coarse-grained updates (entire document, not fields)

When LWW fails:

Multi-user editing (Alice’s changes overwrite Bob’s)
Fine-grained updates (field-level changes lost)
Clock skew between clients causes wrong “winner”

Clock skew problem: Client clocks can drift. A device with clock set to the future always wins. Solutions:

Use server timestamps (but requires connectivity)
Hybrid logical clocks (Lamport timestamp + physical time)
Vector clocks (discussed below)

Vector Clocks

Vector clocks track causality—which events “happened before” others—without synchronized physical clocks.

1type VectorClock = Map<string, number>23function increment(clock: VectorClock, nodeId: string): VectorClock {4  const newClock = new Map(clock)5  newClock.set(nodeId, (newClock.get(nodeId) ?? 0) + 1)6  return newClock7}89function merge(a: VectorClock, b: VectorClock): VectorClock {10  const result = new Map(a)11  for (const [nodeId, count] of b) {12    result.set(nodeId, Math.max(result.get(nodeId) ?? 0, count))13  }14  return result15}1617function compare(a: VectorClock, b: VectorClock): "before" | "after" | "concurrent" {18  let aBefore = false19  let bBefore = false2021  const allNodes = new Set([...a.keys(), ...b.keys()])22  for (const nodeId of allNodes) {23    const aCount = a.get(nodeId) ?? 024    const bCount = b.get(nodeId) ?? 025    if (aCount < bCount) aBefore = true26    if (bCount < aCount) bBefore = true27  }2829  if (aBefore && !bBefore) return "before"30  if (bBefore && !aBefore) return "after"31  return "concurrent" // True conflict32}

Vector clocks detect conflicts but don’t resolve them: When compare returns 'concurrent', you have a true conflict that needs application-specific resolution.

Space overhead: Vector clocks grow with number of writers. For N clients, each entry is O(N). Dynamo-style systems use “version vectors” with pruning.

Operational Transform (OT)

Operational Transformation models changes as operations that can be transformed when concurrent. The original Ellis & Gibbs paper “Concurrency Control in Groupware Systems” (SIGMOD 1989) introduced it, and Google’s Wave / Docs team eventually shipped it at scale.

How OT works:

Client captures operations: insert('Hello', position: 0)
Client applies operation locally (optimistic update)
Client sends operation to server
Server transforms operation against concurrent operations
Server broadcasts transformed operation to other clients

1interface TextOperation {2  type: "insert" | "delete"3  position: number4  text?: string // for insert5  length?: number // for delete6}78// Transform op1 given op2 was applied first9function transform(op1: TextOperation, op2: TextOperation): TextOperation {10  if (op2.type === "insert") {11    if (op1.position >= op2.position) {12      return { ...op1, position: op1.position + op2.text!.length }13    }14  } else if (op2.type === "delete") {15    if (op1.position >= op2.position + op2.length!) {16      return { ...op1, position: op1.position - op2.length! }17    }18    // More complex cases: overlapping deletes, etc.19  }20  return op121}

OT requires central coordination: The server maintains operation history and performs transforms. This means OT doesn’t work for true peer-to-peer or extended offline scenarios.

OT complexity: Transformation functions are notoriously difficult to get right. Google Docs has had OT-related bugs despite years of engineering. The transformation must satisfy mathematical properties (convergence, intention preservation) that are hard to verify.

Where OT excels: Real-time collaborative editing with always-on connectivity. Low latency because changes apply immediately with optimistic updates.

CRDTs (Conflict-free Replicated Data Types)

CRDTs are data structures mathematically designed to merge without conflicts: any order of applying changes converges to the same result. The foundational treatment is Shapiro et al., “Conflict-free Replicated Data Types” (INRIA RR-7687, 2011).

Two types of CRDTs:

State-based (CvRDT): Replicate entire state, merge using mathematical join.

1// G-Counter: Grow-only counter2type GCounter = Map<string, number>34function increment(counter: GCounter, nodeId: string): GCounter {5  const newCounter = new Map(counter)6  newCounter.set(nodeId, (newCounter.get(nodeId) ?? 0) + 1)7  return newCounter8}910function merge(a: GCounter, b: GCounter): GCounter {11  const result = new Map(a)12  for (const [nodeId, count] of b) {13    result.set(nodeId, Math.max(result.get(nodeId) ?? 0, count))14  }15  return result16}1718function value(counter: GCounter): number {19  return Array.from(counter.values()).reduce((sum, n) => sum + n, 0)20}

Operation-based (CmRDT): Replicate operations, apply in any order. Requires reliable delivery (all operations eventually arrive).

Common CRDT types:

CRDT	Use Case	Trade-off
G-Counter	Likes, views	Grow-only
PN-Counter	Votes (up/down)	Two G-Counters
G-Set	Tags, followers	Grow-only set
OR-Set (Observed-Remove)	General sets	Handles concurrent add/remove
LWW-Register	Single values	Last-write-wins
RGA (Replicated Growable Array)	Text editing	Complex, high overhead

Text CRDTs: For collaborative text editing, specialized CRDTs like RGA, WOOT, or Yjs’s implementation track character positions with unique IDs that survive concurrent edits.

1// Simplified RGA node structure2interface RGANode {3  id: { clientId: string; seq: number }4  char: string5  tombstone: boolean // Deleted but kept for ordering6  after: RGANode["id"] | null // Insert position7}

CRDT trade-offs:

Pros: Mathematically guaranteed convergence, works fully offline, no central server required
Cons: High memory overhead (tombstones, metadata), complex implementation, eventual consistency only

“CRDTs are the only data structures that can guarantee consistency in a fully decentralized system, but many published algorithms have subtle bugs. It’s easy to implement CRDTs badly.” — Martin Kleppmann

Interleaving anomaly: When two users type “foo” and “bar” at the same position, naive CRDTs (notably Logoot and LSEQ) can produce “fboaor” instead of “foobar” or “barfoo”. Yjs (YATA) and Automerge use heuristics that reduce this in two-replica edits but can still interleave under three-way concurrent inserts; recent algorithms like Fugue and FugueMax (Weidner et al., 2023) explicitly satisfy a maximal non-interleaving property at comparable performance. The seminal write-up of the problem is Kleppmann et al., PaPoC 2019.

Sync Strategy Comparison

Factor	LWW	Vector Clocks	OT	CRDT
Conflict resolution	Automatic (lossy)	Detect only	Server-based	Automatic (lossless)
Offline duration	Any	Any	Short (needs server)	Any
Implementation complexity	Low	Medium	High	Very High
Memory overhead	Low	Medium	Low	High
P2P support	Yes	Partial	No	Yes
Data loss risk	High	Application-dependent	Low	None

Design Paths

Path 1: Cache-Only PWA

Architecture: Service Worker caches static assets and API responses. No local database. Changes require network.

1Browser → Service Worker → Cache API → (Network when available)

Best for:

Read-heavy applications (news, documentation)
Short offline periods (subway, airplane mode)
Content that doesn’t change offline

Implementation complexity:

Aspect	Effort
Initial setup	Low
Feature additions	Low
Sync logic	None
Testing	Low

Device/network profile:

Works well on: All devices, any network
Struggles on: Extended offline, collaborative editing

Trade-offs:

Simplest implementation
No sync conflicts
Limited offline functionality
Stale data possible

Path 2: Sync Queue Pattern

Architecture: Changes stored in IndexedDB queue, processed when online. Server is source of truth.

Sequence diagram for the sync queue pattern: user edits, the client (app + Service Worker + IndexedDB) queues the change and renders an optimistic update, then drains the queue to the server when connectivity returns, with three terminal outcomes — ack, permanent error, transient retry. — Sync queue happy path: optimistic UI, queued mutation, then three terminal outcomes — 2xx ack, 4xx conflict, 5xx retry on the next sync event.

Best for:

Form submissions (surveys, orders)
Single-user data (personal notes, todos)
Tolerance for occasional conflicts

Implementation complexity:

Aspect	Effort
Initial setup	Medium
Feature additions	Medium
Sync logic	Medium (queue management)
Testing	Medium

Key implementation concerns:

Idempotency: Server must handle duplicate submissions
Ordering: Queue processes FIFO, but network latency can reorder
Failure handling: Permanent failures need user notification

1interface SyncQueueItem {2  id: string3  operation: "create" | "update" | "delete"4  entity: string5  data: unknown6  timestamp: number7  retries: number8}910async function processQueue(): Promise<void> {11  const queue = await getSyncQueue()1213  for (const item of queue) {14    try {15      await sendToServer(item)16      await removeFromQueue(item.id)17    } catch (error) {18      if (isPermanentError(error)) {19        await markAsFailed(item.id)20        notifyUser(`Failed to sync: ${item.entity}`)21      } else {22        await incrementRetry(item.id)23        if (item.retries >= MAX_RETRIES) {24          await markAsFailed(item.id)25        }26      }27    }28  }29}

Trade-offs:

Handles common offline scenarios
Server-side conflict resolution
May lose changes on permanent failures
Doesn’t support real-time collaboration

Path 3: CRDT-Based Local-First

Architecture: Local CRDT state is authoritative. Peers sync directly or through relay server. No central source of truth.

1App → CRDT State (IndexedDB) ↔ Peer/Server ↔ Other Clients' CRDT State

Best for:

Collaborative editing (documents, whiteboards)
P2P applications
Extended offline with multiple editors

Implementation complexity:

Aspect	Effort
Initial setup	High
Feature additions	High
Sync logic	Very High (CRDT implementation)
Testing	Very High

Library options (sizes from Bundlephobia and each project’s release notes; minified-then-gzipped where available):

Library	Focus	Approximate size	Mature
Yjs	Text/structured data	~90 KB min, ~27 KB gzip	Yes
Automerge 2	JSON documents	Rust core compiled to ~200 KB+ Wasm	Yes
Liveblocks	Real-time + CRDT	SaaS (proprietary engine)	Yes
ElectricSQL	Postgres sync	Tens of KB client + Postgres extension	Emerging

1import * as Y from "yjs"2import { IndexeddbPersistence } from "y-indexeddb"3import { WebsocketProvider } from "y-websocket"45// Create CRDT document6const doc = new Y.Doc()78// Persist to IndexedDB9const persistence = new IndexeddbPersistence("my-doc", doc)1011// Sync with server/peers when online12const provider = new WebsocketProvider("wss://sync.example.com", "my-doc", doc)1314// Get shared types15const text = doc.getText("content")16const todos = doc.getArray("todos")1718// Changes automatically sync19text.insert(0, "Hello")

Trade-offs:

True offline-first with guaranteed convergence
Supports P2P architecture
Complex implementation
High memory overhead
Eventual consistency only (no transactions)

Decision Framework

Decision tree for picking an offline strategy: cache-only PWA for short outages, sync queue for single-user mutations, OT or CRDT for multi-user editing depending on whether the system can rely on a single server or needs P2P. — Pick the offline strategy by asking three questions: how long offline, how many concurrent editors, and whether you can rely on a single server.

Real-World Implementations

Figma: Hybrid Multiplayer with Local Recovery

Challenge: Complex vector graphics with potentially millions of objects, multiple concurrent editors, must feel instantaneous.

Approach: Figma’s own engineering write-up is explicit that they use a custom system that is “in the same family of solutions as CRDTs (it’s similar to CRDTs but is not a CRDT)” — operations are applied through a centralized server that establishes total order, with last-writer-wins registers per property and tree-structure-aware reparenting rules. They explicitly rejected classical OT as too complex.

Offline behavior (per the official Figma offline help):

Changes to currently-loaded pages are queued in IndexedDB and replayed on reconnect.
Retention window is 30 days on Chrome / Firefox / Edge / Opera, 7 days on Safari (ITP).
You cannot open a file that wasn’t loaded before going offline; this is recovery for an open editing session, not a full local-first model.

Technical details:

Canvas, geometry, and the multiplayer engine are written in C++ and compiled to WebAssembly (Figma engineering); React handles only UI chrome.
Selective sync — only download what’s viewed, not entire project.
Background prefetch of likely-needed files.

Limitation: There is no “download for offline” mode; if a file hasn’t been opened recently, it isn’t available.

Notion: Block-Based CRDT

Challenge: Rich text documents with blocks (paragraphs, code, embeds), tables, and databases — with millions of users and fan-out collaboration.

Approach (per Notion’s “How we made Notion available offline”, shipped in Notion 2.53 on 2025-08-19):

Pages marked “Available offline” are migrated to a CRDT-backed data model in a local SQLite cache.
Local state is tracked in offline_page and offline_action tables with multiple “reasons a page is offline” so eviction is reference-counted.
Per-page sync watermark is reconciled on reconnect; CRDT semantics merge concurrent text edits automatically.

Technical details:

Desktop and mobile apps only — the web client does not yet have offline mode.
Database views download the first 50 rows of the first view for any database marked offline; remaining rows must be opened individually. As of 2026 this cap still holds across all plans.
Free plans manually toggle pages; paid plans also auto-download top favourites and most-recent pages.

Limitation: Non-text properties (select fields, relations, linked databases) cannot merge cleanly. When Notion can’t reconcile, it forks the page with a (Conflict) suffix instead of guessing.

Linear: Delta Sync

Challenge: Project management with issues, projects, and workflows. Must feel instant.

Approach (consolidated from Linear’s own “Scaling the sync engine” talks and the Linear-CTO-endorsed reverse-engineering write-up):

Bootstrap process downloads a normalized object pool into IndexedDB on first load.
WebSocket pushes incremental delta packets keyed by a monotonically increasing lastSyncId.
A MobX-managed in-memory object graph drives the UI; mutations are framed as transactions and queued in IndexedDB until acknowledged.
Server is the authority for total order — no peer-to-peer sync, no CRDT.

Technical details:

Sync ID increments with each server-side transaction; clients reconcile deltas since their last seen ID.
Optimistic updates with rollback on server rejection.
Not true offline-first — designed as a connectivity failsafe rather than a local-first system.

Trade-off accepted: Offline is “best effort” — clients can read cached state and queue mutations briefly, but edits require eventual connectivity. Simpler than CRDT, ships faster, gives up indefinite offline.

Excalidraw: Pseudo-P2P with `localStorage`

Challenge: Collaborative whiteboard with minimal backend.

Approach (per Excalidraw’s E2EE blog post and the P2P feature write-up):

Pseudo-P2P: a Socket.IO relay (excalidraw-room) brokers end-to-end encrypted messages between peers.
Local state lives in localStorage (keys excalidraw for elements, excalidraw-state for UI).
AES-GCM key is generated client-side and embedded in the URL fragment, which never reaches the server — the server only sees opaque ciphertext.
A custom reconciler resolves merge order between peers.

Technical details:

Self-hosting via the open-source excalidraw-room server is required for non-Plus collaboration.
Web Crypto API (window.crypto.subtle) is the encryption primitive, so a secure context is mandatory.
Room-based collaboration with shareable links; works fully offline for local edits.

Limitation: The reconciliation strategy keeps elements alive aggressively to avoid losing concurrent edits, so a peer that didn’t see a delete can reintroduce the element on reconnect. Trade-off for simplicity.

Browser Constraints Deep Dive

Storage quota lifecycle as a state machine: best-effort writes live until usage approaches the origin quota or the group cap (~80% of disk) hits and the browser evicts LRU origins; persistent storage is protected on Chromium and Firefox; Safari's ITP wipes script-writable storage after seven days of browser use without site interaction regardless of persistence. — Storage quota lifecycle: best-effort vs persistent, group-cap LRU eviction on Chromium and Firefox, and Safari's seven-day ITP wipe that ignores the persistence flag.

Storage Quota Management

1async function manageStorageQuota(): Promise<void> {2  const { quota, usage } = await navigator.storage.estimate()3  const usagePercent = (usage! / quota!) * 10045  if (usagePercent > 80) {6    // Proactive cleanup before hitting quota7    await evictOldCache()8  }910  if (usagePercent > 95) {11    // Critical: may start failing writes12    await aggressiveCleanup()13    notifyUser("Storage nearly full")14  }15}1617async function evictOldCache(): Promise<void> {18  const cache = await caches.open("dynamic-cache")19  const requests = await cache.keys()2021  // Sort by access time (stored in custom header or IndexedDB metadata)22  const sorted = await sortByLastAccess(requests)2324  // Evict oldest 20%25  const toEvict = sorted.slice(0, Math.floor(sorted.length * 0.2))26  await Promise.all(toEvict.map((req) => cache.delete(req)))27}

Quota exceeded handling: When quota is exceeded, IndexedDB and Cache API throw QuotaExceededError. Always wrap storage operations:

1async function safeWrite(key: string, value: unknown): Promise<boolean> {2  try {3    await writeToIndexedDB(key, value)4    return true5  } catch (error) {6    if (error.name === "QuotaExceededError") {7      await evictOldCache()8      try {9        await writeToIndexedDB(key, value)10        return true11      } catch {12        notifyUser("Storage full. Some data may not be saved offline.")13        return false14      }15    }16    throw error17  }18}

Safari’s 7-Day Eviction

Safari’s ITP deletes all script-writable storage after 7 days without user interaction. Mitigation strategies:

Prompt for PWA installation: Installed PWAs are exempt from 7-day limit
Request persistent storage: Not supported in Safari, but doesn’t hurt
Design for re-sync: Assume local data may disappear
Track last interaction: Warn users approaching 7-day cliff

1const SAFARI_EVICTION_DAYS = 723function checkEvictionRisk(): { daysRemaining: number; atRisk: boolean } {4  const lastInteraction = localStorage.getItem("lastInteraction")5  if (!lastInteraction) {6    localStorage.setItem("lastInteraction", Date.now().toString())7    return { daysRemaining: SAFARI_EVICTION_DAYS, atRisk: false }8  }910  const daysSince = (Date.now() - parseInt(lastInteraction)) / (1000 * 60 * 60 * 24)11  const daysRemaining = SAFARI_EVICTION_DAYS - daysSince1213  // Update interaction timestamp14  localStorage.setItem("lastInteraction", Date.now().toString())1516  return {17    daysRemaining: Math.max(0, daysRemaining),18    atRisk: daysRemaining < 2,19  }20}

Cross-Browser Testing

Offline-first behavior varies significantly across browsers. Test matrix:

Scenario	Chrome	Firefox	Safari
Quota exceeded	QuotaExceededError	QuotaExceededError	QuotaExceededError
Persistent storage	Auto-grant for PWAs	User prompt	Not supported
Background sync	Supported	Not supported	Not supported
Service Worker + private mode	Works	Works	Limited
IndexedDB in iframe	Works	Works	Blocked (3rd party)

Common Pitfalls

1. Trusting navigator.onLine

The mistake: Using navigator.onLine to determine if sync should happen.

1// Don't do this2if (navigator.onLine) {3  await syncData()4}

Why it fails: navigator.onLine only checks for network interface, not internet connectivity. LAN without internet, captive portals, and firewalls all report online: true.

The fix: Use actual fetch with timeout as connectivity check:

1async function canReachServer(): Promise<boolean> {2  try {3    const controller = new AbortController()4    const timeoutId = setTimeout(() => controller.abort(), 5000)56    const response = await fetch("/api/health", {7      method: "HEAD",8      signal: controller.signal,9      cache: "no-store",10    })1112    clearTimeout(timeoutId)13    return response.ok14  } catch {15    return false16  }17}

2. Ignoring IndexedDB Versioning

The mistake: Not handling schema upgrades properly.

1// Dangerous: no version handling2const request = indexedDB.open("mydb")3request.onsuccess = () => {4  const db = request.result5  const tx = db.transaction("users", "readwrite") // May not exist!6}

Why it fails: If schema changes, existing users have old schema. Without proper onupgradeneeded, code accessing new object stores crashes.

The fix: Always increment version and handle migrations:

1const DB_VERSION = 3 // Increment with each schema change23request.onupgradeneeded = (event) => {4  const db = request.result5  const oldVersion = event.oldVersion67  // Migrate through each version8  if (oldVersion < 1) {9    db.createObjectStore("users", { keyPath: "id" })10  }11  if (oldVersion < 2) {12    db.createObjectStore("settings", { keyPath: "key" })13  }14  if (oldVersion < 3) {15    const users = request.transaction!.objectStore("users")16    users.createIndex("by_email", "email", { unique: true })17  }18}

3. Service Worker Update Races

The mistake: Using skipWaiting() without considering page code compatibility.

Why it fails: Old page JavaScript + new Service Worker can have API mismatches. Cached responses may not match expected format.

The fix: Either reload the page after Service Worker update, or ensure backward compatibility:

1// In Service Worker2self.addEventListener("message", (event) => {3  if (event.data === "skipWaiting") {4    self.skipWaiting()5  }6})78// In page9navigator.serviceWorker.addEventListener("controllerchange", () => {10  // New SW took over, reload to ensure consistency11  window.location.reload()12})

4. Unbounded Storage Growth

The mistake: Caching without eviction policy.

1// Grows forever2const cache = await caches.open("api-responses")3cache.put(request, response) // Never cleaned up

Why it fails: Eventually hits quota, causing write failures. User experience degrades suddenly rather than gracefully.

The fix: Implement LRU or time-based eviction:

1const MAX_CACHE_ENTRIES = 1002const MAX_CACHE_AGE_MS = 7 * 24 * 60 * 60 * 1000 // 7 days34async function cacheWithEviction(request: Request, response: Response): Promise<void> {5  const cache = await caches.open("api-responses")6  const keys = await cache.keys()78  // Evict if over limit9  if (keys.length >= MAX_CACHE_ENTRIES) {10    await cache.delete(keys[0]) // FIFO, or implement LRU11  }1213  // Store with timestamp14  const headers = new Headers(response.headers)15  headers.set("x-cached-at", Date.now().toString())16  const newResponse = new Response(response.body, {17    status: response.status,18    headers,19  })2021  await cache.put(request, newResponse)22}

5. Sync Conflict Denial

The mistake: Assuming conflicts won’t happen because “users don’t edit the same thing.”

Why it fails: Conflicts happen when the same user edits on multiple devices, when sync is delayed, or when retries duplicate operations.

The fix: Design for conflicts from the start:

Use idempotent operations with unique IDs
Implement conflict detection and resolution UI
Log conflicts for debugging
Test with simulated network partitions

Conclusion

Offline-first architecture inverts the traditional web assumption: data lives locally, sync is background, and network is optional. This enables responsive UX regardless of connectivity but introduces complexity in storage management, sync strategies, and conflict resolution.

Key architectural decisions:

Storage choice: IndexedDB for structured data with indexing needs, OPFS for binary files and performance-critical access, Cache API for HTTP responses. All share origin quota—monitor and manage proactively.

Sync strategy: LWW for simple, loss-tolerant cases. Sync queues for form-style interactions. OT for real-time collaboration with reliable connectivity. CRDTs for true offline-first with guaranteed convergence.

Browser reality: Safari’s 7-day eviction breaks long-term offline. Persistent storage is unreliable in Chromium and a no-op in Safari. navigator.onLine is useless for connectivity checks. Design for data loss and re-sync.

The platform is mature enough — Yjs, Automerge, Workbox, and ElectricSQL provide production-ready foundations. The complexity now lives in picking the right trade-off for your audience: how long users go offline, how many of them collaborate on the same record, and how much data loss is acceptable when the model can’t reconcile.

Appendix

Prerequisites

Browser storage APIs: localStorage, IndexedDB concepts
Service Workers: Basic lifecycle and fetch interception
Distributed systems basics: Consistency models, network partitions
Promises/async: Modern JavaScript async patterns

Terminology

CmRDT: Commutative/operation-based CRDT—replicate operations, apply in any order
CvRDT: Convergent/state-based CRDT—replicate state, merge with join function
ITP: Intelligent Tracking Prevention—Safari’s privacy feature that limits storage
LWW: Last-Write-Wins—conflict resolution where latest timestamp wins
OPFS: Origin Private File System—browser file system API
OT: Operational Transform—sync strategy that transforms concurrent operations
PWA: Progressive Web App—web app with offline capability via Service Worker
Tombstone: Marker for deleted item in CRDT—kept for ordering, never truly removed

Summary

Local-first data model: Application reads/writes to IndexedDB or OPFS immediately; network sync is asynchronous.
Service Workers: Intercept requests, implement caching strategies (cache-first, network-first, stale-while-revalidate), enable background sync.
Storage constraints: Quotas are mostly disk-percentage in Chrome / Firefox / Safari today; Safari evicts script-writable storage after seven days of browser use without interaction; persistent storage helps but isn’t guaranteed and is a no-op in Safari.
Conflict resolution: LWW loses data; OT requires a server; CRDTs guarantee convergence but are complex and can still interleave under multi-replica edits — choose based on offline duration and collaboration needs.
Production patterns: Figma runs a CRDT-inspired multiplayer engine with a centralized server and a 30-day session-recovery window; Notion ships a CRDT desktop/mobile offline mode capped at 50 rows per database view; Linear is delta sync over WebSocket with lastSyncId, not true offline-first; Excalidraw is pseudo-P2P with E2EE and localStorage.

References

Specifications and standards (tier 1):

Service Workers — W3C — normative specification.
Indexed Database API 3.0 — W3C — IndexedDB normative spec.
File System Standard — WHATWG — OPFS and File System Access normative spec.
Web Periodic Background Synchronization — WICG — periodic sync draft.

Vendor and platform docs (tier 2):

Storage quotas and eviction criteria — MDN — browser storage limits.
Origin private file system — MDN — OPFS surface area.
createSyncAccessHandle() — MDN — sync OPFS access from Workers.
Background Synchronization API — MDN — sync event surface and compatibility.
Service Worker lifecycle — web.dev — install/activate/skipWaiting semantics.
The Offline Cookbook — web.dev — canonical caching strategies.
Origin private file system — web.dev — OPFS guide.
Periodic Background Sync — Chrome for Developers — Chrome heuristics and minInterval behaviour.
Workbox Documentation — Chrome for Developers — Service Worker library.
Tracking Prevention in WebKit — current ITP behaviour.
Full Third-Party Cookie Blocking — WebKit — Safari’s 7-day script-writable-storage policy.

Research papers (tier 3):

Kleppmann, Wiggins, van Hardenberg, McGranaghan, “Local-first software” (Ink & Switch, 2019) — the seven local-first ideals.
Shapiro et al., “Conflict-free Replicated Data Types” (INRIA RR-7687, 2011) — CRDT formal foundations.
Kleppmann & Beresford, “Interleaving anomalies in collaborative text editors” (PaPoC 2019) — interleaving problem statement.
Weidner, Nicolaescu, et al., “Minimizing Interleaving in Collaborative Text Editing” (2023) — Fugue / FugueMax algorithms.
Ellis & Gibbs, “Concurrency Control in Groupware Systems” (SIGMOD 1989) — original OT paper.
CRDTs: The Hard Parts — Martin Kleppmann — practitioner-facing CRDT design challenges.
CRDT Papers Collection — academic CRDT research index.

Production write-ups (tier 5):

Figma’s multiplayer technology — Figma Blog — CRDT-inspired centralized engine.
Figma is powered by WebAssembly — Figma Blog — C++/Wasm canvas engine.
What can I do offline in Figma? — Figma Help — 30-day / 7-day retention windows.
How we made Notion available offline — Notion Blog — block CRDT and SQLite cache.
Notion 2.53 release notes (2025-08-19) — offline launch.
Scaling the Linear Sync Engine — Linear — delta sync at scale.
Reverse engineering Linear’s sync engine — wzhudev (CTO-endorsed) — sync ID and delta packet shape.
Building Excalidraw’s P2P collaboration — Excalidraw Blog — pseudo-P2P architecture.
End-to-end encryption in the browser — Excalidraw Blog — Web Crypto + URL-fragment key.
HTTP Archive 2025 Web Almanac — PWA chapter — Service Worker and Workbox adoption.

Library docs (tier 5):

Yjs documentation — production CRDT library.
y-indexeddb — IndexedDB persistence adapter.
Automerge — JSON CRDT library.