Core Distributed Patterns
19 min read

CRDTs for Collaborative Systems

Conflict-free Replicated Data Types (CRDTs) are data structures mathematically guaranteed to converge to the same state across distributed replicas without coordination. They solve the fundamental challenge of distributed collaboration: allowing concurrent updates while ensuring eventual consistency without locking or consensus protocols.

This article covers CRDT fundamentals, implementation variants, production deployments, and when to choose CRDTs over Operational Transformation (OT).

Mathematical Foundation

Core CRDT Property

State/Op

State/Op

State/Op

Same Result

Guaranteed by

Replica 1

Merge

Replica 2

Replica 3

Converged State

Commutative: merge(A,B) = merge(B,A)

Associative: merge(merge(A,B),C) = merge(A,merge(B,C))

Idempotent: merge(A,A) = A

CRDTs guarantee convergence through mathematical properties of the merge function—order of operations, network duplicates, and timing are irrelevant.

CRDTs guarantee convergence through join-semilattice properties: merge operations must be commutative, associative, and idempotent. This makes them fundamentally different from consensus-based systems—no coordination overhead, no leader election, no blocking on network partitions.

Core mental model:

  • State-based (CvRDT): Send full state, merge via lattice join. Simple semantics, any delivery order works. Cost: state transfer size.
  • Operation-based (CmRDT): Send operations only. Small messages, but requires exactly-once causal delivery. Cost: delivery infrastructure.
  • Delta-state: Hybrid—send incremental state changes. Best of both when implemented correctly.

Key insight: The choice between variants is a delivery infrastructure vs. merge complexity tradeoff. State-based pushes complexity into the merge function and tolerates unreliable networks. Operation-based pushes complexity into the delivery layer and achieves smaller messages.

Production reality: Most systems use hybrid approaches. Figma uses operation-based with server ordering. Yjs and Automerge use optimized delta-state. Riak uses state-based with delta optimization.

Approach 1: Last-Writer-Wins with Wall Clocks

// Naive LWW - seems simple
function merge(a: { value: T; timestamp: number }, b: { value: T; timestamp: number }) {
return a.timestamp > b.timestamp ? a : b
}

Fails because:

  • Clock skew: Node A’s clock is 5 seconds ahead. Its writes always win regardless of actual order.
  • Lost updates: Two users edit simultaneously. One user’s entire work disappears.
  • Non-deterministic: Same updates processed in different order can produce different results across replicas.

Approach 2: Locking/Pessimistic Concurrency

Fails because:

  • Unavailable during partitions: Can’t acquire lock when network splits.
  • Latency penalty: Every operation requires round-trip to lock server.
  • Deadlocks: Distributed deadlock detection is complex and slow.

Approach 3: Consensus on Every Write (Paxos/Raft)

Fails because:

  • Unavailable during partitions: Consensus requires majority quorum.
  • High latency: Multiple round-trips per write (Paxos: 2 RTTs minimum).
  • Doesn’t scale: Every write touches every node. P2P scenarios impossible.

The fundamental tension: strong consistency requires coordination, coordination requires availability sacrifice (CAP theorem).

CRDTs resolve this by designing data structures where concurrent operations commute. Instead of preventing conflicts, they make conflicts mathematically impossible to produce divergent states.

CRDTs are built on join-semilattices—partially ordered sets where any two elements have a least upper bound (join/merge).

Formal requirements for merge function ⊔:

PropertyDefinitionWhy It Matters
CommutativeA ⊔ B = B ⊔ AOrder of receiving updates is irrelevant
Associative(A ⊔ B) ⊔ C = A ⊔ (B ⊔ C)Grouping of merges is irrelevant
IdempotentA ⊔ A = ADuplicate messages are harmless

Monotonicity constraint: Updates must be inflations—they can only move “up” in the lattice ordering. You can never return to a previous state.

“Any state-based object satisfying the monotonic semilattice property is strongly eventually consistent.” — Shapiro et al., “A Comprehensive Study of Convergent and Commutative Replicated Data Types” (2011)

CRDTs provide Strong Eventual Consistency:

  1. Eventual delivery: Every update eventually reaches every replica
  2. Convergence: Replicas that have received the same updates are in identical states
  3. Termination: All operations complete locally without blocking

This is stronger than eventual consistency (which only guarantees convergence “eventually”) because SEC guarantees immediate local completion and deterministic convergence.

How it works:

  1. Each replica maintains full local state
  2. Periodically, replicas exchange their complete state
  3. Receiving replica merges incoming state with local state using join operation
  4. Merge function satisfies semilattice properties

When to choose this path:

  • Network is unreliable (messages may be lost, duplicated, reordered)
  • State size is bounded or compressible
  • Merge function is computationally cheap
  • Simpler implementation is preferred over message efficiency

Key characteristics:

  • Any gossip protocol works—no ordering guarantees needed
  • Duplicate messages are automatically handled (idempotence)
  • Full state must be transmitted on every sync

G-Counter example (state-based):

2 collapsed lines
type NodeId = string
interface GCounter {
counts: Map<NodeId, number> // Each node tracks its own increment count
}
function increment(counter: GCounter, nodeId: NodeId): GCounter {
const newCounts = new Map(counter.counts)
newCounts.set(nodeId, (counter.counts.get(nodeId) ?? 0) + 1)
return { counts: newCounts }
}
function merge(a: GCounter, b: GCounter): GCounter {
// Pairwise maximum - satisfies all semilattice properties
const merged = new Map<NodeId, number>()
const allNodes = new Set([...a.counts.keys(), ...b.counts.keys()])
for (const nodeId of allNodes) {
merged.set(nodeId, Math.max(a.counts.get(nodeId) ?? 0, b.counts.get(nodeId) ?? 0))
}
return { counts: merged }
}
function value(counter: GCounter): number {
return [...counter.counts.values()].reduce((sum, n) => sum + n, 0)
}

Trade-offs:

AdvantageDisadvantage
Simple delivery—any order, duplicates OKState transfer can be expensive
Easy to reason about—state is self-describingState can grow unbounded (actor IDs, tombstones)
Works over unreliable networksMerge computation on every sync

Real-world: Riak uses state-based CRDTs with delta optimization. They maintain full state but only transmit deltas when possible.

How it works:

  1. Each replica maintains local state
  2. Operations are applied locally and broadcast to other replicas
  3. Receiving replicas apply operations to their local state
  4. Operations must commute when applied concurrently

When to choose this path:

  • Reliable, exactly-once, causally-ordered delivery available
  • Operations are small relative to state size
  • Low-latency propagation is critical
  • Can invest in delivery infrastructure

Key characteristics:

  • Small message size (just the operation)
  • Requires reliable causal broadcast—significant infrastructure investment
  • Must handle late joiners (replay history or checkpoint + recent ops)

G-Counter example (operation-based):

type Operation = { type: "increment"; nodeId: string; amount: number }
function apply(counter: number, op: Operation): number {
// Operations must be delivered exactly once, in causal order
return counter + op.amount
}
// The delivery layer must guarantee:
// 1. No duplicates
// 2. No lost messages
// 3. Causal ordering (if A caused B, A delivered before B)

Trade-offs:

AdvantageDisadvantage
Small messages (operations only)Requires reliable causal delivery layer
Immediate propagation possibleMust track history for late joiners
Lower bandwidthMore complex reasoning about concurrency

Real-world: Figma uses operation-based approach with their own delivery layer. Server provides ordering and validation. They invested heavily in transport infrastructure to achieve low-latency sync.

How it works:

  1. Track changes since last sync as “deltas”
  2. Send only the delta (incremental state change), not full state
  3. Deltas are themselves CRDTs—can be merged like states
  4. Falls back to full state sync when deltas unavailable

When to choose this path:

  • Want small messages like op-based but unreliable network like state-based
  • State is large but changes are typically small
  • Can track sync points between replicas

Key insight from research:

“Delta-state CRDTs combine the distributed nature of operation-based CRDTs with the uniquely simple model of state-based CRDTs.” — Almeida et al., “Delta State Replicated Data Types” (2018)

Trade-offs:

AdvantageDisadvantage
Small messages in common caseMust track sync state per peer
Works over unreliable networksDelta storage overhead
Falls back gracefully to full stateMore complex implementation

Real-world: Yjs and Automerge both use delta-state approaches with sophisticated compression.

FactorState-BasedOperation-BasedDelta-State
Message sizeFull stateSingle operationIncremental delta
Delivery requirementAny (gossip OK)Exactly-once, causalAny
Late joiner handlingSend current stateReplay historySend deltas or state
Implementation complexityIn merge functionIn delivery layerIn delta tracking
Network partition toleranceExcellentPoorExcellent
Typical latencyHigher (batched)Lower (immediate)Medium

Unreliable/P2P

Reliable infrastructure

Small/bounded

Large/unbounded

No

Yes

Need CRDT

Network reliability?

State size?

Operation-based

State-based

Delta-state

Can guarantee causal delivery?

Pure operation-based

G-Counter (Grow-Only): Covered above. Foundation for other counters.

PN-Counter (Positive-Negative): Two G-Counters—one for increments, one for decrements.

8 collapsed lines
interface PNCounter {
P: GCounter // Positive increments
N: GCounter // Negative increments (decrements)
}
function increment(counter: PNCounter, nodeId: string): PNCounter {
return { ...counter, P: GCounter.increment(counter.P, nodeId) }
}
function decrement(counter: PNCounter, nodeId: string): PNCounter {
return { ...counter, N: GCounter.increment(counter.N, nodeId) }
}
function value(counter: PNCounter): number {
return GCounter.value(counter.P) - GCounter.value(counter.N)
}
function merge(a: PNCounter, b: PNCounter): PNCounter {
return {
P: GCounter.merge(a.P, b.P),
N: GCounter.merge(a.N, b.N),
}
}

LWW-Register (Last-Writer-Wins): Associates timestamp with each update. Highest timestamp wins.

interface LWWRegister<T> {
value: T
timestamp: number // Lamport timestamp, NOT wall clock
nodeId: string // Tie-breaker for equal timestamps
}
function merge<T>(a: LWWRegister<T>, b: LWWRegister<T>): LWWRegister<T> {
if (a.timestamp > b.timestamp) return a
if (b.timestamp > a.timestamp) return b
// Equal timestamps: deterministic tie-breaker
return a.nodeId > b.nodeId ? a : b
}

Critical: Use Lamport timestamps, not wall clocks. Wall clock skew causes non-deterministic behavior.

MV-Register (Multi-Value): Preserves all concurrent values instead of picking winner.

4 collapsed lines
interface MVRegister<T> {
values: Map<VectorClock, T> // All concurrent values
}
function write<T>(reg: MVRegister<T>, value: T, clock: VectorClock): MVRegister<T> {
// Remove all values that this write supersedes
const newValues = new Map<VectorClock, T>()
for (const [vc, v] of reg.values) {
if (!clock.dominates(vc)) {
newValues.set(vc, v) // Keep concurrent values
}
}
newValues.set(clock, value)
return { values: newValues }
}
function read<T>(reg: MVRegister<T>): T[] {
return [...reg.values.values()] // May return multiple values
}

Design decision: MV-Register pushes conflict resolution to the application. Use when automatic resolution (LWW) loses important data.

G-Set (Grow-Only): Elements can only be added, never removed.

2P-Set (Two-Phase): Add-set and remove-set. Element present if in add-set but not remove-set.

  • Limitation: Once removed, element can never be re-added.

OR-Set (Observed-Remove): Most practical set CRDT. Each element tagged with unique ID. Remove only affects observed tags.

6 collapsed lines
type Tag = string // Globally unique (e.g., UUID or nodeId + sequence)
interface ORSet<T> {
elements: Map<T, Set<Tag>> // Element -> set of tags
}
function add<T>(set: ORSet<T>, element: T, tag: Tag): ORSet<T> {
const tags = new Set(set.elements.get(element) ?? [])
tags.add(tag)
const newElements = new Map(set.elements)
newElements.set(element, tags)
return { elements: newElements }
}
function remove<T>(set: ORSet<T>, element: T): ORSet<T> {
// Remove only tags we've "observed" - concurrent adds survive
const newElements = new Map(set.elements)
newElements.delete(element)
return { elements: newElements }
}
function merge<T>(a: ORSet<T>, b: ORSet<T>): ORSet<T> {
const merged = new Map<T, Set<Tag>>()
const allElements = new Set([...a.elements.keys(), ...b.elements.keys()])
for (const element of allElements) {
const tagsA = a.elements.get(element) ?? new Set()
const tagsB = b.elements.get(element) ?? new Set()
const unionTags = new Set([...tagsA, ...tagsB])
if (unionTags.size > 0) {
merged.set(element, unionTags)
}
}
return { elements: merged }
}
function has<T>(set: ORSet<T>, element: T): boolean {
return (set.elements.get(element)?.size ?? 0) > 0
}

Add-wins semantics: Concurrent add and remove → element present. This matches user intuition in most applications.

Sequence CRDTs enable collaborative text editing. Each character gets a unique, ordered identifier that persists across all replicas.

The challenge: Users A and B both type at position 5. A types “foo”, B types “bar”. Naive merge produces “fboaor” instead of “foobar” or “barfoo”.

Initial: "Hello|World" (both cursors at position 5)
User A: "Hello|foo|World"
User B: "Hello|bar|World"
Naive: "Hellofboaor World" ← Characters interleaved!
Correct: "HellofoobarWorld" or "HellobarfooWorld"
AlgorithmApproachInterleavingID GrowthNotes
RGALinked list + timestampsCan interleaveLinearGood general performance
Logoot/LSEQFractional positionsCan interleaveUnboundedSimple but IDs grow
FugueDesigned for non-interleavingMinimalBoundedProven maximal non-interleaving
Eg-walkerEvent graph replayMinimalBoundedState-of-the-art performance

Fugue algorithm (2023): Specifically designed to satisfy “maximal non-interleaving”—concurrent inserts at the same position are never interleaved.

“We prove that Fugue satisfies a maximally strong non-interleaving property.” — Gentle et al., “The Art of the Fugue” (2023)

Peritext handles inline formatting (bold, italic, links) in CRDTs.

Key insight: Formatting spans are linked to stable character identifiers, not positions. Formatting marks accumulate—remove marks don’t delete add marks, they add counter-marks.

Design decisions:

  • Expand on edges: When typing at the end of bold text, new characters inherit bold formatting
  • Anchor to characters: Formatting boundaries attached to character IDs, not positions
  • Never delete marks: Add “remove bold” marks rather than deleting “add bold” marks

Context: Real-time collaborative design tool. Millions of concurrent users editing complex documents.

Implementation choices:

  • Pattern variant: Operation-based with server ordering
  • Architecture: WebSocket to multiplayer servers; server is authoritative
  • Conflict resolution: LWW per-property-per-object. Two users changing different properties don’t conflict.
  • Persistence: State in-memory, checkpointed to S3 every 30-60 seconds. Transaction log in DynamoDB.

Operations

Operations

Operations

Ordered Ops

Ordered Ops

Ordered Ops

Checkpoint

Client 1

Multiplayer Server

Client 2

Client 3

DynamoDB + S3

What worked:

  • Simplified conflict resolution by making server authoritative for ordering
  • LWW per-property avoids complex merge logic for most operations

What was hard:

  • Text editing required more sophisticated merging—adopted Eg-walker algorithm for code layers (2024)
  • Handling network partitions while maintaining responsiveness

Source: Figma Engineering Blog - How Figma’s Multiplayer Technology Works

Context: Most popular CRDT library. 900k+ weekly npm downloads. Powers many collaborative editors.

Implementation choices:

  • Pattern variant: Delta-state with sophisticated encoding
  • Architecture: Network-agnostic. Works P2P, client-server, or hybrid.
  • Data model: Shared types (Y.Map, Y.Array, Y.Text) that mirror JavaScript types

Internal architecture:

Encoding

V1: Variable integers

V2: Run-length encoding

Yjs Document

Shared Types

Struct Store

Block 1: clientId=A, seq=0

Block 2: clientId=A, seq=1

Block 3: clientId=B, seq=0

Key optimizations:

  • Block merging: Consecutive operations from same client merged into single block
  • Efficient encoding: V2 encoding influenced by Automerge research—run-length encoding for sequences
  • Deleted content removal: Can delete content from deleted structs (tombstones kept, content discarded)

What worked:

  • Network-agnostic design enables diverse deployment scenarios
  • Provider architecture separates sync logic from data logic

What was hard:

  • Garbage collection—tombstones accumulate over time
  • Large documents can have significant memory overhead

Source: Yjs Documentation, Yrs Architecture Deep Dive

Context: Designed for local-first software where user’s device is primary. “PostgreSQL for your local-first app.”

Implementation choices:

  • Pattern variant: State-based with optimized sync
  • Architecture: Rust core, bindings for JS/WASM, C, Swift
  • Philosophy: Automatic merging—no git-style merge conflicts exposed to users

Key differentiators:

  • Rigorous proofs: Convergence verified with Isabelle theorem prover
  • Compact storage: Sophisticated compression format
  • Deterministic conflict resolution: Lamport timestamps + actor IDs ensure same result everywhere

What worked:

  • Clean API hides CRDT complexity from application developers
  • Excellent offline support out of the box

What was hard:

  • Performance at scale—led to complete Rust rewrite
  • Sync protocol complexity

Source: Automerge Documentation, Local-First Software

Context: Distributed key-value database. First production database to adopt CRDTs (2012).

Implementation choices:

  • Pattern variant: State-based with delta optimization
  • Types supported: Counters, Sets, Maps, HyperLogLogs (bucket-level); Flags, Registers (embedded)
  • Consistency: Vector clocks for causality, optional causal consistency mode

Production scale (League of Legends):

  • 7.5 million concurrent users
  • 11,000 messages per second
  • In-game chat powered by Riak CRDTs

Challenges encountered:

  • Sets perform poorly for writes as cardinality grows
  • Sets >500KB have issues—addressed with delta-replication and decomposition
  • Recommended: decompose large sets into multiple entries

Source: Riak Data Types Documentation

AspectFigmaYjsAutomergeRiak
VariantOp-based (server)Delta-stateState-basedState + delta
ArchitectureCentralizedAnyP2P/local-firstDistributed DB
Offline supportLimitedExcellentExcellentN/A (server)
Rich textEg-walkerNativePeritextN/A
MaturityProductionProductionProductionProduction
Best forReal-time SaaSEditor librariesLocal-first appsKey-value stores

The tombstone problem: Deleted elements leave markers (tombstones) that must persist until all replicas have seen them. These accumulate over time.

Why tombstones exist: If replica A has tombstone for X, and replica B still has X, A must keep tombstone so X gets deleted when B syncs. Otherwise, X “resurrects.”

Strategies:

StrategyMechanismTrade-off
Epoch-basedSplit into live/compacted portions at version vector boundaryRequires version vector tracking
Stability-basedRemove when update known to all replicasRequires global knowledge
Time-basedRemove after gc_grace_seconds (Cassandra)May resurrect if replica rejoins late
Consensus-basedPaxos/2PC to agree on removalDefeats purpose of coordination-free

Production approach (Cassandra): gc_grace_seconds defaults to 10 days. Tombstones older than this are removed during compaction. Set based on expected node recovery time.

Lamport timestamps: Simple logical clock. Provides partial ordering but cannot distinguish concurrent events.

// Lamport timestamp: increment on every event, max+1 on receive
let clock = 0
function localEvent(): number {
return ++clock
}
function receiveEvent(remoteTimestamp: number): number {
clock = Math.max(clock, remoteTimestamp) + 1
return clock
}

Vector clocks: Full causality tracking. Can determine happened-before, happened-after, or concurrent.

4 collapsed lines
type VectorClock = Map<NodeId, number>
function increment(vc: VectorClock, nodeId: NodeId): VectorClock {
const newVc = new Map(vc)
newVc.set(nodeId, (vc.get(nodeId) ?? 0) + 1)
return newVc
}
function merge(a: VectorClock, b: VectorClock): VectorClock {
const merged = new Map<NodeId, number>()
const allNodes = new Set([...a.keys(), ...b.keys()])
for (const nodeId of allNodes) {
merged.set(nodeId, Math.max(a.get(nodeId) ?? 0, b.get(nodeId) ?? 0))
}
return merged
}
function happenedBefore(a: VectorClock, b: VectorClock): boolean {
// a < b iff all entries in a <= corresponding entry in b, and at least one <
let hasLess = false
for (const [nodeId, aTime] of a) {
const bTime = b.get(nodeId) ?? 0
if (aTime > bTime) return false
if (aTime < bTime) hasLess = true
}
for (const [nodeId, bTime] of b) {
if (!a.has(nodeId) && bTime > 0) hasLess = true
}
return hasLess
}
function concurrent(a: VectorClock, b: VectorClock): boolean {
return !happenedBefore(a, b) && !happenedBefore(b, a)
}

Cost: O(N) space per operation where N is number of nodes. For large systems, consider hybrid logical clocks or bounded vector clocks.

Challenge: New replica needs full state. For large documents with long histories, this is expensive.

Strategies:

  1. Full state transfer: Simple but costly. Works for state-based CRDTs.
  2. Checkpoint + recent ops: Periodically snapshot state; new nodes get snapshot + ops since.
  3. Delta sync: Track version vectors; send deltas since last known state.

Eg-walker approach: Store event graph on disk, document state as plain text. New joiners get current text + subset of event graph needed for future merges.

  • OT: Invented late 1980s. Powers Google Docs. Requires central server for transformation.
  • CRDT: First proposed 2006 (WOOT). Designed for decentralized/offline systems.

OT: Transform operations against each other to preserve intent.

User A: insert("X", 5) → transform against B's op → insert("X", 6)
User B: insert("Y", 3) → transform against A's op → insert("Y", 3)

CRDT: Design operations that commute naturally.

User A: insert("X", id_a) → apply directly
User B: insert("Y", id_b) → apply directly
Result: determined by ID ordering, not transformation
AspectOTCRDT
ArchitectureRequires central serverWorks P2P or centralized
Offline supportPoor (needs server)Excellent
Intent preservationBetter (transforms designed for it)Varies by algorithm
Implementation complexityHigh (transform functions error-prone)Moderate
Proven correctnessDifficult (many flawed algorithms published)Mathematical proofs possible

Choose OT when:

  • Always-online application with reliable connectivity
  • Centralized architecture already exists
  • Intent preservation is critical (e.g., cursor positions in concurrent edits)
  • Using existing OT infrastructure (Google Docs API, ShareDB)

Choose CRDT when:

  • Offline-first is required
  • P2P or decentralized architecture
  • Multi-device sync with unreliable networks
  • Edge computing scenarios
  • Want mathematical guarantees of convergence

Recent research (Eg-walker, 2024) combines benefits of both:

“Eg-walker achieves order of magnitude less memory than existing CRDTs, orders of magnitude faster document loading, and orders of magnitude faster branch merging than OT—all while working P2P without a central server.” — Gentle & Kleppmann, “Collaborative Text Editing with Eg-walker” (EuroSys 2025)

Key insight: Store the event graph (DAG of edits) on disk. Keep document state as plain text in memory. Build CRDT structure temporarily during merge, then discard.

The mistake: Not planning for tombstone/history cleanup.

Example: Notion-like app storing all operations. After 1 year, loading a page requires replaying 100K operations taking 30+ seconds.

Solutions:

  • Periodic state snapshots with operation truncation
  • Tombstone GC with grace period
  • Compaction strategies (Eg-walker approach)

The mistake: Building features that assume immediate consistency.

Example: “Undo” feature that undoes “last operation”—fails with concurrent edits because “last” is ambiguous.

Solutions:

  • Design for concurrent operations from the start
  • Use causal consistency, not wall-clock ordering
  • Undo should undo “my last operation,” not “the last operation”

The mistake: Using wall clocks for LWW timestamps.

Example: Server with clock 5 seconds ahead always wins conflicts, even against more recent actual edits.

Solutions:

  • Use Lamport timestamps or vector clocks
  • If wall clocks required, use hybrid logical clocks (HLC)

The mistake: Using OR-Set for large, frequently-modified collections.

Example (Riak): OR-Sets >500KB have significant performance issues—each element carries metadata.

Solutions:

  • Decompose into multiple smaller sets
  • Use specialized data structures (CRDTs for counters, not sets, when counting)
  • Consider application-level sharding

The mistake: Assuming merge is always fast.

Example: State-based CRDT with O(n²) merge. Works fine with 100 elements, falls over with 10,000.

Solutions:

  • Analyze merge complexity during design
  • Use delta-state to reduce merge frequency
  • Profile with realistic data sizes

Yes

No

JavaScript

Rust

JVM

Erlang

Low

High

No

Yes

Need collaborative data structure

Existing library fits?

Which runtime?

Team CRDT experience?

Yjs or Automerge

Loro or Diamond Types

Akka Distributed Data

riak_dt

Use library anyway - CRDTs are subtle

Performance critical?

Custom implementation

LibraryLanguageBest ForMaturity
YjsJS/TSCollaborative editingProduction (900k+ weekly downloads)
AutomergeJS/RustLocal-first appsProduction
LoroRust/JSRich text, moveable treesProduction
Diamond TypesRustHigh-performance textProduction
Akka Distributed DataJVMActor-based systemsProduction
riak_dtErlangKey-value storesProduction

Only build custom when existing libraries genuinely don’t fit:

  • Define merge semantics precisely before coding
  • Prove commutativity/associativity/idempotence mathematically
  • Handle clock skew (use logical clocks)
  • Plan garbage collection strategy upfront
  • Test with network partition simulation (e.g., Jepsen)
  • Test with artificial latency injection
  • Benchmark with realistic data sizes
  • Consider formal verification (TLA+, Isabelle)

CRDTs provide a mathematically rigorous solution to distributed collaboration. The semilattice properties—commutative, associative, idempotent merge—guarantee convergence without coordination.

Key decisions:

  1. State vs operation vs delta: Trade delivery complexity against message size
  2. Data structure selection: Match CRDT type to application semantics (counters, sets, sequences)
  3. Build vs buy: Use existing libraries unless you have exceptional requirements and CRDT expertise

The ecosystem is mature: Yjs, Automerge, and Loro are production-ready. Recent algorithms (Fugue, Eg-walker) solve longstanding problems like interleaving and state growth.

Start simple: LWW-Register and OR-Set cover most use cases. Graduate to sequence CRDTs only when building collaborative text editing.

  • Distributed systems fundamentals (network partitions, consistency models)
  • CAP theorem and eventual consistency
  • Basic understanding of partial orders and lattice theory helpful but not required
TermDefinition
CvRDTConvergent (state-based) CRDT
CmRDTCommutative (operation-based) CRDT
TombstoneMarker indicating deleted element; must persist for convergence
Vector clockLogical clock tracking causality across multiple nodes
Lamport timestampSimple logical clock; partial ordering only
Join-semilatticeSet with a join (merge) operation that is commutative, associative, idempotent
SECStrong Eventual Consistency—replicas converge to identical state after receiving same updates
OTOperational Transformation—alternative approach requiring central server
  • CRDTs guarantee convergence through join-semilattice properties: merge must be commutative, associative, and idempotent
  • State-based CRDTs tolerate unreliable networks but transfer full state; operation-based require reliable causal delivery but send small messages; delta-state is the practical hybrid
  • OR-Set is the most practical set CRDT; Fugue and Eg-walker are state-of-the-art for text editing
  • Production systems (Figma, Yjs, Automerge, Riak) all use hybrid approaches optimized for their use cases
  • Garbage collection is the primary operational challenge—plan tombstone strategy from the start
  • Use existing libraries (Yjs, Automerge, Loro) unless requirements are exceptional

Foundational Papers:

Text Editing:

Production Implementations:

Additional Resources:

Read more

  • Previous

    Design a Cookie Consent Service

    System Design / System Design Problems 25 min read

    Building a multi-tenant consent management platform that handles regulatory compliance (GDPR, CCPA, LGPD) at scale. Cookie consent services face unique challenges: read-heavy traffic patterns (every page load queries consent status), strict latency requirements (consent checks block page rendering), regulatory complexity across jurisdictions, and the need to merge anonymous visitor consent with authenticated user profiles. This design covers edge-cached consent delivery, anonymous-to-authenticated identity migration, and a multi-tenant architecture serving thousands of websites.

  • Next

    Operational Transformation

    System Design / Core Distributed Patterns 17 min read

    Deep-dive into Operational Transformation (OT): the algorithm powering Google Docs, with its design variants, correctness properties, and production trade-offs.OT enables real-time collaborative editing by transforming concurrent operations so that all clients converge to the same document state. Despite being the foundation of nearly every production collaborative editor since 1995, OT has a troubled academic history—most published algorithms were later proven incorrect. This article covers why OT is hard, which approaches actually work, and how production systems sidestep the theoretical pitfalls.