Core Distributed Patterns
17 min read

Operational Transformation

Deep-dive into Operational Transformation (OT): the algorithm powering Google Docs, with its design variants, correctness properties, and production trade-offs.

OT enables real-time collaborative editing by transforming concurrent operations so that all clients converge to the same document state. Despite being the foundation of nearly every production collaborative editor since 1995, OT has a troubled academic history—most published algorithms were later proven incorrect. This article covers why OT is hard, which approaches actually work, and how production systems sidestep the theoretical pitfalls.

Server

Client B

Client A

Transformed Del(3)

Transformed Ins(0,'x')

Local Op: Ins(0,'x')

Local Op: Del(2)

Canonical Order

Transform & Broadcast

OT core flow: concurrent operations are transformed against each other to preserve intent while achieving convergence.

OT solves the problem of concurrent edits in collaborative systems through a deceptively simple idea: when operations conflict, transform them so they can both apply correctly. The core mental model:

  • Operations (insert, delete) carry position information that becomes stale when concurrent edits occur
  • Transformation functions adjust an operation’s position based on another operation that happened concurrently
  • TP1 (convergence): applying O1 then T(O2,O1) must equal applying O2 then T(O1,O2)—both paths reach the same state
  • TP2 (commutativity): transforming O3 against different orderings of O1,O2 must yield the same result—this property is notoriously hard to satisfy and most published algorithms fail it
  • Production systems avoid TP2 by using a central server to impose a canonical operation order, making the math tractable at the cost of requiring server round-trips

The key insight: practical OT is not distributed. Google Docs, Google Wave, and nearly all production systems use client-server architecture where the server decides operation order, eliminating the need for TP2 satisfaction.

Approach 1: Last-write-wins

Two users edit “hello” concurrently:

  • User A inserts “x” at position 0 → “xhello”
  • User B deletes character at position 4 → “hell”

With last-write-wins, one edit is lost. Collaborative editors cannot discard user intent.

Approach 2: Locking

Lock the document (or region) during edits. Users see “Document locked by User A” and must wait.

Fails because: latency makes locking unusable. With 100ms round-trip, a user typing 5 characters/second would spend most of their time waiting for locks. Google Docs reports 40+ concurrent editors on popular documents—locking would serialize all edits.

Approach 3: Apply operations as-received

User A and B both start with “abc”:

  • A: Insert ‘x’ at position 0 → sends Ins(0,‘x’)
  • B: Delete at position 2 → sends Del(2)

If B receives A’s insert first, B’s document becomes “xabc”. Then B’s own Del(2) executes, deleting ‘a’ instead of ‘c’. Result: “xbc” for B, “xab” for A. Divergence.

The fundamental tension: preserving user intent while achieving convergence across all replicas.

Each operation encodes intent relative to a specific document state. When operations are concurrent (neither causally depends on the other), their positions become meaningless without transformation. OT exists to adjust operations so they execute their intended effect regardless of the order they’re applied.

OT works by defining transformation functions that take two concurrent operations and produce adjusted versions:

T(O1, O2) → O1' // O1 transformed against O2
T(O2, O1) → O2' // O2 transformed against O1

The transformed operations are designed so that:

  • Starting from state S
  • Applying O1 then O2’ yields the same state as
  • Applying O2 then O1’

For string operations, transformation typically adjusts positions:

// Transform insert against insert
function transformInsertInsert(
op1: { pos: number; char: string },
op2: { pos: number; char: string },
): { pos: number; char: string } {
if (op1.pos < op2.pos) {
return op1 // No adjustment needed
} else if (op1.pos > op2.pos) {
return { ...op1, pos: op1.pos + 1 } // Shift right
} else {
// Tie-breaking: use site ID or timestamp
// Both insert at same position
return { ...op1, pos: op1.pos + 1 } // Arbitrary but consistent
}
}
// Transform delete against insert
function transformDeleteInsert(del: { pos: number }, ins: { pos: number; char: string }): { pos: number } {
if (del.pos < ins.pos) {
return del
} else {
return { pos: del.pos + 1 }
}
}
  1. Convergence (TP1): All sites reach the same document state after applying all operations, regardless of arrival order
  2. Causality preservation: Operations are applied respecting their causal dependencies (Lamport’s happened-before)
  3. Intention preservation: The effect of an operation matches what the user intended, even after transformation
FailureImpactMitigation
TP1 violationDocuments diverge permanentlyFormal verification of transformation functions
TP2 violationDivergence with 3+ concurrent operationsUse server-ordered architecture (avoid TP2 requirement)
Causality violationOperations apply out of orderVector clocks or server-assigned sequence numbers
Transformation bugsSilent corruption, divergenceChecksums, periodic reconciliation

When to choose this path:

  • Network connectivity is reliable
  • Latency tolerance allows server round-trips
  • Simplicity is prioritized over offline capability
  • Team wants proven, production-tested approach

How it works:

The server maintains a single canonical operation history. Clients send operations to the server, which:

  1. Transforms incoming operations against any operations the client hasn’t seen
  2. Assigns a sequence number
  3. Broadcasts the transformed operation to all clients

Clients maintain three states:

  • Server state: Last acknowledged server revision
  • Pending operations: Sent to server, awaiting acknowledgment
  • Local operations: Not yet sent (buffered during server round-trip)
Other ClientsServerClientOther ClientsServerClientBuffer local opsShift pending opsOp1 (based on rev 5)Transform Op1 against rev 6,7Ack (rev 8 = transformed Op1)Broadcast Op1' (rev 8)Op2 (based on rev 8)
Client-server OT: server assigns canonical ordering, clients transform against operations they haven't seen.

Critical design choice: Clients must wait for acknowledgment before sending the next operation batch. Google Wave enforced this strictly—during the wait, clients buffer local operations and send them in bulk after ACK.

Why this avoids TP2:

TP2 is required when operations can be transformed along different paths (different orderings of concurrent operations). With a central server:

  • The server decides the canonical order
  • All transformations follow a single path
  • TP2 never comes into play

Trade-offs vs other paths:

AspectClient-ServerPeer-to-Peer
TP2 requirementNot neededRequired
Correctness proofsStraightforwardNotoriously difficult
Server dependencySingle point of failureNo central dependency
LatencyRound-trip per operation batchDirect client-to-client
Offline supportLimited (buffer operations)Native
Implementation complexityModerateVery high

Real-world example:

Google Docs uses client-server OT derived from the Jupiter/Wave codebase. According to Google’s engineering blog, every character change is saved as an event in a revision log. The document is rendered by replaying the log from the start.

Joseph Gentle (ShareJS author, ex-Google Wave): “Whatever the bugs are, I don’t think they were ever fixed in the opensource [Wave] version. It’s all just too complicated… You’re right about OT—it gets crazy complicated if you implement it in a distributed fashion. But implementing it in a centralized fashion is actually not so bad.”

When to choose this path:

  • Offline-first requirement is critical
  • No server infrastructure available
  • Academic research context
  • Willing to accept significant complexity

How it works:

Each site maintains its own operation history and transforms incoming operations against local history. Without a central server, sites can receive operations in different orders, requiring TP2 satisfaction.

The adOPTed algorithm (Ressel et al. 1996) introduced an N-dimensional interaction graph to track all valid transformation paths. Each operation is placed in this graph based on state vectors, and transformations follow edges to find equivalent operations.

The TP2 problem:

For three concurrent operations O1, O2, O3, TP2 requires:

T(O3, O1 ∘ T(O2, O1)) ≡ T(O3, O2 ∘ T(O1, O2))

Meaning: transforming O3 past O1-then-O2’ must equal transforming O3 past O2-then-O1’.

Why it fails in practice:

Randolph et al. (2013) proved formally: “There is no IT function, based on classical parameters of delete and insert operations, which satisfies both TP1 and TP2.”

Using controller synthesis techniques, they showed that position and character parameters are necessary but not sufficient. Every published TP2-claiming algorithm was subsequently shown to have counterexamples:

AlgorithmYearTP2 ClaimStatus
dOPT1989ImpliedDisproven
adOPTed1996AssertedDisproven
SOCT21998”Proven”Disproven
SDT2002ClaimedDisproven
IMOR2003Machine-verifiedInvalid assumptions

Raph Levien: “For a decade or so, TP2 was something of a philosopher’s stone, with several alchemists of OT claiming that they had found a set of transforms satisfying TP2, only for counterexamples to emerge later.”

Trade-offs:

  • No server dependency
  • Offline operation is native
  • Lower latency (direct peer communication)
  • No production-proven correct implementation exists
  • Extreme implementation complexity
  • Debugging distributed state divergence is brutal

Real-world example:

Google Wave attempted federated (peer-to-peer between servers) OT. From Joseph Gentle: “We got it working, kinda, but it was complex and buggy. We ended up with a scheme where every wave would arrange a tree of wave servers… But it never really worked.”

Wave was eventually shut down; its federation protocol remains unproven in production.

When to choose this path:

  • Need provably correct transformation functions
  • Can tolerate memory overhead
  • Building a new implementation from scratch

How it works:

TTF (Oster et al. 2006) sidesteps the TP2 impossibility by changing the data model. Instead of a string, the document is a sequence of (character, visible) pairs. Deletes set visible=false but retain the character as a tombstone.

interface TombstoneChar {
char: string
visible: boolean
id: UniqueId // For ordering
}
type Document = TombstoneChar[]
function deleteAt(doc: Document, pos: number): Document {
let visibleCount = 0
for (let i = 0; i < doc.length; i++) {
if (doc[i].visible) {
if (visibleCount === pos) {
doc[i].visible = false
return doc
}
visibleCount++
}
}
return doc
}

Why this satisfies TP2:

With tombstones, positions are stable. A deleted character still occupies its logical position, so concurrent operations don’t shift each other’s targets unpredictably. The paper includes a theorem prover verification that TTF satisfies both TP1 and TP2.

Trade-offs:

  • Provably correct (machine-verified)
  • Simpler transformation logic
  • Unbounded memory growth (tombstones accumulate)
  • Need garbage collection strategy
  • Operations may reference expired tombstones

Garbage collection challenge:

Tombstones can be removed only when all sites have seen all operations that reference them. This requires distributed garbage collection—itself a hard problem. Production systems typically use periodic snapshots with tombstone truncation, accepting that very late operations may fail.

When to choose this path:

  • Need undo/redo support
  • Want to avoid TP2 at the function level
  • Building a distributed system with known operation context

How it works:

COT (Sun & Sun 2009) replaces history buffers with context vectors that capture the exact document state where an operation was created. Instead of transforming operations pairwise, COT transforms based on context differences.

The key insight: TP2 is required because operations can be transformed along different paths. If you track the exact context (which operations have been applied), you can always reconstruct the correct transformation path.

Advantages:

  • Supports undo/redo of any operation at any time
  • Does NOT require TP2 at the transformation function level
  • Transformation functions only need TP1
  • Proven correct for specific operation sets

Real-world implementations:

COT powers several academic and commercial systems: CoMaya (3D modeling), CoWord (Microsoft Word), CodoxWord, and IBM OpenCoWeb.

Trade-offs:

  • Avoids TP2 requirement through context tracking
  • Native undo/redo support
  • Formally verified for specific operations
  • Context vectors grow with operation count
  • More complex than client-server approach
  • Less production exposure than Jupiter-derived systems

Yes

No

Tight

Flexible

Yes

Pragmatic

Need collaborative editing

Offline-first required?

Memory constraints?

Client-Server OT

Consider CRDTs instead

Need provable correctness?

TTF or COT

Client-Server + sync queue

Jupiter/Wave model

Wait for ACK between batches

Decision tree for OT variant selection. Client-server dominates production use.

Context: World’s most widely used collaborative editor, 40+ concurrent editors on popular documents.

Implementation choices:

  • Architecture: Client-server (Jupiter-derived)
  • Transport: WebSocket with operation batching
  • Persistence: Event-sourced revision log

Architecture:

Each document stores a revision log of operations. The visible document is the result of replaying all operations from the initial state. This enables:

  • Revision history (“See previous versions”)
  • Conflict-free persistence (append-only log)
  • Recovery from any point in time

Specific details:

  • Operations use a streaming format: Retain(n), Insert(chars), Delete(n)
  • Attributes (bold, font) are separate operations on ranges
  • Checkpoints created periodically to avoid replaying full history

What worked:

  • Simplicity of client-server model enabled rapid iteration
  • Operation batching amortizes round-trip cost
  • Event sourcing simplified persistence and history

What was hard:

  • Rich text requires tree-structured operations (paragraphs, lists, tables)
  • Cursor/selection state must be transformed alongside content
  • Presence indicators (who’s editing where) add additional sync complexity

Source: Google Drive Blog (2010), Google Wave Whitepapers

Context: Open-sourced Google Wave codebase, reference implementation for Wave protocol.

Implementation choices:

  • Architecture: Client-server with federation attempt
  • Protocol: Wave Federation Protocol (XMPP-based)
  • Operation types: Document mutations, wavelet operations

Architecture:

Wave operations are hierarchical:

  • Wavelet: Container for participants and documents
  • Document: Rich text with elements and annotations
  • Operations: Retain, InsertCharacters, DeleteCharacters, InsertElement, etc.

Specific details from whitepaper:

  • Every character, start tag, or end tag is an “item”
  • Gaps between items are “positions”
  • Operations compose: B∙A satisfies (B∙A)(d) = B(A(d))
  • Clients must wait for ACK before sending next batch

What worked:

  • Document model handles rich text well
  • Composition of operations simplifies some transformations
  • Open protocol enabled third-party clients

What was hard:

  • Federation never achieved production stability
  • Complexity overwhelmed both users and developers
  • Performance issues with large documents

Source: Apache Wave Whitepapers, Joseph Gentle retrospectives

Context: Commercial rich-text editor with real-time collaboration, 4+ years R&D.

Implementation choices:

  • Architecture: Client-server
  • Key innovation: OT for tree-structured documents
  • Transport: WebSocket

The tree challenge:

Rich text isn’t a string—it’s a tree (paragraphs containing spans containing text). Operations like “split paragraph” or “merge cells” don’t map to simple insert/delete. CKEditor claims to be the first production system to solve OT for trees.

Specific details:

  • Operations can “break” during transformation (one op becomes multiple)
  • Selective undo: users only undo their own changes
  • Custom transformation functions for 50+ operation types

What worked:

  • Tree model handles complex formatting (tables, lists, images)
  • Selective undo improves UX in multi-user scenarios
  • Years of investment produced stable implementation

What was hard:

  • Operation breaking adds significant complexity
  • Testing all transformation combinations is combinatorial
  • Performance optimization for large documents

Source: CKEditor Blog (2020), “Lessons Learned from Creating a Rich-Text Editor with Real-Time Collaboration”

AspectGoogle DocsApache WaveCKEditor 5
ArchitectureClient-serverClient-server + federationClient-server
Document modelStreaming opsHierarchical waveletsTree-structured
OfflineLimited bufferLimited bufferPlanned sync queue
Rich textYes (ops on ranges)Yes (elements + annotations)Yes (native tree ops)
Undo modelGlobal historyGlobal historySelective (user’s ops only)
ScaleProven at massive scaleAbandonedProduction-proven

None

Some

Plain text

Rich text

Custom structure

JSON/text

Rich text

Code

Building collaborative editor

Team experience with OT?

Use existing library

Document complexity?

ShareDB/ot.js

CKEditor 5 or Quill

Consider CRDTs instead

Document type?

Monaco + custom sync

Implementation decision tree. Building OT from scratch is rarely justified.
LibraryLanguageArchitectureMaturityBest For
ShareDBNode.jsClient-serverProductionJSON documents
ot.jsJavaScriptClient-serverProductionPlain text
YjsJavaScriptP2P (CRDT)ProductionOffline-first
CKEditor 5JavaScriptClient-serverProductionRich text
Quill + y-quillJavaScriptP2P (CRDT)ProductionRich text + offline

When to build custom:

  • Document model doesn’t fit existing libraries
  • Need specific consistency guarantees
  • Performance requirements exceed library capabilities
  • Regulatory requirements prevent third-party code

Implementation checklist:

  • Choose architecture: client-server strongly recommended
  • Define operation types (keep minimal)
  • Implement TP1-satisfying transformation functions
  • Add server sequencing to avoid TP2
  • Implement client state machine (pending, sent, acknowledged)
  • Add conflict detection with checksums
  • Build reconciliation for detected divergence
  • Test with artificial latency and packet loss
  • Fuzz test transformation functions exhaustively

Joseph Gentle’s estimate: “Wave took 2 years to write and if we rewrote it today, it would take almost as long.”

The mistake: Implementing distributed OT because it seems “cleaner” than server dependency.

Why it fails: Every published P2P OT algorithm has been proven incorrect. The TP2 property that P2P requires is likely impossible to satisfy with standard string operations.

Example: A startup builds P2P OT for “serverless collaboration.” After 6 months, users report documents diverging. Debugging reveals the transformation functions fail with 3+ concurrent editors—a case the test suite never covered.

Solution: Use client-server architecture. The server round-trip cost is far lower than the correctness risk of P2P.

The mistake: Treating operations as independent units instead of composable.

Why it matters: Client sends Op1, then Op2 before receiving Op1’s ACK. Server receives both, but Op2 was created assuming Op1 applied. If server processes them as independent, transformation fails.

Solution: Wave protocol requires clients to wait for ACK, OR clients must track pending operations and transform new operations against pending ones before sending.

// Client state machine
interface ClientState {
serverRevision: number
pending: Operation | null // Sent, awaiting ACK
buffer: Operation[] // Not yet sent
}
function onLocalEdit(state: ClientState, op: Operation): ClientState {
if (state.pending) {
// Transform against pending before buffering
const transformed = transform(op, state.pending)
return { ...state, buffer: [...state.buffer, transformed] }
} else {
// Send immediately
send(op)
return { ...state, pending: op }
}
}

The mistake: Storing all operations forever for “history” feature.

Why it fails: Document with 1 year of active editing accumulates millions of operations. Loading requires replaying all of them.

Example: Notion-like app stores every keystroke. After 6 months, loading a busy document takes 30 seconds as the client replays 500K operations.

Solutions:

  • Periodic snapshots: Store document state at intervals, only replay recent operations
  • Checkpoint compaction: Combine operations into single state updates
  • Lazy loading: Load recent ops, fetch history on demand

The mistake: Testing with low latency, no packet loss, and 2 concurrent users.

Why it fails: OT bugs emerge with high concurrency, reordered packets, and edge-case timing. The “dOPT puzzle” only manifests with specific 3-user concurrent patterns.

Solution:

  • Fuzz testing: Random operation sequences with random timing
  • Chaos engineering: Inject latency, reordering, duplication
  • Invariant checking: Assert convergence after every operation sequence
  • Property-based testing: Generate operation sequences, verify TP1 holds
// Property-based test for TP1
test("TP1: transformation paths converge", () => {
fc.assert(
fc.property(arbitraryOperation(), arbitraryOperation(), arbitraryDocument(), (op1, op2, doc) => {
const path1 = apply(apply(doc, op1), transform(op2, op1))
const path2 = apply(apply(doc, op2), transform(op1, op2))
expect(path1).toEqual(path2)
}),
)
})

The mistake: Assuming T(op1, op2) and T(op2, op1) are computed the same way.

Why it fails: Transformation depends on which operation is being adjusted. Insert-vs-insert with same position needs tie-breaking; the “winner” doesn’t transform, the “loser” shifts.

Example: Two users insert at position 5. Without consistent tie-breaking, one client computes T(A,B)=A (A wins), other computes T(A,B)=shift(A) (B wins). Divergence.

Solution: Always use consistent tie-breaking: site IDs, timestamps, or operation IDs. The rule must be deterministic and known to all clients.

OT solves collaborative editing through operation transformation, but its apparent simplicity hides decades of failed correctness proofs. The practical takeaway: use client-server architecture. It sidesteps TP2, enables straightforward implementation, and is the only approach proven at scale.

For new projects, prefer existing libraries (ShareDB, CKEditor, or CRDT-based alternatives like Yjs) over custom implementations. If building custom, invest heavily in property-based testing and invariant checking—the bugs that matter only emerge with specific concurrent operation sequences that manual testing will miss.

OT remains the foundation of Google Docs and most production collaborative editors, but its dominance is being challenged by CRDTs for offline-first applications. The choice depends on your consistency requirements, offline needs, and tolerance for implementation complexity.

  • Understanding of distributed systems consistency models
  • Familiarity with event sourcing concepts
  • Basic knowledge of Lamport timestamps and causal ordering
  • For CRDTs comparison: see Martin Kleppmann’s “CRDTs: The Hard Parts”
  • OT (Operational Transformation): Algorithm that transforms concurrent operations to achieve convergence
  • TP1 (Transformation Property 1): Convergence property—different transformation paths yield same result
  • TP2 (Transformation Property 2): Commutativity property—transformation is independent of operation ordering
  • Convergence: All replicas reach identical state after applying all operations
  • Intent preservation: Transformed operation achieves user’s original goal
  • Tombstone: Marker for deleted content, retained for transformation stability
  • COT (Context-based OT): Variant tracking operation context for transformation
  • TTF (Tombstone Transformation Functions): Provably correct transformation using tombstones
  • OT transforms concurrent operations by adjusting positions based on what other operations have done
  • TP1 is mandatory: transformation paths must converge to the same state
  • TP2 is impractical: no standard string operations satisfy it; use client-server to avoid requiring it
  • Production systems (Google Docs, CKEditor) all use client-server architecture for this reason
  • Libraries (ShareDB, ot.js) are strongly preferred over custom implementations
  • Testing must include fuzz testing and property-based tests—manual testing misses the bugs that matter

Read more

  • Previous

    CRDTs for Collaborative Systems

    System Design / Core Distributed Patterns 19 min read

    Conflict-free Replicated Data Types (CRDTs) are data structures mathematically guaranteed to converge to the same state across distributed replicas without coordination. They solve the fundamental challenge of distributed collaboration: allowing concurrent updates while ensuring eventual consistency without locking or consensus protocols.This article covers CRDT fundamentals, implementation variants, production deployments, and when to choose CRDTs over Operational Transformation (OT).

  • Next

    Distributed Locking

    System Design / Core Distributed Patterns 20 min read

    Distributed locks coordinate access to shared resources across multiple processes or nodes. Unlike single-process mutexes, they must handle network partitions, clock drift, process pauses, and partial failures—all while providing mutual exclusion guarantees that range from “best effort” to “correctness critical.”This article covers lock implementations (Redis, ZooKeeper, etcd, Chubby), the Redlock controversy, fencing tokens, lease-based expiration, and when to avoid locks entirely.