Event Sourcing

Event sourcing stores application state as an append-only log of domain events; current state is a derived view, not the source of truth. This article is for senior engineers deciding whether to adopt the pattern, or operating a system that already uses it. By the end you should know when event sourcing is the right answer, how to choose between inline and async projections, how to evolve event schemas without rewriting history, when to add snapshots, why Apache Kafka is a streaming bus and not an event store, and how teams like LMAX, Netflix, and Uber actually run it in production.

Mental model

Three loops:

Write path. A command hits a handler, the handler loads the aggregate by folding its event stream, runs the decision, and appends new events to the stream with an expected version (optimistic concurrency).
Read path. Subscribers consume the event log and project it into one or more read-optimised models. There is no general “query the event store”; every read serves a projection.
Recovery. State is rebuilt by replaying events. Snapshots are an optimisation for that replay, never a source of truth.

The core invariant is a deterministic left fold:

1currentState = events.reduce(applyEvent, initialState)

So long as applyEvent is pure and the event stream is ordered within a stream, replay is reproducible.

Important

Event sourcing implies CQRS — reads come from projections, never from the event store directly. CQRS does not imply event sourcing — you can split commands and queries while still using a CRUD database for both sides. Martin Fowler distinguishes the two explicitly and warns that combining them adds significant complexity.

Vocabulary you need before reading the rest

Term	Working definition
Event	Immutable fact: something that already happened in the domain. Past tense (`OrderPlaced`, not `PlaceOrder`).
Stream	Ordered sequence of events for one aggregate (one order, one account).
Aggregate	DDD consistency boundary; the unit you load, decide on, and append against.
Projection	Read model produced by folding events from one or more streams.
Snapshot	Materialised state at a stream position; an optimisation, not the source of truth.
Upcasting	Transforming an old event payload to the current schema on read.
Optimistic concurrency	Append succeeds only if the stream’s current version equals the version the handler read.

Why CRUD and bolt-on audit tables fall short

Event sourcing exists because the obvious alternatives quietly fail at the bar that drove someone to consider it.

Approach	What it gives up	Concrete failure
Mutable state in a row (CRUD)	History — every write overwrites the previous value	”What was the balance at 15:00 yesterday?” needs a separate audit pipeline that may or may not exist
Audit tables alongside CRUD	Single source of truth — current state and audit can diverge	A code path forgets to write the audit row; reconciliation jobs become a permanent tax
Database triggers for audit	Domain semantics — the trigger sees `status: pending → cancelled` but not why	Debugging an order cancellation reveals the diff but not the customer’s reason or the actor

Event sourcing reframes this: the event log is the database. Current state and the audit trail are no longer two artefacts that have to agree — they come from the same fold.

The pattern in code

Events carry domain meaning plus metadata for ordering and tracing:

1interface DomainEvent<TPayload = unknown> {2  eventId: string         // unique, used for idempotency3  streamId: string        // aggregate id4  eventType: string       // e.g. "OrderPlaced"5  data: TPayload          // domain payload6  metadata: {7    timestamp: Date8    version: number       // position within the stream9    causationId?: string  // event/command that caused this one10    correlationId?: string // saga / request id11  }12}

A command handler loads the aggregate, decides, and appends with optimistic concurrency:

1async function handle<TState, TEvent>(2  command: Command,3  deps: {4    eventStore: EventStore5    evolve: (state: TState, event: TEvent) => TState6    decide: (state: TState, command: Command) => TEvent[]7    initialState: TState8  },9): Promise<void> {10  const events = await deps.eventStore.readStream<TEvent>(command.aggregateId)11  const state = events.reduce(deps.evolve, deps.initialState)12  const newEvents = deps.decide(state, command)13  await deps.eventStore.appendEvents(14    command.aggregateId,15    /* expectedVersion */ events.length,16    newEvents,17  )18}

If another writer wins the race, the store rejects the append on the version check and the handler retries — decide runs again on the now-newer state.

Invariants that the rest of the system relies on

Append-only. No update, no delete on the event log; everything else assumes this.
Deterministic replay. Same events, same order, same final state — no clocks, no random IDs, no implicit time-of-day reads inside evolve.
Total order within a stream. Across streams, only causal order matters; pretending you have a global clock is a footgun.
Idempotent projections. Subscribers will see duplicates and replays — handlers must tolerate them.

Failure modes worth designing for

Failure	Impact	Mitigation
Event store unavailable	Writes block; reads from existing projections may continue	Multi-AZ replication, read replicas, write retries
Projection lag	Stale read models	Lag SLOs, alert on staleness, circuit-break stale-sensitive features
Schema mismatch on replay	Projection crashes or produces nonsense	Schema registry, upcasters, versioned projections
Unbounded event growth	Replay too slow to cold-start	Snapshots, archival to cold storage, time-segmented streams
Concurrent writes to the same stream	Optimistic concurrency violation	Smaller aggregates, retry on conflict

Design paths

The four shapes below trade consistency, throughput, and operational complexity in different ways. They are not mutually exclusive — large systems usually mix them per bounded context.

Path 1: Pure event sourcing

Choose when the audit trail is regulatory (finance, healthcare), temporal queries are first-class, the domain naturally expresses itself as state changes, or you expect to grow new read models from the same events for years.

Shape. Events are the only persisted state. Every read goes through a projection. Aggregates are loaded by replay. There is no shadow CRUD database.

Real-world. LMAX Exchange demonstrated the upper bound of this style: a single-threaded, in-memory Business Logic Processor (BLP) handling around 6 million orders per second on a 3 GHz Nehalem-class server, with the LMAX Disruptor ring buffer (20 M input slots, 4 M output slots) decoupling I/O from the BLP. The BLP recovers by replaying events from the most recent nightly snapshot; replication keeps two BLPs in the primary data centre and one in DR so a restart causes no downtime.

The lesson is not “you’ll hit 6 M ops/s,” but that when state lives in memory and the only durable artefact is the event log, the database stops being the bottleneck. You pay for that with complete event-driven discipline.

Path 2: Hybrid event sourcing (ES + CRUD)

Choose when only some bounded contexts need history (transactions yes, product catalog no), the team has uneven event-driven experience, or you are migrating from CRUD incrementally.

Shape. The event-sourced contexts own their streams and projections. CRUD contexts publish events for integration but their own state is the row in the database, not the event log. Boundaries are explicit.

Real-world. Walmart’s Inventory & Availability system uses this hybrid: inventory is event-sourced (stock movements, reservations, business-rule application), partitioned in Cosmos DB by <product, node> id; the Change Feed streams events into materialised views that power real-time availability queries. The product catalog (descriptions, imagery) stays in conventional storage.

1class OrderAggregate {2  apply(event: OrderEvent): void {3    switch (event.type) {4      case 'OrderPlaced':  this.status = 'placed';  this.items = event.data.items; break5      case 'OrderShipped': this.status = 'shipped'; break6    }7  }8}910class ProductService {11  async updatePrice(productId: string, newPrice: number): Promise<void> {12    await this.db.products.update(productId, { price: newPrice })13    await this.outbox.enqueue({ type: 'PriceUpdated', productId, newPrice })14  }15}

Note the outbox on the CRUD side — even when the row is the source of truth, the integration event still needs the transactional outbox to avoid the dual-write trap (see “Pitfalls” below).

Path 3: Inline (synchronous) projections

Choose when read-after-write must be strong, projections are cheap, throughput requirements are moderate, and you want a single failure domain for the write.

Shape. Append the events and apply the projections in the same database transaction. When the command returns, the read model is already current.

Real-world. Marten (a Postgres-backed event store for .NET) supports exactly this: inline projections “are running as part of the same transaction as the events being captured” via IDocumentSession.SaveChangesAsync(). Append + projection update commit together or roll back together.

1async function handlePlaceOrder(command: PlaceOrderCommand): Promise<void> {2  await db.transaction(async (tx) => {3    const order = await loadFromEvents(tx, command.orderId)4    const events = order.place(command.items)5    await appendEvents(tx, command.orderId, events)6    for (const event of events) {7      await updateOrderListProjection(tx, event)8      await updateInventoryProjection(tx, event)9    }10  })11}

The price is throughput: every projection blocks the write, every projection’s failure fails the command, and you cannot scale projections independently of writes.

Path 4: Async projections

Choose when write throughput matters more than read freshness, projections are expensive (aggregations, fan-out), or you want to scale reads independently from writes.

Shape. Commands return as soon as events commit. Subscribers tail the log, project asynchronously, and checkpoint their position. Projections must be idempotent because subscribers will replay on restart and the bus may deliver out of order on rebalance.

Real-world. Netflix Downloads (launched November 2016, built in roughly six months) is event-sourced on Cassandra with three components — Event Store, Aggregate Repository, Aggregate Service — and async projections rebuilt on demand for debugging. During development, full re-runs over historical events took days; snapshots and event archival were the path out.

1interface ProjectionState {2  lastProcessedPosition: number3}45async function runProjection(6  subscription: EventSubscription,7  project: (state: ProjectionState, event: DomainEvent) => ProjectionState,8  checkpoint: (position: number) => Promise<void>,9): Promise<void> {10  let state = await loadProjectionState()11  for await (const event of subscription.fromPosition(state.lastProcessedPosition)) {12    state = project(state, event)13    if (event.position % 100 === 0) {14      await checkpoint(event.position)15    }16  }17}1819function projectOrderPlaced(state: OrderListState, event: OrderPlacedEvent): OrderListState {20  if (state.processedEventIds.has(event.eventId)) return state // idempotency21  return {22    ...state,23    orders: [...state.orders, { id: event.data.orderId, status: 'placed' }],24    processedEventIds: state.processedEventIds.add(event.eventId),25  }26}

The diagram below contrasts the two projection lifecycles end-to-end:

Inline vs async projection lifecycle, with optimistic concurrency on the write path and idempotent upserts on the async read path. — Inline projections commit with the events; async projections checkpoint after the fact and must tolerate duplicates and reordering.

Picking a path

Decision framework: audit/temporal need → bounded context shape → consistency requirement → throughput, leading to CRUD, hybrid ES, inline projections, or async projections. — From audit need down to consistency and throughput — the cheapest combination that still answers the questions you need to answer.

Snapshots

Replaying every event for every command is fine until your hot aggregates carry tens of thousands of events. Snapshots store materialised state at a stream position so replay only has to fold the tail.

Tip

Do not snapshot eagerly. The EventStoreDB / Kurrent guidance and most practitioner posts agree: aggregates with hundreds of events do not need them. Wait until measured cold-start or load times push past your SLO.

When to snapshot

Add snapshots when at least one of these holds:

A hot aggregate routinely carries more events than your replay budget allows (measure before guessing).
Cold-start latency on rebalancing or restart is unacceptable to upstream callers.
A new projection rebuild would otherwise take days.

Trigger strategies

Strategy	Trigger	Strengths	Weaknesses
Event-count	Every N events (e.g. 100)	Predictable replay cost	Snapshots a quiet aggregate unnecessarily
Time-based	Every N hours / days	Operationally simple	Variable event count between snapshots
State-triggered	On natural transitions (`draft → published`)	Snapshots at semantic boundaries	Requires domain knowledge per aggregate
On-demand	First load that exceeds a threshold	Only when needed	First slow load before snapshot exists

Implementation and load path

1interface Snapshot<TState> {2  state: TState3  version: number4  schemaVersion: number5}67async function loadWithSnapshot<TState, TEvent>(8  streamId: string,9  eventStore: EventStore,10  snapshotStore: SnapshotStore,11  evolve: (state: TState, event: TEvent) => TState,12  initialState: TState,13): Promise<{ state: TState; version: number }> {14  const snapshot = await snapshotStore.load<TState>(streamId)15  const startVersion = snapshot?.version ?? 016  const startState = snapshot?.state ?? initialState17  const events = await eventStore.readStream(streamId, { fromVersion: startVersion + 1 })18  const state = events.reduce(evolve, startState)19  return { state, version: startVersion + events.length }20}2122async function maybeSnapshot<TState>(23  streamId: string,24  state: TState,25  version: number,26  snapshotStore: SnapshotStore,27  threshold = 100,28): Promise<void> {29  const lastSnapshot = await snapshotStore.getVersion(streamId)30  if (version - lastSnapshot > threshold) {31    await snapshotStore.save(streamId, { state, version, schemaVersion: 1 })32  }33}

Snapshot-aware aggregate load: snapshot lookup, optional schema check, tail replay, decide, append, and a conditional snapshot save. — Snapshots short-circuit the replay; the events still own correctness.

Snapshots and schema evolution

Snapshots are derived state, so when the snapshot shape changes:

Bump the schemaVersion on new snapshots.
On load, compare the stored schemaVersion to the current one.
If outdated, discard the snapshot and rebuild from events.

This is much simpler than migrating snapshots in place. It only works because events are still the source of truth.

Event schema evolution

Events are stored forever. Schema changes have to be backward-compatible or applied through a transformation step.

Important

Greg Young’s rule, from Versioning in an Event Sourced System: “A new version of an event must be convertible from the old version of the event. If not, it is not a new version of the event but rather a new event.”

Strategy 1 — additive only (default)

Add optional fields. Never remove or rename. Projections must accept both shapes.

1interface OrderPlacedV1 {2  orderId: string3  customerId: string4  items: OrderItem[]5}67interface OrderPlacedV2 extends OrderPlacedV1 {8  discountCode?: string9}1011function projectOrder(event: OrderPlacedV1 | OrderPlacedV2): Order {12  return {13    id: event.orderId,14    customerId: event.customerId,15    items: event.items,16    discountCode: 'discountCode' in event ? event.discountCode : undefined,17  }18}

Use this until you genuinely cannot.

Strategy 2 — upcasting (transform on read)

Old payloads pass through a chain of upcasters that lift them to the current schema before they reach domain logic. The fold and projections only ever see the current shape.

1type Upcaster<TIn, TOut> = (old: TIn) => TOut23const orderPlacedUpcasters = new Map<number, Upcaster<unknown, OrderPlacedV3>>([4  [1, (v1: OrderPlacedV1) => ({ ...v1, discountCode: undefined, source: 'unknown' })],5  [2, (v2: OrderPlacedV2) => ({ ...v2, source: 'unknown' })],6])78function upcast(event: StoredEvent): DomainEvent {9  const upcaster = orderPlacedUpcasters.get(event.schemaVersion)10  return upcaster ? upcaster(event.data) : event.data11}

Use when schema churns and you want every consumer to think in the present tense. Trade-off: a small CPU cost on every read and a chain that grows over time — eventually you will either snapshot to compress it or graduate to Strategy 3.

Strategy 3 — stream transformation (rewrite history)

Copy the stream into a new stream with transformed events, then point readers at the new stream during a release window. Preserve event ids, timestamps, and order.

Use when the change is genuinely breaking (units, semantics, identity), or when the upcaster chain has become technical debt.

Warning

Never modify events in place — even small edits propagate to every projection that already saw the original. Always rewrite into a new stream and switch over atomically.

Schema registry

For anything multi-team, register event schemas (Avro, Protobuf, JSON Schema) and validate on write. The registry gives you:

Compile-time types generated from the schema.
A reject-on-write barrier so producers cannot quietly diverge.
A history of versions per event type that upcasters can reference.

Projections and read models

Projections turn the event log into the shapes the read side actually needs. The lifecycle (using Marten’s vocabulary, which most stores follow):

Type	Execution	Consistency	Use case
Inline	Same transaction as events	Strong	Critical reads with cheap projections
Async	Background subscriber	Eventual	Complex aggregations, high throughput, fan-out
Live	Computed on demand, not persisted	Computed at query time	Ad-hoc analytics, debugging

Building a projection

1interface Projection<TState> {2  initialState: TState3  apply: (state: TState, event: DomainEvent) => TState4}56const orderListProjection: Projection<OrderListState> = {7  initialState: { orders: [], totalRevenue: 0 },89  apply(state, event) {10    switch (event.type) {11      case 'OrderPlaced':12        return {13          ...state,14          orders: [...state.orders, { id: event.data.orderId, status: 'placed', total: event.data.total }],15          totalRevenue: state.totalRevenue + event.data.total,16        }17      case 'OrderShipped':18        return {19          ...state,20          orders: state.orders.map((o) => (o.id === event.data.orderId ? { ...o, status: 'shipped' } : o)),21        }22      case 'OrderRefunded':23        return {24          ...state,25          orders: state.orders.map((o) => (o.id === event.data.orderId ? { ...o, status: 'refunded' } : o)),26          totalRevenue: state.totalRevenue - event.data.refundAmount,27        }28      default:29        return state30    }31  },32}

Rebuilding projections

Because events are the source of truth, projections are disposable. Rebuild whenever:

A bug in projection logic produced wrong state.
A new query pattern needs a new shape.
The read database schema changes.

The safe rebuild shape is blue/green, not in-place replay. Stand up the new projection from position 0 in a parallel read-model store, let it catch up to the live tail, then atomically swap the reader pointer when its lag drops under your SLO. In-place replay is tempting and almost always wrong: it serves partially-rebuilt state to live readers, and any failure mid-replay leaves the read model in a state nobody can describe.

Blue/green projection rebuild: a v2 projection folds from position 0 in parallel with v1; readers swap to v2 once it catches up. — Rebuild a projection from position 0 next to the live one, then swap the reader pointer when v2 has caught up — never replay in place.

Rebuild cost grows with the event log; this is exactly what snapshots and archival exist to bound. Netflix’s experience — re-runs taking days — is a good warning that “we can always rebuild” only holds if you also designed for it.

Cross-projection dependencies

Caution

Projections that read other projections are a footgun during rebuilds — the dependency may sit at a different position in the log than its consumer. Either denormalise so each projection stands alone, declare dependencies so the rebuild engine orders them, or have dependents wait until their dependencies’ positions catch up. Dennis Doomen’s “The Ugly of Event Sourcing” has a long catalogue of variations on this failure.

Choosing an event store

Purpose-built: KurrentDB (formerly EventStoreDB)

KurrentDB is what was previously called EventStoreDB; the project rebranded under the company Kurrent in 2024. Whichever name you encounter, it is the same engine — purpose-built for streams, subscriptions, and event-sourced workloads.

Around 15 k writes/sec and 50 k reads/sec in Kurrent’s published benchmarks, configuration- and disk-bound — see Kurrent’s own “What if you need better performance than 15k writes per second?” and product page.
Native primitives: streams, optimistic concurrency per stream, persistent subscriptions with checkpoints, JavaScript projections.
Vertical replica-set scaling; sharding is on you. Performance is dominated by disk I/O.

Apache Kafka — streaming, not sourcing

Kafka is excellent for event streaming (transport, fan-out, integration). It is the wrong default for event sourcing. The widely cited Serialized.io article “Apache Kafka is not for Event Sourcing” lays this out:

Topic granularity. Aggregates can easily reach millions; Kafka is not designed for millions of topics, and a topic per entity type forces consumers to scan the whole partition to load one aggregate.
No cheap “load events for entity X.” The streaming API gives you an offset-ordered scan, not a stream-by-id read.
No optimistic concurrency. “Append only if version is still N” is not a Kafka primitive; you bolt that on with a coordinating database.
Log compaction destroys history. Compaction keeps only the latest value per key — the exact opposite of what an event store needs.

Use Kafka as the bus that ships events from your real event store to downstream consumers; do not let it own the source of truth.

PostgreSQL-backed stores

A relational engine works as long as you accept the patterns:

Transactional outbox to publish events together with state changes.
LISTEN/NOTIFY for low-latency subscription wake-ups.
Advisory locks for projection coordination.

Libraries: Marten (.NET), pg-event-store (Node), Eventide (Ruby).

Aspect	PostgreSQL-backed	KurrentDB / EventStoreDB
Operational familiarity	High — same DB you already run	New engine and ops surface
ES primitives	Library-provided	Native
Transactionality with app data	Full ACID	Event store separate from app DB
Scaling	Conventional Postgres patterns	Purpose-built but limited sharding

Cloud message-grid services (and what they are not)

AWS EventBridge — routing and orchestration across 90+ AWS services. Useful as a serverless event bus; not an event store.
Azure Event Hubs — Kafka-compatible high-volume ingestion with geo-DR. Same caveats as Kafka for sourcing.

Production patterns from real systems

LMAX — pure ES, in-memory, single-thread

LMAX’s architecture is the textbook upper bound of pure event sourcing.

Business Logic Processor: single-threaded, in-memory, event-sourced; ~6 M orders/sec on commodity hardware.
Disruptor: lock-free ring buffer for I/O — input ring of 20 M slots, output rings of 4 M slots; benchmarks claim 25 M+ messages/sec and sub-50 ns latency.
Replication: three BLPs running the same input — two in the primary DC, one in DR.
Snapshots: nightly; restart cycles every night with no downtime.

What worked: in-memory state with the log as the durability story made the database irrelevant to throughput. What was hard: enforcing determinism in business logic and debugging replay-driven incidents.

Netflix Downloads — Cassandra-backed, async projections, 6-month build

Netflix’s downloads service launched in November 2016 with a tight deadline (“a 6 am global press release”). They built it in roughly six months on a Cassandra-backed event store with three components — Event Store, Aggregate Repository, Aggregate Service — and async projections.

What worked: requirements churn during the build was absorbable because new questions only needed new projections, not migrations. What was hard: rebuild times during development took days, which forced snapshotting and event archival earlier than they would have liked.

Uber Cadence — event sourcing for durable execution

Cadence (and its fork Temporal) is event sourcing applied to workflow state, not just data. Each workflow’s history is the event stream; the worker reconstructs state by deterministic replay before continuing execution.

Multi-tenant clusters host hundreds of domains.
A single Cadence service runs more than a hundred applications at Uber.
The host-level priority task processor reduced worker goroutines on each history host from 16 000 to about 100 (a 95 %+ reduction) at the same load.

The lesson is that “event sourcing” generalises beyond database design — durable execution platforms reuse the same deterministic-replay primitive.

Comparing the three

Aspect	LMAX	Netflix Downloads	Uber Cadence
Domain	Financial exchange	Media licensing	Workflow orchestration
Throughput	~6 M orders/sec (BLP)	Not disclosed	Hundreds of domains, multi-tenant
Consistency	Single-threaded, deterministic	Eventual	Per-workflow, deterministic replay
Storage	In-memory + on-disk log	Cassandra	Cassandra
Snapshots	Nightly	On demand	Per workflow history
Team shape	Small, specialised	6-month feature team	Platform team

Pitfalls in production

1. Unbounded event growth

Storing every tick or every micro-event without an archival plan eventually breaks rebuilds. Mitigate with snapshots, time-segmented streams (orders-2026-q1), and tiered storage (recent in hot store, older in S3 / Glacier).

2. Assuming event order across streams

Within a stream, events are totally ordered. Across streams, the bus may deliver duplicates, reorder, or replay. Build projections that treat each event as an independent input keyed by eventId, and never rely on arrival order for correctness.

3. Dual-write to event store and message bus

Writing the event then publishing in two operations is the classic distributed bug — append succeeds, publish fails, downstream silently misses the event. Use the transactional outbox pattern, Change Data Capture from the event table, or an event store with built-in subscriptions (KurrentDB).

4. Projection complexity sprawl

Projections that join across streams, aggregate across long windows, or reach into other projections become rebuild monsters. Denormalise aggressively, accept duplication in read models, and prefer many small focused projections over one universal one. If the question is genuinely analytical (“revenue by region by category by month”), push it into the warehouse instead.

5. Schema drift without a strategy

Adding a field “just for new events” and assuming projections will cope works exactly until you replay history through new logic. Make additive changes the default, mark new fields optional, and embed the producing code version in the event metadata so projections can branch on intent rather than presence.

6. Time-travel debugging is oversold

Event sourcing gives you an audit trail; it does not magically reduce debugging effort. Chris Kiehl’s first-hand take in “Don’t Let the Internet Dupe You, Event Sourcing is Hard” argues that “99% of the time ‘bad states’ were bad events caused by your standard run-of-the-mill human error” — the ledger added little over a normal database for those cases. Practical questions to answer before you rely on time travel: how do you fix the bad event for already-affected users (a compensating event, almost always), how do you inspect intermediate state if events are binary, and how do you scope a replay to production data without polluting projections?

Invest in the tooling instead — event visualisation, projection-state-at-event inspection, and a documented compensating-event playbook.

Event sourcing’s immutability collides head-on with the GDPR right to erasure. The two practical workarounds:

Crypto-shredding

Mathias Verraes’ crypto-shredding pattern is the most common answer:

Encrypt personal data with a per-subject key before writing it into the event.
Store the keys in a separate, mutable key store keyed by subject.
On erasure, delete the key. The encrypted payload becomes permanently unreadable.

1interface OrderPlacedEvent {2  orderId: string3  customerRef: string4  encryptedCustomerData: string5  items: OrderItem[]6}78interface CustomerKeyStore {9  getKey(customerRef: string): Promise<CryptoKey | null>10  deleteKey(customerRef: string): Promise<void>11}1213async function projectOrder(event: OrderPlacedEvent, keyStore: CustomerKeyStore): Promise<OrderReadModel> {14  const key = await keyStore.getKey(event.customerRef)15  const customerData = key16    ? await decrypt(event.encryptedCustomerData, key)17    : { name: '[deleted]', address: '[deleted]' }18  return {19    orderId: event.orderId,20    customerName: customerData.name,21  }22}

Crypto-shredding flow: write encrypts PII with a per-subject key; erasure deletes the key; subsequent reads render tombstones. — Crypto-shredding leaves the immutable event log untouched — erasure is a key-store delete that turns ciphertext into a tombstone on the next read.

Warning

Encrypted personal data is still personal data under GDPR. Verraes himself is explicit that crypto-shredding renders data unreadable but may not satisfy a formal erasure request, and recommends pairing it with retention policies, key-management hygiene, and legal review.

Forgettable payloads

Verraes’ “Forgettable Payloads” takes the opposite trade-off: store personal data in a separate, mutable store and reference it from the event by id. Erasure becomes a normal DELETE. The cost is a join at projection time and another datastore to operate.

The two patterns compose — keep ids and immutable facts in the event, push PII into a forgettable side store, and use crypto-shredding for the few PII fields you really do need to keep alongside the event for replay.

Practical takeaways

Default to CRUD until the audit/temporal need is real. Event sourcing carries genuine cost and “we might need history later” rarely justifies it up front.
Pick the smallest path that solves the problem. Hybrid > pure for most teams; inline projections > async until throughput forces the trade.
Hold off on snapshots, schema registries, and stream transformations. Add them when measurements demand it, not before.
Treat events as the source of truth and everything else as derived. Projections, snapshots, and read databases are all rebuildable; events are not.
Use Kafka as transport, not as the store. Pair it with a purpose-built or relational event store.
Plan for erasure on day one. Decide between crypto-shredding and forgettable payloads before personal data lands in the log.

Appendix

Prerequisites

Working knowledge of domain-driven design (aggregates, bounded contexts).
Familiarity with consistency models (strong vs eventual; optimistic vs pessimistic concurrency).
Comfort with CQRS as a separate concept from event sourcing.