Distributed Locking

Distributed locks coordinate access to shared resources across multiple processes or nodes. Unlike single-process mutexes, they must handle network partitions, clock drift, process pauses, and partial failures—all while providing mutual exclusion guarantees that range from “best effort” to “correctness critical.”

This article covers lock implementations (Redis, ZooKeeper, etcd, Chubby), the Redlock controversy, fencing tokens, lease-based expiration, and when to avoid locks entirely.

Distributed locks must handle failures that single-process locks never face—network partitions, clock drift, and process pauses can all cause multiple clients to believe they hold the same lock simultaneously.

Abstract

Distributed locking is fundamentally harder than it appears. The safety property—at most one client holds the lock at any time—requires either consensus protocols (ZooKeeper, etcd) or careful timing assumptions that can fail under realistic conditions (Redlock).

Core mental model:

Efficiency locks: Prevent duplicate work. Occasional double-execution is tolerable. Redis single-node or Redlock works.
Correctness locks: Protect invariants. Double-execution corrupts data. Requires consensus + fencing tokens.

Key insight: Most lock implementations provide leases (auto-expiring locks) rather than indefinite locks. Leases prevent deadlock from crashed clients but introduce the fundamental problem: what if the lease expires while the client is still working?

Fencing tokens solve this: the lock service issues a monotonically increasing token with each lock grant. The protected resource rejects operations with tokens lower than the highest it has seen. This transforms lease expiration from a safety violation into a detected-and-rejected stale operation.

Decision framework:

Requirement	Implementation	Trade-off
Best-effort deduplication	Redis single-node	Single point of failure
Efficiency with fault tolerance	Redlock (5 nodes)	No fencing, timing assumptions
Correctness critical	ZooKeeper/etcd + fencing	Operational complexity
Already using PostgreSQL	Advisory locks	Limited to single database

The Problem

Why Naive Solutions Fail

Approach 1: File-based locks across NFS

1
// Naive NFS lock - seems simple
2
async function acquireLock(path: string): Promise<boolean> {
3
  try {
4
    await fs.writeFile(path, process.pid, { flag: "wx" }) // exclusive create
5
    return true
6
  } catch {
7
    return false // file exists
8
  }
9
}

Fails because:

NFS semantics vary: O_EXCL isn’t atomic on all NFS implementations
No expiration: If the process crashes, lock file persists forever
No fencing: Stale lock holders can still access the resource

Approach 2: Database row locks

1
-- Lock by inserting a row
2
INSERT INTO locks (resource_id, holder, acquired_at)
3
VALUES ('resource-1', 'client-a', NOW())
4
ON CONFLICT DO NOTHING;

Fails because:

No automatic expiration: Crashed clients leave orphan locks
Clock drift: acquired_at timestamps unreliable across nodes
Single point of failure: Database becomes bottleneck

Approach 3: Redis SETNX without TTL

1
SETNX resource:lock client-id

Fails because:

No expiration: Crashed client locks resource forever
Race on release: Client must check-then-delete atomically

The Core Challenge

The fundamental tension: distributed systems are asynchronous—there are no bounded delays on message delivery, no bounded process pauses, and no bounded clock drift.

Distributed locks exist to provide mutual exclusion across this asynchronous environment. The challenge: you cannot distinguish a slow client from a crashed client, and you cannot trust clocks.

“Distributed locks are not just a scaling challenge—they’re a correctness challenge. The algorithm must be correct even when clocks are wrong, networks are partitioned, and processes pause unexpectedly.” — Martin Kleppmann, “How to do distributed locking” (2016)

Lease-Based Locking

All practical distributed locks use leases—time-bounded locks that expire automatically. This prevents indefinite lock holding by crashed clients.

Core Mechanism

TTL Selection Formula

1
MIN_VALIDITY = TTL - (T_acquire - T_start) - CLOCK_DRIFT

Where:

TTL: Initial lease duration
T_acquire - T_start: Time elapsed acquiring the lock
CLOCK_DRIFT: Maximum expected clock skew between client and server

Practical guidance:

JVM applications: TTL ≥ 60s (stop-the-world GC can pause for seconds)
Go/Rust applications: TTL ≥ 30s (less GC concern, but network issues)
General rule: TTL should be 10x your expected operation duration

Clock Skew Issues

Wall-clock danger: Redis uses wall-clock time for TTL expiration. If the server’s clock jumps forward (NTP adjustment, manual change), leases expire prematurely.

Example failure scenario:

Client acquires lock with TTL=30s at server time T
NTP adjusts server clock forward by 20s
Lock expires at “T+30s” = actual T+10s
Client still working; another client acquires lock
Two clients now hold the “same” lock

Mitigation: Use monotonic clocks where possible. Linux clock_gettime(CLOCK_MONOTONIC) measures elapsed time without wall-clock adjustments.

Prior to Redis 7.0: TTL expiration relied entirely on wall-clock time. Redis 7.0+ uses monotonic clocks internally for some operations, but the fundamental issue remains for distributed Redlock scenarios where multiple independent clocks are involved.

Design Paths

Path 1: Redis Single-Node Lock

When to choose:

Lock is for efficiency (prevent duplicate work), not correctness
Single point of failure is acceptable
Lowest latency requirement

Implementation:


2 collapsed lines
1
import { Redis } from "ioredis"
2

3
async function acquireLock(redis: Redis, resource: string, clientId: string, ttlMs: number): Promise<boolean> {
4
  // SET with NX (only if not exists) and PX (millisecond expiry)
5
  const result = await redis.set(resource, clientId, "NX", "PX", ttlMs)
6
  return result === "OK"
7
}
8

9
async function releaseLock(redis: Redis, resource: string, clientId: string): Promise<boolean> {
10
  // Lua script: atomic check-and-delete
11
  // Only delete if we still own the lock
12
  const script = `
13
    if redis.call("get", KEYS[1]) == ARGV[1] then
14
      return redis.call("del", KEYS[1])
15
    else
16
      return 0
17
    end
18
  `
19
  const result = await redis.eval(script, 1, resource, clientId)
20
  return result === 1
21
}

Why the Lua script for release: Without atomic check-and-delete, this race exists:

Client A’s lock expires
Client B acquires lock
Client A (still thinking it has lock) calls DEL
Client A deletes Client B’s lock

Trade-offs:

Advantage	Disadvantage
Simple implementation	Single point of failure
Low latency (~1ms)	No automatic failover
Well-understood semantics	Lost locks on master crash

Real-world: This approach works well for rate limiting, cache stampede prevention, and other scenarios where occasional double-execution is tolerable.

Path 2: Redlock (Multi-Node Redis)

When to choose:

Need fault tolerance for efficiency locks
Can tolerate timing assumptions
Want Redis ecosystem (Lua scripts, familiar API)

Algorithm (N=5 independent Redis instances):

Get current time in milliseconds
Try to acquire lock on ALL N instances sequentially, with small timeout per instance
Lock is acquired if: majority (N/2 + 1) succeeded AND total elapsed time < TTL
Validity time = TTL - elapsed time
If failed, release lock on ALL instances (even those that succeeded)


8 collapsed lines
1
import { Redis } from "ioredis"
2
import { randomBytes } from "crypto"
3

4
interface RedlockResult {
5
  acquired: boolean
6
  validity: number
7
  value: string
8
}
9

10
async function redlockAcquire(instances: Redis[], resource: string, ttlMs: number): Promise<RedlockResult> {
11
  const value = randomBytes(20).toString("hex")
12
  const startTime = Date.now()
13
  const quorum = Math.floor(instances.length / 2) + 1
14

15
  let acquired = 0
16
  for (const redis of instances) {
17
    try {
18
      const result = await redis.set(resource, value, "NX", "PX", ttlMs)
19
      if (result === "OK") acquired++
20
    } catch {
21
      // Instance unavailable, continue
22
    }
23
  }
24

25
  const elapsed = Date.now() - startTime
26
  const validity = ttlMs - elapsed
27

28
  if (acquired >= quorum && validity > 0) {
29
    return { acquired: true, validity, value }
30
  }
31

32
  // Failed - release all locks
33
  await Promise.all(instances.map((r) => releaseLock(r, resource, value)))
34
  return { acquired: false, validity: 0, value }
35
}

Critical limitation: Redlock generates random values (20 bytes from /dev/urandom), not monotonically increasing tokens. You cannot use Redlock values for fencing because resources cannot determine which token is “newer.”

Trade-offs vs single-node:

Aspect	Single-Node	Redlock (N=5)
Fault tolerance	None	Survives N/2 failures
Latency	~1ms	~5ms (sequential attempts)
Complexity	Low	Medium
Fencing support	No	No
Clock assumptions	Server only	All N servers + client

Path 3: ZooKeeper

When to choose:

Correctness-critical locks (fencing required)
Already running ZooKeeper (Kafka, HBase ecosystem)
Can tolerate higher latency for stronger guarantees

Ephemeral sequential node recipe:

Algorithm:

Client creates ephemeral sequential node under /locks/resource
Client lists all children, sorts by sequence number
If client’s node has lowest sequence: lock acquired
Otherwise: set watch on the node with next-lowest sequence number
When watch fires: repeat step 2

Why watch predecessor, not parent:

Watching parent causes thundering herd: all N clients wake when lock releases
Watching predecessor: only next client wakes

Fencing via zxid: ZooKeeper’s transaction ID (zxid) is a monotonically increasing 64-bit number. Use the zxid of your lock node as a fencing token.


6 collapsed lines
1
import org.apache.zookeeper.*;
2
import java.util.List;
3
import java.util.Collections;
4

5
public class ZkLock {
6
    private final ZooKeeper zk;
7
    private final String lockPath;
8
    private String myNode;
9

10
    public long acquireLock(String resource) throws Exception {
11
        // Create ephemeral sequential node
12
        myNode = zk.create(
13
            "/locks/" + resource + "/lock-",
14
            new byte[0],
15
            ZooDefs.Ids.OPEN_ACL_UNSAFE,
16
            CreateMode.EPHEMERAL_SEQUENTIAL
17
        );
18

19
        while (true) {
20
            List<String> children = zk.getChildren("/locks/" + resource, false);
21
            Collections.sort(children);
22

23
            String smallest = children.get(0);
24
            if (myNode.endsWith(smallest)) {
25
                // We have the lock - return zxid as fencing token
26
                Stat stat = zk.exists(myNode, false);
27
                return stat.getCzxid();
28
            }
29

30
            // Find predecessor and watch it
31
            int myIndex = children.indexOf(myNode.substring(myNode.lastIndexOf('/') + 1));
32
            String predecessor = children.get(myIndex - 1);
33

34
            // This blocks until predecessor is deleted
35
            Stat stat = zk.exists("/locks/" + resource + "/" + predecessor, true);
36
            if (stat != null) {
37
                // Wait for watch notification
38
                synchronized (this) { wait(); }
39
            }
40
        }
41
    }
42
}

Trade-offs:

Advantage	Disadvantage
Strong consistency (Zab consensus)	Higher latency (2+ RTTs)
Automatic cleanup (ephemeral nodes)	Operational complexity
Fencing tokens (zxid)	Session management overhead
No clock assumptions	Quorum unavailable = no locks

Path 4: etcd

When to choose:

Kubernetes-native environment
Prefer gRPC over custom protocols
Need distributed KV store beyond just locking

Lease-based locking:

etcd provides first-class lease primitives. A lease is a token with a TTL; keys can be attached to leases and are automatically deleted when the lease expires.


10 collapsed lines
1
package main
2

3
import (
4
    "context"
5
    "time"
6
    clientv3 "go.etcd.io/etcd/client/v3"
7
    "go.etcd.io/etcd/client/v3/concurrency"
8
)
9

10
func acquireLock(client *clientv3.Client, resource string) (*concurrency.Mutex, error) {
11
    // Create session with 30s TTL
12
    session, err := concurrency.NewSession(client, concurrency.WithTTL(30))
13
    if err != nil {
14
        return nil, err
15
    }
16

17
    // Create mutex and acquire
18
    mutex := concurrency.NewMutex(session, "/locks/"+resource)
19
    ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second)
20
    defer cancel()
21

22
    if err := mutex.Lock(ctx); err != nil {
23
        return nil, err
24
    }
25

26
    // Use mutex.Header().Revision as fencing token
27
    return mutex, nil
28
}

Fencing via revision: etcd assigns a globally unique, monotonically increasing revision to every modification. Use mutex.Header().Revision as your fencing token.

Critical limitation (Jepsen finding): Under network partitions, etcd locks can fail to provide mutual exclusion. Jepsen testing found ~18% loss of acknowledged updates when locks protected concurrent modifications. The root cause: etcd must sacrifice correctness to preserve liveness in asynchronous systems.

“etcd’s lock is not safe. It is possible for two processes to simultaneously hold the same lock, even in healthy clusters.” — Kyle Kingsbury, Jepsen analysis of etcd 3.4.3 (2020)

Trade-offs:

Advantage	Disadvantage
Raft consensus (strong consistency)	Jepsen found safety violations
Native lease support	Higher latency than Redis
Kubernetes integration	Operational complexity
Revision-based fencing	Quorum unavailable = no locks

Path 5: Database Advisory Locks (PostgreSQL)

When to choose:

Already using PostgreSQL
Lock scope is single database
Don’t want external dependencies

Session-level advisory locks:

1
-- Acquire lock (blocks until available)
2
SELECT pg_advisory_lock(hashtext('resource-1'));
3

4
-- Try acquire (returns immediately)
5
SELECT pg_try_advisory_lock(hashtext('resource-1'));
6

7
-- Release
8
SELECT pg_advisory_unlock(hashtext('resource-1'));

Transaction-level advisory locks:

1
-- Automatically released at transaction end
2
SELECT pg_advisory_xact_lock(hashtext('resource-1'));
3

4
-- Then do your work within the transaction
5
UPDATE resources SET ... WHERE id = 'resource-1';

Lock ID generation: Advisory locks take a 64-bit integer key. Use hashtext() for string-based resource IDs, or encode your own scheme.

Connection pooling danger: Session-level locks are tied to the database connection. With connection pooling (PgBouncer), your “session” may be reused by another client, leaking locks. Use transaction-level locks with connection pooling.

Trade-offs:

Advantage	Disadvantage
No external dependencies	Single database scope
ACID guarantees	Connection pooling issues
Already have PostgreSQL	Not for multi-database
Automatic transaction cleanup	Lock ID collisions possible

Comparison Matrix

Factor	Redis Single	Redlock	ZooKeeper	etcd	PostgreSQL
Fault tolerance	None	N/2 failures	N/2 failures	N/2 failures	Database HA
Fencing tokens	No	No	Yes (zxid)	Yes (revision)	No
Latency (acquire)	~1ms	~5-10ms	~10-50ms	~10-50ms	~1-5ms
Clock assumptions	Yes	Yes (all nodes)	No	No	No
Correctness guarantee	No	No	Yes	Partial (Jepsen)	Yes (single DB)
Operational complexity	Low	Medium	High	Medium	Low

Decision Framework

Fencing Tokens

The Problem They Solve

Leases expire. When they do, a “stale” lock holder may still be executing its critical section. Without fencing, this corrupts the protected resource.

Example failure scenario:

How Fencing Tokens Work

Lock service issues monotonically increasing token with each grant
Client includes token with every operation on protected resource
Resource tracks highest token ever seen
Resource rejects operations with token < highest seen

Implementation Pattern

Lock service side:


4 collapsed lines
1
interface LockGrant {
2
  token: bigint // Monotonically increasing
3
  expiresAt: number
4
}
5

6
class FencingLockService {
7
  private nextToken: bigint = 1n
8
  private locks: Map<string, { holder: string; token: bigint; expiresAt: number }> = new Map()
9

10
  acquire(resource: string, clientId: string, ttlMs: number): LockGrant | null {
11
    const existing = this.locks.get(resource)
12
    if (existing && existing.expiresAt > Date.now()) {
13
      return null // Lock held
14
    }
15

16
    const token = this.nextToken++
17
    const expiresAt = Date.now() + ttlMs
18
    this.locks.set(resource, { holder: clientId, token, expiresAt })
19

20
    return { token, expiresAt }
21
  }
22
}

Resource side:


6 collapsed lines
1
interface FencedWrite {
2
  token: bigint
3
  data: unknown
4
}
5

6
class FencedStorage {
7
  private highestToken: Map<string, bigint> = new Map()
8
  private data: Map<string, unknown> = new Map()
9

10
  write(resource: string, write: FencedWrite): boolean {
11
    const highest = this.highestToken.get(resource) ?? 0n
12

13
    if (write.token < highest) {
14
      // Stale token - reject
15
      return false
16
    }
17

18
    // Accept write, update highest seen
19
    this.highestToken.set(resource, write.token)
20
    this.data.set(resource, write.data)
21
    return true
22
  }
23
}

Why Random Values Don’t Work

Redlock uses random values (20 bytes), not ordered tokens. A resource cannot determine if abc123 is “newer” than xyz789. This is why Redlock cannot provide fencing—the values lack the ordering property required to reject stale operations.

“To make the lock safe with fencing, you need not just a random token, but a monotonically increasing token. And the only way to generate a monotonically increasing token is to use a consensus protocol.” — Martin Kleppmann

ZooKeeper zxid as Fencing Token

ZooKeeper’s transaction ID (zxid) is perfect for fencing:

Monotonically increasing: Every ZK transaction increments it
Globally ordered: All clients see same ordering
Available at lock time: Stat.getCzxid() returns creation zxid

1
// When acquiring lock
2
Stat stat = zk.exists(myLockNode, false);
3
long fencingToken = stat.getCzxid();
4

5
// When accessing resource
6
resource.write(data, fencingToken);

The Redlock Controversy

Kleppmann’s Critique (2016)

Martin Kleppmann identified fundamental problems with Redlock:

1. Timing assumptions violated by real systems:

Redlock assumes bounded network delay, bounded process pauses, and bounded clock drift. Real systems violate all three:

Network packets can be delayed arbitrarily (TCP retransmits, routing changes)
GC pauses can exceed lease TTL (observed: 1+ minutes in production JVMs)
Clock skew can be seconds under adversarial NTP conditions

2. No fencing capability:

Even if Redlock worked perfectly, it generates random values, not monotonic tokens. Resources cannot reject stale operations.

3. Clock jump scenario:

Client acquires lock on 3 of 5 Redis instances
Clock on one instance jumps forward (NTP sync)
Lock expires prematurely on that instance
Another client acquires on 3 instances (the jumped one + 2 others)
Two clients now hold majority

Antirez’s Response

Salvatore Sanfilippo (Redis creator) responded:

1. Random values + CAS = sufficient:

“The token is a random string. If you use check-and-set (CAS), you can use the random string to ensure that only the lock owner can modify the resource.”

2. Post-acquisition time check:

Redlock spec includes checking elapsed time after acquisition. If elapsed > TTL, the lock is considered invalid. This allegedly handles delayed responses.

3. Monotonic clocks:

Proposed using CLOCK_MONOTONIC instead of wall clocks to eliminate clock jump issues.

The Verdict

Neither argument is fully satisfying:

Kleppmann’s points	Antirez’s counterpoints	Reality
GC pauses violate timing	Post-acquisition check helps	Pauses can happen during resource access, not just during acquire
No fencing possible	Random + CAS works	CAS requires resource to store lock value; not always feasible
Clock jumps break safety	Use monotonic clocks	Cross-machine monotonic clocks don’t exist

Practical guidance:

Efficiency locks: Redlock is acceptable. Double-execution is annoying but not catastrophic.
Correctness locks: Use consensus-based systems (ZooKeeper) with fencing tokens. Redlock’s random values cannot fence.

Production Implementations

Google Chubby: The Original

Context: Internal distributed lock service powering GFS, BigTable, and other Google infrastructure. Open-sourced concept inspired ZooKeeper.

Architecture:

5-replica Paxos cluster
Replicas elect master using Paxos; master lease is several seconds
Client sessions with grace periods (45s default)
Files + locks (locks are files with special semantics)

Key design decisions:

Coarse-grained locks: Designed for locks held minutes to hours, not milliseconds
Advisory locks by default: Files don’t prevent access without explicit lock checks
Master lease renewal: Master doesn’t lose leadership on brief network blips
Client grace period: On leader change, clients have 45s to reconnect before session (and locks) invalidate

Fencing mechanism: Chubby supports sequencers (fencing tokens). The lock service hands out sequencers; resources can verify them with Chubby before accepting writes.

“If a process’s lease has expired, the lock server will refuse to validate the sequencer.” — Mike Burrows, “The Chubby Lock Service” (2006)

Scale: Chubby is not designed for high-frequency locking. It’s optimized for reliability of infrequent operations, not throughput.

Uber: Driver Assignment

Context: When a rider requests a cab, multiple nearby drivers could be assigned. Exactly one driver must be assigned per ride.

Problem:

Multiple matching service instances receive the same request
Race condition: both try to assign the same driver
Result: driver assigned to multiple rides, customer experience failure

Solution:

Distributed lock on driver:{driver_id} before assignment
Lock held only during assignment operation (~10-100ms)
Redis-based (likely Redlock or single-node with replication)

Why it works: This is an efficiency lock. If two services somehow both assign the same driver (lock failure), the booking system downstream rejects the duplicate. Occasional failures are detected and handled.

Netflix: Job Deduplication

Context: Millions of distributed jobs, some triggered by events that may arrive multiple times.

Problem:

Event arrives at multiple consumer instances
Same job should execute exactly once
Idempotency alone doesn’t help if job has side effects

Solution approach:

Acquire lock before processing event
Lock key: job:{event_id}:{job_type}
TTL: Expected job duration + buffer
Combined with idempotency keys in downstream services

Insight: Netflix uses a layered approach—locks provide first-line deduplication, idempotent operations provide safety net, and monitoring detects drift.

Implementation Comparison

Aspect	Google Chubby	Uber	Netflix
Lock type	Correctness	Efficiency	Efficiency
Duration	Minutes-hours	Milliseconds	Seconds
Backend	Paxos (custom)	Redis	Redis/ZK hybrid
Fencing	Sequencers	Downstream checks	Idempotency keys
Scale	Low freq, high reliability	High freq, acceptable loss	High freq, acceptable loss

Lock-Free Alternatives

When to Avoid Locks Entirely

Distributed locks add complexity and failure modes. Before reaching for a lock, consider:

1. Idempotent operations:

If your operation can safely execute multiple times with the same result, you don’t need a lock.

1
// Bad: non-idempotent
2
async function incrementCounter(id: string) {
3
  const current = await db.get(id)
4
  await db.set(id, current + 1)
5
}
6

7
// Good: idempotent with versioning
8
async function setCounterIfMatch(id: string, expectedVersion: number, newValue: number) {
9
  await db
10
    .update(id)
11
    .where("version", expectedVersion)
12
    .set({ value: newValue, version: expectedVersion + 1 })
13
}

2. Compare-and-Swap (CAS):

Many databases support atomic CAS. Use it instead of external locks.

1
-- CAS-based update
2
UPDATE resources
3
SET value = 'new-value', version = version + 1
4
WHERE id = 'resource-1' AND version = 42;
5

6
-- Check rows affected - if 0, retry with fresh version

3. Optimistic concurrency:

Assume no conflicts; detect and retry on collision.


6 collapsed lines
1
interface VersionedResource {
2
  data: unknown
3
  version: number
4
}
5

6
async function optimisticUpdate(id: string, transform: (data: unknown) => unknown) {
7
  while (true) {
8
    const resource = await db.get(id)
9
    const newData = transform(resource.data)
10

11
    const updated = await db.update(id, {
12
      data: newData,
13
      version: resource.version + 1,
14
      _where: { version: resource.version },
15
    })
16

17
    if (updated) return // Success
18
    // Version conflict - retry
19
  }
20
}

4. Queue-based serialization:

Route all operations for a resource to a single queue/partition.

This eliminates concurrent access by design.

Decision: Lock vs Lock-Free

Factor	Use Distributed Lock	Use Lock-Free
Operation complexity	Multi-step, non-decomposable	Single atomic operation
Conflict frequency	Rare	Frequent (CAS retries expensive)
Side effects	External (can’t retry)	Local (can retry)
Existing infrastructure	Lock service available	Database has CAS
Team expertise	Lock patterns understood	Lock-free patterns understood

Common Pitfalls

1. Holding Locks Across Async Boundaries

The mistake: Acquiring lock, then making RPC calls or doing I/O while holding it.

1
// Dangerous: lock held during external call
2
const lock = await acquireLock(resource)
3
const data = await externalService.fetch() // Network call!
4
await db.update(resource, data)
5
await releaseLock(lock)

What goes wrong:

External call takes 10s; lock TTL is 5s
Lock expires while you’re still working
Another client acquires and corrupts data

Solution: Minimize lock scope. Fetch data first, then lock-update-unlock quickly.

1
// Better: minimize lock duration
2
const data = await externalService.fetch()
3

4
const lock = await acquireLock(resource)
5
await db.update(resource, data)
6
await releaseLock(lock)

2. Ignoring Lock Acquisition Failure

The mistake: Assuming lock acquisition always succeeds.

1
// Dangerous: no failure handling
2
await acquireLock(resource)
3
await criticalOperation()
4
await releaseLock(resource)

What goes wrong:

Lock service unavailable → operation proceeds without lock
Lock contention → silent failure, concurrent access

Solution: Always check acquisition result and handle failure.

1
const acquired = await acquireLock(resource)
2
if (!acquired) {
3
  throw new Error("Failed to acquire lock - cannot proceed")
4
}
5
try {
6
  await criticalOperation()
7
} finally {
8
  await releaseLock(resource)
9
}

3. Lock-Release Race with TTL

The mistake: Releasing a lock you no longer own (it expired and was re-acquired).

1
// Dangerous: release without ownership check
2
await lock.release() // May delete another client's lock!

What goes wrong:

Your lock expires due to slow operation
Another client acquires the lock
Your release() deletes their lock
Third client acquires, now two clients think they have it

Solution: Atomic release that checks ownership (shown in Redis Lua script earlier).

4. Thundering Herd on Lock Release

The mistake: All waiting clients wake simultaneously when lock releases.

What goes wrong with ZooKeeper naive implementation:

1000 clients watch /locks/resource parent node
Lock releases, all 1000 receive watch notification
All 1000 call getChildren() simultaneously
ZooKeeper overloaded, lock acquisition stalls

Solution: Watch predecessor only (shown in ZooKeeper recipe earlier). Only one client wakes per release.

5. Missing Fencing on Correctness-Critical Locks

The mistake: Using Redlock (or any lease-based lock) without fencing for correctness-critical operations.

1
// Dangerous: no fencing
2
const lock = await redlock.acquire(resource)
3
await storage.write(data) // Stale lock holder can overwrite!
4
await redlock.release(lock)

Solution: Either use a lock service with fencing tokens (ZooKeeper) or accept that this lock is efficiency-only.

6. Session-Level Locks with Connection Pooling

The mistake: Using PostgreSQL session-level advisory locks with PgBouncer.

1
-- Acquired by connection in pool
2
SELECT pg_advisory_lock(12345);
3
-- Connection returned to pool
4
-- Other client reuses connection
5
-- Lock is still held by "other" client!

Solution: Use transaction-level locks with pooling.

1
BEGIN;
2
SELECT pg_advisory_xact_lock(12345);
3
-- Do work
4
COMMIT; -- Lock automatically released

Conclusion

Distributed locking is a coordination primitive that requires careful consideration of failure modes, timing assumptions, and fencing requirements.

Key decisions:

Efficiency vs correctness: Most locks are for efficiency (preventing duplicate work). These can use simpler implementations with known failure modes. Correctness-critical locks require consensus protocols and fencing.
Fencing is non-negotiable for correctness: Without fencing tokens, lease expiration during long operations corrupts data. Random lock values (Redlock) cannot fence.
Timing assumptions are dangerous: Redlock’s safety depends on bounded network delays, process pauses, and clock drift. Real systems violate all three.
Consider lock-free alternatives: Idempotent operations, CAS, optimistic concurrency, and queue-based serialization often work better than distributed locks.

Start simple: Single-node Redis locks work for most efficiency scenarios. Graduate to ZooKeeper with fencing only when correctness is critical and you understand the operational cost.

Appendix

Prerequisites

Distributed systems fundamentals (network partitions, consensus)
CAP theorem and consistency models
Basic understanding of lease-based coordination

Terminology

Term	Definition
Lease	Time-bounded lock that expires automatically
Fencing token	Monotonically increasing identifier that resources use to reject stale operations
TTL	Time-To-Live; duration before lease expires
Quorum	Majority of nodes (N/2 + 1) required for consensus
Split-brain	Network partition where multiple partitions believe they are authoritative
zxid	ZooKeeper transaction ID; monotonically increasing, usable as fencing token
Advisory lock	Lock that doesn’t prevent access—just signals intention
Ephemeral node	ZooKeeper node that is automatically deleted when the client session ends

Summary

Distributed locks are harder than they appear—network partitions, clock drift, and process pauses all cause multiple clients to believe they hold the same lock
Leases (auto-expiring locks) prevent deadlock but introduce the lease-expiration-during-work problem
Fencing tokens solve this by having the resource reject operations from stale lock holders
Redlock provides fault-tolerant efficiency locks but cannot fence (random values lack ordering)
ZooKeeper/etcd provide fencing tokens (zxid/revision) but add operational complexity
Lock-free alternatives (CAS, idempotency, queues) often work better than distributed locks
For correctness-critical locks: use consensus + fencing; for efficiency locks: Redis single-node is often sufficient

References

Foundational:

How to do distributed locking - Martin Kleppmann’s analysis of Redlock.
Is Redlock safe? - Salvatore Sanfilippo’s response.
The Chubby Lock Service for Loosely-Coupled Distributed Systems - Mike Burrows, OSDI 2006.

Implementation Documentation:

Redis Distributed Locks - Official Redis distributed lock documentation.
ZooKeeper Recipes and Solutions - Official ZooKeeper lock recipe.
etcd Concurrency API - etcd lease and lock APIs.
PostgreSQL Advisory Locks - PostgreSQL documentation.

Testing and Analysis:

Jepsen: etcd 3.4.3 - Kyle Kingsbury’s analysis finding safety violations in etcd locks.
Designing Data-Intensive Applications - Martin Kleppmann. Chapter 8 covers distributed coordination.

Libraries:

Redisson - Redis Java client with distributed locks.
node-redlock - Redlock implementation for Node.js.
Curator - ZooKeeper recipes including distributed locks.

Read more