9 min read
Part of Series: Design Patterns & Distributed Systems

Architecting Asynchronous Workloads in Node.js: From In-Process Queues to Distributed Systems

A comprehensive guide to building resilient, scalable asynchronous task processing systems in Node.js, covering everything from basic in-memory queues to advanced distributed patterns.

graph LR
    %% Task Queue
    subgraph "Task Queue"
        T1[Task 1]
        T2[Task 2]
        T3[Task 3]
        T4[Task 4]
        T5[Task 5]
    end

    %% Executors
    E1[Executor 1]
    E2[Executor 2]
    E3[Executor 3]

    %% Connections
    T1 --> E1
    T2 --> E2
    T3 --> E3
    T4 --> E1
    T5 --> E2

    %% Styling
    classDef taskClass fill:#ffcc00,stroke:#000,stroke-width:2px
    classDef executorClass fill:#00ccff,stroke:#000,stroke-width:2px
    classDef queueClass fill:#e0e0e0,stroke:#000,stroke-width:2px

    class T1,T2,T3,T4,T5 taskClass
    class E1,E2,E3 executorClass

At the core of Node.js is a single-threaded, event-driven architecture. This model is highly efficient for I/O-bound operations but presents a challenge for long-running or CPU-intensive tasks, which can block the main thread and render an application unresponsive.

graph TD
    subgraph "Event Loop Architecture"
        CS[Call Stack]
        EL[Event Loop]
        MQ[Microtask Queue]
        TQ[Task Queue]
        WEB[Web APIs]
    end

    CS --> EL
    EL --> MQ
    EL --> TQ
    WEB --> TQ
    WEB --> MQ

    classDef stackClass fill:#ff9999,stroke:#000,stroke-width:2px
    classDef queueClass fill:#99ccff,stroke:#000,stroke-width:2px
    classDef loopClass fill:#99ff99,stroke:#000,stroke-width:2px

    class CS stackClass
    class MQ,TQ queueClass
    class EL loopClass

The Event Loop orchestrates execution between the Call Stack, where synchronous code runs, and various queues that hold callbacks for asynchronous operations. When an async operation completes, its callback is placed in a queue. The Event Loop monitors the Call Stack and processes tasks from these queues when it becomes empty.

Queue Types:

  • Task Queue (Macrotask Queue): Holds callbacks from I/O operations, setTimeout, and setInterval
  • Microtask Queue: Holds callbacks from Promises (.then(), .catch()) and process.nextTick(). This queue has higher priority - all microtasks are executed to completion before the Event Loop processes the next task from the macrotask queue.

For many applications, the first step beyond simple callbacks is an in-memory task queue. The goal is to manage and throttle the execution of asynchronous tasks within a single process, such as controlling concurrent requests to a third-party API to avoid rate limiting.

code-sample.ts
type Task<T = void> = () => Promise<T>
export class AsyncTaskQueue {
private queue = new TaskQueue<Task>()
private activeCount = 0
private concurrencyLimit: number
constructor(concurrencyLimit: number) {
this.concurrencyLimit = concurrencyLimit
}
addTask<T>(promiseFactory: Task<T>): Promise<T> {
const { promise, resolve, reject } = Promise.withResolvers<T>()
const task: Task = async () => {
try {
const result = await promiseFactory()
resolve(result)
} catch (error) {
reject(error)
}
}
this.queue.enqueue(task)
this.processQueue()
return promise
}
private async processQueue(): Promise<void> {
if (this.activeCount >= this.concurrencyLimit || this.queue.isEmpty()) {
return
}
const task = this.queue.dequeue()
if (task) {
this.activeCount++
try {
await task()
} finally {
this.activeCount--
this.processQueue()
}
}
}
}
// Queue implementation using a linked list
class TaskNode<T> {
6 collapsed lines
value: T
next: TaskNode<T> | null
constructor(value: T) {
this.value = value
this.next = null
}
}
export class TaskQueue<T> {
9 collapsed lines
head: TaskNode<T> | null
tail: TaskNode<T> | null
size: number
constructor() {
this.head = null
this.tail = null
this.size = 0
}
// Enqueue: Add an element to the end of the queue
enqueue(value: T) {
9 collapsed lines
const newNode = new TaskNode(value)
if (this.tail) {
this.tail.next = newNode
}
this.tail = newNode
if (!this.head) {
this.head = newNode
}
this.size++
}
// Dequeue: Remove an element from the front of the queue
dequeue(): T {
10 collapsed lines
if (!this.head) {
throw new Error("Queue is empty")
}
const value = this.head.value
this.head = this.head.next
if (!this.head) {
this.tail = null
}
this.size--
return value
}
isEmpty() {
return this.size === 0
}
}
// Example usage
const queue = new AsyncTaskQueue(3)
const createTask = (id: number, delay: number) => () =>
6 collapsed lines
new Promise<void>((resolve) => {
console.log(`Task ${id} started`)
setTimeout(() => {
console.log(`Task ${id} completed`)
resolve()
}, delay)
})
queue.addTask(createTask(1, 1000))
3 collapsed lines
queue.addTask(createTask(2, 500))
queue.addTask(createTask(3, 1500))
queue.addTask(createTask(4, 200))
queue.addTask(createTask(5, 300))

This implementation provides basic control over local asynchronous operations. However, it has critical limitations for production systems:

  • No Persistence: Jobs are lost if the process crashes
  • No Distribution: Cannot be shared across multiple processes or servers
  • Limited Features: Lacks advanced features like retries, prioritization, or detailed monitoring

To build scalable and reliable Node.js applications, especially in a microservices architecture, tasks must be offloaded from the main application thread and managed by a system that is both persistent and distributed.

graph LR
    subgraph "Producer"
        P1[API Server]
        P2[Background Job]
        P3[Event Handler]
    end

    subgraph "Message Broker"
        MB[(Redis/Database)]
    end

    subgraph "Consumers"
        W1[Worker 1]
        W2[Worker 2]
        W3[Worker 3]
    end

    P1 --> MB
    P2 --> MB
    P3 --> MB
    MB --> W1
    MB --> W2
    MB --> W3

    classDef producerClass fill:#ffcc99,stroke:#000,stroke-width:2px
    classDef brokerClass fill:#cc99ff,stroke:#000,stroke-width:2px
    classDef workerClass fill:#99ffcc,stroke:#000,stroke-width:2px

    class P1,P2,P3 producerClass
    class MB brokerClass
    class W1,W2,W3 workerClass

A distributed task queue system consists of three main components:

  1. Producers: Application components that create jobs and add them to a queue
  2. Message Broker: A central, persistent data store (like Redis or a database) that holds the queue of jobs
  3. Consumers (Workers): Separate processes that pull jobs from the queue and execute them

Key Benefits:

  • Decoupling: Producers and consumers operate independently
  • Reliability: Jobs are persisted in the message broker
  • Scalability: Multiple worker processes can handle increased load (Competing Consumers pattern)
LibraryBackendCore Philosophy & StrengthsKey Features
BullMQRedisModern, robust, high-performance queue systemJob dependencies (flows), rate limiting, repeatable jobs, priority queues, sandboxed processors
Bee-QueueRedisSimple, fast, lightweight for real-time, short-lived jobsAtomic operations, job timeouts, configurable retry strategies, scheduled jobs
AgendaMongoDBFlexible job scheduling with cron-based intervalsCron scheduling, concurrency control per job, job priorities, web UI (Agendash)

Producer: Adding a Job to the Queue

producer.ts
import { Queue } from "bullmq"
// Connect to a local Redis instance
const emailQueue = new Queue("email-processing")
async function queueEmailJob(userId: number, template: string) {
await emailQueue.add("send-email", { userId, template })
console.log(`Job queued for user ${userId}`)
}
queueEmailJob(123, "welcome-email")

Worker: Processing the Job

worker.ts
import { Worker } from "bullmq"
const emailWorker = new Worker(
"email-processing",
async (job) => {
const { userId, template } = job.data
console.log(`Processing email for user ${userId} with template ${template}`)
// Simulate sending an email
await new Promise((resolve) => setTimeout(resolve, 2000))
console.log(`Email sent to user ${userId}`)
},
{
// Concurrency defines how many jobs this worker can process in parallel
concurrency: 5,
},
)
console.log("Email worker started...")

In any distributed system, failures are not an exception but an expected part of operations. A resilient system must anticipate and gracefully handle these failures.

When a task fails due to a transient issue, the simplest solution is to retry it. However, naive immediate retries can create a “thundering herd” problem that worsens the situation.

graph LR
    subgraph "Exponential Backoff with Jitter"
        T1[1s + random]
        T2[2s + random]
        T3[4s + random]
        T4[8s + random]
    end

    T1 --> T2
    T2 --> T3
    T3 --> T4

    classDef timeClass fill:#ffcc00,stroke:#000,stroke-width:2px
    class T1,T2,T3,T4 timeClass

Exponential Backoff Strategy:

  • Delay increases exponentially: 1s, 2s, 4s, 8s
  • Retries quickly for brief disruptions
  • Gives overwhelmed systems meaningful recovery periods

Jitter Implementation:

  • Adds random time to backoff delay
  • Desynchronizes retry attempts from different clients
  • Smooths load on downstream services
// producer.ts - adding a job with a backoff strategy
await apiCallQueue.add(
"call-flaky-api",
{ some: "data" },
{
attempts: 5, // Retry up to 4 times (5 attempts total)
backoff: {
type: "exponential",
delay: 1000, // 1000ms, 2000ms, 4000ms, 8000ms
},
},
)

Some messages are inherently unprocessable due to malformed data or persistent bugs in consumer logic. These “poison messages” can get stuck in infinite retry loops.

graph LR
    subgraph "Main Queue"
        MQ[(Main Queue)]
    end

    subgraph "Processing"
        W[Worker]
    end

    subgraph "Dead Letter Queue"
        DLQ[(DLQ)]
    end

    MQ --> W
    W -->|Success| MQ
    W -->|Failed > Max Attempts| DLQ

    classDef queueClass fill:#e0e0e0,stroke:#000,stroke-width:2px
    classDef workerClass fill:#00ccff,stroke:#000,stroke-width:2px
    classDef dlqClass fill:#ff6666,stroke:#000,stroke-width:2px

    class MQ queueClass
    class W workerClass
    class DLQ dlqClass

The Dead Letter Queue (DLQ) pattern moves messages to a separate queue after a configured number of processing attempts have failed. This isolates problematic messages, allowing the main queue to continue functioning.

Most distributed messaging systems offer at-least-once delivery guarantees, meaning messages might be delivered more than once under certain failure conditions.

idempotent-consumer.ts
import { Worker } from "bullmq"
import { db } from "./database"
const idempotentWorker = new Worker("user-registration", async (job) => {
const { userId, userData } = job.data
// Check if already processed
const existingUser = await db.users.findByPk(userId)
if (existingUser) {
console.log(`User ${userId} already exists, skipping`)
return
}
// Process in transaction to ensure atomicity
await db.transaction(async (t) => {
await db.users.create(userData, { transaction: t })
await db.processedJobs.create(
{
jobId: job.id,
processedAt: new Date(),
},
{ transaction: t },
)
})
console.log(`User ${userId} registered successfully`)
})

A common challenge in event-driven architectures is ensuring that database updates and event publishing happen atomically.

graph TD
    subgraph "Application"
        A[Application Service]
        DB[(Database)]
        OT[Outbox Table]
    end

    subgraph "Message Relay"
        MR[Message Relay Process]
        MB[Message Broker]
    end

    A -->|1. Business Transaction| DB
    DB -->|2. Write Event| OT
    MR -->|3. Read Events| OT
    MR -->|4. Publish Events| MB

    classDef appClass fill:#ffcc99,stroke:#000,stroke-width:2px
    classDef dbClass fill:#cc99ff,stroke:#000,stroke-width:2px
    classDef relayClass fill:#99ffcc,stroke:#000,stroke-width:2px

    class A appClass
    class DB,OT dbClass
    class MR,MB relayClass

The Transactional Outbox pattern writes events to an “outbox” table within the same database transaction as business data. A separate message relay process then reads from this table and publishes events to the message broker.

transactional-outbox.ts
async function createUserWithEvent(userData: UserData) {
return await db.transaction(async (t) => {
// 1. Create user
const user = await db.users.create(userData, { transaction: t })
// 2. Write event to outbox in same transaction
await db.outbox.create(
{
eventType: "USER_CREATED",
eventData: { userId: user.id, ...userData },
status: "PENDING",
},
{ transaction: t },
)
return user
})
}

In microservices architecture, coordinating updates across multiple services requires the Saga pattern.

graph LR
    subgraph "Choreography Saga"
        S1[Service 1]
        S2[Service 2]
        S3[Service 3]
        S4[Service 4]
    end

    S1 -->|Event| S2
    S2 -->|Event| S3
    S3 -->|Event| S4
    S4 -->|Compensation Event| S3
    S3 -->|Compensation Event| S2
    S2 -->|Compensation Event| S1

    classDef serviceClass fill:#ffcc99,stroke:#000,stroke-width:2px
    class S1,S2,S3,S4 serviceClass

Saga Implementation Types:

  1. Choreography: Services communicate via events without central controller

    • Highly decoupled
    • Harder to debug (workflow logic distributed)
  2. Orchestration: Central orchestrator manages workflow

    • Centralized logic, easier to monitor
    • Potential single point of failure
saga-orchestrator.ts
class OrderSagaOrchestrator {
async executeOrderSaga(orderData: OrderData) {
try {
// Step 1: Reserve inventory
await this.reserveInventory(orderData.items)
// Step 2: Process payment
await this.processPayment(orderData.payment)
// Step 3: Create shipping label
await this.createShippingLabel(orderData.shipping)
// Step 4: Confirm order
await this.confirmOrder(orderData.id)
} catch (error) {
// Execute compensating transactions
await this.compensateOrderSaga(orderData, error)
}
}
private async compensateOrderSaga(orderData: OrderData, error: Error) {
// Reverse operations in reverse order
await this.cancelShippingLabel(orderData.shipping)
await this.refundPayment(orderData.payment)
await this.releaseInventory(orderData.items)
}
}

For applications requiring full audit history, Event Sourcing stores immutable sequences of state-changing events.

graph TD
    subgraph "Write Side"
        C[Command Handler]
        ES[Event Store]
        W[Write Model]
    end

    subgraph "Read Side"
        Q[Query Handler]
        MV[Materialized Views]
        R[Read Model]
    end

    C --> ES
    ES --> W
    ES --> MV
    MV --> R
    Q --> R

    classDef writeClass fill:#ffcc99,stroke:#000,stroke-width:2px
    classDef readClass fill:#99ffcc,stroke:#000,stroke-width:2px
    classDef storeClass fill:#cc99ff,stroke:#000,stroke-width:2px

    class C,W writeClass
    class Q,R readClass
    class ES,MV storeClass

Apache Kafka’s durable, replayable log is ideal for event stores. Key features include log compaction, which retains the last known value for each message key.

event-sourcing-example.ts
class UserEventStore {
async appendEvent(userId: string, event: UserEvent) {
await kafka.produce({
topic: "user-events",
key: userId,
value: JSON.stringify({
eventId: uuid(),
userId,
eventType: event.type,
eventData: event.data,
timestamp: new Date().toISOString(),
}),
})
}
async getUserEvents(userId: string): Promise<UserEvent[]> {
const events = await kafka.consume({
topic: "user-events",
key: userId,
})
return events.map((event) => JSON.parse(event.value))
}
async getUserState(userId: string): Promise<UserState> {
const events = await this.getUserEvents(userId)
return events.reduce(this.applyEvent, {})
}
}

Tags

Read more

  • Previous in series: Design Patterns & Distributed Systems

    The Art of Resilience: A Deep Dive into Exponential Backoff, Jitter, and Modern Retry Strategies

    14 min read

    In the landscape of modern distributed systems, transient failures are not an exception but an operational certainty. The intricate web of microservices, network links, and cloud infrastructure that underpins contemporary applications is inherently susceptible to momentary disruptions. These can manifest as temporary network partitions, service throttling in response to load (429 Too Many Requests), brief service unavailability during deployments, or ephemeral resource contention on a downstream dependency.

  • Next in series: Design Patterns & Distributed Systems

    Publish-Subscribe Pattern: From Core Concepts to Production-Grade Systems

    11 min read

    The Publish-Subscribe (Pub/Sub) pattern is not merely a coding technique for event handling; it is a fundamental paradigm for designing scalable, resilient, and event-driven systems. This comprehensive guide explores the pattern’s architectural philosophy, implementation strategies, and real-world applications.