Critical Rendering Path
14 min read

Critical Rendering Path: Rendering Pipeline Overview

The browser’s rendering pipeline transforms HTML, CSS, and JavaScript into visual pixels through a series of discrete, highly optimized stages. Modern browser engines like Chromium employ the RenderingNG architecture—a next-generation rendering system developed between 2014 and 2021—which decouples the main thread from the compositor and GPU processes to ensure 60fps+ performance and minimize interaction latency.

GPU / Viz Process

Compositor Thread

Main Thread (Renderer Process)

Network & Parser

Blocks

HTML Bytes

CSS Bytes

JS Bytes

DOM Tree

CSSOM Tree

Style Recalc

Layout

Prepaint

Property Trees

Paint

Display Lists

Commit

Tiling

Rasterize

Composite

Draw to Screen

The RenderingNG pipeline: Main thread stages (DOM → Paint) produce immutable outputs committed to the compositor thread, which handles rasterization and compositing independently.

The rendering pipeline is fundamentally a producer-consumer architecture split across threads and processes:

  • Main Thread: Produces structured data (DOM, CSSOM, computed styles, layout geometry, property trees, display lists). Each stage’s output is immutable once complete.
  • Compositor Thread: Consumes committed data to handle scrolling, animations (transform/opacity), and frame assembly without blocking the main thread.
  • Viz Process: Aggregates compositor frames from all sources and issues GPU draw calls.

The key insight: Property Trees (transform, clip, effect, scroll) replaced monolithic layer trees, reducing animation updates from O(layers) to O(affected nodes). This enables compositor-driven animations that bypass the main thread entirely—the architectural foundation for responsive scrolling and 60fps animations even when JavaScript is busy.

Performance impact flows from this split: Interaction to Next Paint (INP) measures how quickly the pipeline can present a frame after user input. Each pipeline stage that runs on the main thread directly contributes to input delay and processing time.

Each stage has well-defined inputs and outputs. Understanding this data flow is essential for debugging performance issues.

StageInputOutputConsumed By
DOM ConstructionHTML bytesDOM TreeStyle Recalc
CSSOM ConstructionCSS bytesCSSOM TreeStyle Recalc
Style RecalcDOM + CSSOMComputedStyle (per node) + LayoutObject TreeLayout
LayoutLayoutObject Tree + ComputedStyleFragment Tree (immutable geometry)Prepaint
PrepaintLayoutObject Tree + Fragment TreeProperty Trees (transform, clip, effect, scroll) + paint invalidationsPaint
PaintLayoutObject Tree + Property TreesDisplay Lists (drawing commands)Commit
CommitProperty Trees + Display ListsCopied data on compositor threadLayerize, Raster
LayerizeDisplay ListsComposited layer listRaster
RasterDisplay Lists + TilesGPU texture tiles (bitmaps)Composite
CompositeTexture tiles + Property TreesCompositor Frame (DrawQuads)Draw
DrawCompositor FramePixels on screenDisplay

Key distinction: The LayoutObject Tree is created during Style Recalc, not Layout. Layout annotates the LayoutObject tree and produces the immutable Fragment Tree as its output. Prepaint then traverses the LayoutObject tree (using Fragment Tree data) to build Property Trees.

The Critical Rendering Path (CRP) is the sequence of steps the browser undergoes to convert code into a visual frame. While traditionally viewed as a linear flow (DOM → CSSOM → Render Tree → Layout → Paint), modern engines employ a granular multi-threaded architecture designed around a core constraint: the main thread handles both JavaScript execution and rendering pipeline stages, so any work on the main thread delays both script responsiveness and visual updates.

Prior to RenderingNG (pre-2021): Rendering was deeply coupled. Scrolling could trigger expensive style recalculations. Animation of transforms required full layer tree walks. The single-threaded assumption baked into the original WebKit codebase (dating to 1998) meant rendering work blocked JavaScript and vice versa.

RenderingNG’s design addresses this by:

  1. Separating concerns: Each pipeline stage produces well-defined, immutable outputs
  2. Enabling skip logic: Stages that aren’t needed can be bypassed (e.g., transform animations skip layout and paint)
  3. Offloading work: Compositor-driven operations don’t require main thread involvement

The pipeline comprises 12 stages, though several can be skipped when unnecessary. The first six run on the main thread; the remainder run on the compositor thread and Viz process.

The browser parses HTML bytes into the Document Object Model (DOM) tree. This process is incremental—the browser starts building the tree before the entire document downloads.

Blocking Behavior:

  • JS is Parser Blocking: Synchronous <script> tags halt the HTML parser because scripts can call document.write(), which modifies the input stream.
  • JS is Non-Render Blocking: JavaScript doesn’t block rendering directly, but it blocks the parser that generates the DOM required for rendering.
  • Preload Scanner: A secondary parser scans ahead for external resources (JS, CSS, fonts, images) to start downloads early, mitigating parser-blocking delays.

Design Trade-off: The parser-blocking behavior exists because JavaScript can modify the document structure mid-parse via document.write(). This legacy API forces sequential processing, though defer and async attributes provide escape hatches for scripts that don’t need synchronous document access.

The browser parses CSS into the CSS Object Model (CSSOM) tree. Unlike DOM construction, CSSOM must be built in its entirety before rendering can occur.

Blocking Behavior:

  • CSS is Render Blocking: The browser won’t render content until CSSOM completes, avoiding Flash of Unstyled Content (FOUC).
  • CSS is JS Execution Blocking: Scripts can query styles via getComputedStyle(), so browsers block JS execution until CSSOM is ready.

Design Trade-off: The all-or-nothing CSSOM requirement exists because CSS rules can override each other in complex ways (cascade, specificity, !important). Partial rendering would show incorrect styles as later rules load. The browser trades initial latency for visual correctness.

The engine combines DOM and CSSOM to determine final computed styles for every element. This stage produces two outputs:

  1. ComputedStyle for each node—the resolved CSS property values after cascade, inheritance, and calc() resolution
  2. LayoutObject Tree—the tree structure that establishes the order of operations for the layout phase

Computed Style + LayoutObject Tree vs. Render Tree:

Legacy architecture: Older browser documentation refers to a Render Tree—a tree structure built by combining DOM and CSSOM that contained only visible elements (excluding display: none, <head>, etc.). Each render tree node stored both the DOM reference and its computed styles.

Modern engines (RenderingNG, BlinkNG) decouple these concerns:

  • Computed styles are stored as a map attached to DOM nodes, not in a separate tree
  • Visibility filtering happens later during layout (the Fragment Tree excludes non-rendered elements)
  • Style calculation is now a discrete phase that can run independently and be cached

This separation enables better incremental updates—changing an element’s display from none to block only requires style recalc and layout, not rebuilding a monolithic tree structure.

Key Details:

  • Dirty Bit System: Only elements marked “dirty” (style-invalidated) are recalculated, enabling O(dirty nodes) instead of O(all nodes).
  • Containment Optimization: Elements with contain: style limit style invalidation scope—descendant changes don’t invalidate ancestors.

Edge Case: Style recalculation can cascade unexpectedly. A change to a parent’s font-size invalidates all descendants using relative units (em, %). Container queries introduced additional invalidation paths where ancestor size changes can trigger descendant style recalc.

Layout receives the LayoutObject Tree (from Style Recalc) and calculates geometry (width, height, x, y) of every visible element. The output is the Fragment Tree—an immutable representation of laid-out boxes with resolved physical coordinates.

LayoutObject Tree vs. Fragment Tree:

  • LayoutObject Tree (input): Mutable tree created during style recalc, points to DOM nodes, receives layout annotations
  • Fragment Tree (output): Immutable tree of PhysicalFragment objects with final positions, sizes, and physical coordinates (left/top, not logical)

The Fragment Tree is the “primary, read-only output of layout.” Its immutability enables caching and prevents later stages from accidentally modifying geometry.

Key Details:

  • Dirty Bit System: Layout uses dirty flags to recalculate only affected subtrees. The engine distinguishes between “needs layout” and “needs full layout” states.
  • Layout Containment: Elements with contain: layout become layout roots—their descendants’ layout changes don’t propagate to ancestors.

Forced Synchronous Layout (Layout Thrashing): Reading geometric properties from JavaScript while layout is dirty forces an immediate, synchronous layout:

// ❌ Layout thrashing: each iteration forces layout
for (const el of elements) {
el.style.width = container.offsetWidth + "px" // read forces layout, write invalidates it
}
// ✅ Batch reads, then batch writes
const width = container.offsetWidth // single read
for (const el of elements) {
el.style.width = width + "px" // writes only
}

Properties That Force Layout: offsetLeft/Top/Width/Height, clientLeft/Top/Width/Height, scrollLeft/Top/Width/Height, getClientRects(), getBoundingClientRect(), getComputedStyle() (for layout-dependent properties), innerText, focus(), scrollIntoView().

Introduced in RenderingNG, Prepaint performs an in-order traversal of the LayoutObject tree (not the Fragment Tree) to build Property Trees—separate tree structures for transform, clip, effect (opacity/filters/masks), and scroll. The traversal order matters: it enables efficient computation of DOM-order hierarchy like parent containing blocks.

Why Property Trees Exist:

Legacy architecture: Browsers used a monolithic Layer Tree where each layer stored its own transform, clip, and effect values. Updating any property required walking the entire tree—O(N) where N is layers.

Property trees decouple these concerns:

  • Transform Tree: Spatial positioning (translation, rotation, scale)
  • Clip Tree: Visibility boundaries (overflow, clip-path)
  • Effect Tree: Visual effects (opacity, filters, blend modes, masks)
  • Scroll Tree: Scroll offset relationships

Each layer references nodes in these trees by ID. The compositor can update an element’s position by applying a different matrix from the Transform Tree without re-walking layout or style.

Design Trade-off: Property trees add complexity (four separate trees instead of one) but enable O(1) property updates and compositor-only animations.

The browser records drawing commands into Display Lists—a sequence of paint operations like “draw rectangle at (0,0) with blue fill.”

Key Details:

  • Not Pixels: Paint produces display lists, not actual pixels. Rasterization happens later on the compositor thread.
  • Caching: The PaintController caches display items. Identical items are reused rather than repainted.
  • Subsequence Recording: Related display items are grouped and cached together. Unchanged subtrees skip painting entirely.

Layerization: Based on CSS properties (will-change, transform, opacity, position: fixed), the engine determines which elements get their own composited layers. Layerization decisions happen here, not during compositing.

Edge Case: Creating too many layers (e.g., hundreds of will-change: transform elements) consumes GPU memory and can degrade performance. The browser uses heuristics to limit layer count.

The main thread commits updated property trees and display lists to the Compositor thread.

Key Details:

  • Synchronous Handoff: The main thread blocks while the compositor copies data. This is the handoff point between threads.
  • Atomic Update: All changes from one frame are committed together, ensuring consistency.

Implementation Detail: The ProxyImpl class enforces thread safety—it only accesses main-thread data structures when the main thread is blocked, verified via DCHECKs in debug builds.

Frame Boundaries: A commit marks the point where the main thread’s work for a frame is “done.” The main thread can begin work on the next frame while the compositor processes the current one.

The Compositor thread breaks display lists into composited layer lists for independent rasterization.

Why Separate From Paint: Layerization decisions depend on runtime factors (memory pressure, GPU capabilities, overlap analysis) that the main thread shouldn’t wait for. Moving this to the compositor enables faster commits.

The Compositor thread converts display lists into bitmapped textures (pixels).

Key Details:

  • Tiling: The viewport divides into tiles (typically 256×256 or 512×512 pixels). Only visible and near-visible tiles are rasterized.
  • GPU Acceleration: Most modern browsers use GPU rasterization via Skia. Software rasterization is the fallback.
  • Modes: ZeroCopyRasterBufferProvider (direct GPU memory), OneCopyRasterBufferProvider (CPU to GPU upload), GpuRasterBufferProvider (GPU command buffer).

Image Decoding: Image decode is the most expensive raster operation. It runs on separate decode threads with dedicated caches for software vs. GPU paths.

Limitation: GPU rasterization is single-threaded due to GPU context locks. Image decoding parallelizes, but pixel generation is sequential per tile.

The pending tree (staging rasterization results) becomes the active tree (ready for drawing).

Three Tree States:

TreePurpose
MainSource of truth for layers (main thread)
PendingStaging area for rasterization work
ActiveReady for drawing

Why Two Compositor Trees: The pending/active split enables atomic visual updates. The active tree remains drawable (scrollable, animatable) while the pending tree completes rasterization. Once ready, activation is instant.

The Compositor thread assembles rasterized tiles into a single frame based on Property Trees.

Key Details:

  • Off-Main-Thread: Scrolling and compositor-driven animations (transform, opacity) happen here without main thread involvement.
  • Draw Quads: The compositor produces DrawQuad objects describing how to render each tile, ordered back-to-front.
  • RenderPasses: Complex effects (masks, filters, clips on rotated content) use intermediate render passes.

Compositor-Only Animations: Animations affecting only transform or opacity can run at 60fps even with a busy main thread because the compositor has everything needed in the property trees. This is why will-change: transform exists—it hints the browser to promote the element to its own layer.

The Viz process issues the final GPU commands to render the composited frame.

Key Details:

  • Viz Process: A separate process that receives compositor frames from all sources (multiple tabs, browser UI) and aggregates them.
  • VSync Synchronization: Drawing synchronizes with the display’s refresh rate (60Hz, 120Hz, 144Hz) to avoid tearing.
  • Final Aggregation: SurfaceAggregator combines frames; DirectRenderer executes the actual GL/Vulkan/Metal commands.

Process Isolation: Viz runs in a separate process so GPU driver crashes don’t take down the entire browser.

Interaction to Next Paint (INP) is a Core Web Vital measuring responsiveness. It captures the latency from user interaction to the next visual update, comprising three phases:

PhaseDescriptionPipeline Impact
Input DelayTime before event handlers runBlocked by main thread work (long tasks, style, layout)
Processing TimeEvent handler executionJavaScript in event listeners
Presentation DelayHandler completion → frame displayedCommit → Raster → Composite → Draw

Thresholds (75th percentile):

  • Good: ≤200ms
  • Needs Improvement: 201–500ms
  • Poor: >500ms

Why Architecture Matters: RenderingNG’s thread isolation means compositor-driven work (scrolling, transform/opacity animations) doesn’t contribute to input delay. The main thread stays available for event processing because the compositor handles visual updates independently.

Optimization Strategies:

  1. Minimize main thread work: Long tasks (>50ms) delay both input handling and rendering
  2. Use compositor-friendly properties: transform and opacity animations don’t require layout or paint
  3. Avoid forced synchronous layout: Batch DOM reads before writes
  4. Use content-visibility: auto: Defers rendering of off-screen content
ProcessResponsibilityCount
BrowserUI chrome, navigation, input routing1
RendererPage rendering, JS execution1 per site (site isolation)
VizGPU operations, final composition1

The renderer process is sandboxed with minimal OS access. The Viz process handles all GPU communication, isolating graphics driver instability from the rest of the browser.

  • Tiling: GPU memory is managed via tiles. Only visible tiles consume GPU resources.
  • Tile Priority: Tiles are prioritized (visible > soon-visible > prefetch). Under memory pressure, low-priority tiles are discarded.
  • Layer Squashing: The browser merges layers when possible to reduce memory overhead.

The main thread uses priority-based scheduling:

  1. Discrete input events (click, keydown): Highest priority
  2. Continuous input events (scroll, mousemove): High priority
  3. Rendering updates (rAF, style, layout, paint): Normal priority
  4. Background work (timers, microtasks): Lower priority

High Latency Mode: When the main thread can’t meet frame deadlines, the scheduler increases pipelining—trading latency for throughput by allowing more frames in flight.


  • Familiarity with the single-threaded event loop model of JavaScript
  • Understanding of GPU vs. CPU execution and memory models
  • Basic knowledge of CSS cascade, specificity, and inheritance
  • RenderingNG (2014–2021) replaced monolithic rendering with a pipelined, multi-threaded architecture
  • Property Trees enable O(1) compositor-driven animations by decoupling transform, clip, effect, and scroll from the layer hierarchy
  • The main thread produces immutable outputs (DOM, styles, layout, paint); the compositor consumes them independently
  • Forced synchronous layout occurs when reading geometry properties (e.g., offsetWidth) after style/DOM changes
  • INP measures the full pipeline latency from interaction to visual update; architecture directly impacts all three phases
  • CRP (Critical Rendering Path): The sequence of stages from HTML/CSS/JS to pixels on screen
  • DOM (Document Object Model): Tree representation of HTML structure
  • CSSOM (CSS Object Model): Tree representation of parsed CSS rules
  • ComputedStyle: Resolved CSS property values for a node after cascade, inheritance, and calculation
  • LayoutObject Tree: Mutable tree created during style recalc; establishes layout order; points to DOM nodes
  • Fragment Tree: Immutable tree of PhysicalFragment objects with resolved geometry; output of layout
  • Property Trees: Separate tree structures for transform, clip, effect, and scroll properties
  • Display Lists: Recorded drawing commands (not pixels) produced by the paint stage
  • Reflow: Synonym for layout recalculation
  • Repaint: Re-recording paint commands when visual styles change without geometry changes
  • Viz: Chromium’s GPU process that aggregates compositor frames and issues draw calls
  • FOUC (Flash of Unstyled Content): Visual artifact when content renders before CSS loads
  • INP (Interaction to Next Paint): Core Web Vital measuring responsiveness from input to visual update

Read more

  • Previous

    Design a Drag and Drop System

    System Design / Frontend System Design 26 min read

    Building drag and drop interactions that work across input devices, handle complex reordering scenarios, and maintain accessibility—the browser APIs, architectural patterns, and trade-offs that power production implementations in Trello, Notion, and Figma.Drag and drop appears simple: grab an element, move it, release it. In practice, it requires handling three incompatible input APIs (mouse, touch, pointer), working around significant browser inconsistencies in the HTML5 Drag and Drop API, providing keyboard alternatives for accessibility, and managing visual feedback during the operation. This article covers the underlying browser APIs, the design decisions that differentiate library approaches, and how production applications solve these problems at scale.

  • Next

    Critical Rendering Path: DOM Construction

    Browser & Runtime Internals / Critical Rendering Path 13 min read

    How browsers parse HTML bytes into a Document Object Model (DOM) tree, why JavaScript loading strategies dictate performance, and how the preload scanner mitigates the cost of parser-blocking resources.