Critical Rendering Path: Rasterization
Rasterization is the process where the browser converts recorded display lists into actual pixels—bitmaps for software raster or GPU textures for hardware-accelerated paths. This stage marks the transition from abstract paint commands to concrete visual data. In Chromium, rasterization is managed by the compositor thread and executed by worker threads, ensuring smooth interactions even when the main thread is saturated with JavaScript or layout work.
Abstract
Rasterization converts display lists into pixels through a tiled, prioritized, multi-threaded architecture designed around three constraints: GPU memory is finite, users scroll faster than pixels can be drawn, and the main thread must remain free for JavaScript.
The Core Model:
- Tiling: Layers are divided into fixed-size tiles (256×256 px for software, viewport-width × ¼ viewport-height for GPU raster) to bound memory usage and enable incremental rasterization
- Dual-Tree Architecture: Pending tree rasterizes new content while active tree continues drawing—activation only occurs when tiles are ready, preventing checkerboard artifacts
- Prioritization: Tiles are binned by urgency (NOW, SOON, EVENTUALLY, NEVER) based on viewport distance and scroll velocity, with GPU memory budget distributed in priority order
- Out-of-Process Execution: The Viz process owns GPU resources; renderer processes serialize paint commands over IPC (Inter-Process Communication) for security isolation
The Evolution: Chromium is transitioning from Ganesh (OpenGL-centric, single-threaded) to Graphite (Vulkan/Metal/D3D12, multi-threaded with depth testing). Graphite achieves ~15% performance gains through reduced overdraw and pipeline pre-compilation.
The Mechanics of Rasterization
Rasterization does not happen all at once for the entire page. The browser employs a tiling system, dual-tree architecture, and priority scheduling to handle documents that may be orders of magnitude larger than available GPU memory.
From Display Lists to Tiles
Once the Paint stage records drawing commands into cc::PaintRecord (a serializable sequence of Skia operations), the compositor thread takes ownership. Rather than rasterizing entire layers, it decomposes them into tiles.
Why Tiling? A long-scrolling page might be 50,000 pixels tall. Rasterizing this into a single texture would:
- Exceed GPU memory limits (a 4K layer consumes ~33MB of VRAM (Video Random Access Memory))
- Create massive latency before any pixels appear
- Waste resources on off-screen content the user may never see
Tile Sizes vary by rasterization mode:
| Mode | Tile Dimensions | Rationale |
|---|---|---|
| Software Raster | 256×256 px | Small tiles allow fine-grained priority; worker threads can complete tiles quickly |
| GPU Raster | Viewport width × ¼ viewport height | Larger tiles reduce draw call overhead; GPU handles large textures efficiently |
Tiles are managed by PictureLayerImpl, which maintains multiple PictureLayerTiling objects at different scale factors. A 512×512 layer at 1:1 scale produces four 256×256 tiles, but the same layer might have a single 256×256 tile at 1:2 scale for lower-resolution fallbacks during fast scrolling.
The Dual-Tree Architecture
A critical design decision in Chromium’s compositor is the separation between pending and active layer trees:
- Pending Tree: Receives new content from commits; tiles are rasterized here
- Active Tree: Currently being displayed; animations and scrolling read from this tree
- Recycle Tree: Cached pending tree for allocation reuse
Why Two Trees? Without this separation, activating a new commit would immediately expose partially-rasterized content as “checkerboard” artifacts (gray or white rectangles where tiles haven’t finished). The dual-tree design ensures activation only occurs when the pending tree’s tiles are sufficiently rasterized.
The TileManager controls activation timing. It tracks which tiles are required for the current viewport, which are desirable for smooth scrolling, and which can be deferred. Activation proceeds only when “NOW” priority tiles are ready.
Tile Prioritization
Tiles are assigned to priority bins based on heuristics:
| Priority Bin | Criteria | Treatment |
|---|---|---|
| NOW | Visible in viewport | Must complete before activation |
| SOON | Within ~1 viewport distance | High priority; prevents checkerboard on scroll |
| EVENTUALLY | Further from viewport | Rasterized when workers are idle |
| NEVER | Off-screen, no scroll path | Skipped entirely; memory reclaimed |
The TileManager distributes the GPU memory budget in priority order. If memory is constrained, lower-priority tiles are evicted or never rasterized. Scroll velocity affects binning—fast scrolls increase the “SOON” radius.
Rasterization Paths: Software vs. GPU
Chromium supports two rasterization paths, selected based on device capabilities, content complexity, and compositor settings.
Software Rasterization
Worker threads execute paint commands using Skia’s CPU rasterizer, producing bitmaps in shared memory. The SoftwareImageDecodeCache handles image decode, scaling, and color correction as prerequisite tasks.
Buffer Providers:
- ZeroCopyRasterBufferProvider: Maps GPU memory directly; zero CPU-to-GPU copy
- OneCopyRasterBufferProvider: Rasterizes to shared memory, then uploads to GPU; required when direct mapping isn’t available
Software raster remains the fallback when GPU acceleration fails (driver bugs, unsupported content) and is sometimes faster for very simple content where GPU overhead dominates.
GPU Rasterization (OOP-R)
Modern Chromium uses Out-of-Process Rasterization (OOP-R) (Out-of-Process Rasterization). The renderer process doesn’t execute GPU commands directly—it serializes paint operations into a command buffer that the Viz process (GPU process) executes.
Why OOP-R?
- Security: Renderer processes are sandboxed; they cannot access platform 3D APIs directly
- Stability: GPU driver crashes don’t take down the renderer
- Parallelism: CPU and GPU work can overlap across process boundaries
The Viz process uses Skia to execute commands against the actual GPU. Resources are shared via mailboxes (opaque identifiers) and synchronized through sync tokens.
Ganesh vs. Graphite: The Backend Evolution
Skia’s GPU backend has evolved significantly:
Ganesh (Legacy):
- Designed around OpenGL semantics
- Single-threaded command submission
- No depth testing—overdraw handled by painter’s algorithm
- Specialized shader pipelines created performance cliffs (unpredictable hitches when new pipelines compile during animation)
Graphite (Current/Future):
- Built for Vulkan, Metal, and D3D12
- Multi-threaded by default: independent
Recorderobjects produceRecordinginstances on worker threads - Depth testing for 2D: each draw receives a z-value, allowing opaque objects to be reordered while the depth buffer maintains correctness—reduces overdraw
- Consolidated pipeline compilation at startup, not during animation
As of Chrome 125+ (mid-2024), Graphite is enabled by default on Apple Silicon Macs. Windows support (via Dawn’s D3D11/D3D12 backends) is in progress. The flag
--skia-graphite-backendenables it on other platforms.
Performance Impact: Graphite achieves ~15% improvement on MotionMark 1.3 benchmarks on M3 MacBooks, with measurable improvements in INP (Interaction to Next Paint), LCP (Largest Contentful Paint), and dropped frame rates.
Layerization and Promotion
The compositor decides which parts of the DOM (Document Object Model) should live on their own composited layers. This decision, called layer promotion, trades memory for animation performance.
Promotion Criteria
Elements are promoted when they meet criteria suggesting frequent changes or GPU-native content:
| Trigger | Example | Why |
|---|---|---|
| Explicit hint | will-change: transform | Developer signals intent to animate |
| 3D transforms | translate3d(), perspective | Already GPU-native operations |
| Hardware content | <video>, <canvas>, <iframe> | Content rendered by GPU or separate process |
| Opacity animation | opacity in CSS animation | Opacity is a compositor-only property |
| Overlap correction | Element above a promoted layer | Prevents incorrect z-ordering artifacts |
The Overlap Problem
When a promoted element sits above a non-promoted element in z-order, the browser may force-promote the lower element to maintain correctness. Without this, the compositor would blend layers incorrectly since it doesn’t have z-buffer information across layers.
This can cascade: one promoted element causes neighbors to promote, which causes their neighbors to promote—resulting in layer explosion.
Layer Squashing
To mitigate layer explosion, the compositor uses squashing: multiple overlapping elements that would be promoted for overlap reasons are merged into a single composited layer when possible. This bounds memory growth while preserving correctness.
When Squashing Fails:
- Different blend modes between elements
- Different opacity values
- Transform animations that would require re-rasterization
- Elements requiring different scroll containers
Why Graphics Layers Enable 60fps
The primary benefit of layerization is bypassing the main thread for common interactions.
Compositor-Only Properties
When an element is on its own layer and you animate compositor-only properties (transform, opacity), the pipeline shortcuts to:
- Main Thread: Idle (or busy with JS (JavaScript))
- Compositor Thread: Receives animation tick, updates transform/opacity values in property trees
- GPU: Re-composites existing textures with new transformation matrix
No re-layout, no re-paint, no re-raster. The textures already exist; only the final blend changes.
/* ❌ Triggers Layout → Paint → Raster → Composite */.animate-position { transition: top 0.2s;}
/* ✅ Triggers only Composite */.animate-transform { transition: transform 0.2s;}Real-World Example: Fixed Headers
A position: fixed header is typically promoted to its own layer. During scroll:
- The content layer’s texture is shifted by the scroll offset
- The header layer’s texture remains static at viewport position
- The display compositor blends these textures at different offsets
The GPU performs a simple texture blend—no main thread involvement, no rasterization.
Operational Trade-offs and Failure Modes
Layer promotion and GPU rasterization are not free. Understanding the costs prevents performance regressions.
Memory Pressure
Each composited layer consumes GPU memory proportional to its pixel area:
Formula: Width × Height × 4 bytes (RGBA (Red, Green, Blue, Alpha))
| Layer Size | Memory (RGBA) |
|---|---|
| 1920×1080 (Full HD) | ~8 MB |
| 2560×1440 (QHD) | ~14 MB |
| 3840×2160 (4K) | ~33 MB |
Mobile devices with 2-4 GB total RAM and shared GPU memory can exhaust resources quickly. Symptoms: compositor falls back to software raster, animations stutter, browser process crashes.
Texture Upload Latency
When content changes (e.g., background-color), the browser must:
- Re-rasterize affected tiles
- Upload new texture data from CPU to GPU memory
On memory-bandwidth-constrained devices, uploading a single 512×512 texture can take 3-5ms—enough to drop a frame. The compositor throttles uploads to avoid this, but rapid content changes can still cause jank.
The Checkerboard Effect
When scroll velocity exceeds rasterization throughput, the compositor displays the active tree’s textures while the pending tree hasn’t finished rasterizing new viewport content. Users see empty rectangles (historically gray checkerboard patterns).
Mitigations:
- Pre-rasterize tiles outside the viewport (configurable radius)
- Use lower-resolution fallback tilings during fast scroll
- Async image decode prevents blocking raster on image data
When Mitigations Fail:
- Extremely fast flings (flick-scroll)
- Complex content (heavy SVG, many layers)
- Memory pressure evicting pre-rasterized tiles
GPU Process Crashes
Since rasterization runs in the Viz process with real GPU access, driver bugs can crash it. Chromium handles this:
- Viz process restarts automatically
- Renderer processes reconnect
- Visible as brief black flash or texture loss
Frequent crashes may trigger fallback to software rendering for stability.
Conclusion
Rasterization is the heavy-lifting stage where abstract paint commands become concrete pixels. The tiled, dual-tree, priority-based architecture exists to solve fundamental constraints: GPU memory is limited, users scroll unpredictably fast, and the main thread must remain available for JavaScript.
For production optimization:
- Minimize layer count: Each layer costs memory; use
will-changesparingly - Prefer compositor-only animations:
transformandopacityskip the entire main-thread pipeline - Avoid unnecessary repaints: Color and background changes trigger texture re-upload
- Test on constrained devices: Desktop GPU memory hides problems that manifest on mobile
The transition from Ganesh to Graphite represents Chromium’s investment in modern GPU APIs, multi-threaded rasterization, and reduced overdraw—improvements that compound as web content grows more complex.
Appendix
Prerequisites
- Paint Stage: Understanding how display lists (paint records) are generated
- Compositing: How layers are assembled into the final frame
- GPU Architecture Basics: Distinction between system RAM and VRAM; texture upload costs
Terminology
| Term | Definition |
|---|---|
| OOP-R | Out-of-Process Rasterization; Skia executes in the Viz (GPU) process, not the renderer |
| Viz Process | Chromium’s GPU process; owns GPU resources, runs display compositor |
| VRAM | Video RAM; dedicated GPU memory for textures |
| Skia | Open-source 2D graphics library used by Chrome, Android, and Flutter |
| Ganesh | Skia’s legacy OpenGL-based GPU backend |
| Graphite | Skia’s modern GPU backend for Vulkan/Metal/D3D12; multi-threaded with depth testing |
| Tiling | Dividing large surfaces into fixed-size rectangles for independent rasterization |
| Checkerboarding | Visual artifact when tiles haven’t been rasterized before becoming visible |
| Layer Squashing | Merging multiple overlapping promoted elements into a single layer to reduce memory |
| Compositor-Only Property | CSS properties (transform, opacity) that can be animated without main thread involvement |
Summary
- Rasterization converts paint records into GPU textures (or CPU bitmaps)
- Tiling bounds memory usage and enables incremental rasterization (256×256 px software, viewport-based GPU)
- Dual-tree architecture (pending/active) prevents checkerboard artifacts during commits
- Tile prioritization (NOW/SOON/EVENTUALLY/NEVER) ensures viewport content renders first
- OOP-R isolates GPU operations in the Viz process for security and stability
- Graphite replaces Ganesh with multi-threaded, depth-tested rasterization (~15% gains)
- Layer promotion enables 60fps animations but costs memory; squashing mitigates explosion
- Trade-offs: VRAM pressure, texture upload latency, checkerboarding on fast scroll
References
- Chromium: How cc Works — Definitive guide to Chrome’s compositor architecture
- Chromium: Impl-Side Painting — Dual-tree architecture and tile prioritization
- Chromium: RenderingNG Architecture — Process model and Viz integration
- Chromium: GPU Accelerated Compositing — Layer promotion criteria and texture management
- Chromium Blog: Introducing Skia Graphite — Graphite architecture and performance improvements
- W3C: CSS Compositing and Blending Level 1 — Specification for layer compositing operations
- web.dev: Stick to Compositor-Only Properties — Practical guidance on layer management
- Chrome Developers: Inside look at modern web browser (Part 3) — Compositing and rasterization overview
Read more
-
Previous
Critical Rendering Path: Layerize
Browser & Runtime Internals / Critical Rendering Path 9 min readThe Layerize stage converts paint chunks into composited layers (cc::Layer objects), determining how display items should be grouped for independent rasterization and animation. This process happens after Paint produces the paint artifact and before Rasterization converts those layers into GPU textures.
-
Next
Critical Rendering Path: Compositing
Browser & Runtime Internals / Critical Rendering Path 13 min readHow the compositor thread assembles rasterized layers into compositor frames and coordinates with the Viz process to display the final pixels on screen.