Skip to main content
On this page

Web Performance Optimization: Overview and Playbook

A playbook-style entry point to the Web Performance series. The goal is a small set of mental models (a CWV scorecard, a layered dependency stack, and a “which metric is failing → which layer to fix” decision tree) so you can pick the right deep-dive article instead of optimising blindly.

Web performance optimization layers and their impact on Core Web Vitals
Each layer drives one or more Core Web Vitals; a weak upstream layer caps what every downstream optimisation can deliver.

Abstract

Web performance optimisation reduces to three user-centric questions: how fast does the main content appear? (Largest Contentful Paint, LCP), how quickly does the page respond to interaction? (Interaction to Next Paint, INP), and does the layout stay stable while loading? (Cumulative Layout Shift, CLS). These three Core Web Vitals are Google’s user-experience ranking signals, measured at the 75th percentile of real user visits per origin per device class. A URL passes only when p75 hits “good” on all three for both mobile and desktop slices.

The optimisation stack is a layered dependency chain — each layer caps what the next can achieve:

  1. Infrastructure (DNS, HTTP/3, CDN, caching) determines the floor for every metric. You cannot optimise your way out of a slow origin or a missing edge cache.
  2. JavaScript (bundle size, long tasks, workers) directly gates INP. The 200 ms INP budget at p75 forces you to break work into chunks and move computation off the main thread.
  3. CSS & Typography (critical path, containment, font loading) affect both LCP and CLS. Inlined critical CSS unblocks first paint; font-metric overrides eliminate text-swap layout shift.
  4. Images (formats, responsive sizing, priority hints) usually dominate LCP on content-heavy pages. AVIF/WebP and fetchpriority="high" target the actual LCP element.

Each layer has diminishing returns without the previous one being sound. A perfectly code-split bundle does not help if TTFB is 2 seconds. Zero-CLS font loading does not help if JavaScript blocks the main thread for 500 ms during the user’s first tap.

Core Web Vitals Thresholds

As of March 12, 2024, the Core Web Vitals are LCP, INP, and CLS — capturing loading, interactivity, and visual stability. INP replaced FID on that date and the deprecation period for FID-based APIs ran to September 9, 2024.

Metric Good Needs Improvement Poor What It Measures
LCP (Largest Contentful Paint) ≤ 2.5 s 2.5–4.0 s > 4.0 s When the main content element finishes painting
INP (Interaction to Next Paint) ≤ 200 ms 200–500 ms > 500 ms The slowest interaction’s input → next paint duration
CLS (Cumulative Layout Shift) ≤ 0.1 0.1–0.25 > 0.25 Largest windowed sum of unexpected layout shifts

Supporting metrics (not Core Web Vitals):

Metric Good Needs Improvement Poor Why It Matters
TTFB (Time to First Byte) ≤ 800 ms 800–1 800 ms > 1 800 ms Sets the floor for FCP and LCP
FCP (First Contentful Paint) ≤ 1.8 s 1.8–3.0 s > 3.0 s First paint of any DOM content; LCP can’t be earlier

Note

The web.dev “good” TTFB threshold (≤ 800 ms) is deliberately loose so that the p75 of users still get a “good” FCP under realistic network conditions. As an engineering target for a CDN-fronted, cached origin, treat ≤ 200 ms as the aspirational bar and use 800 ms as the don’t-regress red line.

TTFB is excluded from the Core Web Vitals because fast server response does not guarantee good user experience — a page can have 100 ms TTFB but 5 s of JavaScript blocking LCP. However, slow TTFB makes good LCP nearly impossible: at p75, every millisecond of TTFB is a millisecond unavailable for the rest of the LCP budget.

Why These Metrics?

Google’s threshold methodology targets p75 with two competing constraints: achievability (a meaningful fraction of sites can reach “good” with focused effort) and meaningfulness (users perceive the difference). 2.5 s LCP corresponds to lab and field research on perceived load; 200 ms INP is at the upper end of the 100–200 ms window where interactions still feel instant; 0.1 CLS is the largest shift a typical user does not consciously notice.

Important

INP is harder than FID was. First Input Delay measured only the input delay of the first interaction — the time before the handler started. INP measures the full input → processing → presentation duration of every click, tap, and keyboard interaction, then reports the longest one — except on pages with 50 or more interactions, where the 98th-percentile interaction is reported instead (one outlier dropped per 50 interactions). Most sites that comfortably passed FID failed INP at first because FID ignored both subsequent interactions and processing time entirely.

Which metric is failing? Start with the bottleneck

Before diving into a layer, identify which metric is failing on field data and which layer is the most likely cause. The decision tree below routes from a failing CWV to the deep-dive article that covers the relevant fix.

From failing Core Web Vital to the right deep-dive article in the series
Use field (CrUX/RUM) data to pick the failing metric, then route to the layer that most often causes it. Skip to the article in the series that covers that layer.

1. Infrastructure & Architecture

Infrastructure determines the performance floor. Network round trips, TLS handshakes, server processing, and cache misses create latency that no amount of frontend optimisation can overcome. A 2 s TTFB leaves only 500 ms for everything else against a 2.5 s LCP target.

Detailed coverage: Infrastructure Optimization for Web Performance

Quick reference

Layer Key technologies Target metrics
DNS SVCB/HTTPS records (RFC 9460) < 50 ms resolution
Protocol HTTP/3 / QUIC (RFC 9114), TLS 1.3 < 100 ms connection
Edge CDN, edge functions > 80 % origin offload
Origin Load balancing, Redis, connection pooling < 200 ms TTFB
Architecture Islands, BFF, resumability 50–80 % JS reduction (workload-dependent)

Key techniques

  • DNS protocol discovery. HTTPS resource records (RFC 9460, published November 2023) advertise alpn=h3 so browsers attempt HTTP/3 on the first request rather than upgrading via Alt-Svc, saving 100–300 ms (Cloudflare).
  • HTTP/3 and QUIC. Eliminates TCP head-of-line blocking and merges crypto+transport handshakes. In stable networks the page-load gain over HTTP/2 is modest (single-digit percent) — HTTP/3’s real wins are in lossy or high-RTT conditions, where Cloudpanel’s synthetic benchmarks at 15 % packet loss show ~55 % faster page load vs. HTTP/2 (details). Cloudflare’s own production data showed 12.4 % faster TTFB and small page-load gains on average.
  • Edge computing. Runs personalisation, A/B routing, and auth at the CDN edge, removing an origin round-trip for the request paths that touch them.
  • BFF pattern. A backend-for-frontend collapses chatty client-to-service calls into one tailored response. Concrete savings vary by workload — typically a meaningful drop in request count and payload, especially for mobile and waterfall-heavy SPAs.
  • Multi-layer caching. RFC 9111-compliant edge cache + service worker + IndexedDB + origin object cache. Each layer answers a different request class (cold vs. warm vs. revisit).

2. JavaScript Optimization

JavaScript directly gates INP. Every millisecond of main-thread blocking is a millisecond added to interaction latency. The 200 ms INP budget means: receive input, run handlers, let the browser paint — all within 200 ms. Tasks longer than 50 ms are flagged as Long Tasks and risk exceeding the budget on mid-tier devices.

Detailed coverage: JavaScript Performance Optimization

Quick reference

Technique Use case Impact
async / defer Script loading Unblock HTML parsing
Code splitting Large bundles Defer non-critical code from initial payload
scheduler.yield() Long tasks > 50 ms Yield to high-priority work without losing the queue position (Chrome 129+, Firefox 142+)
Web Workers Heavy computation Move work off the main thread (parsing, image decode, crypto)
React.memo / useMemo Re-render cost Skip unnecessary subtree work

Key techniques

  • Script loading. defer for app code (parallel fetch, in-document-order execution after parse). async for independent scripts (analytics, third-party widgets).
  • Code splitting. Route-based with React.lazy() + Suspense; component-level for heavy widgets (rich text, charts).
  • Task scheduling. Use scheduler.yield() (Chrome 129+, Firefox 142+) over setTimeout(0) so the continuation jumps to the front of the queue after the browser handles input/render. Yield every ~5 ms or every N items in long loops.
  • Worker pools. A pool with bounded concurrency and a task queue beats spawning a worker per call. Use transferable objects to avoid copying large buffers.
  • Tree shaking. Mark packages "sideEffects": false, use ES modules end-to-end so bundlers can statically eliminate unused exports (CommonJS is runtime-resolved and resists this).

3. CSS & Typography

CSS is render-blocking by default; web fonts cause layout shift when they swap from a fallback. The critical rendering path needs CSS before first paint, but only the CSS for above-the-fold content. Font loading is the most common preventable CLS source — fallback fonts have different metrics than web fonts, so text reflows on swap.

Detailed coverage: CSS and Typography Optimization

Quick reference

Technique Use case Impact
Critical CSS Above-the-fold styles Eliminate render-blocking on first paint
CSS containment Layout isolation Bound layout/paint to subtrees (CSS Containment Module Level 2)
WOFF2 + subset Font delivery 30–50 % smaller than WOFF; subset removes unused glyphs
font-display Loading strategy Control FOIT/FOUT trade-off
Metric overrides Fallback matching Zero-CLS font swap via size-adjust / ascent-override / descent-override

Key techniques

  • Critical CSS. Inline above-the-fold styles within the ~14 KB initial congestion window so first paint is unblocked in a single round-trip. Defer the rest with the media="print"onload swap pattern.
  • CSS containment. contain: layout paint style isolates reflows and repaints to the contained subtree, which keeps a misbehaving widget from re-laying-out the entire page.
  • Compositor-only animation. Only animate transform and opacity to stay on the compositor thread (60 fps without main-thread work).
  • Font subsetting. Strip unused glyphs with pyftsubset (e.g., Latin-only). Single language subsets often save 60–90 % of file size.
  • Variable fonts. A single file for all weights/widths is usually smaller than three or more static files combined.
  • Font metric overrides. size-adjust, ascent-override, descent-override, line-gap-override on @font-face for the local fallback to match the web font’s box, eliminating font-swap CLS.

4. Image Optimization

Images are typically the LCP element on content-heavy pages. A 2 MB hero served as JPEG when an equivalent-quality AVIF would be 400 KB directly delays LCP. Beyond format, the browser must discover, request, download, and decode the image before it can paint — so loading strategy matters as much as bytes.

Detailed coverage: Image Optimization for Web Performance

Quick reference

Technique Use case Impact
AVIF / WebP Modern browsers ~30–50 % smaller than JPEG (AVIF), ~25–34 % (WebP)
srcset + sizes Responsive images Right resolution per viewport / DPR
loading="lazy" Below-fold images Defer fetch until near viewport
fetchpriority="high" LCP images Promote LCP candidate above other resources (web.dev)
decoding="async" Non-blocking decode Move decode off the main thread

Format selection

Format Size vs JPEG (same quality) Browser support (2026) Best use case
AVIF ~30–50 % smaller ~94 % (caniuse) HDR photos, rich media
WebP ~25–34 % smaller ~97 % (caniuse) General photos & UI
JPEG baseline 100 % Universal fallback
PNG n/a (lossless) 100 % Graphics, transparency

Key techniques

  • Picture element. Negotiate AVIF → WebP → JPEG via <picture><source type="image/avif"><source type="image/webp"><img></picture>. Pagesmith generates this automatically for local raster images in this repo.
  • Responsive images. srcset with width descriptors (image-480w.jpg 480w), sizes for layout hints ((min-width: 768px) 50vw, 100vw).
  • LCP images. loading="eager" + fetchpriority="high" + decoding="async" on the actual LCP element.
  • Below-the-fold images. loading="lazy". Browsers typically pre-fetch a viewport or two ahead.
  • Network-aware loading. Adjust quality / format based on navigator.connection.effectiveType where available.

5. Performance Monitoring

Continuous monitoring keeps the optimisations effective and catches regressions before they reach the field. The CWV thresholds are p75 metrics — without field data you cannot prove you pass them.

Detailed coverage: Core Web Vitals Measurement: Lab vs Field Data

Key signals to track

  • Core Web Vitals (field). LCP, INP, CLS via the web-vitals library or PerformanceObserver, sliced by device class and route. Field data — not Lighthouse — is what Search uses.
  • TTFB and origin health. Server response time and cache hit ratio at the edge and origin.
  • Bundle sizes. Track JS/CSS bytes per route in CI; fail the build on regression with size-limit or your bundler’s stats.
  • Resource budgets. Use Lighthouse performance budgets (budget.json) or your CDN’s analytics to enforce per-page byte counts.

Monitoring tools

Tool Purpose Where it fits
Lighthouse CI Synthetic monitoring per PR GitHub Actions / your CI
web-vitals + RUM Field metrics from real users Your analytics pipeline
CrUX (Chrome UX Report) Public p75 field data per origin PageSpeed Insights, CrUX dashboard
size-limit Bundle size budgets CI/CD pipeline
WebPageTest Detailed waterfall + filmstrip Manual diagnosis

Implementation Checklist

Infrastructure

  • Enable HTTP/3 with HTTPS DNS records
  • Configure TLS 1.3 with 0-RTT resumption
  • Set up CDN with edge computing capabilities
  • Implement multi-layer caching (CDN + Service Worker + Redis)
  • Configure Brotli compression (level 11 static, level 4-5 dynamic)

JavaScript

  • Implement route-based code splitting
  • Use scheduler.yield() for tasks >50ms
  • Offload heavy computation to Web Workers
  • Configure tree shaking with ES modules
  • Set up bundle size budgets in CI

CSS & Typography

  • Extract and inline critical CSS (≤14KB)
  • Apply CSS containment to independent sections
  • Use WOFF2 format with subsetting
  • Implement font metric overrides for zero-CLS
  • Preload critical fonts with crossorigin attribute

Images

  • Serve AVIF/WebP with JPEG fallback via <picture>
  • Implement responsive images with srcset and sizes
  • Use fetchpriority="high" for LCP images
  • Apply loading="lazy" to below-fold images
  • Set explicit width/height to prevent CLS

Monitoring

  • Set up Lighthouse CI in GitHub Actions
  • Configure bundle size budgets with size-limit
  • Implement RUM with PerformanceObserver
  • Create performance dashboards and alerts

Performance Budget Reference

JSON
{  "resourceSizes": {    "total": "500KB",    "javascript": "150KB",    "css": "50KB",    "images": "200KB",    "fonts": "75KB"  },  "metrics": {    "lcp": "2.5s",    "fcp": "1.8s",    "ttfb": "200ms",    "inp": "200ms",    "cls": "0.1"  }}

Series Articles

Article Focus area Key topics
Infrastructure Optimization Network & architecture DNS, HTTP/3, CDN, edge, BFF, caching
JavaScript Optimization Client-side performance Code splitting, workers, React, scheduling
CSS & Typography Rendering & fonts Critical CSS, containment, font loading
Image Optimization Media delivery Formats, responsive, lazy loading
Core Web Vitals Measurement Measurement Lab vs. field, web-vitals library, RUM

Practical takeaways

  • Measure before you optimise. Use field data (CrUX or your own RUM) to identify the failing metric at p75. Synthetic Lighthouse scores are a debugging aid, not a pass/fail signal.
  • Fix the layer that owns the metric. A site with good infrastructure but poor INP needs JavaScript work, not more CDN tuning. A bad LCP on a content-heavy page is almost always images or TTFB, not CSS.
  • Respect the dependency chain. Each layer caps what the next can deliver. A 2 s TTFB cannot be hidden by clever code splitting.
  • Track regressions in CI, not just dashboards. Bundle-size budgets and Lighthouse CI catch regressions before they reach users; CrUX reports them 28 days later.
  • Pick one metric per quarter. Most teams improve more by spending a quarter focused on INP than by spreading effort across all three CWV at once.

Appendix

Prerequisites

  • Working knowledge of the browser rendering pipeline (parse, layout, paint, composite)
  • Familiarity with HTTP/1.1, HTTP/2, and the basics of TLS
  • Mental model of the JavaScript execution model, the event loop, and microtasks

Summary

  • Core Web Vitals (as of March 12, 2024): LCP ≤ 2.5 s, INP ≤ 200 ms, CLS ≤ 0.1 — measured at the 75th percentile of real user visits.
  • Optimisation layers form a chain: Infrastructure → JavaScript → CSS/Typography → Images. Each layer caps the next.
  • Infrastructure: HTTP/3, edge CDN, multi-layer caching aim for TTFB ≤ 800 ms (web.dev “good”), with ≤ 200 ms as the engineering target on cached origins.
  • JavaScript: code splitting, scheduler.yield(), and Web Workers keep INP under 200 ms.
  • CSS & fonts: critical CSS inlining within the 14 KB initial congestion window; font metric overrides eliminate font-swap CLS.
  • Images: AVIF/WebP via <picture>, responsive srcset, fetchpriority="high" on the LCP element.
  • Measurement: field data (CrUX/RUM) is the source of truth; lab tools are for diagnosis.

References

Specifications and standards

Official documentation

Implementation references