Client Performance Monitoring

Measuring frontend performance in production requires capturing real user experience data—not just synthetic benchmarks. Lab tools like Lighthouse measure performance under controlled conditions, but users experience your application on varied devices, networks, and contexts. Real User Monitoring (RUM) bridges this gap by collecting performance metrics from actual browser sessions, enabling data-driven optimization where it matters most: in the field.

RUM architecture: browser APIs capture metrics, beacons transmit reliably on page unload, backend pipelines aggregate and surface insights.

Abstract

Client performance monitoring centers on three measurement categories:

Core Web Vitals: Google’s standardized metrics—LCP (Largest Contentful Paint), INP (Interaction to Next Paint, replaced FID in March 2024), and CLS (Cumulative Layout Shift). These measure loading, interactivity, and visual stability respectively, evaluated at the 75th percentile.
Performance APIs: Browser-native interfaces (PerformanceObserver, Navigation Timing, Resource Timing, Event Timing) that expose timing data with sub-millisecond precision. The web-vitals library abstracts edge cases these raw APIs miss.
Data transmission: navigator.sendBeacon() and fetch with keepalive enable reliable transmission during page unload—critical because metrics like CLS and INP finalize only when the user leaves.

Key architectural decisions:

Sampling strategy: 100% capture for errors, 1-10% for performance metrics to control costs
Attribution data: Include element selectors and interaction targets to make metrics actionable
Session windowing: CLS uses session windows (max 5s, 1s gap); INP reports the worst interaction minus outliers

The distinction between lab and field data is fundamental: lab tools provide reproducible debugging; field data reveals what users actually experience.

Core Web Vitals

LCP (Largest Contentful Paint)

LCP measures perceived load speed by tracking when the largest visible content element renders. Per the web.dev specification, qualifying elements include:

<img> elements (first frame for animated images)
<image> elements within <svg>
<video> elements (poster image or first displayed frame)
Elements with CSS background-image via url()
Block-level elements containing text nodes

Size calculation rules:

Only the visible portion within the viewport counts
For resized images, the smaller of visible size or intrinsic size is used
Margins, padding, and borders are excluded
Low-entropy placeholders and fully transparent elements are filtered out

When LCP stops reporting:

The browser stops dispatching LCP entries when the user interacts (tap, scroll, keypress) because interaction typically changes the visible content. This means LCP captures the initial loading experience, not ongoing rendering.


3 collapsed lines
1
// Using the raw Performance API
2
new PerformanceObserver((entryList) => {
3
  const entries = entryList.getEntries()
4
  // The last entry is the current LCP candidate
5
  const lastEntry = entries[entries.length - 1] as LargestContentfulPaint
6

7
  console.log("LCP:", lastEntry.startTime)
8
  console.log("Element:", lastEntry.element)
9
  console.log("Size:", lastEntry.size)
10
  console.log("URL:", lastEntry.url) // For images
11
}).observe({ type: "largest-contentful-paint", buffered: true })

Thresholds:

Rating	Value
Good	≤ 2.5s
Needs Improvement	2.5s – 4s
Poor	> 4s

INP (Interaction to Next Paint)

INP replaced FID (First Input Delay) as a Core Web Vital in March 2024. The key difference: FID measured only the first interaction’s input delay; INP measures the worst interaction latency across the entire page lifecycle.

Why INP is more comprehensive:

Chrome usage data shows 90% of user time on a page occurs after initial load. FID captured only first impressions; INP captures the full responsiveness experience.

Three phases of interaction latency:

1
User Input → [Input Delay] → [Processing Time] → [Presentation Delay] → Next Frame
2

3
1. Input Delay: Time before event handlers execute (main thread blocked)
4
2. Processing Time: Time spent executing all event handlers
5
3. Presentation Delay: Time from handler completion to next frame paint

Tracked interactions:

Mouse clicks
Touchscreen taps
Keyboard presses (physical and on-screen)

Excluded: Scrolling, hovering, zooming (these don’t trigger event handlers in the same way).

Final value calculation:

INP reports the worst interaction latency, with one outlier ignored per 50 interactions. This prevents a single anomalous interaction from skewing the metric while still capturing genuinely slow interactions.


3 collapsed lines
1
// Using web-vitals library (recommended)
2
import { onINP } from "web-vitals/attribution"
3

4
onINP((metric) => {
5
  console.log("INP:", metric.value)
6
  console.log("Rating:", metric.rating)
7

8
  // Attribution data for debugging
9
  const { eventTarget, eventType, loadState } = metric.attribution
10
  console.log("Element:", eventTarget)
11
  console.log("Event:", eventType)
12
  console.log("Load state:", loadState) // 'loading', 'dom-interactive', 'dom-content-loaded', 'complete'
13
})

Thresholds:

Rating	Value
Good	≤ 200ms
Needs Improvement	200ms – 500ms
Poor	> 500ms

CLS (Cumulative Layout Shift)

CLS measures visual stability—how much visible content unexpectedly shifts during the page lifecycle. Unexpected shifts frustrate users, cause misclicks, and degrade perceived quality.

Layout shift score formula:

1
Layout Shift Score = Impact Fraction × Distance Fraction

Impact Fraction: Combined visible area of shifted elements (before and after positions) as a fraction of viewport area
Distance Fraction: Greatest distance any element moved, divided by viewport’s largest dimension

Session windowing:

CLS doesn’t sum all shifts. It groups shifts into “session windows” with these constraints:

Maximum 1 second gap between shifts within a window
Maximum 5 second window duration

CLS reports the highest-scoring session window, not the total. This prevents long-lived SPAs (Single Page Applications) from accumulating artificially high scores.

Expected vs. unexpected shifts:

Shifts within 500ms of user interaction (click, tap, keypress) are considered expected and excluded via the hadRecentInput flag. Animations using transform: translate() or transform: scale() don’t trigger layout shifts because they don’t affect element geometry.


3 collapsed lines
1
// Raw API approach (simplified)
2
let clsValue = 0
3
let clsEntries: LayoutShift[] = []
4

5
new PerformanceObserver((entryList) => {
6
  for (const entry of entryList.getEntries() as LayoutShift[]) {
7
    // Only count unexpected shifts
8
    if (!entry.hadRecentInput) {
9
      clsValue += entry.value
10
      clsEntries.push(entry)
11
    }
12
  }
13
}).observe({ type: "layout-shift", buffered: true })
14

15
// Note: This simplified version doesn't implement session windowing
16
// Use web-vitals library for correct CLS calculation

Common causes:

Images without dimensions (width/height attributes)
Ads, embeds, iframes that resize
Dynamically injected content
Web fonts causing FOIT/FOUT (Flash of Invisible/Unstyled Text)

Thresholds:

Rating	Value
Good	≤ 0.1
Needs Improvement	0.1 – 0.25
Poor	> 0.25

The 75th Percentile Standard

All Core Web Vitals are evaluated at the 75th percentile of page loads. A page passes if 75% or more of visits meet the “Good” threshold. This approach:

Accounts for real-world variance in devices and networks
Balances between median (too lenient) and 95th percentile (too sensitive to outliers)
Provides a consistent benchmark across sites

Performance APIs

PerformanceObserver

PerformanceObserver is the modern interface for accessing performance timeline entries. Unlike performance.getEntries(), it provides asynchronous notification as entries are recorded.


5 collapsed lines
1
interface PerformanceMetricCallback {
2
  (entries: PerformanceEntryList): void
3
}
4

5
function observePerformance(
6
  entryType: string,
7
  callback: PerformanceMetricCallback,
8
  options: { buffered?: boolean } = {},
9
): () => void {
10
  // Check if entry type is supported
11
  if (!PerformanceObserver.supportedEntryTypes.includes(entryType)) {
12
    console.warn(`Entry type "${entryType}" not supported`)
13
    return () => {}
14
  }
15

16
  const observer = new PerformanceObserver((list) => {
17
    callback(list.getEntries())
18
  })
19

20
  observer.observe({
21
    type: entryType,
22
    buffered: options.buffered ?? true,
23
  })
24

25
  return () => observer.disconnect()
26
}
27

28
// Usage
29
const disconnect = observePerformance("largest-contentful-paint", (entries) => {
30
  const lcp = entries[entries.length - 1]
31
  console.log("LCP:", lcp.startTime)
32
})

The buffered flag:

When buffered: true, the observer receives entries recorded before observe() was called. This is critical for metrics like LCP and FCP that often fire before your monitoring code loads. Each entry type has buffer limits—when full, new entries aren’t buffered.

Supported entry types (2024):

Entry Type	Interface	Purpose
`navigation`	PerformanceNavigationTiming	Page load timing
`resource`	PerformanceResourceTiming	Resource fetch timing
`paint`	PerformancePaintTiming	FP, FCP
`largest-contentful-paint`	LargestContentfulPaint	LCP
`layout-shift`	LayoutShift	CLS
`event`	PerformanceEventTiming	Interaction timing (INP)
`first-input`	PerformanceEventTiming	First input delay
`longtask`	PerformanceLongTaskTiming	Tasks > 50ms
`long-animation-frame`	PerformanceLongAnimationFrameTiming	LoAF (replaces longtask)
`mark`	PerformanceMark	Custom marks
`measure`	PerformanceMeasure	Custom measures
`element`	PerformanceElementTiming	Specific element timing

Navigation Timing provides detailed timing for the document load process.


3 collapsed lines
1
function getNavigationMetrics(): Record<string, number> {
2
  const [nav] = performance.getEntriesByType("navigation") as PerformanceNavigationTiming[]
3

4
  if (!nav) return {}
5

6
  return {
7
    // DNS
8
    dnsLookup: nav.domainLookupEnd - nav.domainLookupStart,
9

10
    // TCP connection
11
    tcpConnect: nav.connectEnd - nav.connectStart,
12

13
    // TLS handshake (if HTTPS)
14
    tlsHandshake: nav.secureConnectionStart > 0 ? nav.connectEnd - nav.secureConnectionStart : 0,
15

16
    // Time to First Byte
17
    ttfb: nav.responseStart - nav.requestStart,
18

19
    // Download time
20
    downloadTime: nav.responseEnd - nav.responseStart,
21

22
    // DOM processing
23
    domProcessing: nav.domContentLoadedEventEnd - nav.responseEnd,
24

25
    // Total page load
26
    pageLoad: nav.loadEventEnd - nav.startTime,
27

28
    // Transfer size
29
    transferSize: nav.transferSize,
4 collapsed lines
30
    encodedBodySize: nav.encodedBodySize,
31
    decodedBodySize: nav.decodedBodySize,
32
  }
33
}

Key timing points:

1
navigationStart
2
    → redirectStart/End
3
    → fetchStart
4
    → domainLookupStart/End (DNS)
5
    → connectStart/End (TCP)
6
    → secureConnectionStart (TLS)
7
    → requestStart
8
    → responseStart (TTFB)
9
    → responseEnd
10
    → domInteractive
11
    → domContentLoadedEventStart/End
12
    → domComplete
13
    → loadEventStart/End

Resource Timing

Resource Timing exposes network timing for fetched resources (scripts, stylesheets, images, XHR/fetch requests).


5 collapsed lines
1
interface ResourceMetrics {
2
  name: string
3
  initiatorType: string
4
  duration: number
5
  transferSize: number
6
  cached: boolean
7
}
8

9
function getSlowResources(threshold = 1000): ResourceMetrics[] {
10
  const resources = performance.getEntriesByType("resource") as PerformanceResourceTiming[]
11

12
  return resources
13
    .filter((r) => r.duration > threshold)
14
    .map((r) => ({
15
      name: r.name,
16
      initiatorType: r.initiatorType,
17
      duration: r.duration,
18
      transferSize: r.transferSize,
19
      cached: r.transferSize === 0 && r.decodedBodySize > 0,
20
    }))
21
    .sort((a, b) => b.duration - a.duration)
22
}
23

24
// Monitor resources as they load
25
new PerformanceObserver((list) => {
26
  for (const entry of list.getEntries() as PerformanceResourceTiming[]) {
27
    if (entry.duration > 2000) {
28
      console.warn("Slow resource:", entry.name, entry.duration)
29
    }
30
  }
31
}).observe({ type: "resource", buffered: true })

Cross-origin timing:

By default, cross-origin resources expose only startTime, duration, and responseEnd with other timing values zeroed for privacy. The Timing-Allow-Origin header enables full timing:

1
Timing-Allow-Origin: *
2
Timing-Allow-Origin: https://example.com

Long Animation Frames (LoAF)

Long Animation Frames API replaces Long Tasks API with better attribution. A “long animation frame” is one where rendering work exceeds 50ms, blocking smooth 60fps rendering.


3 collapsed lines
1
// Detect long animation frames with attribution
2
new PerformanceObserver((list) => {
3
  for (const entry of list.getEntries() as PerformanceLongAnimationFrameTiming[]) {
4
    console.log("Long frame:", entry.duration, "ms")
5

6
    // Scripts that contributed to the long frame
7
    for (const script of entry.scripts) {
8
      console.log("  Script:", script.sourceURL)
9
      console.log("  Function:", script.sourceFunctionName)
10
      console.log("  Duration:", script.duration, "ms")
11
      console.log("  Invoker:", script.invoker) // 'user-callback', 'event-listener', etc.
12
    }
13
  }
14
}).observe({ type: "long-animation-frame", buffered: true })

Why LoAF over Long Tasks:

Long Tasks only reported that a task exceeded 50ms. LoAF provides:

Which scripts contributed and their individual durations
Source URLs and function names
Invoker type (event listener, user callback, etc.)
Better correlation with INP issues

User Timing (Custom Metrics)

User Timing API enables custom performance marks and measures for application-specific metrics.


3 collapsed lines
1
// Mark a point in time
2
performance.mark("feature-start")
3

4
// ... feature code executes ...
5

6
performance.mark("feature-end")
7

8
// Measure between marks
9
performance.measure("feature-duration", "feature-start", "feature-end")
10

11
// Measure from navigation start
12
performance.measure("time-to-feature", {
13
  start: 0, // navigationStart
14
  end: "feature-start",
15
})
16

17
// Retrieve measures
18
const measures = performance.getEntriesByType("measure")
19
for (const measure of measures) {
20
  console.log(`${measure.name}: ${measure.duration}ms`)
21
}
22

23
// Include custom data (Performance API Level 3)
24
performance.mark("api-call-complete", {
25
  detail: {
26
    endpoint: "/api/users",
27
    status: 200,
28
    cached: false,
29
  },
30
})

Real-world custom metrics:

Metric	What It Measures
Time to Interactive Feature	When a specific feature becomes usable
Search Results Render	Time from query to results display
Checkout Flow Duration	Time through purchase funnel
API Response Time	Backend latency as experienced by client

Data Collection Architecture

Beacon Transmission

navigator.sendBeacon() is designed for reliable analytics transmission during page unload. Unlike XHR/fetch, it’s queued by the browser and sent even after the page closes.


5 collapsed lines
1
interface PerformancePayload {
2
  url: string
3
  sessionId: string
4
  timestamp: number
5
  metrics: Record<string, number>
6
  attribution?: Record<string, unknown>
7
}
8

9
class MetricsCollector {
10
  private buffer: PerformancePayload[] = []
11
  private endpoint: string
12
  private maxBufferSize = 10
13

14
  constructor(endpoint: string) {
15
    this.endpoint = endpoint
16
    this.setupUnloadHandler()
17
  }
18

19
  record(metrics: Record<string, number>, attribution?: Record<string, unknown>): void {
20
    this.buffer.push({
21
      url: location.href,
22
      sessionId: this.getSessionId(),
23
      timestamp: Date.now(),
24
      metrics,
25
      attribution,
26
    })
27

28
    // Flush if buffer is full
29
    if (this.buffer.length >= this.maxBufferSize) {
30
      this.flush()
31
    }
32
  }
33

34
  private flush(): void {
35
    if (this.buffer.length === 0) return
36

37
    const payload = JSON.stringify(this.buffer)
38
    this.buffer = []
39

40
    // Try sendBeacon first
41
    const sent = navigator.sendBeacon(this.endpoint, payload)
42

43
    // Fallback to fetch with keepalive
44
    if (!sent) {
16 collapsed lines
45
      fetch(this.endpoint, {
46
        method: "POST",
47
        body: payload,
48
        keepalive: true,
49
        headers: { "Content-Type": "application/json" },
50
      }).catch(() => {
51
        // Silently fail - analytics shouldn't break the page
52
      })
53
    }
54
  }
55

56
  private setupUnloadHandler(): void {
57
    // visibilitychange is more reliable than unload/beforeunload
58
    document.addEventListener("visibilitychange", () => {
59
      if (document.visibilityState === "hidden") {
60
        this.flush()
61
      }
62
    })
63

64
    // Fallback for browsers that don't fire visibilitychange
65
    window.addEventListener("pagehide", () => this.flush())
66
  }
67

68
  private getSessionId(): string {
69
    let id = sessionStorage.getItem("perf_session_id")
70
    if (!id) {
71
      id = crypto.randomUUID()
72
      sessionStorage.setItem("perf_session_id", id)
73
    }
74
    return id
75
  }
76
}

Why visibilitychange over unload:

unload doesn’t fire reliably on mobile when switching apps
unload and beforeunload prevent bfcache (back/forward cache) in many browsers
visibilitychange fires when the page is hidden, backgrounded, or navigated away

Payload size limits:

sendBeacon() typically has a 64KB limit. For large payloads, batch and compress, or use fetch with keepalive which has similar guarantees but more flexibility.

Sampling Strategies

RUM generates substantial data volume. Sampling reduces costs while maintaining statistical validity.


5 collapsed lines
1
type SamplingDecision = "always" | "sampled" | "never"
2

3
interface SamplingConfig {
4
  performanceRate: number // 0-1, e.g., 0.1 = 10%
5
  errorRate: number // Usually 1.0 (100%)
6
  sessionBased: boolean // Decide once per session
7
}
8

9
class Sampler {
10
  private config: SamplingConfig
11
  private sessionDecision: boolean | null = null
12

13
  constructor(config: SamplingConfig) {
14
    this.config = config
15

16
    if (config.sessionBased) {
17
      this.sessionDecision = this.makeDecision(config.performanceRate)
18
    }
19
  }
20

21
  shouldSample(type: "performance" | "error"): boolean {
22
    // Always capture errors
23
    if (type === "error") {
24
      return this.makeDecision(this.config.errorRate)
25
    }
26

27
    // Use session decision if configured
28
    if (this.config.sessionBased && this.sessionDecision !== null) {
29
      return this.sessionDecision
30
    }
31

32
    return this.makeDecision(this.config.performanceRate)
33
  }
34

35
  private makeDecision(rate: number): boolean {
36
    return Math.random() < rate
37
  }
38
}
39

10 collapsed lines
40
// Usage
41
const sampler = new Sampler({
42
  performanceRate: 0.1, // 10% of sessions
43
  errorRate: 1.0, // 100% of errors
44
  sessionBased: true, // Consistent within session
45
})
46

47
if (sampler.shouldSample("performance")) {
48
  collector.record(metrics)
49
}

Sampling considerations:

Approach	Pros	Cons
Head-based (session start)	Consistent within session, simpler analysis	May miss rare interactions
Tail-based (after event)	Can prioritize errors/slow requests	More complex, higher initial capture
Rate-based (percentage)	Simple, predictable volume	May split sessions
Adaptive (dynamic rate)	Handles traffic spikes	Complex to implement correctly

Typical rates:

Errors: 100% (always capture)
Performance metrics: 1-10% depending on traffic
Session replay: 0.1-1% (high data volume)

Attribution for Debugging

Raw metric values (LCP = 3.2s) are insufficient for debugging. Attribution data identifies what caused the value.


5 collapsed lines
1
import { onLCP, onINP, onCLS } from "web-vitals/attribution"
2

3
function collectWithAttribution(): void {
4
  onLCP((metric) => {
5
    const { element, url, timeToFirstByte, resourceLoadDelay, resourceLoadDuration, elementRenderDelay } =
6
      metric.attribution
7

8
    sendMetric({
9
      name: "LCP",
10
      value: metric.value,
11
      attribution: {
12
        element: element?.tagName,
13
        elementId: element?.id,
14
        url,
15
        ttfb: timeToFirstByte,
16
        resourceLoadDelay,
17
        resourceLoadDuration,
18
        elementRenderDelay,
19
      },
20
    })
21
  })
22

23
  onINP((metric) => {
24
    const { eventTarget, eventType, loadState, interactionTargetElement, longAnimationFrameEntries } =
25
      metric.attribution
26

27
    sendMetric({
28
      name: "INP",
29
      value: metric.value,
30
      attribution: {
31
        eventTarget,
32
        eventType,
33
        loadState,
34
        element: interactionTargetElement?.tagName,
35
        longFrames: longAnimationFrameEntries?.length,
36
      },
37
    })
38
  })
39

40
  onCLS((metric) => {
41
    const { largestShiftTarget, largestShiftTime, largestShiftValue, loadState } = metric.attribution
42

43
    sendMetric({
44
      name: "CLS",
45
      value: metric.value,
46
      attribution: {
47
        shiftTarget: largestShiftTarget?.tagName,
48
        shiftTime: largestShiftTime,
49
        shiftValue: largestShiftValue,
50
        loadState,
51
      },
52
    })
53
  })
54
}

Attribution bundle size trade-off:

The standard web-vitals build is ~2KB (brotli). The attribution build is ~3.5KB. The extra 1.5KB provides debugging data that makes metrics actionable—worth it for production monitoring.

Error Tracking

Capturing JavaScript Errors

Comprehensive error tracking requires multiple handlers for different error types.


8 collapsed lines
1
interface ErrorReport {
2
  type: "runtime" | "resource" | "promise" | "network"
3
  message: string
4
  stack?: string
5
  source?: string
6
  line?: number
7
  column?: number
8
  timestamp: number
9
  url: string
10
  userAgent: string
11
}
12

13
class ErrorTracker {
14
  private endpoint: string
15
  private buffer: ErrorReport[] = []
16

17
  constructor(endpoint: string) {
18
    this.endpoint = endpoint
19
    this.setupHandlers()
20
  }
21

22
  private setupHandlers(): void {
23
    // Runtime errors (synchronous)
24
    window.onerror = (message, source, line, column, error) => {
25
      this.report({
26
        type: "runtime",
27
        message: String(message),
28
        stack: error?.stack,
29
        source,
30
        line: line ?? undefined,
31
        column: column ?? undefined,
32
      })
33
      return false // Don't suppress default handling
34
    }
35

36
    // Unhandled promise rejections
37
    window.addEventListener("unhandledrejection", (event) => {
38
      this.report({
39
        type: "promise",
40
        message: event.reason?.message || String(event.reason),
41
        stack: event.reason?.stack,
42
      })
43
    })
44

45
    // Resource loading errors (images, scripts, stylesheets)
46
    window.addEventListener(
47
      "error",
48
      (event) => {
49
        // Only handle resource errors, not runtime errors
50
        if (event.target !== window && event.target instanceof HTMLElement) {
51
          const target = event.target as HTMLImageElement | HTMLScriptElement | HTMLLinkElement
52
          this.report({
53
            type: "resource",
54
            message: `Failed to load ${target.tagName.toLowerCase()}`,
55
            source: (target as HTMLImageElement).src || (target as HTMLLinkElement).href,
56
          })
57
        }
58
      },
59
      true,
60
    ) // Capture phase to catch resource errors
61
  }
62

63
  private report(error: Omit<ErrorReport, "timestamp" | "url" | "userAgent">): void {
64
    const fullError: ErrorReport = {
65
      ...error,
66
      timestamp: Date.now(),
67
      url: location.href,
68
      userAgent: navigator.userAgent,
69
    }
16 collapsed lines
70

71
    this.buffer.push(fullError)
72

73
    // Send immediately for errors (don't batch)
74
    this.flush()
75
  }
76

77
  private flush(): void {
78
    if (this.buffer.length === 0) return
79

80
    const payload = JSON.stringify(this.buffer)
81
    this.buffer = []
82

83
    navigator.sendBeacon(this.endpoint, payload)
84
  }
85
}

Stack Trace Parsing

Production JavaScript is minified, making raw stack traces unreadable. Source maps restore original file/line information.


5 collapsed lines
1
interface ParsedFrame {
2
  function: string
3
  file: string
4
  line: number
5
  column: number
6
}
7

8
function parseStackTrace(stack: string): ParsedFrame[] {
9
  if (!stack) return []
10

11
  const lines = stack.split("\n")
12
  const frames: ParsedFrame[] = []
13

14
  // Common stack trace formats
15
  const chromeRegex = /at\s+(.+?)\s+\((.+?):(\d+):(\d+)\)/
16
  const firefoxRegex = /(.*)@(.+?):(\d+):(\d+)/
17

18
  for (const line of lines) {
19
    let match = chromeRegex.exec(line) || firefoxRegex.exec(line)
20

21
    if (match) {
22
      frames.push({
23
        function: match[1] || "<anonymous>",
24
        file: match[2],
25
        line: parseInt(match[3], 10),
26
        column: parseInt(match[4], 10),
27
      })
28
    }
29
  }
30

31
  return frames
32
}
33

34
// Server-side: Use source-map library to resolve original locations
35
// Libraries like Sentry, Datadog do this automatically

Error Grouping

Without grouping, each error instance creates a separate alert. Grouping consolidates identical errors.


3 collapsed lines
1
function generateErrorFingerprint(error: ErrorReport): string {
2
  // Group by: type + message pattern + top stack frame
3
  const parts = [error.type, normalizeMessage(error.message), error.stack ? getTopFrame(error.stack) : "no-stack"]
4

5
  return hashString(parts.join("|"))
6
}
7

8
function normalizeMessage(message: string): string {
9
  // Remove dynamic values that would create unique fingerprints
10
  return message
11
    .replace(/\d+/g, "<N>") // Numbers
12
    .replace(/'[^']+'/g, "'<S>'") // Single-quoted strings
13
    .replace(/"[^"]+"/g, '"<S>"') // Double-quoted strings
14
    .replace(/\b[a-f0-9]{8,}\b/gi, "<ID>") // Hex IDs
15
}
16

17
function getTopFrame(stack: string): string {
18
  const frames = parseStackTrace(stack)
19
  if (frames.length === 0) return "unknown"
20

21
  const top = frames[0]
22
  // Use file and line, not column (column varies with minification)
23
  return `${top.file}:${top.line}`
24
}
25

26
function hashString(str: string): string {
27
  // Simple hash for fingerprinting
28
  let hash = 0
29
  for (let i = 0; i < str.length; i++) {
30
    hash = (hash << 5) - hash + str.charCodeAt(i)
31
    hash |= 0
32
  }
33
  return hash.toString(16)
34
}

Lab vs. Field Data

Fundamental Differences

Aspect	Lab (Synthetic)	Field (RUM)
Environment	Controlled (specific device, network)	Variable (real user conditions)
Reproducibility	High	Low
Metrics	All measurable	User-experienced only
Use case	Development, CI/CD gates	Production monitoring
Data volume	One measurement	Aggregated from many
Attribution	Full stack traces	Limited (privacy, performance)

When to Use Each

Lab data (Lighthouse, WebPageTest):

Pre-deployment validation
Regression testing in CI
Debugging specific issues
Comparing configurations

Field data (RUM):

Understanding real user experience
Identifying issues lab doesn’t catch
Monitoring production performance
Correlating performance with business metrics

The Gap Between Them

Lab measurements often differ from field measurements because:

Device diversity: Lab uses consistent hardware; users have varied devices
Network conditions: Lab uses throttled but stable connections; real networks are unpredictable
User behavior: Lab follows scripted paths; users interact unpredictably
Third-party content: Ads, widgets, and embeds behave differently in production
Cache state: Lab often tests cold cache; users may have warm caches

As web.dev states: “lab measurement is not a substitute for field measurement.”

The web-vitals Library

Google’s web-vitals library is the recommended approach for measuring Core Web Vitals. It handles edge cases that raw Performance APIs miss.

Why Use It Over Raw APIs

The library handles:

Background tab detection (metrics shouldn’t include time page was hidden)
bfcache (back/forward cache) restoration (resets metrics appropriately)
Iframe considerations
Prerendered page handling
Mobile-specific timing issues

Basic Usage


5 collapsed lines
1
import { onCLS, onINP, onLCP, onFCP, onTTFB } from "web-vitals"
2

3
function sendToAnalytics(metric: { name: string; value: number; delta: number; id: string; rating: string }): void {
4
  const body = JSON.stringify({
5
    name: metric.name,
6
    value: metric.value,
7
    delta: metric.delta,
8
    id: metric.id,
9
    rating: metric.rating,
10
    page: location.pathname,
11
  })
12

13
  // Use sendBeacon for reliable delivery
14
  navigator.sendBeacon("/api/analytics", body)
15
}
16

17
// Register handlers - call each only once per page
18
onCLS(sendToAnalytics)
19
onINP(sendToAnalytics)
20
onLCP(sendToAnalytics)
21
onFCP(sendToAnalytics)
22
onTTFB(sendToAnalytics)

Attribution Build


3 collapsed lines
1
import { onLCP, onINP, onCLS } from "web-vitals/attribution"
2

3
// LCP attribution
4
onLCP((metric) => {
5
  console.log("LCP value:", metric.value)
6
  console.log("LCP element:", metric.attribution.element)
7
  console.log("Resource URL:", metric.attribution.url)
8
  console.log("TTFB:", metric.attribution.timeToFirstByte)
9
  console.log("Resource load delay:", metric.attribution.resourceLoadDelay)
10
  console.log("Element render delay:", metric.attribution.elementRenderDelay)
11
})
12

13
// INP attribution
14
onINP((metric) => {
15
  console.log("INP value:", metric.value)
16
  console.log("Event type:", metric.attribution.eventType)
17
  console.log("Event target:", metric.attribution.eventTarget)
18
  console.log("Load state:", metric.attribution.loadState)
19
  console.log("Long frames:", metric.attribution.longAnimationFrameEntries)
20
})
21

22
// CLS attribution
23
onCLS((metric) => {
24
  console.log("CLS value:", metric.value)
25
  console.log("Largest shift target:", metric.attribution.largestShiftTarget)
26
  console.log("Largest shift value:", metric.attribution.largestShiftValue)
27
  console.log("Largest shift time:", metric.attribution.largestShiftTime)
28
})

Key API Details

The delta property:

Metrics like CLS can update multiple times as new layout shifts occur. The delta property contains only the change since the last report. For analytics platforms that don’t support metric updates, sum the deltas.

The id property:

A unique identifier for the metric instance. Use this to aggregate multiple reports for the same page view (e.g., when CLS updates).

Single call rule:

Call each metric function only once per page load. Multiple calls create multiple PerformanceObserver instances, wasting memory and causing duplicate reports.

Real-World Implementations

Sentry Performance

Architecture:

JavaScript SDK instruments fetch/XHR, framework components
Traces capture transaction spans from browser to backend
Web Vitals automatically captured via web-vitals library integration

Key features:

Automatic performance instrumentation for React, Vue, Angular
Distributed tracing connecting frontend spans to backend
Release tracking for performance regression detection

Datadog RUM

Architecture:

Lightweight SDK (~30KB) loaded asynchronously
Session-based collection with configurable sampling
Automatic Core Web Vitals, resource timing, long tasks

Key features:

Replay integration for debugging sessions
Synthetic monitoring comparison
Custom actions and timing

Vercel Speed Insights

Architecture:

Minimal script injection in Next.js builds
Real user data collected to Vercel’s analytics backend
Core Web Vitals with Next.js-specific attribution

Key features:

Route-level performance breakdown
Comparison across deployments
Integration with Vercel’s deployment workflow

Open-Source: SpeedCurve, Calibre

Self-hosted considerations:

Data storage for high-cardinality metrics (ClickHouse, TimescaleDB)
Aggregation pipelines for percentile calculation
Visualization (Grafana dashboards)

Conclusion

Client performance monitoring requires three interconnected systems:

Metrics collection: Core Web Vitals (LCP, INP, CLS) via the web-vitals library, supplemented by Navigation Timing, Resource Timing, and custom User Timing marks.
Reliable transmission: navigator.sendBeacon() on visibilitychange ensures data reaches your servers even during page unload. Batch secondary metrics; send errors immediately.
Actionable analysis: Raw numbers (LCP = 3.2s) aren’t useful without attribution (LCP element = hero image, resource load delay = 1.8s). Capture debugging data to make metrics actionable.

The gap between lab and field data is fundamental. Lighthouse tells you what’s possible; RUM tells you what users actually experience. Both are necessary for comprehensive performance management.

Appendix

Prerequisites

Browser Performance APIs (PerformanceObserver, timing interfaces)
HTTP basics (request/response timing, headers)
JavaScript event handling

Terminology

Term	Definition
RUM	Real User Monitoring—collecting performance data from actual user sessions
CrUX	Chrome User Experience Report—Google’s public dataset of field performance data
TTFB	Time to First Byte—time until first byte of response received
FCP	First Contentful Paint—time until first content renders
LCP	Largest Contentful Paint—time until largest visible content renders
INP	Interaction to Next Paint—worst interaction latency (replaced FID)
CLS	Cumulative Layout Shift—measure of visual stability
LoAF	Long Animation Frame—frame taking >50ms, blocking smooth rendering

Summary

Core Web Vitals (LCP, INP, CLS) are evaluated at the 75th percentile with “Good” thresholds of 2.5s, 200ms, and 0.1 respectively
INP replaced FID in March 2024, measuring all interactions rather than just the first
PerformanceObserver with buffered: true captures metrics recorded before your code loads
navigator.sendBeacon() on visibilitychange is the most reliable transmission pattern
The web-vitals library handles edge cases (background tabs, bfcache) that raw APIs miss
Attribution data transforms numbers into actionable debugging information
Lab data (Lighthouse) and field data (RUM) serve different purposes—both are necessary

References

web.dev - Web Vitals - Core Web Vitals overview and guidance
web.dev - LCP - Largest Contentful Paint specification
web.dev - INP - Interaction to Next Paint specification
web.dev - CLS - Cumulative Layout Shift specification
W3C Performance Timeline - PerformanceObserver specification
W3C Navigation Timing Level 2 - Navigation timing specification
W3C Resource Timing - Resource timing specification
MDN - PerformanceObserver - API documentation
MDN - navigator.sendBeacon - Beacon API documentation
GoogleChrome/web-vitals - Official web-vitals library
web.dev - Custom Metrics - User Timing and custom measurement
web.dev - Long Animation Frames - LoAF API documentation

Read more