Cheatsheet
Optimization Deep Dive — Staff-Level Performance Handbook
Condensed for last-minute review — every key takeaway, decision table, and recall card.
01 — DIAGNOSE
Measurement & Profiling
- Senior Signal: Always Measure First: A junior engineer says 'I think the bottleneck is the hero image.' A staff engineer says 'Let me check the field data — if LCP is above 2.5s for 75% of users, then we profile the main thread to see if it's network or rendering.' Never guess; always measure with real data.
- Avoid Premature Optimization: Optimizing before measuring is the #1 performance mistake. A junior engineer might preload every image or inline all CSS, bloating the page. A staff engineer measures first, finds the real bottleneck (e.g., a 300ms server response), and optimizes that. Premature optimization adds complexity without guaranteed benefit.
| Aspect | Lab Data | Field Data |
|---|---|---|
| Environment | Controlled (e.g., Moto G4, Slow 3G) | Real user devices, networks, locations |
| Metrics | Synthetic scores (Lighthouse performance) | Real user percentiles (p75, p95) |
| Use Case | Debugging, CI gates, regression detection | Prioritization, user impact, long-term trends |
| Limitation | May not reflect real-world variability | Noisy, requires sufficient sample size |
- What is the threshold for a long task?
- — 50ms — tasks longer than this block the main thread and degrade INP.
- What is the difference between lab and field data?
- — Lab data is synthetic and controlled; field data is from real users and captures variability.
- What is the measure → optimize → verify loop?
- — Measure current performance, optimize one bottleneck, then verify the improvement with the same metrics.
- What is a vanity metric?
- — A metric that doesn't correlate with user experience, e.g., DOMContentLoaded or total page weight.
- What is a performance budget?
- — A limit on metrics (e.g., bundle size <200kB, LCP <2.5s) enforced in CI to prevent regressions.
02 — LOADING
Bundle Size & Code Splitting
- When NOT to use tree shaking: Tree shaking is ineffective for CommonJS modules (
require()) or packages withoutsideEffects: false. Also, if your code uses dynamic property access (e.g.,import(`./${name}`)), the bundler cannot statically analyze it. Don't waste time micro-optimizing tree shaking on a 10kB utility; focus on the 200kB+ libraries first. - Senior Signal: Measure First, Then Optimize: A junior engineer will say 'let's split everything into lazy chunks.' A staff engineer says: 'Let's run a bundle analysis, identify the top 3 bloat sources, and then decide if splitting or tree shaking gives the best ROI. For example, if a 200kB chart library is only used on one route, route-based splitting saves 200kB. But if it's used on 5 routes, a shared chunk might be better. Always measure the actual impact on LCP and INP before and after.'
| Technique | Best for | Cost | Avoid when |
|---|---|---|---|
| Route-based splitting | Large SPAs with many pages | Adds network round-trip per route | Routes are <10kB each |
| Component-based splitting | Heavy, rarely-used components | Loading state + potential CLS | Component is <50kB or used on every page |
| Vendor chunking | Stable third-party deps | Cache invalidation complexity | Vendor is <100kB or rarely changes |
| Shared chunks | Common modules across routes | Increased initial bundle size | Shared code is <5kB per route |
- What is the minimum bundle size that typically causes performance regressions?
- — ~200kB uncompressed; above this, LCP and TBT degrade noticeably.
- What flag must a package set in package.json for tree shaking to work?
- — "sideEffects": false
- What is the main trade-off of route-based code splitting?
- — Reduces initial bundle size but adds a network round-trip per route transition, potentially hurting INP.
- Why do barrel files (index.ts re-exports) break tree shaking?
- — Bundlers cannot statically determine which exports are used, so they include all re-exported modules.
- What is the recommended minSize for shared chunks in webpack splitChunks?
- — 20kB (20000 bytes) to avoid creating too many tiny chunks.
03 — LOADING
Build Tooling & Chunking
- Senior Signal: 'Always measure before optimizing': A junior says 'Vite is faster' and switches. A staff engineer says: 'I measured our cold build at 45s and HMR at 200ms. The bottleneck was our custom webpack plugin, not webpack itself. We profiled, found the plugin was doing O(n²) lookups, fixed it, and got cold builds to 12s without changing bundlers. Only then did we evaluate Vite for the next project.' Always profile your actual build pipeline before blaming the tool.
- When NOT to use aggressive chunk splitting: If your app has fewer than 10 routes and total bundle size is under 200kB, splitting into 10+ chunks adds HTTP overhead and complexity without benefit. For small apps, a single bundle with code-splitting only for heavy third-party libraries (e.g., charting) is better. Measure: if your total JS is <100kB, don't split at all—the 16ms parsing time is negligible. Only split when you see route-level bundles exceeding 50kB or when vendor code is >30% of total.
| Bundler | Dev Speed | Production Output Size | Plugin Ecosystem | Best For |
|---|---|---|---|---|
| webpack 5 | Slow (10-60s cold) | Good (baseline) | Excellent | Complex SPAs, legacy migration |
| Vite | Fast (<2s cold, <50ms HMR) | Excellent (Rollup) | Good (growing) | New SPAs, SSR, modern stacks |
| esbuild | Very fast (<1s cold) | Good (5-10% larger) | Limited | Dev builds, simple apps, transpilation |
| Rollup | Moderate (no HMR) | Excellent (smallest) | Good | Libraries, component packages |
| Minifier | Speed (500kB bundle) | Output Size Reduction | Best For |
|---|---|---|---|
| Terser | 2-5s | 10-15% | Production builds where size is critical |
| esbuild | 0.1-0.3s | 5-10% | Dev builds, CI, fast iterations |
| SWC | 0.2-0.5s | 5-10% | Webpack projects needing speed |
- What is the primary trade-off between webpack and Vite for production builds?
- — Vite uses Rollup for production, offering better tree-shaking and smaller output than webpack, but webpack has a richer plugin ecosystem for complex scenarios (e.g., Module Federation).
- What is the recommended chunk size range for HTTP/2?
- — 100-200kB per chunk. Smaller chunks increase HTTP overhead; larger chunks delay parsing. Aim for 3-6 core chunks plus route-level async chunks.
- How does content-hash filename improve caching?
- — It generates a unique filename based on file content. Unchanged chunks keep the same URL, so browsers reuse cached versions. Only changed chunks get new URLs, avoiding full cache invalidation.
- When should you NOT use Module Federation?
- — When teams cannot coordinate shared dependency versions, or when the app has fewer than 3 independent teams. The overhead of runtime sharing and version negotiation outweighs benefits for small projects.
- What is the key metric for evaluating minifier choice?
- — Build time vs output size. Terser is 2-5x slower but produces 5-10% smaller bundles than esbuild. Choose based on whether your bottleneck is build speed or bundle size.
04 — LOADING
Image & Font Optimization
- When NOT to use AVIF: Avoid AVIF for images that are critical to LCP (e.g., hero banners) on low-end devices. Decoding a 2MB AVIF can take 80ms on a Moto G4, blowing the 50ms long-task threshold. Use WebP for LCP images and reserve AVIF for non-critical, large background images.
- Senior Signal: Always measure first: A junior engineer adds lazy loading to every image. A staff engineer measures: 'Our LCP image is 1.8s on 4G — lazy loading it would push LCP to 3.2s. Instead, I'll eager-load the hero and lazy-load the 12 gallery images below the fold.' Always profile with Lighthouse and real-user monitoring (RUM) before applying optimizations.
- Common Mistake: Over-subsetting: Subsetting to only ASCII characters breaks internationalization. If your site supports Japanese or Cyrillic, include those ranges. Always test with a full character set in staging. A subset that's too aggressive can cause missing glyphs (tofu boxes) that hurt UX more than the 50KB savings.
| Technique | Best for | Cost | Avoid when |
|---|---|---|---|
| srcset + sizes | Resolution switching | Low: one img tag | Art direction needed |
| <picture> | Format switching, art direction | Medium: multiple source tags | Simple resolution only |
| Client Hints (DPR, Viewport-Width) | Automatic selection | Low: HTTP header | Privacy restrictions, legacy browsers |
| CDN image transformation | Dynamic resizing | Variable: per-request cost | Static assets, low traffic |
| Strategy | FOUT/FOIT | CLS risk | Best for |
|---|---|---|---|
| font-display: swap | FOUT (visible fallback) | Low if fallback metrics match | Brand fonts, headings |
| font-display: optional | FOIT (invisible text up to 100ms) | None if font fails | Body text, non-critical |
| font-display: block | FOIT (up to 3s) | High | Avoid in production |
| Variable fonts | Single file for multiple weights | Low (one download) | Sites with many font weights |
- What is the primary trade-off between AVIF and WebP?
- — AVIF is 50% smaller but 2-3x slower to decode on older CPUs. WebP is 30% smaller with fast decode. Use WebP for LCP images, AVIF for non-critical.
- How do you prevent CLS from images without fixed dimensions?
- — Use CSS `aspect-ratio` with `max-width: 100%` and `height: auto`, or set explicit `width` and `height` on `
` tags.
- What is the difference between font-display: swap and optional?
- — `swap` shows fallback text immediately (FOUT) and swaps when font loads. `optional` may skip the font if it takes >100ms, showing fallback permanently — zero CLS but no custom font.
- When should you use the `
` element instead of srcset? - — Use `
` for format switching (AVIF/WebP/JPEG) or art direction (different crops). Use `srcset` for simple resolution switching. - What is the recommended rootMargin for IntersectionObserver lazy loading?
- — 200px (or 1250px for native `loading='lazy'`). This preloads images before they enter the viewport, reducing perceived latency.
05 — LOADING
Critical Rendering Path
- Senior Signal: Always measure the cost of inlining: A junior engineer inlines all CSS. A staff engineer measures: if the full CSS is <14kB gzipped, inlining everything may be faster than splitting. Use Lighthouse and WebPageTest to compare FCP with and without extraction. The decision depends on your CSS size, cache hit rate, and HTML delivery latency.
- Common Mistake: Preloading everything: Preloading too many resources (e.g., all images) can delay the LCP resource by competing for bandwidth. The browser has a limited number of preload slots (typically 3-6). Only preload resources that are discovered late in the HTML or are critical for above-the-fold rendering. Measure with Priority Hints and Lighthouse to validate.
| Technique | Best for | Cost | Avoid when |
|---|---|---|---|
| defer | Scripts that depend on DOM or other deferred scripts | Delays execution until after parsing | Scripts that must run before DOMContentLoaded |
| async | Independent scripts (analytics, ads) | No order guarantee; can block DOMContentLoaded | Scripts with dependencies or DOM manipulation |
| sync (no attr) | Legacy or critical inline scripts | Blocks parsing and painting entirely | Any script >1kB that can be deferred |
- What is the critical rendering path?
- — The sequence of steps the browser takes to convert HTML, CSS, and JS into pixels. Blocking any step delays paint.
- How does defer differ from async?
- — defer preserves execution order and runs after HTML parsing; async runs as soon as downloaded, no order guarantee.
- When should you use preload vs prefetch?
- — preload for critical current-page resources; prefetch for likely-next-page resources. Overuse wastes bandwidth.
- What is the 14kB rule for critical CSS?
- — Inline only the CSS needed for above-the-fold content, keeping it under ~14kB compressed to fit in the first TCP packet.
- What does fetchpriority='high' do?
- — Hints the browser to prioritize that resource (e.g., LCP image) over others. Overuse can cause priority inversion.
06 — LOADING
Third-Party Scripts
- Senior Signal: Always measure first, then optimize: A junior says 'we should async all scripts.' A staff engineer says 'let's measure the real cost with DevTools Performance tab, then decide which scripts are critical and which can be deferred or removed.' The key is data-driven triage.
- When NOT to use Partytown: Partytown is not a silver bullet. Avoid it for scripts that manipulate the DOM (e.g., chat widgets, A/B testing tools) because they require synchronous DOM access. Also, the
postMessageoverhead can add 10-50ms per call, which may negate benefits for high-frequency interactions. Always measure TBT and INP before and after.
| Technique | Best for | Cost | Avoid when |
|---|---|---|---|
| Self-hosting | Critical scripts (e.g., analytics, auth) | Maintenance overhead, no CDN caching | Scripts that update frequently (e.g., tag managers) |
| CDN with preconnect | Non-critical scripts (e.g., fonts, widgets) | Extra DNS lookup (mitigated by preconnect) | Scripts that need low latency on first load |
| Facade pattern | Heavy embeds (chat, video, maps) | Delayed interactivity | Scripts needed for initial UX (e.g., login) |
| Partytown | Analytics, tag managers | Setup complexity, postMessage latency | DOM-manipulating scripts |
- What is the facade pattern for third-party scripts?
- — Replace heavy widget with a lightweight placeholder that loads the real script only on user interaction. Saves 200-500 kB JS and improves LCP by 1-2s.
- When should you use Partytown vs async/defer?
- — Partytown: analytics/tag managers that don't need DOM access. Async/defer: scripts that need DOM but not immediate execution. Avoid Partytown for DOM-manipulating widgets.
- What metric best captures third-party script impact on interactivity?
- — Total Blocking Time (TBT) and INP. Look for long tasks >50ms attributed to third-party origins in DevTools.
- What is the trade-off of self-hosting third-party scripts?
- — Reduces DNS/network latency by 100-300ms but loses CDN caching and requires manual updates. Best for critical, infrequently updated scripts.
- How do you measure third-party script impact in DevTools?
- — Performance tab > Bottom-Up view > filter by script origin. Network tab > blocking time. Lighthouse CI for regression tracking.
07 — RUNTIME
Rendering & Paint Performance
- When NOT to Use will-change: Do not apply
will-changeto every animated element. On mobile, 20+ layers can cause GPU memory pressure and actually increase jank. Only promote elements that are actively animating and where you've measured a benefit. For simple transitions, the browser's own heuristics are often sufficient. - Senior Signal: Always Measure First: A junior says 'I'll add will-change to fix jank.' A staff engineer says 'Let me profile the frame budget first — if the bottleneck is layout, I'll batch reads; if it's paint, I'll promote to compositor; if it's JavaScript, I'll defer work.' Never optimize without a DevTools Performance recording. The 16ms budget is a target, not a guarantee — measure the actual cost of your change.
| Technique | Best For | Cost | Avoid When |
|---|---|---|---|
| transform/opacity animation | Position, size, visibility | Zero layout/paint; ~0.1ms composite | Need to change actual layout (e.g., reflow siblings) |
| will-change: transform | Pre-promote to compositor layer | Memory: ~1-2MB per layer on mobile | More than 10 elements; causes layer explosion |
| content-visibility: auto | Off-screen content in long lists | Initial layout cost; ~0.5ms per element | Above-the-fold content; can delay LCP |
| CSS containment | Isolate subtrees from layout | Minimal; ~0.01ms per container | Small components with no layout impact |
- What is the 16ms frame budget and how is it typically split?
- — 16.67ms per frame at 60fps. Typical split: JS ~5ms, Style/Layout ~3ms, Paint ~5ms, Composite ~3ms. Exceeding any slice drops the frame.
- Which CSS properties only trigger composite (no layout or paint)?
- — transform and opacity. All other properties (width, height, left, top, color, etc.) trigger layout or paint.
- What is layout thrashing and how do you fix it?
- — Interleaving reads (e.g., offsetHeight) and writes (e.g., style.left) forces synchronous layout flushes. Fix by batching all reads first, then writes.
- What is the risk of overusing will-change?
- — Layer explosion: each promoted layer costs ~1-2MB GPU memory. On mobile, 20+ layers can cause memory pressure and increase jank.
- When should you NOT use content-visibility: auto?
- — On above-the-fold content. It delays rendering and can increase LCP beyond 2.5s. Only use on off-screen sections.
08 — RUNTIME
CSS Performance
- Senior Signal: Always measure before optimizing selectors: Junior engineers rewrite all selectors to BEM. Staff engineers profile with Performance panel first. If style recalculation is under 1ms for a 5000-element tree, selector optimization is noise. Focus on layout thrashing and paint complexity instead.
- When NOT to use containment: Overusing
contain: stricton every element can increase memory usage because the browser creates separate rendering contexts. Only apply to components that are truly independent (e.g., widgets, list items, modals). For static text, it's unnecessary overhead. Profile with Chrome DevTools 'Rendering' > 'Layer borders' to see if you're creating too many layers.
| Technique | Best for | Cost | Avoid when |
|---|---|---|---|
| Runtime CSS-in-JS | Highly dynamic theming, server-rendered apps with small component trees | 0.5–2ms per mount, 10–50kB JS bundle overhead | Pages with >100 components or strict INP budgets (<200ms) |
| Zero-runtime (vanilla-extract) | Static or token-based theming, large component libraries | Build-time only, <1kB runtime JS | Need runtime color swapping without rebuild |
| Tailwind CSS | Utility-first design, rapid prototyping, small bundles | JIT scanning adds ~200ms to dev build | Custom design systems with complex component APIs |
| Technique | Best for | Cost | Avoid when |
|---|---|---|---|
| Container queries | Reusable components, widget libraries | ~0.1ms per container per resize | Nested containers or >100 containers on a page |
| Media queries | Page-level layouts, responsive breakpoints | ~0.01ms per query, no per-element tracking | Components that need to respond to parent size |
- What is the key selector in CSS selector matching, and why does it matter?
- — The rightmost part of a selector. Browsers match right-to-left, so a tag or universal key selector checks every element; a class or ID key selector is O(1).
- What is the primary performance cost of runtime CSS-in-JS?
- — JavaScript execution time during component mount (0.5–2ms per component), which can create long tasks (>50ms) on pages with many components.
- What does contain: layout do, and when should you use it?
- — It isolates an element's layout from the rest of the page. Use on independent widgets or list items to prevent style changes from triggering full-page layout recalculations.
- What is the risk of promoting too many elements to GPU layers?
- — GPU memory explosion (100–500kB per layer), leading to jank on mobile. Only promote elements that animate frequently.
- How does PurgeCSS reduce CSS bundle size, and what is its main trade-off?
- — It removes unused CSS by scanning templates. Trade-off: dynamic class construction (e.g., `btn-${variant}`) requires safelisting to avoid broken styles.
09 — RUNTIME
Large Lists & Heavy DOM
- Senior Signal: 'Always measure first': A junior reaches for virtualization at 100 rows. A staff engineer measures: if the list takes < 50ms to render (under the long-task threshold), virtualization adds complexity without benefit. Profile with
performance.measure()or React DevTools profiler. Only virtualize when the raw DOM render exceeds 16ms per frame or INP > 200ms. - When NOT to Virtualize: Virtualization breaks
Ctrl+Ffind-in-page, printing, and accessibility tree navigation because non-visible rows are not in the DOM. If your users rely on browser search or need to select all items, use pagination or a static list withcontent-visibility: autoinstead. Also avoid virtualization for lists under 500 items — the overhead of scroll listeners and measurements outweighs the benefit.
| Technique | Best for | Cost | Avoid when |
|---|---|---|---|
| Windowing (virtualized) | Real-time scroll, 1000+ items, same view | ~2-5ms per scroll frame, DOM < 100 nodes | Items need full height for print/SEO |
| Pagination (page-based) | Search results, table UIs, data export | Server round-trip per page, ~200ms INP | Continuous browsing, mobile swipe |
| Infinite scroll (append) | Social feeds, activity logs | DOM grows unbounded, memory leak risk | User needs to find old items quickly |
- What is the primary performance bottleneck when rendering 10,000 DOM nodes?
- — Layout and paint time exceeding the 16ms frame budget, causing jank and INP > 200ms.
- What is the recommended DOM node budget for a virtualized list?
- — Under 100 visible nodes (including overscan) to keep layout under 3ms.
- When should you choose pagination over virtualization?
- — When users need browser find-in-page, printing, or SEO; or when the list is under 500 items.
- What CSS property defers rendering of off-screen elements?
- — content-visibility: auto, paired with contain-intrinsic-size to prevent CLS.
- Why does DOM node recycling reduce GC pauses?
- — It reuses a fixed pool of nodes instead of creating/destroying them, avoiding garbage collection that can block the main thread for >50ms.
10 — RUNTIME
Main Thread & Concurrency
- When NOT to use Web Workers: Avoid workers for trivial tasks like formatting a date or updating a single DOM element. The cost of spawning a worker (1–5ms) and serializing data (especially large strings) can exceed the computation time. Also, workers cannot access
window,document, orlocalStorage— if your task needs those, you must restructure. - Senior Signal: Always Measure First: A junior engineer reaches for Web Workers or
requestIdleCallbackat the first sign of slowness. A staff engineer profiles with Chrome DevTools Performance panel, identifies tasks exceeding the 50ms long-task threshold, and measures the actual impact on INP (target <200ms). If the task takes 30ms and runs once per page load, the optimization is premature. Only invest when the bottleneck is confirmed. - SharedArrayBuffer Security Overhead: Using
SharedArrayBufferrequires your site to be cross-origin isolated. This blocks loading cross-origin resources (e.g., CDN scripts, iframes) unless they opt in viaCross-Origin-Resource-Policy. For many apps, the isolation cost outweighs the performance gain. Only reach for it when you need sub-millisecond shared state between workers — otherwise, stick withpostMessageandTransferableobjects.
| Technique | Best for | Cost | Avoid when |
|---|---|---|---|
| Debounce | Autocomplete, resize, save-on-stop | Delayed response; may never fire if events keep coming | Real-time feedback (e.g., drawing, animation) |
| Throttle | Scroll, mousemove, progress updates | May skip trailing events; fixed rate can feel choppy | One-shot actions (e.g., button click) |
| requestAnimationFrame | Visual updates tied to paint cycle | Only fires ~60fps; not for non-visual work | Non-visual computation or background tasks |
- What is the 50ms long-task threshold?
- — Any task on the main thread exceeding 50ms is considered a long task, blocking user input and causing jank. Break work into chunks <50ms to stay responsive.
- When should you use debounce vs throttle?
- — Debounce: wait for a pause (autocomplete, resize). Throttle: ensure max rate (scroll, mousemove). Debounce can delay indefinitely; throttle may miss trailing events.
- What is the main cost of using Web Workers?
- — Serialization overhead for data passed via postMessage. Use Transferable objects (ArrayBuffer) to avoid copying. Workers cannot access DOM.
- What does scheduler.postTask provide that setTimeout doesn't?
- — Explicit priority levels (user-blocking, user-visible, background) and cancellation via AbortSignal. Better integration with browser scheduling.
- What headers are required for SharedArrayBuffer?
- — Cross-Origin-Opener-Policy: same-origin and Cross-Origin-Embedder-Policy: require-corp. This isolates the site and blocks cross-origin resources unless they opt in.
11 — REACT
React Re-render Optimization
- When NOT to use React.memo: Do not wrap every component in React.memo. The shallow comparison itself costs ~0.01ms per prop. For a component that renders in <0.1ms, memoization is a net loss. Only memoize if the component re-renders frequently with the same props AND its render cost is >1ms (e.g., a chart, a large list, a complex form).
- Senior Signal: Always Measure First: A staff engineer never optimizes without profiling. Use the React DevTools Profiler to identify actual re-render hotspots. Look for components that re-render more than expected or take >1ms to render. The 80/20 rule applies: 80% of performance gains come from fixing 20% of re-renders. Don't guess — measure with the Profiler's flamegraph and ranked timeline.
| Technique | Best for | Cost | Avoid when |
|---|---|---|---|
| React.memo | Expensive leaf components with stable props | Shallow prop compare (~0.01ms/prop) | Cheap renders (<0.1ms), always-new props |
| useMemo | Expensive computations (>1ms), referential stability | Memory for cached value, dep array compare | Trivial calculations (<0.1ms) |
| useCallback | Stable function references for memoized children | Memory for cached function, dep array compare | Functions not passed to memoized children |
| State colocation | Local UI state (toggles, inputs) | None (it's the default) | Global state that needs sharing |
- What is the 16ms frame budget?
- — The maximum time available for rendering a single frame at 60fps. Exceeding it causes jank. React re-renders should complete within this budget.
- When does React.memo hurt performance?
- — When the component is cheap to render (<0.1ms) or props are always new objects/functions, making the shallow comparison a net loss.
- What is state colocation?
- — Keeping state as close as possible to the component that uses it, preventing unnecessary re-renders of sibling subtrees.
- How does context splitting improve performance?
- — By separating unrelated state into different providers, so a change in one context only re-renders its consumers, not all consumers of a monolithic context.
- What is the React Compiler?
- — A build-time tool that automatically memoizes components and hooks, reducing the need for manual React.memo, useMemo, and useCallback.
12 — REACT
React Concurrent & Loading
- Senior Signal: Always measure first: Don't reach for useTransition or useDeferredValue until you've measured a long task (>50ms) or INP >200ms. Premature concurrency adds complexity and can cause visual jank if the deferred value lags too much. Profile with React DevTools or Chrome Performance panel before optimizing.
- When NOT to use this optimization: Don't wrap every state update in useTransition. If the update is small (< 1ms of work), the overhead of the transition mechanism (extra re-render, pending state) can actually increase latency. Also, avoid useDeferredValue for values that change rapidly (e.g., every keystroke) because the deferred value may never catch up, causing a stale UI.
| Technique | Best for | Cost | Avoid when |
|---|---|---|---|
| useTransition | Marking state updates as low priority | Extra re-render for pending state | Updates that must be synchronous (e.g., form validation) |
| useDeferredValue | Deferring derived values from props/state | Memory overhead for keeping old value | Simple computations (< 1ms) |
| React.lazy + Suspense | Code splitting large components | Network latency for chunk load | Tiny components (< 5kB) or SSR without streaming |
| startTransition | Non-urgent updates outside hooks | No built-in pending indicator | Inside event handlers that need immediate feedback |
- What is the main benefit of useTransition?
- — Marks a state update as low priority so it can be interrupted by urgent updates, keeping INP under 200ms.
- When should you use useDeferredValue instead of useTransition?
- — When you want to defer a derived value (e.g., filtered list) without wrapping the state update itself.
- What problem does useSyncExternalStore solve?
- — Prevents tearing by ensuring consistent snapshots of external state during concurrent renders.
- What is the cost of using React.lazy?
- — Adds network latency for chunk loading; avoid for components under 5kB or in SSR without streaming.
- What metric should you measure before using concurrent features?
- — INP >200ms or long tasks >50ms in the Performance panel.
13 — REACT
SSR & Hydration Cost
- Senior Signal: 'Always measure the hydration cost before optimizing': A junior might immediately reach for islands or streaming. A staff engineer first instruments Long Tasks and First Input Delay (FID) to quantify the actual hydration cost. If the total hydration time is under 200ms on a Moto G4, the optimization may be premature. Use
performance.measurearound the hydration root to get concrete numbers. - When NOT to Use These Optimizations: Avoid islands or resumability if your app has fewer than 5 interactive components or a total bundle size under 30kB. The overhead of splitting into islands (build tooling, state management) can outweigh the benefits. Similarly, don't use streaming SSR if your server response time is already under 200ms — the complexity of handling streaming errors and mismatches isn't worth it.
| Technique | Best For | Cost | Avoid When |
|---|---|---|---|
| Full SSR + Hydration | Simple pages, small bundles (<50kB) | High TTI, blocks main thread | Complex apps with large JS bundles |
| Streaming SSR + Progressive Hydration | Content-heavy pages, news sites | Moderate complexity, risk of mismatches | Real-time apps needing instant interactivity |
| Islands Architecture | Marketing sites, dashboards with few interactive widgets | Lower hydration cost, but state management overhead | Highly interactive apps (e.g., Figma, Google Docs) |
| React Server Components | Data-heavy pages, e-commerce product lists | Zero client JS for server components, but RSC payload size | Apps requiring client-side interactivity on every component |
- What is the hydration tax?
- — The cost of re-running component logic on the client to attach event listeners, measured as the delay between FCP and TTI. Typically 200-500ms on mid-range devices.
- How does streaming SSR improve TTI?
- — It sends HTML in chunks, allowing the browser to paint content early. Progressive hydration then hydrates visible components first, reducing time to first interaction.
- What is the key trade-off of islands architecture?
- — Lower hydration cost (only interactive components hydrate) but increased developer complexity and potential state management fragmentation.
- How do React Server Components reduce hydration?
- — Server components run on the server and send a serialized payload (not HTML) to the client, requiring zero client JavaScript for those components. Only 'use client' components hydrate.
- What is resumability in Qwik?
- — Instead of hydrating, Qwik serializes application state in HTML and resumes execution on interaction. This eliminates hydration entirely, achieving sub-50ms TTI on slow devices.
14 — NETWORK & DATA
Network & Caching
- Senior Signal: Measure Before You Cache: A junior engineer sets max-age to 3600 on everything. A staff engineer measures the cache hit rate, the cost of a miss (e.g., 200ms vs 50ms), and the frequency of content changes. They know that
immutableis only safe when the URL changes on every deploy — otherwise you break updates. Always validate with Lighthouse or WebPageTest before deciding on a caching strategy. - When NOT to Use CDN Caching: Avoid caching authenticated or user-specific data at the edge unless you use a CDN that supports key-based cache invalidation (e.g., Varnish with Vary: Cookie). Caching a user's private dashboard for all visitors is a data leak. Also, don't cache POST responses — they are rarely idempotent. Stick to GET and HEAD.
- Key Metrics for Network Optimization: Target: LCP < 2.5s, INP < 200ms, CLS < 0.1. Each round-trip adds ~50ms on 4G, ~300ms on 3G. A 100kB Brotli-compressed JS file (vs 130kB gzip) saves ~30ms on 4G. Use
Server-Timingheaders to measure cache hit/miss at each layer.
| Technique | Best for | Cost | Avoid when |
|---|---|---|---|
| max-age + immutable | Versioned static assets (JS, CSS, fonts) | Low; no revalidation | Unversioned or frequently changing resources |
| ETag + 304 | Dynamic API responses, user-specific data | One round-trip per validation | High-frequency polling; use stale-while-revalidate instead |
| stale-while-revalidate | News feeds, product listings | Background fetch; no user wait | Real-time data (chat, stock prices) |
| no-cache | Always-fresh content (e.g., CSRF tokens) | Full round-trip each time | Static assets; wastes bandwidth |
- What is the difference between max-age and s-maxage?
- — max-age applies to browsers and intermediate caches; s-maxage overrides max-age for shared caches (CDNs, proxies) but not browsers.
- Why was HTTP/2 server push deprecated?
- — It often pushed resources already in the browser cache, wasting bandwidth. Use 103 Early Hints or preload links instead.
- When should you use Brotli over gzip?
- — Brotli for text assets (HTML, CSS, JS, SVG) — 20-30% better compression. Gzip for legacy clients and binary formats like WOFF.
- What is the cache hierarchy from fastest to slowest?
- — Browser memory cache → browser disk cache → service worker cache → CDN edge cache → origin server.
- How does HTTP/3 reduce latency compared to HTTP/2?
- — HTTP/3 uses QUIC over UDP, eliminating TCP head-of-line blocking and reducing connection setup from 3 RTTs to 1.
15 — NETWORK & DATA
Data Fetching Optimization
- Senior Signal: Always measure cache hit rate first: A junior adds caching blindly. A staff engineer instruments cache hit/miss ratios and measures time-to-interactive before and after. If your cache hit rate is below 80%, your staleTime or query key design is wrong. Use browser DevTools or RUM to confirm.
- When NOT to deduplicate: If your API returns user-specific data (e.g., /api/orders?userId=X), deduplication across users is dangerous. Ensure query keys include all parameters that affect the response. Also, avoid deduplication for real-time streams (WebSocket) — use a different cache layer.
| Technique | Best for | Cost | Avoid when |
|---|---|---|---|
| GraphQL field selection | Complex UIs with varying data needs | Schema maintenance, resolver complexity | Simple CRUD with fixed views |
| REST sparse fieldsets | Legacy APIs, simple clients | Backend parsing overhead, caching granularity | Payloads < 2kB already |
| Over-fetching (no optimization) | Prototypes, low-traffic pages | Bandwidth waste, slower parse time | Any page with > 10 requests or > 50kB total |
- What is the key trade-off when setting staleTime in React Query?
- — Too short: unnecessary refetches on mount. Too long: stale data visible. Set based on data volatility (user profile: 5-30 min, feed: 30s-2min).
- When should you NOT use optimistic updates?
- — When mutation success rate < 95% or rollback takes > 200ms. Also avoid for irreversible actions (e.g., payments) without server confirmation.
- What metric indicates a waterfall problem?
- — Waterfall depth > 3 requests in Chrome DevTools Network tab, or total time > 2x the slowest single request due to sequential RTTs.
- Cursor vs offset pagination: when is offset acceptable?
- — When data is static (e.g., historical logs) or paginated tables with page numbers. For infinite scroll with live data, always use cursor.
- What is the cost of request deduplication?
- — ~0.1ms CPU overhead per key comparison. Negligible, but dangerous if query keys don't include user-specific parameters.
16 — PERCEIVED & MEMORY
Perceived Performance
- Senior Signal: 'Always measure the occupied time, not just the metric': A junior says 'we use skeletons.' A staff engineer says 'we measured that skeletons reduced perceived wait by 40% in our A/B test, but we also tracked that they increased CLS by 0.02. We accepted that trade-off because the conversion lift was 3%.' Always quantify the perception gap with real user monitoring (RUM) data.
- When NOT to use optimistic UI: Optimistic UI is dangerous for non-idempotent actions (e.g., charging a credit card, sending an email). If the server fails, you've shown a success state that's false. Always pair optimistic updates with a clear error state and rollback. Also avoid it when the action has side effects that are hard to reverse (e.g., deleting a user).
- Common Mistake: Over-prefetching: Prefetching every link on hover can waste bandwidth and CPU, especially on mobile. A 100ms hover threshold is too aggressive — use 200ms minimum. Also avoid prefetching for links that are likely to be cancelled (e.g., dropdown menus). Measure the cache hit rate: if <20% of prefetched resources are used, you're wasting resources.
| Technique | Best for | Cost | Avoid when |
|---|---|---|---|
| Skeleton screens | Predictable layout, >1s load | CLS risk if no fixed dimensions | Dynamic content (e.g., search results) |
| Spinners | Unpredictable layout, <1s load | No layout shift, but feels slower | Long loads (>3s) without progress |
| Optimistic UI | Idempotent actions (like, save) | State rollback complexity | Non-idempotent or error-prone actions |
- What is the 100ms threshold for perceived instantaneity?
- — Under 100ms, users perceive the system as reacting instantly. Visual feedback (e.g., button press) should appear within 50ms.
- What is 'occupied time' in perceived performance?
- — The time the user's brain is busy processing visual changes (e.g., skeleton screens, animations). Occupied time feels shorter than unoccupied time (blank screen).
- When should you use skeleton screens vs spinners?
- — Skeletons for predictable layout and >1s loads; spinners for unpredictable layout or <1s loads. Skeletons reduce perceived latency but risk CLS.
- What is the trade-off of optimistic UI?
- — Instant feedback but requires rollback on failure. Only use for idempotent actions (likes, toggles). Avoid for destructive or non-idempotent operations.
- What is the PRPL pattern?
- — Push (critical resources), Render (initial shell), Pre-cache (assets), Lazy-load (below-the-fold content). Prioritizes perceived speed over full load.
17 — PERCEIVED & MEMORY
Memory Optimization
- Senior Signal: Always Measure First: Before optimizing, run a heap snapshot in DevTools. If heap size is stable after 5 minutes of use, don't touch it. Premature memory optimization adds complexity. A staff engineer says: 'Show me the allocation timeline, then we talk.'
- When NOT to Use WeakRef: WeakRef is non-deterministic — the GC may clear it at any time. Never use it for critical data that must be available synchronously. Also, creating many WeakRefs can itself increase GC overhead. Stick to WeakMap for 99% of cases.
| Technique | Best For | Cost | Avoid When |
|---|---|---|---|
| WeakMap | DOM node metadata, private data | Slightly slower lookup than Map | Need to iterate keys (not possible) |
| WeakSet | Marking objects without preventing GC | Minimal | Need to store primitives |
| WeakRef | Large caches that can be recreated | GC may clear ref at any time | Data must always be available |
| Map/Set | General purpose, iterable | Prevents GC of keys | Memory-sensitive contexts |
- What is a detached DOM node?
- — A DOM node removed from the document tree but still referenced by JavaScript, preventing GC.
- When to use WeakMap vs Map?
- — WeakMap when keys are DOM nodes or objects that should be GC'd when no other references exist. Map when you need to iterate keys or store primitives.
- What is a generational GC?
- — V8 divides heap into young (new space, collected frequently) and old (old space, collected less often). Most objects die young.
- How to detect a memory leak in DevTools?
- — Take heap snapshots before/after actions. Look for sawtooth patterns in allocation timeline that don't return to baseline. Filter for 'Detached' nodes.
- What is the cost of object pooling?
- — Increased code complexity, risk of stale data, and potential for bugs if objects are not properly reset. Only use when allocation rate >10MB/s or GC pauses >50ms.