Cheatsheet

Optimization Deep Dive — Staff-Level Performance Handbook

Condensed for last-minute review — every key takeaway, decision table, and recall card.

01 — DIAGNOSE

Measurement & Profiling

Senior Signal: Always Measure First: A junior engineer says 'I think the bottleneck is the hero image.' A staff engineer says 'Let me check the field data — if LCP is above 2.5s for 75% of users, then we profile the main thread to see if it's network or rendering.' Never guess; always measure with real data.
Avoid Premature Optimization: Optimizing before measuring is the #1 performance mistake. A junior engineer might preload every image or inline all CSS, bloating the page. A staff engineer measures first, finds the real bottleneck (e.g., a 300ms server response), and optimizes that. Premature optimization adds complexity without guaranteed benefit.

Aspect	Lab Data	Field Data
Environment	Controlled (e.g., Moto G4, Slow 3G)	Real user devices, networks, locations
Metrics	Synthetic scores (Lighthouse performance)	Real user percentiles (p75, p95)
Use Case	Debugging, CI gates, regression detection	Prioritization, user impact, long-term trends
Limitation	May not reflect real-world variability	Noisy, requires sufficient sample size

What is the threshold for a long task?: — 50ms — tasks longer than this block the main thread and degrade INP.
What is the difference between lab and field data?: — Lab data is synthetic and controlled; field data is from real users and captures variability.
What is the measure → optimize → verify loop?: — Measure current performance, optimize one bottleneck, then verify the improvement with the same metrics.
What is a vanity metric?: — A metric that doesn't correlate with user experience, e.g., DOMContentLoaded or total page weight.
What is a performance budget?: — A limit on metrics (e.g., bundle size <200kB, LCP <2.5s) enforced in CI to prevent regressions.

02 — LOADING

Bundle Size & Code Splitting

When NOT to use tree shaking: Tree shaking is ineffective for CommonJS modules (require()) or packages without sideEffects: false. Also, if your code uses dynamic property access (e.g., import(`./${name}`)), the bundler cannot statically analyze it. Don't waste time micro-optimizing tree shaking on a 10kB utility; focus on the 200kB+ libraries first.
Senior Signal: Measure First, Then Optimize: A junior engineer will say 'let's split everything into lazy chunks.' A staff engineer says: 'Let's run a bundle analysis, identify the top 3 bloat sources, and then decide if splitting or tree shaking gives the best ROI. For example, if a 200kB chart library is only used on one route, route-based splitting saves 200kB. But if it's used on 5 routes, a shared chunk might be better. Always measure the actual impact on LCP and INP before and after.'

Technique	Best for	Cost	Avoid when
Route-based splitting	Large SPAs with many pages	Adds network round-trip per route	Routes are <10kB each
Component-based splitting	Heavy, rarely-used components	Loading state + potential CLS	Component is <50kB or used on every page
Vendor chunking	Stable third-party deps	Cache invalidation complexity	Vendor is <100kB or rarely changes
Shared chunks	Common modules across routes	Increased initial bundle size	Shared code is <5kB per route

What is the minimum bundle size that typically causes performance regressions?: — ~200kB uncompressed; above this, LCP and TBT degrade noticeably.
What flag must a package set in package.json for tree shaking to work?: — "sideEffects": false
What is the main trade-off of route-based code splitting?: — Reduces initial bundle size but adds a network round-trip per route transition, potentially hurting INP.
Why do barrel files (index.ts re-exports) break tree shaking?: — Bundlers cannot statically determine which exports are used, so they include all re-exported modules.
What is the recommended minSize for shared chunks in webpack splitChunks?: — 20kB (20000 bytes) to avoid creating too many tiny chunks.

03 — LOADING

Build Tooling & Chunking

Senior Signal: 'Always measure before optimizing': A junior says 'Vite is faster' and switches. A staff engineer says: 'I measured our cold build at 45s and HMR at 200ms. The bottleneck was our custom webpack plugin, not webpack itself. We profiled, found the plugin was doing O(n²) lookups, fixed it, and got cold builds to 12s without changing bundlers. Only then did we evaluate Vite for the next project.' Always profile your actual build pipeline before blaming the tool.
When NOT to use aggressive chunk splitting: If your app has fewer than 10 routes and total bundle size is under 200kB, splitting into 10+ chunks adds HTTP overhead and complexity without benefit. For small apps, a single bundle with code-splitting only for heavy third-party libraries (e.g., charting) is better. Measure: if your total JS is <100kB, don't split at all—the 16ms parsing time is negligible. Only split when you see route-level bundles exceeding 50kB or when vendor code is >30% of total.

Bundler	Dev Speed	Production Output Size	Plugin Ecosystem	Best For
webpack 5	Slow (10-60s cold)	Good (baseline)	Excellent	Complex SPAs, legacy migration
Vite	Fast (<2s cold, <50ms HMR)	Excellent (Rollup)	Good (growing)	New SPAs, SSR, modern stacks
esbuild	Very fast (<1s cold)	Good (5-10% larger)	Limited	Dev builds, simple apps, transpilation
Rollup	Moderate (no HMR)	Excellent (smallest)	Good	Libraries, component packages

Minifier	Speed (500kB bundle)	Output Size Reduction	Best For
Terser	2-5s	10-15%	Production builds where size is critical
esbuild	0.1-0.3s	5-10%	Dev builds, CI, fast iterations
SWC	0.2-0.5s	5-10%	Webpack projects needing speed

What is the primary trade-off between webpack and Vite for production builds?: — Vite uses Rollup for production, offering better tree-shaking and smaller output than webpack, but webpack has a richer plugin ecosystem for complex scenarios (e.g., Module Federation).
What is the recommended chunk size range for HTTP/2?: — 100-200kB per chunk. Smaller chunks increase HTTP overhead; larger chunks delay parsing. Aim for 3-6 core chunks plus route-level async chunks.
How does content-hash filename improve caching?: — It generates a unique filename based on file content. Unchanged chunks keep the same URL, so browsers reuse cached versions. Only changed chunks get new URLs, avoiding full cache invalidation.
When should you NOT use Module Federation?: — When teams cannot coordinate shared dependency versions, or when the app has fewer than 3 independent teams. The overhead of runtime sharing and version negotiation outweighs benefits for small projects.
What is the key metric for evaluating minifier choice?: — Build time vs output size. Terser is 2-5x slower but produces 5-10% smaller bundles than esbuild. Choose based on whether your bottleneck is build speed or bundle size.

04 — LOADING

Image & Font Optimization

When NOT to use AVIF: Avoid AVIF for images that are critical to LCP (e.g., hero banners) on low-end devices. Decoding a 2MB AVIF can take 80ms on a Moto G4, blowing the 50ms long-task threshold. Use WebP for LCP images and reserve AVIF for non-critical, large background images.
Senior Signal: Always measure first: A junior engineer adds lazy loading to every image. A staff engineer measures: 'Our LCP image is 1.8s on 4G — lazy loading it would push LCP to 3.2s. Instead, I'll eager-load the hero and lazy-load the 12 gallery images below the fold.' Always profile with Lighthouse and real-user monitoring (RUM) before applying optimizations.
Common Mistake: Over-subsetting: Subsetting to only ASCII characters breaks internationalization. If your site supports Japanese or Cyrillic, include those ranges. Always test with a full character set in staging. A subset that's too aggressive can cause missing glyphs (tofu boxes) that hurt UX more than the 50KB savings.

Technique	Best for	Cost	Avoid when
srcset + sizes	Resolution switching	Low: one img tag	Art direction needed
<picture>	Format switching, art direction	Medium: multiple source tags	Simple resolution only
Client Hints (DPR, Viewport-Width)	Automatic selection	Low: HTTP header	Privacy restrictions, legacy browsers
CDN image transformation	Dynamic resizing	Variable: per-request cost	Static assets, low traffic

Strategy	FOUT/FOIT	CLS risk	Best for
font-display: swap	FOUT (visible fallback)	Low if fallback metrics match	Brand fonts, headings
font-display: optional	FOIT (invisible text up to 100ms)	None if font fails	Body text, non-critical
font-display: block	FOIT (up to 3s)	High	Avoid in production
Variable fonts	Single file for multiple weights	Low (one download)	Sites with many font weights

What is the primary trade-off between AVIF and WebP?: — AVIF is 50% smaller but 2-3x slower to decode on older CPUs. WebP is 30% smaller with fast decode. Use WebP for LCP images, AVIF for non-critical.
How do you prevent CLS from images without fixed dimensions?: — Use CSS `aspect-ratio` with `max-width: 100%` and `height: auto`, or set explicit `width` and `height` on `` tags.
What is the difference between font-display: swap and optional?: — `swap` shows fallback text immediately (FOUT) and swaps when font loads. `optional` may skip the font if it takes >100ms, showing fallback permanently — zero CLS but no custom font.
When should you use the `` element instead of srcset?: — Use `` for format switching (AVIF/WebP/JPEG) or art direction (different crops). Use `srcset` for simple resolution switching.
What is the recommended rootMargin for IntersectionObserver lazy loading?: — 200px (or 1250px for native `loading='lazy'`). This preloads images before they enter the viewport, reducing perceived latency.

05 — LOADING

Critical Rendering Path

Senior Signal: Always measure the cost of inlining: A junior engineer inlines all CSS. A staff engineer measures: if the full CSS is <14kB gzipped, inlining everything may be faster than splitting. Use Lighthouse and WebPageTest to compare FCP with and without extraction. The decision depends on your CSS size, cache hit rate, and HTML delivery latency.
Common Mistake: Preloading everything: Preloading too many resources (e.g., all images) can delay the LCP resource by competing for bandwidth. The browser has a limited number of preload slots (typically 3-6). Only preload resources that are discovered late in the HTML or are critical for above-the-fold rendering. Measure with Priority Hints and Lighthouse to validate.

Technique	Best for	Cost	Avoid when
defer	Scripts that depend on DOM or other deferred scripts	Delays execution until after parsing	Scripts that must run before DOMContentLoaded
async	Independent scripts (analytics, ads)	No order guarantee; can block DOMContentLoaded	Scripts with dependencies or DOM manipulation
sync (no attr)	Legacy or critical inline scripts	Blocks parsing and painting entirely	Any script >1kB that can be deferred

What is the critical rendering path?: — The sequence of steps the browser takes to convert HTML, CSS, and JS into pixels. Blocking any step delays paint.
How does defer differ from async?: — defer preserves execution order and runs after HTML parsing; async runs as soon as downloaded, no order guarantee.
When should you use preload vs prefetch?: — preload for critical current-page resources; prefetch for likely-next-page resources. Overuse wastes bandwidth.
What is the 14kB rule for critical CSS?: — Inline only the CSS needed for above-the-fold content, keeping it under ~14kB compressed to fit in the first TCP packet.
What does fetchpriority='high' do?: — Hints the browser to prioritize that resource (e.g., LCP image) over others. Overuse can cause priority inversion.

06 — LOADING

Third-Party Scripts

Senior Signal: Always measure first, then optimize: A junior says 'we should async all scripts.' A staff engineer says 'let's measure the real cost with DevTools Performance tab, then decide which scripts are critical and which can be deferred or removed.' The key is data-driven triage.
When NOT to use Partytown: Partytown is not a silver bullet. Avoid it for scripts that manipulate the DOM (e.g., chat widgets, A/B testing tools) because they require synchronous DOM access. Also, the postMessage overhead can add 10-50ms per call, which may negate benefits for high-frequency interactions. Always measure TBT and INP before and after.

Technique	Best for	Cost	Avoid when
Self-hosting	Critical scripts (e.g., analytics, auth)	Maintenance overhead, no CDN caching	Scripts that update frequently (e.g., tag managers)
CDN with preconnect	Non-critical scripts (e.g., fonts, widgets)	Extra DNS lookup (mitigated by preconnect)	Scripts that need low latency on first load
Facade pattern	Heavy embeds (chat, video, maps)	Delayed interactivity	Scripts needed for initial UX (e.g., login)
Partytown	Analytics, tag managers	Setup complexity, postMessage latency	DOM-manipulating scripts

What is the facade pattern for third-party scripts?: — Replace heavy widget with a lightweight placeholder that loads the real script only on user interaction. Saves 200-500 kB JS and improves LCP by 1-2s.
When should you use Partytown vs async/defer?: — Partytown: analytics/tag managers that don't need DOM access. Async/defer: scripts that need DOM but not immediate execution. Avoid Partytown for DOM-manipulating widgets.
What metric best captures third-party script impact on interactivity?: — Total Blocking Time (TBT) and INP. Look for long tasks >50ms attributed to third-party origins in DevTools.
What is the trade-off of self-hosting third-party scripts?: — Reduces DNS/network latency by 100-300ms but loses CDN caching and requires manual updates. Best for critical, infrequently updated scripts.
How do you measure third-party script impact in DevTools?: — Performance tab > Bottom-Up view > filter by script origin. Network tab > blocking time. Lighthouse CI for regression tracking.

07 — RUNTIME

Rendering & Paint Performance

When NOT to Use will-change: Do not apply will-change to every animated element. On mobile, 20+ layers can cause GPU memory pressure and actually increase jank. Only promote elements that are actively animating and where you've measured a benefit. For simple transitions, the browser's own heuristics are often sufficient.
Senior Signal: Always Measure First: A junior says 'I'll add will-change to fix jank.' A staff engineer says 'Let me profile the frame budget first — if the bottleneck is layout, I'll batch reads; if it's paint, I'll promote to compositor; if it's JavaScript, I'll defer work.' Never optimize without a DevTools Performance recording. The 16ms budget is a target, not a guarantee — measure the actual cost of your change.

Technique	Best For	Cost	Avoid When
transform/opacity animation	Position, size, visibility	Zero layout/paint; ~0.1ms composite	Need to change actual layout (e.g., reflow siblings)
will-change: transform	Pre-promote to compositor layer	Memory: ~1-2MB per layer on mobile	More than 10 elements; causes layer explosion
content-visibility: auto	Off-screen content in long lists	Initial layout cost; ~0.5ms per element	Above-the-fold content; can delay LCP
CSS containment	Isolate subtrees from layout	Minimal; ~0.01ms per container	Small components with no layout impact

What is the 16ms frame budget and how is it typically split?: — 16.67ms per frame at 60fps. Typical split: JS ~5ms, Style/Layout ~3ms, Paint ~5ms, Composite ~3ms. Exceeding any slice drops the frame.
Which CSS properties only trigger composite (no layout or paint)?: — transform and opacity. All other properties (width, height, left, top, color, etc.) trigger layout or paint.
What is layout thrashing and how do you fix it?: — Interleaving reads (e.g., offsetHeight) and writes (e.g., style.left) forces synchronous layout flushes. Fix by batching all reads first, then writes.
What is the risk of overusing will-change?: — Layer explosion: each promoted layer costs ~1-2MB GPU memory. On mobile, 20+ layers can cause memory pressure and increase jank.
When should you NOT use content-visibility: auto?: — On above-the-fold content. It delays rendering and can increase LCP beyond 2.5s. Only use on off-screen sections.

08 — RUNTIME

CSS Performance

Senior Signal: Always measure before optimizing selectors: Junior engineers rewrite all selectors to BEM. Staff engineers profile with Performance panel first. If style recalculation is under 1ms for a 5000-element tree, selector optimization is noise. Focus on layout thrashing and paint complexity instead.
When NOT to use containment: Overusing contain: strict on every element can increase memory usage because the browser creates separate rendering contexts. Only apply to components that are truly independent (e.g., widgets, list items, modals). For static text, it's unnecessary overhead. Profile with Chrome DevTools 'Rendering' > 'Layer borders' to see if you're creating too many layers.

Technique	Best for	Cost	Avoid when
Runtime CSS-in-JS	Highly dynamic theming, server-rendered apps with small component trees	0.5–2ms per mount, 10–50kB JS bundle overhead	Pages with >100 components or strict INP budgets (<200ms)
Zero-runtime (vanilla-extract)	Static or token-based theming, large component libraries	Build-time only, <1kB runtime JS	Need runtime color swapping without rebuild
Tailwind CSS	Utility-first design, rapid prototyping, small bundles	JIT scanning adds ~200ms to dev build	Custom design systems with complex component APIs

Technique	Best for	Cost	Avoid when
Container queries	Reusable components, widget libraries	~0.1ms per container per resize	Nested containers or >100 containers on a page
Media queries	Page-level layouts, responsive breakpoints	~0.01ms per query, no per-element tracking	Components that need to respond to parent size

What is the key selector in CSS selector matching, and why does it matter?: — The rightmost part of a selector. Browsers match right-to-left, so a tag or universal key selector checks every element; a class or ID key selector is O(1).
What is the primary performance cost of runtime CSS-in-JS?: — JavaScript execution time during component mount (0.5–2ms per component), which can create long tasks (>50ms) on pages with many components.
What does contain: layout do, and when should you use it?: — It isolates an element's layout from the rest of the page. Use on independent widgets or list items to prevent style changes from triggering full-page layout recalculations.
What is the risk of promoting too many elements to GPU layers?: — GPU memory explosion (100–500kB per layer), leading to jank on mobile. Only promote elements that animate frequently.
How does PurgeCSS reduce CSS bundle size, and what is its main trade-off?: — It removes unused CSS by scanning templates. Trade-off: dynamic class construction (e.g., `btn-${variant}`) requires safelisting to avoid broken styles.

09 — RUNTIME

Large Lists & Heavy DOM

Senior Signal: 'Always measure first': A junior reaches for virtualization at 100 rows. A staff engineer measures: if the list takes < 50ms to render (under the long-task threshold), virtualization adds complexity without benefit. Profile with performance.measure() or React DevTools profiler. Only virtualize when the raw DOM render exceeds 16ms per frame or INP > 200ms.
When NOT to Virtualize: Virtualization breaks Ctrl+F find-in-page, printing, and accessibility tree navigation because non-visible rows are not in the DOM. If your users rely on browser search or need to select all items, use pagination or a static list with content-visibility: auto instead. Also avoid virtualization for lists under 500 items — the overhead of scroll listeners and measurements outweighs the benefit.

Technique	Best for	Cost	Avoid when
Windowing (virtualized)	Real-time scroll, 1000+ items, same view	~2-5ms per scroll frame, DOM < 100 nodes	Items need full height for print/SEO
Pagination (page-based)	Search results, table UIs, data export	Server round-trip per page, ~200ms INP	Continuous browsing, mobile swipe
Infinite scroll (append)	Social feeds, activity logs	DOM grows unbounded, memory leak risk	User needs to find old items quickly

What is the primary performance bottleneck when rendering 10,000 DOM nodes?: — Layout and paint time exceeding the 16ms frame budget, causing jank and INP > 200ms.
What is the recommended DOM node budget for a virtualized list?: — Under 100 visible nodes (including overscan) to keep layout under 3ms.
When should you choose pagination over virtualization?: — When users need browser find-in-page, printing, or SEO; or when the list is under 500 items.
What CSS property defers rendering of off-screen elements?: — content-visibility: auto, paired with contain-intrinsic-size to prevent CLS.
Why does DOM node recycling reduce GC pauses?: — It reuses a fixed pool of nodes instead of creating/destroying them, avoiding garbage collection that can block the main thread for >50ms.

10 — RUNTIME

Main Thread & Concurrency

When NOT to use Web Workers: Avoid workers for trivial tasks like formatting a date or updating a single DOM element. The cost of spawning a worker (1–5ms) and serializing data (especially large strings) can exceed the computation time. Also, workers cannot access window, document, or localStorage — if your task needs those, you must restructure.
Senior Signal: Always Measure First: A junior engineer reaches for Web Workers or requestIdleCallback at the first sign of slowness. A staff engineer profiles with Chrome DevTools Performance panel, identifies tasks exceeding the 50ms long-task threshold, and measures the actual impact on INP (target <200ms). If the task takes 30ms and runs once per page load, the optimization is premature. Only invest when the bottleneck is confirmed.
SharedArrayBuffer Security Overhead: Using SharedArrayBuffer requires your site to be cross-origin isolated. This blocks loading cross-origin resources (e.g., CDN scripts, iframes) unless they opt in via Cross-Origin-Resource-Policy. For many apps, the isolation cost outweighs the performance gain. Only reach for it when you need sub-millisecond shared state between workers — otherwise, stick with postMessage and Transferable objects.

Technique	Best for	Cost	Avoid when
Debounce	Autocomplete, resize, save-on-stop	Delayed response; may never fire if events keep coming	Real-time feedback (e.g., drawing, animation)
Throttle	Scroll, mousemove, progress updates	May skip trailing events; fixed rate can feel choppy	One-shot actions (e.g., button click)
requestAnimationFrame	Visual updates tied to paint cycle	Only fires ~60fps; not for non-visual work	Non-visual computation or background tasks

What is the 50ms long-task threshold?: — Any task on the main thread exceeding 50ms is considered a long task, blocking user input and causing jank. Break work into chunks <50ms to stay responsive.
When should you use debounce vs throttle?: — Debounce: wait for a pause (autocomplete, resize). Throttle: ensure max rate (scroll, mousemove). Debounce can delay indefinitely; throttle may miss trailing events.
What is the main cost of using Web Workers?: — Serialization overhead for data passed via postMessage. Use Transferable objects (ArrayBuffer) to avoid copying. Workers cannot access DOM.
What does scheduler.postTask provide that setTimeout doesn't?: — Explicit priority levels (user-blocking, user-visible, background) and cancellation via AbortSignal. Better integration with browser scheduling.
What headers are required for SharedArrayBuffer?: — Cross-Origin-Opener-Policy: same-origin and Cross-Origin-Embedder-Policy: require-corp. This isolates the site and blocks cross-origin resources unless they opt in.

11 — REACT

React Re-render Optimization

When NOT to use React.memo: Do not wrap every component in React.memo. The shallow comparison itself costs ~0.01ms per prop. For a component that renders in <0.1ms, memoization is a net loss. Only memoize if the component re-renders frequently with the same props AND its render cost is >1ms (e.g., a chart, a large list, a complex form).
Senior Signal: Always Measure First: A staff engineer never optimizes without profiling. Use the React DevTools Profiler to identify actual re-render hotspots. Look for components that re-render more than expected or take >1ms to render. The 80/20 rule applies: 80% of performance gains come from fixing 20% of re-renders. Don't guess — measure with the Profiler's flamegraph and ranked timeline.

Technique	Best for	Cost	Avoid when
React.memo	Expensive leaf components with stable props	Shallow prop compare (~0.01ms/prop)	Cheap renders (<0.1ms), always-new props
useMemo	Expensive computations (>1ms), referential stability	Memory for cached value, dep array compare	Trivial calculations (<0.1ms)
useCallback	Stable function references for memoized children	Memory for cached function, dep array compare	Functions not passed to memoized children
State colocation	Local UI state (toggles, inputs)	None (it's the default)	Global state that needs sharing

What is the 16ms frame budget?: — The maximum time available for rendering a single frame at 60fps. Exceeding it causes jank. React re-renders should complete within this budget.
When does React.memo hurt performance?: — When the component is cheap to render (<0.1ms) or props are always new objects/functions, making the shallow comparison a net loss.
What is state colocation?: — Keeping state as close as possible to the component that uses it, preventing unnecessary re-renders of sibling subtrees.
How does context splitting improve performance?: — By separating unrelated state into different providers, so a change in one context only re-renders its consumers, not all consumers of a monolithic context.
What is the React Compiler?: — A build-time tool that automatically memoizes components and hooks, reducing the need for manual React.memo, useMemo, and useCallback.

12 — REACT

React Concurrent & Loading

Senior Signal: Always measure first: Don't reach for useTransition or useDeferredValue until you've measured a long task (>50ms) or INP >200ms. Premature concurrency adds complexity and can cause visual jank if the deferred value lags too much. Profile with React DevTools or Chrome Performance panel before optimizing.
When NOT to use this optimization: Don't wrap every state update in useTransition. If the update is small (< 1ms of work), the overhead of the transition mechanism (extra re-render, pending state) can actually increase latency. Also, avoid useDeferredValue for values that change rapidly (e.g., every keystroke) because the deferred value may never catch up, causing a stale UI.

Technique	Best for	Cost	Avoid when
useTransition	Marking state updates as low priority	Extra re-render for pending state	Updates that must be synchronous (e.g., form validation)
useDeferredValue	Deferring derived values from props/state	Memory overhead for keeping old value	Simple computations (< 1ms)
React.lazy + Suspense	Code splitting large components	Network latency for chunk load	Tiny components (< 5kB) or SSR without streaming
startTransition	Non-urgent updates outside hooks	No built-in pending indicator	Inside event handlers that need immediate feedback

What is the main benefit of useTransition?: — Marks a state update as low priority so it can be interrupted by urgent updates, keeping INP under 200ms.
When should you use useDeferredValue instead of useTransition?: — When you want to defer a derived value (e.g., filtered list) without wrapping the state update itself.
What problem does useSyncExternalStore solve?: — Prevents tearing by ensuring consistent snapshots of external state during concurrent renders.
What is the cost of using React.lazy?: — Adds network latency for chunk loading; avoid for components under 5kB or in SSR without streaming.
What metric should you measure before using concurrent features?: — INP >200ms or long tasks >50ms in the Performance panel.

13 — REACT

SSR & Hydration Cost

Senior Signal: 'Always measure the hydration cost before optimizing': A junior might immediately reach for islands or streaming. A staff engineer first instruments Long Tasks and First Input Delay (FID) to quantify the actual hydration cost. If the total hydration time is under 200ms on a Moto G4, the optimization may be premature. Use performance.measure around the hydration root to get concrete numbers.
When NOT to Use These Optimizations: Avoid islands or resumability if your app has fewer than 5 interactive components or a total bundle size under 30kB. The overhead of splitting into islands (build tooling, state management) can outweigh the benefits. Similarly, don't use streaming SSR if your server response time is already under 200ms — the complexity of handling streaming errors and mismatches isn't worth it.

Technique	Best For	Cost	Avoid When
Full SSR + Hydration	Simple pages, small bundles (<50kB)	High TTI, blocks main thread	Complex apps with large JS bundles
Streaming SSR + Progressive Hydration	Content-heavy pages, news sites	Moderate complexity, risk of mismatches	Real-time apps needing instant interactivity
Islands Architecture	Marketing sites, dashboards with few interactive widgets	Lower hydration cost, but state management overhead	Highly interactive apps (e.g., Figma, Google Docs)
React Server Components	Data-heavy pages, e-commerce product lists	Zero client JS for server components, but RSC payload size	Apps requiring client-side interactivity on every component

What is the hydration tax?: — The cost of re-running component logic on the client to attach event listeners, measured as the delay between FCP and TTI. Typically 200-500ms on mid-range devices.
How does streaming SSR improve TTI?: — It sends HTML in chunks, allowing the browser to paint content early. Progressive hydration then hydrates visible components first, reducing time to first interaction.
What is the key trade-off of islands architecture?: — Lower hydration cost (only interactive components hydrate) but increased developer complexity and potential state management fragmentation.
How do React Server Components reduce hydration?: — Server components run on the server and send a serialized payload (not HTML) to the client, requiring zero client JavaScript for those components. Only 'use client' components hydrate.
What is resumability in Qwik?: — Instead of hydrating, Qwik serializes application state in HTML and resumes execution on interaction. This eliminates hydration entirely, achieving sub-50ms TTI on slow devices.

14 — NETWORK & DATA

Network & Caching

Senior Signal: Measure Before You Cache: A junior engineer sets max-age to 3600 on everything. A staff engineer measures the cache hit rate, the cost of a miss (e.g., 200ms vs 50ms), and the frequency of content changes. They know that immutable is only safe when the URL changes on every deploy — otherwise you break updates. Always validate with Lighthouse or WebPageTest before deciding on a caching strategy.
When NOT to Use CDN Caching: Avoid caching authenticated or user-specific data at the edge unless you use a CDN that supports key-based cache invalidation (e.g., Varnish with Vary: Cookie). Caching a user's private dashboard for all visitors is a data leak. Also, don't cache POST responses — they are rarely idempotent. Stick to GET and HEAD.
Key Metrics for Network Optimization: Target: LCP < 2.5s, INP < 200ms, CLS < 0.1. Each round-trip adds ~50ms on 4G, ~300ms on 3G. A 100kB Brotli-compressed JS file (vs 130kB gzip) saves ~30ms on 4G. Use Server-Timing headers to measure cache hit/miss at each layer.

Technique	Best for	Cost	Avoid when
max-age + immutable	Versioned static assets (JS, CSS, fonts)	Low; no revalidation	Unversioned or frequently changing resources
ETag + 304	Dynamic API responses, user-specific data	One round-trip per validation	High-frequency polling; use stale-while-revalidate instead
stale-while-revalidate	News feeds, product listings	Background fetch; no user wait	Real-time data (chat, stock prices)
no-cache	Always-fresh content (e.g., CSRF tokens)	Full round-trip each time	Static assets; wastes bandwidth

What is the difference between max-age and s-maxage?: — max-age applies to browsers and intermediate caches; s-maxage overrides max-age for shared caches (CDNs, proxies) but not browsers.
Why was HTTP/2 server push deprecated?: — It often pushed resources already in the browser cache, wasting bandwidth. Use 103 Early Hints or preload links instead.
When should you use Brotli over gzip?: — Brotli for text assets (HTML, CSS, JS, SVG) — 20-30% better compression. Gzip for legacy clients and binary formats like WOFF.
What is the cache hierarchy from fastest to slowest?: — Browser memory cache → browser disk cache → service worker cache → CDN edge cache → origin server.
How does HTTP/3 reduce latency compared to HTTP/2?: — HTTP/3 uses QUIC over UDP, eliminating TCP head-of-line blocking and reducing connection setup from 3 RTTs to 1.

15 — NETWORK & DATA

Data Fetching Optimization

Senior Signal: Always measure cache hit rate first: A junior adds caching blindly. A staff engineer instruments cache hit/miss ratios and measures time-to-interactive before and after. If your cache hit rate is below 80%, your staleTime or query key design is wrong. Use browser DevTools or RUM to confirm.
When NOT to deduplicate: If your API returns user-specific data (e.g., /api/orders?userId=X), deduplication across users is dangerous. Ensure query keys include all parameters that affect the response. Also, avoid deduplication for real-time streams (WebSocket) — use a different cache layer.

Technique	Best for	Cost	Avoid when
GraphQL field selection	Complex UIs with varying data needs	Schema maintenance, resolver complexity	Simple CRUD with fixed views
REST sparse fieldsets	Legacy APIs, simple clients	Backend parsing overhead, caching granularity	Payloads < 2kB already
Over-fetching (no optimization)	Prototypes, low-traffic pages	Bandwidth waste, slower parse time	Any page with > 10 requests or > 50kB total

What is the key trade-off when setting staleTime in React Query?: — Too short: unnecessary refetches on mount. Too long: stale data visible. Set based on data volatility (user profile: 5-30 min, feed: 30s-2min).
When should you NOT use optimistic updates?: — When mutation success rate < 95% or rollback takes > 200ms. Also avoid for irreversible actions (e.g., payments) without server confirmation.
What metric indicates a waterfall problem?: — Waterfall depth > 3 requests in Chrome DevTools Network tab, or total time > 2x the slowest single request due to sequential RTTs.
Cursor vs offset pagination: when is offset acceptable?: — When data is static (e.g., historical logs) or paginated tables with page numbers. For infinite scroll with live data, always use cursor.
What is the cost of request deduplication?: — ~0.1ms CPU overhead per key comparison. Negligible, but dangerous if query keys don't include user-specific parameters.

16 — PERCEIVED & MEMORY

Perceived Performance

Senior Signal: 'Always measure the occupied time, not just the metric': A junior says 'we use skeletons.' A staff engineer says 'we measured that skeletons reduced perceived wait by 40% in our A/B test, but we also tracked that they increased CLS by 0.02. We accepted that trade-off because the conversion lift was 3%.' Always quantify the perception gap with real user monitoring (RUM) data.
When NOT to use optimistic UI: Optimistic UI is dangerous for non-idempotent actions (e.g., charging a credit card, sending an email). If the server fails, you've shown a success state that's false. Always pair optimistic updates with a clear error state and rollback. Also avoid it when the action has side effects that are hard to reverse (e.g., deleting a user).
Common Mistake: Over-prefetching: Prefetching every link on hover can waste bandwidth and CPU, especially on mobile. A 100ms hover threshold is too aggressive — use 200ms minimum. Also avoid prefetching for links that are likely to be cancelled (e.g., dropdown menus). Measure the cache hit rate: if <20% of prefetched resources are used, you're wasting resources.

Technique	Best for	Cost	Avoid when
Skeleton screens	Predictable layout, >1s load	CLS risk if no fixed dimensions	Dynamic content (e.g., search results)
Spinners	Unpredictable layout, <1s load	No layout shift, but feels slower	Long loads (>3s) without progress
Optimistic UI	Idempotent actions (like, save)	State rollback complexity	Non-idempotent or error-prone actions

What is the 100ms threshold for perceived instantaneity?: — Under 100ms, users perceive the system as reacting instantly. Visual feedback (e.g., button press) should appear within 50ms.
What is 'occupied time' in perceived performance?: — The time the user's brain is busy processing visual changes (e.g., skeleton screens, animations). Occupied time feels shorter than unoccupied time (blank screen).
When should you use skeleton screens vs spinners?: — Skeletons for predictable layout and >1s loads; spinners for unpredictable layout or <1s loads. Skeletons reduce perceived latency but risk CLS.
What is the trade-off of optimistic UI?: — Instant feedback but requires rollback on failure. Only use for idempotent actions (likes, toggles). Avoid for destructive or non-idempotent operations.
What is the PRPL pattern?: — Push (critical resources), Render (initial shell), Pre-cache (assets), Lazy-load (below-the-fold content). Prioritizes perceived speed over full load.

17 — PERCEIVED & MEMORY

Memory Optimization

Senior Signal: Always Measure First: Before optimizing, run a heap snapshot in DevTools. If heap size is stable after 5 minutes of use, don't touch it. Premature memory optimization adds complexity. A staff engineer says: 'Show me the allocation timeline, then we talk.'
When NOT to Use WeakRef: WeakRef is non-deterministic — the GC may clear it at any time. Never use it for critical data that must be available synchronously. Also, creating many WeakRefs can itself increase GC overhead. Stick to WeakMap for 99% of cases.

Technique	Best For	Cost	Avoid When
WeakMap	DOM node metadata, private data	Slightly slower lookup than Map	Need to iterate keys (not possible)
WeakSet	Marking objects without preventing GC	Minimal	Need to store primitives
WeakRef	Large caches that can be recreated	GC may clear ref at any time	Data must always be available
Map/Set	General purpose, iterable	Prevents GC of keys	Memory-sensitive contexts

What is a detached DOM node?: — A DOM node removed from the document tree but still referenced by JavaScript, preventing GC.
When to use WeakMap vs Map?: — WeakMap when keys are DOM nodes or objects that should be GC'd when no other references exist. Map when you need to iterate keys or store primitives.
What is a generational GC?: — V8 divides heap into young (new space, collected frequently) and old (old space, collected less often). Most objects die young.
How to detect a memory leak in DevTools?: — Take heap snapshots before/after actions. Look for sawtooth patterns in allocation timeline that don't return to baseline. Filter for 'Detached' nodes.
What is the cost of object pooling?: — Increased code complexity, risk of stale data, and potential for bugs if objects are not properly reset. Only use when allocation rate >10MB/s or GC pauses >50ms.