Node.js Mastery — Interview Prep

⚙️

Segment 1 — Core Fundamentals

The bedrock of every Node.js interview. These concepts explain why Node.js behaves the way it does — understanding them deeply separates a strong engineer from someone who just uses Node.js as a framework runner.

Event Loop Non-blocking I/O Concurrency Model Module System libuv CommonJS vs ESM

Questions

Opened

Topics Covered

~45m

Study Time

🧠 Mental Model — What is Node.js, Really?

Before memorising phases and APIs, lock in this mental model. Node.js is not magic — it's one very efficient decision: don't wait, move on. While one request waits for a database response, serve ten others.

☕ The Café Waiter Analogy

Imagine a café with one highly efficient waiter (the event loop) and a kitchen full of chefs (OS + libuv threads).

The waiter takes your order and immediately moves to the next table — never stands watching the kitchen.
When food is ready, the kitchen rings a bell → the waiter picks it up and delivers it.
A traditional multi-threaded server hires one waiter per table — expensive, and most of them just stand waiting.

Node.js doesn't make I/O faster — it makes the time spent waiting productive.

✅ Node.js excels at

Handling thousands of concurrent connections
HTTP APIs, database calls, file reads (I/O-bound)
Real-time apps — chat, live dashboards, notifications
Streaming large files without loading all into memory

❌ Node.js struggles with

CPU-heavy work: video encoding, ML inference, crypto
Workloads needing true parallel CPU threads
Long synchronous blocking operations on the main thread

V8 Engine — executes your JavaScript. This is the one JS thread.

Node.js Core APIs — C++ bindings that bridge JS to the operating system.

libuv — cross-platform C library: the event loop lives here, plus a thread pool (4 threads by default) for file I/O and crypto.

OS Kernel — actual hardware: disks, network cards, timers. Uses epoll (Linux), kqueue (macOS), IOCP (Windows) for async network I/O.

Why This Architecture Wins for Web Servers

A typical HTTP request spends ~95% of its time waiting (DB query, downstream API, disk read) and only ~5% doing computation. Node's single-threaded async model makes that 95% waiting time free for other requests — instead of tying up an OS thread.

Topic A — The Event Loop

What is the Node.js Event Loop? Explain its phases in order.

Hard Event Loop

▾

🎯 Build the Mental Model First

The event loop is Node's task scheduler. Before the phases table, understand the big picture: the event loop is a loop that runs forever, checking queues in a fixed order and running callbacks.

🏁 Racetrack Analogy

Each "tick" of the event loop is one lap around a racetrack. The track has fixed pit stops in a fixed order. Your callbacks sit at different pit stops waiting to be picked up:

Timers pit stop — setTimeout / setInterval callbacks whose time has come
I/O pit stop (poll) — file / network callbacks that just completed in the background
Check pit stop — setImmediate callbacks
Microtask express lane — process.nextTick and Promises cut the queue between every pit stop

The Most Important Rule

The JS thread processes one callback at a time, to completion. While a callback runs, nothing else can run. This is why blocking the event loop affects every concurrent user.

Core Concept

The Event Loop is the mechanism that allows Node.js to perform non-blocking I/O operations despite JavaScript being single-threaded. It continuously checks whether there are tasks to execute and in what order.

Node.js uses libuv (a C library) under the hood to implement the event loop. Each "tick" of the loop passes through these 6 phases in order:

The 6 Phases

#	Phase	What runs here	Key API
1	timers	Callbacks from `setTimeout` and `setInterval` whose delay has elapsed	`setTimeout`
2	pending callbacks	I/O callbacks deferred from the previous loop iteration (e.g. some TCP errors)	—
3	idle / prepare	Internal libuv use only — you cannot schedule work here directly	—
4	poll	Fetch new I/O events; execute I/O-related callbacks (file reads, network). Loop waits here if nothing else is pending.	`fs.readFile`
5	check	Callbacks scheduled with `setImmediate`	`setImmediate`
6	close callbacks	Close events — e.g. `socket.on('close', ...)`	`.on('close')`

Microtasks — The Priority Queue

Between every phase transition, Node.js drains two microtask queues before moving to the next phase:

process.nextTick queue — always drains first (highest priority)
Promise microtask queue — drains after nextTick

⚠ Common Trap

Recursively calling process.nextTick will starve I/O — the loop never advances past microtasks.

Code Example — Execution Order

JavaScript

// What is the output order?
setTimeout(() => console.log('1. setTimeout'), 0);

setImmediate(() => console.log('2. setImmediate'));

process.nextTick(() => console.log('3. nextTick'));

Promise.resolve().then(() => console.log('4. Promise.then'));

console.log('5. synchronous');

/*
  Output:
  5. synchronous          ← sync code runs first (call stack)
  3. nextTick             ← nextTick queue (before promises)
  4. Promise.then         ← promise microtask queue
  1. setTimeout           ← timers phase (may swap with setImmediate
  2. setImmediate           depending on when loop starts)
*/

How to answer this in an interview

"The event loop is Node's scheduler. Sync code runs first. Then microtasks — nextTick before Promises. Then the loop cycles through its 6 phases: timers → pending callbacks → idle → poll → check → close. The poll phase is where Node waits for I/O. setImmediate fires in the check phase, always after I/O callbacks — that's its guarantee."

What is the difference between process.nextTick, setImmediate, and setTimeout(fn, 0)?

Medium Event Loop

▾

Side-by-side Comparison

API	When it runs	Use case
process.nextTick	After current operation, before any I/O or timers — highest priority microtask	Ensure a callback fires asynchronously but before anything else
Promise.then	After nextTick queue empties — second priority microtask	Standard async control flow
setImmediate	Check phase — guaranteed after I/O callbacks in the same loop iteration	Do something after I/O handlers finish
setTimeout(fn, 0)	Timers phase — minimum 1ms delay (OS-dependent), can slip past setImmediate outside I/O	Delay by at least one tick — less predictable than setImmediate

The Classic Gotcha — Inside vs Outside I/O

JavaScript

// OUTSIDE an I/O callback — order of setTimeout vs setImmediate is UNDEFINED
setTimeout(() => console.log('timeout'), 0);
setImmediate(() => console.log('immediate'));
// Could print either order — depends on OS timer resolution

// INSIDE an I/O callback — setImmediate ALWAYS wins
fs.readFile(__filename, () => {
  setTimeout(() => console.log('timeout'), 0);
  setImmediate(() => console.log('immediate')); // always first
});

Practical Rule

Prefer setImmediate over setTimeout(fn, 0) when you need post-I/O ordering — it's deterministic. Use process.nextTick sparingly and only when you truly need the highest priority.

One-liner answer for interview

"nextTick fires before any I/O in the current microtask checkpoint — it's the highest priority. setImmediate fires in the check phase, after I/O. setTimeout(0) fires in the timers phase and has at least a 1ms minimum delay, so its ordering versus setImmediate is non-deterministic outside I/O callbacks."

What happens if you block the event loop? How do you detect and fix it?

Hard Event Loop

▾

What Blocking Means

The event loop runs on a single thread. Every callback runs to completion before the next one starts. If a callback takes a long time — a CPU-heavy computation, a massive JSON.parse, a synchronous file read — no other request is served during that time.

⚠ Real-world Impact

One blocked event loop iteration in a production server delays every concurrent user. A 500ms block on a server handling 1000 req/s means thousands of requests time out.

Common Blocking Culprits

Synchronous I/O: fs.readFileSync, fs.writeFileSync
Heavy computation: sorting millions of items, crypto without worker
Large JSON: JSON.parse of a 50MB response body
Regex with catastrophic backtracking (ReDoS)
Infinite or deep synchronous recursion

Code Example — Blocking vs Non-blocking

JavaScript

// ❌ BLOCKS the event loop — no request served during this
function badRoute(req, res) {
  const data = fs.readFileSync('/huge-file.json');
  const parsed = JSON.parse(data);  // still on main thread
  res.send(parsed);
}

// ✅ Non-blocking — event loop is free while file is being read
async function goodRoute(req, res) {
  const data = await fs.promises.readFile('/huge-file.json');
  const parsed = JSON.parse(data); // JSON.parse still blocks! See below
  res.send(parsed);
}

// ✅ Offload CPU work to a worker thread
const { Worker } = require('worker_threads');
function parseInWorker(rawJson) {
  return new Promise((resolve, reject) => {
    const worker = new Worker(`
      const { workerData, parentPort } = require('worker_threads');
      parentPort.postMessage(JSON.parse(workerData));
    `, { eval: true, workerData: rawJson });
    worker.on('message', resolve);
    worker.on('error', reject);
  });
}

How to Detect Blocking

clinic doctor — visualizes event loop lag over time
perf_hooks: monitorEventLoopDelay() — measures lag in ms
--inspect + Chrome DevTools — CPU profile to find hot synchronous code

JavaScript

const { monitorEventLoopDelay } = require('perf_hooks');
const h = monitorEventLoopDelay({ resolution: 20 });
h.enable();

setInterval(() => {
  console.log(`Event loop lag: ${h.mean / 1e6}ms mean`);
}, 5000);

Topic B — Non-blocking I/O & libuv

How does Node.js achieve non-blocking I/O if JavaScript is single-threaded?

Hard libuv

▾

The Key Insight

JavaScript runs on a single thread — but the operating system and libuv's thread pool do the actual I/O work in the background. Node.js is the middleman that hands off work and gets notified when it's done.

The Architecture Stack

Architecture

  ┌─────────────────────────────────────┐
  │   Your JavaScript Code (V8)         │  ← Single thread
  ├─────────────────────────────────────┤
  │   Node.js Core APIs (C++ bindings)  │
  ├─────────────────────────────────────┤
  │              libuv                   │
  │  ┌─────────────┐  ┌───────────────┐ │
  │  │ Event Loop  │  │  Thread Pool  │ │  ← 4 threads default
  │  │  (1 thread) │  │  (UV_THREADPOOL│ │     (max 128)
  │  └─────────────┘  │  _SIZE)       │ │
  │                   └───────────────┘ │
  ├─────────────────────────────────────┤
  │         Operating System            │
  │  (epoll/kqueue/IOCP - async I/O)    │
  └─────────────────────────────────────┘

Two Separate Mechanisms

Network I/O (TCP/UDP, HTTP) — uses OS-level async APIs (epoll on Linux, kqueue on macOS, IOCP on Windows). The OS tells libuv when a socket is readable/writable. No thread pool needed.
File system & DNS lookups — most OS file APIs are blocking, so libuv uses its thread pool (4 threads by default). A worker thread does the blocking call; when done it signals the event loop.

Thread Pool Size

You can increase libuv's thread pool: UV_THREADPOOL_SIZE=16 node app.js
This helps when many concurrent disk I/O or crypto operations queue up. Max is 128 threads.

The Flow for fs.readFile

Flow

1. fs.readFile('data.txt', callback)
   → Registers callback, hands work to libuv thread pool

2. Event loop continues processing other events (non-blocking)

3. A libuv thread performs the blocking OS read() call

4. Thread completes → posts result to event loop's poll queue

5. Poll phase picks it up → calls your callback on the main thread

One-liner answer for interview

"JavaScript is single-threaded but Node.js uses libuv to delegate I/O work — network I/O goes through OS-level async interfaces like epoll, while file system work uses libuv's thread pool. When work completes, a callback is queued on the event loop and your JS code picks it up — never blocking the main thread."

Is Node.js truly single-threaded? What runs on multiple threads?

Medium libuv

▾

The Nuanced Answer

Your JavaScript code runs on one thread — that's the V8 thread. But the Node.js process as a whole uses multiple threads:

1 main thread — V8 + event loop (your JS)
4+ libuv threads — file system, DNS, crypto, zlib
V8 internal threads — GC, JIT compilation (background)
Worker threads (optional) — you can spawn via worker_threads

What This Means Practically

JavaScript

const crypto = require('crypto');

// These 4 hashes run CONCURRENTLY on 4 libuv threads
// despite JavaScript initiating them "one after another"
for (let i = 0; i < 4; i++) {
  crypto.pbkdf2('password', 'salt', 100000, 64, 'sha512', (err, key) => {
    console.log(`Hash ${i} done`);
  });
}
// All 4 finish at roughly the same time — parallel on thread pool

// 5th one has to wait for a free thread (default pool = 4)
crypto.pbkdf2('password', 'salt', 100000, 64, 'sha512', (err, key) => {
  console.log('Hash 4 done — waited for a free thread');
});

Interview Tip

Interviewers love this question. The short answer: "single-threaded for JS execution, multi-threaded under the hood for I/O." Show you know the difference — it demonstrates real depth.

Topic C — Single-threaded Concurrency Model

Why is Node.js good for I/O-bound workloads but bad for CPU-bound workloads?

Medium Concurrency

▾

I/O Bound — Why Node Shines

In traditional multi-threaded servers (like Java), each request gets its own thread. With 10,000 concurrent connections you need 10,000 threads — enormous memory overhead and context-switching cost.

Node.js uses a single thread + event loop. While waiting for I/O (database, file, network), the thread serves other requests. 10,000 concurrent connections can be handled by a handful of threads if most time is spent waiting for I/O.

CPU Bound — Why Node Struggles

If a request requires heavy computation — image resizing, video encoding, complex cryptography, ML inference — that work occupies the main thread. All other requests wait.

JavaScript

// Simulating CPU work — blocks event loop for ~2 seconds
function fibonacci(n) {
  if (n <= 1) return n;
  return fibonacci(n - 1) + fibonacci(n - 2);
}

app.get('/slow', (req, res) => {
  const result = fibonacci(44); // ← blocks ALL requests for ~2s
  res.send({ result });
});

// Fix: offload to worker thread
app.get('/fast', async (req, res) => {
  const result = await runInWorker(44); // non-blocking
  res.send({ result });
});

Solutions for CPU-bound work in Node

Worker Threads — worker_threads module for parallel JS execution
Child Processes — child_process.fork for separate Node processes
Native Addons — offload to C/C++ via N-API
Microservice — delegate to a Python/Go service better suited for compute

What is a callback, and what is "callback hell"? How do modern Node.js patterns solve it?

Easy Concurrency

▾

📜 Historical Context — The Callback Era

Callbacks were the only async pattern in Node.js from v0.1 (2009) until Promises in ES6 (2015). Understanding them is still essential because:

Many Node.js core APIs still use callbacks (fs, crypto, dns)
EventEmitters are callbacks under the hood
util.promisify is your bridge between the callback world and async/await

Error-first conventionFirst argument is always the error (null if none). Ensures you can't accidentally ignore errors.

Inversion of controlYou hand your callback to another function — they decide when to call it. This makes bugs harder to trace in deeply nested callback code.

Pyramid of doomThe visual shape of deeply nested callbacks — each async step adds another indent level until the code becomes unreadable.

util.promisifyBuilt-in Node utility that wraps any error-first callback function and returns a Promise-based version — the standard bridge to async/await.

JavaScript — util.promisify bridge

const { promisify } = require('util');
const fs = require('fs');

// Old callback-style API
fs.readFile('data.txt', 'utf8', (err, data) => {
  if (err) throw err;
  console.log(data);
});

// Promisified — now usable with async/await
const readFileAsync = promisify(fs.readFile);
const data = await readFileAsync('data.txt', 'utf8');

// Note: fs.promises already provides async versions natively:
const data2 = await fs.promises.readFile('data.txt', 'utf8');

What is a Callback

A callback is a function passed as an argument to another function, to be called when an async operation completes. Node.js follows the error-first callback convention: (err, result) => {}

Callback Hell — The Problem

JavaScript — Callback Hell

// ❌ Hard to read, error-prone, hard to maintain
fs.readFile('user.json', (err, userData) => {
  if (err) return handleError(err);
  db.getUser(userData.id, (err, user) => {
    if (err) return handleError(err);
    db.getOrders(user.id, (err, orders) => {
      if (err) return handleError(err);
      emailService.send(user.email, orders, (err) => {
        if (err) return handleError(err);
        console.log('Done!');  // deeply nested — "pyramid of doom"
      });
    });
  });
});

JavaScript — async/await Solution

// ✅ Same logic — flat, readable, error handled in one place
async function processUser() {
  try {
    const userData = JSON.parse(await fs.promises.readFile('user.json'));
    const user   = await db.getUser(userData.id);
    const orders = await db.getOrders(user.id);
    await emailService.send(user.email, orders);
    console.log('Done!');
  } catch (err) {
    handleError(err); // one catch handles all failures
  }
}

Topic D — Module System

What is the difference between CommonJS (require) and ES Modules (import)? When do you use each?

Medium Module System

▾

Side-by-side Comparison

Feature	CommonJS (CJS)	ES Modules (ESM)
Syntax	`require()` / `module.exports`	`import` / `export`
Loading	Synchronous — blocks	Asynchronous — non-blocking
Evaluation	Dynamic — can `require()` inside a function	Static — imports resolved at parse time
Tree-shaking	Not possible	Bundlers can eliminate dead code
Top-level await	Not supported	Supported
File extension	`.js` (with `type:commonjs`)	`.mjs` or `.js` (with `type:module`)
`__dirname` / `__filename`	Available	Not available — use `import.meta.url`

Code Comparison

JavaScript — CommonJS

// math.js
function add(a, b) { return a + b; }
module.exports = { add };

// app.js
const { add } = require('./math');
// Can also do this dynamically:
if (condition) {
  const utils = require('./utils'); // dynamic require — valid in CJS
}

JavaScript — ES Modules

// math.mjs
export function add(a, b) { return a + b; }

// app.mjs
import { add } from './math.mjs';  // must include extension

// Dynamic import — works in ESM (returns a Promise)
const { add } = await import('./math.mjs');

// __dirname equivalent in ESM
import { fileURLToPath } from 'url';
import { dirname } from 'path';
const __dirname = dirname(fileURLToPath(import.meta.url));

When to use which

Use CJS for existing Node.js projects and when publishing to npm (better ecosystem compatibility)
Use ESM for new projects, browser-shared code, and when you need tree-shaking or top-level await
Set "type": "module" in package.json to make all .js files ESM

How does Node.js module caching work? What is a circular dependency?

Medium Module System

▾

Module Caching

The first time you require('./foo'), Node.js loads, compiles, and executes it, then caches the result in require.cache (keyed by resolved filename). Every subsequent require('./foo') returns the cached exports object — the module is not re-executed.

JavaScript

// counter.js
let count = 0;
module.exports = {
  increment: () => ++count,
  get: () => count
};

// app.js
const a = require('./counter');
const b = require('./counter'); // same cached object

a.increment();
console.log(b.get()); // → 1 (a and b are the SAME object)

// Force a fresh load (rare — testing, hot reload)
delete require.cache[require.resolve('./counter')];
const c = require('./counter'); // fresh copy, count = 0

Circular Dependencies

A circular dependency is when module A requires B, and B requires A. Node.js handles this by returning an incomplete (partial) exports object for the module currently being loaded.

JavaScript — Circular Dependency

// a.js
console.log('a.js loading');
const b = require('./b');          // triggers b.js to load
console.log('b.done =>', b.done);  // → true
module.exports = { done: true };

// b.js
console.log('b.js loading');
const a = require('./a');          // a is mid-load → gets {} (empty!)
console.log('a.done =>', a.done);  // → undefined (partial export)
module.exports = { done: true };

⚠ How to Fix Circular Dependencies

Restructure the code to extract shared logic into a third module that both A and B can import, breaking the cycle.

How does Node.js resolve modules? What is the resolution algorithm?

Medium Module System

▾

The Resolution Steps

When you call require('X'), Node.js resolves it using this algorithm:

Is X a core module? (fs, path, http…) → return immediately
Does X start with ./, ../, or /? → it's a file path:
- Try X exactly
- Try X.js, X.json, X.node
- Try X/index.js, X/index.json, X/index.node
Otherwise → look in node_modules folders, walking up the directory tree:
- ./node_modules/X
- ../node_modules/X
- ../../node_modules/X … up to root

Practical Demo

JavaScript

// See exactly where a module was loaded from:
console.log(require.resolve('express'));
// → /project/node_modules/express/index.js

// Inspect the full module cache:
console.log(Object.keys(require.cache));

// package.json "main" field controls which file is the entry
// package.json "exports" field (Node 12+) controls subpath exports
// package.json "type": "module" makes .js files use ESM

Pro Tip

The exports field in package.json (introduced in Node 12) overrides the old resolution and lets package authors control exactly which files are exposed — preventing internal paths from being imported directly.

⚡

Segment 2 — Async Patterns

Mastering async is what separates a junior from a senior Node.js engineer. This segment covers every pattern you will encounter — from raw Promises to async iteration — and the subtle bugs that trip up even experienced developers.

Promises async / await Error Handling Promise Combinators Async Iteration Concurrency Control

Questions

Opened

Topics Covered

~50m

Study Time

🧠 Mental Model — The Evolution of Async in JavaScript

Async patterns evolved to solve one compounding problem: how do you write code that waits for things without becoming unreadable or unsafe? Each generation fixed the previous generation's main pain point.

Callbacks (Node.js 0.x — 2009) — The original. Pass a function, get called when done. Works, but sequential async creates deeply nested "callback hell." Error handling requires a manual check at every level. fs.readFile('f', (err, data) => { if (err) ... })

Promises (ES6 — 2015) — Represent a future value. Flat .then() chains replace nesting. One .catch() handles all upstream errors. Still verbose for complex flows. readFile('f').then(process).catch(handleError)

async/await (ES2017) — Syntactic sugar over Promises. Async code reads like synchronous code. Same event-loop semantics, zero performance difference from Promises. const data = await readFile('f');

Async Iterators (ES2018) — Handle streams of async values. Consumer controls the pace (backpressure). Used with Node.js Readable streams and pagination. for await (const chunk of stream) { ... }

How async ties into the event loop

When you await a Promise, the async function is suspended and the event loop is freed to run other callbacks. When the Promise resolves, the function is re-queued as a microtask and resumes. No thread blocking — ever.

The one rule that explains everything

async/await is compiled to Promises. Promises use microtasks. Microtasks run between every event loop phase. Master the Promise model first — everything else is syntax on top.

The Key Insight for Interviews

await db.query() does NOT block the thread. It suspends only that async function's execution context. The event loop continues serving other requests while your query runs. This is the entire value of async in Node.js.

Topic A — Promises

What is a Promise? Explain its three states and how .then / .catch / .finally chaining works.

Medium Promises

▾

🎯 Mental Model — The Promise as an IOU

🎫 The Ticket Stub Analogy

When you order at a fast-food counter, they hand you a ticket stub. You don't stand blocking the counter. Instead:

You hold the stub (pending) and do other things while food is prepared
Your number is called — the stub redeems for real food (fulfilled)
They ran out of ingredients — the stub gets cancelled with a reason (rejected)
Once your number is called (or cancelled), it will never be called again — Promises settle exactly once

⚠ Critical Misconception

A Promise is not a running operation — it's a reference to one. The operation starts when the Promise is created, not when you attach .then(). Attaching handlers to an already-settled Promise is valid — the handler fires immediately as a microtask.

Core Concept

A Promise is an object representing the eventual completion or failure of an async operation. It acts as a placeholder for a value that is not yet available.

State	Meaning	Can transition to
pending	Initial state — neither fulfilled nor rejected	fulfilled or rejected
fulfilled	Operation completed successfully, has a value	— (terminal)
rejected	Operation failed, has a reason (error)	— (terminal)

Key Rule

Once a Promise is settled (fulfilled or rejected) it never changes state. Calling resolve() after reject() has already been called is silently ignored.

Creating and chaining

JavaScript

const p = new Promise((resolve, reject) => {
  // executor runs SYNCHRONOUSLY
  setTimeout(() => resolve(42), 1000);
});

p
  .then(val => {
    console.log('Got:', val);  // Got: 42
    return val * 2;           // value passed to next .then
  })
  .then(val => console.log('Doubled:', val))  // Doubled: 84
  .catch(err => console.error('Error:', err))  // catches any error above
  .finally(() => console.log('Done'));          // always runs, passes value through

Critical behaviour — .then always runs async

JavaScript

console.log('1 — sync');

Promise.resolve('hello')
  .then(v => console.log('3 —', v)); // microtask — runs AFTER sync code

console.log('2 — sync');

// Output: 1 — sync, 2 — sync, 3 — hello
// Even though the Promise is already resolved, .then fires asynchronously

How to answer in an interview

"A Promise has three states: pending, fulfilled, rejected — and once settled, it never changes. Each .then() returns a new Promise so you can chain. .catch() is shorthand for .then(null, handler). .finally() runs regardless of outcome and passes the value through — it's used for cleanup like hiding a loading spinner."

What is Promise chaining? What is the most common mistake developers make with .then()?

Medium Promises

▾

The Golden Rule

Each .then() must return a value for the next handler to receive it. If you forget to return, the next .then() receives undefined — a very common silent bug.

JavaScript — Missing return (silent bug)

// ❌ Missing return — next .then gets undefined
fetch('/api/user')
  .then(res => {
    res.json();  // ← no return!
  })
  .then(user => console.log(user)); // → undefined

// ✅ Correct — return the promise so it chains
fetch('/api/user')
  .then(res => res.json())      // returns the json() Promise
  .then(user => console.log(user)); // gets the actual user object

Promise nesting — the callback hell equivalent

JavaScript

// ❌ Nested .then — defeats the purpose of chaining
fetch('/api/user')
  .then(res => res.json()
    .then(user => fetch(`/api/orders/${user.id}`)
      .then(r => r.json())));

// ✅ Flat chain — return the inner promise to the outer chain
fetch('/api/user')
  .then(res    => res.json())
  .then(user   => fetch(`/api/orders/${user.id}`))
  .then(res    => res.json())
  .then(orders => console.log(orders))
  .catch(err   => console.error(err)); // one handler for all

Mental Model

Think of a Promise chain like a conveyor belt — each .then() is a station that receives, transforms, and passes on the item. If a station doesn't hand the item to the belt (no return), the next station gets nothing.

Topic B — async / await

How does async/await work under the hood? What does an async function actually return?

Hard async/await

▾

🎯 Mental Model — The "Pause Button" Function

Think of an async function as a function that can pause its own execution at any await point — without blocking the thread. It's like a bookmark in a book.

📺 The DVR Analogy

You're watching live TV (your async function is running). Something comes up — you pause (await fires). You handle the interruption. When it's resolved, you resume from exactly where you paused. The TV station (event loop) never stopped broadcasting for everyone else while your TV was paused.

async functionAlways returns a Promise. Its execution body can be suspended at await points without blocking the JS thread.

await expressionSuspends the current function, saves all local variable state, and yields control to the event loop. Resumes when the awaited Promise settles.

Microtask resumptionAfter the awaited Promise resolves, the function resumes via the microtask queue — after current synchronous code, before the next I/O callback.

CoroutineComputer science term for a function that can pause and resume. async/await turns JS functions into coroutines — this is the conceptual foundation.

Under the Hood — Generator Functions

Before async/await existed, developers used function* (generator functions) + a Promise runner to achieve the same pause/resume behaviour. async/await is built on exactly that mechanism — V8 compiles it to generator-style code internally. yield in a generator = await in an async function.

The Two Rules

An async function always returns a Promise — even if you return a plain value, it gets wrapped in Promise.resolve()
await suspends the function until the awaited Promise settles, then resumes with the resolved value (or throws on rejection)

async/await is syntactic sugar over Promises

JavaScript

// These two are functionally equivalent:

async function fetchUser(id) {
  const res  = await fetch(`/api/users/${id}`);
  const user = await res.json();
  return user;
}

function fetchUser(id) {
  return fetch(`/api/users/${id}`)
    .then(res => res.json());
}

// async always returns a Promise — even plain values get wrapped:
async function getValue() { return 42; }
getValue().then(console.log); // logs 42 — it IS a Promise

// await works on any thenable — even plain values:
const a = await 42;           // valid — wraps in Promise.resolve(42)
const b = await Promise.resolve('hello');

Event loop perspective — what "suspend" actually means

JavaScript

async function demo() {
  console.log('A');
  await Promise.resolve(); // suspends here — schedules resume as microtask
  console.log('C');         // runs AFTER the current sync code finishes
}

demo();
console.log('B');

// Output: A → B → C
// "B" runs before "C" because await yields to the event loop

Under the Hood

async/await is transpiled to generator functions (function*) + a Promise-based runner by Babel/TypeScript. The yield in a generator suspends execution just like await does — the async/await syntax is just cleaner sugar on top.

What are the most common async/await mistakes? How do you spot and fix them?

Hard async/await

▾

Mistake 1 — Forgetting await (silent bug)

JavaScript

// ❌ user is a Promise object, not a user!
async function bad() {
  const user = getUser(1);   // forgot await
  console.log(user.name);     // → undefined (no crash!)
}

// ✅
async function good() {
  const user = await getUser(1);
  console.log(user.name);  // works
}

Mistake 2 — await inside forEach (does NOT work)

JavaScript

// ❌ forEach ignores returned Promises — items process uncontrolled
async function processAll(items) {
  items.forEach(async (item) => {
    await processItem(item);
  });
  console.log('Done?'); // prints BEFORE items finish!
}

// ✅ Option A: for...of (sequential)
async function processSequential(items) {
  for (const item of items) {
    await processItem(item);  // truly waits for each
  }
  console.log('Done');
}

// ✅ Option B: Promise.all (parallel)
async function processParallel(items) {
  await Promise.all(items.map(item => processItem(item)));
  console.log('Done');
}

Mistake 3 — Sequential when parallel is possible

JavaScript

// ❌ Total time: 300ms (100 + 200) — these are independent!
async function slow() {
  const user   = await getUser(1);    // waits 100ms
  const orders = await getOrders(1);  // then waits 200ms
}

// ✅ Total time: ~200ms (run together)
async function fast() {
  const [user, orders] = await Promise.all([
    getUser(1),    // both start at the same time
    getOrders(1)
  ]);
}

⚠ Rule of thumb

If two await calls are not dependent on each other's result, run them in parallel with Promise.all. Sequential await is only correct when the second needs the first's result.

Topic C — Error Handling in Async Code

How do you properly handle errors in async code? What happens with unhandled Promise rejections in Node.js?

Hard Error Handling

▾

🎯 Mental Model — How Errors Travel in Async Code

In synchronous code, throw unwinds the call stack upward. In async code, an error inside an async function becomes a rejected Promise — it travels down the Promise chain instead, jumping to the nearest .catch().

🏔️ The Pipeline Crack Analogy

Imagine a Promise chain as a series of water pipes. A crack (thrown error) at any point causes water (data) to stop flowing and instead activates a drain valve (the nearest .catch() downstream). All pipes between the crack and the drain are bypassed — their .then() handlers are skipped. If there's no drain at all, the water floods the floor — an unhandled rejection.

Synchronous errors

Inside an async function, a throw or a runtime error (like null.property) automatically rejects the returned Promise — no extra code needed.

Async errors

If an awaited Promise rejects and there's no try/catch around it, the rejection propagates out of the async function as its own rejected Promise — bubbling up to the caller.

Unhandled Rejections — What happens

Node version	Behaviour on unhandled rejection
Node 10–14	Warning printed, process continues (dangerous!)
Node 15+	Process crashes with exit code 1 (same as uncaught exception)

Patterns for safe async error handling

JavaScript

// ✅ try/catch — standard for async functions
async function fetchData(url) {
  try {
    const res = await fetch(url);
    if (!res.ok) throw new Error(`HTTP ${res.status}`);
    return await res.json();
  } catch (err) {
    console.error('Fetch failed:', err.message);
    throw err; // re-throw so caller can decide
  }
}

// ✅ Global safety net — last resort, NOT a substitute for local handling
process.on('unhandledRejection', (reason, promise) => {
  console.error('Unhandled rejection:', reason);
  process.exit(1); // explicit crash is safer than corrupt state
});

// ✅ Express: async route errors are NOT caught by default
const asyncHandler = fn => (req, res, next) =>
  Promise.resolve(fn(req, res, next)).catch(next); // forwards to error middleware

app.get('/user/:id', asyncHandler(async (req, res) => {
  const user = await getUser(req.params.id);
  res.json(user);
}));

Interview answer

"In Node 15+ an unhandled rejection crashes the process — which is actually safer than continuing with corrupted state. The fix is always local try/catch in async functions and re-throwing when the caller needs to handle it. For Express, async errors aren't caught by default so you need a wrapper or the express-async-errors package."

What are the try/catch scope gotchas with async code? When does it not catch what you expect?

Medium Error Handling

▾

Gotcha 1 — try/catch doesn't reach async callbacks

JavaScript

// ❌ The catch block never runs
async function tricky() {
  try {
    setTimeout(async () => {
      await mightFail(); // error thrown here...
    }, 1000);
    // ...but the try block has already exited by then!
  } catch (err) {
    console.log('never runs');
  }
}

// ✅ Move the try/catch inside the callback
setTimeout(async () => {
  try { await mightFail(); }
  catch (err) { handleError(err); }
}, 1000);

Gotcha 2 — Promise.all fails fast, losing other results

JavaScript

// ❌ If task2 rejects, you lose task1 and task3 results entirely
try {
  const [a, b, c] = await Promise.all([task1(), task2(), task3()]);
} catch (err) {
  // only know SOMETHING failed, not which one
}

// ✅ allSettled when you need every result regardless of failures
const results = await Promise.allSettled([task1(), task2(), task3()]);
results.forEach((r, i) => {
  if (r.status === 'fulfilled') console.log(`task${i+1}:`, r.value);
  else                           console.error(`task${i+1} failed:`, r.reason);
});

Gotcha 3 — Throwing in a .then() handler

JavaScript

// ❌ This outer try/catch does NOT catch errors thrown inside .then()
try {
  somePromise().then(result => {
    throw new Error('oops'); // becomes a rejected Promise
  });
} catch (e) {
  console.log('never runs');
}

// ✅ Errors in .then() are caught by the next .catch() in the chain
somePromise()
  .then(result => { throw new Error('oops'); })
  .catch(e => console.log('caught:', e.message)); // works

Topic D — Promise Combinators

Explain Promise.all, Promise.allSettled, Promise.race, and Promise.any — when do you use each?

Hard Combinators

▾

Decision Table

Method	Resolves when	Rejects when	Best for
Promise.all	ALL fulfill	ANY rejects (fail-fast)	Parallel calls that ALL must succeed
Promise.allSettled	ALL settle (any state)	Never	Bulk ops — want every result regardless
Promise.race	FIRST settles (resolve OR reject)	FIRST rejects	Timeout pattern, fastest resource wins
Promise.any	FIRST fulfills	ALL reject → AggregateError	Redundant sources, use fastest success

Code — all four in action

JavaScript

const p1 = new Promise(r => setTimeout(() => r('A'), 100));
const p2 = new Promise(r => setTimeout(() => r('B'), 200));
const p3 = new Promise((_, r) => setTimeout(() => r(new Error('C failed')), 150));

// .all — all or nothing
await Promise.all([p1, p2]);         // → ['A', 'B'] after 200ms
await Promise.all([p1, p2, p3]);   // throws at 150ms, A still pending

// .allSettled — always waits, gives status for each
await Promise.allSettled([p1, p2, p3]);
// → [{status:'fulfilled', value:'A'},
//    {status:'fulfilled', value:'B'},
//    {status:'rejected',  reason: Error}]

// .race — first to settle wins (resolve OR reject)
await Promise.race([p1, p2, p3]);  // → 'A' at 100ms

// .any — first to RESOLVE (skips rejections)
await Promise.any([p3, p1, p2]);  // → 'A' at 100ms (p3 rejection ignored)
await Promise.any([p3]);           // throws AggregateError (all rejected)

Practical pattern — timeout with Promise.race

JavaScript

function withTimeout(promise, ms) {
  const timer = new Promise((_, reject) =>
    setTimeout(() => reject(new Error(`Timed out after ${ms}ms`)), ms)
  );
  return Promise.race([promise, timer]);
}

const data = await withTimeout(fetchData(), 5000);

How do you run a large number of async tasks with a concurrency limit?

Hard Combinators

▾

The Problem

You cannot Promise.all(items.map(processItem)) when items has thousands of entries — you'd fire thousands of requests simultaneously, overwhelming the server, exhausting connection pools, or triggering rate limits.

Manual concurrency pool

JavaScript

// ❌ Fires ALL 1000 requests simultaneously
const results = await Promise.all(urls.map(url => fetch(url)));

// ✅ Process max 5 at a time using a worker-pool pattern
async function pLimit(taskFns, concurrency) {
  const results = new Array(taskFns.length);
  let cursor = 0;

  async function worker() {
    while (cursor < taskFns.length) {
      const i = cursor++;
      results[i] = await taskFns[i](); // each worker picks next available task
    }
  }

  // Start `concurrency` workers — they race to consume tasks
  await Promise.all(Array.from({ length: concurrency }, worker));
  return results;
}

const tasks = urls.map(url => () => fetch(url).then(r => r.json()));
const results = await pLimit(tasks, 5); // max 5 in-flight at any time

// Or use the battle-tested p-limit package (same idea, more robust):
import pLimit from 'p-limit';
const limit = pLimit(5);
const out = await Promise.all(
  urls.map(url => limit(() => fetch(url).then(r => r.json())))
);

Why the worker-pool pattern works

Instead of spawning N promises for N tasks, you spawn concurrency long-lived workers. Each worker picks the next task when it finishes its current one — naturally keeping exactly concurrency tasks in-flight at all times.

Topic E — Async Iteration

What is for await...of? When do you use it instead of Promise.all?

Medium Async Iteration

▾

What it is

for await...of iterates over an async iterable — any object implementing [Symbol.asyncIterator](). It awaits each item before the loop body runs and before requesting the next item. This gives you natural backpressure.

Async Generators — the most common source

JavaScript

// Async generator — produces pages lazily, on demand
async function* paginate(url) {
  let page = 1;
  while (true) {
    const data = await fetch(`${url}?page=${page}`).then(r => r.json());
    if (!data.length) break;
    yield data;     // pause until consumer asks for next
    page++;
  }
}

// Consumer controls the pace — next page only fetched when loop body finishes
for await (const page of paginate('/api/users')) {
  await processPage(page);  // slow processing doesn't flood the API
}

// Node.js Readable streams are async iterables (Node 10+):
const stream = createReadStream('large-file.txt', { encoding: 'utf8' });
for await (const chunk of stream) {
  process(chunk); // backpressure handled automatically
}

for await...of vs Promise.all — choose the right tool

	for await...of	Promise.all
Parallelism	Sequential (one at a time)	All run concurrently
Source	Stream / generator / unknown size	Fixed array of known tasks
Backpressure	Natural — consumer sets pace	None — all start immediately
Memory	Low — one item at a time	All results held in memory
Use when	Pagination, file streams, lazy data	Parallel API calls, fixed batch

What is the difference between a Promise, an EventEmitter, and an async generator for handling async data?

Medium Async Iteration

▾

The Core Difference

	Promise	EventEmitter	Async Generator
Values	Single, once	Multiple, unbounded	Multiple, lazy
Model	Pull (one-shot)	Push	Pull
Backpressure	N/A	None built-in	Natural
Error handling	`.catch` / try-catch	`'error'` event	try-catch in loop
Cancel	AbortController	removeListener	`.return()`

Code — each pattern for the same problem

JavaScript

// Promise — single value, resolves once
const data = await fetchUser(1); // done, can't receive more

// EventEmitter — producer pushes multiple values (push model)
const stream = getDataStream();
stream.on('data', chunk => process(chunk)); // producer controls pace
stream.on('end',  () => console.log('done'));
stream.on('error', err => handleError(err));

// Async Generator — consumer pulls values (pull model)
async function* getDataStream() {
  yield await fetch('/chunk/1').then(r => r.json());
  yield await fetch('/chunk/2').then(r => r.json());
}
for await (const chunk of getDataStream()) {
  await process(chunk); // next fetch only starts when I'm ready
}

// Node Readable stream can be consumed BOTH ways:
const rs = fs.createReadStream('file.txt');
rs.on('data', chunk => ...);    // push style
for await (const chunk of rs) ... // pull style (same stream)

One-liner for interview

"Promises are for a single async value. EventEmitters are for multiple values pushed by the producer — no built-in backpressure. Async generators are for multiple values where the consumer controls the pace, naturally handling backpressure. Node streams are EventEmitters that also expose an async iterator interface."

🌊

Segment 3 — Streams & Buffers

Streams are one of Node.js's most powerful and most misunderstood features. Understanding them is the difference between a server that crashes on a 2 GB file and one that handles it with 64 KB of memory. This segment builds understanding from the ground up — starting with what binary data is, all the way to production-safe pipelines.

Buffer Readable / Writable Backpressure Transform / Duplex pipe() vs pipeline() Memory Efficiency

Questions

Opened

Topics Covered

~55m

Study Time

📖

Read This First — The Big Picture

Before diving into the questions, build this mental model. Every concept in this segment connects back to one core problem: how do you move large amounts of data efficiently without running out of memory?

🧱 The Problem with "load it all"

Imagine you want to read a 4 GB video file. If you load the whole thing into RAM first, your server needs at least 4 GB free — just for one request. Ten concurrent users? 40 GB. This is why servers crash.

🌊 The Stream Solution

Instead of loading everything at once, a stream reads a small chunk (e.g. 64 KB), hands it to you, then reads the next chunk. Memory usage stays constant — 64 KB — no matter how big the file is.

Real-world analogy — Netflix vs downloading first

When you watch Netflix, the video streams to you — a few seconds at a time. You don't wait for the full 2 GB movie to download before you can watch it. Node.js streams work exactly the same way: data flows piece by piece rather than all at once.

Key vocabulary you will see throughout this segment

Buffer

A fixed block of raw bytes in memory. The raw material streams move around.

Chunk

One piece of data delivered by a stream — usually a Buffer of 16–64 KB.

highWaterMark

The maximum bytes a stream's internal buffer holds before it signals "slow down."

Backpressure

The signal from a slow consumer to a fast producer: "I'm full, pause."

pipe()

Connects a Readable to a Writable, managing backpressure automatically.

Encoding

The rule for converting between bytes and human-readable text (e.g. UTF-8).

Topic A — Buffer & Binary Data

What is a Buffer in Node.js? How does it differ from a Uint8Array and a regular JavaScript array?

Medium Buffer

▾

Start here — What is binary data?

Computers only understand numbers. Every piece of data — text, images, video, a JSON file — is ultimately stored as a sequence of numbers from 0 to 255. Each of these numbers is called a byte (8 bits). A byte can represent 256 different values (2⁸).

Analogy — Letters as numbers

Computers agreed on a standard called ASCII (later UTF-8) that maps letters to numbers. The letter 'H' = 72, 'e' = 101, 'l' = 108, 'o' = 111. So the word "Hello" is stored in memory as the five bytes: 72 101 108 108 111. A Buffer lets you see and manipulate these raw bytes directly.

Architecture — how "Hello" lives in memory

  Character:   H     e     l     l     o
  Decimal:     72    101   108   108   111
  Hex:         0x48  0x65  0x6c  0x6c  0x6f
  Binary:      01001000 01100101 01101100 01101100 01101111

  Buffer:     <Buffer 48 65 6c 6c 6f>  ← Node.js shows hex by default

Why JavaScript needed Buffer

JavaScript was designed for web pages — it only had strings and numbers. When Node.js arrived and needed to work with files, TCP sockets, and image data, strings were not enough — they add encoding overhead and can't represent arbitrary bytes. Buffer was created to fill this gap.

Without Buffer (strings only)

A string always applies an encoding (UTF-16 internally). You cannot efficiently represent raw image pixels, encrypted data, or network protocol headers as a string.

With Buffer (raw bytes)

A Buffer holds arbitrary bytes — no encoding applied. You can read a PNG file byte by byte, or build a TCP packet header with exact byte values. Full control.

What "outside the V8 heap" means

V8 (the JS engine) manages its own memory region called the heap — this is where your JavaScript objects, arrays, and strings live. The Garbage Collector (GC) watches the heap and frees unused objects automatically.

Buffers are allocated outside this heap, directly in C++ memory managed by Node.js. This means:

Buffers don't add GC pressure — the GC doesn't need to scan them
You can allocate large Buffers without hitting V8's heap size limit
The downside: you must be careful about when they get freed

Buffer vs Uint8Array vs Regular Array

	Buffer	Uint8Array	Regular Array
Memory location	Outside V8 heap (C++)	V8 heap (typed)	V8 heap (dynamic)
Values stored	Integers 0–255 (bytes)	Integers 0–255	Any JS value
Size after creation	Fixed — cannot grow	Fixed — cannot grow	Dynamic — can push/pop
Encoding helpers	Yes — `.toString()`, `.write()`…	No	No
Performance for binary	Fastest	Fast	Slow
Works in browser	No — Node.js only	Yes	Yes

Since Node.js 4

Buffer is a subclass of Uint8Array. Every Buffer is a Uint8Array, but not every Uint8Array is a Buffer. Buffer adds Node-specific encoding/decoding methods on top.

In practice — seeing bytes directly

JavaScript

const buf = Buffer.from('Hello');
console.log(buf);           // <Buffer 48 65 6c 6c 6f>  (hex values)
console.log(buf[0]);        // 72  — decimal value of 'H'
console.log(buf.length);    // 5   — bytes, not characters

// ⚠ bytes ≠ characters for non-ASCII text
const euro = Buffer.from('€');
console.log(euro.length);   // 3  — '€' needs 3 bytes in UTF-8
console.log('€'.length);   // 1  — JS string measures in UTF-16 code units

// Buffer IS a Uint8Array (subclass check)
console.log(buf instanceof Uint8Array);      // true
console.log(Buffer.isBuffer(buf));           // true
console.log(Buffer.isBuffer(new Uint8Array(5))); // false

Interview answer

"Buffer is Node's mechanism for working with raw binary data — the actual bytes flowing through files and networks. It's allocated outside V8's heap so it doesn't impact garbage collection, and since Node 4 it's a subclass of Uint8Array for browser compatibility. The key value over a plain Uint8Array is the encoding/decoding helpers like toString('base64') that are essential for I/O work."

What is the difference between Buffer.alloc, Buffer.allocUnsafe, and Buffer.from? When does it matter?

Medium Buffer

▾

What "memory allocation" means

When you create a Buffer, Node.js asks the operating system for a chunk of RAM. Think of RAM like a whiteboard. There are two ways to get a section of whiteboard:

🧹 Erased first (safe)

Someone cleans the section before handing it to you — no traces of previous writing. Slower, but you know exactly what's on it: zeros.

⚡ Not erased (unsafe)

You get the section immediately — but someone else's writing might still be there. Faster, but you might see their old data (passwords, private keys).

The three creation methods explained

Buffer.from(data) — you already have data. Copies it into a new Buffer. Most common method. Input can be a string, array of byte values, another Buffer, or an ArrayBuffer.

Buffer.alloc(size) — you want an empty Buffer of a specific size. Every byte is set to 0x00 (zero). Safe — no leftover data. Use this when you'll write into it gradually.

Buffer.allocUnsafe(size) — same as alloc but skips zeroing. About 2× faster but the memory contains whatever was there before. Only safe when you immediately overwrite every byte yourself.

JavaScript

// Buffer.from — create from existing data
Buffer.from('hello', 'utf8')         // from string with encoding
Buffer.from([0x48, 0x65, 0x6c])     // from array of byte values
Buffer.from(anotherBuf)              // copy of another buffer

// Buffer.alloc — safe zeroed memory
const safe = Buffer.alloc(10);
console.log(safe); // <Buffer 00 00 00 00 00 00 00 00 00 00> — all zeros

// Buffer.allocUnsafe — fast but raw memory (may contain old bytes!)
const unsafe = Buffer.allocUnsafe(10);
// May print: <Buffer a3 00 f2 11 ...> — leftover garbage from other processes
unsafe.fill(0); // you MUST zero it yourself before using

The security risk — explained simply

When your Node.js process runs, it shares physical RAM with other processes. When memory is freed (from a previous Buffer, for example), the bytes are not erased — they're just marked "available." allocUnsafe hands you that raw memory immediately. If that memory previously held a user's password or auth token, and you send your Buffer over the network before overwriting it, you've just leaked sensitive data.

⚠ The Rule

Use Buffer.allocUnsafe only when you will immediately overwrite the entire buffer (e.g. a stream fills it directly from disk). For anything you might expose or send externally, use Buffer.alloc.

Encodings explained — why they exist

An encoding is the rule for converting between bytes and text. The same bytes mean different things under different encodings.

Encoding	What it does	Use it for
`utf8`	Standard text — 1–4 bytes per character. Supports every language.	Reading/writing text files, JSON, most web data
`hex`	Each byte shown as 2 hex digits (0–9, a–f). Not compressed.	Debugging, checksums, crypto hashes
`base64`	Converts binary to printable ASCII (A–Z, a–z, 0–9, +, /). ~33% larger.	Embedding images in HTML, JWT tokens, email attachments
`ascii`	7-bit ASCII only — 1 byte per character, English only.	Legacy systems, simple protocols
`binary`	Latin-1 — one byte per character, 256 characters max.	Binary protocol headers, legacy data

JavaScript — encoding conversions

const buf = Buffer.from('hello');

buf.toString('utf8')    // → 'hello'        (human-readable text)
buf.toString('hex')     // → '68656c6c6f'    (good for debugging)
buf.toString('base64')  // → 'aGVsbG8='      (good for HTTP/JSON)

// Round-trip: base64 → Buffer → utf8
const encoded = buf.toString('base64');           // 'aGVsbG8='
const decoded = Buffer.from(encoded, 'base64').toString(); // 'hello'

// Utility operations
const combined = Buffer.concat([buf1, buf2, buf3]); // join multiple buffers
const isEqual  = buf1.equals(buf2);               // constant-time compare (safe for crypto)

Topic B — Stream Fundamentals

🏭

The Stream Mental Model — An Assembly Line

A stream is not a new idea — it is the same concept as a factory assembly line. Raw material arrives at one end, workers transform it piece by piece, and finished products come out the other end. No worker waits for every piece in the world to arrive before starting work.

The pipeline metaphor

  Without streams (load everything first):
  ┌──────────────────────────────────────────────────────┐
  │  Read ENTIRE 2 GB file into RAM → process → write    │
  │  Memory needed: 2 GB                                 │
  └──────────────────────────────────────────────────────┘

  With streams (process chunk by chunk):
  ┌──────────┐   chunk   ┌──────────┐   chunk   ┌──────────┐
  │  READ    │ ────────► │ PROCESS  │ ────────► │  WRITE   │
  │ (source) │  64 KB    │(transform│  64 KB    │  (sink)  │
  └──────────┘           └──────────┘           └──────────┘
  Memory needed: ~64 KB (just one chunk at a time)

Every Node.js stream is an EventEmitter — meaning it communicates by firing named events ('data', 'end', 'error') rather than returning values directly. You listen for events to know when data arrives or when the stream is done.

Analogy — newspaper delivery

A Readable stream is like a newspaper being printed: pages come off the press one at a time (chunks). A Writable stream is the delivery van — it accepts pages and eventually delivers the complete paper. A Transform stream is the editor in between, reviewing each page before it goes to the van. You don't wait for the entire edition to be printed before delivery begins.

The internal buffer is a small waiting area between two stages. If the van is full, the press operator doesn't just throw papers on the floor — the press pauses. That "van is full, press pauses" signal is backpressure (covered in Topic C).

What are the four types of streams in Node.js? Give a real-world example of each.

Medium Stream Fundamentals

▾

Why four types?

Data can flow in different directions depending on the use case. Node.js has four stream types to cover every direction and combination. All four extend EventEmitter, so they communicate through events ('data', 'end', 'drain', 'error', etc.).

The four types — with analogies

Type	Direction	Analogy	Key events	Real examples
Readable	Produces data — flows out to you	A tap you open to get water	`data`, `end`, `error`	`fs.createReadStream`, HTTP request body, `process.stdin`
Writable	Consumes data — you pour data in	A drain that accepts water	`drain`, `finish`, `error`	`fs.createWriteStream`, HTTP response, `process.stdout`
Duplex	Both — independently in each direction	A telephone — you send and receive separately	Both sets	TCP socket (`net.Socket`), WebSocket
Transform	Both — but input becomes the output	A blender — what goes in comes out changed	Both sets	`zlib.createGzip()`, crypto cipher, CSV parser

What "all streams are EventEmitters" means in practice

A regular function returns a value when called. A stream cannot do that — data arrives later and in pieces. Instead, a stream fires events. Your code listens for those events:

The stream reads a chunk from disk/network. It fires 'data' with that chunk as the argument.

Your 'data' listener runs with the chunk — process it, write it, transform it.

When there is nothing more to read, the stream fires 'end'. Your cleanup code runs here.

If anything goes wrong at any point, the stream fires 'error'. Always handle this or the process crashes.

The internal buffer — what highWaterMark controls

Every stream has a small internal queue called its internal buffer. Data accumulates there between the time it is produced and the time the consumer reads it. The highWaterMark setting controls the maximum size of that queue before the stream signals "I'm full, slow down."

JavaScript

const fs   = require('fs');
const zlib = require('zlib');

// Readable — produces data from disk chunk by chunk
const readable = fs.createReadStream('input.txt', {
  highWaterMark: 64 * 1024  // 64 KB internal buffer (default)
});

// Transform — compresses each chunk as it flows through
const gzip = zlib.createGzip();

// Writable — writes compressed chunks to a file
const writable = fs.createWriteStream('output.txt.gz');

// pipe() connects them: readable feeds gzip which feeds writable
readable.pipe(gzip).pipe(writable);

// Listening to events from each stream
readable.on('data',   chunk => console.log('read chunk:', chunk.length));
readable.on('end',   () => console.log('reading done'));
writable.on('finish', () => console.log('file fully written'));
readable.on('error',  err => console.error('read error:', err));

What is the difference between flowing mode and paused mode in a Readable stream?

Hard Stream Fundamentals

▾

The core question — who controls the pace?

A Readable stream can deliver data in two ways. The question is: does the stream push data to you automatically, or do you ask for it explicitly?

Analogy — water tap vs automatic sprinkler

Paused mode is like a tap: water only flows when you turn the handle. You are in control. Flowing mode is like an automatic sprinkler: water comes out on its own schedule. If your bucket is not ready when the water arrives, it spills on the floor (data loss).

State transitions — what switches the mode

State diagram

                    ┌──────────────────────────────────┐
                    │  Created — starts in PAUSED mode │
                    └──────────────┬───────────────────┘
                                   │
          ┌────────────────────────▼──────────────────────────┐
          │                   PAUSED MODE                      │
          │  Data sits in internal buffer waiting to be pulled │
          │  You call stream.read() to get a chunk             │
          └──────┬──────────────────────────────┬─────────────┘
                 │                              │
    ┌────────────▼──────────┐     ┌─────────────▼────────────┐
    │ stream.on('data', …)  │     │       stream.pipe()       │
    │ stream.resume()       │     │  (also switches to flow)  │
    └────────────┬──────────┘     └─────────────┬────────────┘
                 │                              │
          ┌──────▼──────────────────────────────▼─────────────┐
          │                  FLOWING MODE                       │
          │  Stream pushes data to you as fast as it arrives   │
          │  'data' event fires for every chunk                │
          └────────────────────────┬───────────────────────────┘
                                   │
                         stream.pause() / stream.unpipe()
                                   │
                             back to PAUSED

Paused mode — you pull data explicitly

The 'readable' event fires when there is data available in the internal buffer. You call stream.read(size) to pull a chunk. This gives you fine-grained control over exactly how much you consume at once.

JavaScript

const stream = fs.createReadStream('file.txt');
// Stream starts paused — nothing flowing yet

stream.on('readable', () => {
  // This fires whenever data becomes available in the internal buffer
  let chunk;
  while ((chunk = stream.read(1024)) !== null) {
    // read(1024) pulls exactly 1024 bytes at a time
    // returns null when the internal buffer is empty
    processChunk(chunk);
  }
});

stream.on('end', () => console.log('All data consumed'));

Flowing mode — data is pushed to you

Attaching a 'data' listener switches the stream to flowing mode immediately. From that point, 'data' fires for each chunk as fast as the source can produce it. You can call stream.pause() to stop the flow and stream.resume() to restart it.

JavaScript

const stream = fs.createReadStream('file.txt');

// Attaching 'data' immediately switches to FLOWING mode
stream.on('data', chunk => {
  console.log('Got chunk of', chunk.length, 'bytes');

  // If processing this chunk is slow, pause the stream:
  stream.pause();                       // stop the flow
  doSlowWork(chunk).then(() => {
    stream.resume();                     // resume when ready
  });
});

stream.on('end',  () => console.log('stream finished'));
stream.on('error', err => console.error('error:', err));

// In practice: pipe() does the pause/resume automatically
stream.pipe(writable); // you almost never manage modes manually

⚠ Silent data loss trap

If the stream enters flowing mode (e.g. via .resume() or an early pipe()) before you attach your 'data' listener, chunks emitted in that gap are permanently lost — no error, no warning. Always attach listeners before any flow-triggering call.

Topic C — Backpressure & pipe()

What is backpressure in streams and how does pipe() handle it automatically?

Hard Backpressure

▾

What is backpressure — plain English first

Before we touch code, understand the real-world problem. A producer is anything that generates data (reading a file from disk, receiving bytes over a network). A consumer is anything that processes that data (writing to another disk, inserting into a database). Producers and consumers rarely run at the same speed.

Analogy — fire hose filling a bucket

A fire hose pumps water at 500 litres per minute. Your bucket drains at 10 litres per minute. Within seconds the bucket overflows — water goes everywhere.

In Node.js: the "fire hose" is your Readable stream (disk reads at 500 MB/s). The "bucket" is your Writable stream (database writes at 10 MB/s). The unprocessed data doesn't vanish — it piles up in RAM. Eventually the server runs out of memory and crashes.

Backpressure is the signal that lets the bucket shout "STOP!" to the hose when full, and "GO!" when it has room again. In Node this signal is the return value of write().

Readable reads 64 KB from disk and calls write(chunk) on the Writable.

Writable's internal buffer fills. write() returns false — meaning "I'm full, stop sending."

Without backpressure: Readable ignores false and keeps pushing. Buffer grows 64 KB → 1 MB → 500 MB → OOM crash (process killed).

With backpressure: Readable calls pause(). Stops reading. Writable drains its buffer and emits 'drain'.

Readable hears 'drain' and calls resume(). Data flows again. Memory stays constant — always around 64 KB.

The Problem — fast producer, slow consumer

Backpressure is the mechanism that lets a slow consumer signal a fast producer to slow down. Without it, unread data accumulates in the stream's internal buffer, consuming unbounded memory — and eventually crashing the process.

Architecture

  Producer (network, disk)   →   [internal buffer]   →   Consumer (disk, DB)
  reads: 500 MB/s                  fills up!               writes: 10 MB/s

  Without backpressure: buffer grows until OOM crash
  With backpressure:    producer pauses when buffer is full, resumes on drain

How pipe() implements backpressure

JavaScript — what pipe() does internally

// pipe() is roughly equivalent to this:
readable.on('data', (chunk) => {
  const ok = writable.write(chunk);
  // write() returns false when the internal buffer is full (highWaterMark reached)
  if (!ok) {
    readable.pause();               // stop asking for more data
    writable.once('drain', () => {
      readable.resume();             // writable has space again — keep going
    });
  }
});
readable.on('end', () => writable.end());

// pipe() does all of this for you in one line:
readable.pipe(writable);

The highWaterMark — tuning the buffer size

JavaScript

// Default highWaterMark = 16 KB for byte streams
const stream = fs.createReadStream('file.txt', {
  highWaterMark: 64 * 1024  // 64 KB chunks — larger = fewer I/O ops, more memory
});

// For object streams (highWaterMark = 16 objects by default)
const objectStream = new Readable({
  objectMode: true,
  highWaterMark: 100  // buffer up to 100 objects
});

Interview answer

"Backpressure is the feedback signal from a slow consumer to a fast producer saying 'pause until I'm ready.' pipe() implements it by checking the return value of writable.write() — when it returns false the buffer is full and pipe() pauses the readable. When the writable emits 'drain', pipe() resumes. Without this, data piles up in memory and you OOM."

What happens when you ignore backpressure? Show both the broken and correct implementations.

Hard Backpressure

▾

What OOM means — why this matters

OOM = Out Of Memory. Every Node.js process has a memory limit (roughly 1.5 GB on 64-bit systems by default, though this can be increased). When your code allocates more memory than that limit, Node.js does not gracefully handle it — the OS kills the process with a fatal error:

Terminal — what an OOM crash looks like

FATAL ERROR: CALL_AND_RETRY_LAST Allocation failed - JavaScript heap out of memory

 1: 0xb7c9e0 node::Abort() [node]
 ...
Aborted (core dumped)

# Your entire server is dead. Every active request is dropped.
# Users get connection refused. Logs stop. Alerts fire.

How it escalates in production

Imagine 50 concurrent users uploading 100 MB files. Each upload reads a file and writes to a slow database. Without backpressure, each upload holds its entire 100 MB in the Writable's buffer while waiting for the DB. That is 50 × 100 MB = 5 GB of buffered data — well above the 1.5 GB limit. Server crashes, taking all 50 users' uploads with it.

With backpressure: each stream pauses when the DB is busy. Actual memory usage: 50 × 64 KB ≈ 3 MB. Server never breaks a sweat.

The broken pattern — ignoring write() return value

JavaScript — dangerous

const src  = fs.createReadStream('/dev/urandom'); // fast source
const dest = fs.createWriteStream('output.bin');    // slow sink

// ❌ Ignores backpressure — writable buffer grows without bound
src.on('data', (chunk) => {
  dest.write(chunk); // return value ignored — buffer fills up!
});
// Result: dest's internal buffer balloons → GC pressure → eventual OOM

Manual backpressure — correct implementation

JavaScript — correct manual implementation

src.on('data', (chunk) => {
  const canContinue = dest.write(chunk); // false = buffer full
  if (!canContinue) {
    src.pause();                             // stop producing
    dest.once('drain', () => src.resume()); // resume when drained
  }
});
src.on('end', () => dest.end());

The right tools — pipe() and pipeline()

JavaScript

// ✅ pipe() — handles backpressure, but errors don't propagate
src.pipe(dest);

// ✅ pipeline() — handles backpressure AND errors AND cleanup
const { pipeline } = require('stream/promises');

await pipeline(
  fs.createReadStream('huge.txt'),
  zlib.createGzip(),
  fs.createWriteStream('huge.txt.gz')
);
// If any stream errors: all streams destroyed, file descriptors closed

Topic D — Transform & Duplex Streams

What is a Transform stream? Build one from scratch and explain how _transform and _flush work.

Hard Transform

▾

Mental model — a Transform is a factory assembly station

Analogy — conveyor belt assembly line

Picture a factory with two conveyor belts — one bringing in raw parts, one carrying out finished goods. A machine in the middle takes each raw part, modifies it (drills a hole, paints it, stamps it), and places the finished piece on the outgoing belt.

That machine is a Transform stream. The incoming belt is the writable side (you write data in). The outgoing belt is the readable side (transformed data comes out). The _transform() method is the work the machine does to each piece.

A chunk of data arrives — either from write(chunk) or via pipe() from an upstream source.

Your _transform(chunk, encoding, callback) method is called. This is where you do the work: uppercase it, compress it, parse it, encrypt it.

Call this.push(result) to put the transformed data on the readable (outgoing) side. You can push zero, one, or many chunks per input chunk.

Call callback() to tell the stream "I'm done with this chunk — send me the next one." The stream blocks new input until you call callback.

When all input is consumed, _flush(callback) is called once. Push any remaining buffered output here, then call callback to end the stream.

Why _flush exists

Streams deliver data in arbitrary-sized chunks. If you are accumulating state (e.g. building complete CSV lines from partial byte chunks), some data may still be sitting in your internal buffer when the source ends. _flush is called exactly once at the end — it is your last chance to push anything remaining.

What makes Transform special

A Transform stream is a Duplex where data written to the writable side is transformed and appears on the readable side. You implement _transform(chunk, encoding, callback) to define the transformation. _flush(callback) is called once when the writable side ends — your chance to push any remaining buffered output.

Building a Transform — class style

JavaScript

const { Transform } = require('stream');

class UpperCaseTransform extends Transform {
  _transform(chunk, encoding, callback) {
    // chunk is a Buffer (or string if decodeStrings: false)
    const upper = chunk.toString().toUpperCase();
    this.push(upper);  // push to readable side
    callback();        // signal chunk is fully processed (ready for next)
    // shorthand: callback(null, upper) — push + signal in one call
  }

  _flush(callback) {
    // Input ended — push anything remaining (e.g. buffered partial lines)
    this.push('\n--- END OF DOCUMENT ---\n');
    callback();
  }
}

// Usage
fs.createReadStream('input.txt')
  .pipe(new UpperCaseTransform())
  .pipe(fs.createWriteStream('output.txt'));

Practical example — line-by-line CSV parser transform

JavaScript

class CSVLineParser extends Transform {
  constructor() {
    super({ objectMode: true }); // output objects, not Buffers
    this._buffer = '';
    this._headers = null;
  }

  _transform(chunk, _, cb) {
    this._buffer += chunk.toString();
    const lines = this._buffer.split('\n');
    this._buffer = lines.pop();     // keep incomplete last line

    for (const line of lines) {
      if (!this._headers) { this._headers = line.split(','); continue; }
      const vals = line.split(',');
      const row  = Object.fromEntries(this._headers.map((h, i) => [h, vals[i]]));
      this.push(row);  // push parsed object downstream
    }
    cb();
  }

  _flush(cb) {
    if (this._buffer.trim()) this.push(/* parse last line */);
    cb();
  }
}

What is the difference between a Duplex stream and a Transform stream? When would you implement each?

Medium Transform

▾

The key insight — what connects the two sides?

Both Duplex and Transform streams have a writable side (data goes in) and a readable side (data comes out). The crucial difference is whether those two sides are connected to each other.

🔀 Duplex — two independent pipes

The writable and readable sides do not know about each other. Data written in does NOT become the data that comes out. They are separate channels that happen to share one object.

Real example: a TCP socket — you write a request to the server, and the server writes a response back. Those are two separate data flows.

🔄 Transform — one pipe with a processing step

What you write in IS what comes out — but modified. The writable side feeds directly into the readable side through your _transform() logic.

Real example: gzip — you write plain text in, compressed bytes come out. Same data, transformed.

Data flow diagram

  Duplex:
  [write side] ──────────────────────── you write data in
  [read  side] ──────────────────────── data comes out (independently produced)
  ↑ Two unconnected pipes sharing one object

  Transform:
  [write side] → [_transform()] → [read side]
  ↑ Data written in flows through your logic and emerges out the other end

The core distinction

	Duplex	Transform
Relationship	Read/write sides are independent	Written data becomes readable data (possibly modified)
Analogy	Telephone — you talk and listen separately	Blender — input becomes transformed output
Internal buffer	Two separate buffers	Two separate buffers, but data flows from write to read
Example	TCP socket, WebSocket	gzip, crypto cipher, serializer

Code — implementing each

JavaScript

const { Duplex, Transform } = require('stream');

// Duplex — read and write sides are completely independent
class EchoDuplex extends Duplex {
  _read(size) {
    // Produce data for the readable side (independently)
    this.push('data from readable side');
    this.push(null);  // end the readable side
  }
  _write(chunk, enc, cb) {
    // Handle incoming writes (independently of _read)
    console.log('received:', chunk.toString());
    cb();
  }
}

// Transform — writable input IS the readable output (transformed)
class ReverseTransform extends Transform {
  _transform(chunk, enc, cb) {
    // What comes in, goes out reversed
    const reversed = chunk.toString().split('').reverse().join('');
    cb(null, reversed); // push reversed chunk to readable side
  }
}

Rule of thumb

If you are wrapping a network connection or bidirectional protocol → Duplex. If you are transforming a data stream (compress, encrypt, parse, serialize) → Transform.

Topic E — Practical Patterns & Error Handling

How do you process a large file without loading it into memory? What is the real memory difference?

Medium Memory Efficiency

▾

Why Node.js has a memory limit — and why async doesn't save you

Node.js runs on the V8 JavaScript engine (the same engine inside Google Chrome). V8 manages memory using a garbage collector, and by default it limits your JavaScript heap to roughly 1.5 GB on 64-bit systems. If your code tries to allocate more than that, Node.js does not gracefully degrade — the OS kills the process.

Here is the critical misconception beginners have: "I used fs.readFile which is async, so I'm safe." Wrong. fs.readFile is non-blocking, meaning it does not freeze your event loop while reading — but it still loads the entire file into a single Buffer in memory before calling your callback. Async only means "don't block the event loop." It says nothing about memory.

Analogy — photocopying a book vs reading it page by page

fs.readFile is like photocopying an entire 2000-page book before you read a single word. You need a warehouse of space for the copies before you can start.

fs.createReadStream is like reading the book normally: open to page 1, read it, move to page 2. You only ever need a desk — regardless of whether the book has 200 pages or 20,000.

⚠ The async trap in production

A server with 100 concurrent requests, each calling fs.readFile on a 50 MB file, holds 100 × 50 MB = 5 GB in memory simultaneously. With streaming, the same 100 requests use 100 × 64 KB ≈ 6.4 MB. The difference is not academic — one crashes, one doesn't.

Memory comparison

Approach	RAM for a 2 GB file	When it fails
`fs.readFileSync`	~2 GB allocated at once	Any file larger than available heap
`fs.readFile` (async)	~2 GB — still all in memory	Same problem, just non-blocking
`createReadStream`	~64 KB (default highWaterMark)	Never (constant regardless of file size)
`readline` interface	~1 line at a time	Never

Streaming line-by-line processing

JavaScript

const fs       = require('fs');
const readline = require('readline');

// ❌ Loads entire 2 GB log file into memory
const content = fs.readFileSync('huge.log', 'utf8');
const errors  = content.split('\n').filter(l => l.includes('ERROR'));

// ✅ Streams line by line — constant ~64 KB memory
const rl = readline.createInterface({
  input: fs.createReadStream('huge.log'),
  crlfDelay: Infinity
});

const errors = [];
for await (const line of rl) {
  if (line.includes('ERROR')) errors.push(line);
}

// ✅ Stream file download directly to disk without buffering in memory
const { pipeline } = require('stream/promises');
const https = require('https');

https.get('https://example.com/huge-file.zip', async (response) => {
  await pipeline(response, fs.createWriteStream('huge-file.zip'));
  // HTTP response IS a Readable stream — no temp buffer needed
});

What is stream.pipeline()? How does it differ from pipe() in error handling and resource cleanup?

Hard Error Handling

▾

What is a file descriptor — and what is a resource leak?

When your program opens a file, the operating system assigns it a small integer called a file descriptor (FD). Think of it as a locker key at a train station — the OS keeps a table of who is using which file, and the FD is the key number. Your Node process has a limited supply of these keys — typically 1024 by default on Linux/macOS.

When you are done with a file, you must close it — return the key. If your code errors out mid-stream and forgets to close the file, that FD is never returned. Each leaked FD silently occupies a slot in the OS table. This is a resource leak.

Request arrives. Code creates a ReadStream and WriteStream. OS assigns FD #5 (input) and FD #6 (output).

Midway through, the ReadStream errors — file not found, network drop, disk error.

With pipe(): the error is NOT forwarded to the WriteStream. FD #6 (output file) stays open. Key #6 is never returned.

After thousands of requests like this: FD table is full. Server throws EMFILE: too many open files. No new files, sockets, or connections can be opened. Server is broken.

With pipeline(): any error destroys all streams in the chain automatically. FDs are returned. No leak. A single catch handles everything.

⚠ The pipe() silent leak

pipe() has been in Node.js since the very beginning. It is widely used in tutorials and examples. But it has this fundamental flaw: errors on one stream do not automatically destroy the other streams. This is a long-standing design mistake that pipeline() was built to fix. Always prefer pipeline() in production code.

The critical difference — error propagation

	pipe()	pipeline()
Errors propagate	No — each stream must handle its own 'error' event	Yes — one error destroys all streams
Stream cleanup on error	No — streams left open (file descriptor leak)	Yes — all streams automatically destroyed
Promise / callback	Synchronous return (the destination stream)	Callback or Promise (via `stream/promises`)
Introduced	Original Node.js	Node.js 10 (callback), Node 15 (promise version)

pipe() — the leak problem

JavaScript — pipe() with broken error handling

const r = fs.createReadStream('in.txt');
const z = zlib.createGzip();
const w = fs.createWriteStream('out.gz');

r.pipe(z).pipe(w);

// ❌ If r errors (file not found):
//   - Error event emitted on r
//   - z and w are NOT destroyed
//   - w's file descriptor stays open → resource leak!
//   - Unhandled 'error' event crashes the process

// To do it properly with pipe() you need this boilerplate:
[r, z, w].forEach(s => s.on('error', err => {
  [r, z, w].forEach(s => s.destroy());
  console.error(err);
}));

pipeline() — the right way

JavaScript — pipeline() clean error handling

const { pipeline } = require('stream/promises');
const fs   = require('fs');
const zlib = require('zlib');

async function compress(input, output) {
  try {
    await pipeline(
      fs.createReadStream(input),
      zlib.createGzip(),
      fs.createWriteStream(output)
    );
    console.log('Compressed successfully');
  } catch (err) {
    console.error('Compression failed:', err.message);
    // All three streams are already destroyed — no cleanup needed
  }
}

// pipeline also accepts async generators as stages:
await pipeline(
  fs.createReadStream('data.csv'),
  async function* (source) {
    for await (const chunk of source) {
      yield chunk.toString().toUpperCase();  // transform inline
    }
  },
  fs.createWriteStream('data-upper.csv')
);

Interview one-liner

"Always prefer pipeline() over pipe(). pipe() doesn't propagate errors — if any stream in the chain errors, the others are left open, leaking file descriptors. pipeline() handles it automatically: one error destroys all streams in the chain. Use the promise version from stream/promises for clean async/await syntax."

⚙️

Segment 4 — Worker Threads

Node.js is single-threaded — but your CPU has 8, 16, or 32 cores sitting idle. Worker Threads let you use them. This segment explains why CPU-bound work breaks Node's event loop, how to offload it safely to threads, how threads share memory, and the hard trade-offs that determine when workers actually help.

worker_threads postMessage SharedArrayBuffer Atomics Thread Pool Structured Clone

Questions

Opened

Topics Covered

~60m

Study Time

📖

Read This First — The Event Loop Problem

Everything in Segments 1–3 relied on one fact: Node.js never blocks its event loop on I/O. That works brilliantly for waiting on files, databases, and networks. But there is a completely different category of work — CPU-bound computation — where no I/O is involved and the event loop cannot help.

⚡ I/O-bound work (Node handles fine)

Reading a file, querying a DB, making an HTTP request. The CPU sits idle while waiting for the response. Node offloads the wait to libuv and handles thousands of these concurrently.

🔥 CPU-bound work (the problem)

Image resizing, video encoding, heavy JSON parsing, cryptographic hashing, ML inference. The CPU itself is busy — 100% — with no I/O to yield on. The event loop is frozen until it finishes.

Analogy — the restaurant chef who does math

Your Node.js process is like a single chef running a busy restaurant. The chef is great at multitasking — starting a dish, putting it in the oven, then starting another while the first bakes. That's I/O-bound work: the oven waits, the chef keeps moving.

But imagine the chef stops to solve a 10,000-piece jigsaw puzzle. While solving it, the chef cannot plate food, take orders, or respond to anyone. Every customer waits. That is CPU-bound work on the main thread — one calculation blocks every other request until it finishes.

Key vocabulary for this segment

Main thread

The original Node.js event loop thread. Handles all incoming requests by default.

Worker thread

A separate OS thread with its own V8 instance and event loop. Does not block the main thread.

postMessage

The mechanism for sending data between threads. Data is copied (not shared) by default.

SharedArrayBuffer

A block of raw memory that multiple threads can access simultaneously — without copying.

Atomics

Operations on SharedArrayBuffer that are guaranteed to be safe when multiple threads access the same memory.

Structured Clone

The algorithm Node uses to copy data across thread boundaries. Most JS types supported; functions are not.

Topic A — Why Worker Threads Exist

What problem do Worker Threads solve? Why can't async/await fix CPU-bound work?

Medium Fundamentals

▾

async/await does NOT help with CPU-bound work

This is one of the most common Node.js misconceptions. async/await and Promises only help with waiting — they free up the event loop while your code waits for something external (a file, a network response, a timer). They do not split computation across multiple CPU cores.

Why async/await can't help

Imagine you need to add up a billion numbers. Making this function async does nothing — the CPU still has to do every single addition, one by one, on the same thread. await only yields control to the event loop at an await point. If there is no I/O to wait for, there is no yield point, and the loop stays frozen.

JavaScript — the frozen event loop problem

// ❌ This FREEZES the event loop for ~2 seconds on a modern CPU
//    Every other request waits. Health checks time out. Metrics drop.
app.get('/hash', (req, res) => {
  const result = expensiveHash(req.body.data); // runs for 2s on main thread
  res.json({ result });
});

// ❌ Making it async does NOT fix the problem
//    "await" only helps if there's I/O to yield on
app.get('/hash', async (req, res) => {
  const result = await expensiveHash(req.body.data); // still 2s on main thread!
  res.json({ result });
});

Worker Threads — the real fix

A Worker Thread is a completely separate OS thread with its own V8 instance and event loop. When you offload CPU work to a worker, the main thread's event loop is free to handle new requests while the worker runs the computation in parallel.

Request arrives on main thread. Main thread spawns a Worker and sends the data via postMessage().

Worker runs the expensive computation on a separate CPU core. Main thread continues handling other requests — no freeze.

Worker finishes and sends the result back via postMessage(). Main thread receives it and responds.

JavaScript — offloading to a worker

const { Worker } = require('worker_threads');

app.get('/hash', (req, res) => {
  const worker = new Worker('./hash-worker.js', {
    workerData: { data: req.body.data }
  });

  worker.once('message', result => res.json({ result }));
  worker.once('error',   err    => res.status(500).json({ error: err.message }));
  // Main thread event loop is FREE while the worker runs
});

Interview one-liner

"async/await only helps with I/O — it yields the event loop while waiting for external resources. CPU-bound work has nothing to yield on, so it freezes the loop. Worker Threads run computation on a separate OS thread with its own V8 instance, leaving the main event loop free to handle new requests."

What is the difference between worker_threads, child_process, and cluster? When do you use each?

Medium Architecture

▾

Three tools, three different problems

Node.js has three different mechanisms for doing work outside the main event loop. Beginners often confuse them because they all "run code somewhere else." The key is understanding what they share, how they communicate, and what they cost.

Office analogy

worker_threads: Your colleague sits at the same desk, shares your files and whiteboard, and whispers results to you. Fast, low overhead, same process memory.

child_process: You hire a contractor who works in a different office. They get their own computer, can run a totally different program (Python, Bash), and send you reports by email. Isolated, flexible, but slower to set up.

cluster: You clone yourself. Multiple copies of the exact same program run simultaneously, all answering calls on the same phone number. Each clone is independent — a crash in one doesn't affect the others.

Comparison table

	worker_threads	child_process	cluster
Process	Same process	Separate process	Separate process (fork of main)
Memory	Shared heap possible (SharedArrayBuffer)	Completely separate	Completely separate
Communication	`postMessage` (fast)	IPC / stdin-stdout (slower)	IPC message passing
Language	JavaScript only	Any (Python, Ruby, shell)	JavaScript only
Use case	CPU-bound JS computation	Run external programs, shell commands	Scale HTTP servers across CPU cores
Overhead	Low (shared process)	High (spawn a process)	High (fork a process)

Quick decision guide

→

Expensive JS computation (image resize, hashing, parsing) in the same Node app → worker_threads

→

Running an external program (ffmpeg, a Python script, a shell command) → child_process

→

Scaling an HTTP server to utilise all CPU cores for I/O-heavy web traffic → cluster (or PM2)

Topic B — The worker_threads Module

How do you create a Worker Thread and communicate with it? Walk through a complete example.

Medium worker_threads

▾

The two sides — main thread and worker file

When using worker_threads, you write two separate pieces of code: the main thread that creates the worker and listens for results, and the worker script that does the computation and posts results back. Both communicate via postMessage() and the 'message' event.

JavaScript — main.js (creates and controls the worker)

const { Worker, isMainThread, workerData } = require('worker_threads');

// isMainThread is true when this code runs on the main thread
if (isMainThread) {
  const worker = new Worker(__filename, {    // run THIS file as worker too
    workerData: { n: 42 }                  // data passed to the worker
  });

  worker.on('message', (result) => {
    console.log('Fibonacci result:', result);  // 267914296
  });

  worker.on('error', (err) => {
    console.error('Worker error:', err);
  });

  worker.on('exit', (code) => {
    if (code !== 0) console.error('Worker stopped with exit code', code);
  });

  console.log('Main thread continues — not blocked!');

} else {
  // This branch runs inside the worker thread
  const { parentPort, workerData } = require('worker_threads');

  function fib(n) {
    return n <= 1 ? n : fib(n - 1) + fib(n - 2); // expensive!
  }

  const result = fib(workerData.n);    // workerData = { n: 42 }
  parentPort.postMessage(result);       // send result to main thread
}

Two-way communication — workers can receive messages too

JavaScript — bidirectional messaging

// Main thread sends progress requests, worker reports back
const worker = new Worker('./worker.js');

worker.on('message', ({ type, data }) => {
  if (type === 'progress') console.log('Progress:', data + '%');
  if (type === 'result')   console.log('Done:', data);
});

worker.postMessage({ cmd: 'start', payload: 'large dataset' });

// worker.js
parentPort.on('message', ({ cmd, payload }) => {
  if (cmd === 'start') {
    for (let i = 0; i <= 100; i += 10) {
      doChunk(payload, i);
      parentPort.postMessage({ type: 'progress', data: i });
    }
    parentPort.postMessage({ type: 'result', data: 'processed' });
  }
});

How does data transfer work between threads? What is the Structured Clone Algorithm and what are Transferable objects?

Hard postMessage

▾

The default — data is COPIED, not shared

When you call postMessage(data), Node does not give the receiving thread a reference to the same object. It creates a full deep copy using the Structured Clone Algorithm. Both threads then hold their own independent copy. A change on one side does not affect the other.

Analogy — photocopying a document

Sending data via postMessage is like photocopying a document and handing the copy to a colleague. You both have the information, but you now have two separate documents. Writing on your copy doesn't change theirs. This is safe — but copying a 100 MB Buffer takes time and memory.

JavaScript — what Structured Clone can and cannot copy

// ✅ Structured Clone handles all of these:
worker.postMessage('hello');                        // string
worker.postMessage({ a: 1, b: [2, 3] });             // plain objects/arrays
worker.postMessage(Buffer.from('data'));              // Buffer (copied)
worker.postMessage(new Map([['key', 'val']]));       // Map, Set, Date, RegExp
worker.postMessage(new Error('oops'));               // Error objects

// ❌ Structured Clone CANNOT handle these:
worker.postMessage(() => {});          // Functions — throws DataCloneError
worker.postMessage(Promise.resolve()); // Promises — throws DataCloneError
worker.postMessage(myClassInstance);   // Custom class methods are lost

Transferable objects — zero-copy transfer

For large binary data, copying is expensive. Transferable objects (like ArrayBuffer) can be transferred instead of copied. The original thread loses access to the data — ownership is moved to the receiving thread. This is instantaneous, regardless of size.

JavaScript — transfer vs copy

// ❌ Copy — 100 MB Buffer is duplicated (slow, doubles memory usage)
const buf = Buffer.allocUnsafe(100 * 1024 * 1024);
worker.postMessage({ buf }); // copies all 100 MB

// ✅ Transfer — zero-copy, instantaneous, buf is now empty here
const ab = new ArrayBuffer(100 * 1024 * 1024);
worker.postMessage({ ab }, [ab]); // second arg = transferList
console.log(ab.byteLength); // 0 — ownership transferred, can't use here

// Transferable types: ArrayBuffer, MessagePort, ImageBitmap, OffscreenCanvas

Interview answer

"By default, postMessage uses the Structured Clone Algorithm to deep-copy the data — both threads have independent copies. For large binary data, you can use Transferable objects: ownership of the ArrayBuffer is moved to the receiving thread instantly with zero copying. The sending thread can no longer access the data."

Topic C — SharedArrayBuffer & Atomics

What is SharedArrayBuffer? How does it differ from a regular ArrayBuffer sent via postMessage?

Hard SharedArrayBuffer

▾

The fundamental difference — shared vs owned

A regular ArrayBuffer has exactly one owner at a time. When you postMessage it (with transfer), ownership moves to the receiving thread — the sender loses it. SharedArrayBuffer has no single owner: multiple threads can read and write to it simultaneously, with no copying at all.

Analogy — whiteboard vs document

A regular ArrayBuffer transferred via postMessage is like handing a document to a colleague — they now own it, you don't.

A SharedArrayBuffer is like a whiteboard in a shared office. Everyone can see it and write on it at any time. This is extremely fast — but dangerous. If two people write different things at the same moment, the result is garbled. That's why you need Atomics.

JavaScript — SharedArrayBuffer in use

// Main thread creates the shared buffer
const sharedBuffer = new SharedArrayBuffer(4); // 4 bytes = one Int32
const view = new Int32Array(sharedBuffer);

const worker = new Worker('./worker.js', {
  workerData: { sharedBuffer }  // buffer is SHARED, not copied
});

view[0] = 0;  // initial value

worker.on('message', () => {
  console.log('Value written by worker:', view[0]); // sees worker's write!
});

// worker.js
const { workerData, parentPort } = require('worker_threads');
const view = new Int32Array(workerData.sharedBuffer);
view[0] = 42;          // written directly to shared memory
parentPort.postMessage('done'); // notify main thread

Comparison — three ways to share data

Method	Mechanism	Speed	Safe?
`postMessage` (copy)	Deep clone via Structured Clone	Slow for large data	Yes — independent copies
Transfer `ArrayBuffer`	Move ownership, zero-copy	Instant	Yes — one owner at a time
`SharedArrayBuffer`	Both threads access same memory	Fastest (no copying ever)	Only with Atomics

⚠ Security note

SharedArrayBuffer requires Cross-Origin Isolation headers (Cross-Origin-Opener-Policy and Cross-Origin-Embedder-Policy) in browser environments due to Spectre vulnerability mitigations. In Node.js (no browser sandbox), it is available without restrictions.

What are Atomics? Why do you need them when using SharedArrayBuffer?

Hard Atomics

▾

The race condition problem

When two threads share memory and both try to write at the same time, you get a race condition. The result depends on which thread gets there first — and that is unpredictable. Even a simple increment (value++) is actually three operations: read, add, write. Two threads doing this simultaneously can overwrite each other's work.

Architecture — why value++ is not safe

  Thread A: reads value = 0
  Thread B: reads value = 0        ← both read before either writes
  Thread A: adds 1 → result = 1
  Thread B: adds 1 → result = 1
  Thread A: writes 1
  Thread B: writes 1               ← overwrites Thread A's write!
  Final value: 1   (expected: 2)   ← one increment was lost!

Atomics — operations that cannot be interrupted

Atomics provides operations that the CPU executes as a single, uninterruptible unit. No other thread can read or write the same memory location while an atomic operation is in progress. The read-add-write is guaranteed to complete as one step.

JavaScript — Atomics in use

const sab  = new SharedArrayBuffer(4);
const view = new Int32Array(sab);

// ❌ Not safe — two threads can race
view[0]++;

// ✅ Safe — atomic add: read+add+write as one uninterruptible step
Atomics.add(view, 0, 1);       // add 1 to index 0
Atomics.sub(view, 0, 1);       // subtract 1
Atomics.store(view, 0, 99);    // write 99
Atomics.load(view, 0);         // read safely
Atomics.compareExchange(view, 0, 99, 100); // if 99, swap to 100

// Atomics.wait — blocks the thread until value changes (like a mutex)
Atomics.wait(view, 0, 0);       // sleep until view[0] != 0
Atomics.notify(view, 0, 1);    // wake one waiting thread

Interview one-liner

"SharedArrayBuffer gives multiple threads direct access to the same memory — no copying. But operations like value++ are three steps (read, add, write) that can interleave between threads, causing race conditions. Atomics provides operations that execute as a single indivisible CPU instruction, making shared-memory manipulation safe."

Topic D — Patterns & Error Handling

How do you build a Worker Thread pool to handle concurrent CPU-bound tasks efficiently?

Hard Thread Pool

▾

Why spawning a new Worker per request is wrong

Creating a Worker Thread is expensive — Node must spin up a new V8 engine, load and parse your script, and initialise the Node environment. Doing this for every HTTP request adds ~100–200 ms overhead per request and wastes memory. The solution is a pool: create a fixed number of workers upfront, reuse them for every task, and queue tasks when all workers are busy.

Analogy — a team of specialists

Instead of hiring a new specialist for every task (expensive, slow), you hire a team of 4 specialists who are always available. When a task arrives, you hand it to any available specialist. If all 4 are busy, you queue the task and hand it to the first one who finishes. This is a thread pool.

Simple worker pool implementation

JavaScript — WorkerPool class

const { Worker } = require('worker_threads');
const os          = require('os');

class WorkerPool {
  constructor(workerFile, size = os.cpus().length) {
    this.queue   = [];             // pending tasks
    this.workers = [];             // all worker instances
    this.free    = [];             // idle workers

    for (let i = 0; i < size; i++) {
      const worker = new Worker(workerFile);
      this.workers.push(worker);
      this.free.push(worker);
    }
  }

  run(data) {
    return new Promise((resolve, reject) => {
      const task = { data, resolve, reject };
      if (this.free.length) {
        this._dispatch(this.free.pop(), task);
      } else {
        this.queue.push(task); // all busy — wait in queue
      }
    });
  }

  _dispatch(worker, { data, resolve, reject }) {
    worker.once('message', (result) => {
      resolve(result);
      if (this.queue.length) {
        this._dispatch(worker, this.queue.shift()); // next task
      } else {
        this.free.push(worker); // return to idle pool
      }
    });
    worker.once('error', reject);
    worker.postMessage(data);
  }
}

// Usage
const pool = new WorkerPool('./hash-worker.js'); // one worker per core
const result = await pool.run({ input: 'data to hash' });

Production tip

The piscina npm package is the de-facto production worker pool. It handles error recovery, worker crashes, task timeouts, and concurrency limits. Use it instead of rolling your own unless you have a specific reason.

How do you handle errors in Worker Threads? What happens when a worker crashes?

Medium Error Handling

▾

Worker errors do NOT crash the main thread

An unhandled exception inside a Worker Thread does not crash the main process — it emits an 'error' event on the Worker object and then an 'exit' event with a non-zero code. If you do not listen for 'error', Node.js throws an uncaught exception on the main thread (which could crash it). Always attach an error handler.

JavaScript — complete error handling

const worker = new Worker('./worker.js', { workerData: { n: 10 } });

worker.on('message', (result) => {
  console.log('Result:', result);
});

worker.on('error', (err) => {
  // Worker threw an unhandled exception
  console.error('Worker error:', err.message);
  // Spawn a replacement worker if needed
});

worker.on('exit', (code) => {
  if (code !== 0) {
    console.error('Worker exited with code', code);
    // code 0 = clean exit, non-zero = crash or process.exit(N)
  }
});

// Inside the worker — propagate errors explicitly:
parentPort.on('message', async (data) => {
  try {
    const result = await doWork(data);
    parentPort.postMessage({ ok: true, result });
  } catch (err) {
    // Don't throw — postMessage the error so main thread can handle it
    parentPort.postMessage({ ok: false, error: err.message });
  }
});

Terminating a worker manually

JavaScript

// Terminate a worker that is taking too long (timeout pattern)
const timeout = setTimeout(() => {
  worker.terminate(); // forcefully ends the worker thread
  reject(new Error('Worker timed out'));
}, 5000);

worker.once('message', (result) => {
  clearTimeout(timeout);
  resolve(result);
});

Topic E — Practical Use & Trade-offs

How do Worker Threads interact with libuv's thread pool? Are they the same thing?

Hard Internals

▾

Two completely different thread pools

Node.js actually uses threads internally already — but they are hidden from JavaScript. libuv maintains a thread pool (default: 4 threads) for certain async operations like DNS lookups, fs.readFile, and crypto. These threads are invisible to your JavaScript — you never interact with them directly. Worker Threads are something entirely different and user-controlled.

	libuv Thread Pool	worker_threads
Purpose	Offload I/O and blocking system calls for Node internals	Run JavaScript code in parallel
Controlled by	Node.js / libuv (automatic)	Your code
Runs	C/C++ code (file system, crypto)	JavaScript (your own scripts)
Size	4 by default (`UV_THREADPOOL_SIZE`)	As many as you create
JavaScript accessible	No — transparent	Yes — full JS environment

Increasing the libuv pool — a lesser-known tuning knob

Shell / JavaScript — tuning libuv pool size

# Increase before starting Node — must be set before any I/O operations
UV_THREADPOOL_SIZE=16 node server.js

// Or in code (must be set VERY early, before any I/O):
process.env.UV_THREADPOOL_SIZE = '16';

// Useful when: many concurrent crypto operations, many simultaneous
// fs calls, or many DNS lookups — these all compete for the 4 libuv threads.
// Unlike Worker Threads, increasing this does NOT help pure JS computation.

Interview answer

"libuv's thread pool and worker_threads are completely separate. libuv's pool (default 4 threads) is used internally by Node for blocking C-level operations like file I/O and crypto — your JS never sees these. worker_threads is your API for running JavaScript in parallel across cores. They do not share the same pool."

When should you NOT use Worker Threads? What are the real costs and trade-offs?

Hard Trade-offs

▾

The hidden costs — when workers hurt performance

Worker Threads are not free. Spawning a new Worker has real overhead: starting a V8 engine, loading Node.js internals, parsing your script. For tasks that take less than the startup cost, adding a worker makes things slower.

✅ Good use cases for workers

Tasks that take >10 ms of pure CPU time: image/video processing, ML inference, cryptographic key generation, heavy data parsing (large JSON/CSV), mathematical simulations, compression.

❌ Bad use cases for workers

Any I/O-bound work (file reads, DB queries, HTTP calls). Short calculations (<1 ms). Tasks where serialization cost exceeds computation time. Per-request workers without a pool.

The real costs — a mental model for decisions

Cost	Rough number	Implication
Worker startup	~50–150 ms	Always pool workers — never spawn per-request
Worker memory	~10–30 MB per thread	Don't create more workers than CPU cores
postMessage serialization	~1 ms per MB	Use Transferable for large data, SharedArrayBuffer for hot paths
Context switch	~1–10 µs	Negligible unless switching thousands of times per second

The key question to ask before adding a worker

Decision framework

  1. Is this work CPU-bound (not waiting on I/O)?    → if no: async/await is enough
  2. Does it take > a few milliseconds?              → if no: overhead > benefit
  3. Does it happen frequently enough to warrant a pool?
  4. Is the data small enough that serialization cost is acceptable?
     (or can I use SharedArrayBuffer / Transferable?)
  5. Can I structure the task as a standalone script?
     (workers can't share closures, class instances, or open DB connections)

Interview one-liner

"Worker Threads are the right tool for long-running CPU-bound JavaScript. The main pitfalls are: spawning a new worker per request (use a pool), sending large data via postMessage without Transferables (copies are slow), using workers for I/O (async handles that better), and not accounting for the ~10–30 MB memory overhead per thread. More workers than CPU cores just adds context-switching overhead without extra parallelism."

🌐

Segment 5 — HTTP & Networking

Every Node.js server speaks HTTP — but most developers only ever see it through Express middleware. This segment peels back the layers: from raw TCP sockets up through HTTP/1.1 keep-alive, TLS handshakes, HTTP/2 multiplexing, and WebSocket upgrades. Understanding the protocol makes you 10× better at debugging real production issues.

http / https modules HTTP Agent & Keep-Alive TLS / HTTPS HTTP/2 WebSockets net module / TCP

Questions

Opened

Topics Covered

~65m

Study Time

📖

Read This First — What HTTP Really Is

HTTP is the language your browser and server use to talk to each other. But under the hood it is just text sent over a TCP connection. Understanding this layered model is the key to understanding everything in this segment.

TCP (Transport layer) — creates a reliable, ordered byte stream between two machines. Think of it as a phone call: once connected, bytes flow in both directions without loss.

TLS (Security layer) — wraps TCP with encryption and authentication. The bytes are scrambled so only the two parties can read them. This is what the "S" in HTTPS means.

HTTP (Application layer) — defines the format of messages sent over TCP/TLS. A request is literally text: a method, a path, headers, and an optional body. A response is text too: a status code, headers, and a body.

Analogy — sending a letter

TCP is the postal system — it guarantees the letter arrives, in order, without damage.
TLS is sealing the letter in a tamper-evident envelope so only the recipient can read it.
HTTP is the agreed format of the letter: "Dear server, please send me /index.html. Signed, Browser."

Key vocabulary for this segment

TCP socket

A bi-directional byte channel between client and server. HTTP messages travel inside it.

Keep-Alive

Reusing the same TCP socket for multiple HTTP request-response pairs instead of closing after each one.

TLS handshake

The negotiation phase before any HTTP data flows — exchanging certificates and agreeing on encryption keys.

HTTP Agent

Node's built-in connection pool manager for outgoing HTTP requests.

Multiplexing

HTTP/2 feature: multiple requests share one TCP connection simultaneously without blocking each other.

WebSocket

An HTTP upgrade that converts a request-response connection into a full-duplex persistent channel.

Topic A — HTTP Server Fundamentals

How does http.createServer work internally? Walk through the full lifecycle from TCP connection to HTTP response.

Medium HTTP Server

▾

What happens when a request arrives — step by step

TCP connection established. The OS accepts a new TCP connection on port 80/443. Node's libuv is notified via epoll/kqueue/IOCP. The event loop picks it up.

HTTP parsing begins. Node's built-in HTTP parser (llhttp, written in C) reads the raw bytes and parses the request line (GET /path HTTP/1.1) and headers. This is streaming — it does not wait for the body.

Your callback fires. Once headers are fully parsed, Node creates an IncomingMessage (the request object) and a ServerResponse (the response object) and calls your requestListener(req, res).

Body arrives as a stream. The request body (for POST/PUT) arrives in chunks via the 'data' event on req. You must consume it — it is a Readable stream.

You write the response. Call res.writeHead() for status + headers, then res.write()/res.end() for the body. Calling res.end() signals that the response is complete.

Connection reuse or close. With HTTP/1.1 keep-alive (default), the TCP socket stays open for the next request. With Connection: close, the socket is destroyed.

The raw HTTP protocol — what actually travels over the wire

HTTP/1.1 — raw text on the wire

  ── REQUEST (browser → server) ──────────────────────────────────
  GET /users/42 HTTP/1.1\r\n
  Host: api.example.com\r\n
  Accept: application/json\r\n
  Connection: keep-alive\r\n
  \r\n                          ← blank line = end of headers
  (no body for GET)

  ── RESPONSE (server → browser) ─────────────────────────────────
  HTTP/1.1 200 OK\r\n
  Content-Type: application/json\r\n
  Content-Length: 27\r\n
  \r\n
  {"id":42,"name":"Alice"}     ← body

Complete minimal HTTP server

JavaScript

const http = require('http');

const server = http.createServer((req, res) => {
  // req = IncomingMessage (Readable stream)
  // res = ServerResponse  (Writable stream)

  console.log(req.method, req.url);   // "GET /users/42"
  console.log(req.headers['accept']);  // "application/json"
  console.log(req.httpVersion);       // "1.1"

  // Route matching (what Express does under the hood)
  if (req.method === 'GET' && req.url === '/health') {
    res.writeHead(200, { 'Content-Type': 'application/json' });
    res.end(JSON.stringify({ status: 'ok' }));
    return;
  }

  // Default 404
  res.writeHead(404, { 'Content-Type': 'text/plain' });
  res.end('Not Found');
});

server.listen(3000, () => console.log('Server on :3000'));

// server.listen() calls the OS bind() + listen() syscalls
// Node then polls for new connections via libuv

What Express actually does

Express is just a function that matches this requestListener(req, res) signature. It wraps the bare req/res objects with convenience methods (req.params, res.json()), runs middleware in sequence, and provides a routing table. Under the hood it still calls http.createServer(app).

How do you parse request bodies, headers, query parameters, and route parameters in raw Node.js HTTP?

Medium HTTP Server

▾

Why body parsing is non-trivial — it's a stream

Unlike headers (which are fully parsed before your callback fires), the request body arrives in chunks after the headers. You must listen to the 'data' event, collect the chunks, and reassemble them on 'end'. Forgetting to consume the body can cause the connection to stall or memory to grow.

JavaScript — parsing everything manually

const http = require('http');
const { URL } = require('url');

http.createServer(async (req, res) => {

  // ── 1. Parse URL and query parameters ───────────────────────────
  const url    = new URL(req.url, 'http://localhost');
  const path   = url.pathname;          // "/users/42"
  const page   = url.searchParams.get('page'); // "?page=2" → "2"

  // ── 2. Route parameter extraction (manual regex) ─────────────────
  const match  = path.match(/^\/users\/(\d+)$/);
  const userId = match?.[1]; // "42" (string — remember to parseInt)

  // ── 3. Read headers ──────────────────────────────────────────────
  const contentType = req.headers['content-type']; // always lowercase
  const auth        = req.headers['authorization'];

  // ── 4. Parse the request BODY (streaming) ────────────────────────
  const parseBody = (req) => new Promise((resolve, reject) => {
    const chunks = [];
    req.on('data', (chunk) => chunks.push(chunk));
    req.on('end',  () => resolve(Buffer.concat(chunks).toString()));
    req.on('error', reject);
  });

  if (req.method === 'POST') {
    const rawBody = await parseBody(req);

    if (contentType?..includes('application/json')) {
      const body = JSON.parse(rawBody);  // now you have your object
    } else if (contentType?..includes('application/x-www-form-urlencoded')) {
      const params = new URLSearchParams(rawBody); // form fields
    }
  }

  res.end('ok');
}).listen(3000);

Protecting against oversized bodies

JavaScript — body size limit

const parseBody = (req, maxBytes = 1 * 1024 * 1024) => new Promise((resolve, reject) => {
  const chunks = [];
  let total = 0;

  req.on('data', (chunk) => {
    total += chunk.length;
    if (total > maxBytes) {
      req.destroy();  // abort the connection immediately
      reject(new Error('Request body too large'));
      return;
    }
    chunks.push(chunk);
  });
  req.on('end',  () => resolve(Buffer.concat(chunks).toString()));
  req.on('error', reject);
});

Topic B — HTTP Client & Connection Pooling

How do you make outgoing HTTP requests in Node.js? Compare http.get, the native fetch, and third-party clients.

Medium HTTP Client

▾

The options — from low-level to high-level

Method	Available since	Pros	Cons
`http.request` / `http.get`	Node.js v0.1	No dependencies, full control	Very verbose, no JSON helpers, manual stream handling
Native `fetch`	Node.js v18 (stable v21)	No deps, browser-compatible API, Promise-based	No request cancellation timeout built-in (needs AbortController)
`axios`	npm package	Interceptors, auto JSON, nice API, wide adoption	Extra dependency
`got` / `node-fetch`	npm package	Retry logic, hooks, streams support	Extra dependency

Low-level: http.request — seeing what really happens

JavaScript — http.request (verbose but instructive)

const https = require('https');

const req = https.request({
  hostname: 'api.github.com',
  path:     '/users/torvalds',
  method:   'GET',
  headers:  { 'User-Agent': 'Node.js' }
}, (res) => {
  // res is a Readable stream of the response body
  console.log('Status:', res.statusCode);

  const chunks = [];
  res.on('data', chunk => chunks.push(chunk));
  res.on('end', () => {
    const body = JSON.parse(Buffer.concat(chunks).toString());
    console.log(body.name); // "Linus Torvalds"
  });
});

req.on('error', err => console.error(err));
req.end(); // must call end() to signal no body (even for GET)

Modern: native fetch (Node 18+)

JavaScript — fetch with timeout (AbortController)

// Native fetch — available globally in Node 18+
const getUser = async (id) => {
  const controller = new AbortController();
  const timeout = setTimeout(() => controller.abort(), 5000); // 5s timeout

  try {
    const res = await fetch(`https://api.example.com/users/${id}`, {
      signal: controller.signal,
      headers: { 'Authorization': `Bearer ${process.env.TOKEN}` }
    });

    if (!res.ok) throw new Error(`HTTP ${res.status}`);
    return await res.json();
  } finally {
    clearTimeout(timeout);
  }
};

What is HTTP keep-alive? How does Node's http.Agent manage connection pooling and why does it matter?

Hard Connection Pooling

▾

The cost of a new TCP connection

Every new TCP connection requires a 3-way handshake before any data can flow. For HTTPS add the TLS handshake on top. Together these cost 1–3 network round trips before your actual request even starts. If you make 100 requests to the same server and create a new connection for each one, you pay this overhead 100 times.

Analogy — taxi vs regular driver

Opening a new TCP connection per request is like taking a taxi every time you need to go somewhere — you pay the "starting fee" every time. Keep-alive is like having a regular driver: the connection stays open between trips, so you only pay the startup cost once and then just travel.

Architecture — TCP handshake cost

  Without keep-alive (HTTP/1.0 default):
  Request 1: SYN → SYN-ACK → ACK → GET /a → 200 → FIN (connection closed)
  Request 2: SYN → SYN-ACK → ACK → GET /b → 200 → FIN (new connection!)
  Request 3: SYN → SYN-ACK → ACK → GET /c → 200 → FIN (new connection!)

  With keep-alive (HTTP/1.1 default):
  SYN → SYN-ACK → ACK
  → GET /a → 200
  → GET /b → 200   ← reusing same socket, no new handshake
  → GET /c → 200
  → FIN (one close for all three)

http.Agent — Node's connection pool manager

When you make outgoing requests with http.request(), Node uses a global http.globalAgent behind the scenes. The Agent manages a pool of open TCP sockets per host:port, reusing them for multiple requests. Key settings:

JavaScript — configuring an Agent

const https  = require('https');

const agent = new https.Agent({
  keepAlive:    true,   // reuse sockets (default: false in Node <19)
  maxSockets:   50,    // max concurrent connections per host:port
  maxFreeSockets: 10, // max idle sockets to keep in pool
  timeout:      60000 // socket timeout in ms
});

// Pass agent to each request:
fetch('https://api.example.com/data', { agent });

// Or set globally for all https requests:
https.globalAgent.keepAlive = true;

// ⚠ agent: false means NO pooling — new socket per request
//   Use this only for one-off requests or when testing
fetch('https://api.example.com/one-off', { agent: false });

⚠ The maxSockets trap

Default maxSockets is Infinity in Node.js. Under load this means your server can open thousands of outbound connections simultaneously — exhausting OS file descriptor limits or overwhelming the target service. Always set a sensible maxSockets in production.

Topic C — HTTPS & TLS

How does HTTPS work in Node.js? Walk through the TLS handshake and explain certificates.

Hard TLS / HTTPS

▾

What TLS does — two separate jobs

🔒 Encryption

All data between client and server is encrypted using symmetric keys derived during the handshake. An eavesdropper on the network sees only garbled bytes.

✅ Authentication

The server's certificate proves it is who it claims to be. The client verifies the certificate was signed by a trusted Certificate Authority (CA). Prevents man-in-the-middle attacks.

The TLS handshake — step by step

ClientHello. Client sends: "I support TLS 1.3, here are my cipher suites (encryption algorithms), here is my random number."

ServerHello + Certificate. Server responds: "Let's use cipher X. Here is my certificate (containing my public key, signed by a CA)."

Certificate verification. Client checks: Is this cert signed by a CA I trust? Is it expired? Does the hostname match? If any check fails → connection aborted.

Key exchange. Both sides use the public key and their random numbers to independently compute the same session key (via Diffie-Hellman). The key is never transmitted.

Handshake complete. Both sides confirm with a "Finished" message encrypted with the new session key. All subsequent HTTP data is encrypted with this key.

HTTPS server in Node.js

JavaScript — https.createServer

const https = require('https');
const fs    = require('fs');

const options = {
  key:  fs.readFileSync('server.key'),   // private key (keep secret!)
  cert: fs.readFileSync('server.cert'),  // public certificate (share freely)

  // Modern TLS hardening:
  minVersion: 'TLSv1.2',   // reject TLS 1.0 / 1.1 (deprecated, insecure)
  ciphers: 'TLS_AES_256_GCM_SHA384:TLS_CHACHA20_POLY1305_SHA256' // TLS 1.3 only
};

https.createServer(options, (req, res) => {
  res.end('Secure!');
}).listen(443);

// In production: use nginx/Caddy as TLS terminator in front of Node.
// Node handles HTTP on port 3000; nginx handles TLS on port 443 and proxies.

Mutual TLS (mTLS) — both sides authenticate

JavaScript — mTLS (client must also have a certificate)

// Server requires a client certificate (used for service-to-service auth)
https.createServer({
  key:                fs.readFileSync('server.key'),
  cert:               fs.readFileSync('server.cert'),
  ca:                 fs.readFileSync('ca.cert'),   // trusted CA for clients
  requestCert:        true,                          // ask for client cert
  rejectUnauthorized: true                           // reject if cert invalid
}, (req, res) => {
  const cert = req.socket.getPeerCertificate();
  console.log('Client CN:', cert.subject.CN);
  res.end('Hello trusted client');
}).listen(443);

Topic D — HTTP/2

What problems does HTTP/2 solve compared to HTTP/1.1? How do you use the http2 module in Node.js?

Hard HTTP/2

▾

HTTP/1.1's two big problems

Problem 1 — Head-of-Line Blocking

HTTP/1.1 is serial: on one connection, you send request 1, wait for response 1, then send request 2. If response 1 is slow (large file, slow DB), everything queues behind it. Workaround: open 6–8 parallel TCP connections per host — wasteful.

Problem 2 — Header bloat

HTTP headers are sent as plain text with every single request — even identical Cookie, User-Agent, and Accept headers. A typical header block is 500–2000 bytes, repeated thousands of times per session.

HTTP/2 solutions — three key features

Multiplexing. Multiple requests and responses travel over a single TCP connection simultaneously as independent "streams." Request 3 doesn't wait for request 1 to finish. No more opening 6 connections per host.

Header compression (HPACK). Headers are compressed and deduplicated. If you send the same Authorization header 100 times, subsequent sends cost almost nothing — the receiver already has it in a shared table.

Server Push. The server can proactively send resources (CSS, JS) before the browser asks for them — "I know you'll need this, here it is." (Deprecated in practice but still spec'd.)

Architecture — HTTP/1.1 vs HTTP/2 over one connection

  HTTP/1.1 (one connection, serial):
  ──[GET /api/users]──[response]──[GET /api/posts]──[response]──

  HTTP/2 (one connection, multiplexed streams):
  Stream 1: ──[GET /api/users]─────────[response]──
  Stream 2:    ──[GET /api/posts]──[response]──
  Stream 3:       ──[GET /static/app.js]─────[response]──
  All running in parallel on the same TCP connection

HTTP/2 server in Node.js

JavaScript — http2 module

const http2 = require('http2');
const fs    = require('fs');

// HTTP/2 requires TLS in browsers (h2c = cleartext only for testing)
const server = http2.createSecureServer({
  key:  fs.readFileSync('server.key'),
  cert: fs.readFileSync('server.cert')
});

server.on('stream', (stream, headers) => {
  // Each request is a "stream" with an ID (1, 3, 5, 7...)
  const path = headers[':path'];    // pseudo-header (HTTP/2 specific)
  const method = headers[':method'];

  if (path === '/') {
    // Server push — send CSS before browser requests it
    stream.pushStream({ ':path': '/style.css' }, (err, push) => {
      if (!err) {
        push.respond({ ':status': 200, 'content-type': 'text/css' });
        push.end('body { color: red }');
      }
    });
  }

  stream.respond({ ':status': 200, 'content-type': 'text/html' });
  stream.end('<html>Hello HTTP/2!</html>');
});

server.listen(443);

Interview answer

"HTTP/1.1 suffers from head-of-line blocking — one slow response blocks the connection — and header bloat. HTTP/2 solves both: multiplexing lets multiple requests share one TCP connection concurrently, and HPACK compresses repeated headers. In Node.js, the http2 module exposes a 'stream' event instead of a request event, since each H2 stream maps to one request/response pair."

Topic E — WebSockets

What is a WebSocket? How does the HTTP Upgrade handshake work? How does it differ from long-polling and SSE?

Hard WebSockets

▾

The problem — HTTP is request-response only

HTTP was designed for a simple pattern: client asks, server answers, connection closes (or is reused for the next ask). The server can never spontaneously send the client data — it can only respond. This makes real-time features (chat, live scores, collaborative editing) awkward to implement over plain HTTP.

Three solutions compared

Long-polling: Client sends request. Server holds it open until data is ready, then responds. Client immediately sends a new request. Like repeatedly asking "are we there yet?" — works, but wasteful.

Server-Sent Events (SSE): Client makes one HTTP request. Server keeps the response open and streams data one-way (server → client only). Like a radio broadcast — great for notifications, useless for chat.

WebSocket: Client upgrades the HTTP connection to a persistent full-duplex channel. Both sides can send messages any time, with no request-response overhead. Like a phone call.

The HTTP Upgrade handshake — what actually happens

Client sends a regular HTTP GET request with special headers: Upgrade: websocket and a randomly generated Sec-WebSocket-Key.

Server responds with 101 Switching Protocols. It computes a Sec-WebSocket-Accept value (SHA-1 of the key + a magic string) to prove it understands WebSocket.

The TCP socket is now "hijacked" from the HTTP layer. Both sides stop speaking HTTP and start speaking the WebSocket frame protocol directly over the same socket.

Either side can now send messages (frames) at any time — no request needed. The connection persists until either side sends a Close frame or the socket disconnects.

HTTP — the upgrade exchange on the wire

  ── CLIENT REQUEST ──────────────────────────────────────────────
  GET /chat HTTP/1.1
  Host: server.example.com
  Upgrade: websocket
  Connection: Upgrade
  Sec-WebSocket-Key: dGhlIHNhbXBsZSBub25jZQ==
  Sec-WebSocket-Version: 13

  ── SERVER RESPONSE (101 = switching protocols) ─────────────────
  HTTP/1.1 101 Switching Protocols
  Upgrade: websocket
  Connection: Upgrade
  Sec-WebSocket-Accept: s3pPLMBiTxaQ9kYGzzhZRbK+xOo=

  ── NOW SPEAKING WEBSOCKET FRAMES (not HTTP anymore) ────────────
  [frame] [frame] [frame] ... (bidirectional, any time)

Comparison table

	Long-polling	SSE	WebSocket
Direction	Server → Client (one shot)	Server → Client (stream)	Full-duplex (both ways)
Protocol	HTTP	HTTP (text/event-stream)	WebSocket (ws://)
Overhead per message	Full HTTP headers each time	Tiny (after initial request)	2–10 byte frame header
Browser support	All	All (except IE)	All modern browsers
Use case	Legacy fallback	Notifications, live feeds	Chat, gaming, collaboration

How do you implement WebSockets in Node.js? Show both raw HTTP upgrade handling and the ws library approach.

Hard WebSockets

▾

Production approach — the ws library

JavaScript — WebSocket server with ws

const { WebSocketServer } = require('ws');
const http  = require('http');

const httpServer = http.createServer((req, res) => {
  res.end('HTTP server running'); // handle regular HTTP requests
});

// Attach WebSocket server to the same HTTP server
const wss = new WebSocketServer({ server: httpServer });

wss.on('connection', (ws, req) => {
  const ip = req.socket.remoteAddress;
  console.log('New connection from', ip);

  ws.on('message', (data, isBinary) => {
    const msg = isBinary ? data : data.toString();
    console.log('Received:', msg);

    // Broadcast to all connected clients
    wss.clients.forEach(client => {
      if (client.readyState === client.OPEN) {
        client.send(msg);
      }
    });
  });

  ws.on('close', (code, reason) => {
    console.log('Disconnected:', code, reason.toString());
  });

  ws.on('error', err => console.error('WS error:', err));

  ws.send('Welcome!'); // server sends first
});

httpServer.listen(3000);

Heartbeat — detecting dead connections

TCP does not detect dead connections immediately. A client can disappear (laptop lid closed, network cut) and the server won't know for minutes. A ping/pong heartbeat actively checks that the client is still alive.

JavaScript — heartbeat pattern

function heartbeat() { this.isAlive = true; }

wss.on('connection', (ws) => {
  ws.isAlive = true;
  ws.on('pong', heartbeat); // client responds to our ping
});

// Every 30 seconds, ping all clients. Kill ones that didn't pong.
setInterval(() => {
  wss.clients.forEach(ws => {
    if (ws.isAlive === false) { ws.terminate(); return; }
    ws.isAlive = false;
    ws.ping(); // ws library sends WS ping frame
  });
}, 30_000);

Topic F — Advanced Patterns & Production Issues

What is the net module? How do you build a raw TCP server, and when would you use it instead of HTTP?

Hard net module / TCP

▾

What the net module exposes

The net module is Node's interface to raw TCP sockets. http is built on top of net — an HTTP server is just a TCP server that understands the HTTP text format. Using net directly gives you a raw byte stream with no protocol overhead. You design the protocol yourself.

Use raw TCP (net) when:

You are implementing a custom binary protocol. Building a database driver, game server, message broker (like Redis), proxy, or IoT gateway. Any case where HTTP's text overhead is too much.

Use HTTP when:

You are building a REST API, a web server, a webhook receiver, or anything that talks to browsers or standard HTTP clients. HTTP gives you routing, headers, status codes, and caching for free.

Raw TCP server + client

JavaScript — net.createServer

const net = require('net');

// ── TCP Server ───────────────────────────────────────────────────
const server = net.createServer((socket) => {
  // socket is a Duplex stream — you can read and write bytes
  console.log('Client connected:', socket.remoteAddress);

  socket.on('data', (data) => {
    console.log('Received:', data.toString()); // raw bytes
    socket.write('Echo: ' + data);            // write back
  });

  socket.on('end',   () => console.log('Client disconnected'));
  socket.on('error', (err) => console.error('Socket error:', err));
});

server.listen(8080, () => console.log('TCP server on :8080'));

// ── TCP Client ───────────────────────────────────────────────────
const client = net.connect({ port: 8080 }, () => {
  client.write('Hello server!');
});
client.on('data', data => console.log(data.toString())); // "Echo: Hello server!"

Unix domain sockets — faster than TCP for local IPC

JavaScript — Unix socket (no TCP overhead for local communication)

// Server: listen on a file path instead of a port
net.createServer(handler).listen('/tmp/myapp.sock');

// Client: connect via file path
net.connect('/tmp/myapp.sock');

// Used by: nginx ↔ Node.js, PM2 process manager, Redis on localhost
// ~30% faster than loopback TCP (127.0.0.1) for local IPC

How do you implement graceful shutdown, request timeouts, and what do the common networking errors ECONNRESET / ECONNREFUSED mean?

Hard Production Patterns

▾

Graceful shutdown — why process.exit() is wrong

When you deploy a new version, your process manager (PM2, Kubernetes) sends SIGTERM. Calling process.exit() immediately kills all in-flight requests mid-response — users see broken pages. Graceful shutdown stops accepting new connections but lets existing ones finish.

Receive SIGTERM. Stop accepting new connections: call server.close().

Wait for in-flight requests to finish. server.close(callback) fires when the last connection closes.

Close DB connections, flush logs, clean up. Then call process.exit(0).

Force-exit after a deadline (e.g. 30 s) in case something hangs — setTimeout(() => process.exit(1), 30000).

JavaScript — graceful shutdown

const server = http.createServer(app).listen(3000);

async function shutdown(signal) {
  console.log(`${signal} received — graceful shutdown`);

  await new Promise(resolve => server.close(resolve)); // stop new connections
  await db.close();           // close DB pool
  await redisClient.quit(); // close Redis connection

  console.log('Shutdown complete');
  process.exit(0);
}

process.on('SIGTERM', () => shutdown('SIGTERM'));
process.on('SIGINT',  () => shutdown('SIGINT'));  // Ctrl+C

// Safety net — force exit if graceful shutdown hangs
setTimeout(() => { console.error('Forced exit'); process.exit(1); }, 30_000).unref();

Request timeouts — four types you must know

JavaScript — setting all timeout types

const server = http.createServer(handler);

// 1. Keep-alive timeout — how long to hold an idle persistent connection
server.keepAliveTimeout = 65_000; // must be > load balancer's idle timeout

// 2. Headers timeout — how long to wait for request headers to arrive
server.headersTimeout = 60_000;

// 3. Request timeout — total time allowed for the full request cycle
server.requestTimeout = 300_000; // Node 14+

// 4. Socket timeout — per-socket idle timeout (no data for N ms)
server.setTimeout(120_000, (socket) => {
  socket.destroy(); // socket went idle — destroy it
});

Common networking errors — decoded

Error code	Meaning	Common cause	Fix
`ECONNREFUSED`	Nothing listening on that port	Server not started, wrong port, DB down	Check the target service is running
`ECONNRESET`	Remote end forcibly closed the connection	Load balancer timeout, server crash, firewall	Retry with backoff; check keepAliveTimeout
`ETIMEDOUT`	No response within timeout	Slow network, overloaded server, firewall drop	Set a request timeout; add circuit breaker
`ENOTFOUND`	DNS lookup failed	Wrong hostname, DNS misconfiguration	Verify hostname; check /etc/resolv.conf
`EMFILE`	Too many open file descriptors	Connection pool not limiting sockets	Set maxSockets on Agent; increase OS fd limit

Interview answer

"Graceful shutdown means stopping new connections via server.close(), waiting for in-flight requests to finish, closing DB/Redis connections, then calling process.exit(0) — with a hard timeout fallback. ECONNRESET means the remote side forcibly closed the socket — usually a load balancer idle timeout. Fix it by setting server.keepAliveTimeout slightly above the load balancer's value. ECONNREFUSED means nothing is listening on that port."

⚡

Segment 6 — Performance

Performance is not guesswork — it is a discipline of measuring, understanding, and fixing. This segment covers the full stack: V8's JIT compiler and hidden classes, garbage collection internals, memory leak detection, event loop lag, flame graph analysis, cluster scaling, caching strategies, and production benchmarking. Every answer starts from first principles so nothing feels like magic.

Profiling & Flame Graphs V8 JIT & Hidden Classes GC & Memory Leaks Event Loop Lag Cluster & Caching Benchmarking

Questions

Opened

Topics Covered

~70m

Study Time

📖

Read This First — The Golden Rule of Performance

Never optimise what you have not measured. Most developers instinctively "feel" what is slow and optimise the wrong thing. Studies consistently show that programmers guess the bottleneck correctly less than 20% of the time. This segment teaches you to measure first, then fix with precision.

📏 Latency vs Throughput

Latency = how long one request takes (milliseconds). Throughput = how many requests per second the system handles. Optimising one can hurt the other. A batch that waits to group 100 requests improves throughput but increases individual latency.

🔭 Three layers of performance

1. Code — algorithms, V8 optimisations, avoid GC pressure.
2. Process — event loop health, memory usage, concurrency limits.
3. Architecture — caching, clustering, load balancing, CDN.

Analogy — the hospital triage system

You wouldn't operate on a patient without first diagnosing the problem. Profiling is the diagnosis. A flame graph is the X-ray. Only after you see exactly where time is being spent do you start fixing things. Guessing and optimising blindly is like operating on the wrong organ.

Key vocabulary for this segment

JIT compilation

V8 converts hot JavaScript functions to native machine code at runtime for maximum speed.

Hidden class

V8's internal structure representing the "shape" of an object. Objects with the same shape share a hidden class and use fast property access.

Heap snapshot

A frozen dump of all objects in memory at a moment in time — used to find what is leaking.

Flame graph

A visualisation of profiling data — the width of each bar shows how much CPU time that function consumed.

Event loop lag

The delay between when a timer/callback is scheduled and when it actually runs. High lag = the loop is blocked.

GC pause

Time the event loop is frozen while V8's garbage collector frees unused memory.

Topic A — Profiling & Flame Graphs

How do you profile a Node.js application? Walk through --inspect, Chrome DevTools, and clinic.js.

Medium Profiling

▾

Three profiling tools and when to use each

Tool	Best for	How to start
`--inspect` + Chrome DevTools	Deep CPU + memory analysis on any Node app	`node --inspect server.js` → open `chrome://inspect`
`clinic.js` (clinic doctor)	Automated diagnosis — tells you what kind of problem you have	`npx clinic doctor -- node server.js`
`0x` (flame graphs)	Identifying hot functions — which code path eats CPU	`npx 0x server.js`

Step-by-step: CPU profiling with --inspect

Start your app: node --inspect server.js. Node opens a WebSocket debugger on port 9229 and prints the inspector URL.

Open Chrome and navigate to chrome://inspect. Click "inspect" under your Node process. The DevTools window opens connected to your live process.

Go to the Profiler tab → click Start. Send a load of requests to your server (using autocannon or your browser). Click Stop.

DevTools shows a flame graph. The widest bars at the top are the functions consuming the most CPU. Click a bar to jump to the source line.

For memory leaks: go to the Memory tab → take a heap snapshot before and after the suspected leak → compare the two snapshots to see what grew.

clinic.js — automated performance diagnosis

clinic.js wraps your app, runs it under load, collects data, and then generates an HTML report with a diagnosis: "CPU bottleneck", "Event loop blocked", "I/O bottleneck", or "Memory leak". It is the fastest way to understand what kind of problem you have before going deeper.

Shell — clinic.js workflow

# Install once
npm install -g clinic

# Run your server under clinic's harness
clinic doctor -- node server.js

# In another terminal, generate load
npx autocannon -c 100 -d 30 http://localhost:3000

# When done, Ctrl+C — clinic opens HTML report automatically
# It shows: event loop delay, CPU usage, memory over time

# For detailed flame graphs:
clinic flame -- node server.js

# For async/await timing issues:
clinic bubbleprof -- node server.js

Production profiling without downtime

Use node --inspect=0.0.0.0:9229 with an SSH tunnel to profile a live production server without restarting it. Never expose the debug port publicly — it gives full remote code execution access.

How do you read a flame graph? What does it tell you and what are the common patterns to look for?

Hard Flame Graphs

▾

How a flame graph is built and what the axes mean

A profiler samples the call stack thousands of times per second. Each sample is one snapshot of "what functions are currently running, and which functions called them." A flame graph stacks all these samples and merges identical sequences.

Flame graph — how to read it

  ┌─────────────────────────────────────────────────────────┐
  │  JSON.parse           [████████████████]   ← WIDE = SLOW │
  │  parseBody            [████████████████████]             │
  │  routeHandler         [██████████████████████████]       │
  │  http.Server.emit     [████████████████████████████████] │
  └─────────────────────────────────────────────────────────┘

  X-axis  → NOT time order — it is sorted alphabetically
            (so adjacent bars are NOT related in time)
  Y-axis  → call stack depth (bottom = root, top = leaf)
  Width   → proportion of CPU time spent in that function
            (wider = more samples = more CPU consumed)

  🔴 Hot path: wide bars near the TOP of the stack
     These are the actual bottlenecks — the functions doing real work.
  🟡 Wide bars in the MIDDLE: a function that calls slow children
     (fix the children, not the parent)
  🟣 "Plateau" bars: a function running flat across the graph
     = blocking the event loop — nothing else ran during that time

The three patterns every developer should recognise

🔥

Tall narrow spikes: Deep call chains that finish quickly. Usually not a problem — the width (time) is small.

🔥

Wide flat plateau at the top: One function consuming a huge fraction of CPU time. This is your bottleneck — look at what it does and optimise it or move it to a worker thread.

🔥

Wide flat plateau in the MIDDLE: A function that just sits there while all its children run. The parent itself is fine; its children (above it) are the actual bottlenecks.

Common bottleneck examples found in flame graphs

  JSON.stringify / JSON.parse  → large objects, run in worker or use faster-json
  RegExp                       → catastrophic backtracking — rewrite regex or use re2
  crypto.pbkdf2Sync            → blocking crypto — use async version or worker
  fs.readFileSync              → blocking I/O — replace with createReadStream
  Array.sort on large arrays   → move to worker or optimise comparator
  Deep object cloning          → avoid lodash.cloneDeep on hot paths

Topic B — V8 JIT & Hidden Classes

How does V8's JIT compiler work? What are hidden classes and how do they affect property access speed?

Hard V8 Internals

▾

How V8 goes from JavaScript to machine code

V8 does not interpret JavaScript line by line. It compiles it — but smartly, in two phases:

Ignition (interpreter). First time a function runs, V8's Ignition interpreter executes it from bytecode. Fast to start, not maximally fast to run. V8 collects type feedback — it watches what types actually flow through each variable.

TurboFan (JIT compiler). Once a function is called enough times ("hot"), TurboFan compiles it to optimised native machine code using the type feedback. Assumptions baked in: "this variable is always a Number, this object always has shape X."

Deoptimisation. If an assumption is violated (variable suddenly becomes a String), TurboFan throws away the compiled code and falls back to Ignition. This is called a deopt and is expensive.

Hidden classes — the secret to fast property access

In a statically-typed language like C++, the compiler knows the exact memory layout of every struct at compile time — accessing a field is a single memory read at a fixed offset. JavaScript is dynamic — objects can have any properties added at any time. V8 solves this with hidden classes.

Analogy — class register in a school

When students all sit in the same seats every day, the teacher can say "row 2, seat 3" to find Alice instantly. That's a hidden class — a known layout. If students sit randomly every day, the teacher has to search the whole room. That's a dictionary-style property lookup — much slower.

JavaScript — hidden class behaviour

// ✅ GOOD — both objects have same property order → share hidden class C2
//    V8 can use fast fixed-offset property access (like C struct)
const a = { x: 1, y: 2 }; // hidden class: C0 → C1 (add x) → C2 (add y)
const b = { x: 3, y: 4 }; // same shape → shares C2 → fast!

// ❌ BAD — different property order → different hidden classes → slower
const c = { x: 1, y: 2 }; // hidden class C2
const d = { y: 4, x: 3 }; // different class C4 — two separate layouts!

// ❌ BAD — adding properties after creation creates new hidden classes
const e = {};
e.x = 1; // C0 → C1
e.y = 2; // C1 → C2 (transition OK if always in this order)

// ❌ WORST — deleting properties destroys hidden class sharing
delete e.x; // forces dictionary mode — all property access becomes slow

Practical rules for V8-friendly code

✅ Do this

Initialise all object properties in the constructor. Keep property order consistent. Use typed arrays for numeric data. Avoid mixing types in arrays ([1, 'two', 3]). Use factory functions that always produce the same shape.

❌ Avoid this

Adding properties to objects after creation in different orders. Using delete on object properties. Storing different types in the same variable across calls. Arrays with holes ([1,,3]). Megamorphic call sites (calling same function with 4+ different object shapes).

What triggers V8 deoptimisation? How do you detect and prevent it?

Hard V8 Internals

▾

What deoptimisation is and why it matters

TurboFan compiles hot functions to machine code based on assumptions about the types of values seen so far. If those assumptions are violated, V8 has to discard the native code and revert to the slower Ignition interpreter. This is a deoptimisation. It can cause a sudden spike in latency on an otherwise fast path.

Analogy — a manufacturing line retooled for a different product

Imagine a factory line optimised to make blue widgets. It runs fast because every machine is tuned for blue widgets. Suddenly you send a red widget through. The line has to stop, reconfigure for red, process it slowly, then decide whether to retool back. That retooling cost is deoptimisation.

Common deoptimisation triggers

JavaScript — what causes deopts

// 1. Type change mid-function
function add(a, b) { return a + b; }
add(1, 2);       // V8 optimises assuming a,b are always Numbers
add('hi', 'yo'); // ❌ type changed → DEOPT → back to interpreter

// 2. try/catch around hot code (historically; mostly fixed in modern V8)
function hotPath() {
  try { /* hot work */ } catch(e) {}  // can prevent optimisation
}

// 3. arguments object (old pattern — use rest params instead)
function old() { return arguments[0]; }   // ❌ arguments prevents opt
function modern(...args) { return args[0]; } // ✅ rest params are fine

// 4. for...in on objects with prototype properties (use Object.keys instead)
// 5. eval() — prevents all optimisation of the enclosing function
// 6. with statement — same issue

Detecting deopts — the --trace-deopt flag

Shell — tracing V8 deoptimisations

# Print every deopt to stdout with reason and source location
node --trace-deopt server.js

# Example output:
# [deoptimize] reason: wrong type at [doWork] /app/server.js:42
# [deoptimize] reason: lost precision at [parseAmount] /app/parser.js:17

# See all V8 optimisation decisions:
node --trace-opt --trace-deopt server.js 2>&1 | grep -E "(optimiz|deoptim)"

# Or use the --allow-natives-syntax flag + %OptimizationStatus() in tests
node --allow-natives-syntax -e "
  function add(a,b){ return a+b; }
  add(1,2); add(1,2);
  %OptimizeFunctionOnNextCall(add);
  add(1,2);
  console.log(%GetOptimizationStatus(add)); // 2 = optimized
"

Topic C — Memory Management & Garbage Collection

How does Node.js manage memory? Explain the V8 heap structure and how garbage collection works.

Hard GC & Memory

▾

The V8 heap — three regions you need to know

Architecture — V8 heap layout

  ┌─────────────────────────────────────────────────────────────┐
  │                    V8 HEAP                                   │
  │                                                             │
  │  ┌──────────────┐  ┌──────────────┐  ┌───────────────────┐ │
  │  │  Young Gen   │  │   Old Gen    │  │   Large Objects   │ │
  │  │  (Nursery)   │  │  (Tenured)   │  │   Space           │ │
  │  │              │  │              │  │                   │ │
  │  │ New objects  │  │ Survived 2+  │  │ Objects > 512 KB  │ │
  │  │ live here    │  │ minor GCs    │  │ live here         │ │
  │  │ ~1–8 MB      │  │ ~1.4 GB max  │  │ allocated once    │ │
  │  │ GC: fast     │  │ GC: slow     │  │ GC: full heap     │ │
  │  └──────────────┘  └──────────────┘  └───────────────────┘ │
  │                                                             │
  │  OUTSIDE HEAP (not managed by V8 GC):                       │
  │  Buffer data, ArrayBuffer data — allocated via C++ malloc   │
  └─────────────────────────────────────────────────────────────┘

Two GC types — Scavenger vs Mark-Sweep

🐣 Minor GC (Scavenger) — Young Gen

Runs very frequently (every few hundred KB). Copies surviving objects to the other half of the young space. Objects that survive two minor GCs are "promoted" to Old Gen. Very fast (~1 ms). Most objects die young — this is the "generational hypothesis."

🧹 Major GC (Mark-Sweep-Compact) — Old Gen

Runs infrequently but can take 50–200 ms. Marks all reachable objects from roots (global, stack). Sweeps unreachable objects. Optionally compacts to reduce fragmentation. This is the GC pause you feel in latency spikes.

Monitoring GC in your Node process

JavaScript — monitoring GC pressure

// 1. Using the perf_hooks PerformanceObserver
const { PerformanceObserver } = require('perf_hooks');

const obs = new PerformanceObserver((list) => {
  for (const entry of list.getEntries()) {
    console.log('GC type:', entry.detail.kind,
               'duration:', entry.duration.toFixed(2), 'ms');
  }
});
obs.observe({ type: 'gc' });

// 2. Heap statistics via process.memoryUsage()
const mem = process.memoryUsage();
console.log({
  heapUsed:  (mem.heapUsed  / 1024 / 1024).toFixed(1) + ' MB',  // JS objects
  heapTotal: (mem.heapTotal / 1024 / 1024).toFixed(1) + ' MB',  // V8 heap capacity
  external:  (mem.external  / 1024 / 1024).toFixed(1) + ' MB',  // Buffer / C++ memory
  rss:       (mem.rss       / 1024 / 1024).toFixed(1) + ' MB'   // total process RSS
});

// 3. Expose GC logs at runtime
node --expose-gc server.js  // enables global.gc() to trigger GC manually
node --trace-gc  server.js  // prints every GC event with type + duration

How do you detect and fix memory leaks in Node.js? Walk through the full diagnosis process.

Hard Memory Leaks

▾

What a memory leak is and how it feels in production

A memory leak happens when your code holds references to objects that are no longer needed, preventing the GC from collecting them. Leaked objects accumulate in Old Gen. Eventually heap usage reaches the V8 limit (~1.5 GB) and the process crashes — or GC pauses become so frequent that response times degrade badly.

Memory leaks are insidious: the server starts fast, runs fine for hours, then slows down and eventually dies. The only fix is a restart — until you find and fix the root cause.

The four most common sources of leaks

JavaScript — common leak patterns

// 1. Unbounded caches / collections (most common leak)
const cache = {}; // grows forever — never evicts
app.get('/user/:id', (req, res) => {
  cache[req.params.id] = fetchUser(req.params.id); // every userId ever hit → leak
});
// Fix: use a Map with a max size (LRU cache) or WeakMap for object keys

// 2. Event emitter listener accumulation
function addHandler() {
  emitter.on('data', (d) => process(d)); // adds a new listener on every call!
}
// Fix: use .once() or remove listener in cleanup; check emitter.listenerCount()

// 3. Closure retaining large objects
function buildHandler(largeData) {  // 50 MB object
  return (req, res) => {
    const id = largeData.id;  // closure keeps largeData alive in memory
    res.send(id);
  };
}
// Fix: only close over what you need — const id = largeData.id; (then largeData can GC)

// 4. Circular references with external resources
class Connection {
  constructor() {
    this.socket = createSocket();
    this.socket.conn = this; // circular — neither can GC
  }
}

Finding leaks with heap snapshots

Start with node --inspect server.js. Open Chrome DevTools → Memory tab.

Take a baseline heap snapshot. Send 1000 requests (to warm up and leak). Force GC (global.gc() with --expose-gc). Take a second snapshot.

In DevTools, switch to "Comparison" view between the two snapshots. Sort by "# New" or "Size Delta". The top items are what grew — these are your leak candidates.

Click the leaked object type to see retainer chains: the path from a root (global, closure, event listener) that is keeping it alive.

WeakMap and WeakRef — the GC-friendly alternatives

Use WeakMap when you need to associate extra data with an object but don't want to prevent it from being GC'd. If the key object is collected, the WeakMap entry disappears automatically. WeakRef holds a reference that does not prevent GC — call deref() and check for undefined before using.

Topic D — Event Loop Health

How do you measure event loop lag? What causes it, how do you detect it in production, and how do you fix it?

Hard Event Loop

▾

What event loop lag is — plain English

You schedule a callback with setTimeout(fn, 0) — meaning "run this as soon as possible, in the next tick." But the event loop is occupied processing a long-running task. Your callback waits. The actual delay between when you scheduled it and when it ran is the event loop lag.

Analogy — single checkout lane at a supermarket

The event loop is a single cashier. Each customer is a callback. If one customer has a cart of 500 items (a long synchronous operation), everyone behind them waits the full duration. The waiting time is event loop lag. A 200 ms synchronous operation causes 200 ms of lag for every other request waiting in the queue.

JavaScript — measuring event loop lag

// Simple measurement using setTimeout baseline
function measureLag() {
  const start = Date.now();
  setTimeout(() => {
    const lag = Date.now() - start; // should be ~0, anything > 50ms is a problem
    if (lag > 50) console.warn('Event loop lag:', lag, 'ms');
    measureLag(); // schedule next measurement
  }, 0);
}
measureLag();

// Production-grade: use Node.js built-in monitorEventLoopDelay (Node 11+)
const { monitorEventLoopDelay } = require('perf_hooks');
const h = monitorEventLoopDelay({ resolution: 20 }); // sample every 20ms
h.enable();

setInterval(() => {
  console.log({
    mean:  (h.mean / 1e6).toFixed(2),    // nanoseconds → ms
    p99:   (h.percentile(99) / 1e6).toFixed(2),
    max:   (h.max / 1e6).toFixed(2)
  });
  h.reset();
}, 5000);

Causes and fixes

Cause	Symptom	Fix
Synchronous CPU-bound work	All requests delayed during computation	Move to Worker Thread or child_process
Large JSON.parse / JSON.stringify	Periodic lag spikes on large payloads	Use streaming JSON parser (`stream-json`), or do in worker
GC major collection pause	Periodic spikes every few seconds/minutes	Reduce allocation rate; tune heap size; use object pools
Synchronous crypto (pbkdf2Sync, scryptSync)	Lag during login/auth requests	Always use async versions: `crypto.pbkdf2()`
Blocking fs calls (readFileSync)	Lag during file access	Replace with async streams or `fs.promises`
Dense in-memory computation in hot path	Constant baseline lag	Profile with flame graph, optimise algorithm, or use worker

Topic E — Scaling & Caching

What is the cluster module? How does it scale across CPU cores and how does it differ from Worker Threads?

Medium Cluster

▾

The problem cluster solves

Node.js is single-threaded — one process uses at most one CPU core. A 16-core server runs one Node process: 15 cores sit idle. The cluster module forks multiple copies of your server process, one per CPU core, all sharing the same port. The OS distributes incoming connections across workers.

Analogy — one bank vs a bank with many tellers

A single Node process is a bank with one teller. Cluster is the same bank with 16 tellers all serving customers from the same queue. Each teller is an independent copy of your entire application — they don't share memory, but they all answer the same phone number (port).

How cluster works

The primary process calls cluster.fork() once per CPU core. Each fork is a full child process running your server code.

The primary process holds the server socket. When a new TCP connection arrives, it passes the socket handle to one of the workers via IPC.

If a worker crashes, the primary detects it via the 'exit' event and forks a replacement automatically.

JavaScript — cluster module

const cluster = require('cluster');
const http    = require('http');
const os      = require('os');
const numCPUs = os.cpus().length;

if (cluster.isPrimary) {
  console.log(`Primary ${process.pid} forking ${numCPUs} workers`);

  for (let i = 0; i < numCPUs; i++) cluster.fork();

  cluster.on('exit', (worker, code) => {
    console.warn(`Worker ${worker.process.pid} died (code ${code}) — respawning`);
    cluster.fork(); // auto-restart crashed workers
  });

} else {
  // Each worker runs its own HTTP server on the same port
  http.createServer((req, res) => {
    res.end(`Worker ${process.pid} handled this`);
  }).listen(3000, () => {
    console.log(`Worker ${process.pid} listening`);
  });
}

cluster vs worker_threads — the key distinction

	cluster	worker_threads
Unit of work	Full HTTP server (I/O-bound scaling)	A specific CPU-bound task
Memory	Separate process — no shared memory	Same process — can share memory
Communication	IPC messages between processes	postMessage (fast)
Crash isolation	Full isolation — one crash doesn't kill others	Worker crash emits event on main thread
Use case	Maximise HTTP throughput on multi-core servers	Offload one expensive computation

In practice: Use PM2 (pm2 start app.js -i max) instead of writing cluster code manually. PM2 manages forking, restart on crash, zero-downtime reload, and metrics out of the box.

What caching strategies work in Node.js? Cover in-process caching, Redis, and HTTP cache headers.

Medium Caching

▾

The three layers of caching — pick the right one

In-process cache (fastest, ~0 ms). A Map or LRU cache inside your Node process. Lives in the same heap. Sub-millisecond reads. Limitations: lost on restart, not shared between cluster workers, bounded by process memory.

Redis (~0.2–1 ms). Shared across all processes and servers. Survives restarts. Supports TTL, pub/sub, distributed locks. The standard solution for shared caching in any scaled deployment.

HTTP cache headers (free — browser/CDN does the work). Tell the browser or CDN to cache responses. Zero server load for cached hits. Only works for read-only, public, or per-user responses with correct headers.

In-process LRU cache

JavaScript — simple LRU cache (no dependencies)

// Node's built-in LRU via a bounded Map (Map preserves insertion order)
class LRUCache {
  constructor(maxSize = 500) {
    this.cache   = new Map();
    this.maxSize = maxSize;
  }
  get(key) {
    if (!this.cache.has(key)) return undefined;
    const val = this.cache.get(key);
    this.cache.delete(key); this.cache.set(key, val); // move to end (MRU)
    return val;
  }
  set(key, val) {
    if (this.cache.has(key)) this.cache.delete(key);
    else if (this.cache.size >= this.maxSize)
      this.cache.delete(this.cache.keys().next().value); // evict LRU (first entry)
    this.cache.set(key, val);
  }
}

HTTP cache headers — the most underused performance tool

JavaScript — HTTP caching headers

// 1. Cache-Control: max-age — browser caches for N seconds, no server hit
res.setHeader('Cache-Control', 'public, max-age=3600'); // 1 hour

// 2. ETag — fingerprint of the response; conditional GET saves bandwidth
const etag = crypto(body);
res.setHeader('ETag', etag);
if (req.headers['if-none-match'] === etag) {
  res.writeHead(304); res.end(); return; // Not Modified — no body sent!
}

// 3. Stale-While-Revalidate — serve stale immediately, refresh in background
res.setHeader('Cache-Control', 'public, max-age=60, stale-while-revalidate=300');
// Browser serves cached version instantly; background request updates cache

// 4. Cache-Control: private — per-user data, only browser caches (not CDN)
res.setHeader('Cache-Control', 'private, max-age=0, must-revalidate');

Topic F — Benchmarking & Production Metrics

How do you benchmark a Node.js HTTP server? What tools, metrics, and common mistakes should you know?

Medium Benchmarking

▾

The metrics that actually matter

Metric	What it measures	Good target (typical API)
Throughput (RPS)	Requests per second at saturation	Depends on workload — measure your baseline
p50 latency	Median response time — what most users experience	< 20 ms for simple API
p99 latency	Worst 1% of requests — tail latency	< 200 ms — often 10–50× higher than p50
p99.9 latency	Worst 0.1% — the "long tail"	Watch GC pauses here
Error rate	% of non-2xx responses under load	0% — any errors under load indicate a bug
Memory growth	Heap used after N minutes of load	Stable (flat line) — any growth = potential leak

autocannon — the Node.js benchmarking tool

Shell — autocannon benchmarking

# Install
npm install -g autocannon

# Basic benchmark: 100 connections, 30 seconds
autocannon -c 100 -d 30 http://localhost:3000/api/users

# Output shows:
# Stat    | 2.5% | 50%  | 97.5% | 99%  | Avg     | Stdev | Max
# Latency | 4 ms | 7 ms | 18 ms | 45ms | 8.2 ms  | 3.1   | 234ms
# Req/Sec | 9200 |10500 | 11400 | ...  | 10213.5 | ...

# POST with JSON body
autocannon -c 50 -d 20 \
  -m POST \
  -H 'Content-Type: application/json' \
  -b '{"name":"test"}' \
  http://localhost:3000/api/users

# Programmatic use in Node
const autocannon = require('autocannon');
const result = await autocannon.run({ url: 'http://localhost:3000', connections: 100 });
console.log(autocannon.printResult(result));

Five common benchmarking mistakes

❌

Benchmarking from the same machine. The load generator and server compete for CPU. Run autocannon on a separate machine or at least a different core group.

❌

Not warming up first. V8 JIT needs time to optimise hot functions. Discard the first 5–10 seconds of results. Use -w 5 (warmup connections) in autocannon.

❌

Only looking at average latency. Average hides the tail. p99 and max are what users actually feel when things go wrong. Always report percentiles.

❌

Not running under realistic load. One connection is not realistic. Test with the concurrency level you expect in production (typically 50–500 for APIs).

❌

Benchmarking in development mode. Frameworks like Next.js, NestJS have source map overhead, dev logging, etc. Always benchmark the production build with NODE_ENV=production.

Interview answer — performance investigation process

"My approach: measure first with autocannon to establish a baseline and understand if the problem is latency or throughput. Then run clinic doctor to identify the category — CPU, event loop, or memory. If CPU: use 0x for a flame graph to find the hot function. If memory: use heap snapshots in DevTools to find what's growing. If event loop lag: monitorEventLoopDelay + look for synchronous blocking. Only then do I change code — and always re-benchmark after to verify the fix actually helped."

🔒

Segment 7 — Security

Security is not a feature you bolt on at the end — it is a discipline woven through every line of code. This segment covers the attacks that actually happen in production: injection, prototype pollution, JWT vulnerabilities, CSRF, SSRF, and more. Every topic shows the vulnerable pattern first, explains exactly how the attack works, then shows the correct defence — so you understand the why, not just the rule.

Injection Attacks Auth & JWT Security Headers CSRF / SSRF Input Validation Secrets & Dependencies

Questions

Opened

Topics Covered

~65m

Study Time

📖

Read This First — The Attacker's Mindset

To defend a system you must understand how it is attacked. Almost every vulnerability in web applications comes down to one root cause: trusting data that comes from outside your process. User input, HTTP headers, environment variables, database responses, third-party package code — any of these can be malicious. Your job is to treat every external input as potentially hostile until you have validated it.

Analogy — the paranoid bank teller

A good bank teller does not hand over money just because someone claims to be the account owner. They verify ID, check the signature, confirm the amount is available, log the transaction, and alert the manager if anything is unusual. Every step is a security control. Good security code works the same way: verify, validate, sanitise, log, and limit.

Key vocabulary for this segment

Injection

Attacker's data is interpreted as code or commands by a downstream system (DB, OS, parser).

XSS

Cross-Site Scripting — attacker injects JavaScript that runs in another user's browser.

CSRF

Cross-Site Request Forgery — attacker tricks a logged-in user's browser into making an unwanted request.

SSRF

Server-Side Request Forgery — attacker makes your server fetch a URL the attacker controls (often internal).

Prototype pollution

Attacker modifies Object.prototype via a crafted payload, changing behaviour for all objects.

Defence in depth

Multiple independent security layers so that one failure does not lead to a full compromise.

Topic A — Injection Attacks

What is injection? Show SQL injection, NoSQL injection, and OS command injection in Node.js with vulnerable and fixed code.

Hard Injection

▾

What injection is — the root cause

Injection happens when you mix untrusted data with a command or query in a way that the data can change the structure of that command. Instead of being treated as a value, the attacker's input is interpreted as code. The root cause is always the same: string concatenation where parameterisation should be used.

Analogy — filling out a form vs talking to someone

Parameterised queries are like a structured form: there is a "name" box and a "surname" box. The form processor knows exactly which part is data. String concatenation is like telling someone what to write: "Write: hello, [whatever the user typed]." If the user types "; DROP TABLE users;--" the person writes that verbatim — and your database executes it.

1. SQL Injection

JavaScript — SQL injection vulnerable vs fixed

// ❌ VULNERABLE — string concatenation builds the query
// Attacker sends: username = "admin'--"
// Query becomes: SELECT * FROM users WHERE name='admin'--' AND pass='x'
// The -- comments out the password check → login bypassed!
const query = `SELECT * FROM users WHERE name='${req.body.username}' AND pass='${req.body.password}'`;
await db.query(query);

// ✅ FIXED — parameterised query: data is NEVER part of the SQL string
const row = await db.query(
  'SELECT * FROM users WHERE name = $1 AND pass = $2',
  [req.body.username, req.body.password]  // values sent separately, never interpolated
);
// The DB driver sends query structure and values in separate packets.
// The DB can NEVER interpret a value as SQL syntax.

2. NoSQL Injection (MongoDB)

JavaScript — NoSQL injection vulnerable vs fixed

// ❌ VULNERABLE — attacker sends JSON body: { "username": {"$gt": ""} }
// req.body.username is now the object { $gt: "" }
// MongoDB query becomes: { username: { $gt: "" } } → matches ALL users!
const user = await User.findOne({
  username: req.body.username,   // object passed directly → operator injection
  password: req.body.password
});

// ✅ FIXED — enforce string type before passing to query
const sanitise = (val) => typeof val === 'string' ? val : '';
const user = await User.findOne({
  username: sanitise(req.body.username),
  password: sanitise(req.body.password)
});
// Better: use a validation library like Zod or Joi to enforce schema at the boundary

3. OS Command Injection

JavaScript — command injection vulnerable vs fixed

const { exec } = require('child_process');

// ❌ VULNERABLE — attacker sends: filename = "report.pdf; cat /etc/passwd"
// Shell executes: convert report.pdf; cat /etc/passwd → reads /etc/passwd!
exec(`convert ${req.query.filename} output.png`, (err, out) => { ... });

// ✅ FIX 1 — use execFile instead of exec (no shell, args are separate)
const { execFile } = require('child_process');
execFile('convert', [req.query.filename, 'output.png'], ...);
// execFile does not invoke a shell — the filename cannot inject commands

// ✅ FIX 2 — validate input strictly (allowlist, not blocklist)
if (!/^[a-zA-Z0-9_\-]+\.pdf$/.test(req.query.filename)) {
  return res.status(400).send('Invalid filename');
}

What is prototype pollution? How does it work and how do you prevent it?

Hard Prototype Pollution

▾

How JavaScript's prototype chain makes this possible

Every plain JavaScript object inherits from Object.prototype. If an attacker can set a property on Object.prototype, that property will appear on every object in your application — because all objects inherit from it. This can change control flow, bypass security checks, or enable remote code execution.

Analogy — poisoning the shared water supply

Imagine every object in your program drinks from a shared water supply (Object.prototype). If an attacker adds a poison (a property) to the supply, every object that drinks from it gets affected — even objects created after the poisoning. That is prototype pollution.

JavaScript — the attack and its consequences

// How pollution happens — a naive "deep merge" function
function merge(target, source) {
  for (const key of Object.keys(source)) {
    if (typeof source[key] === 'object') {
      target[key] = merge(target[key] || {}, source[key]);
    } else {
      target[key] = source[key]; // ❌ allows writing to __proto__!
    }
  }
  return target;
}

// Attacker sends: { "__proto__": { "isAdmin": true } }
// After merge: Object.prototype.isAdmin === true
merge({}, JSON.parse('{"__proto__": {"isAdmin": true}}'));

// Now EVERY object inherits isAdmin: true
const user = { name: 'alice' }; // no isAdmin property
console.log(user.isAdmin); // true ← inherited from polluted prototype!

// If your auth check is:
if (user.isAdmin) grantAdminAccess();
// → every user is now an admin

Prevention — four defensive layers

JavaScript — defences against prototype pollution

// 1. Block dangerous keys in merge/deep-clone functions
const BLOCKED = new Set(['__proto__', 'constructor', 'prototype']);
function safeMerge(target, source) {
  for (const key of Object.keys(source)) {
    if (BLOCKED.has(key)) continue; // skip dangerous keys
    target[key] = source[key];
  }
  return target;
}

// 2. Use null-prototype objects for untrusted data (no prototype chain)
const safe = Object.create(null); // no __proto__, no prototype chain
safe['__proto__'] = 'attack';   // just sets a normal property, harmless

// 3. Freeze Object.prototype (prevents any additions)
Object.freeze(Object.prototype); // throws in strict mode if polluted

// 4. Use structured validation (Zod/Joi) to only allow known keys
//    Unknown keys are stripped before they ever reach merge logic
const schema = z.object({ name: z.string(), age: z.number() }).strict();
// .strict() rejects any extra keys not in the schema

⚠ Real-world impact

Prototype pollution has been found in lodash (CVE-2019-10744), jQuery, and dozens of other popular packages. Always run npm audit — many advisories are specifically for prototype pollution in deep-merge utilities.

Topic B — Authentication & JWT

How do you store passwords securely? Why is bcrypt / argon2 used instead of SHA-256 or MD5?

Medium Auth

▾

Why fast hashes (SHA-256, MD5) are wrong for passwords

SHA-256 and MD5 are designed to be fast — that is their purpose for checksums and signatures. But for passwords, speed is the enemy. A modern GPU can compute 10 billion SHA-256 hashes per second. If an attacker steals your hashed password database, they can try every possible 8-character password in seconds. A fast hash provides almost no protection.

Analogy — a vault door vs a screen door

A fast hash (SHA-256) is a screen door. It looks like a door, but an attacker can get through it in moments. bcrypt is a bank vault door — it is designed to be slow (deliberately). Even if an attacker gets the hash, cracking it takes years instead of seconds. The slowness is the feature.

Architecture — cost comparison for cracking "password123"

  Algorithm    | Time to crack (GPU farm) | Why
  ─────────────────────────────────────────────────────────────
  MD5          | < 1 second               | 50 billion hashes/sec
  SHA-256      | ~1 second                | 10 billion hashes/sec
  bcrypt(10)   | ~10-100 years            | ~200 hashes/sec (cost factor 10)
  argon2id     | > 100 years              | Tunable; also memory-hard
  ─────────────────────────────────────────────────────────────
  "memory-hard" = requires large RAM → can't parallelise on GPU cheaply

Why bcrypt is slow — the cost factor and salting

bcrypt has two key properties: a cost factor (work factor) that doubles the computation time for every increment, and a built-in random salt that is stored alongside the hash. The salt ensures that two users with the same password produce completely different hashes, defeating rainbow table attacks.

JavaScript — secure password handling

const bcrypt = require('bcrypt');

// REGISTRATION — hash the password before storing
const register = async (plainPassword) => {
  const saltRounds = 12; // 2^12 iterations — ~250ms on modern CPU
  const hash = await bcrypt.hash(plainPassword, saltRounds);
  // Store 'hash' in DB. NEVER store the plaintext password.
  await db.save({ passwordHash: hash });
};

// LOGIN — compare against stored hash
const login = async (plainPassword, storedHash) => {
  const match = await bcrypt.compare(plainPassword, storedHash);
  // bcrypt.compare is timing-safe — no timing oracle vulnerability
  if (!match) throw new Error('Invalid credentials');
};

// argon2 — recommended for new projects (winner of Password Hashing Competition)
const argon2 = require('argon2');
const hash   = await argon2.hash(password, { type: argon2.argon2id });
const valid  = await argon2.verify(hash, password);

How do JWT tokens work? What are the critical vulnerabilities and how do you prevent them?

Hard JWT

▾

What a JWT is — the three parts

A JSON Web Token is a self-contained credential. The server signs it with a secret key, so it can verify later that it created the token — without needing to look anything up in a database. A JWT has three base64-encoded parts separated by dots:

Architecture — JWT anatomy

  eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9   ← Header (algorithm + type)
  .
  eyJ1c2VySWQiOiI0MiIsInJvbGUiOiJ1c2VyIn0 ← Payload (your claims — NOT encrypted!)
  .
  SflKxwRJSMeKKF2QT4fwpMeJf36POk6yJV_adQssw5c ← Signature (HMAC or RSA of header+payload)

  Header:  { "alg": "HS256", "typ": "JWT" }
  Payload: { "userId": "42", "role": "user", "exp": 1700000000 }
  Signature: HMAC-SHA256(base64(header) + "." + base64(payload), SECRET_KEY)

  ⚠ The payload is base64-encoded, NOT encrypted.
    Anyone can decode it. NEVER put passwords or sensitive data in a JWT.

The critical vulnerabilities

JavaScript — JWT vulnerabilities and fixes

const jwt = require('jsonwebtoken');

// ─── Vulnerability 1: Algorithm "none" attack ───────────────────────────────
// Attacker modifies header: { "alg": "none" } → signs with empty signature
// If your code trusts 'alg' from the token header → anyone can forge a token!

// ❌ VULNERABLE — accepts algorithm from the token
jwt.verify(token, secret); // old versions trusted header's alg field

// ✅ FIXED — explicitly specify the allowed algorithm
jwt.verify(token, secret, { algorithms: ['HS256'] });

// ─── Vulnerability 2: Weak/missing secret ───────────────────────────────────
// ❌ VULNERABLE — short or predictable secret is brute-forceable
jwt.sign(payload, 'secret');        // brute-forced in milliseconds
jwt.sign(payload, 'mysecret123');   // still trivially weak

// ✅ FIXED — 256+ bit cryptographically random secret
// Generate once: node -e "console.log(require('crypto').randomBytes(64).toString('hex'))"
jwt.sign(payload, process.env.JWT_SECRET); // 128-hex-char secret from env

// ─── Vulnerability 3: No expiry ─────────────────────────────────────────────
// ❌ VULNERABLE — token never expires; stolen token works forever
jwt.sign({ userId: 42 }, secret);

// ✅ FIXED — always set expiry
jwt.sign({ userId: 42 }, secret, { expiresIn: '15m' }); // short-lived access token
// Pair with a long-lived refresh token stored in httpOnly cookie

// ─── Vulnerability 4: JWT in localStorage ───────────────────────────────────
// ❌ VULNERABLE — XSS can steal localStorage tokens
// ✅ FIXED — store in httpOnly, secure, sameSite cookie
// httpOnly = JS cannot read it → XSS cannot steal it

Interview answer

"JWTs have three common attack vectors: the 'alg: none' attack (always specify allowed algorithms explicitly), weak secrets (use 256+ bit random), and missing expiry (always set expiresIn and use refresh tokens). Additionally, store JWTs in httpOnly cookies — not localStorage — so XSS cannot steal them. The payload is only base64, not encrypted, so never put sensitive data there."

Topic C — Security Headers & CSRF

What HTTP security headers should every Node.js server set? What does each one protect against?

Medium Security Headers

▾

Why headers matter — the browser as a security partner

HTTP security headers are instructions from your server to the browser: "here is how you are allowed to use this page." The browser enforces these rules, providing a second layer of defence even if your application code has vulnerabilities. They cost nothing and should be set on every response.

The essential headers — what each one does

Header	Protects against	Recommended value
`Content-Security-Policy`	XSS — restricts which scripts, styles, and resources can load	`default-src 'self'; script-src 'self'`
`Strict-Transport-Security`	HTTPS downgrade / SSL stripping attacks	`max-age=31536000; includeSubDomains`
`X-Content-Type-Options`	MIME sniffing — browser misinterpreting file type	`nosniff`
`X-Frame-Options`	Clickjacking — embedding your page in a malicious iframe	`DENY` or `SAMEORIGIN`
`Referrer-Policy`	Leaking sensitive URLs in the Referer header to third parties	`no-referrer-when-downgrade`
`Permissions-Policy`	Restricts browser features (camera, microphone, geolocation)	`camera=(), microphone=(), geolocation=()`
`X-Powered-By`	Fingerprinting — remove it to hide that you use Express/Node	Remove entirely (`app.disable('x-powered-by')`)

Using Helmet.js — the practical way

JavaScript — helmet.js sets all security headers in one line

const express = require('express');
const helmet  = require('helmet');

const app = express();

// Sets ~12 security headers automatically (use this on every Express app)
app.use(helmet());

// Customise CSP for your specific needs
app.use(helmet.contentSecurityPolicy({
  directives: {
    defaultSrc:  ["'self'"],
    scriptSrc:   ["'self'", 'https://cdn.example.com'],
    styleSrc:    ["'self'", "'unsafe-inline'"],   // avoid unsafe-inline in prod
    imgSrc:      ["'self'", 'data:', 'https:'],
    connectSrc:  ["'self'", 'https://api.example.com'],
    upgradeInsecureRequests: []
  }
}));

// HSTS — tell browsers to ALWAYS use HTTPS for this domain (1 year)
app.use(helmet.strictTransportSecurity({
  maxAge: 31_536_000, includeSubDomains: true, preload: true
}));

What is CSRF? How does it work and how do you prevent it? What role does the SameSite cookie attribute play?

Hard CSRF

▾

How CSRF works — the attack step by step

CSRF exploits the fact that browsers automatically include cookies with every request to a domain — even if the request originated from a different website. If a user is logged into your site (session cookie stored in browser), and visits a malicious site, that site can make the user's browser silently send a request to your API — with the user's real cookies attached.

Alice logs into bank.example.com. A session cookie is stored in her browser.

Alice visits evil.com (a malicious page). It contains a hidden form or an image tag that sends a POST to bank.example.com/transfer?amount=1000&to=attacker.

Alice's browser automatically includes her bank.example.com session cookie with the request. The bank server sees a valid session and processes the transfer.

Alice's money is transferred. She never knew the request was made — it happened in the background.

Prevention — three layers

JavaScript — CSRF prevention

// ── Layer 1: SameSite cookie attribute (modern, most effective) ──────────────
// SameSite=Strict: cookie NOT sent for ANY cross-site request (breaks OAuth flows)
// SameSite=Lax:    cookie NOT sent for cross-site POST/PUT/DELETE (good default)
// SameSite=None:   always sent (requires Secure; needed for third-party embeds)
res.cookie('session', token, {
  httpOnly: true,        // JS cannot read (XSS protection)
  secure:   true,        // HTTPS only
  sameSite: 'lax',       // not sent on cross-site POST → CSRF mitigated
  maxAge:   900_000      // 15 min in ms
});

// ── Layer 2: CSRF token (double-submit cookie pattern) ───────────────────────
// Server generates a random token, stored in a cookie AND a hidden form field.
// On POST: verify the form field matches the cookie value.
// An attacker's page cannot read your cookies (same-origin policy) → cannot forge.
const csrf = require('csrf');
const tokens = new csrf();

// On form render:
const secret = await tokens.secret();
const token  = tokens.create(secret);
req.session.csrfSecret = secret;
// Include <input type="hidden" name="_csrf" value="${token}"> in your form

// On form submit — verify token:
if (!tokens.verify(req.session.csrfSecret, req.body._csrf)) {
  return res.status(403).send('Invalid CSRF token');
}

// ── Layer 3: Check Origin/Referer header ─────────────────────────────────────
// Cross-site requests have Origin set to the attacker's domain
const origin = req.headers['origin'] || req.headers['referer'];
if (origin && !origin.startsWith('https://yourdomain.com')) {
  return res.status(403).send('Forbidden');
}

Topic D — Input Validation & Rate Limiting

How do you validate and sanitize user input in Node.js? What is the difference between the two?

Medium Input Validation

▾

Validation vs sanitization — two different jobs

✅ Validation — is this input acceptable?

Reject input that does not match expected shape/type/range. Returns an error to the caller. Never modifies the data. Examples: "Is this a valid email?", "Is this age between 0 and 120?", "Is this UUID format correct?"

🧹 Sanitization — clean the input

Transform input to remove potentially dangerous content before using it. Examples: strip HTML tags before storing user bio, trim whitespace, normalise unicode, encode special characters for SQL/HTML output.

The golden rule: validate at the boundary, sanitize before output

Validate that input matches your schema when it enters your system. Sanitize (encode) before writing to HTML, SQL, shell commands, etc. Never sanitize blindly at input time — you might strip data you need, and it does not protect against all output contexts.

Validation with Zod — the modern approach

JavaScript — Zod schema validation

const { z } = require('zod');

// Define the shape you expect — anything outside this is rejected
const CreateUserSchema = z.object({
  username: z.string().min(3).max(30).regex(/^[a-zA-Z0-9_]+$/),
  email:    z.string().email(),
  age:      z.number().int().min(13).max(120),
  role:     z.enum(['user', 'moderator'])  // explicit allowlist
}).strict(); // .strict() rejects any unknown keys (prevents mass assignment)

app.post('/users', async (req, res) => {
  const result = CreateUserSchema.safeParse(req.body);

  if (!result.success) {
    return res.status(400).json({
      errors: result.error.flatten().fieldErrors
    });
  }

  // result.data is fully typed and validated — safe to use
  await createUser(result.data);
  res.status(201).json({ ok: true });
});

Path traversal — a classic input validation failure

JavaScript — path traversal vulnerable vs fixed

const path = require('path');

// ❌ VULNERABLE — attacker requests: /files?name=../../etc/passwd
app.get('/files', (req, res) => {
  res.sendFile('/uploads/' + req.query.name); // reads ANY file on the system!
});

// ✅ FIXED — resolve and verify the path stays within allowed directory
app.get('/files', (req, res) => {
  const base     = path.resolve('/uploads');
  const filePath = path.resolve(base, req.query.name);

  if (!filePath.startsWith(base + path.sep)) {
    return res.status(403).send('Forbidden');  // outside allowed dir
  }
  res.sendFile(filePath);
});

How do you implement rate limiting in Node.js? What attacks does it prevent?

Medium Rate Limiting

▾

What rate limiting prevents

🔑 Brute force attacks

Attacker tries thousands of password combinations against /login. Without rate limiting: 10,000 attempts per second is trivial. With rate limiting: 5 attempts per 15 minutes — an attacker cannot crack a reasonable password in a lifetime.

📧 Credential stuffing / account enumeration

Attacker uses a list of leaked usernames/passwords from other breaches and tries them against your login. Also prevents scraping, API abuse, and denial-of-service via expensive endpoints (search, reports).

Rate limiting with express-rate-limit + Redis

JavaScript — rate limiting implementation

const rateLimit = require('express-rate-limit');
const RedisStore = require('rate-limit-redis').default;

// General API rate limit — 100 requests per 15 minutes per IP
const apiLimiter = rateLimit({
  windowMs: 15 * 60 * 1000,   // 15 minutes
  max:       100,
  standardHeaders: true,    // sets RateLimit-* headers in response
  legacyHeaders:   false,
  store: new RedisStore({ sendCommand: (...args) => redisClient.sendCommand(args) })
  // Redis store: counters survive server restarts and work across cluster workers
});

// Strict limit for auth endpoints — 5 attempts per 15 minutes
const authLimiter = rateLimit({
  windowMs: 15 * 60 * 1000,
  max:       5,
  message:  'Too many login attempts. Try again in 15 minutes.',
  skipSuccessfulRequests: true  // don't count successful logins
});

app.use('/api',    apiLimiter);
app.post('/login', authLimiter, loginHandler);

// ⚠ Important: rate limit by user ID AFTER authentication for logged-in endpoints
//   IP-based limits can be bypassed with multiple IPs / can affect legitimate
//   shared networks (NAT, VPN, office). For auth endpoints, IP is acceptable.

Topic E — SSRF & Secrets Management

What is SSRF? How does it happen in Node.js and how do you prevent it?

Hard SSRF

▾

What SSRF is and why it is dangerous

Server-Side Request Forgery (SSRF) happens when your server makes an HTTP request to a URL that an attacker controls — and the attacker uses this to reach systems that are only accessible from within your internal network. Your server becomes a proxy for the attacker.

Analogy — using the receptionist as a spy

Imagine a company receptionist who, when asked, will fetch any document from any internal room and read it to you. An outsider cannot enter the building — but they can ask the receptionist to fetch sensitive documents from the CEO's office. The receptionist is trusted internally, so doors open for them. Your server is that receptionist — it has access to internal services that external attackers cannot reach directly.

Your app has a feature: "fetch this URL and return its content" (e.g. image preview, webhook validation, URL metadata scraper).

Attacker submits: http://169.254.169.254/latest/meta-data/iam/security-credentials/ — the AWS EC2 metadata endpoint, only accessible from within the cloud instance.

Your server fetches it and returns the AWS credentials to the attacker. The attacker now has full AWS access.

Prevention — allowlist and block internal ranges

JavaScript — SSRF prevention

const dns  = require('dns/promises');
const net  = require('net');
const { URL } = require('url');

// Private/internal IP ranges to block
const BLOCKED_PREFIXES = [
  '10.', '172.16.', '192.168.',   // RFC1918 private
  '127.', '::1', 'localhost',       // loopback
  '169.254.', 'fd',                 // link-local + IPv6 private
  '0.'                                // 0.0.0.0
];

async function isSafeUrl(rawUrl) {
  let parsed;
  try { parsed = new URL(rawUrl); } catch { return false; }

  // 1. Only allow https (never file://, ftp://, gopher://)
  if (parsed.protocol !== 'https:') return false;

  // 2. Resolve DNS and check the actual IP (prevents DNS rebinding)
  const addresses = await dns.lookup(parsed.hostname, { all: true });
  for (const { address } of addresses) {
    if (BLOCKED_PREFIXES.some(p => address.startsWith(p))) return false;
  }

  // 3. Allowlist approach (even better): only allow specific hostnames
  const ALLOWED_HOSTS = new Set(['api.partner.com', 'cdn.example.com']);
  return ALLOWED_HOSTS.has(parsed.hostname);
}

How do you manage secrets and secure dependencies in a Node.js application?

Medium Secrets & Dependencies

▾

Secrets management — the rules

Never hardcode secrets in source code. Database passwords, API keys, JWT secrets, TLS private keys — these must NEVER appear in your code or git history. A public GitHub repo with a hardcoded AWS key is compromised within minutes by automated scanners.

Use environment variables for local development. Store secrets in a .env file (loaded by dotenv). Add .env to .gitignore immediately. Provide a .env.example with dummy values for team onboarding.

Use a secrets manager in production. AWS Secrets Manager, HashiCorp Vault, GCP Secret Manager, or Azure Key Vault. Your app fetches secrets at startup via an API call — secrets are never in environment variables on disk.

Rotate secrets regularly. If a secret is exposed, rotation limits the damage window. Design your system to support rotation without downtime (store two valid keys during rollover).

JavaScript — safe secrets loading

// ❌ NEVER — hardcoded secret in source
const secret = 'sk_live_abc123xyz';

// ✅ Environment variable (development)
require('dotenv').config(); // reads .env file (never commit .env!)
const secret = process.env.STRIPE_SECRET_KEY;
if (!secret) throw new Error('STRIPE_SECRET_KEY is required'); // fail fast

// ✅ AWS Secrets Manager (production)
const client = new SecretsManagerClient({ region: 'us-east-1' });
const { SecretString } = await client.send(
  new GetSecretValueCommand({ SecretId: 'prod/myapp/db' })
);
const { password } = JSON.parse(SecretString);

Dependency security — supply chain attacks are real

Your node_modules folder contains thousands of packages you did not write. Any of them could have a vulnerability — or be malicious (supply chain attack). This is now one of the most common attack vectors against Node.js applications.

Shell — dependency security workflow

# 1. Audit for known vulnerabilities (run after every npm install)
npm audit

# 2. Auto-fix safe upgrades
npm audit fix

# 3. Snyk — deeper analysis including transitive deps and license issues
npx snyk test

# 4. Lock your dependency versions (always commit package-lock.json)
# Use exact versions for critical deps: "express": "4.18.2" not "^4.18.2"

# 5. Check for typosquatting before installing
# "lodash" is safe. "1odash" (number one) is a malicious package.
# Always verify the package name exactly.

# 6. Limit what packages can do — use Node's --experimental-permission
node --experimental-permission --allow-read=/data --allow-net server.js
# Node will throw if the app tries to read/write outside allowed paths

Interview answer — security in 30 seconds

"My security checklist: parameterised queries for all DB access (no string concatenation), Zod/Joi validation at every API boundary, Helmet.js for security headers, rate limiting on auth endpoints, httpOnly+SameSite=Lax cookies for sessions, bcrypt/argon2 for passwords, short-lived JWTs with explicit algorithm, secrets in environment variables (never in code), npm audit in CI, and SSRF protection on any endpoint that fetches user-supplied URLs. Defence in depth — each layer assumes the previous one can fail."

🧪

Segment 8 — Testing

Tests are not bureaucracy — they are the safety net that lets you change code with confidence. This final segment covers the full testing spectrum: the test pyramid, Jest fundamentals, async pitfalls, every type of test double, HTTP integration testing with supertest, mocking timers and outbound HTTP, code coverage semantics, and the architectural patterns that make code testable in the first place.

Test Pyramid Jest Fundamentals Async Testing Mocks / Stubs / Spies supertest Coverage & Testability

Questions

Opened

Topics Covered

~70m

Study Time

📖

Read This First — Why Tests Exist and What They Actually Test

A test proves that your code does what you think it does — right now. More importantly, it proves it still does that six months later after fifty other changes. Tests are a time machine: you write them once and they keep checking correctness forever, automatically, in milliseconds.

Analogy — the aircraft pre-flight checklist

A pilot does not skip the pre-flight checklist just because the plane flew fine yesterday. Conditions change, things break, humans make mistakes. The checklist runs every time, automatically catching problems before they become disasters. Your test suite is that checklist — it runs on every commit, every deploy, catching regressions before users see them.

🧪 What good tests give you

Confidence to refactor — change internals without fear. Living documentation — tests show how code is meant to be used. Faster debugging — a failing test pinpoints exactly what broke. Design pressure — hard-to-test code is usually badly designed.

⚠ What tests cannot give you

Tests cannot prove the absence of bugs — only the presence of the behaviours you tested. 100% code coverage does not mean 100% correct. Tests are only as good as the scenarios you imagined. Always pair tests with code review and monitoring.

Key vocabulary for this segment

Unit test

Tests one function or class in isolation. All dependencies are replaced with test doubles.

Integration test

Tests multiple units working together — e.g. a route handler with a real DB transaction.

Test double

Generic term for any replacement of a real dependency (mock, stub, spy, fake).

Mock

A pre-programmed substitute that also records how it was called — verifiable after the fact.

Stub

Returns a hard-coded response. Does not record calls. Used to control a dependency's output.

Spy

Wraps a real function but records calls. Real behaviour still runs unless overridden.

Topic A — Test Pyramid & Jest Fundamentals

What is the test pyramid? What are unit, integration, and end-to-end tests and when do you use each?

Medium Fundamentals

▾

The pyramid — why the shape matters

Architecture — the test pyramid

                      ▲
                    /   \
                   / E2E  \       Few, slow, brittle, expensive
                  /─────────\     Tests the whole system via UI/API
                 /           \
                / Integration  \  Some — test module boundaries
               /───────────────\  Real DB, HTTP, file system
              /                 \
             /    Unit Tests      \ Many, fast, isolated, cheap
            /─────────────────────\ One function, all deps mocked
           ───────────────────────────

  Rule of thumb:   70% unit   /   20% integration   /   10% E2E
  Why the shape?   Unit tests are fast (ms), E2E tests are slow (seconds)
                   Fast tests run on every save; slow ones run in CI

The three layers — what each one does and costs

	Unit	Integration	End-to-End (E2E)
What it tests	One function / class in isolation	Multiple components together (routes, DB, cache)	The whole system via UI or real HTTP
Speed	~1–5 ms per test	~50–500 ms per test	~1–30 s per test
Dependencies	All mocked	Some real (DB), some mocked	All real (deployed env)
Catches	Logic bugs in one unit	Contract bugs between units	User-visible workflow failures
Misses	Integration bugs	UI-level issues	Edge cases (covered by unit tests)
Tools	Jest, Vitest, Mocha	Jest + supertest + testcontainers	Playwright, Cypress, k6

The "ice cream cone" anti-pattern to avoid

Common mistake — inverted pyramid

Many teams write almost no unit tests but have hundreds of E2E tests ("ice cream cone" shape). E2E tests are 100× slower, break on unrelated UI changes, are hard to debug, and cannot isolate exactly what failed. A suite of 200 E2E tests that takes 30 minutes to run gives you less confidence than 2000 unit tests that run in 5 seconds. The pyramid shape is intentional.

Walk through Jest's core API: describe, it, expect, matchers, and beforeEach / afterEach. What are the setup / teardown lifecycle hooks?

Medium Jest

▾

The anatomy of a Jest test file

JavaScript — Jest structure and lifecycle

// ── Structure ─────────────────────────────────────────────────────
describe('UserService', () => {        // groups related tests

  let db, service;

  // ── Lifecycle hooks ────────────────────────────────────────────────
  beforeAll(async () => {              // runs ONCE before all tests in this describe
    db = await createTestDb();
  });

  afterAll(async () => {               // runs ONCE after all tests
    await db.close();
  });

  beforeEach(async () => {             // runs before EACH test — reset state
    await db.clear();
    service = new UserService(db);
  });

  afterEach(() => {                    // runs after EACH test — cleanup
    jest.clearAllMocks();
  });

  // ── Tests ─────────────────────────────────────────────────────────
  it('creates a user with hashed password', async () => {
    const user = await service.create({ name: 'Alice', password: 'secret' });

    expect(user.id).toBeDefined();
    expect(user.name).toBe('Alice');
    expect(user.password).not.toBe('secret');            // must be hashed
    expect(user.password).toMatch(/^\$2[aby]\$/);         // bcrypt format
  });

  it('throws when email already exists', async () => {
    await service.create({ email: 'a@b.com' });
    await expect(service.create({ email: 'a@b.com' }))
      .rejects.toThrow('Email already exists');
  });
});

Essential matchers — the ones you use every day

JavaScript — Jest matchers reference

// Equality
expect(val).toBe(42)            // strict equality (===) — use for primitives
expect(obj).toEqual({ a: 1 })  // deep equality — use for objects/arrays
expect(obj).toStrictEqual({})   // like toEqual but also checks undefined properties

// Truthiness
expect(val).toBeTruthy()        // not null/undefined/0/false/''
expect(val).toBeFalsy()         // null/undefined/0/false/''
expect(val).toBeNull()          // exactly null
expect(val).toBeUndefined()     // exactly undefined

// Numbers
expect(0.1 + 0.2).toBeCloseTo(0.3)  // float comparison (avoids 0.30000000004)
expect(5).toBeGreaterThan(3)

// Strings / arrays
expect('hello').toContain('ell')
expect([1,2,3]).toContain(2)
expect('hello').toMatch(/^hel/)

// Partial object matching
expect(user).toMatchObject({ name: 'Alice', role: 'admin' })
// passes even if user has extra properties — great for API responses

// Async errors
await expect(asyncFn()).rejects.toThrow('message')
await expect(asyncFn()).rejects.toMatchObject({ code: 'NOT_FOUND' })

Topic B — Async Testing & Test Doubles

How do you test async code in Jest? What are the common pitfalls that make async tests silently pass when they should fail?

Hard Async Testing

▾

The silent false-positive — the most dangerous async mistake

If you forget to return a Promise or await it in a test, Jest considers the test finished as soon as the synchronous code completes — before the async assertions run. The test passes even if the assertions would have failed. This is a false positive: your tests lie to you.

JavaScript — all four async patterns in Jest

// ── Pattern 1: async/await (recommended) ────────────────────────
it('fetches a user', async () => {
  const user = await getUser(1);
  expect(user.name).toBe('Alice');
});

// ── Pattern 2: return a Promise ──────────────────────────────────
it('fetches a user', () => {
  return getUser(1).then(user => {   // ← MUST return. Omitting = silent pass.
    expect(user.name).toBe('Alice');
  });
});

// ── Pattern 3: callbacks — use done() ───────────────────────────
it('reads a file', (done) => {
  fs.readFile('test.txt', (err, data) => {
    expect(err).toBeNull();
    expect(data.toString()).toBe('hello');
    done();   // signal Jest the test is complete. Forgetting = timeout.
  });
});

// ── Common pitfall: not awaiting rejections ──────────────────────
// ❌ WRONG — test passes even if the promise never rejects
it('throws on bad input', () => {
  expect(badFn()).rejects.toThrow(); // forgot await/return — always passes!
});

// ✅ CORRECT — await the assertion
it('throws on bad input', async () => {
  await expect(badFn()).rejects.toThrow('Invalid input');
});

// ── expect.assertions(n) — fail test if fewer assertions ran ────
it('should reach the catch block', async () => {
  expect.assertions(1);  // test FAILS if this assertion was never reached
  try {
    await badFn();
  } catch (err) {
    expect(err.message).toBe('Invalid input');
  }
});

What are test doubles? Explain the difference between mocks, stubs, spies, and fakes with concrete Node.js examples.

Hard Test Doubles

▾

Why you need test doubles — isolation

A unit test should test one thing. If your function calls a database, sends an email, and calls an external API, a test failure could be caused by any of those systems — not your code. Test doubles replace real dependencies with controlled substitutes so your test only verifies your logic.

Analogy — movie production

A stub is a cardboard prop — it looks like a sword but doesn't do anything real. A fake is a rubber sword that can actually be swung. A spy is a hidden camera on the real sword. A mock is a prop with instructions: "record every time someone draws it, and if it's drawn before the fight scene, fail the take."

The four test doubles — definition and code

JavaScript — all four patterns in Jest

// ── STUB — returns a fixed value. You don't care HOW it was called.
const getUser = jest.fn().mockResolvedValue({ id: 1, name: 'Alice' });
// Call it — it always returns that object. No assertions on how it was called.

// ── SPY — wraps a REAL function but records calls
const consoleSpy = jest.spyOn(console, 'log');
myFunction(); // console.log still runs, but calls are recorded
expect(consoleSpy).toHaveBeenCalledWith('expected message');
consoleSpy.mockRestore(); // restore original after test

// ── MOCK — like a spy but with pre-programmed expectations
const emailSender = { send: jest.fn() };
sendWelcomeEmail(emailSender, 'alice@example.com');
// Now verify HOW it was called:
expect(emailSender.send).toHaveBeenCalledTimes(1);
expect(emailSender.send).toHaveBeenCalledWith({
  to: 'alice@example.com',
  subject: expect.stringContaining('Welcome')
});

// ── FAKE — a working lightweight implementation (no jest.fn needed)
class FakeUserRepository {
  constructor() { this.users = new Map(); }
  async save(user)   { this.users.set(user.id, user); return user; }
  async findById(id) { return this.users.get(id) ?? null; }
  async clear()      { this.users.clear(); }
}
// Tests use this instead of a real DB — fast, isolated, but real logic runs

When to use which

Double	Use when
Stub	You need to control what a dependency returns but don't care about the call details
Spy	You want real behaviour to run but also need to assert how the function was called
Mock	You want to replace a dependency AND assert exactly how it was called
Fake	The real dependency is too complex/slow but you want real logic (e.g. in-memory DB)

Topic C — Module Mocking & Timers

How do you mock entire modules and external dependencies with jest.mock()? What are the rules and gotchas?

Hard Mocking

▾

jest.mock() — hoisting and the module registry

When you call jest.mock('moduleName'), Jest intercepts all require() calls to that module for the duration of the test file and replaces them with auto-generated mocks. Critically, Jest hoists jest.mock() calls to the top of the file at compile time — before any imports — so the mock is in place before your production code loads the module.

JavaScript — jest.mock() patterns

// ── Auto-mock an entire module ───────────────────────────────────
jest.mock('nodemailer'); // all exports become jest.fn() automatically
const nodemailer = require('nodemailer');
nodemailer.createTransport.mockReturnValue({
  sendMail: jest.fn().mockResolvedValue({ messageId: 'test-id' })
});

// ── Mock with factory function (full control) ────────────────────
jest.mock('../db', () => ({
  query: jest.fn().mockResolvedValue([]),
  close: jest.fn()
}));

// ── Partial mock — keep some real, mock some ─────────────────────
jest.mock('../utils', () => ({
  ...jest.requireActual('../utils'),  // keep real implementations
  generateId: jest.fn().mockReturnValue('fixed-id') // replace only this one
}));

// ── Mock per-test (override for one test) ───────────────────────
beforeEach(() => jest.clearAllMocks()); // reset call counts between tests

it('handles DB error', async () => {
  db.query.mockRejectedValueOnce(new Error('Connection lost'));
  // Only this test sees the error; next test gets the default mock
  await expect(getUsers()).rejects.toThrow('Connection lost');
});

The hoisting gotcha with ES modules

JavaScript — the hoisting trap

// ❌ BROKEN — variable declared before jest.mock() but jest.mock() is hoisted above it
const mockFn = jest.fn();
jest.mock('../service', () => ({ doWork: mockFn }));
// At runtime, jest.mock factory runs BEFORE const mockFn — mockFn is undefined!

// ✅ FIXED — use jest.fn() inside the factory, capture the reference after
jest.mock('../service', () => ({ doWork: jest.fn() }));
const { doWork } = require('../service'); // capture after jest.mock runs

// OR use jest.mocked() for TypeScript-aware mock typing
import { doWork } from '../service';
const mockDoWork = jest.mocked(doWork); // typed mock with full autocomplete

How do you mock timers, Date.now(), and randomness in tests to make time-dependent code deterministic?

Hard Mocking

▾

Why time is the enemy of reliable tests

Tests that depend on real time have three problems: they are slow (a 5-second timeout test takes 5 seconds every run), they are flaky (a test that passes in 100 ms might fail under CI load at 101 ms), and they are non-deterministic (results change based on when the test runs). Jest's fake timers solve all three by giving you full control over the clock.

JavaScript — fake timers, Date, and Math.random

// ── Fake timers — control setTimeout, setInterval, Date ─────────
beforeEach(() => {
  jest.useFakeTimers();
  jest.setSystemTime(new Date('2024-01-15T10:00:00Z')); // fixed clock
});
afterEach(() => jest.useRealTimers()); // ALWAYS restore after test

it('debounces search after 300ms', () => {
  const handler = jest.fn();
  const debounced = debounce(handler, 300);

  debounced('a');
  debounced('ab');
  debounced('abc');
  expect(handler).not.toHaveBeenCalled(); // not fired yet

  jest.advanceTimersByTime(300); // jump clock forward 300ms instantly
  expect(handler).toHaveBeenCalledTimes(1);
  expect(handler).toHaveBeenCalledWith('abc'); // only last call fired
});

it('generates a timestamp-based ID', () => {
  // Date.now() returns our fixed time — deterministic!
  expect(generateId()).toBe('id-1705312800000');
});

// ── Mocking Math.random ──────────────────────────────────────────
it('generates predictable tokens', () => {
  jest.spyOn(Math, 'random').mockReturnValue(0.5);
  expect(generateToken()).toBe('expected-value-for-0.5');
  jest.spyOn(Math, 'random').mockRestore();
});

// ── Running all pending timers ───────────────────────────────────
jest.runAllTimers();      // flush ALL pending timers (careful with recursion)
jest.runOnlyPendingTimers(); // safer — only currently queued timers
await jest.runAllTimersAsync(); // for async timers (Promises inside timeouts)

Topic D — HTTP & Outbound Request Testing

How do you integration-test an Express HTTP API with supertest? Show a complete setup including authentication and database.

Hard supertest

▾

What supertest does — the key insight

supertest wraps your Express app object and starts a temporary HTTP server for the duration of each test. You make real HTTP requests against it without binding to a port, without needing the server to be running separately, and without any network overhead. The full middleware chain runs — routing, validation, auth, body parsing — just like production.

Complete integration test setup

JavaScript — supertest integration tests

const request  = require('supertest');
const app      = require('../app');          // import app without .listen()
const { db }   = require('../db');
const { sign } = require('jsonwebtoken');

describe('POST /api/users', () => {

  beforeAll(async () => { await db.migrate.latest(); });
  afterAll(async  () => { await db.destroy(); });
  beforeEach(async () => { await db('users').truncate(); });

  it('returns 201 and the new user', async () => {
    const res = await request(app)
      .post('/api/users')
      .set('Content-Type', 'application/json')
      .send({ name: 'Alice', email: 'alice@example.com' });

    expect(res.status).toBe(201);
    expect(res.body).toMatchObject({ name: 'Alice', email: 'alice@example.com' });
    expect(res.body.id).toBeDefined();
    expect(res.body.password).toBeUndefined(); // must not leak hash
  });

  it('returns 400 for invalid email', async () => {
    const res = await request(app)
      .post('/api/users')
      .send({ name: 'Bob', email: 'not-an-email' });

    expect(res.status).toBe(400);
    expect(res.body.errors.email).toBeDefined();
  });

  it('returns 401 for unauthenticated admin route', async () => {
    const res = await request(app).delete('/api/users/1');
    expect(res.status).toBe(401);
  });

  it('deletes user when admin token provided', async () => {
    const token = sign({ role: 'admin' }, process.env.JWT_SECRET);
    const res = await request(app)
      .delete('/api/users/1')
      .set('Authorization', `Bearer ${token}`);
    expect(res.status).toBe(204);
  });
});

Key pattern — export app without .listen()

Always separate your app definition from the server startup. Export the app object from app.js without calling listen(). Call listen() only in server.js. This lets supertest (and tests) import the app without starting a real server.

How do you test code that makes outgoing HTTP requests? Compare nock, msw, and jest-fetch-mock.

Hard HTTP Mocking

▾

The problem — tests that call real APIs

If your code calls an external API (Stripe, Twilio, GitHub) and your tests actually make those calls, you have: slow tests (network latency), flaky tests (rate limits, downtime), billed API calls, and tests that break when the API changes independently of your code. You need to intercept outgoing HTTP at the network level.

nock — intercept at the Node http module level

JavaScript — nock for intercepting outbound HTTP

const nock = require('nock');

afterEach(() => nock.cleanAll()); // remove all interceptors after each test

it('creates a Stripe charge', async () => {
  // Intercept the real HTTP call at the socket level — no network
  nock('https://api.stripe.com')
    .post('/v1/charges')
    .reply(200, { id: 'ch_test123', status: 'succeeded' });

  const charge = await createCharge({ amount: 1000, currency: 'usd' });
  expect(charge.id).toBe('ch_test123');
});

it('handles Stripe errors gracefully', async () => {
  nock('https://api.stripe.com')
    .post('/v1/charges')
    .reply(402, { error: { message: 'Your card was declined.' } });

  await expect(createCharge({ amount: 1000 }))
    .rejects.toThrow('Your card was declined.');
});

// nock also supports: query params, request body matching, delays, retries
nock('https://api.example.com')
  .get('/users')
  .query({ page: '2' })              // match specific query string
  .delayConnection(200)             // simulate slow network
  .reply(200, [{ id: 2 }])
  .times(3);                         // match up to 3 times

msw (Mock Service Worker) — the modern alternative

JavaScript — msw in Node.js tests

const { setupServer } = require('msw/node');
const { http, HttpResponse } = require('msw');

const server = setupServer(
  http.get('https://api.github.com/users/:username', ({ params }) => {
    return HttpResponse.json({ login: params.username, public_repos: 42 });
  })
);

beforeAll(() => server.listen());
afterEach(() => server.resetHandlers());
afterAll(() => server.close());

it('fetches GitHub profile', async () => {
  const profile = await getGithubProfile('torvalds');
  expect(profile.login).toBe('torvalds');
  expect(profile.public_repos).toBe(42);
});

Topic E — Coverage & Writing Testable Code

What does code coverage actually measure? What does 100% coverage miss, and how do you set meaningful thresholds?

Medium Coverage

▾

The four coverage metrics — what each one counts

Metric	Measures	Example
Statement	% of individual statements executed	`const x = 1;` was run at least once
Branch	% of if/else/ternary paths taken	Both the `if` path AND the `else` path were executed
Function	% of functions called at least once	Every function was invoked by some test
Line	% of source lines executed	Similar to statement but per source line

⚠ What 100% coverage does NOT prove

Coverage measures which code ran — not whether the assertions were correct. A test that calls a function but makes no assertions gives 100% coverage with zero quality. Coverage also cannot detect: missing business logic, edge cases you didn't think of, incorrect error handling, race conditions, or security vulnerabilities. It is a useful floor, not a ceiling.

Configuring coverage in Jest

JSON — jest.config.js coverage thresholds

// jest.config.js
module.exports = {
  collectCoverage: true,
  coverageProvider: 'v8',           // use V8's built-in coverage (faster)
  collectCoverageFrom: [
    'src/**/*.js',
    '!src/**/*.test.js',              // exclude test files themselves
    '!src/migrations/**'              // exclude generated/migration files
  ],
  coverageThresholds: {
    global: {
      statements: 80,                 // CI fails if below 80%
      branches:   75,                 // branches are harder to cover — set lower
      functions:  80,
      lines:      80
    },
    // Per-file thresholds for critical business logic
    'src/payments/': { statements: 95, branches: 95 }
  },
  coverageReporters: ['text', 'html', 'lcov']
  // lcov for CI integration (Codecov, SonarQube)
};

What to target in practice

→

Business logic / service layer: 90–95%. This is where bugs hurt most. Every branch and edge case should be exercised.

→

API route handlers: 80–90%. Cover happy paths and key error paths via integration tests.

→

Utility functions: 95–100%. Pure functions are trivial to test exhaustively.

→

Configuration / bootstrap files: don't chase 100%. These files are hard to test and low business risk. Exclude them from thresholds.

What makes code hard to test and how do you fix it? Walk through the key design principles for testable Node.js code.

Hard Testability

▾

The root cause — hidden dependencies

Code that is hard to test has the same root cause: it reaches out to external systems (databases, files, APIs, clocks) directly, without giving tests a way to substitute those dependencies. The fix is always the same: inject dependencies instead of instantiating them inside the function.

Analogy — a chef who grows all their own vegetables

A chef who grows, harvests, and cooks all in one process is impossible to test — you cannot check just the cooking without the farm. A chef who accepts ingredients as inputs can be tested by handing them different ingredients. Dependency injection is handing the chef the ingredients.

Five patterns that make code untestable — and the fixes

JavaScript — testability anti-patterns vs solutions

// ── 1. Hard-coded dependencies ────────────────────────────────────
// ❌ Hard to test — db is always the real database
const db = require('../db');
async function getUser(id) { return db.query('SELECT ...'); }

// ✅ Inject the dependency — tests can pass a fake
async function getUser(id, db = require('../db')) { ... }
// Or better — constructor injection:
class UserService {
  constructor(db) { this.db = db; }   // injected from outside
  async getUser(id) { return this.db.query(...); }
}
// Test: new UserService(fakeDb) — no real DB needed

// ── 2. Global state ───────────────────────────────────────────────
// ❌ Hard to test — global state bleeds between tests
let counter = 0;
module.exports = { increment: () => ++counter, get: () => counter };

// ✅ Return state-holding closures or classes — each test gets fresh instance
function createCounter() {
  let count = 0;
  return { increment: () => ++count, get: () => count };
}

// ── 3. Side effects in module scope (runs on require) ─────────────
// ❌ Hard to test — this runs when you import the module
const server = http.createServer().listen(3000); // runs immediately!

// ✅ Export a factory — caller decides when to start
function createServer() { return http.createServer(...); }
module.exports = { createServer }; // import is side-effect free

// ── 4. No separation of concerns (God function) ───────────────────
// ❌ One function validates + queries + transforms + logs + emails
//    To test validation you must deal with all the other concerns

// ✅ Split into small single-purpose functions — each testable alone
const validate  = (input) => ...;           // pure — trivial to test
const transform = (raw) => ...;              // pure — trivial to test
const persist   = (data, db) => ...;         // injectable db
const notify    = (data, mailer) => ...;     // injectable mailer

Interview answer — testable code in 30 seconds

"Testable code has three properties: pure functions where possible (same input → same output, no side effects), dependency injection instead of hard-coded instantiation (pass db/mailer/clock as parameters so tests can substitute fakes), and separation of concerns (small functions with one job are trivially testable in isolation). The rule of thumb: if you find yourself wrestling with jest.mock() constantly, it means the production code has too many hidden dependencies — refactor to inject them instead."