0 / 10 answered
Home
🍃
Segment 1 — Core Concepts & Document Model
The foundation of every MongoDB interview. Understanding the document model, BSON, ObjectId, and why MongoDB exists separates engineers who use MongoDB from those who understand it. These questions appear in virtually every MongoDB interview at any level.
Document Model BSON ObjectId Collections Schema Flexibility WiredTiger
10
Questions
0
Opened
5
Topics Covered
~40m
Study Time
🧠 Mental Model — What MongoDB Really Is

Before memorising APIs, lock in this mental model: MongoDB stores data the way your application thinks about it — as objects, not rows. The core insight is that your domain model and your storage model are the same shape.

📦 The Filing Cabinet Analogy

A SQL database is like a spreadsheet — every row must have the same columns. MongoDB is like a filing cabinet of folders. Each folder (document) can contain whatever papers (fields) it needs. Some folders are thick, some thin. You don't have to pre-define what goes inside each folder — you just put it in.

✅ MongoDB excels at
  • Rapidly evolving data shapes (product catalogs, CMS)
  • Hierarchical or nested data (orders with line items)
  • Horizontal scale-out across many servers
  • High write throughput applications
  • Real-time analytics on operational data
❌ MongoDB struggles with
  • Complex multi-table JOINs (use RDBMS instead)
  • Strict ACID across many unrelated collections
  • Highly relational data with many foreign keys
  • Reporting-heavy workloads (data warehouse is better)
Topic A — The Document Model vs SQL
1
What is MongoDB? How does it fundamentally differ from a relational database?
Easy Document Model NoSQL

MongoDB is a document-oriented NoSQL database that stores data as flexible JSON-like documents (technically BSON). Unlike relational databases which organise data into rigid tables with rows and columns, MongoDB stores each record as a self-contained document that can have its own structure.

SQL ConceptMongoDB EquivalentKey Difference
DatabaseDatabaseSame concept
TableCollectionCollections have no enforced schema
RowDocumentDocuments can differ in structure
ColumnFieldFields can be nested objects or arrays
Primary Key_id fieldAuto-generated ObjectId by default
JOIN$lookup (aggregation)Prefer embedding over joining
Foreign KeyManual reference (DBRef)Not enforced at DB level
IndexIndexSame concept, more types available

Relational databases normalise data — splitting it across many tables to avoid duplication, then using JOINs to reconstruct it. MongoDB denormalises by design — related data lives together in one document, avoiding the JOIN cost at read time.

SQL vs MongoDB — same data
-- SQL: data split across 3 tables, need JOIN to reconstruct
SELECT o.id, u.name, p.title, oi.quantity
FROM orders o
JOIN users u ON o.user_id = u.id
JOIN order_items oi ON oi.order_id = o.id
JOIN products p ON oi.product_id = p.id;

// MongoDB: one read, all data together
db.orders.findOne({ _id: orderId });
// Returns: { _id, user: { name, email }, items: [{ title, qty, price }] }
Why This Matters in Interviews
Interviewers want to know you understand the trade-off: embedding avoids JOINs but can lead to data duplication. The right answer shows you know MongoDB is a deliberate design choice, not just "a faster MySQL."
How to answer in an interview

"MongoDB is a document-oriented database that stores data as BSON documents — essentially flexible JSON objects. The fundamental difference from relational databases is that MongoDB avoids the table-column-row model in favour of documents that can contain nested data and arrays. This means you design schemas around your access patterns — the data your app reads together lives together — trading JOIN-heavy reads for potential data duplication."

2
What is BSON? How does it differ from JSON, and why does MongoDB use it?
Easy BSON

BSON (Binary JSON) is a binary-encoded serialisation of JSON-like documents. MongoDB stores documents in BSON format internally and transmits them over the wire in BSON. Your application driver converts between BSON and native language objects transparently.

PropertyJSONBSON
FormatText (UTF-8 string)Binary encoded
Human-readableYesNo
Parse speedSlower (text parsing)Faster (binary, length-prefixed)
Data typesString, Number, Boolean, Array, Object, nullAll JSON types + Date, Binary, ObjectId, Int32, Int64, Decimal128, Regex, Timestamp
SizeSmaller for simple dataCan be larger due to type metadata
TraversalMust parse entire stringSkip fields using length prefix — fast
MongoDB Shell / BSON types
{
  _id:        ObjectId("507f1f77bcf86cd799439011"), // 12-byte unique ID
  createdAt:  new Date("2026-01-15"),          // ISODate — NOT a string
  price:      Decimal128("19.99"),             // Precise decimal (no float errors)
  views:      NumberLong(9999999999),           // 64-bit integer
  data:       BinData(0, "base64string"),       // Binary blob (images, files)
  pattern:   RegExp(/^mongo/i),               // Stored regex
  isActive:  true                               // Boolean (same as JSON)
}
⚠ Common Gotcha — Dates
Always store dates as BSON Date objects, never as strings. BSON Date enables proper date comparison queries ($gt, $lt) and range queries. Storing "2026-01-15" as a string means alphabetical comparison — completely wrong for dates.
How to answer in an interview

"BSON is MongoDB's binary serialisation format — it's JSON's superset. The two key advantages over plain JSON are: firstly, it supports richer types like Date, Int64, Decimal128, and Binary that JSON doesn't have — this prevents precision issues with numbers and enables proper date queries. Secondly, BSON is length-prefixed so the driver can skip fields without parsing the entire document, making it faster to read."

3
What is a MongoDB document? What are its constraints and supported field types?
Easy Documents

A document is a set of key-value pairs stored in BSON format. It is the basic unit of data in MongoDB — analogous to a row in SQL but far more flexible. Documents can contain nested documents (subdocuments), arrays, and arrays of subdocuments.

  • Maximum size: 16 MB per document (hard limit)
  • _id field: Every document must have a unique _id — auto-generated as ObjectId if not provided
  • Field names: Cannot start with $ or contain . (dot) — these are reserved for query operators
  • Nesting depth: No hard limit, but deeply nested documents are hard to query and index
  • Field order: Field order is preserved in documents (unlike most JSON parsers)
MongoDB Document
{
  _id: ObjectId("507f1f77bcf86cd799439011"),   // auto-generated unique key
  name: "MongoDB Mastery",                       // String
  price: Decimal128("49.99"),                    // Precise decimal
  inStock: true,                                 // Boolean
  tags: ["mongodb", "nosql", "database"],        // Array
  publishedAt: new Date("2026-01-01"),           // BSON Date
  author: {                                       // Embedded subdocument
    name: "Yogesh Tiwari",
    email: "yogesh@example.com"
  },
  reviews: [                                      // Array of subdocuments
    { user: "alice", rating: 5, comment: "Great!" },
    { user: "bob",   rating: 4, comment: "Good"   }
  ],
  metadata: null                                 // null value
}
16MB Limit — What to Do
If a document might exceed 16MB (e.g., storing file content), use GridFS — MongoDB's specification for storing large files by splitting them into 255KB chunks stored as separate documents.
How to answer in an interview

"A MongoDB document is a BSON key-value structure — think of it as a rich JSON object. Key constraints: maximum 16MB, every document needs a unique _id, and field names can't use dots or start with dollar signs. What makes documents powerful is that field values can be strings, numbers, booleans, dates, arrays, or even other embedded documents — you can represent an entire entity with all its relationships in one place."

4
What is a Collection in MongoDB? How does it compare to a SQL table?
Easy Collections

A collection is a grouping of MongoDB documents — the equivalent of a SQL table, but without a fixed schema. Collections are created automatically when you first insert a document into them. No pre-definition (CREATE TABLE) required.

FeatureSQL TableMongoDB Collection
SchemaFixed — every row has the same columnsDynamic — each document can differ
Data types per fieldEnforced by column definitionAny BSON type per field per document
Adding new fieldsALTER TABLE (can lock table)Just insert a document with the new field
CreationExplicit CREATE TABLEAuto-created on first insert
JoinsNative JOIN across tables$lookup in aggregation (not native)
ValidationColumn constraints, triggersOptional JSON Schema validation

A special collection type with a fixed size — when full, it overwrites the oldest documents. Useful for logs, event streams, and audit trails where you only care about recent data.

MongoDB Shell
// Create a capped collection — max 1000 docs or 5MB
db.createCollection("auditLog", {
  capped: true,
  size: 5242880,   // 5MB in bytes
  max:  1000        // optional: max document count
});
How to answer in an interview

"A collection is like a SQL table but schema-free — it groups related documents without enforcing a structure. The key differences: collections are created implicitly on first insert, documents within the same collection can have completely different fields, and adding new fields to documents doesn't require any schema migration. You can optionally add JSON Schema validation to a collection if you need structure enforcement."

Topic B — ObjectId & Identity
5
What is an ObjectId? How is it structured, and why is it better than SQL auto-increment?
Easy ObjectId
4
Bytes 0–3: Unix timestamp — seconds since epoch. This makes ObjectIds sortable by creation time without a separate createdAt field.
5
Bytes 4–8: Random value — generated once per process startup. Unique per machine and process.
3
Bytes 9–11: Incrementing counter — initialised to a random value, then incremented. Handles multiple inserts within the same second on the same process.
JavaScript (Node.js Driver)
const { ObjectId } = require('mongodb');

const id = new ObjectId();

console.log(id.toString());      // "507f1f77bcf86cd799439011"
console.log(id.getTimestamp()); // 2026-01-01T00:00:00.000Z — embedded creation time!

// IMPORTANT: always query with ObjectId, not a string
db.collection('users').findOne({ _id: new ObjectId("507f1f77bcf86cd799439011") });
// NOT: { _id: "507f1f77bcf86cd799439011" } ← this won't match!
FeatureSQL AUTO_INCREMENTMongoDB ObjectId
Distributed generation❌ Requires central DB coordination✅ Generated client-side — no round-trip
Predictability❌ Sequential — easy to enumerate✅ Randomly seeded — harder to guess
Embedded metadata❌ Just a number✅ Contains creation timestamp
Sharding-safe❌ Conflicts across shards✅ Globally unique across shards
Sort by creationYes (by value)Yes (sort by _id gives time order)
⚠ Common Mistake — String vs ObjectId
Always convert string IDs to ObjectId before querying. { _id: "507f1f..." } will never match a document whose _id is an ObjectId. This is the #1 ObjectId bug beginners hit.
How to answer in an interview

"An ObjectId is a 12-byte unique identifier — 4 bytes for Unix timestamp, 5 random bytes unique per process, and 3 bytes for an incrementing counter. This design means ObjectIds are globally unique without coordination between servers, which is critical for sharding. A bonus: sorting by _id gives you documents in insertion order, and you can extract the creation timestamp from any ObjectId without a separate field."

Topic C — Schema Flexibility
6
What does "schema-less" mean in MongoDB? Is it truly schema-less?
Medium Schema Flexibility

MongoDB is often called "schema-less" but that's misleading. A more accurate term is schema-flexible or dynamic schema. MongoDB doesn't enforce a schema at the database level by default — but your application absolutely has a schema. It's just implicit in your code rather than enforced by the database engine.

📝 The Analogy

SQL is like a government form — every field is printed and you must fill them all in. MongoDB is like a blank notebook — you can write whatever you want, but your team still has conventions about what goes on each page. "Schema-less" means the notebook doesn't yell at you if you skip a section — it doesn't mean you don't have rules.

MongoDB Shell — valid but messy
// All 3 can coexist in the same collection — MongoDB won't complain
db.users.insertMany([
  { name: "Alice", email: "alice@example.com", age: 30 },
  { name: "Bob",   phone: "+91-9999"                      },  // no email or age
  { fullName: "Charlie D", age: "twenty-five"           }   // age as string!
]);

MongoDB supports optional JSON Schema validation — you can enforce that all documents in a collection meet certain rules, while still allowing fields to be added freely.

MongoDB Shell — Schema Validation
db.createCollection("users", {
  validator: {
    $jsonSchema: {
      bsonType: "object",
      required: ["name", "email"],
      properties: {
        name:  { bsonType: "string",  description: "must be a string" },
        email: { bsonType: "string",  pattern: "^.+@.+$"           },
        age:   { bsonType: "int",     minimum: 0, maximum: 150        }
      }
    }
  },
  validationAction: "error"  // "warn" just logs, "error" rejects invalid inserts
});
Best Practice
In production, always use JSON Schema validation or an ODM like Mongoose for schema enforcement. Pure "no schema" is fine for prototyping but leads to data quality issues at scale.
How to answer in an interview

"MongoDB is schema-flexible, not truly schema-less. The database doesn't enforce a schema by default — different documents in the same collection can have different fields. But in practice, your application has an implicit schema baked into your code. MongoDB lets you add JSON Schema validation rules to enforce required fields and data types when you need that discipline. The flexibility shines during rapid development and schema evolution — adding a new field is just inserting it, no ALTER TABLE required."

7
What are embedded documents vs references? When do you use each?
Medium Data Modeling
Embedding (Denormalised)

Nest related data inside the parent document as a subdocument or array. Everything is in one place — one read gets everything.

Referencing (Normalised)

Store a reference ID to another collection. Related data lives in a separate document — requires a second read or $lookup.

MongoDB — Embedded vs Referenced
// ── EMBEDDING: address inside user ──
{
  _id: ObjectId("..."),
  name: "Alice",
  address: {                    // embedded — always fetched with user
    street: "123 Main St",
    city: "Chandigarh",
    zip: "160001"
  }
}

// ── REFERENCING: author as a separate document ──
// posts collection
{
  _id: ObjectId("post1"),
  title: "MongoDB Guide",
  authorId: ObjectId("user1")  // reference — lookup separately
}
// users collection
{ _id: ObjectId("user1"), name: "Alice", email: "alice@example.com" }
Use Embedding When…Use References When…
Data is always read together with the parentData is read independently of the parent
The child data belongs to one parent only (1:1)The child is shared by many parents (N:N)
The child data doesn't change frequentlyThe child is updated often and independently
The array won't grow unboundedlyThe array could grow very large (thousands+)
Atomic updates to parent + child are neededYou need to query/filter the child independently
⚠ The Unbounded Array Problem
Never embed a relationship that could grow without limit (e.g., all comments on a viral post). An unbounded embedded array will eventually push the document past the 16MB limit and degrade performance as the array grows. Use references for high-cardinality relationships.
How to answer in an interview

"The rule is: embed when you read together, reference when you read independently. Address in a user document is a classic embed — you always want it with the user. But a user's blog posts should be referenced — you might want posts without the user, and there could be thousands of them. The key constraints are: never embed arrays that could grow without bounds, and only embed if atomic updates to both parent and child are needed."

Topic D — MongoDB in Context
8
What are the main advantages and disadvantages of MongoDB?
Medium Trade-offs
  • Flexible schema: Rapid iteration without ALTER TABLE migrations. New fields just appear in new documents.
  • Developer-friendly data model: Documents map directly to objects in code — no ORM impedance mismatch.
  • Horizontal scaling (sharding): Distribute data across many servers natively — harder to do with SQL.
  • Rich query language: Supports ad hoc queries, aggregation pipeline, full-text search, geospatial queries, and time-series.
  • High write throughput: WiredTiger's document-level locking allows far more concurrent writes than table-level locking.
  • Native JSON ecosystem: Perfect for Node.js, REST APIs — data flows end-to-end without transformation.
  • Built-in replication: Replica sets provide automatic failover and read scaling out of the box.
  • No native JOINs: $lookup exists but is slower than SQL JOINs — not ideal for relational data.
  • 16MB document limit: Can't store large files directly (requires GridFS workaround).
  • Data duplication: Embedding denormalises data — updates to shared data require updating multiple documents.
  • Memory-hungry: WiredTiger cache defaults to 50% of RAM — needs generous memory allocation.
  • Multi-document transactions add overhead: Available since 4.0, but they're slower than single-document operations.
  • No declarative referential integrity: No foreign key constraints — your application must maintain consistency.
How to answer in an interview

"MongoDB's main advantages are flexible schema evolution, natural JSON data model for JavaScript apps, built-in horizontal scaling via sharding, and high write throughput from document-level locking. The downsides: it doesn't enforce referential integrity, multi-document JOINs are expensive, and denormalization means update anomalies require updating multiple documents. The right choice depends on whether your data is document-shaped and your access patterns favour reads-by-document over complex joins."

9
When should you choose MongoDB over a relational database (and vice versa)?
Medium Architecture
  • Schema evolves rapidly: SaaS products early-stage, product catalogs with variable attributes
  • Data is hierarchical: Orders with line items, blog posts with comments, user profiles with nested settings
  • High write throughput: IoT telemetry, event logging, activity feeds, clickstream data
  • Horizontal scale is a requirement: Multi-region, petabyte-scale datasets
  • JavaScript / Node.js stack: JSON end-to-end — no serialisation overhead
  • Geospatial queries: Location-based apps, maps, proximity searches
  • Data is highly relational: Financial systems, ERP, complex many-to-many relationships
  • Strict ACID across many entities: Banking, payments, inventory management
  • Complex reporting and analytics: Ad hoc SQL queries, BI tools, data warehouses
  • Referential integrity is non-negotiable: Foreign keys enforced at DB level
  • Legacy system integration: Most enterprise software speaks SQL
Real-World Hybrid Approach
Most large-scale applications use both. MongoDB for user profiles, content, and logs; PostgreSQL for financial transactions and billing; Redis for caching. Don't treat it as MongoDB vs SQL — treat it as the right tool for each workload.
How to answer in an interview

"I choose MongoDB when the data is document-shaped, the schema evolves frequently, or the access pattern is 'give me this whole entity.' I choose PostgreSQL or MySQL when data is highly relational with many foreign keys, when I need strict ACID guarantees across multiple entities, or when complex reporting queries are the primary workload. In practice most production systems use both — MongoDB for operational workloads and a relational DB for transactional data."

Topic E — Storage Engine
10
What is the WiredTiger storage engine? How does MongoDB store and manage data physically?
Hard Internals WiredTiger

WiredTiger is MongoDB's default (and recommended) storage engine. It replaced the older MMAPv1 engine and brought massive improvements in concurrency, compression, and performance.

FeatureDetailWhy It Matters
Document-level concurrency MVCC (Multi-Version Concurrency Control) — each write gets its own version Multiple writers can work simultaneously without blocking each other
Compression Snappy (default), zlib, or zstd for data; prefix for indexes Reduces disk usage by 50–80% — also reduces I/O
Cache Internal cache: 50% of (RAM − 1GB) by default, min 256MB Keeps hot data in memory for fast reads
Write-ahead log (journal) All writes go to the journal before being applied Crash recovery — no data loss on sudden shutdown
Checkpoints Every 60 seconds, WiredTiger flushes a consistent snapshot to disk Limits how far back journal replay needs to go after a crash
1
Write accepted — driver sends BSON document over the network.
2
Journal write — operation is written to the WiredTiger journal (WAL). This is what guarantees durability on crash.
3
Cache update — document enters WiredTiger's in-memory cache as a "dirty page."
4
Checkpoint — every 60s (or when cache pressure is high), dirty pages are flushed to the data files on disk.
5
Index updates — all relevant B-tree indexes are updated in the cache and eventually flushed to disk.
MongoDB Shell — Storage Engine Info
// Check the storage engine in use
db.serverStatus().storageEngine
// { name: "wiredTiger", ... }

// Check WiredTiger cache statistics
db.serverStatus().wiredTiger.cache

// Check compression (collection stats)
db.runCommand({ collStats: "users" })
// Shows: storageSize (compressed), totalIndexSize, wiredTiger.creationString
MVCC — Why Document-Level Locking Is Fast
MVCC means readers never block writers and writers never block readers. When a write happens, WiredTiger creates a new version of the document. Readers see the old version until the write commits. This is the same pattern PostgreSQL uses — far superior to MMAPv1's collection-level locking.
How to answer in an interview

"WiredTiger has been MongoDB's default storage engine since 3.2. Its key advantages are document-level concurrency via MVCC — readers and writers don't block each other — and compression, which cuts disk usage significantly. Data flows from insert → journal (for durability) → in-memory cache (dirty page) → disk (at checkpoint every 60s). The internal cache defaults to 50% of RAM, so MongoDB is memory-hungry by design. If you're memory-constrained, tune wiredTigerCacheSizeGB in your config."