MongoDB Performance Optimization: Indexes, Aggregations, and Beyond

MongoDB is one of the most flexible document databases available, but flexibility can lead to disastrous performance if you do not apply intentional indexing and query design. This guide covers the optimization strategies I rely on in every production deployment.

Understanding Query Execution Plans

Before optimizing, you need to see how MongoDB resolves queries. The explain() method is your best friend:

javascript Code Block

db.orders.find({ userId: ObjectId('...'), status: 'active' }).explain('executionStats');

Key fields to inspect:

totalDocsExamined — how many documents MongoDB scanned.
totalKeysExamined — how many index keys were walked.
executionTimeMillis — wall-clock time for the query.

The ideal query has totalDocsExamined === nReturned — meaning every document scanned was returned to the caller with zero waste.

Compound Indexes

Single-field indexes rarely cover production queries. A typical query filters by userId, then sorts by createdAt:

javascript Code Block

db.orders.createIndex({ userId: 1, createdAt: -1 });

MongoDB reads this index left-to-right. The order matters: put equality filters first, sort fields next, and range filters last — the ESR (Equality, Sort, Range) rule.

javascript Code Block

// Suppose we query:
db.orders.find({
  userId: ObjectId('...'),
  total: { $gte: 100 }
}).sort({ createdAt: -1 });

// Optimal index following ESR:
db.orders.createIndex({ userId: 1, createdAt: -1, total: 1 });

Covered Queries

A covered query is one where every field the client requests exists inside the index itself, so MongoDB never touches the document storage. This is the fastest possible read path:

javascript Code Block

// Create an index that covers the projection
db.orders.createIndex({ userId: 1, status: 1, total: 1 });

// Query with matching projection (exclude _id)
db.orders.find(
  { userId: ObjectId('...') },
  { _id: 0, status: 1, total: 1 }
);

Check the explain output for "stage": "IXSCAN" with no "FETCH" stage — that confirms coverage.

Aggregation Pipeline Optimization

Aggregation pipelines are powerful, but an unoptimized pipeline can bring a replica set to its knees. Follow these rules:

1. Filter early with `$match`

Always place $match as the first stage so MongoDB can use indexes to reduce the working set:

javascript Code Block

db.orders.aggregate([
  { $match: { status: 'completed', createdAt: { $gte: thirtyDaysAgo } } },
  { $group: { _id: '$userId', totalSpent: { $sum: '$total' } } },
  { $sort: { totalSpent: -1 } },
  { $limit: 10 },
]);

2. Use `$project` to reduce document size

Strip away fields you do not need before expensive stages like $group or $lookup:

javascript Code Block

{ $project: { userId: 1, total: 1, createdAt: 1 } }

3. Avoid `$unwind` when possible

$unwind explodes arrays into individual documents, multiplying the pipeline cardinality. Use array operators like $filter or $reduce inside a $project stage when you only need a subset.

Connection Pooling with Mongoose

Every mongoose.connect() call opens a pool of connections (default is 100 in the Node.js driver). For a containerized deployment where dozens of pods connect to the same cluster, tune the pool size:

typescript Code Block

await mongoose.connect(MONGODB_URI, {
  maxPoolSize: 20,       // max connections per pod
  minPoolSize: 5,        // keep 5 connections warm
  socketTimeoutMS: 45000,
  serverSelectionTimeoutMS: 5000,
});

Monitor pool utilization via the db.serverStatus().connections command to find the sweet spot between throughput and cluster memory.

Lean Queries for Read-Heavy Endpoints

By default, Mongoose wraps every document in a full model instance with change tracking. For read-only endpoints, skip the overhead:

typescript Code Block

const orders = await Order.find({ userId }).lean();

.lean() returns plain JavaScript objects and can reduce response times by 30-50 % for large result sets.

Schema Design Checklist

Embed data that is always read together (e.g., order items inside orders).
Reference data that grows unboundedly or is shared across collections.
Set { timestamps: true } on every schema for free createdAt / updatedAt tracking.
Add a TTL index on transient data like sessions:

javascript Code Block

db.sessions.createIndex({ expiresAt: 1 }, { expireAfterSeconds: 0 });

Key Takeaways

Always run explain() before and after adding indexes.
Follow the ESR rule for compound indexes.
Strive for covered queries on hot paths.
Filter and project early in aggregation pipelines.
Use .lean() for read-only Mongoose queries.
Tune connection pool sizes per container, not per cluster.

These techniques have saved me from slow dashboards, timeout errors, and midnight pager alerts in production.

Understanding Query Execution Plans

Compound Indexes

Covered Queries

Aggregation Pipeline Optimization

1. Filter early with `$match`

2. Use `$project` to reduce document size

3. Avoid `$unwind` when possible

Connection Pooling with Mongoose

Lean Queries for Read-Heavy Endpoints

Schema Design Checklist

Key Takeaways

Bablu Kumar Singh

You May Also Like

Optimizing PostgreSQL Query Performance: Indexing and Connection Pooling

Building a Multi-Tenant SaaS: Database Architecture Strategies

Designing a Role-Based Access Control (RBAC) System from Scratch

MongoDB Performance Optimization: Indexes, Aggregations, and Beyond

Understanding Query Execution Plans

Compound Indexes

Covered Queries

Aggregation Pipeline Optimization

1. Filter early with $match

2. Use $project to reduce document size

3. Avoid $unwind when possible

Connection Pooling with Mongoose

Lean Queries for Read-Heavy Endpoints

Schema Design Checklist

Key Takeaways

Bablu Kumar Singh

You May Also Like

Optimizing PostgreSQL Query Performance: Indexing and Connection Pooling

Building a Multi-Tenant SaaS: Database Architecture Strategies

Designing a Role-Based Access Control (RBAC) System from Scratch

1. Filter early with `$match`

2. Use `$project` to reduce document size

3. Avoid `$unwind` when possible