MongoDB is one of the most flexible document databases available, but flexibility can lead to disastrous performance if you do not apply intentional indexing and query design. This guide covers the optimization strategies I rely on in every production deployment.
Understanding Query Execution Plans
Before optimizing, you need to see how MongoDB resolves queries. The explain() method is your best friend:
db.orders.find({ userId: ObjectId('...'), status: 'active' }).explain('executionStats');Key fields to inspect:
totalDocsExamined— how many documents MongoDB scanned.totalKeysExamined— how many index keys were walked.executionTimeMillis— wall-clock time for the query.
The ideal query has totalDocsExamined === nReturned — meaning every document scanned was returned to the caller with zero waste.
Compound Indexes
Single-field indexes rarely cover production queries. A typical query filters by userId, then sorts by createdAt:
db.orders.createIndex({ userId: 1, createdAt: -1 });MongoDB reads this index left-to-right. The order matters: put equality filters first, sort fields next, and range filters last — the ESR (Equality, Sort, Range) rule.
// Suppose we query:
db.orders.find({
userId: ObjectId('...'),
total: { $gte: 100 }
}).sort({ createdAt: -1 });
// Optimal index following ESR:
db.orders.createIndex({ userId: 1, createdAt: -1, total: 1 });Covered Queries
A covered query is one where every field the client requests exists inside the index itself, so MongoDB never touches the document storage. This is the fastest possible read path:
// Create an index that covers the projection
db.orders.createIndex({ userId: 1, status: 1, total: 1 });
// Query with matching projection (exclude _id)
db.orders.find(
{ userId: ObjectId('...') },
{ _id: 0, status: 1, total: 1 }
);Check the explain output for "stage": "IXSCAN" with no "FETCH" stage — that confirms coverage.
Aggregation Pipeline Optimization
Aggregation pipelines are powerful, but an unoptimized pipeline can bring a replica set to its knees. Follow these rules:
1. Filter early with $match
Always place $match as the first stage so MongoDB can use indexes to reduce the working set:
db.orders.aggregate([
{ $match: { status: 'completed', createdAt: { $gte: thirtyDaysAgo } } },
{ $group: { _id: '$userId', totalSpent: { $sum: '$total' } } },
{ $sort: { totalSpent: -1 } },
{ $limit: 10 },
]);2. Use $project to reduce document size
Strip away fields you do not need before expensive stages like $group or $lookup:
{ $project: { userId: 1, total: 1, createdAt: 1 } }3. Avoid $unwind when possible
$unwind explodes arrays into individual documents, multiplying the pipeline cardinality. Use array operators like $filter or $reduce inside a $project stage when you only need a subset.
Connection Pooling with Mongoose
Every mongoose.connect() call opens a pool of connections (default is 100 in the Node.js driver). For a containerized deployment where dozens of pods connect to the same cluster, tune the pool size:
await mongoose.connect(MONGODB_URI, {
maxPoolSize: 20, // max connections per pod
minPoolSize: 5, // keep 5 connections warm
socketTimeoutMS: 45000,
serverSelectionTimeoutMS: 5000,
});Monitor pool utilization via the db.serverStatus().connections command to find the sweet spot between throughput and cluster memory.
Lean Queries for Read-Heavy Endpoints
By default, Mongoose wraps every document in a full model instance with change tracking. For read-only endpoints, skip the overhead:
const orders = await Order.find({ userId }).lean();.lean() returns plain JavaScript objects and can reduce response times by 30-50 % for large result sets.
Schema Design Checklist
- Embed data that is always read together (e.g., order items inside orders).
- Reference data that grows unboundedly or is shared across collections.
- Set
{ timestamps: true }on every schema for freecreatedAt/updatedAttracking. - Add a TTL index on transient data like sessions:
db.sessions.createIndex({ expiresAt: 1 }, { expireAfterSeconds: 0 });Key Takeaways
- Always run
explain()before and after adding indexes. - Follow the ESR rule for compound indexes.
- Strive for covered queries on hot paths.
- Filter and project early in aggregation pipelines.
- Use
.lean()for read-only Mongoose queries. - Tune connection pool sizes per container, not per cluster.
These techniques have saved me from slow dashboards, timeout errors, and midnight pager alerts in production.
