The Aggregation Pipeline is MongoDB's framework for data processing and transformation. It is modeled as a conveyor belt: documents enter the pipeline, pass through a series of sequential stages, and emerge as transformed output.
Each stage takes documents as input, applies one specific operation (filter, reshape, group, sort, join, etc.), and passes the result to the next stage.
Basic Syntax
db.collection.aggregate([
{ $stage1: { /* options */ } },
{ $stage2: { /* options */ } },
{ $stage3: { /* options */ } }
])
// Array of stage objects — executed top to bottom
// Output of each stage is the input of the next
Why Use the Aggregation Pipeline?
- Group documents and compute summaries (
GROUP BYequivalent) - Join multiple collections (
LEFT JOINequivalent) - Compute derived fields with expressions
- Reshape documents — include, exclude, rename, compute fields
- Deconstruct arrays into individual documents
- Date/time extraction and formatting
- Statistical operations (standard deviation, etc.)
aggregate() returns a cursor, just like find(). Documents flow through the pipeline lazily — they are not all loaded into memory at once.