$group Stage
The $group stage is like organizing your clothes into piles โ all shirts together, all pants together, all socks together. In MongoDB, it groups documents by a specified field (or fields) and performs operations on each group.
This is where the real power of aggregation comes in. Need to calculate totals, averages, or find extremes? The $group stage does it all. It's like having a built-in spreadsheet pivot table right in your database.
Think of it as the heart of most aggregation pipelines. Whether you're analyzing sales, user behavior, or any other data, you'll probably use $group to summarize and aggregate your information.
Grouping Documents
To group documents, you specify the _id field in the $group stage. This defines what you're grouping by. It can be a single field, multiple fields, or even a calculated expression.
Here's a simple example. Let's say you have an orders collection and you want to group orders by customer:
db.orders.aggregate([
{ $group: {
_id: "$customerId",
orderCount: { $sum: 1 }
}}
])
This groups all orders by customer and counts how many orders each customer has made. The _id field becomes the key for each group, and all documents with the same _id value are grouped together.
You can group by multiple fields by using an object as the _id value:
db.orders.aggregate([
{ $group: {
_id: { customer: "$customerId", product: "$productName" },
totalQuantity: { $sum: "$quantity" }
}}
])
This groups orders by both customer AND product, giving you a breakdown of how much of each product each customer ordered. It's like creating a cross-tabulation in Excel.
Accumulator Operators
Accumulators are the magic behind $group. They take multiple documents and combine them into a single result. Think of them as mathematical operations that work across groups of documents.
The most common accumulators are:
db.sales.aggregate([
{ $group: {
_id: "$category",
totalAmount: { $sum: "$amount" },
averageAmount: { $avg: "$amount" },
minSale: { $min: "$amount" },
maxSale: { $max: "$amount" },
saleCount: { $sum: 1 }
}}
])
$sum adds up values, $avg calculates the average, $min and $max find the extremes, and $sum: 1 counts documents. It's like having a calculator that works on groups instead of single numbers.
You can also use $first and $last to get the first or last value in each group (based on the order documents appear). And $push creates an array of all values in each group โ super useful when you need to see all individual values alongside your aggregates.
Custom Group Expressions
Sometimes you need more than simple aggregations. That's where custom expressions come in. You can use expression operators to create calculated fields within your groups.
For example, let's say you want to calculate the profit margin for each product category:
db.sales.aggregate([
{ $group: {
_id: "$category",
totalRevenue: { $sum: "$revenue" },
totalCost: { $sum: "$cost" },
profitMargin: {
$avg: {
$multiply: [{ $divide: ["$revenue", "$cost"] }, 100]
}
}
}}
])
This groups sales by category, calculates total revenue and cost, and computes the average profit margin. You can nest expressions to create complex calculations โ it's like building a formula in Excel, but way more powerful.
Custom expressions let you create any calculation you need directly in the database. No more exporting data to calculate complex metrics โ do it all where your data lives!
Try it Yourself โ