What is Referencing?
Referencing is the opposite of embedding. Instead of putting related data inside one document, you store it in separate collections and link them using IDs. It's like having a contact list where you store phone numbers, not the entire contact details of every person you know.
In MongoDB, you reference documents by storing their _id values in another document. This is similar to foreign keys in relational databases, but more flexible.
Referencing keeps your documents small and focused. Each collection has a single responsibility, which can make maintenance easier.
const authorId = ObjectId()
db.authors.insertOne({
_id: authorId,
name: "James Wilson",
email: "james@example.com",
bio: "Tech writer and developer"
})
db.books.insertOne({
title: "MongoDB in Action",
authorId: authorId,
isbn: "978-0123456789",
price: 39.99,
publishedYear: 2024
})
Benefits of Referencing
Referencing prevents data duplication. Instead of storing the same author information in every book document, you store it once and reference it. This saves space and keeps your data consistent.
Updates are cleaner too. If an author changes their email, you update it in one place instead of finding and updating every book by that author.
Referencing also works well for many-to-many relationships. A book can have multiple authors, and an author can write multiple books. This is much harder to model with embedding.
db.books.insertMany([
{
title: "MongoDB Advanced Topics",
authorIds: [authorId, ObjectId()],
price: 44.99
},
{
title: "NoSQL Fundamentals",
authorIds: [authorId],
price: 34.99
}
])
When to Reference
Reference when the subcollection grows unbounded. Comments on a viral post, log entries, or transaction history can grow indefinitely. These belong in their own collections.
Reference when you need to access subdocuments independently. If users frequently browse authors without caring about their books, separate collections make sense.
Reference when you have many-to-many relationships or when data is shared across multiple parent documents.
db.categories.insertMany([
{ _id: ObjectId(), name: "Fiction", slug: "fiction" },
{ _id: ObjectId(), name: "Non-Fiction", slug: "non-fiction" }
])
const fictionId = db.categories.findOne({ slug: "fiction" })._id
db.books.insertOne({
title: "The Great Adventure",
categoryId: fictionId,
price: 19.99
})
Population (Joins)
When you reference documents, you need a way to fetch the related data. MongoDB provides the $lookup aggregation stage, which works like a JOIN in SQL.
Population is the process of replacing a referenced ID with the actual document. It's like looking up a contact in your phone after getting their number from a business card.
Keep in mind that $lookup adds overhead. Use it thoughtfully, and consider whether embedding might be a better fit for frequently accessed relationships.
db.books.aggregate([
{
$lookup: {
from: "authors",
localField: "authorId",
foreignField: "_id",
as: "author"
}
},
{
$unwind: "$author"
},
{
$project: {
title: 1,
price: 1,
"author.name": 1,
"author.email": 1
}
}
])