Embedded Data Versus References - MongoDB Manual v7.0

The Nugget

  • Deciding between embedded data and references in MongoDB schema design impacts performance, data redundancy, and query complexity. Embedded data is best for related, frequently accessed data, while references work better for complex, changing, or large data relationships.

Make it stick

  • 🔗 Embedded data models are often denormalized, improving read performance at the cost of possible redundancy.
  • 📊 References store links between separate documents, promoting normalization and ease of updating but may require more complex queries.
  • 🚀 Single document queries are more efficient with embedded data versus potentially multiple queries with references.
  • ⚖️ Choose based on relationship type: one-to-one and one-to-many favors embedding, while many-to-many benefits from references.

Key insights

Embedded Data Models

  1. Use cases: For "contains" relationships (e.g., contacts containing addresses) and one-to-many relationships where context is needed from the parent document.
  2. Benefits:
    • Better performance for read operations.
    • Single database operation retrieves related data.
    • Atomic updates possible, making changes more efficient.

Querying Embedded Data

  1. Method: Use dot notation to access data in embedded documents.
  2. Limitations: Documents must be smaller than 16 megabytes; consider GridFS for large data.

References

  1. Use cases: For situations where embedding leads to data duplication with minimal read performance gain or when dealing with many-to-many relationships.
  2. Advantages:
    • Data is normalized with less duplication.
    • Easier to manage frequent changes and complex relationships.
  3. Querying: Utilize aggregation stages like $lookup and $graphLookup for querying normalized data across collections.

Key quotes

  • "Embedded data models are often denormalized, because frequently-accessed data is duplicated in multiple collections."
  • "Use references to link related data when embedding would result in duplication of data."
  • "Query embedded documents using dot notation."
  • "Documents in MongoDB must be smaller than 16 megabytes."
  • "For large binary data, consider GridFS."
This summary contains AI-generated information and may have important inaccuracies or omissions.