SlideShare a Scribd company logo
MongoDB and
Schema Design
Solutions Architect, MongoDB Inc.
Matias Cascallares
matias@mongodb.com
Who am I?
• Originally from Buenos Aires,
Argentina
• Solutions Architect at MongoDB
Inc based in Singapore
• Software Engineer, most of my
experience in web environments
• In my toolbox I have Java, Python
and Node.js
Why do we 

need to look

for new
databases?
MongoDB and Schema Design
MongoDB and Schema Design
.. and not so long time ago
.. and not so long time ago
Hardware nowadays
HTTP POST
https://ec2.amazonaws.com/?
Action=StartInstances
&InstanceId.1=i-10a64379
&AUTHPARAMS
MongoDB and Schema Design
MONGODB IS A

DOCUMENT 

ORIENTED

DATABASE
Document Databases
• General purpose data storage
• Dynamic schema / unstructured data
• Flexible query and indexing capabilities
• Consistent writes
• Aggregation capabilities
Show me a document
{
"name" : "Matias Cascallares",
"title" : "Solutions Architect",
"email" : "matias@mongodb.com",
"birth_year" : 1981,
"location" : [ "Singapore", "Asia"],
"phone" : {
"type" : "mobile",
"number" : "+65 8591 3870"
}
}
Document Model
• MongoDB is made up of collections
• Collections are composed of documents
• Each document is a set of key-value pairs
• No predefined schema
• Keys are always strings
• Values can be any (supported) data type
• Values can also be an array
• Values can also be a document
Benefits of
document
model ..?
Flexibility
• Each document can have different fields
• No need of long migrations, easier to be agile
• Common structure enforced at application level
Arrays
• Documents can have field with array values
• Ability to query and index array elements
• We can model relationships with no need of different
tables or collections
Embedded documents
• Documents can have field with document values
• Ability to query and index nested documents
• Semantic closer to Object Oriented Programming
Indexing an array of documents
How should I
store my
information?
SCHEMA 

DESIGN IS
AN ARThttps://www.flickr.com/photos/76377775@N05/11098637655/
Relational
Schema Design
Focus on
data
storage
Document
Schema Design
Focus on
data
usage
Implementing

Relations
https://www.flickr.com/photos/ravages/2831688538
A task 

tracking app
Requirement #1
"We need to store user information like name, email
and their addresses… yes they can have more than
one.”
— Bill, a project manager, contemporary
Relational
id name email title
1 Kate
Powell
kate.powell@somedomain.c
om
Regional Manager
id street city user_id
1 123 Sesame Street Boston 1
2 123 Evergreen Street New York 1
Let’s use the document model
> db.user.findOne( { email: "kate.powell@somedomain.com"} )
{
_id: 1,
name: "Kate Powell",
email: "kate.powell@somedomain.com",
title: "Regional Manager",
addresses: [
{ street: "123 Sesame St", city: "Boston" },
{ street: "123 Evergreen St", city: "New York" }
]
}
Requirement #2
"We have to be able to store tasks, assign them to
users and track their progress…"
— Bill, a project manager, contemporary
Embedding tasks
> db.user.findOne( { email: "kate.powell@somedomain.com"} )
{
name: "Kate Powell",
// ... previous fields
tasks: [
{
summary: "Contact sellers",
description: "Contact agents to specify our needs
and time constraints",
due_date: ISODate("2014-08-25T08:37:50.465Z"),
status: "NOT_STARTED"
},
{ // another task }
]
}
Embedding tasks
• Tasks are unbounded items: initially we do not know
how many tasks we are going to have
• A user along time can end with thousands of tasks
• Maximum document size in MongoDB: 16 MB !
• It is harder to access task information without a user
context
Referencing tasks
> db.user.findOne({_id: 1})
{
_id: 1,
name: "Kate Powell",
email: "kate.powell@...",
title: "Regional Manager",
addresses: [
{ // address 1 },
{ // address 2 }
]
}
> db.task.findOne({user_id: 1})
{
_id: 5,
summary: "Contact sellers",
description: "Contact agents
to specify our ...",
due_date: ISODate(),
status: "NOT_STARTED",
user_id: 1
}
Referencing tasks
• Tasks are unbounded items and our schema supports
that
• Application level joins
• Remember to create proper indexes (e.g. user_id)
Embedding 

vs 

Referencing
One-to-many relations
• Embed when you have a few number of items on ‘many'
side
• Embed when you have some level of control on the
number of items on ‘many' side
• Reference when you cannot control the number of items
on the 'many' side
• Reference when you need to access to ‘many' side items
without parent entity scope
Many-to-many relations
• These can be implemented with two one-to-many
relations with the same considerations
RECIPE #1

USE EMBEDDING 

FOR ONE-TO-FEW
RELATIONS
RECIPE #2

USE REFERENCING 

FOR ONE-TO-MANY
RELATIONS
Working with

arrays
https://www.flickr.com/photos/kishjar/10747531785
Arrays are
great!
List of sorted elements
> db.numbers.insert({
_id: "even",
values: [0, 2, 4, 6, 8]
});
> db.numbers.insert({
_id: "odd",
values: [1, 3, 5, 7, 9]
});
Access based on position
db.numbers.find({_id: "even"}, {values: {$slice: [2, 3]}})
{
_id: "even",
values: [4, 6, 8]
}
db.numbers.find({_id: "odd"}, {values: {$slice: -2}})
{
_id: "odd",
values: [7, 9]
}
Access based on values
// is number 2 even or odd?
> db.numbers.find( { values : 2 } )
{
_id: "even",
values: [0, 2, 4, 6, 8]
}
Like sorted sets
> db.numbers.find( { _id: "even" } )
{
_id: "even",
values: [0, 2, 4, 6, 8]
}
> db.numbers.update(
{ _id: "even"},
{ $addToSet: { values: 10 } }
);
Several times…!
> db.numbers.find( { _id: "even" } )
{
_id: "even",
values: [0, 2, 4, 6, 8, 10]
}
Array update operators
• pop
• push
• pull
• pullAll
But…
Storage
{
_id: 1,
name: "Nike Pump Air 180",
tags: ["sports", "running"]
}
db.inventory.update(
{ _id: 1},
{ $push: { tags: "shoes" } }

)
DocA DocCDocB
Empty
Storage
DocA DocCDocB DocB
IDX IDX IDX
86 bytes
Why is expensive to move a doc?
1. We need to write the document in another location ($$)
2. We need to mark the original position as free for new
documents ($)
3. We need to update all those index entries pointing to the
moved document to the new location ($$$)
Considerations with arrays
• Limited number of items
• Avoid document movements
• Document movements can be delayed with padding
factor
• Document movements can be mitigated with pre-
allocation
RECIPE #3

AVOID EMBEDDING
LARGE ARRAYS
RECIPE #4

USE DATA MODELS
THAT MINIMIZE THE
NEED FOR 

DOCUMENT 

GROWTH
Denormalization
https://www.flickr.com/photos/ross_strachan/5146307757
Denormalization
"…is the process of attempting to optimise the
read performance of a database by adding
redundant data …”
— Wikipedia
Products and comments
> db.product.find( { _id: 1 } )
{
_id: 1,
name: "Nike Pump Air Force 180",
tags: ["sports", "running"]
}
> db.comment.find( { product_id: 1 } )
{ score: 5, user: "user1", text: "Awesome shoes" }
{ score: 2, user: "user2", text: "Not for me.." }
Denormalizing
> db.product.find({_id: 1})
{
_id: 1,
name: "Nike Pump Air Force 180",
tags: ["sports", “running"],
comments: [
{ user: "user1", text: "Awesome shoes" },
{ user: "user2", text: "Not for me.." }
]
}
> db.comment.find({product_id: 1})
{ score: 5, user: "user1", text: "Awesome shoes" }
{ score: 2, user: "user2", text: "Not for me.."}
RECIPE #5

DENORMALIZE 

TO AVOID 

APP-LEVEL JOINS
RECIPE #6

DENORMALIZE ONLY
WHEN YOU HAVE A
HIGH READ TO WRITE
RATIO
Bucketing
https://www.flickr.com/photos/97608671@N02/13558864555/
What’s the idea?
• Reduce number of documents to be retrieved
• Less documents to retrieve means less disk seeks
• Using arrays we can store more than one entity per
document
• We group things that are accessed together
An example
Comments are showed in
buckets of 2 comments
A ‘read more’ button
loads next 2 comments
Bucketing comments
> db.comments.find({post_id: 123})
.sort({sequence: -1})
.limit(1)
{
_id: 1,
post_id: 123,
sequence: 8, // this acts as a page number
comments: [
{user: user1@somedomain.com, text: "Awesome shoes.."},
{user: user2@somedomain.com, text: "Not for me..”}
] // we store two comments per doc, fixed size bucket
}
RECIPE #7

USE BUCKETING TO
STORE THINGS THAT
ARE GOING TO BE
ACCESSED AS A
GROUP
MongoDB and Schema Design

More Related Content

What's hot (20)

PPTX
MongoDB 101
Abhijeet Vaikar
 
PDF
Intro To MongoDB
Alex Sharp
 
PPTX
Spring mvc
Pravin Pundge
 
PDF
MongoDB and Node.js
Norberto Leite
 
PPTX
Sql vs NoSQL
RTigger
 
PPTX
대용량 분산 아키텍쳐 설계 #3 대용량 분산 시스템 아키텍쳐
Terry Cho
 
PPTX
The Basics of MongoDB
valuebound
 
PPTX
Schema migrations in no sql
Dr-Dipali Meher
 
PPTX
Basic Concept of Node.js & NPM
Bhargav Anadkat
 
PDF
An introduction to MongoDB
César Trigo
 
PPTX
Introduction to MongoDB.pptx
Surya937648
 
PDF
Introduction aux bases de données NoSQL
Antoine Augusti
 
PPTX
NoSql
Girish Khanzode
 
PPTX
MongoDB presentation
Hyphen Call
 
PPTX
NoSQL databases - An introduction
Pooyan Mehrparvar
 
PPTX
Mongodb introduction and_internal(simple)
Kai Zhao
 
PPT
MongoDB Schema Design
MongoDB
 
PDF
Intro to HBase
alexbaranau
 
MongoDB 101
Abhijeet Vaikar
 
Intro To MongoDB
Alex Sharp
 
Spring mvc
Pravin Pundge
 
MongoDB and Node.js
Norberto Leite
 
Sql vs NoSQL
RTigger
 
대용량 분산 아키텍쳐 설계 #3 대용량 분산 시스템 아키텍쳐
Terry Cho
 
The Basics of MongoDB
valuebound
 
Schema migrations in no sql
Dr-Dipali Meher
 
Basic Concept of Node.js & NPM
Bhargav Anadkat
 
An introduction to MongoDB
César Trigo
 
Introduction to MongoDB.pptx
Surya937648
 
Introduction aux bases de données NoSQL
Antoine Augusti
 
MongoDB presentation
Hyphen Call
 
NoSQL databases - An introduction
Pooyan Mehrparvar
 
Mongodb introduction and_internal(simple)
Kai Zhao
 
MongoDB Schema Design
MongoDB
 
Intro to HBase
alexbaranau
 

Viewers also liked (20)

PDF
The Fine Art of Schema Design in MongoDB: Dos and Don'ts
Matias Cascallares
 
PPTX
Internet of Things Cologne 2015: Why Your Dad’s Database won’t Work for IoT a...
MongoDB
 
PPTX
Elasticsearch 5.0
Matias Cascallares
 
PDF
The What and Why of NoSql
Matias Cascallares
 
PPTX
Socialite, the Open Source Status Feed Part 1: Design Overview and Scaling fo...
MongoDB
 
KEY
Building your first application w/mongoDB MongoSV2011
Steven Francia
 
PDF
Intro to MongoDB and datamodeling
rogerbodamer
 
ODP
Кратко о MongoDB
Gleb Lebedev
 
PPTX
MongoDB. Области применения, преимущества и узкие места, тонкости использован...
phpdevby
 
PPTX
Преимущества NoSQL баз данных на примере MongoDB
UNETA
 
KEY
MongoDB Aggregation Framework
Tyler Brock
 
PDF
Выбор NoSQL базы данных для вашего проекта: "Не в свои сани не садись"
Alexey Zinoviev
 
PPTX
An Introduction to Big Data, NoSQL and MongoDB
William LaForest
 
PPTX
Data Modeling for NoSQL
Tony Tam
 
PDF
What's new in Elasticsearch v5
Idan Tohami
 
PPTX
Agg framework selectgroup feb2015 v2
MongoDB
 
PPTX
Socialite, the Open Source Status Feed
MongoDB
 
PPTX
Socialite, the Open Source Status Feed Part 3: Scaling the Data Feed
MongoDB
 
PPTX
MongoDB IoT City Tour STUTTGART: Industrial Internet, Industry 4.0, Smart Fac...
MongoDB
 
PPTX
MongoDB IoT City Tour LONDON: Industrial Internet, Industry 4.0, Smart Factor...
MongoDB
 
The Fine Art of Schema Design in MongoDB: Dos and Don'ts
Matias Cascallares
 
Internet of Things Cologne 2015: Why Your Dad’s Database won’t Work for IoT a...
MongoDB
 
Elasticsearch 5.0
Matias Cascallares
 
The What and Why of NoSql
Matias Cascallares
 
Socialite, the Open Source Status Feed Part 1: Design Overview and Scaling fo...
MongoDB
 
Building your first application w/mongoDB MongoSV2011
Steven Francia
 
Intro to MongoDB and datamodeling
rogerbodamer
 
Кратко о MongoDB
Gleb Lebedev
 
MongoDB. Области применения, преимущества и узкие места, тонкости использован...
phpdevby
 
Преимущества NoSQL баз данных на примере MongoDB
UNETA
 
MongoDB Aggregation Framework
Tyler Brock
 
Выбор NoSQL базы данных для вашего проекта: "Не в свои сани не садись"
Alexey Zinoviev
 
An Introduction to Big Data, NoSQL and MongoDB
William LaForest
 
Data Modeling for NoSQL
Tony Tam
 
What's new in Elasticsearch v5
Idan Tohami
 
Agg framework selectgroup feb2015 v2
MongoDB
 
Socialite, the Open Source Status Feed
MongoDB
 
Socialite, the Open Source Status Feed Part 3: Scaling the Data Feed
MongoDB
 
MongoDB IoT City Tour STUTTGART: Industrial Internet, Industry 4.0, Smart Fac...
MongoDB
 
MongoDB IoT City Tour LONDON: Industrial Internet, Industry 4.0, Smart Factor...
MongoDB
 
Ad

Similar to MongoDB and Schema Design (20)

PPTX
Schema design mongo_boston
MongoDB
 
PPTX
Schema Design
MongoDB
 
PPTX
Schema Design
MongoDB
 
PDF
Schema & Design
MongoDB
 
PPTX
Webinar: Schema Design
MongoDB
 
PDF
MongoDB Schema Design
aaronheckmann
 
PDF
Schema Design
MongoDB
 
PDF
Schema Design
MongoDB
 
PDF
Schema Design
MongoDB
 
PDF
Schema Design
MongoDB
 
KEY
MongoDB, PHP and the cloud - php cloud summit 2011
Steven Francia
 
PPTX
Document databases
Qframe
 
PPTX
Modeling JSON data for NoSQL document databases
Ryan CrawCour
 
PPTX
Schema Design Best Practices with Buzz Moschetti
MongoDB
 
PPTX
Conceptos básicos. seminario web 3 : Diseño de esquema pensado para documentos
MongoDB
 
KEY
Modeling Data in MongoDB
lehresman
 
PDF
Best Practices for Migrating RDBMS to MongoDB
Sheeri Cabral
 
PPTX
Webinar: Back to Basics: Thinking in Documents
MongoDB
 
PDF
MongoDB Schema Design (Event: An Evening with MongoDB Houston 3/11/15)
MongoDB
 
PPTX
Schema Design
MongoDB
 
Schema design mongo_boston
MongoDB
 
Schema Design
MongoDB
 
Schema Design
MongoDB
 
Schema & Design
MongoDB
 
Webinar: Schema Design
MongoDB
 
MongoDB Schema Design
aaronheckmann
 
Schema Design
MongoDB
 
Schema Design
MongoDB
 
Schema Design
MongoDB
 
Schema Design
MongoDB
 
MongoDB, PHP and the cloud - php cloud summit 2011
Steven Francia
 
Document databases
Qframe
 
Modeling JSON data for NoSQL document databases
Ryan CrawCour
 
Schema Design Best Practices with Buzz Moschetti
MongoDB
 
Conceptos básicos. seminario web 3 : Diseño de esquema pensado para documentos
MongoDB
 
Modeling Data in MongoDB
lehresman
 
Best Practices for Migrating RDBMS to MongoDB
Sheeri Cabral
 
Webinar: Back to Basics: Thinking in Documents
MongoDB
 
MongoDB Schema Design (Event: An Evening with MongoDB Houston 3/11/15)
MongoDB
 
Schema Design
MongoDB
 
Ad

Recently uploaded (20)

PPTX
Tally software_Introduction_Presentation
AditiBansal54083
 
PDF
Empower Your Tech Vision- Why Businesses Prefer to Hire Remote Developers fro...
logixshapers59
 
PDF
Top Agile Project Management Tools for Teams in 2025
Orangescrum
 
PPTX
OpenChain @ OSS NA - In From the Cold: Open Source as Part of Mainstream Soft...
Shane Coughlan
 
PDF
Linux Certificate of Completion - LabEx Certificate
VICTOR MAESTRE RAMIREZ
 
PDF
Odoo CRM vs Zoho CRM: Honest Comparison 2025
Odiware Technologies Private Limited
 
PDF
SAP Firmaya İade ABAB Kodları - ABAB ile yazılmıl hazır kod örneği
Salih Küçük
 
PPTX
Why Businesses Are Switching to Open Source Alternatives to Crystal Reports.pptx
Varsha Nayak
 
PDF
Generic or Specific? Making sensible software design decisions
Bert Jan Schrijver
 
PDF
The 5 Reasons for IT Maintenance - Arna Softech
Arna Softech
 
PPTX
ChiSquare Procedure in IBM SPSS Statistics Version 31.pptx
Version 1 Analytics
 
PDF
Download Canva Pro 2025 PC Crack Full Latest Version
bashirkhan333g
 
PPTX
Change Common Properties in IBM SPSS Statistics Version 31.pptx
Version 1 Analytics
 
PPTX
Empowering Asian Contributions: The Rise of Regional User Groups in Open Sour...
Shane Coughlan
 
PPTX
Home Care Tools: Benefits, features and more
Third Rock Techkno
 
PPTX
Milwaukee Marketo User Group - Summer Road Trip: Mapping and Personalizing Yo...
bbedford2
 
PPTX
Hardware(Central Processing Unit ) CU and ALU
RizwanaKalsoom2
 
PPTX
Help for Correlations in IBM SPSS Statistics.pptx
Version 1 Analytics
 
PDF
iTop VPN With Crack Lifetime Activation Key-CODE
utfefguu
 
PDF
TheFutureIsDynamic-BoxLang witch Luis Majano.pdf
Ortus Solutions, Corp
 
Tally software_Introduction_Presentation
AditiBansal54083
 
Empower Your Tech Vision- Why Businesses Prefer to Hire Remote Developers fro...
logixshapers59
 
Top Agile Project Management Tools for Teams in 2025
Orangescrum
 
OpenChain @ OSS NA - In From the Cold: Open Source as Part of Mainstream Soft...
Shane Coughlan
 
Linux Certificate of Completion - LabEx Certificate
VICTOR MAESTRE RAMIREZ
 
Odoo CRM vs Zoho CRM: Honest Comparison 2025
Odiware Technologies Private Limited
 
SAP Firmaya İade ABAB Kodları - ABAB ile yazılmıl hazır kod örneği
Salih Küçük
 
Why Businesses Are Switching to Open Source Alternatives to Crystal Reports.pptx
Varsha Nayak
 
Generic or Specific? Making sensible software design decisions
Bert Jan Schrijver
 
The 5 Reasons for IT Maintenance - Arna Softech
Arna Softech
 
ChiSquare Procedure in IBM SPSS Statistics Version 31.pptx
Version 1 Analytics
 
Download Canva Pro 2025 PC Crack Full Latest Version
bashirkhan333g
 
Change Common Properties in IBM SPSS Statistics Version 31.pptx
Version 1 Analytics
 
Empowering Asian Contributions: The Rise of Regional User Groups in Open Sour...
Shane Coughlan
 
Home Care Tools: Benefits, features and more
Third Rock Techkno
 
Milwaukee Marketo User Group - Summer Road Trip: Mapping and Personalizing Yo...
bbedford2
 
Hardware(Central Processing Unit ) CU and ALU
RizwanaKalsoom2
 
Help for Correlations in IBM SPSS Statistics.pptx
Version 1 Analytics
 
iTop VPN With Crack Lifetime Activation Key-CODE
utfefguu
 
TheFutureIsDynamic-BoxLang witch Luis Majano.pdf
Ortus Solutions, Corp
 

MongoDB and Schema Design