SlideShare a Scribd company logo
MongoDB and
Schema Design
Solutions Architect, MongoDB Inc.
Matias Cascallares
matias@mongodb.com
Who am I?
• Originally from Buenos Aires,
Argentina
• Solutions Architect at MongoDB
Inc based in Singapore
• Software Engineer, most of my
experience in web environments
• In my toolbox I have Java, Python
and Node.js
Why do we 

need to look

for new
databases?
MongoDB and Schema Design
MongoDB and Schema Design
.. and not so long time ago
.. and not so long time ago
Hardware nowadays
HTTP POST
https://ec2.amazonaws.com/?
Action=StartInstances
&InstanceId.1=i-10a64379
&AUTHPARAMS
MongoDB and Schema Design
MONGODB IS A

DOCUMENT 

ORIENTED

DATABASE
Document Databases
• General purpose data storage
• Dynamic schema / unstructured data
• Flexible query and indexing capabilities
• Consistent writes
• Aggregation capabilities
Show me a document
{
"name" : "Matias Cascallares",
"title" : "Solutions Architect",
"email" : "matias@mongodb.com",
"birth_year" : 1981,
"location" : [ "Singapore", "Asia"],
"phone" : {
"type" : "mobile",
"number" : "+65 8591 3870"
}
}
Document Model
• MongoDB is made up of collections
• Collections are composed of documents
• Each document is a set of key-value pairs
• No predefined schema
• Keys are always strings
• Values can be any (supported) data type
• Values can also be an array
• Values can also be a document
Benefits of
document
model ..?
Flexibility
• Each document can have different fields
• No need of long migrations, easier to be agile
• Common structure enforced at application level
Arrays
• Documents can have field with array values
• Ability to query and index array elements
• We can model relationships with no need of different
tables or collections
Embedded documents
• Documents can have field with document values
• Ability to query and index nested documents
• Semantic closer to Object Oriented Programming
Indexing an array of documents
How should I
store my
information?
SCHEMA 

DESIGN IS
AN ARThttps://www.flickr.com/photos/76377775@N05/11098637655/
Relational
Schema Design
Focus on
data
storage
Document
Schema Design
Focus on
data
usage
Implementing

Relations
https://www.flickr.com/photos/ravages/2831688538
A task 

tracking app
Requirement #1
"We need to store user information like name, email
and their addresses… yes they can have more than
one.”
— Bill, a project manager, contemporary
Relational
id name email title
1 Kate
Powell
kate.powell@somedomain.c
om
Regional Manager
id street city user_id
1 123 Sesame Street Boston 1
2 123 Evergreen Street New York 1
Let’s use the document model
> db.user.findOne( { email: "kate.powell@somedomain.com"} )
{
_id: 1,
name: "Kate Powell",
email: "kate.powell@somedomain.com",
title: "Regional Manager",
addresses: [
{ street: "123 Sesame St", city: "Boston" },
{ street: "123 Evergreen St", city: "New York" }
]
}
Requirement #2
"We have to be able to store tasks, assign them to
users and track their progress…"
— Bill, a project manager, contemporary
Embedding tasks
> db.user.findOne( { email: "kate.powell@somedomain.com"} )
{
name: "Kate Powell",
// ... previous fields
tasks: [
{
summary: "Contact sellers",
description: "Contact agents to specify our needs
and time constraints",
due_date: ISODate("2014-08-25T08:37:50.465Z"),
status: "NOT_STARTED"
},
{ // another task }
]
}
Embedding tasks
• Tasks are unbounded items: initially we do not know
how many tasks we are going to have
• A user along time can end with thousands of tasks
• Maximum document size in MongoDB: 16 MB !
• It is harder to access task information without a user
context
Referencing tasks
> db.user.findOne({_id: 1})
{
_id: 1,
name: "Kate Powell",
email: "kate.powell@...",
title: "Regional Manager",
addresses: [
{ // address 1 },
{ // address 2 }
]
}
> db.task.findOne({user_id: 1})
{
_id: 5,
summary: "Contact sellers",
description: "Contact agents
to specify our ...",
due_date: ISODate(),
status: "NOT_STARTED",
user_id: 1
}
Referencing tasks
• Tasks are unbounded items and our schema supports
that
• Application level joins
• Remember to create proper indexes (e.g. user_id)
Embedding 

vs 

Referencing
One-to-many relations
• Embed when you have a few number of items on ‘many'
side
• Embed when you have some level of control on the
number of items on ‘many' side
• Reference when you cannot control the number of items
on the 'many' side
• Reference when you need to access to ‘many' side items
without parent entity scope
Many-to-many relations
• These can be implemented with two one-to-many
relations with the same considerations
RECIPE #1

USE EMBEDDING 

FOR ONE-TO-FEW
RELATIONS
RECIPE #2

USE REFERENCING 

FOR ONE-TO-MANY
RELATIONS
Working with

arrays
https://www.flickr.com/photos/kishjar/10747531785
Arrays are
great!
List of sorted elements
> db.numbers.insert({
_id: "even",
values: [0, 2, 4, 6, 8]
});
> db.numbers.insert({
_id: "odd",
values: [1, 3, 5, 7, 9]
});
Access based on position
db.numbers.find({_id: "even"}, {values: {$slice: [2, 3]}})
{
_id: "even",
values: [4, 6, 8]
}
db.numbers.find({_id: "odd"}, {values: {$slice: -2}})
{
_id: "odd",
values: [7, 9]
}
Access based on values
// is number 2 even or odd?
> db.numbers.find( { values : 2 } )
{
_id: "even",
values: [0, 2, 4, 6, 8]
}
Like sorted sets
> db.numbers.find( { _id: "even" } )
{
_id: "even",
values: [0, 2, 4, 6, 8]
}
> db.numbers.update(
{ _id: "even"},
{ $addToSet: { values: 10 } }
);
Several times…!
> db.numbers.find( { _id: "even" } )
{
_id: "even",
values: [0, 2, 4, 6, 8, 10]
}
Array update operators
• pop
• push
• pull
• pullAll
But…
Storage
{
_id: 1,
name: "Nike Pump Air 180",
tags: ["sports", "running"]
}
db.inventory.update(
{ _id: 1},
{ $push: { tags: "shoes" } }

)
DocA DocCDocB
Empty
Storage
DocA DocCDocB DocB
IDX IDX IDX
86 bytes
Why is expensive to move a doc?
1. We need to write the document in another location ($$)
2. We need to mark the original position as free for new
documents ($)
3. We need to update all those index entries pointing to the
moved document to the new location ($$$)
Considerations with arrays
• Limited number of items
• Avoid document movements
• Document movements can be delayed with padding
factor
• Document movements can be mitigated with pre-
allocation
RECIPE #3

AVOID EMBEDDING
LARGE ARRAYS
RECIPE #4

USE DATA MODELS
THAT MINIMIZE THE
NEED FOR 

DOCUMENT 

GROWTH
Denormalization
https://www.flickr.com/photos/ross_strachan/5146307757
Denormalization
"…is the process of attempting to optimise the
read performance of a database by adding
redundant data …”
— Wikipedia
Products and comments
> db.product.find( { _id: 1 } )
{
_id: 1,
name: "Nike Pump Air Force 180",
tags: ["sports", "running"]
}
> db.comment.find( { product_id: 1 } )
{ score: 5, user: "user1", text: "Awesome shoes" }
{ score: 2, user: "user2", text: "Not for me.." }
Denormalizing
> db.product.find({_id: 1})
{
_id: 1,
name: "Nike Pump Air Force 180",
tags: ["sports", “running"],
comments: [
{ user: "user1", text: "Awesome shoes" },
{ user: "user2", text: "Not for me.." }
]
}
> db.comment.find({product_id: 1})
{ score: 5, user: "user1", text: "Awesome shoes" }
{ score: 2, user: "user2", text: "Not for me.."}
RECIPE #5

DENORMALIZE 

TO AVOID 

APP-LEVEL JOINS
RECIPE #6

DENORMALIZE ONLY
WHEN YOU HAVE A
HIGH READ TO WRITE
RATIO
Bucketing
https://www.flickr.com/photos/97608671@N02/13558864555/
What’s the idea?
• Reduce number of documents to be retrieved
• Less documents to retrieve means less disk seeks
• Using arrays we can store more than one entity per
document
• We group things that are accessed together
An example
Comments are showed in
buckets of 2 comments
A ‘read more’ button
loads next 2 comments
Bucketing comments
> db.comments.find({post_id: 123})
.sort({sequence: -1})
.limit(1)
{
_id: 1,
post_id: 123,
sequence: 8, // this acts as a page number
comments: [
{user: user1@somedomain.com, text: "Awesome shoes.."},
{user: user2@somedomain.com, text: "Not for me..”}
] // we store two comments per doc, fixed size bucket
}
RECIPE #7

USE BUCKETING TO
STORE THINGS THAT
ARE GOING TO BE
ACCESSED AS A
GROUP
MongoDB and Schema Design

More Related Content

What's hot (14)

PPT
Basic DBMS ppt
dangwalrajendra888
 
PPTX
Query Optimization
rohitsalunke
 
PPTX
AWS Lambda Features and Uses
GlobalLogic Ukraine
 
PPTX
Veean Backup & Replication
Arnaud PAIN
 
PPTX
AWS Lambda Tutorial For Beginners | What is AWS Lambda? | AWS Tutorial For Be...
Simplilearn
 
PPTX
01-database-management.pptx
dhanajimirajkar1
 
PDF
AWS vs Azure vs Google Cloud Storage Deep Dive
RightScale
 
PDF
Introduction to Databases and Transactions
نبيله نواز
 
PDF
AWS 환경에서 Dell Technologies 데이터 보호 솔루션을 활용한 데이터 보호 방안 - 정진환 이사, Dell EMC :: AW...
Amazon Web Services Korea
 
PPT
Database management system1
jamwal85
 
PPTX
ADBMS Object and Object Relational Databases
Jayanthi Kannan MK
 
PDF
Cloud Deployment Models.pdf
HasanRaza331074
 
PPTX
Docker In Cloud
Santanu Pattanayak
 
PPT
Cloud Migration: Moving to the Cloud
Dr.-Ing. Michael Menzel
 
Basic DBMS ppt
dangwalrajendra888
 
Query Optimization
rohitsalunke
 
AWS Lambda Features and Uses
GlobalLogic Ukraine
 
Veean Backup & Replication
Arnaud PAIN
 
AWS Lambda Tutorial For Beginners | What is AWS Lambda? | AWS Tutorial For Be...
Simplilearn
 
01-database-management.pptx
dhanajimirajkar1
 
AWS vs Azure vs Google Cloud Storage Deep Dive
RightScale
 
Introduction to Databases and Transactions
نبيله نواز
 
AWS 환경에서 Dell Technologies 데이터 보호 솔루션을 활용한 데이터 보호 방안 - 정진환 이사, Dell EMC :: AW...
Amazon Web Services Korea
 
Database management system1
jamwal85
 
ADBMS Object and Object Relational Databases
Jayanthi Kannan MK
 
Cloud Deployment Models.pdf
HasanRaza331074
 
Docker In Cloud
Santanu Pattanayak
 
Cloud Migration: Moving to the Cloud
Dr.-Ing. Michael Menzel
 

Viewers also liked (20)

PDF
The Fine Art of Schema Design in MongoDB: Dos and Don'ts
Matias Cascallares
 
PPTX
Internet of Things Cologne 2015: Why Your Dad’s Database won’t Work for IoT a...
MongoDB
 
PPTX
Elasticsearch 5.0
Matias Cascallares
 
PDF
The What and Why of NoSql
Matias Cascallares
 
PPTX
Socialite, the Open Source Status Feed Part 1: Design Overview and Scaling fo...
MongoDB
 
KEY
Building your first application w/mongoDB MongoSV2011
Steven Francia
 
PDF
Intro to MongoDB and datamodeling
rogerbodamer
 
ODP
Кратко о MongoDB
Gleb Lebedev
 
PPTX
MongoDB. Области применения, преимущества и узкие места, тонкости использован...
phpdevby
 
PPTX
Преимущества NoSQL баз данных на примере MongoDB
UNETA
 
KEY
MongoDB Aggregation Framework
Tyler Brock
 
PDF
Выбор NoSQL базы данных для вашего проекта: "Не в свои сани не садись"
Alexey Zinoviev
 
PPTX
An Introduction to Big Data, NoSQL and MongoDB
William LaForest
 
PPTX
Data Modeling for NoSQL
Tony Tam
 
PDF
What's new in Elasticsearch v5
Idan Tohami
 
PPTX
Agg framework selectgroup feb2015 v2
MongoDB
 
PPTX
Socialite, the Open Source Status Feed
MongoDB
 
PPTX
Socialite, the Open Source Status Feed Part 3: Scaling the Data Feed
MongoDB
 
PPTX
MongoDB IoT City Tour STUTTGART: Industrial Internet, Industry 4.0, Smart Fac...
MongoDB
 
PPTX
MongoDB IoT City Tour LONDON: Industrial Internet, Industry 4.0, Smart Factor...
MongoDB
 
The Fine Art of Schema Design in MongoDB: Dos and Don'ts
Matias Cascallares
 
Internet of Things Cologne 2015: Why Your Dad’s Database won’t Work for IoT a...
MongoDB
 
Elasticsearch 5.0
Matias Cascallares
 
The What and Why of NoSql
Matias Cascallares
 
Socialite, the Open Source Status Feed Part 1: Design Overview and Scaling fo...
MongoDB
 
Building your first application w/mongoDB MongoSV2011
Steven Francia
 
Intro to MongoDB and datamodeling
rogerbodamer
 
Кратко о MongoDB
Gleb Lebedev
 
MongoDB. Области применения, преимущества и узкие места, тонкости использован...
phpdevby
 
Преимущества NoSQL баз данных на примере MongoDB
UNETA
 
MongoDB Aggregation Framework
Tyler Brock
 
Выбор NoSQL базы данных для вашего проекта: "Не в свои сани не садись"
Alexey Zinoviev
 
An Introduction to Big Data, NoSQL and MongoDB
William LaForest
 
Data Modeling for NoSQL
Tony Tam
 
What's new in Elasticsearch v5
Idan Tohami
 
Agg framework selectgroup feb2015 v2
MongoDB
 
Socialite, the Open Source Status Feed
MongoDB
 
Socialite, the Open Source Status Feed Part 3: Scaling the Data Feed
MongoDB
 
MongoDB IoT City Tour STUTTGART: Industrial Internet, Industry 4.0, Smart Fac...
MongoDB
 
MongoDB IoT City Tour LONDON: Industrial Internet, Industry 4.0, Smart Factor...
MongoDB
 
Ad

Similar to MongoDB and Schema Design (20)

PDF
MongoDB & NoSQL 101
Jollen Chen
 
PPTX
MediaGlu and Mongo DB
Sundar Nathikudi
 
KEY
Managing Social Content with MongoDB
MongoDB
 
PPTX
How to Achieve Scale with MongoDB
MongoDB
 
PDF
MongoDB for Coder Training (Coding Serbia 2013)
Uwe Printz
 
KEY
Mongodb intro
christkv
 
PDF
MongoDB Revised Sharding Guidelines MongoDB 3.x_Kimberly_Wilkins
kiwilkins
 
PDF
Dan Sullivan - Data Analytics and Text Mining with MongoDB - NoSQL matters Du...
NoSQLmatters
 
PPTX
Webinar: Building Your First Application with MongoDB
MongoDB
 
PPTX
Effective Testing using Behavior-Driven Development
Alexander Kress
 
PDF
OSDC 2012 | Building a first application on MongoDB by Ross Lawley
NETWAYS
 
PPTX
Whats new in MongoDB 24
MongoDB
 
PDF
Practical Ruby Projects with MongoDB - Ruby Kaigi 2010
Alex Sharp
 
PDF
Building your first app with MongoDB
Norberto Leite
 
PPTX
MongoDB using Grails plugin by puneet behl
TO THE NEW | Technology
 
PPTX
SQL To NoSQL - Top 6 Questions Before Making The Move
IBM Cloud Data Services
 
PDF
Webinar: Was ist neu in MongoDB 2.4
MongoDB
 
PDF
10gen Presents Schema Design and Data Modeling
DATAVERSITY
 
PPTX
Offline First Apps With Couchbase Mobile and Xamarin
Martin Esmann
 
KEY
MongoDB
Steven Francia
 
MongoDB & NoSQL 101
Jollen Chen
 
MediaGlu and Mongo DB
Sundar Nathikudi
 
Managing Social Content with MongoDB
MongoDB
 
How to Achieve Scale with MongoDB
MongoDB
 
MongoDB for Coder Training (Coding Serbia 2013)
Uwe Printz
 
Mongodb intro
christkv
 
MongoDB Revised Sharding Guidelines MongoDB 3.x_Kimberly_Wilkins
kiwilkins
 
Dan Sullivan - Data Analytics and Text Mining with MongoDB - NoSQL matters Du...
NoSQLmatters
 
Webinar: Building Your First Application with MongoDB
MongoDB
 
Effective Testing using Behavior-Driven Development
Alexander Kress
 
OSDC 2012 | Building a first application on MongoDB by Ross Lawley
NETWAYS
 
Whats new in MongoDB 24
MongoDB
 
Practical Ruby Projects with MongoDB - Ruby Kaigi 2010
Alex Sharp
 
Building your first app with MongoDB
Norberto Leite
 
MongoDB using Grails plugin by puneet behl
TO THE NEW | Technology
 
SQL To NoSQL - Top 6 Questions Before Making The Move
IBM Cloud Data Services
 
Webinar: Was ist neu in MongoDB 2.4
MongoDB
 
10gen Presents Schema Design and Data Modeling
DATAVERSITY
 
Offline First Apps With Couchbase Mobile and Xamarin
Martin Esmann
 
Ad

Recently uploaded (20)

PPTX
Platform for Enterprise Solution - Java EE5
abhishekoza1981
 
PDF
Beyond Binaries: Understanding Diversity and Allyship in a Global Workplace -...
Imma Valls Bernaus
 
PDF
Odoo CRM vs Zoho CRM: Honest Comparison 2025
Odiware Technologies Private Limited
 
PDF
MiniTool Partition Wizard 12.8 Crack License Key LATEST
hashhshs786
 
PPTX
Human Resources Information System (HRIS)
Amity University, Patna
 
PDF
Understanding the Need for Systemic Change in Open Source Through Intersectio...
Imma Valls Bernaus
 
PDF
GetOnCRM Speeds Up Agentforce 3 Deployment for Enterprise AI Wins.pdf
GetOnCRM Solutions
 
PPTX
Revolutionizing Code Modernization with AI
KrzysztofKkol1
 
PPTX
A Complete Guide to Salesforce SMS Integrations Build Scalable Messaging With...
360 SMS APP
 
PDF
Efficient, Automated Claims Processing Software for Insurers
Insurance Tech Services
 
DOCX
Import Data Form Excel to Tally Services
Tally xperts
 
PPTX
An Introduction to ZAP by Checkmarx - Official Version
Simon Bennetts
 
PPTX
Feb 2021 Cohesity first pitch presentation.pptx
enginsayin1
 
PDF
Unlock Efficiency with Insurance Policy Administration Systems
Insurance Tech Services
 
PPTX
MailsDaddy Outlook OST to PST converter.pptx
abhishekdutt366
 
PDF
Build It, Buy It, or Already Got It? Make Smarter Martech Decisions
bbedford2
 
PDF
Salesforce CRM Services.VALiNTRY360
VALiNTRY360
 
PDF
Executive Business Intelligence Dashboards
vandeslie24
 
PDF
Thread In Android-Mastering Concurrency for Responsive Apps.pdf
Nabin Dhakal
 
PPTX
How Apagen Empowered an EPC Company with Engineering ERP Software
SatishKumar2651
 
Platform for Enterprise Solution - Java EE5
abhishekoza1981
 
Beyond Binaries: Understanding Diversity and Allyship in a Global Workplace -...
Imma Valls Bernaus
 
Odoo CRM vs Zoho CRM: Honest Comparison 2025
Odiware Technologies Private Limited
 
MiniTool Partition Wizard 12.8 Crack License Key LATEST
hashhshs786
 
Human Resources Information System (HRIS)
Amity University, Patna
 
Understanding the Need for Systemic Change in Open Source Through Intersectio...
Imma Valls Bernaus
 
GetOnCRM Speeds Up Agentforce 3 Deployment for Enterprise AI Wins.pdf
GetOnCRM Solutions
 
Revolutionizing Code Modernization with AI
KrzysztofKkol1
 
A Complete Guide to Salesforce SMS Integrations Build Scalable Messaging With...
360 SMS APP
 
Efficient, Automated Claims Processing Software for Insurers
Insurance Tech Services
 
Import Data Form Excel to Tally Services
Tally xperts
 
An Introduction to ZAP by Checkmarx - Official Version
Simon Bennetts
 
Feb 2021 Cohesity first pitch presentation.pptx
enginsayin1
 
Unlock Efficiency with Insurance Policy Administration Systems
Insurance Tech Services
 
MailsDaddy Outlook OST to PST converter.pptx
abhishekdutt366
 
Build It, Buy It, or Already Got It? Make Smarter Martech Decisions
bbedford2
 
Salesforce CRM Services.VALiNTRY360
VALiNTRY360
 
Executive Business Intelligence Dashboards
vandeslie24
 
Thread In Android-Mastering Concurrency for Responsive Apps.pdf
Nabin Dhakal
 
How Apagen Empowered an EPC Company with Engineering ERP Software
SatishKumar2651
 

MongoDB and Schema Design