0% found this document useful (0 votes)
21 views

Dbms Unit5 Notes

Notes
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
21 views

Dbms Unit5 Notes

Notes
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 81

mongodb.

org
Content

Part 1: Introduction & Basics


2: CRUD
3: Schema Design
4: Indexes
5: Aggregation
6: Replication & Sharding
History
mongoDB = “Humongous DB”
Open-source
Document-based
“High performance, high availability”
Automatic scaling
C-P on CAP

-blog.mongodb.org/post/475279604/on-distributed-consistency-part-1
-mongodb.org/manual
Other NoSQL Types

Key/value (Dynamo)

Columnar/tabular (HBase)

Document (mongoDB)

http://www.aaronstannard.com/post/2011/06/30/MongoDB-vs-SQL-Server.aspx
Motivations
Problems with SQL

Rigid schema
Not easily scalable (designed for 90’s technology or
worse)
Requires unintuitive joins

Perks of mongoDB
Easy interface with common languages (Java,
Javascript, PHP, etc.)
DB tech should run anywhere (VM’s, cloud, etc.)
Keeps essential features of RDBMS’s while learning
from key-value noSQL systems

http://www.slideshare.net/spf13/mongodb-9794741?v=qf1&b=&from_search=13
Company Using mongoDB

“MongoDB powers Under Armour’s online store, and was


chosen for its dynamic schema, ability to scale horizontally
and perform multi-data center replication.”

http://www.mongodb.org/about/production-deployments/
-Steve Francia, http://www.slideshare.net/spf13/mongodb-9794741?v=qf1&b=&from_search=13
Data Model
Document-Based (max 16 MB)
Documents are in BSON format, consisting of
field-value pairs
Each document stored in a collection
Collections
Have index set in common
Like tables of relational db’s.
Documents do not have to have uniform structure

-docs.mongodb.org/manual/
JSON

“JavaScript Object Notation”


Easy for humans to write/read, easy for
computers to parse/generate
Objects can be nested
Built on
name/value pairs
Ordered list of values

http://json.org/
BSON
• “Binary JSON”
• Binary-encoded serialization of JSON-like docs
• Also allows “referencing”
• Embedded structure reduces need for joins
• Goals
– Lightweight
– Traversable
– Efficient (decoding and encoding)

http://bsonspec.org/
BSON Example
{
"_id" : "37010"
"city" : "ADAMS",
"pop" : 2660,
"state" : "TN",
“councilman” : {
name: “John Smith”
address: “13 Scenic Way”
}
}
BSON Types
Type Number
Double 1
String 2
Object 3
Array 4
Binary data 5 The number can
Object id 7
Boolean 8 be used with the
Date 9 $type operator to
Null 10
Regular Expression 11 query by type!
JavaScript 13
Symbol 14
JavaScript (with scope) 15
32-bit integer 16
Timestamp 17
64-bit integer 18
Min key 255
Max key 127

http://docs.mongodb.org/manual/reference/bson-types/
The _id Field

• By default, each document contains an _id


field. This field has a number of special
characteristics:
– Value serves as primary key for collection.
– Value is unique, immutable, and may be any
non-array type.
– Default data type is ObjectId, which is “small,
likely unique, fast to generate, and ordered.”
Sorting on an ObjectId value is roughly equivalent
to sorting on creation time.

http://docs.mongodb.org/manual/reference/bson-types/
mongoDB vs. SQL

mongoDB SQL
Document Tuple
Collection Table/View
PK: _id Field PK: Any Attribute(s)
Uniformity not Required Uniform Relation Schema

Index Index
Embedded Structure Joins
Shard Partition
CRUD
Create, Read, Update, Delete
Getting Started with mongoDB

To install mongoDB, go to this link and click on the


appropriate OS and architecture:
http://www.mongodb.org/downloads

First, extract the files (preferrably to the C drive).

Finally, create a data directory on C:\ for mongoDB to use


i.e. “md data” followed by “md data\db”

http://docs.mongodb.org/manual/tutorial/install-mongodb-on-windows/
Getting Started with mongoDB

Open your mongodb/bin directory and run mongod.exe to


start the database server.

To establish a connection to the server, open another


command prompt window and go to the same directory,
entering in mongo.exe. This engages the mongodb
shell—it’s that easy!

http://docs.mongodb.org/manual/tutorial/getting-started/
CRUD: Using the Shell

To check which db you’re using db


Show all databases show dbs
Switch db’s/make a new one use <name>
See what collections exist show collections

Note: db’s are not actually created until you insert data!
CRUD: Using the Shell (cont.)

To insert documents into a collection/make a new


collection:

db.<collection>.insert(<document>)
<=>

INSERT INTO <table>


VALUES(<attributevalues>);
CRUD: Inserting Data

Insert one document

db.<collection>.insert({<field>:<value>})

Inserting a document with a field name new to the collection is


inherently supported by the BSON model.

To insert multiple documents, use an array.


CRUD: Querying

Done on collections.
Get all docs: db.<collection>.find()
Returns a cursor, which is iterated over shell to
display first 20 results.
Add .limit(<number>) to limit results
SELECT * FROM <table>;

Get one doc: db.<collection>.findOne()


CRUD: Querying

To match a specific value:


db.<collection>.find({<field>:<value>})
“AND”
db.<collection>.find({<field1>:<value1>,
<field2>:<value2>
})
SELECT *
FROM <table>
WHERE <field1> = <value1> AND <field2> =
<value2>;
CRUD: Querying

OR
db.<collection>.find({ $or: [
<field>:<value1>
<field>:<value2> ]
})

SELECT *
FROM <table>
WHERE <field> = <value1> OR <field> = <value2>;

Checking for multiple values of same field


db.<collection>.find({<field>: {$in [<value>, <value>]}})
CRUD: Querying
Including/excluding document fields

db.<collection>.find({<field1>:<value>}, {<field2>: 0})

SELECT field1
FROM <table>;

db.<collection>.find({<field>:<value>}, {<field2>: 1})


Find documents with or w/o field

db.<collection>.find({<field>: { $exists: true}})


CRUD: Updating

db.<collection>.update(
{<field1>:<value1>}, //all docs in which field = value
{$set: {<field2>:<value2>}}, //set field to value
{multi:true} ) //update multiple docs

upsert: if true, creates a new doc when none matches search criteria.

UPDATE <table>
SET <field2> = <value2>
WHERE <field1> = <value1>;
CRUD: Updating

To remove a field

db.<collection>.update({<field>:<value>},
{ $unset: { <field>: 1}})

Replace all field-value pairs

db.<collection>.update({<field>:<value>},
{ <field>:<value>,
<field>:<value>})
*NOTE: This overwrites ALL the contents of a document,
even removing fields.
CRUD: Removal

Remove all records where field = value

db.<collection>.remove({<field>:<value>})

DELETE FROM <table>


WHERE <field> = <value>;

As above, but only remove first document

db.<collection>.remove({<field>:<value>}, true)
CRUD: Isolation

• By default, all writes are atomic only on the level


of a single document.
• This means that, by default, all writes can be
interleaved with other operations.
• You can isolate writes on an unsharded collection
by adding $isolated:1 in the query area:
db.<collection>.remove({<field>:<value>,
$isolated: 1})
Schema Design
RDBMS MongoDB
Database ➜ Database
Table ➜ Collection
Row ➜ Document
Index ➜ Index
Join ➜ Embedded
Document
Foreign ➜ Reference
Key
Intuition – why database exist in the
first place?

Why can’t we just write programs that operate on


objects?
Memory limit
We cannot swap back from disk merely by OS for the page based
memory management mechanism

Why can’t we have the database operating on the


same data structure as in program?
That is where mongoDB comes in
Mongo is basically schema-free

The purpose of schema in SQL is for meeting the


requirements of tables and quirky SQL implementation

Every “row” in a database “table” is a data structure, much


like a “struct” in C, or a “class” in Java. A table is then an
array (or list) of such data structures

So we what we design in mongoDB is basically same way


how we design a compound data type binding in JSON
There are some patterns

Embedding

Linking
Embedding & Linking
One to One relationship
zip = {
_id: 35004,
zip = {
city: “ACMAR”,
loc: [-86, 33], _id: 35004 ,
pop: 6065,
city: “ACMAR”
State: “AL” loc: [-86, 33],
} pop: 6065,
State: “AL”,
Council_person = {
zip_id = 35004, council_person: {
name: “John Doe",
name: “John Doe",
address: “123 Fake St.”,
address: “123 Fake St.”,
Phone: 123456
Phone: 123456
}
}
}
Example 2

MongoDB: The Definitive Guide,


By Kristina Chodorow and Mike Dirolf
Published: 9/24/2010
Pages: 216
Language: English

Publisher: O’Reilly Media, CA


One to many relationship -
Embedding
book = {
title: "MongoDB: The Definitive Guide",
authors: [ "Kristina Chodorow", "Mike Dirolf" ]
published_date: ISODate("2010-09-24"),
pages: 216,
language: "English",
publisher: {
name: "O’Reilly Media",
founded: "1980",
location: "CA"
}
}
One to many relationship –
Linkingpublisher = {
_id: "oreilly",
name: "O’Reilly Media",
founded: "1980",
location: "CA"
}
book = {
title: "MongoDB: The Definitive Guide",
authors: [ "Kristina Chodorow", "Mike Dirolf" ]
published_date: ISODate("2010-09-24"),
pages: 216,
language: "English",
publisher_id: "oreilly"
}
Linking vs. Embedding

Embedding is a bit like pre-joining data


Document level operations are easy for the server
to handle
Embed when the “many” objects always appear
with (viewed in the context of) their parents.
Linking when you need more flexibility
Many to many relationship

Can put relation in either one of the


documents (embedding in one of the
documents)

Focus how data is accessed queried


Example
book = {
title: "MongoDB: The Definitive Guide",
authors : [
{ _id: "kchodorow", name: "Kristina Chodorow” },
{ _id: "mdirolf", name: "Mike Dirolf” }
]
published_date: ISODate("2010-09-24"),
pages: 216,
language: "English"
}

author = {
_id: "kchodorow",
name: "Kristina Chodorow",
hometown: "New York"
}

db.books.find( { authors.name : "Kristina Chodorow" } )


What is bad about SQL (
semantically )
“Primary keys” of a database table are in essence persistent memory
addresses for the object. The address may not be the same when the
object is reloaded into memory. This is why we need primary keys.

Foreign key functions just like a pointer in C, persistently point to the


primary key.

Whenever we need to deference a pointer, we do JOIN

It is not intuitive for programming and also JOIN is time consuming


Example 3

• Book can be checked out by one


student at a time
• Student can check out many books
Modeling Checkouts

student = {
_id: "joe"
name: "Joe Bookreader",
join_date: ISODate("2011-10-15"),
address: { ... }
}

book = {
_id: "123456789"
title: "MongoDB: The Definitive Guide",
authors: [ "Kristina Chodorow", "Mike Dirolf" ],
...
}
Modeling Checkouts

student = {
_id: "joe"
name: "Joe Bookreader",
join_date: ISODate("2011-10-15"),
address: { ... },
checked_out: [
{ _id: "123456789", checked_out: "2012-10-15" },
{ _id: "987654321", checked_out: "2012-09-12" },
...
]
}
What is good about mongoDB?

find() is more semantically clear for programming

(map (lambda (b) b.title)


(filter (lambda (p) (> p 100)) Book)

Data locality, and


De-normalization provides

Data locality provides speed


Part 4: Index in MongoDB
Before Index
What does database normally do when we query?
MongoDB must scan every document.
Inefficient because process large volume of data
db.users.find( { score: { “$lt” : 30} } )
Definition of Index
Definition
Indexes are special data structures that
store a small portion of the collection’s
data set in an easy to traverse form.

Index

Diagram of a query that uses an index to select


Index in MongoDB
Operations
Creation index
db.users.ensureIndex( { score: 1 } )
Show existing indexes
db.users.getIndexes()
Drop index
db.users.dropIndex( {score: 1} )
Explain—Explain
db.users.find().explain()
Returns a document that describes the process
and indexes
Hint
db.users.find().hint({score: 1})
Overide MongoDB’s default index selection
Index in MongoDB
Types • Single Field Indexes
• Compound Field Indexes
• Multikey Indexes
• Single Field Indexes
– db.users.ensureIndex( { score: 1 } )
Index in MongoDB
Types • Single Field Indexes
• Compound Field Indexes
• Multikey Indexes
• Compound Field Indexes
– db.users.ensureIndex( { userid:1, score: -1 } )
Index in MongoDB
Types • Single Field Indexes
• Compound Field Indexes
• Multikey Indexes
• Multikey Indexes
– db.users.ensureIndex( { addr.zip:1} )
Demo of indexes in MongoDB

Import Data
Create Index
Single Field Index
Compound Field
Indexes
Multikey Indexes
Show Existing Index
Hint
Single Field Index
Compound Field
Indexes
Multikey Indexes
Explain
Compare with data without
indexes
Demo of indexes in MongoDB

Import Data
Create Index
Single Field Index
Compound Field
Indexes
Multikey Indexes
Show Existing Index
Hint
Single Field Index
Compound Field
Indexes
Multikey Indexes
Explain
Compare with data without
indexes
Demo of indexes in MongoDB

Import Data
Create Index
Single Field Index
Compound Field Indexes
Multikey Indexes
Show Existing Index
Hint
Single Field Index
Compound Field Indexes
Multikey Indexes
Explain
Compare with data
without indexes
Demo of indexes in MongoDB

Import Data
Create Index
Single Field Index
Compound Field
Indexes
Multikey Indexes
Show Existing Index
Hint
Single Field Index
Compound Field
Indexes
Multikey Indexes
Explain
Compare with data without
indexes
Demo of indexes in MongoDB

Import Data
Create Index
Single Field Index
Compound Field Indexes
Multikey Indexes

Show Existing Index


Hint
Single Field Index
Compound Field Indexes
Multikey Indexes

Explain
Compare with data without
indexes
Demo of indexes in MongoDB

Import Data
Create Index
Single Field Index
Compound Field Indexes
Multikey Indexes

Show Existing Index


Hint
Single Field Index
Compound Field Indexes
Multikey Indexes

Explain
Compare with data without
indexes
Demo of indexes in MongoDB

Import Data
Create Index
Single Field Index
Compound Field Indexes
Multikey Indexes

Show Existing Index


Hint
Single Field Index
Compound Field Indexes
Multikey Indexes

Explain
Compare with data without
indexes
Demo of indexes in MongoDB
Import Data
Create Index Without
Single Field Index
Index
Compound Field Indexes
Multikey Indexes
Show Existing Index
Hint
Single Field Index
Compound Field Indexes
Multikey Indexes
Explain
Compare with data
without indexes With Index
Aggregation

Operations that process data records and


return computed results.
MongoDB provides aggregation
operations
Running data aggregation on
the mongod instance simplifies
application code and limits resource
requirements.
Pipelines
Modeled on the concept of data processing pipelines.
Provides:
filters that operate like queries
document transformations that modify the form of
the output document.
Provides tools for:
grouping and sorting by field
aggregating the contents of arrays, including
arrays of documents
Can use operators for tasks such as calculating the average or
concatenating a string.
Pipelines

$limit
$skip
$sort
Map-Reduce
Has two phases:
A map stage that processes each document and emits one or more
objects for each input document
A reduce phase that combines the output of the map operation.
An optional finalize stage for final modifications to the result
Uses Custom JavaScript functions
Provides greater flexibility but is less efficient and more
complex than the aggregation pipeline
Can have output sets that exceed the 16 megabyte output
limitation of the aggregation pipeline.
Single Purpose Aggregation
Operations
Special purpose database commands:
returning a count of matching documents
returning the distinct values for a field
grouping data based on the values of a field.
Aggregate documents from a single collection.
Lack the flexibility and capabilities of the
aggregation pipeline and map-reduce.
Replication & Sharding

Image source: http://mongodb.in.th


Replication

What is replication?
Purpose of replication/redundancy
Fault tolerance
Availability
Increase read capacity
Replication in MongoDB

Replica Set Members


Primary
Read, Write operations
Secondary
Asynchronous Replication
Can be primary
Arbiter
Voting
Can’t be primary
Delayed Secondary
Can’t be primary
Replication in MongoDB

Automatic Failover
Heartbeats
Elections
The Standard Replica Set
Deployment
Deploy an Odd Number of
Members
Rollback
Security
SSL/TLS
Demo for Replication
Sharding

What is sharding?
Purpose of sharding
Horizontal scaling out
Query Routers
mongos
Shard keys
Range based sharding
Cardinality
Avoid hotspotting
Demo for Sharding
Thanks

You might also like