Mongodb-Unit 5
Mongodb-Unit 5
NOSQL
History
● mongoDB = “Humongous DB”
○ Open-source
○ Document-based
○ “High performance, high availability”
○ Automatic scaling
○ C-P on CAP
-blog.mongodb.org/post/475279604/on-distributed-consistency-part-1
-mongodb.org/manual
Other NoSQL Types
Key/value (Dynamo)
Columnar/tabular (HBase)
Document (mongoDB)
http://www.aaronstannard.com/post/2011/06/30/MongoDB-vs-SQL-Server.aspx
Motivations
🠶 Problems with SQL
🠶 Rigid schema
🠶 Not easily scalable (designed for 90’s technology or
worse)
🠶 Requires unintuitive joins
🠶 Perks of mongoDB
http://www.slideshare.net/spf13/mongodb-9794741?v=qf1&b=&from_search=13
Company Using mongoDB
http://www.mongodb.org/about/production-deployments/
-Steve Francia, http://www.slideshare.net/spf13/mongodb-9794741?v=qf1&b=&from_search=13
NoSQL Distinguishing Characteristics
Large data volumes
Schema-less
CAP Theorem 7
Partition
Consistency Availability
tolerance
9
Consistency
• all nodes see the same data at the same time – Wikipedia
10
Availability
• node failures do not prevent survivors
11
Partition Tolerance
• the system continues to operate despite
12
Outline
Difference Between SQL and NoSQL
Study of Open Source NOSQL
Database
MongoDB Installation,
Execution
Open Source
Document Store
Key-Value Store – stores
– Hash table of documents
keys made up of
tagged elements 15
Other Non-SQL Databases
• XML Databases
16
NoSQL Example: Column Store
Each storage block contains data from only one
Example: Hadoop/Hbase
• http://hadoop.apache.org/
• Yahoo, Facebook
18
NoSQL Examples: Key-Value Store
• Hash tables of Keys
19
MongoDB
What is MongoDB ?
• Scalable High-Performance Open-source,
Document-orientated database.
•Web
Caching
2.0,and High SAAS,
Media, Scalability
Gaming
HealthCare, Finance, Telecom, Government
Not great for?
• Highly Transactional Applications.
No complex joins
-docs.mongodb.org/manual/
JSON
http://json.org/
BSON
• “Binary JSON”
• Binary-encoded serialization of JSON-like docs
• Also allows “referencing”
• Embedded structure reduces need for joins
• Goals
– Lightweight
– Traversable
– Efficient (decoding and encoding)
http://bsonspec.org/
BSON Example
{
"_id" : "37010"
"city" : “Pune",
"pop" : 2660,
"state" : “MH",
“councilman” : {
name: “Smith”
address: “Pune-12”
}
}
BSON Types
Type Number
Double 1
String 2
Object 3
Array 4
Binary data
Object id
5
7
The number can
Boolean 8 be used with the
Date
Null
9
10
$type operator to
Regular Expression 11 query by type!
JavaScript 13
Symbol 14
JavaScript (with scope) 15
32-bit integer 16
Timestamp 17
64-bit integer 18
Min key 255
Max key 127
http://docs.mongodb.org/manual/reference/bson-types/
The _id Field
• By default, each document contains an _id field. This field has a number
of special characteristics:
– Value serves as primary key for collection.
– Value is unique, immutable, and may be any non-array type.
– Default data type is ObjectId, which is “small, likely unique, fast to generate, and
ordered.” Sorting on an ObjectId value is roughly equivalent to sorting on creation
time.
• Architecturally, by default the _id field is an ObjectID, one of
MongoDB's BSON types. The ObjectID is the primary key for the stored
document and is automatically generated when creating a new document
in a collection. The following values make up the full 12-byte
combination of every _ID (quoted from MongoDB's documentation): "a
4-byte value representing the seconds since the Unix epoch,
• a 3-byte machine identifier,
• a 2-byte process id, and a 3-byte counter, starting with a random value."
http://docs.mongodb.org/manual/reference/bson-types/
MongoDB Terminologies for
RDBMS concepts
RDBMS MongoDB
Database Database
Table, View Collection
Row Document (JSON, BSON)
Column Field
Index Index
Join Embedded Document
Foreign Key Reference
Partition Shard
32
JSON
“JavaScript Object Notation”
Built on
• name/value pairs
• Ordered list of values
http://json.org/
BSON
“Binary JSON”
Goals
• Lightweight
• Traversable
• Efficient (decoding and encoding)
http://bsonspec.org/
BSON Example
{
"_id" : "37010"
“City" : “Nashik",
“Pin" : 423201,
"state" : “MH",
“Postman” : {
name: “Ramesh Jadhav”
address: “Panchavati”
}
}
MongoDB
CRUD Operations
Data Types of MongoDB
Integer
Date Boolean
Object ID String
Null Arrays
Data Types
• String : This is most commonly used datatype to store the data. String in
mongodb must be UTF-8 valid.
• Integer : This type is used to store a numerical value. Integer can be 32 bit
or 64 bit depending upon your server.
• Boolean : This type is used to store a boolean (true/ false) value.
• Double : This type is used to store floating point values.
• Min/ Max keys : This type is used to compare a value against the lowest
and highest BSON elements.
• Arrays : This type is used to store arrays or list or multiple values into one
key.
• Timestamp : ctimestamp. This can be handy for recording when a
document has been modified or added.
• Object : This datatype is used for embedded documents.
Data Types
• Null : This type is used to store a Null value.
• Symbol : This datatype is used identically to a string however, it's
generally reserved for languages that use a specific symbol type.
• Date : This datatype is used to store the current date or time in
UNIX time format. You can specify your own date time by creating
object of Date and passing day, month, year into it.
• Object ID : This datatype is used to store the document’s ID.
• Binary data : This datatype is used to store binay data.
• Code : This datatype is used to store javascript code into
document.
• Regular expression : This datatype is used to store regular
expression
Basic Database Operations
Database
collection
Basic Database Operations- Database
use <database • switched to database provided with
name> ciommand
Find
Update
Delete
CRUD Operations - Insert
• The insert() Method:- To insert data into MongoDB collection,
you need to use MongoDB's insert() or save()method.
• Syntax
>db.COLLECTION_NAME.insert(document)
• Example
>db.stud.insert({name: “Jiya”, age:15})
CRUD Operations - Insert
• _id Field
• If the document does not specify an _id field, then MongoDB
will add the _id field and assign a unique ObjectId for the
document before inserting.
• The _id value must be unique within the collection to avoid
duplicate key error.
_Id field
3 Bytes- Machine Id
db.stud.insert
( {Name: “Ankit”, Rno:1, Address: “Pune”} )
CRUD Operations - Insert
• Insert Multiple Documents
db.stud.insert
([
{ Name: “Ankit”, Rno:1, Address: “Pune”} ,
{ Name: “Sagar”, Rno:2},
{ Name: “Neha”, Rno:3}
])
CRUD Operations - Insert
• Insert Multicolumn attribute
db.stud.insert(
{
Name: “Ritu",
Address: { City: “Pune",
State: “MH” },
Rno: 6
}
)
CRUD Operations - Insert
• Insert Multivalued attribute
db.stud.insert(
{
Name : “Sneha",
Hobbies: [“Singing”, “Dancing” , “Cricket”] ,
Rno:8
}
)
CRUD Operations - Insert
• Insert Multivalued with Multicolumn attribute
db.stud.insert(
{
Name : “Sneha",
Awards: [ { Award : “Dancing”, Rank: “1st”, Year: 2008 },
{Award : “Drawing”, Rank: “3rd”, Year: 2010 } ,
{Award : “Singing”, Rank: “1st”, Year: 2015 } ],
Rno: 9
}
)
CRUD Operations - Insert
db.source.copyTo(target)
Find
Update
Delete
CRUD Operations - Find
• The find() Method- To display data from MongoDB collection.
Displays all the documents in a non structured way.
• Syntax
>db.COLLECTION_NAME.find()
• The pretty() Method- To display the results in a formatted way,
you can use pretty() method.
• Syntax
>db. COLLECTION_NAME.find().pretty()
CRUD Operations - Find
$lte Matches values that are less than or equal to a specified value.
$ne Matches all values that are not equal to a specified value.
db.stud.find({name: “Jiya”},{Rno:1})
To show the rollno of student whose name is equal to
Jiya (by default _id is also shown)
db.stud.find({name: “jiya”},{_id:0,Rno:1})
show the rollno of student whose name is equal to
Jiya (_id is not shown)
CRUD Operations – Find Examples for
Sort function
db.stud.find().sort( { Rno: 1 } )
Sort on age field in Ascending order (1)
db.stud.find().sort( { Rno: -1 } )
Sort on age field in Ascending order(-1)
CRUD Operations – Find Examples of
Count functions
db.stud.find().count()
Returns no of documents in the collection
db.stud.find({Rno:2}).count()
Returns no of documents in the collection
which satisfies the given condition Rno=2
CRUD Operations – Find Examples of
limit and skip
db.stud.find().limit(2)
Returns only first 2 documents
db.stud.find().skip(5)
Returns all documents except first 5
documents
CRUD Operations – Find Examples of
limit and skip
db.stud.find({“Address.city”: “Pune”})-
Finding in Multicolumned attribute
db.stud.find({name: “Riya”,age:20})
Find documents whose name is Riya and Rno is 20
CRUD Operations – Find Examples with in
and not in operator
db.stud.find({name:{$in:[“riya”,”jiya”]}})
Find information whose name is riya or jiya
db.stud.find({Rno:{$nin:[20,25]}})
Find information whose rollno is not 20 or 25
CRUD Operations – Find Examples for
Distinct clause
db.stud.distinct(“Address”)
Find from which different cities students
are coming
CRUD Operations – Find Examples similar
to like operator
db.stud.find({name:/^n/})
Find students whose name starts with n
db.stud.find({name:/n/})
Find students whose name contains n letter
db.stud.find({name:/n$/})
Find students whose name ends with n
CRUD Operations – Find Examples
db.collection.stats()
db.collection.explain().find()
db.collection.explain().find().help()
CRUD Operations
Insert
Find
Update
Delete
CRUD Operations – Update
• Syntax
db.CollectionName.update(
<query/Condition>,
<update with $set or $unset>,
{
upsert: <boolean>,
multi: <boolean>,
}
)
CRUD Operations – Update
db.stud.update(
• To remove a age column from single document
{ _id: 100 }, where id=100
{ $unset:{age: 1}})
CRUD Operations – Update
Examples
db.stud.update( • Set marks for dbms subject as 50
{ _id: 100 }, where id = 100 (only one row is
{ $set: { “marks.DBMS": 50} }) updated)
db.stud.update(
• Set marks for dbms subject as 50
{ class: “TE” },
where class is TE (all rows which
{ $set: { “marks.DBMS": 50} } , matches the condition were updated)
{ multi: true } )
Find
Update
Delete
CRUD Operations – Remove
Indexing
Aggregatio
n
Indexin
g
Indexes support the
efficient execution of
queries in
MongoDB.
Indexing
Types
Single Field Indexes • A single field index only includes data from a single
field of the documents in a collection.
Using ensureIndex
• db.CollectionName.ensureIndex({KeyName: 1 or -1})
Using ensureIndex
• Single: db.stud.ensureIndex({“name":1})
• Compound: db.stud.ensureIndex ({“address":1,“name":-1})
Index
Display
db.collection.getIndexes()
• Returns an array that holds a list of
documents that identify and describe the
existing indexes on the collection.
db.collection.getIndexStats()
• Displays a human-readable summary of aggregated
statistics about an index’s B-tree data structure.
• db.<collection>.getIndexStats( { index : "<index
name>" } )
Index
Drop
Syntax
• db.collection.dropIndex()
• db.collection.dropIndex(index)
Example
• db.stud.dropIndex()
• db.stud.dropIndex( { “name" : 1 } )
Indexing and
Querying
• create an ascending index on the field name for a collection
records:
db.records.createIndex( { name: 1 } )
• This index can support an ascending sort on name :
db.records.find().sort( { name: 1 } )
• The index can also support descending sort
db.records.find().sort( { a: -1 } )
Indexing and
Querying
db.stud.findOne( {rno:2} ), using index {rno:1}
db.c.find().sort( {name:1,age:1} ),
using index {name:1,age:1}
Indexing
Aggregatio
n
Aggregation
Aggregations operations process data records and return computed
results.
Syntax:
• >db.COLLECTION_NAME.aggregate(AGGREGATE_OPERATION)
Aggregation
• MongoDB’s aggregation framework is modeled on the concept
of data processing pipelines.
• Documents enter a multi-stage pipeline that transforms the
documents into an aggregated result.
• Other pipeline operations provide tools for grouping and
sorting documents by specific field or fields.
• In addition, pipeline stages can use operators for tasks such
as calculating the average or concatenating a string.
aggregate()
method
Expression Description
Sums up the defined value from all documents in the
$sum collection.
Calculates the average of all given values from all
$avg documents in the collection.
Gets the minimum of the corresponding values from all
$min documents in the collection.
Gets the maximum of the corresponding values from all
$max documents in the collection.
Gets the first document from the source documents
$first according to the grouping.
Gets the last document from the source documents
$last according to the grouping.
Pipeline
Concept
db.student.aggregate
([{$group : {_id : "$subject",
marks : {$first : "$marks"}}}]);
LAST()
db.student.aggregate
([{$group : {_id : "$subject",
marks : {$last : "$marks"}}}]);
SUM()-Example
1
db.student.aggregate
([{$group : {_id : "$subject",
marks : {$sum :
"$marks"}}}]);
db.student.aggregate
([{ $match: {subject:"OSD"}},
{$group:{_id:null,count:{$sum:1}}}]);
SUM()- Example
3
db.student.aggregate
([{ $match: {subject:"OSD"}},
{$group:{_id:null,count:{$sum:1}}}]);
db.student.aggregate
([{ $match: {subject:"OSD"}},
{$skip:1}]);
Sort()
db.student.aggregate
([{ $match: {subject:"OSD"}},
{$sort:{marks:-1}}]);
db.student.aggregate
([{ $match: {subject:"OSD"}},
{$sort:{marks:1}}]);
Unwind()
• db.student.insert({rollno:9,name:"Anavi",marks:[80,30,50]});
• db.student.aggregate([{$unwind:"$marks"}])
Map-
Reduce
• Map-reduce is a data processing paradigm for
condensing large volumes of data into useful aggregated
results. For operations,
reduce map- MongoDB provides mapReduce
the database command.
• Consider the following map-reduce operation:
Map-
Reduce
Map-
Reduce
In very simple terms, the mapReduce command takes 2
primary inputs, the mapper function and the reducer
function.
And then this key value pair is fed into a Reducer, which
will process the values.
Map-Reduce
Syntax
db.collection.mapReduce( fun
ction() {emit(key, value);},
//Define map function
function(key,values) {return
reduceFunction}, {
//Define reduce function
out: collection,
query: document,
sort: document,
limit: number
}
)
Map-Reduce Syntax
Explanation
• The above map-reduce function will query the collection, and then map
the output documents to the emit key-value pairs. After this, it is
reduced based on the keys that have multiple values. Here, we have
used the following functions and parameters.
• > db.author.save({
"book_title" : "MongoDB Tutorial", "author_name" : "aparajita",
"status" : "active", "publish_year": "2016" })
• > db.author.save({
"book_title" : "Software Testing Tutorial", "author_name" :
"aparajita", "status" : "active", "publish_year": "2015" })
• > db.author.save({
"book_title" : "Node.js Tutorial", "author_name" : “Kritika",
"status" : "active", "publish_year": "2016" })
• > db.author.save({
"book_title" : "PHP7 Tutorial", "author_name" : "aparajita",
"status" : “passive", "publish_year": "2016" })
Example1 MapReduce
function
db.author.find()
• Out-Put
{ "_id" : "aparajita", "value" : 2 }
{ "_id" : “Kritika", "value" : 1 }
Code:
db.author.mapReduce(
function() { emit(this.author_name,1); },
function(key, values) {return Array.sum(values)}, {
query: { status : "active" },
out: "author_total” })
Example2 MapReduce
function
• Consider the following document structure of Students
• > db.stud.mapReduce(
function(){ emit(this.Name,1)},
function(key, values) {return Array.sum(values)},
{out: “Name_Total“ }).find()
• > db.stud.mapReduce(
function() { emit(this.Name,this.Marks) },
function(key, values) {return Array.sum(values)},
{out: “Total_Marks“ }).find()
Example2 MapReduce
function
• > db.stud.mapReduce(
... function(){ emit(this.Name,this.Marks)},
... function(key, values) {return Array.sum(values)},
... {out: 'Name_Total'}).find().sort({value:1})
• db.stud.mapReduce(
... function(){ emit(this.Name,this.Marks)},
... function(key, values) {return Array.sum(values)},
... {out: 'Name_Total'}).find().limit(1)
• db.stud.mapReduce( function
(){ emit(this.Name,1)},
function(key, values) {return
Array.sum(values)},
{query:{Marks:{$gt:70}},out:
'Name_Total'}).find()