0% found this document useful (0 votes)
15 views

Mongodb-Unit 5

Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views

Mongodb-Unit 5

Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 120

Unit- 5

NOSQL
History
● mongoDB = “Humongous DB”
○ Open-source
○ Document-based
○ “High performance, high availability”
○ Automatic scaling
○ C-P on CAP

-blog.mongodb.org/post/475279604/on-distributed-consistency-part-1
-mongodb.org/manual
Other NoSQL Types

Key/value (Dynamo)

Columnar/tabular (HBase)

Document (mongoDB)

http://www.aaronstannard.com/post/2011/06/30/MongoDB-vs-SQL-Server.aspx
Motivations
🠶 Problems with SQL

🠶 Rigid schema
🠶 Not easily scalable (designed for 90’s technology or
worse)
🠶 Requires unintuitive joins

🠶 Perks of mongoDB

🠶 Easy interface with common languages (Java,


Javascript, PHP, etc.)
🠶 DB tech should run anywhere (VM’s, cloud, etc.)
🠶 Keeps essential features of RDBMS’s while learning
from key-value noSQL systems

http://www.slideshare.net/spf13/mongodb-9794741?v=qf1&b=&from_search=13
Company Using mongoDB

“MongoDB powers Under Armour’s online store, and


was chosen for its dynamic schema, ability to scale
horizontally and perform multi-data center replication.”

http://www.mongodb.org/about/production-deployments/
-Steve Francia, http://www.slideshare.net/spf13/mongodb-9794741?v=qf1&b=&from_search=13
NoSQL Distinguishing Characteristics
Large data volumes

August 14, 2024


• Google’s “big data”

Scalable replication and distribution


• Potentially thousands of machines
• Potentially distributed around the world

Queries need to return answers quickly

Schema-less

ACID transaction properties are not needed – BASE

CAP Theorem 7

Open source development


BASE Transactions
• Acronym contrived to be the opposite of ACID

August 14, 2024


• Basically Available,
• Soft state,
• Eventually Consistent
• Characteristics
• Weak consistency – stale data OK
• Availability first
• Best effort
• Approximate answers OK
• Aggressive (optimistic)
• Simpler and faster
8
Brewer’s CAP Theorem

August 14, 2024


CAP
Theorem

Partition
Consistency Availability
tolerance
9
Consistency
• all nodes see the same data at the same time – Wikipedia

August 14, 2024


• client perceives that a set of operations has occurred all at
once – Pritchett
• More like Atomic in ACID transaction properties

10
Availability
• node failures do not prevent survivors

August 14, 2024


from continuing to operate – Wikipedia
• Every operation must terminate in an intended response –
Pritchett

11
Partition Tolerance
• the system continues to operate despite

August 14, 2024


arbitrary message loss – Wikipedia
• Operations will complete, even if individual components are
unavailable – Pritchett

12
Outline
Difference Between SQL and NoSQL
Study of Open Source NOSQL
Database
MongoDB Installation,

Basic CRUD operations,

Execution
Open Source

August 14, 2024


Small upfront software costs

Suitable for large scale


distribution on commodity
hardware 14
NoSQL Database Types

August 14, 2024


Column Store –
Each storage block
contains data from
only one column

Document Store
Key-Value Store – stores
– Hash table of documents
keys made up of
tagged elements 15
Other Non-SQL Databases
• XML Databases

August 14, 2024


• Graph Databases
• Codasyl Databases
• Object Oriented Databases
• Etc…

16
NoSQL Example: Column Store
Each storage block contains data from only one

August 14, 2024


column

Example: Hadoop/Hbase
• http://hadoop.apache.org/
• Yahoo, Facebook

Example: Ingres VectorWise


• Column Store integrated with an SQL database 17
• http://www.ingres.com/products/vectorwise
Column Store Comments
• More efficient than row (or document) store if:

August 14, 2024


• Multiple row/record/documents are inserted at the same time so
updates of column blocks can be aggregated
• Retrievals access only some of the columns in a
row/record/document

18
NoSQL Examples: Key-Value Store
• Hash tables of Keys

August 14, 2024


• Values stored with Keys
• Fast access to small data values
• Example – Project-Voldemort
• http://www.project-voldemort.com/
• Linkedin
• Example – MemCacheDB
• http://memcachedb.org/
• Backend storage is Berkeley-DB

19
MongoDB
What is MongoDB ?
• Scalable High-Performance Open-source,
Document-orientated database.

• Built for Speed

• Rich Document based queries for Easy


readability.

• Full Index Support for High Performance.

• Replication and Failover for High


Availability.

• Auto Sharding for Easy Scalability.


Why use MongoDB?
• SQL was invented in the 70’s to store
data.

• MongoDB stores documents (or) objects.

• Now-a-days, everyone works with objects


(Python/Ruby/Java/etc.)

• And we need Databases to persist our


objects. Then why not store objects
directly ?

• Embedded documents and arrays reduce


need for joins. No Joins and No-multi
document transactions.
What is MongoDB great for?
• RDBMS replacement for Web
Applications.

• Semi-structured Content Management.

• Real-time Analytics & High-Speed


Logging.

•Web
Caching
2.0,and High SAAS,
Media, Scalability
Gaming
HealthCare, Finance, Telecom, Government
Not great for?
• Highly Transactional Applications.

• Problems requiring SQL.

Some Companies using MongoDB in


Production
Advantages of MongoDB

Schema less : Number of fields, content and size of the


document can be differ from one document to another.

No complex joins

Data is stored as JSON style

Index on any attribute

Replication and High availability


Data Model
🠶 Document-Based (max 16 MB)
🠶 Documents are in BSON format, consisting of field-
value pairs
🠶 Each document stored in a collection
🠶 Collections
🠶 Have index set in common
🠶 Like tables of relational db’s.
🠶 Documents do not have to have uniform structure

-docs.mongodb.org/manual/
JSON

🠶 “JavaScript Object Notation”


🠶 Easy for humans to write/read, easy for computers to
parse/generate
🠶 Objects can be nested
🠶 Built on
🠶 name/value pairs
🠶 Ordered list of values

http://json.org/
BSON
• “Binary JSON”
• Binary-encoded serialization of JSON-like docs
• Also allows “referencing”
• Embedded structure reduces need for joins
• Goals
– Lightweight
– Traversable
– Efficient (decoding and encoding)

http://bsonspec.org/
BSON Example
{
"_id" : "37010"
"city" : “Pune",
"pop" : 2660,
"state" : “MH",
“councilman” : {
name: “Smith”
address: “Pune-12”
}
}
BSON Types
Type Number
Double 1
String 2
Object 3
Array 4
Binary data
Object id
5
7
The number can
Boolean 8 be used with the
Date
Null
9
10
$type operator to
Regular Expression 11 query by type!
JavaScript 13
Symbol 14
JavaScript (with scope) 15
32-bit integer 16
Timestamp 17
64-bit integer 18
Min key 255
Max key 127

http://docs.mongodb.org/manual/reference/bson-types/
The _id Field
• By default, each document contains an _id field. This field has a number
of special characteristics:
– Value serves as primary key for collection.
– Value is unique, immutable, and may be any non-array type.
– Default data type is ObjectId, which is “small, likely unique, fast to generate, and
ordered.” Sorting on an ObjectId value is roughly equivalent to sorting on creation
time.
• Architecturally, by default the _id field is an ObjectID, one of
MongoDB's BSON types. The ObjectID is the primary key for the stored
document and is automatically generated when creating a new document
in a collection. The following values make up the full 12-byte
combination of every _ID (quoted from MongoDB's documentation): "a
4-byte value representing the seconds since the Unix epoch,
• a 3-byte machine identifier,
• a 2-byte process id, and a 3-byte counter, starting with a random value."
http://docs.mongodb.org/manual/reference/bson-types/
MongoDB Terminologies for
RDBMS concepts
RDBMS MongoDB
Database Database
Table, View Collection
Row Document (JSON, BSON)
Column Field
Index Index
Join Embedded Document
Foreign Key Reference
Partition Shard

32
JSON
“JavaScript Object Notation”

Easy for humans to write/read, easy for computers to


parse/generate

Objects can be nested

Built on
• name/value pairs
• Ordered list of values

http://json.org/
BSON
“Binary JSON”

Binary-encoded serialization of JSON-like docs

Embedded structure reduces need for joins

Goals
• Lightweight
• Traversable
• Efficient (decoding and encoding)

http://bsonspec.org/
BSON Example
{
"_id" : "37010"
“City" : “Nashik",
“Pin" : 423201,
"state" : “MH",
“Postman” : {
name: “Ramesh Jadhav”
address: “Panchavati”
}
}
MongoDB
CRUD Operations
Data Types of MongoDB
Integer

Date Boolean

Binary data Double

Object ID String

Null Arrays
Data Types
• String : This is most commonly used datatype to store the data. String in
mongodb must be UTF-8 valid.
• Integer : This type is used to store a numerical value. Integer can be 32 bit
or 64 bit depending upon your server.
• Boolean : This type is used to store a boolean (true/ false) value.
• Double : This type is used to store floating point values.
• Min/ Max keys : This type is used to compare a value against the lowest
and highest BSON elements.
• Arrays : This type is used to store arrays or list or multiple values into one
key.
• Timestamp : ctimestamp. This can be handy for recording when a
document has been modified or added.
• Object : This datatype is used for embedded documents.
Data Types
• Null : This type is used to store a Null value.
• Symbol : This datatype is used identically to a string however, it's
generally reserved for languages that use a specific symbol type.
• Date : This datatype is used to store the current date or time in
UNIX time format. You can specify your own date time by creating
object of Date and passing day, month, year into it.
• Object ID : This datatype is used to store the document’s ID.
• Binary data : This datatype is used to store binay data.
• Code : This datatype is used to store javascript code into
document.
• Regular expression : This datatype is used to store regular
expression
Basic Database Operations

Database

collection
Basic Database Operations- Database
use <database • switched to database provided with
name> ciommand

• To check currently selected


db database use the command db

show dbs • Displays the list of databases

db.dropDatabase • To Drop the database


()
Basic Database Operations- Collection

db.createCollection (name) • To create collection


Ex:- db.createCollection(Stud)

• List out all names of collection in current


>show collections database

• In mongodb you don't need to


db.databasename.insert
create collection. MongoDB
({Key : Value}) creates collection automatically,
Ex:- db.Stud.insert({{Name:”Jiya”}) when you insert some document.

db.collection.drop() • MongoDB's db.collection.drop() is used to


Example:- db.Stud.drop() drop a collection from the database.
CRUD Operations
Insert

Find

Update

Delete
CRUD Operations - Insert
• The insert() Method:- To insert data into MongoDB collection,
you need to use MongoDB's insert() or save()method.

• Syntax
>db.COLLECTION_NAME.insert(document)

• Example
>db.stud.insert({name: “Jiya”, age:15})
CRUD Operations - Insert
• _id Field
• If the document does not specify an _id field, then MongoDB
will add the _id field and assign a unique ObjectId for the
document before inserting.
• The _id value must be unique within the collection to avoid
duplicate key error.
_Id field

_id is 12 Byte field


4 Bytes – Current time stamp

3 Bytes- Machine Id

2 Bytes- Process id of MongoDB Server

3 Bytes- Incremental Value.


CRUD Operations - Insert
• Insert a Document without Specifying an _id Field
• db.stud.insert( { Name : “Reena", Rno: 15 } )
• db.stud.find()
{ "_id" : "5063114bd386d8fadbd6b004”, “Name” : “Reena", “Rno”:
15 }

• Insert a Document Specifying an _id Field


• db.stud.insert({ _id: 10, Name : “Reena", Rno: 15 } )
• db.stud.find()
{ "_id" : 10, “Name” : “Reena", “Rno”: 15 }
CRUD Operations - Insert
• Insert Single Documents

db.stud.insert
( {Name: “Ankit”, Rno:1, Address: “Pune”} )
CRUD Operations - Insert
• Insert Multiple Documents

db.stud.insert
([
{ Name: “Ankit”, Rno:1, Address: “Pune”} ,
{ Name: “Sagar”, Rno:2},
{ Name: “Neha”, Rno:3}
])
CRUD Operations - Insert
• Insert Multicolumn attribute
db.stud.insert(
{
Name: “Ritu",
Address: { City: “Pune",
State: “MH” },
Rno: 6
}
)
CRUD Operations - Insert
• Insert Multivalued attribute
db.stud.insert(
{
Name : “Sneha",
Hobbies: [“Singing”, “Dancing” , “Cricket”] ,
Rno:8
}
)
CRUD Operations - Insert
• Insert Multivalued with Multicolumn attribute
db.stud.insert(
{
Name : “Sneha",
Awards: [ { Award : “Dancing”, Rank: “1st”, Year: 2008 },
{Award : “Drawing”, Rank: “3rd”, Year: 2010 } ,
{Award : “Singing”, Rank: “1st”, Year: 2015 } ],
Rno: 9
}
)
CRUD Operations - Insert

db.source.copyTo(target)

Copies all documents from old


collection into new Collection .
If newCollection does not exist,
MongoDB creates it.
CRUD Operations
Insert

Find

Update

Delete
CRUD Operations - Find
• The find() Method- To display data from MongoDB collection.
Displays all the documents in a non structured way.

• Syntax
>db.COLLECTION_NAME.find()
• The pretty() Method- To display the results in a formatted way,
you can use pretty() method.

• Syntax
>db. COLLECTION_NAME.find().pretty()
CRUD Operations - Find

• Select All Documents


db.stud.find() in a Collection in
unstructured form

• Select All Documents in


db.stud.find().pretty() a Collection in
structured form
CRUD Operations - Find

Specify Equality Condition


• use the query document
{ <field>: <value> }
• Examples:
• db.stud.find( name: “Jiya" } )
• db.stud.find( { _id: 5 } )
CRUD Operations – Find Comparison
Operators
Operator Description

$eq Matches values that are equal to a specified value.

$gt Matches values that are greater than a specified value.

$gte values that are greater than or equal to a specified value.

$lt Matches values that are less than a specified value.

$lte Matches values that are less than or equal to a specified value.

$ne Matches all values that are not equal to a specified value.

$in Matches any of the values specified in an array.

$nin Matches none of the values specified in an array.


CRUD Operations – Find Examples with
comparison operators

db.stud.find( { rno: { $gt:5} } )


Shows all documents whose rno>5

db.stud.find( { rno: { $gt: 0, $lt: 5} } )


Shows all documents whose rno
greater than 0 and less than 5
CRUD Operations – Find Examples to show
only particular columns

db.stud.find({name: “Jiya”},{Rno:1})
To show the rollno of student whose name is equal to
Jiya (by default _id is also shown)

db.stud.find({name: “jiya”},{_id:0,Rno:1})
show the rollno of student whose name is equal to
Jiya (_id is not shown)
CRUD Operations – Find Examples for
Sort function

db.stud.find().sort( { Rno: 1 } )
Sort on age field in Ascending order (1)

db.stud.find().sort( { Rno: -1 } )
Sort on age field in Ascending order(-1)
CRUD Operations – Find Examples of
Count functions

db.stud.find().count()
Returns no of documents in the collection

db.stud.find({Rno:2}).count()
Returns no of documents in the collection
which satisfies the given condition Rno=2
CRUD Operations – Find Examples of
limit and skip

db.stud.find().limit(2)
Returns only first 2 documents

db.stud.find().skip(5)
Returns all documents except first 5
documents
CRUD Operations – Find Examples of
limit and skip

db.stud.find({ rno: { $gt:5} } ).limit(2)


Returns only first 2 documents whose rno
is greater than 5

db.stud.find({ rno: { $gt:5} } ).skip(5)


Returns all documents except first 5
documents whose rno is greater than 5
CRUD Operations – Find Examples

db.stud.findOne() - Find first document only

db.stud.find({“Address.city”: “Pune”})-
Finding in Multicolumned attribute

db.stud.find({name: “Riya”,age:20})
Find documents whose name is Riya and Rno is 20
CRUD Operations – Find Examples with in
and not in operator

db.stud.find({name:{$in:[“riya”,”jiya”]}})
Find information whose name is riya or jiya

db.stud.find({Rno:{$nin:[20,25]}})
Find information whose rollno is not 20 or 25
CRUD Operations – Find Examples for
Distinct clause

db.stud.distinct(“Address”)
Find from which different cities students
are coming
CRUD Operations – Find Examples similar
to like operator

db.stud.find({name:/^n/})
Find students whose name starts with n

db.stud.find({name:/n/})
Find students whose name contains n letter

db.stud.find({name:/n$/})
Find students whose name ends with n
CRUD Operations – Find Examples

db.collection.stats()

db.collection.explain().find()

db.collection.explain().find().help()
CRUD Operations
Insert

Find

Update

Delete
CRUD Operations – Update
• Syntax
db.CollectionName.update(
<query/Condition>,
<update with $set or $unset>,
{
upsert: <boolean>,
multi: <boolean>,
}
)
CRUD Operations – Update

• If set to True, creates new


upsert document if no matches
found.

• If set to True, updates


multi multiple documents that
matches the query criteria
CRUD Operations – Update
Examples
db.stud.update( • Set age = 25 where id is 100
• First Whole document is replaced where
{ _id: 100 }, condition is matched and only one field is
{ age: 25}) remained as age:25

db.stud.update( • Set age = 25 where id is 100


{ _id: 100 }, • Only the age field of one document is updated
where condition is matched
{ $set:{age: 25}})

db.stud.update(
• To remove a age column from single document
{ _id: 100 }, where id=100
{ $unset:{age: 1}})
CRUD Operations – Update
Examples
db.stud.update( • Set marks for dbms subject as 50
{ _id: 100 }, where id = 100 (only one row is
{ $set: { “marks.DBMS": 50} }) updated)

db.stud.update(
• Set marks for dbms subject as 50
{ class: “TE” },
where class is TE (all rows which
{ $set: { “marks.DBMS": 50} } , matches the condition were updated)
{ multi: true } )

db.stud.update( • Set marks for dbms subject as 50 where


class is TE (all rows which matches the
{ class: “TE” }, condition were updated)
{ $set: { “marks.DBMS": 50} } , • If now row found which matches the
{ upsert: true } ) condition it will insert new row.
CRUD Operations – Update
Examples
db.stud.update({ },{ $inc:{age: 5}})

db.stud.update ({ },{ $set:{cadd:


“Pune”}},{multi:true})

db.stud.update ({ },{ $rename:


{“age”:“Age”}},{multi:true})
CRUD Operations
Insert

Find

Update

Delete
CRUD Operations – Remove

Remove All • db.inventory.remove({})


Documents

Remove All • db.inventory.remove


Documents that ( { type : "food" } )
Match a Condition

Remove a Single • db.inventory.remove


Document that ( { type : "food" }, 1 )
Matches a Condition
Outlin
e

Indexing

Aggregatio
n
Indexin
g
Indexes support the
efficient execution of
queries in
MongoDB.
Indexing
Types
Single Field Indexes • A single field index only includes data from a single
field of the documents in a collection.

• A compound index includes more than one field of


Compound Indexes the documents in a collection.

• A multikey index is an index on an array field, adding


Multikey Indexes an index key for each value in the array.

Geospatial Indexes • Geospatial indexes support location-based searches .


and Queries

Text Indexes • Text indexes support search of string content


in documents.

• Hashed indexes maintain entries with hashes of the values


Hashed Index of the indexed field and are used with sharded clusters to
support hashed shard keys.
Index
Properties TTL Indexes The TTL index is
used for TTL collections,
which expire data after a
period of time.

Unique Indexes A unique


Index Properties The
index causes MongoDB to
properties you can
reject all documents that
specify when
contain a duplicate value for
building indexes.
the indexed field.

Sparse Indexes A sparse


index does not index
documents that do not have
the indexed field.
Index
Creation
Using CreateIndex
• db.CollectionName.createIndex( { KeyName: 1 or -1})

Using ensureIndex
• db.CollectionName.ensureIndex({KeyName: 1 or -1})

1 for Ascending Sorting


-1 for Descending Sorting
Index
Creation
Using CreateIndex
• Single: db.stud.createIndex( { zipcode: 1})
• Compound: db.stud.createIndex( { dob: 1, zipcode: -1 } )
• Unique: db.stud.createIndex( { rollno: 1 }, { unique: true } )
• Sparse: db.stud.createIndex( { age: 1 }, { sparse: true } )

Using ensureIndex
• Single: db.stud.ensureIndex({“name":1})
• Compound: db.stud.ensureIndex ({“address":1,“name":-1})
Index
Display
db.collection.getIndexes()
• Returns an array that holds a list of
documents that identify and describe the
existing indexes on the collection.

db.collection.getIndexStats()
• Displays a human-readable summary of aggregated
statistics about an index’s B-tree data structure.
• db.<collection>.getIndexStats( { index : "<index
name>" } )
Index
Drop
Syntax
• db.collection.dropIndex()
• db.collection.dropIndex(index)

Example
• db.stud.dropIndex()
• db.stud.dropIndex( { “name" : 1 } )
Indexing and
Querying
• create an ascending index on the field name for a collection
records:
db.records.createIndex( { name: 1 } )
• This index can support an ascending sort on name :
db.records.find().sort( { name: 1 } )
• The index can also support descending sort
db.records.find().sort( { a: -1 } )
Indexing and
Querying
db.stud.findOne( {rno:2} ), using index {rno:1}

db.stud.find ( {rno:5} ), using index {rno:1}

db.stud.find( {rno:{$in:[2,3]}} ), using index {rno:1}

db.stud.find( {age:{$gt:15}} ), using index {age:1}

db.stud.find( {age :{$gt:2,$lt:5}} ), using index {age :1}

db.stud.count( {age:19} ) using index {age:1}


db.stud.distinct( {branch: “Computer”} ) using index
{branch:1}
Indexing and
Querying
db.stud.find({}, {name:1,age:1}),
using index {name:1,age:1}

db.c.find().sort( {name:1,age:1} ),
using index {name:1,age:1}

db.stud.update( {age:20}, {age:19} )


using index {age:1}

db.stud.remove( {name: “Jiya”} )


using index {name:1}
Outlin
e

Indexing

Aggregatio
n
Aggregation
Aggregations operations process data records and return computed
results.

Aggregation operations group values from multiple documents


together,
and can perform a variety of operations on the grouped data

For aggregation in mongodb use aggregate() method.

Syntax:
• >db.COLLECTION_NAME.aggregate(AGGREGATE_OPERATION)
Aggregation
• MongoDB’s aggregation framework is modeled on the concept
of data processing pipelines.
• Documents enter a multi-stage pipeline that transforms the
documents into an aggregated result.
• Other pipeline operations provide tools for grouping and
sorting documents by specific field or fields.
• In addition, pipeline stages can use operators for tasks such
as calculating the average or concatenating a string.
aggregate()
method
Expression Description
Sums up the defined value from all documents in the
$sum collection.
Calculates the average of all given values from all
$avg documents in the collection.
Gets the minimum of the corresponding values from all
$min documents in the collection.
Gets the maximum of the corresponding values from all
$max documents in the collection.
Gets the first document from the source documents
$first according to the grouping.
Gets the last document from the source documents
$last according to the grouping.
Pipeline
Concept

There is a set of possible


stages and each of those is
taken as a set of documents
as an input and produces a
resulting set of documents
Possible stages in
aggregation
• $project − Used to select some specific fields from a collection.
• $match − This is a filtering operation and thus this can reduce the
amount of documents that are given as input to the next stage.
• $group − This does the actual aggregation as discussed above.
• $sort − Sorts the documents.
• $skip − With this, it is possible to skip forward in the list of
documents for a given amount of documents.
• $limit − This limits the amount of documents to look at, by the given
number starting from the current positions.
• $unwind − This is used to unwind document that are using arrays.
When using an array, the data is kind of pre-joined and this
operation will be undone with this to have individual documents
again. Thus with this stage we will increase the amount of
documents for the next stage.
Collection creation to run
practical
• db.student.insert({Rollno:1,name:'Navin ',subject:'DBMS',marks:78});
db.student.insert({Rollno:2,name:'anusha',subject:'OSD',marks:75});
db.student.insert({Rollno:3,name:'ravi',subject:'WT',marks:69});
db.student.insert({Rollno:4,name:'veena',subject:'WT',marks:70});
db.student.insert({Rollno:5,name: ‘Pravini',subject: ‘OSD',marks:80});
db.student.insert({Rollno:6,name: ‘Reena',subject: ‘DBMS',marks:50});
db.student.insert({Rollno:7,name: ‘Geeta',subject: ‘CN',marks:90});
db.student.insert({Rollno:8,name: ‘Akash',subject: ‘CN',marks:85});
MIN(
)
db.student.aggregate
([{$group : {_id : "$subject",
marks : {$min :
"$marks"}}}]);

SQL Equivalent Query


Select subject, min(marks) from student
group by subject
MAX(
)
db.student.aggregate
([{$group : {_id : "$subject",
marks : {$max :
"$marks"}}}]);

SQL Equivalent Query


Select subject, max(marks) from student
group by subject
AVG(
)
db.student.aggregate
([{$group : {_id : "$subject",
marks : {$avg :
"$marks"}}}]);

SQL Equivalent Query


Select subject, avg(marks) from student
group by subject
FIRST()

db.student.aggregate
([{$group : {_id : "$subject",
marks : {$first : "$marks"}}}]);
LAST()

db.student.aggregate
([{$group : {_id : "$subject",
marks : {$last : "$marks"}}}]);
SUM()-Example
1
db.student.aggregate
([{$group : {_id : "$subject",
marks : {$sum :
"$marks"}}}]);

SQL Equivalent Query


Select subject, sum(marks) from student
group by subject
SUM(): Example
2
db.student.aggregate
([{$group : {_id : "$subject",
Count: {$sum : 1}}}]);

SQL Equivalent Query


Select subject, count(*) from student
group by subject
$matc
h
db.student.aggregate
([{ $match: {subject:"OSD"}}])

db.student.aggregate
([{ $match: {subject:"OSD"}},
{$group:{_id:null,count:{$sum:1}}}]);
SUM()- Example
3
db.student.aggregate
([{ $match: {subject:"OSD"}},
{$group:{_id:null,count:{$sum:1}}}]);

SQL Equivalent Query


Select subject, count(*) from student
group by subject
having subject=“OSD”
Limit() &
Skip()
db.student.aggregate
([{ $match: {subject:"OSD"}},
{$limit:1}]);

db.student.aggregate
([{ $match: {subject:"OSD"}},
{$skip:1}]);
Sort()

db.student.aggregate
([{ $match: {subject:"OSD"}},
{$sort:{marks:-1}}]);

db.student.aggregate
([{ $match: {subject:"OSD"}},
{$sort:{marks:1}}]);
Unwind()

If following document is their in collection(Array)

• db.student.insert({rollno:9,name:"Anavi",marks:[80,30,50]});

Using Unwind the above document will be


unwinded into 3 different document

• db.student.aggregate([{$unwind:"$marks"}])
Map-
Reduce
• Map-reduce is a data processing paradigm for
condensing large volumes of data into useful aggregated
results. For operations,
reduce map- MongoDB provides mapReduce
the database command.
• Consider the following map-reduce operation:
Map-
Reduce
Map-
Reduce
In very simple terms, the mapReduce command takes 2
primary inputs, the mapper function and the reducer
function.

A Mapper will start off by reading a collection of data and


building a Map with only the required fields we wish to
process and group them into one array based on the key.

And then this key value pair is fed into a Reducer, which
will process the values.
Map-Reduce
Syntax
db.collection.mapReduce( fun
ction() {emit(key, value);},
//Define map function
function(key,values) {return
reduceFunction}, {
//Define reduce function
out: collection,
query: document,
sort: document,
limit: number
}
)
Map-Reduce Syntax
Explanation
• The above map-reduce function will query the collection, and then map
the output documents to the emit key-value pairs. After this, it is
reduced based on the keys that have multiple values. Here, we have
used the following functions and parameters.

• Map: – It is a JavaScript function. It is used to map a value with a key and


produces a key-value pair.
• Reduce: – It is a JavaScript function. It is used to reduce or group
together all the documents which have the same key.
• Out: – It is used to specify the location of the map-reduce query
output.
• Query: – It is used to specify the optional selection criteria for selecting
documents.
• Sort: – It is used to specify the optional sort criteria.
• Limit: – It is used to specify the optional maximum number of
documents which are desired to be returned.
Example1 MapReduce
function
• Consider the following document structure that stores book details
author wise. The document stores author_name of the book author and
the status of book.

• > db.author.save({
"book_title" : "MongoDB Tutorial", "author_name" : "aparajita",
"status" : "active", "publish_year": "2016" })
• > db.author.save({
"book_title" : "Software Testing Tutorial", "author_name" :
"aparajita", "status" : "active", "publish_year": "2015" })
• > db.author.save({
"book_title" : "Node.js Tutorial", "author_name" : “Kritika",
"status" : "active", "publish_year": "2016" })
• > db.author.save({
"book_title" : "PHP7 Tutorial", "author_name" : "aparajita",
"status" : “passive", "publish_year": "2016" })
Example1 MapReduce
function
db.author.find()

{ "_id" : ObjectId("59333022523476d644344db9"), "book_title" : "MongoDB Tutorial",


"author_name" : "aparajita", "status" : "active", "publish_year" : "2016" }

{ "_id" : ObjectId("59333031523476d644344dba"), "book_title" : "Software Testing


Tutorial", "author_name" : "aparajita", "status" : "active", "publish_year" : "2015" }

{ "_id" : ObjectId("5933303e523476d644344dbb"), "book_title" : "Node.js Tutorial",


"author_name" : "aparajita", "status" : "active", "publish_year" : "2016" }

{ "_id" : ObjectId("5933304b523476d644344dbc"), "book_title" : "PHP7 Tutorial",


"author_name" : "aparajita", "status" : "active", "publish_year" : "2016" }
Example1 MapReduce
function
• Now, use the mapReduce function
• To select all the active books,
• Group them together on the basis of author_name and
• Then count the number of books by each author by using the
following code in MongoDB.
Example1 MapReduce
function
• Code:
db.author.mapReduce(
function() { emit(this.author_name,1) },
function(key, values) {return Array.sum(values)},
{ query:{status:"active"}, out:"author_total" } ).find()

• Out-Put
{ "_id" : "aparajita", "value" : 2 }
{ "_id" : “Kritika", "value" : 1 }

Code:
db.author.mapReduce(
function() { emit(this.author_name,1); },
function(key, values) {return Array.sum(values)}, {
query: { status : "active" },
out: "author_total” })
Example2 MapReduce
function
• Consider the following document structure of Students

• > db.stud.insert({ Name: ‘Amit’, Marks:80 })


• >db.stud.insert({ Name: ‘Amit’, Marks:90 })
• >db.stud.insert({ Name: ‘Shreya’, Marks:40 })
• >db.stud.insert({ Name: ‘Neha’, Marks:80 })
• >db.stud.insert({ Name: ‘Neha’, Marks:35})
Example2 MapReduce
function
• > db.stud.mapReduce(
function(){ emit(this.Name,1)},
function(key, values) {return Array.sum(values)},
{out: “Name_Total“ })

• > db.stud.mapReduce(
function(){ emit(this.Name,1)},
function(key, values) {return Array.sum(values)},
{out: “Name_Total“ }).find()

• > db.stud.mapReduce(
function() { emit(this.Name,this.Marks) },
function(key, values) {return Array.sum(values)},
{out: “Total_Marks“ }).find()
Example2 MapReduce
function
• > db.stud.mapReduce(
... function(){ emit(this.Name,this.Marks)},
... function(key, values) {return Array.sum(values)},
... {out: 'Name_Total'}).find().sort({value:1})

• db.stud.mapReduce(
... function(){ emit(this.Name,this.Marks)},
... function(key, values) {return Array.sum(values)},
... {out: 'Name_Total'}).find().limit(1)

• db.stud.mapReduce( function
(){ emit(this.Name,1)},
function(key, values) {return
Array.sum(values)},
{query:{Marks:{$gt:70}},out:
'Name_Total'}).find()

You might also like