0% found this document useful (0 votes)

15 views

Mongodb-Unit 5

Uploaded by

vedantsalunke2021

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

15 views

Mongodb-Unit 5

Uploaded by

vedantsalunke2021

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 120

Unit- 5

NOSQL
History
● mongoDB = “Humongous DB”
○ Open-source
○ Document-based
○ “High performance, high availability”
○ Automatic scaling
○ C-P on CAP

-blog.mongodb.org/post/475279604/on-distributed-consistency-part-1
-mongodb.org/manual
Other NoSQL Types

Key/value (Dynamo)

Columnar/tabular (HBase)

Document (mongoDB)

http://www.aaronstannard.com/post/2011/06/30/MongoDB-vs-SQL-Server.aspx
Motivations
🠶 Problems with SQL

🠶 Rigid schema
🠶 Not easily scalable (designed for 90’s technology or
worse)
🠶 Requires unintuitive joins

🠶 Perks of mongoDB

🠶 Easy interface with common languages (Java,

Javascript, PHP, etc.)
🠶 DB tech should run anywhere (VM’s, cloud, etc.)
🠶 Keeps essential features of RDBMS’s while learning
from key-value noSQL systems

http://www.slideshare.net/spf13/mongodb-9794741?v=qf1&b=&from_search=13
Company Using mongoDB

“MongoDB powers Under Armour’s online store, and

was chosen for its dynamic schema, ability to scale
horizontally and perform multi-data center replication.”

http://www.mongodb.org/about/production-deployments/
-Steve Francia, http://www.slideshare.net/spf13/mongodb-9794741?v=qf1&b=&from_search=13
NoSQL Distinguishing Characteristics
Large data volumes

August 14, 2024

• Google’s “big data”

Scalable replication and distribution

• Potentially thousands of machines
• Potentially distributed around the world

Queries need to return answers quickly

Schema-less

ACID transaction properties are not needed – BASE

CAP Theorem 7

Open source development

BASE Transactions
• Acronym contrived to be the opposite of ACID

August 14, 2024

• Basically Available,
• Soft state,
• Eventually Consistent
• Characteristics
• Weak consistency – stale data OK
• Availability first
• Best effort
• Approximate answers OK
• Aggressive (optimistic)
• Simpler and faster
8
Brewer’s CAP Theorem

August 14, 2024

CAP
Theorem

Partition
Consistency Availability
tolerance
9
Consistency
• all nodes see the same data at the same time – Wikipedia

August 14, 2024

• client perceives that a set of operations has occurred all at
once – Pritchett
• More like Atomic in ACID transaction properties

10
Availability
• node failures do not prevent survivors

August 14, 2024

from continuing to operate – Wikipedia
• Every operation must terminate in an intended response –
Pritchett

11
Partition Tolerance
• the system continues to operate despite

August 14, 2024

arbitrary message loss – Wikipedia
• Operations will complete, even if individual components are
unavailable – Pritchett

12
Outline
Difference Between SQL and NoSQL
Study of Open Source NOSQL
Database
MongoDB Installation,

Basic CRUD operations,

Execution
Open Source

August 14, 2024

Small upfront software costs

Suitable for large scale

distribution on commodity
hardware 14
NoSQL Database Types

August 14, 2024

Column Store –
Each storage block
contains data from
only one column

Document Store
Key-Value Store – stores
– Hash table of documents
keys made up of
tagged elements 15
Other Non-SQL Databases
• XML Databases

August 14, 2024

• Graph Databases
• Codasyl Databases
• Object Oriented Databases
• Etc…

16
NoSQL Example: Column Store
Each storage block contains data from only one

August 14, 2024

column

Example: Hadoop/Hbase
• http://hadoop.apache.org/
• Yahoo, Facebook

Example: Ingres VectorWise

• Column Store integrated with an SQL database 17
• http://www.ingres.com/products/vectorwise
Column Store Comments
• More efficient than row (or document) store if:

August 14, 2024

• Multiple row/record/documents are inserted at the same time so
updates of column blocks can be aggregated
• Retrievals access only some of the columns in a
row/record/document

18
NoSQL Examples: Key-Value Store
• Hash tables of Keys

August 14, 2024

• Values stored with Keys
• Fast access to small data values
• Example – Project-Voldemort
• http://www.project-voldemort.com/
• Linkedin
• Example – MemCacheDB
• http://memcachedb.org/
• Backend storage is Berkeley-DB

19
MongoDB
What is MongoDB ?
• Scalable High-Performance Open-source,
Document-orientated database.

• Built for Speed

• Rich Document based queries for Easy

readability.

• Full Index Support for High Performance.

• Replication and Failover for High

Availability.

• Auto Sharding for Easy Scalability.

Why use MongoDB?
• SQL was invented in the 70’s to store
data.

• MongoDB stores documents (or) objects.

• Now-a-days, everyone works with objects

(Python/Ruby/Java/etc.)

• And we need Databases to persist our

objects. Then why not store objects
directly ?

• Embedded documents and arrays reduce

need for joins. No Joins and No-multi
document transactions.
What is MongoDB great for?
• RDBMS replacement for Web
Applications.

• Semi-structured Content Management.

• Real-time Analytics & High-Speed

Logging.

•Web
Caching
2.0,and High SAAS,
Media, Scalability
Gaming
HealthCare, Finance, Telecom, Government
Not great for?
• Highly Transactional Applications.

• Problems requiring SQL.

Some Companies using MongoDB in

Production
Advantages of MongoDB

Schema less : Number of fields, content and size of the

document can be differ from one document to another.

No complex joins

Data is stored as JSON style

Index on any attribute

Replication and High availability

Data Model
🠶 Document-Based (max 16 MB)
🠶 Documents are in BSON format, consisting of field-
value pairs
🠶 Each document stored in a collection
🠶 Collections
🠶 Have index set in common
🠶 Like tables of relational db’s.
🠶 Documents do not have to have uniform structure

-docs.mongodb.org/manual/
JSON

🠶 “JavaScript Object Notation”

🠶 Easy for humans to write/read, easy for computers to
parse/generate
🠶 Objects can be nested
🠶 Built on
🠶 name/value pairs
🠶 Ordered list of values

http://json.org/
BSON
• “Binary JSON”
• Binary-encoded serialization of JSON-like docs
• Also allows “referencing”
• Embedded structure reduces need for joins
• Goals
– Lightweight
– Traversable
– Efficient (decoding and encoding)

http://bsonspec.org/
BSON Example
{
"_id" : "37010"
"city" : “Pune",
"pop" : 2660,
"state" : “MH",
“councilman” : {
name: “Smith”
address: “Pune-12”
}
}
BSON Types
Type Number
Double 1
String 2
Object 3
Array 4
Binary data
Object id
5
7
The number can
Boolean 8 be used with the
Date
Null
9
10
$type operator to
Regular Expression 11 query by type!
JavaScript 13
Symbol 14
JavaScript (with scope) 15
32-bit integer 16
Timestamp 17
64-bit integer 18
Min key 255
Max key 127

http://docs.mongodb.org/manual/reference/bson-types/
The _id Field
• By default, each document contains an _id field. This field has a number
of special characteristics:
– Value serves as primary key for collection.
– Value is unique, immutable, and may be any non-array type.
– Default data type is ObjectId, which is “small, likely unique, fast to generate, and
ordered.” Sorting on an ObjectId value is roughly equivalent to sorting on creation
time.
• Architecturally, by default the _id field is an ObjectID, one of
MongoDB's BSON types. The ObjectID is the primary key for the stored
document and is automatically generated when creating a new document
in a collection. The following values make up the full 12-byte
combination of every _ID (quoted from MongoDB's documentation): "a
4-byte value representing the seconds since the Unix epoch,
• a 3-byte machine identifier,
• a 2-byte process id, and a 3-byte counter, starting with a random value."
http://docs.mongodb.org/manual/reference/bson-types/
MongoDB Terminologies for
RDBMS concepts
RDBMS MongoDB
Database Database
Table, View Collection
Row Document (JSON, BSON)
Column Field
Index Index
Join Embedded Document
Foreign Key Reference
Partition Shard

32
JSON
“JavaScript Object Notation”

Easy for humans to write/read, easy for computers to

parse/generate

Objects can be nested

Built on
• name/value pairs
• Ordered list of values

http://json.org/
BSON
“Binary JSON”

Binary-encoded serialization of JSON-like docs

Embedded structure reduces need for joins

Goals
• Lightweight
• Traversable
• Efficient (decoding and encoding)

http://bsonspec.org/
BSON Example
{
"_id" : "37010"
“City" : “Nashik",
“Pin" : 423201,
"state" : “MH",
“Postman” : {
name: “Ramesh Jadhav”
address: “Panchavati”
}
}
MongoDB
CRUD Operations
Data Types of MongoDB
Integer

Date Boolean

Binary data Double

Object ID String

Null Arrays
Data Types
• String : This is most commonly used datatype to store the data. String in
mongodb must be UTF-8 valid.
• Integer : This type is used to store a numerical value. Integer can be 32 bit
or 64 bit depending upon your server.
• Boolean : This type is used to store a boolean (true/ false) value.
• Double : This type is used to store floating point values.
• Min/ Max keys : This type is used to compare a value against the lowest
and highest BSON elements.
• Arrays : This type is used to store arrays or list or multiple values into one
key.
• Timestamp : ctimestamp. This can be handy for recording when a
document has been modified or added.
• Object : This datatype is used for embedded documents.
Data Types
• Null : This type is used to store a Null value.
• Symbol : This datatype is used identically to a string however, it's
generally reserved for languages that use a specific symbol type.
• Date : This datatype is used to store the current date or time in
UNIX time format. You can specify your own date time by creating
object of Date and passing day, month, year into it.
• Object ID : This datatype is used to store the document’s ID.
• Binary data : This datatype is used to store binay data.
• Code : This datatype is used to store javascript code into
document.
• Regular expression : This datatype is used to store regular
expression
Basic Database Operations

Database

collection
Basic Database Operations- Database
use <database • switched to database provided with
name> ciommand

• To check currently selected

db database use the command db

show dbs • Displays the list of databases

db.dropDatabase • To Drop the database

()
Basic Database Operations- Collection

db.createCollection (name) • To create collection

Ex:- db.createCollection(Stud)

• List out all names of collection in current

>show collections database

• In mongodb you don't need to

db.databasename.insert
create collection. MongoDB
({Key : Value}) creates collection automatically,
Ex:- db.Stud.insert({{Name:”Jiya”}) when you insert some document.

db.collection.drop() • MongoDB's db.collection.drop() is used to

Example:- db.Stud.drop() drop a collection from the database.
CRUD Operations
Insert

Find

Update

Delete
CRUD Operations - Insert
• The insert() Method:- To insert data into MongoDB collection,
you need to use MongoDB's insert() or save()method.

• Syntax
>db.COLLECTION_NAME.insert(document)

• Example
>db.stud.insert({name: “Jiya”, age:15})
CRUD Operations - Insert
• _id Field
• If the document does not specify an _id field, then MongoDB
will add the _id field and assign a unique ObjectId for the
document before inserting.
• The _id value must be unique within the collection to avoid
duplicate key error.
_Id field

_id is 12 Byte field

4 Bytes – Current time stamp

3 Bytes- Machine Id

2 Bytes- Process id of MongoDB Server

3 Bytes- Incremental Value.

CRUD Operations - Insert
• Insert a Document without Specifying an _id Field
• db.stud.insert( { Name : “Reena", Rno: 15 } )
• db.stud.find()
{ "_id" : "5063114bd386d8fadbd6b004”, “Name” : “Reena", “Rno”:
15 }

• Insert a Document Specifying an _id Field

• db.stud.insert({ _id: 10, Name : “Reena", Rno: 15 } )
• db.stud.find()
{ "_id" : 10, “Name” : “Reena", “Rno”: 15 }
CRUD Operations - Insert
• Insert Single Documents

db.stud.insert
( {Name: “Ankit”, Rno:1, Address: “Pune”} )
CRUD Operations - Insert
• Insert Multiple Documents

db.stud.insert
([
{ Name: “Ankit”, Rno:1, Address: “Pune”} ,
{ Name: “Sagar”, Rno:2},
{ Name: “Neha”, Rno:3}
])
CRUD Operations - Insert
• Insert Multicolumn attribute
db.stud.insert(
{
Name: “Ritu",
Address: { City: “Pune",
State: “MH” },
Rno: 6
}
)
CRUD Operations - Insert
• Insert Multivalued attribute
db.stud.insert(
{
Name : “Sneha",
Hobbies: [“Singing”, “Dancing” , “Cricket”] ,
Rno:8
}
)
CRUD Operations - Insert
• Insert Multivalued with Multicolumn attribute
db.stud.insert(
{
Name : “Sneha",
Awards: [ { Award : “Dancing”, Rank: “1st”, Year: 2008 },
{Award : “Drawing”, Rank: “3rd”, Year: 2010 } ,
{Award : “Singing”, Rank: “1st”, Year: 2015 } ],
Rno: 9
}
)
CRUD Operations - Insert

db.source.copyTo(target)

Copies all documents from old

collection into new Collection .
If newCollection does not exist,
MongoDB creates it.
CRUD Operations
Insert

Find

Update

Delete
CRUD Operations - Find
• The find() Method- To display data from MongoDB collection.
Displays all the documents in a non structured way.

• Syntax
>db.COLLECTION_NAME.find()
• The pretty() Method- To display the results in a formatted way,
you can use pretty() method.

• Syntax
>db. COLLECTION_NAME.find().pretty()
CRUD Operations - Find

• Select All Documents

db.stud.find() in a Collection in
unstructured form

• Select All Documents in

db.stud.find().pretty() a Collection in
structured form
CRUD Operations - Find

Specify Equality Condition

• use the query document
{ <field>: <value> }
• Examples:
• db.stud.find( name: “Jiya" } )
• db.stud.find( { _id: 5 } )
CRUD Operations – Find Comparison
Operators
Operator Description

$eq Matches values that are equal to a specified value.

$gt Matches values that are greater than a specified value.

$gte values that are greater than or equal to a specified value.

$lt Matches values that are less than a specified value.

$lte Matches values that are less than or equal to a specified value.

$ne Matches all values that are not equal to a specified value.

$in Matches any of the values specified in an array.

$nin Matches none of the values specified in an array.

CRUD Operations – Find Examples with
comparison operators

db.stud.find( { rno: { $gt:5} } )

Shows all documents whose rno>5

db.stud.find( { rno: { $gt: 0, $lt: 5} } )

Shows all documents whose rno
greater than 0 and less than 5
CRUD Operations – Find Examples to show
only particular columns

db.stud.find({name: “Jiya”},{Rno:1})
To show the rollno of student whose name is equal to
Jiya (by default _id is also shown)

db.stud.find({name: “jiya”},{_id:0,Rno:1})
show the rollno of student whose name is equal to
Jiya (_id is not shown)
CRUD Operations – Find Examples for
Sort function

db.stud.find().sort( { Rno: 1 } )
Sort on age field in Ascending order (1)

db.stud.find().sort( { Rno: -1 } )
Sort on age field in Ascending order(-1)
CRUD Operations – Find Examples of
Count functions

db.stud.find().count()
Returns no of documents in the collection

db.stud.find({Rno:2}).count()
Returns no of documents in the collection
which satisfies the given condition Rno=2
CRUD Operations – Find Examples of
limit and skip

db.stud.find().limit(2)
Returns only first 2 documents

db.stud.find().skip(5)
Returns all documents except first 5
documents
CRUD Operations – Find Examples of
limit and skip

db.stud.find({ rno: { $gt:5} } ).limit(2)

Returns only first 2 documents whose rno
is greater than 5

db.stud.find({ rno: { $gt:5} } ).skip(5)

Returns all documents except first 5
documents whose rno is greater than 5
CRUD Operations – Find Examples

db.stud.findOne() - Find first document only

db.stud.find({“Address.city”: “Pune”})-
Finding in Multicolumned attribute

db.stud.find({name: “Riya”,age:20})
Find documents whose name is Riya and Rno is 20
CRUD Operations – Find Examples with in
and not in operator

db.stud.find({name:{$in:[“riya”,”jiya”]}})
Find information whose name is riya or jiya

db.stud.find({Rno:{$nin:[20,25]}})
Find information whose rollno is not 20 or 25
CRUD Operations – Find Examples for
Distinct clause

db.stud.distinct(“Address”)
Find from which different cities students
are coming
CRUD Operations – Find Examples similar
to like operator

db.stud.find({name:/^n/})
Find students whose name starts with n

db.stud.find({name:/n/})
Find students whose name contains n letter

db.stud.find({name:/n$/})
Find students whose name ends with n
CRUD Operations – Find Examples

db.collection.stats()

db.collection.explain().find()

db.collection.explain().find().help()
CRUD Operations
Insert

Find

Update

Delete
CRUD Operations – Update
• Syntax
db.CollectionName.update(
<query/Condition>,
<update with $set or $unset>,
{
upsert: <boolean>,
multi: <boolean>,
}
)
CRUD Operations – Update

• If set to True, creates new

upsert document if no matches
found.

• If set to True, updates

multi multiple documents that
matches the query criteria
CRUD Operations – Update
Examples
db.stud.update( • Set age = 25 where id is 100
• First Whole document is replaced where
{ _id: 100 }, condition is matched and only one field is
{ age: 25}) remained as age:25

db.stud.update( • Set age = 25 where id is 100

{ _id: 100 }, • Only the age field of one document is updated
where condition is matched
{ $set:{age: 25}})

db.stud.update(
• To remove a age column from single document
{ _id: 100 }, where id=100
{ $unset:{age: 1}})
CRUD Operations – Update
Examples
db.stud.update( • Set marks for dbms subject as 50
{ _id: 100 }, where id = 100 (only one row is
{ $set: { “marks.DBMS": 50} }) updated)

db.stud.update(
• Set marks for dbms subject as 50
{ class: “TE” },
where class is TE (all rows which
{ $set: { “marks.DBMS": 50} } , matches the condition were updated)
{ multi: true } )

db.stud.update( • Set marks for dbms subject as 50 where

class is TE (all rows which matches the
{ class: “TE” }, condition were updated)
{ $set: { “marks.DBMS": 50} } , • If now row found which matches the
{ upsert: true } ) condition it will insert new row.
CRUD Operations – Update
Examples
db.stud.update({ },{ $inc:{age: 5}})

db.stud.update ({ },{ $set:{cadd:

“Pune”}},{multi:true})

db.stud.update ({ },{ $rename:

{“age”:“Age”}},{multi:true})
CRUD Operations
Insert

Find

Update

Delete
CRUD Operations – Remove

Remove All • db.inventory.remove({})

Documents

Remove All • db.inventory.remove

Documents that ( { type : "food" } )
Match a Condition

Remove a Single • db.inventory.remove

Document that ( { type : "food" }, 1 )
Matches a Condition
Outlin
e

Indexing

Aggregatio
n
Indexin
g
Indexes support the
efficient execution of
queries in
MongoDB.
Indexing
Types
Single Field Indexes • A single field index only includes data from a single
field of the documents in a collection.

• A compound index includes more than one field of

Compound Indexes the documents in a collection.

• A multikey index is an index on an array field, adding

Multikey Indexes an index key for each value in the array.

Geospatial Indexes • Geospatial indexes support location-based searches .

and Queries

Text Indexes • Text indexes support search of string content

in documents.

• Hashed indexes maintain entries with hashes of the values

Hashed Index of the indexed field and are used with sharded clusters to
support hashed shard keys.
Index
Properties TTL Indexes The TTL index is
used for TTL collections,
which expire data after a
period of time.

Unique Indexes A unique

Index Properties The
index causes MongoDB to
properties you can
reject all documents that
specify when
contain a duplicate value for
building indexes.
the indexed field.

Sparse Indexes A sparse

index does not index
documents that do not have
the indexed field.
Index
Creation
Using CreateIndex
• db.CollectionName.createIndex( { KeyName: 1 or -1})

Using ensureIndex
• db.CollectionName.ensureIndex({KeyName: 1 or -1})

1 for Ascending Sorting

-1 for Descending Sorting
Index
Creation
Using CreateIndex
• Single: db.stud.createIndex( { zipcode: 1})
• Compound: db.stud.createIndex( { dob: 1, zipcode: -1 } )
• Unique: db.stud.createIndex( { rollno: 1 }, { unique: true } )
• Sparse: db.stud.createIndex( { age: 1 }, { sparse: true } )

Using ensureIndex
• Single: db.stud.ensureIndex({“name":1})
• Compound: db.stud.ensureIndex ({“address":1,“name":-1})
Index
Display
db.collection.getIndexes()
• Returns an array that holds a list of
documents that identify and describe the
existing indexes on the collection.

db.collection.getIndexStats()
• Displays a human-readable summary of aggregated
statistics about an index’s B-tree data structure.
• db.<collection>.getIndexStats( { index : "<index
name>" } )
Index
Drop
Syntax
• db.collection.dropIndex()
• db.collection.dropIndex(index)

Example
• db.stud.dropIndex()
• db.stud.dropIndex( { “name" : 1 } )
Indexing and
Querying
• create an ascending index on the field name for a collection
records:
db.records.createIndex( { name: 1 } )
• This index can support an ascending sort on name :
db.records.find().sort( { name: 1 } )
• The index can also support descending sort
db.records.find().sort( { a: -1 } )
Indexing and
Querying
db.stud.findOne( {rno:2} ), using index {rno:1}

db.stud.find ( {rno:5} ), using index {rno:1}

db.stud.find( {rno:{$in:[2,3]}} ), using index {rno:1}

db.stud.find( {age:{$gt:15}} ), using index {age:1}

db.stud.find( {age :{$gt:2,$lt:5}} ), using index {age :1}

db.stud.count( {age:19} ) using index {age:1}

db.stud.distinct( {branch: “Computer”} ) using index
{branch:1}
Indexing and
Querying
db.stud.find({}, {name:1,age:1}),
using index {name:1,age:1}

db.c.find().sort( {name:1,age:1} ),
using index {name:1,age:1}

db.stud.update( {age:20}, {age:19} )

using index {age:1}

db.stud.remove( {name: “Jiya”} )

using index {name:1}
Outlin
e

Indexing

Aggregatio
n
Aggregation
Aggregations operations process data records and return computed
results.

Aggregation operations group values from multiple documents

together,
and can perform a variety of operations on the grouped data

For aggregation in mongodb use aggregate() method.

Syntax:
• >db.COLLECTION_NAME.aggregate(AGGREGATE_OPERATION)
Aggregation
• MongoDB’s aggregation framework is modeled on the concept
of data processing pipelines.
• Documents enter a multi-stage pipeline that transforms the
documents into an aggregated result.
• Other pipeline operations provide tools for grouping and
sorting documents by specific field or fields.
• In addition, pipeline stages can use operators for tasks such
as calculating the average or concatenating a string.
aggregate()
method
Expression Description
Sums up the defined value from all documents in the
$sum collection.
Calculates the average of all given values from all
$avg documents in the collection.
Gets the minimum of the corresponding values from all
$min documents in the collection.
Gets the maximum of the corresponding values from all
$max documents in the collection.
Gets the first document from the source documents
$first according to the grouping.
Gets the last document from the source documents
$last according to the grouping.
Pipeline
Concept

There is a set of possible

stages and each of those is
taken as a set of documents
as an input and produces a
resulting set of documents
Possible stages in
aggregation
• $project − Used to select some specific fields from a collection.
• $match − This is a filtering operation and thus this can reduce the
amount of documents that are given as input to the next stage.
• $group − This does the actual aggregation as discussed above.
• $sort − Sorts the documents.
• $skip − With this, it is possible to skip forward in the list of
documents for a given amount of documents.
• $limit − This limits the amount of documents to look at, by the given
number starting from the current positions.
• $unwind − This is used to unwind document that are using arrays.
When using an array, the data is kind of pre-joined and this
operation will be undone with this to have individual documents
again. Thus with this stage we will increase the amount of
documents for the next stage.
Collection creation to run
practical
• db.student.insert({Rollno:1,name:'Navin ',subject:'DBMS',marks:78});
db.student.insert({Rollno:2,name:'anusha',subject:'OSD',marks:75});
db.student.insert({Rollno:3,name:'ravi',subject:'WT',marks:69});
db.student.insert({Rollno:4,name:'veena',subject:'WT',marks:70});
db.student.insert({Rollno:5,name: ‘Pravini',subject: ‘OSD',marks:80});
db.student.insert({Rollno:6,name: ‘Reena',subject: ‘DBMS',marks:50});
db.student.insert({Rollno:7,name: ‘Geeta',subject: ‘CN',marks:90});
db.student.insert({Rollno:8,name: ‘Akash',subject: ‘CN',marks:85});
MIN(
)
db.student.aggregate
([{$group : {_id : "$subject",
marks : {$min :
"$marks"}}}]);

SQL Equivalent Query

Select subject, min(marks) from student
group by subject
MAX(
)
db.student.aggregate
([{$group : {_id : "$subject",
marks : {$max :
"$marks"}}}]);

SQL Equivalent Query

Select subject, max(marks) from student
group by subject
AVG(
)
db.student.aggregate
([{$group : {_id : "$subject",
marks : {$avg :
"$marks"}}}]);

SQL Equivalent Query

Select subject, avg(marks) from student
group by subject
FIRST()

db.student.aggregate
([{$group : {_id : "$subject",
marks : {$first : "$marks"}}}]);
LAST()

db.student.aggregate
([{$group : {_id : "$subject",
marks : {$last : "$marks"}}}]);
SUM()-Example
1
db.student.aggregate
([{$group : {_id : "$subject",
marks : {$sum :
"$marks"}}}]);

SQL Equivalent Query

Select subject, sum(marks) from student
group by subject
SUM(): Example
2
db.student.aggregate
([{$group : {_id : "$subject",
Count: {$sum : 1}}}]);

SQL Equivalent Query

Select subject, count(*) from student
group by subject
$matc
h
db.student.aggregate
([{ $match: {subject:"OSD"}}])

db.student.aggregate
([{ $match: {subject:"OSD"}},
{$group:{_id:null,count:{$sum:1}}}]);
SUM()- Example
3
db.student.aggregate
([{ $match: {subject:"OSD"}},
{$group:{_id:null,count:{$sum:1}}}]);

SQL Equivalent Query

Select subject, count(*) from student
group by subject
having subject=“OSD”
Limit() &
Skip()
db.student.aggregate
([{ $match: {subject:"OSD"}},
{$limit:1}]);

db.student.aggregate
([{ $match: {subject:"OSD"}},
{$skip:1}]);
Sort()

db.student.aggregate
([{ $match: {subject:"OSD"}},
{$sort:{marks:-1}}]);

db.student.aggregate
([{ $match: {subject:"OSD"}},
{$sort:{marks:1}}]);
Unwind()

If following document is their in collection(Array)

• db.student.insert({rollno:9,name:"Anavi",marks:[80,30,50]});

Using Unwind the above document will be

unwinded into 3 different document

• db.student.aggregate([{$unwind:"$marks"}])
Map-
Reduce
• Map-reduce is a data processing paradigm for
condensing large volumes of data into useful aggregated
results. For operations,
reduce map- MongoDB provides mapReduce
the database command.
• Consider the following map-reduce operation:
Map-
Reduce
Map-
Reduce
In very simple terms, the mapReduce command takes 2
primary inputs, the mapper function and the reducer
function.

A Mapper will start off by reading a collection of data and

building a Map with only the required fields we wish to
process and group them into one array based on the key.

And then this key value pair is fed into a Reducer, which
will process the values.
Map-Reduce
Syntax
db.collection.mapReduce( fun
ction() {emit(key, value);},
//Define map function
function(key,values) {return
reduceFunction}, {
//Define reduce function
out: collection,
query: document,
sort: document,
limit: number
}
)
Map-Reduce Syntax
Explanation
• The above map-reduce function will query the collection, and then map
the output documents to the emit key-value pairs. After this, it is
reduced based on the keys that have multiple values. Here, we have
used the following functions and parameters.

• Map: – It is a JavaScript function. It is used to map a value with a key and

produces a key-value pair.
• Reduce: – It is a JavaScript function. It is used to reduce or group
together all the documents which have the same key.
• Out: – It is used to specify the location of the map-reduce query
output.
• Query: – It is used to specify the optional selection criteria for selecting
documents.
• Sort: – It is used to specify the optional sort criteria.
• Limit: – It is used to specify the optional maximum number of
documents which are desired to be returned.
Example1 MapReduce
function
• Consider the following document structure that stores book details
author wise. The document stores author_name of the book author and
the status of book.

• > db.author.save({
"book_title" : "MongoDB Tutorial", "author_name" : "aparajita",
"status" : "active", "publish_year": "2016" })
• > db.author.save({
"book_title" : "Software Testing Tutorial", "author_name" :
"aparajita", "status" : "active", "publish_year": "2015" })
• > db.author.save({
"book_title" : "Node.js Tutorial", "author_name" : “Kritika",
"status" : "active", "publish_year": "2016" })
• > db.author.save({
"book_title" : "PHP7 Tutorial", "author_name" : "aparajita",
"status" : “passive", "publish_year": "2016" })
Example1 MapReduce
function
db.author.find()

{ "_id" : ObjectId("59333022523476d644344db9"), "book_title" : "MongoDB Tutorial",

"author_name" : "aparajita", "status" : "active", "publish_year" : "2016" }

{ "_id" : ObjectId("59333031523476d644344dba"), "book_title" : "Software Testing

Tutorial", "author_name" : "aparajita", "status" : "active", "publish_year" : "2015" }

{ "_id" : ObjectId("5933303e523476d644344dbb"), "book_title" : "Node.js Tutorial",

"author_name" : "aparajita", "status" : "active", "publish_year" : "2016" }

{ "_id" : ObjectId("5933304b523476d644344dbc"), "book_title" : "PHP7 Tutorial",

"author_name" : "aparajita", "status" : "active", "publish_year" : "2016" }
Example1 MapReduce
function
• Now, use the mapReduce function
• To select all the active books,
• Group them together on the basis of author_name and
• Then count the number of books by each author by using the
following code in MongoDB.
Example1 MapReduce
function
• Code:
db.author.mapReduce(
function() { emit(this.author_name,1) },
function(key, values) {return Array.sum(values)},
{ query:{status:"active"}, out:"author_total" } ).find()

• Out-Put
{ "_id" : "aparajita", "value" : 2 }
{ "_id" : “Kritika", "value" : 1 }

Code:
db.author.mapReduce(
function() { emit(this.author_name,1); },
function(key, values) {return Array.sum(values)}, {
query: { status : "active" },
out: "author_total” })
Example2 MapReduce
function
• Consider the following document structure of Students

• > db.stud.insert({ Name: ‘Amit’, Marks:80 })

• >db.stud.insert({ Name: ‘Amit’, Marks:90 })
• >db.stud.insert({ Name: ‘Shreya’, Marks:40 })
• >db.stud.insert({ Name: ‘Neha’, Marks:80 })
• >db.stud.insert({ Name: ‘Neha’, Marks:35})
Example2 MapReduce
function
• > db.stud.mapReduce(
function(){ emit(this.Name,1)},
function(key, values) {return Array.sum(values)},
{out: “Name_Total“ })

• > db.stud.mapReduce(
function(){ emit(this.Name,1)},
function(key, values) {return Array.sum(values)},
{out: “Name_Total“ }).find()

• > db.stud.mapReduce(
function() { emit(this.Name,this.Marks) },
function(key, values) {return Array.sum(values)},
{out: “Total_Marks“ }).find()
Example2 MapReduce
function
• > db.stud.mapReduce(
... function(){ emit(this.Name,this.Marks)},
... function(key, values) {return Array.sum(values)},
... {out: 'Name_Total'}).find().sort({value:1})

• db.stud.mapReduce(
... function(){ emit(this.Name,this.Marks)},
... function(key, values) {return Array.sum(values)},
... {out: 'Name_Total'}).find().limit(1)

• db.stud.mapReduce( function
(){ emit(this.Name,1)},
function(key, values) {return
Array.sum(values)},
{query:{Marks:{$gt:70}},out:
'Name_Total'}).find()

Hadoop
77% (13)
Hadoop
65 pages
Learn MongoDB in 24 Hours
From Everand
Learn MongoDB in 24 Hours
Alex Nordeen
5/5 (2)
Admin Cloudera
100% (3)
Admin Cloudera
637 pages
Chapter 5
No ratings yet
Chapter 5
84 pages
A. Im, G. Cai, H. Tunc, J. Stevens, Y. Barve, S. Hei Vanderbilt University
No ratings yet
A. Im, G. Cai, H. Tunc, J. Stevens, Y. Barve, S. Hei Vanderbilt University
81 pages
UNIT-3( MONGO DB)
No ratings yet
UNIT-3( MONGO DB)
47 pages
Dbms Unit5 Notes
No ratings yet
Dbms Unit5 Notes
81 pages
PPT 2.1.2
No ratings yet
PPT 2.1.2
31 pages
Chapitre 4 MongoDB
No ratings yet
Chapitre 4 MongoDB
27 pages
Mongodb Introductioninstalaltion and Basic Crud Operations
No ratings yet
Mongodb Introductioninstalaltion and Basic Crud Operations
53 pages
Key-Value Stores: Riakkv: Mar N Svoboda
No ratings yet
Key-Value Stores: Riakkv: Mar N Svoboda
44 pages
Module 3
No ratings yet
Module 3
39 pages
NoSQL DB
No ratings yet
NoSQL DB
39 pages
NOSQL
No ratings yet
NOSQL
25 pages
UNIT 1 MongoDB Fully Complete
100% (1)
UNIT 1 MongoDB Fully Complete
60 pages
Lecture NoSqlIntro
No ratings yet
Lecture NoSqlIntro
30 pages
NOSQL Lecture 1 Notes
No ratings yet
NOSQL Lecture 1 Notes
31 pages
01 Overview
No ratings yet
01 Overview
49 pages
Whitepaper Introduction To Singlestoredb 2022
No ratings yet
Whitepaper Introduction To Singlestoredb 2022
12 pages
06-NoSQL
No ratings yet
06-NoSQL
80 pages
No SQL
No ratings yet
No SQL
38 pages
MongoDB Intro
No ratings yet
MongoDB Intro
30 pages
NoSQL_Notes
No ratings yet
NoSQL_Notes
11 pages
Bcse302l Dbms Module-7 Nosql
No ratings yet
Bcse302l Dbms Module-7 Nosql
30 pages
Cs 620 / Dasc 600 Introduction To Data Science & Analytics: Lecture 6-Nosql
No ratings yet
Cs 620 / Dasc 600 Introduction To Data Science & Analytics: Lecture 6-Nosql
31 pages
Unit 3 NoSQL
No ratings yet
Unit 3 NoSQL
98 pages
Module 5_NoSQL databases
No ratings yet
Module 5_NoSQL databases
33 pages
Experiment No 4
No ratings yet
Experiment No 4
9 pages
Mongo DB
No ratings yet
Mongo DB
66 pages
BDS-Session-5_NoSQL-DB
No ratings yet
BDS-Session-5_NoSQL-DB
51 pages
Lecture 1 - NoSQL
No ratings yet
Lecture 1 - NoSQL
31 pages
Basics of NoSQL, Mongo DB
No ratings yet
Basics of NoSQL, Mongo DB
29 pages
5.1 Intro Nosql
No ratings yet
5.1 Intro Nosql
22 pages
4.1 Intro Nosql
No ratings yet
4.1 Intro Nosql
45 pages
Mongo DB
No ratings yet
Mongo DB
77 pages
NoSql-Unit-4
No ratings yet
NoSql-Unit-4
28 pages
4.1_intro_nosql
No ratings yet
4.1_intro_nosql
43 pages
3 RK_NoSQL_MongoDB_V5
No ratings yet
3 RK_NoSQL_MongoDB_V5
61 pages
FSD Unit - 3 - Part-1
No ratings yet
FSD Unit - 3 - Part-1
15 pages
T2 Schema Design and Data Modeling PDF
No ratings yet
T2 Schema Design and Data Modeling PDF
13 pages
Mongodb
No ratings yet
Mongodb
161 pages
4.1 Intro Nosql
No ratings yet
4.1 Intro Nosql
43 pages
Learning Guide 2.1 - CloudDatabase - NOSQL PDF
No ratings yet
Learning Guide 2.1 - CloudDatabase - NOSQL PDF
44 pages
III
No ratings yet
III
21 pages
Basics of NoSQL, Mongo DB
No ratings yet
Basics of NoSQL, Mongo DB
29 pages
Introduction To NoSQL
No ratings yet
Introduction To NoSQL
29 pages
Lecture 1
No ratings yet
Lecture 1
31 pages
BDA3
No ratings yet
BDA3
3 pages
No SQL
No ratings yet
No SQL
17 pages
04-2 Intro Nosql
No ratings yet
04-2 Intro Nosql
18 pages
2 BDA A6515 Hadoop
No ratings yet
2 BDA A6515 Hadoop
55 pages
BIG - DATA - Unit 4
No ratings yet
BIG - DATA - Unit 4
99 pages
1.5 Module-1
No ratings yet
1.5 Module-1
21 pages
NoSQL (1)
No ratings yet
NoSQL (1)
12 pages
Mongo
No ratings yet
Mongo
58 pages
BDM Redis Mongodb
No ratings yet
BDM Redis Mongodb
62 pages
NOsql Presentation
No ratings yet
NOsql Presentation
20 pages
Chapter14_BigData&NoSQLDatabases
No ratings yet
Chapter14_BigData&NoSQLDatabases
39 pages
21 Mongo DB
No ratings yet
21 Mongo DB
104 pages
Advance Database Chap One
No ratings yet
Advance Database Chap One
53 pages
1664473609-Unit 5 - Database Management - MongoDB
No ratings yet
1664473609-Unit 5 - Database Management - MongoDB
23 pages
Elements of Android Room
From Everand
Elements of Android Room
Mark Murphy
No ratings yet
Alpha Beta
No ratings yet
Alpha Beta
2 pages
217529-Internet of Things: Devices
100% (1)
217529-Internet of Things: Devices
77 pages
A9 Unit 4 IoT
No ratings yet
A9 Unit 4 IoT
66 pages
IoT Unit-VI
No ratings yet
IoT Unit-VI
16 pages
SL-2 pr1 PPT
No ratings yet
SL-2 pr1 PPT
23 pages
MCQ - Bda
33% (3)
MCQ - Bda
3 pages
Use of Big Data Analytics in Banking Industry PDF
No ratings yet
Use of Big Data Analytics in Banking Industry PDF
4 pages
BDA Answer Bank
No ratings yet
BDA Answer Bank
24 pages
CS246 - Home
No ratings yet
CS246 - Home
13 pages
Ccs334 Big Data Analytics
0% (1)
Ccs334 Big Data Analytics
2 pages
Adjei 2018 Cae 652735
No ratings yet
Adjei 2018 Cae 652735
7 pages
Sqoop User Guide
No ratings yet
Sqoop User Guide
58 pages
Bda Mod2
No ratings yet
Bda Mod2
8 pages
Reference: Apache Hadoop: Hadoop: The Definitive Guide, by Tom White, 2 Edition, Oreilly's, 2010
100% (1)
Reference: Apache Hadoop: Hadoop: The Definitive Guide, by Tom White, 2 Edition, Oreilly's, 2010
57 pages
GCC Lab Manual2
No ratings yet
GCC Lab Manual2
50 pages
Cassandra Training 3 Day Course
No ratings yet
Cassandra Training 3 Day Course
5 pages
Efficient Parallel Set-Similarity Joins Using MapReduce
No ratings yet
Efficient Parallel Set-Similarity Joins Using MapReduce
54 pages
Hadoop Wordmean Program in Mapreduce (Without Ide) - College Projects Blog - College Projects
No ratings yet
Hadoop Wordmean Program in Mapreduce (Without Ide) - College Projects Blog - College Projects
4 pages
Big Data Solution For Tourism PDF
No ratings yet
Big Data Solution For Tourism PDF
10 pages
Hadoop-Yahoo - Tutorial Course 1
No ratings yet
Hadoop-Yahoo - Tutorial Course 1
149 pages
Comp 473 Assignment 1 and 2
No ratings yet
Comp 473 Assignment 1 and 2
8 pages
Lecture 4 - Pair RDD and DataFrame
No ratings yet
Lecture 4 - Pair RDD and DataFrame
38 pages
Java Project
No ratings yet
Java Project
66 pages
Hadoop - Quick Guide Hadoop - Big Data Overview
No ratings yet
Hadoop - Quick Guide Hadoop - Big Data Overview
32 pages
Assignment - 2 Big Data With Solution
No ratings yet
Assignment - 2 Big Data With Solution
5 pages
Data Analytics Unit 2
No ratings yet
Data Analytics Unit 2
18 pages
01 Introduction To Hive
No ratings yet
01 Introduction To Hive
17 pages
Cloud Compute
No ratings yet
Cloud Compute
46 pages
Secure Photo Sharing in Social Networks
No ratings yet
Secure Photo Sharing in Social Networks
51 pages
Oozie Basic Exercise
No ratings yet
Oozie Basic Exercise
3 pages
Week 2
No ratings yet
Week 2
7 pages
Database and BI
No ratings yet
Database and BI
33 pages
Bits
No ratings yet
Bits
2 pages