0% found this document useful (0 votes)
27 views14 pages

BIG DATA ANALYTICS

big data notes

Uploaded by

Gopika Gopika
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
27 views14 pages

BIG DATA ANALYTICS

big data notes

Uploaded by

Gopika Gopika
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 14

22CSA62 BIG DATA ANALYTICS

UNIT – 4 : MONGODB
Introduction to MongoDB : Definition of MongoDB - Need of MongoDB - Terms used in
RDBMS and MongoDB - Data Types in MongoDB - MongoDB Query Language.

1 Introduction to MongoDB
1.1 Definition of MongoDB
MongoDB is
1. Cross-platform.
2. Open source.
3. Non-relational.
4. Distributed.
5. NoSQL.
6. Document-oriented data store.

1.2 Need for MongoDB


Few of the major challenges with traditional RDBMS are dealing with large
volumes of data, rich variety of data - particularly unstructured data, and meeting up
to the scale needs of enterprise data. The need is for a database that can scale out or
scale horizontally to meet the scale requirements, has flexibility with respect to
schema, is fault tolerant, is consistent and partition tolerant, and can be easily
distributed over a multitude of nodes in a cluster.
2 Key Concepts
2.1 Terms Used in RDBMS and MongoDB
RDBMS MongoDB
Database Database
Table Collection
Record Document
Columns Fields / Key Value Pairs
Index Index
Joins Embedded Documents
Primary Key Primary Key (_id is an identifier)

MySQL Oracle MongoDB


Database Server Mysqld Oracle Mongod
Database Client Mysql SQL Plus mongo
2.1.1 Create Database
In MongoDB, databases are created dynamically. Unlike traditional SQL databases,
you don’t need an explicit CREATE DATABASE command. Instead, MongoDB
creates the database when you insert data into it.

1. Create or Switch to a Database


Use the use command to create or switch to a database:
> use myDB
switched to db myDB

• The use myDB command switches to myDB.


• If myDB does not already exist, MongoDB will create it only when data is
inserted.

2. Verify the Current Database


Check which database is currently in use:
> db
myDB

• The db command returns the current database name.


• This confirms that we are working inside myDB.

3. List All Databases


To see all databases in MongoDB:
> show dbs
admin (empty)
local 0.078GB
test 0.078GB

• The show dbs command lists all available databases.


• Notice that myDB is not listed yet because MongoDB only shows databases that
contain data.

2.2.2 Drop Database


In MongoDB, databases can be deleted using the db.dropDatabase() command.
This command removes the currently active database and all its associated data.

1. Syntax to Drop a Database


> db.dropDatabase()
• This command deletes the currently selected database.
• It removes all collections and documents inside the database.

2. Steps to Drop a Specific Database


Step 1: Switch to the Database You Want to Drop
> use myDB
switched to db myDB

• The use myDB command ensures that we are working inside the myDB database.
• Only the active database can be dropped.

Step 2: Drop the Database


> db.dropDatabase()
{ "dropped" : "myDB", "ok" : 1 }

• The db.dropDatabase() command deletes the currently selected database (myDB).


• The output confirms that the database was successfully dropped.

3. Confirming That the Database is Deleted


Step 3: Check the List of Databases
> show dbs
admin (empty)
local 0.078GB
test 0.078GB

• The show dbs command lists all available databases.


• If myDB is no longer listed, it means the database has been successfully deleted.

2.2 Data Types in MongoDB


3 Data Type Description
Here is the table format for MongoDB Data Types:
Data Type Description
String Must be UTF-8 valid. Most commonly used data type.
Integer Can be 32-bit or 64-bit (depends on the server).
Boolean Used to store a true/false value.
Double Stores floating point (real) values.
Compares a value against the lowest or highest BSON
Min/Max Keys
elements.
Arrays Stores arrays, lists, or multiple values in one key.
Data Type Description
Timestamp Records when a document has been modified or added.
Stores a NULL value, representing a missing or unknown
NULL
value.
Stores the current date/time in Unix time format. Can be
Date
created with day, month, and year values.
Object ID Stores the document's unique ID.
Binary Data Stores binary data such as images, binaries, etc.
Code Stores JavaScript code within the document.
Regular
Stores a regular expression for pattern matching.
Expression

4 MongoDB Query Language


MongoDB Query Language (CRUD Operations)
CRUD operations in MongoDB allow us to Create, Read, Update, and Delete
documents from a collection.

1. CRUD Operations in MongoDB


Operation Method Used Description
Create insert(), update(), save() Adds new documents to a collection.
Retrieves documents from a
Read find()
collection.
update() (with upsert set to Modifies existing documents in a
Update
false) collection.
Delete remove() Deletes documents from a collection.

2. Creating a Collection (Example: "Person")


Objective:
Create a collection named "Person" and check if it exists.
Step 1: Show Existing Collections
> show collections
Example Output (Before Creation)
students
food
system.indexes
system.js

Step 2: Create the "Person" Collection


> db.createCollection("Person")
Step 3: Verify the Collection is Created
> show collections
Example Output (After Creation)
Person
students
food
system.indexes
system.js

3. Dropping a Collection (Example: "food")


Objective:
Delete the collection named "food" from the database.
Step 1: Show Existing Collections
> show collections
Example Output (Before Deletion)
Person
students
food
system.indexes
system.js

Step 2: Drop the "food" Collection


> db.food.drop()
Output:
True

Step 3: Verify the Collection is Deleted


> show collections
Example Output (After Deletion)
Person
students
system.indexes
system.js

4.1 Insert Method


MongoDB Insert Method
1. Creating a Collection: "Students"
Objective: Create a new collection named "Students" and verify its creation.
Steps:
1. Check existing collections:
> show collections
Output:
system.indexes
system.js

2. Insert an initial document to create the collection:


> db.Students.insert({_id: 1, StudName: "Michelle Jacintha", Grade: "VII",
Hobbies: "Internet Surfing"})

3. Verify the collection is created:


> show collections
Output:
Students
system.indexes
system.js

2. Inserting Another Document into the Collection


Objective: Insert a new document for "Mabel Mathews" into the "Students" collection.
Steps:
1. Insert the document:
> db.Students.insert({_id: 2, StudName: "Mabel Mathews", Grade: "VII",
Hobbies: "Baseball"})
Output:
WriteResult({ "nInserted" : 1 })

2. Verify the inserted document:


> db.Students.find().pretty()
Output:
{
"_id" : 1,
"StudName" : "Michelle Jacintha",
"Grade" : "VII",
"Hobbies" : "Internet Surfing"
}
{
"_id" : 2,
"StudName" : "Mabel Mathews",
"Grade" : "VII",
"Hobbies" : "Baseball"
}
3. Upsert: Insert or Update Document (Aryan David)
Objective:
• Insert "Aryan David" into the "Students" collection if he does not already exist.
• If the document exists, update Hobbies from "Skating" to "Chess".
Steps:
1. Check if Aryan David exists:
> db.Students.find().pretty()
Output (Before inserting Aryan David):
{
"_id" : 1,
"StudName" : "Michelle Jacintha",
"Grade" : "VII",
"Hobbies" : "Internet Surfing"
}
{
"_id" : 2,
"StudName" : "Mabel Mathews",
"Grade" : "VII",
"Hobbies" : "Baseball"
}

2. Insert or Update (Upsert) the document:


> db.Students.update({_id: 3, StudName: "Aryan David", Grade: "VII"}, {$set:
{Hobbies: "Skating"}}, {upsert: true})
Output:
WriteResult({ "nMatched" : 0, "nUpserted" : 1, "nModified": 0 })

3. Verify the inserted document:


> db.Students.find().pretty()
Output:
{
"_id" : 1,
"StudName" : "Michelle Jacintha",
"Grade" : "VII",
"Hobbies" : "Internet Surfing"
}
{
"_id" : 2,
"StudName" : "Mabel Mathews",
"Grade" : "VII",
"Hobbies" : "Baseball"
}
{
"_id" : 3,
"StudName" : "Aryan David",
"Grade" : "VII",
"Hobbies" : "Skating"
}

4. Update hobbies from "Skating" to "Chess" using Upsert:


> db.Students.update({_id: 3}, {$set: {Hobbies: "Chess"}}, {upsert: true})
Output:
WriteResult({ "nMatched" : 1, "nUpserted": 0, "nModified": 1 })

5. Verify the update:


> db.Students.find().pretty()
Output:
{
"_id" : 3,
"StudName" : "Aryan David",
"Grade" : "VII",
"Hobbies" : "Chess"
}

Testing Upsert Behavior:


• Set upsert: true (ensures insert if missing, otherwise update)
• Set upsert: false (prevents insert, updates only if the document exists)
> db.Students.update({_id: 3}, {$set: {Hobbies: "Chess"}}, {upsert: false})
Output:
WriteResult({ "nMatched" : 1, "nModified": 1 })

4. Using the save() Method to Insert a Document


Objective: Insert a document for "Vamsi Bapat" into the "Students" collection without
specifying _id.
Steps:
1. Insert the document using save():
> db.Students.save({StudName: "Vamsi Bapat", Grade: "VI"})
Output:
WriteResult({ "nInserted" : 1 })
2. Verify the inserted document:
> db.Students.find().pretty()
Output:
{
"_id" : 1,
"StudName" : "Michelle Jacintha",
"Grade" : "VII",
"Hobbies" : "Internet Surfing"
}
{
"_id" : 2,
"StudName" : "Mabel Mathews",
"Grade" : "VII",
"Hobbies" : "Baseball"
}
{
"_id" : 3,
"StudName" : "Aryan David",
"Grade" : "VII",
"Hobbies" : "Chess"
}
{
"_id" : ObjectId("5f4c8e3c2a3aeb5f6b89e4a3"),
"StudName" : "Vamsi Bapat",
"Grade" : "VI"
}

4.2 Save() Method


Save() Method & Update with Upsert in MongoDB
Objective:
1. Insert "Hersch Gibbs" into the "Students" collection using the update() method.
2. First, set upsert: false (no insertion if not found).
3. Then, set upsert: true (insert if not found, otherwise update).
4. Confirm that the document was inserted or updated.

Step 1: Check Existing Documents


Before making any updates, check the existing data.
Command:
> db.Students.find().pretty()
Output (Before inserting Hersch Gibbs):
{
"_id" : 1,
"StudName" : "Michelle Jacintha",
"Grade" : "VII",
"Hobbies" : "Internet Surfing"
}
{
"_id" : 2,
"StudName" : "Mabel Mathews",
"Grade" : "VII",
"Hobbies" : "Baseball"
}
{
"_id" : 3,
"StudName" : "Aryan David",
"Grade" : "VII",
"Hobbies" : "Chess"
}
{
"_id" : ObjectId("5f4c8e3c2a3aeb5f6b89e4a3"),
"StudName" : "Vamsi Bapat",
"Grade" : "VI"
}

Step 2: Try to Update with upsert: false (Document Does Not Exist Yet)
Since Hersch Gibbs does not exist yet, this will not insert a new document.
Command:
> db.Students.update(
{_id: 4, StudName: "Hersch Gibbs", Grade: "VII"},
{$set: {Hobbies: "Graffiti"}},
{upsert: false}
)
Output:
WriteResult({ "nMatched" : 0, "nModified" : 0, "nUpserted" : 0 })
• Since upsert: false, no new document is inserted.
• "nMatched" : 0 → No matching document was found.
• "nUpserted" : 0 → No document was inserted.
Check the Collection Again:
> db.Students.find().pretty()
Output (No change in the collection):
{ "_id" : 1, "StudName" : "Michelle Jacintha", "Grade" : "VII", "Hobbies" : "Internet
Surfing" }
{ "_id" : 2, "StudName" : "Mabel Mathews", "Grade" : "VII", "Hobbies" : "Baseball"
}
{ "_id" : 3, "StudName" : "Aryan David", "Grade" : "VII", "Hobbies" : "Chess" }
{ "_id" : ObjectId("5f4c8e3c2a3aeb5f6b89e4a3"), "StudName" : "Vamsi Bapat",
"Grade" : "VI" }
• No new document for Hersch Gibbs is inserted.

Step 3: Update with upsert: true (Insert if Not Found)


Now, we try the update operation with upsert: true, meaning:
• If the document exists, it will be updated.
• If it does not exist, it will be inserted.
Command:
> db.Students.update(
{_id: 4, StudName: "Hersch Gibbs", Grade: "VII"},
{$set: {Hobbies: "Graffiti"}},
{upsert: true}
)
Output:
WriteResult({ "nMatched" : 0, "nModified" : 0, "nUpserted" : 1 })
• "nUpserted" : 1 → A new document has been inserted.
Check the Collection Again:
> db.Students.find().pretty()
Output (Now Hersch Gibbs Exists in the Collection):
{
"_id" : 1,
"StudName" : "Michelle Jacintha",
"Grade" : "VII",
"Hobbies" : "Internet Surfing"
}
{
"_id" : 2,
"StudName" : "Mabel Mathews",
"Grade" : "VII",
"Hobbies" : "Baseball"
}
{
"_id" : 3,
"StudName" : "Aryan David",
"Grade" : "VII",
"Hobbies" : "Chess"
}
{
"_id" : ObjectId("5f4c8e3c2a3aeb5f6b89e4a3"),
"StudName" : "Vamsi Bapat",
"Grade" : "VI"
}
{
"_id" : 4,
"StudName" : "Hersch Gibbs",
"Grade" : "VII",
"Hobbies" : "Graffiti"
}
• Hersch Gibbs has been successfully inserted.

4.3 Updating Information In-Place (Update Method)


Adding a New Field to an Existing Document - Update Method
Objective:
• Add a new field "Location" with the value "Newark" to the document where _id: 4
in the Students collection.

Step 1: Check Existing Document (_id: 4)


Before updating, we check if the document with _id: 4 exists and what its current fields
are.
Command:
> db.Students.find({_id: 4}).pretty()
Output (Before Update):
{
"_id": 4,
"StudName": "Hersch Gibbs",
"Grade": "VII",
"Hobbies": "Graffiti"
}
• The "Location" field is not present yet.

Step 2: Add the "Location" Field Using $set


Now, we add a new field "Location" with the value "Newark".
Command:
> db.Students.update(
{ _id: 4 },
{ $set: { Location: "Newark" } }
)
Output:
WriteResult({ "nMatched": 1, "nModified": 1 })
• "nMatched": 1 → Found 1 matching document.
• "nModified": 1 → Successfully updated the document.

Step 3: Confirm the Update


Check the document again to verify that the "Location" field has been added.
Command:
> db.Students.find({_id: 4}).pretty()
Output (After Update):
{
"_id": 4,
"StudName": "Hersch Gibbs",
"Grade": "VII",
"Hobbies": "Graffiti",
"Location": "Newark"
}

4.4 Removing an Existing Field (Remove Method)

Here’s a structured explanation and demonstration of removing an existing field from


a MongoDB document using the $unset operator.

5.4 Removing an Existing Field from an Existing Document - Remove Method


Objective:
• Remove the "Location" field from the document with _id: 4 in the Students
collection.

Step 1: Check Existing Document (_id: 4)


Before removing the field, let's check if the document contains "Location": "Newark".
Command:
> db.Students.find({_id: 4}).pretty()
Output (Before Removal):
{
"_id": 4,
"StudName": "Hersch Gibbs",
"Grade": "VII",
"Hobbies": "Graffiti",
"Location": "Newark"
}
• The "Location" field is present.
Step 2: Remove the "Location" Field Using $unset
Now, we remove the "Location" field from this document.
Command:
> db.Students.update(
{ _id: 4 },
{ $unset: { Location: "" } }
)
Output:
WriteResult({ "nMatched": 1, "nModified": 1 })
• "nMatched": 1 → Found one matching document.
• "nModified": 1 → Successfully removed the field.

Step 3: Confirm the Update


Check the document again to verify that the "Location" field has been removed.
Command:
> db.Students.find({_id: 4}).pretty()
Output (After Removal):
{
"_id": 4,
"StudName": "Hersch Gibbs",
"Grade": "VII",
"Hobbies": "Graffiti"
}

4.5 Finding Documents Based on Search Criteria (Find Method)

4.6 Dealing with NULL Values


4.7 Count, Limit, Sort, and Skip
4.8 Arrays
4.9 Aggregate Function
4.10 MapReduce Function
4.11 JavaScript Programming
4.12 Cursors in MongoDB
4.13 Indexes
4.14 MongoImport
4.15 MongoExport
4.16 Automatic Generation of Unique Numbers for the "_id" Field

You might also like