NoSQL 24 Mongo P1
NoSQL 24 Mongo P1
MongoDB
“A true NoSQL document-oriented database system”
Vinu Venugopal
ScaDS.ai Lab, IIIT Bangalore
NoSQL Systems
MongoDB
• Short form of “Humongous DB”
• Open-source tool but not an Apache tool; MongoDB today is a company
• Document oriented database: Stores all its data in the form of documents
• The data model is based on the JSON (JavaScript Object Notation) format
• Has its own replication schema (instead of HDFS) and integration with
MapReduce (instead of Hadoop).
2
MongoDB Access Mode
• Command-line shell
• We first need to start the server using “mongod” it is a daemon process that would
take care the requests from the client.
• And then, when you run the “mongo” command you would get an interactive shell.
3
MongoDB shell – basic commands
• To see the list of databases on the server:
>show dbs
> db
test (default database)
4
MongoDB shell
• The db.stats() method returns a document with statistics about the database
system's state:
> db.stats()
{
"db" : "test",
Number of objects (specifically, documents) in
"collections" : 1, the database across all collections.
"views" : 0,
"objects" : 49, Total size of the uncompressed data held in the
"avgObjSize" : 33, database. The dataSize decreases when you
"dataSize" : 1617, remove documents.
"storageSize" : 36864,
"numExtents" : 0,
"indexes" : 1, Sum of the space allocated to all collections in
"indexSize" : 36864, the database for document storage, including
"scaleFactor" : 1,
"fsUsedSize" : 85887053824,
free space.
"fsTotalSize" : 499963174912,
"ok" : 1 Total number of indexes across all collections in
}
the database.
Refer: https://docs.mongodb.com/manual/reference/command/dbStats/
5
MongoDB shell
SQL Terms/Concepts MongoDB Terms/Concepts
database database
table collection
row document
column field
primary key primary key
index index
>show collections
• To insert a single document into the collection ‘testCollection’:
>db.testCollection.insert({x:1})
• If the collection name does not exist, it is created on-the-fly
6
MongoDB shell
• To insert multiple documents using for loop:
7
MongoDB shell
• To insert multiple documents using for loop:
8
Scripting
• Can write mongo scripts in JavaScript
• Opening a new connection:
conn = new Mongo();
db = conn.getDB("myDatabase");
• If not on default port:
db = connect("localhost:27020/myDatabase");
9
Scripting
• Can write mongo scripts in JavaScript
• Opening a new connection:
conn = new Mongo();
db = conn.getDB("myDatabase");
• If not on default port:
db = connect("localhost:27020/myDatabase");
11
MongoDB Data Model
Flexible Schema
• “Collections” (tables in MongoDB) do not enforce a schema
• “Documents” within a collection can have different fields
13
MongoDB Data Model
References among Documents
• For many use cases in MongoDB, the denormalized data model is optimal.
14
MongoDB Data Model
References among Documents
• Stores the relationships between data by including links or references from one
document to another
{
title: "50 Tips and Tricks for the MongoDB Developer",
author: "Kristina Chodorow",
published_date: ISODate("2011-05-06"),
pages: 68,
language: "English",
publisher: { name: "O'Reilly Media", founded: 1980, location: "CA" }
}
16
MongoDB Data Model
Modeling One-to-Many Relationships – Multiple options are there!
{
title: "50 Tips and Tricks for the MongoDB Developer",
author: "Kristina Chodorow",
published_date: ISODate("2011-05-06"),
pages: 68,
language: "English",
publisher: { name: "O'Reilly Media", founded: 1980, location: "CA" }
}
Option 2: Separate Documents with a Mutable Array. (This can avoid redundancies.)
{ _id: 123456789,
title: "MongoDB: The Definitive Guide",
author: [ "Kristina Chodorow", "Mike Dirolf" ],
published_date: ISODate("2010-09-24"),
pages: 216, language: "English" }
{ _id: 234567890,
title: "50 Tips and Tricks for the MongoDB Developer",
author: "Kristina Chodorow",
published_date: ISODate("2011-05-06"),
pages: 68,
language: "English" }
18
MongoDB Data Model
Modeling One-to-Many Relationships
Option 2: Separate Documents with a Mutable Array. (This can avoid redundancies.)
{ _id: 234567890,
title: "50 Tips and Tricks for the MongoDB Developer",
author: "Kristina Chodorow",
published_date: ISODate("2011-05-06"),
pages: 68,
language: "English" }
19
MongoDB Data Model
Modeling One-to-Many Relationships
{ _id: 123456789,
title: "MongoDB: The Definitive Guide",
author: [ "Kristina Chodorow", "Mike Dirolf" ],
published_date: ISODate("2010-09-24"),
pages: 216, Keeping the reference of the
language: "English", publisher_id: "oreilly" } publisher in each of its book
document.
{ _id: 234567890,
title: "50 Tips and Tricks for the MongoDB Developer",
author: "Kristina Chodorow", publisher_id: "oreilly",
published_date: ISODate("2011-05-06"),
pages: 68,
language: "English" }
20
MongoDB Data Model
Comparison to Foreign Keys in SQL
21
MongoDB Data Model
Many-To-Many (N:M)
Book Author
{ _id: 1,
title: "A tale of two people", { _id: “A”,
categories: ["drama"], name: "Peter Standford",
authors: [“A”, “B”] } books: [1, 2] }
22
MongoDB Data Model
Many-To-Many (N:M)
Book Author
{ _id: 1,
title: "A tale of two people", { _id: “A”,
categories: ["drama"], name: "Peter Standford",
authors: [“A”, “B”] } books: [1, 2] }
Embed the object id of author in the book document and vice versa.
23
MongoDB Data Model
Many-To-Many (N:M)
• There is no silver bullet, and you should always create the most appropriate data model
that meets the needs of how your data will be queried.
24
Data Storage & Replication
26
Data Storage & Replication
29
Sharding
30
Sharding
31
Sharded and Non-Sharded Collections
Refer: https://docs.mongodb.com/manual/sharding/
32
Sharded and Non-Sharded Collections
Replica-set 1
Replica-set 2
• There are mulitple replica sets each responsible for a particular shard.
• One mongod istance in a replica set act as a primary node.
• Here, the primary node of the Replica set-1 acts as the primary shard.
Refer: https://docs.mongodb.com/manual/sharding/
33
Connecting to Sharded Cluster
The mongos would pick the primary mongod on which the requested data
resides.
34
Range- vs. Hash-Partitioning
Two main sharding strategies adopted by MongoDB.
Option 1: Range-Partitioning
• Confirms to computing x div m, where m is the desired range of a
partition.
Option 2: Hash-Partitioning
• Confirms to computing x mod n, where n is the desired number of
partitions.
35
MongoDB CRUD Operations
CRUD Operations:
• CREATE
• READ
• UPDATE
• DELETE
• BULK WRITE
36
MongoDB CRUD Operations
CRUD Operations:
• CREATE
• READ
• UPDATE
• DELETE
• BULK WRITE
• If the collection does not currently exist, insert operations will create the collection
Methods:
db.collection.insertOne()
db.collection.insertMany()
37
MongoDB CRUD Operations
CRUD Operations:
• CREATE
• READ
• UPDATE
• DELETE
• BULK WRITE
• If the collection does not currently exist, insert operations will create the collection
Methods:
db.collection.insertOne()
db.collection.insertMany()
const doc1 = { "name": "basketball", "category": "sports", "quantity": 20, "reviews": [] };
const doc2 = { "name": "football", "category": "sports", "quantity": 30, "reviews": [] };
db.itemsCollection.insertMany([doc1, doc2])
38
MongoDB CRUD Operations
CRUD Operations:
• CREATE
• READ
• UPDATE
• DELETE
• BULK WRITE
Methods:
db.collection.find()
39
MongoDB CRUD Operations
CRUD Operations:
• CREATE
• READ
• UPDATE
• DELETE
• BULK WRITE
Methods:
db.collection.updateOne()
db.collection.updateMany()
db.collection.replaceOne()
• CREATE
• READ
• UPDATE
• DELETE
• BULK WRITE
Methods:
db.collection.deleteOne()
db.collection.deleteMany()
41
MongoDB CRUD Operations
CRUD Operations:
db.characters.bulkWrite(
[ { insertOne :
• CREATE { "document" :
• READ {"_id" : 4, "char" : "Dithras",
"class" : "barbarian", "lvl" : 4 }
• UPDATE }
• DELETE },
{ insertOne :
• BULK WRITE { "document" :
{"_id" : 5, "char" : "Taeln",
"class" : "fighter", "lvl" : 3 }
• Provides the ability to perform bulk insert, }
update and remove operations },
{ updateOne :
{ "filter" : { "char" : "Eldon" },
Methods: "update" : { $set : {
"status" : "Critical Injury" } }
db.collection.bulkWrite() }
},
Supports the following write operations: { deleteOne :
insertOne { "filter" : { "char" : "Brisbane"} }
},
updateOne { replaceOne :
updateMany { "filter" : { "char" : "Meldane" },
replaceOne "replacement" : { "char" : "Tanys",
"class" : "oracle", "lvl" : 4 }
deleteOne
}
deleteMany }] )
42
Bulkloading Large Files: mongoimport
• mongoimport supports JSON, CSV and TSV formats (all of which are ASCII-based) for
bulk-loading large data volumes into a MongoDB collection
43