Mongo DB
Mongo DB
Session 1
Big Data?
Big data is a term that describes the large volume of data – both
structured and unstructured.
Remember:
• Horizontal scaling means that you scale by adding more machines into your pool of resources.
• Vertical scaling means that you scale by adding more power (CPU, RAM) to an existing machine.
why NoSQL
when to use NoSQL
When should NoSQL be used:
• When huge amount of data need to be stored and retrieved .
• The relationship between the data you store is not that important
• The data changing over time and is not structured.
• Support of Constraints and Joins is not required at database level.
• The data is growing continuously and you need to scale the database regular to handle the data.
Remember:
• Data Persistence on Server-Side via NoSQL.
• Does not use SQL-like query language.
• Longer persistence
• Store massive amounts of data.
• Systems can be scaled.
• High availability.
• Semi-structured data.
• Support for numerous concurrent connections.
• Indexing of records for faster retrieval
NoSQL Categories
NoSQL Categories
There are 4 basic types of NoSQL databases.
Key-value stores, or key-value databases, implement a simple data model that pairs
a unique key with an associated value.
Key-value stores e.g.
• Redis, Cassandra
Document databases, also called document stores, store semi-structured data and
descriptions of that data in document format.
Document oriented e.g.
• MongoDB, CouchDB
Column-oriented Graph
An object is an unordered set of name/value pairs.
Relational databases are commonly referred to as SQL databases because they use SQL
(structured query language) as a way of storing and querying the data.
Difference:
• NoSQL databases are document based, key-value pairs, or wide-column stores. This means that
SQL databases represent data in form of tables which consists of n number of rows of data
whereas NoSQL databases are the collection of key-value pair, documents, or wide-column stores
which do not have standard schema definitions.
• SQL databases have predefined schema whereas NoSQL databases have dynamic schema for
unstructured data.
• SQL databases are vertically scalable whereas the NoSQL databases are horizontally scalable.
• SQL databases uses SQL ( structured query language ) for defining and manipulating the data. In
NoSQL database, queries are focused on collection of documents.
Types of Data
Structured Semi-Structured Unstructured
• Semi-Structured
Semi-Structured Data is a type of data which does not have a formal structure of a data model, i.e. a table
definition in a relational DBMS, XML files or JSON documents are examples of semi-structured data.
• Unstructured
The data which have unknown form and cannot be stored in RDBMS and cannot be analyzed unless it is
transformed into a structured format is called as unstructured data. Text Files and multimedia contents like
images, audios, videos are example of unstructured data.
MongoDB
MongoDB is a cross-platform document-oriented database program. Classified as a NoSQL database.
Remember:
• MongoDB documents are similar to JSON (key/fields and value pairs) objects.
• The values of fields may include other documents, arrays, or an arrays of documents.
Core MongoDB Operations (CRUD), stands for create, read, update, and delete.
SQL/MongoDB Terms:
MongoDB
MongoDB stores data as BSON documents. BSON is a binary representation of JSON
documents.
document
MongoDB stores data as BSON documents. BSON is a binary representation of
JSON documents.
document
document
MongoDB documents are composed of field-and-value pairs. The value of a field can be
any of the BSON data types, including other documents, arrays, and arrays of
documents.
The field name _id is reserved for use as a primary key; its value must be unique in the
collection, is immutable, and may be of any type other than an array.
Note:
• MongoDB does not support duplicate field names.
db
In the mongo shell, db is the variable that references the current database. The
variable is automatically set to the default database test or is set when you use
the use <db_name> to switch current database.
MongoDB Redis MySQL Oracle
Database Server mongod redis-server mysqld oracle
Database Client mongo redis-cli mysql sqlplus
start db server
start server and client
To start MongoDB server, execute mongod.exe.
Note: Always give --dbpath in ""
• The --dbpath option points to your database directory.
• The --bind_ip_all option : bind to all ip addresses.
• The --bind_ip arg option : comma separated list of ip addresses to listen on, localhost by default.
comparison operator
comparison operator
$gte Matches values that are greater than or equal to a specified value.
$lte Matches values that are less than or equal to a specified value.
$ne Matches all values that are not equal to a specified value.
$gt $gte
{ field: { $gt: value } } { field: { $gte: value } }
$lt $lte
{ field: { $lt: value } } { field: { $lte: value } }
$in
{ field: { $in: [ <value1>, <value2>, ..., <valueN> ] } }
$nin
{ field: { $nin: [ <value1>, <value2>, ..., <valueN> ] } }
logical operator
logical operator
Joins query clauses with a logical AND returns all documents that
$and
match the conditions of both clauses.
$and
{ $and: [ { <expr1> }, { <expr2> }, ... , { <exprN> } ] }
$not
{ field: { $not: { <operator-expression> } } }
The _id field must have a unique value. You can think of the _id field as the
document’s primary key.
ObjectId()
MongoDB uses ObjectIds as the default value of _id field of each document, which is
auto generated while the creation of any document.
ObjectId()
• x = ObjectId()
show databases
Print a list of all available databases.
show database
Print a list of all databases on the server.
db.getName()
• db
• db.getName() # Returns: the current database name.
To access an element of an array by the zero-based index position, concatenate the
array name with the dot (.) and zero-based index position, and enclose in quotes
use database
Switch current database to <db>. The mongo shell variable db is set to
the current database.
use database
Switch current database to <db>. The mongo shell variable db is set to the current database.
use <db>
• use db1
db.dropDatabase()
db.dropDatabase()
Removes the current database, deleting the associated data files.
db.dropDatabase()
• use db1
• db.dropDatabase()
Infoway Technologies, 3rd Floor Commerce Centre, Rambaug Colony, Paud Road Pune 411038
If not working then do changes in
my.ini file.
secure_file_priv = ""
• SELECT * FROM emp INTO OUTFILE "d:/emp.csv" FIELDS
TERMINATED BY ',';
mongoimport
mongoimport tool imports content from an Extended JSON, CSV, or TSV
export created by mongoexport, or another third-party export tool.
mongoimport - JSON
The mongoimport tool imports content from an Extended JSON, CSV, or TSV export
created by mongoexport.
mongoimport < --host > < --port > < --db > < --collection > <
--type > < --file > < --fields "Field-List" > < --mode
{ insert | upsert | merge } > < --jsonArray > < --drop >
mongoimport < --host > < --port > < --db > < --collection > < --type >
< --file > < --fields "<field1>[,<field2>]*" < --headerline > > < --
useArrayIndexFields >
• C:\> mongoimport --host 192.168.100.20 --port 27017 --db db1 --collection emp --type csv
--file d:\emp.csv --headerline
• C:\> mongoimport --host 192.168.100.20 --port 27017 --db db1 --collection emp --type csv
--file d:\emp.csv --fields
"EMPNO,ENAME,JOB,MGR,HIREDATE,SAL,COMM,DEPTNO,BONUSID,USERNAME,PWD"
• C:\> mongoimport --db db1 --collection o --type csv --file d:\emp.csv --fields
"EMPNO.int(32),ENAME.string(),JOB.string(),MGR.int32(),HIREDATE.date(2006-01-
02),SAL.int32(),COMM.int32(),DEPTNO.int32(),BONUSID.int32(),USERNAME.string(),PWD.string()"
Note:
• There should be no blank space in the field list.
e.g.
_id, ename, salary #this is an error
mongoimport - CSV
The mongoimport tool imports content from an Extended JSON, CSV, or TSV export
created by mongoexport.
mongoimport < --host > < --port > < --db > < --collection > < --type >
< --file > < --fields "<field1>[,<field2>]*" < --headerline > > < --
useArrayIndexFields >
_id,course,duration,modules.0,modules.1,modules.2,modules.3
1,course1,6 months,c++,database,java,.net
2,course2,6 months,c++,database,python,R
3,course3,6 months,c++,database,awp,.net
mongoexport < --host > < --port > < --db > < --collection > < --type >
< --file > < --out >
• C:\> mongoexport --host 192.168.0.6 --port 27017 --db db1 --collection emp --
type JSON --out "d:\emp.json"
• C:\> mongoexport --host 192.168.0.6 --port 27017 --db db1 --collection emp --
type JSON --out "d:\emp.json" --fields "empno,ename,job"
• C:\> mongoexport --host 192.168.0.6 --port 27017 --db db1 --collection emp --
type CSV --out "d:\emp.csv" --fields "empno,ename,job"
Note:
• there should be no space in the field list.
e.g.
_id, ename, salary #this is an error
new Date()
TODO
new Date()
MongoDB uses ObjectIds as the default value of _id field of each document, which is
auto generated while the creation of any document.
• x = Date()
db.getCollectionNames()
Returns an array containing the names of all collections and views in
the current database.
db.getCollectionNames()
getCollectionNames() returns an array containing the names of all collections in the current
database.
show collection
db.getCollectionNames()
• show collections
• db.getCollectionNames();
db.createCollection()
Creates a new collection or view.
db.createCollection()
Capped collections have maximum size or document counts that prevent them from
growing beyond maximum thresholds. All capped collections must specify a maximum
size and may also specify a maximum document count. MongoDB removes older
documents if a collection reaches the maximum size limit before it reaches the
maximum document count.
• db.createCollection("log");
• db.createCollection("log", { capped: true, size: 1, max: 2}); //
This command creates a collection named log with a maximum size of 1 byte and a maximum
of 2 documents.
db.collection.isCapped()
Returns true if the collection is a capped collection, otherwise
returns false.
db.collection.isCapped()
isCapped() returns true if the collection is a capped collection, otherwise returns false.
db.collection.isCapped()
• db.log.isCapped();
db.getCollection()
Returns a collection or a view object that is in the DB.
db.getCollection()
TODO
db.getCollection('name')
• db.getCollection('emp').find();
auth.insertOne( doc )
db.getSiblingDB()
To access another database without switching databases.
db.getSiblingDB()
Used to return another database without modifying the db variable in the shell environment.
db.getSiblingDB(<database>)
• db.getSiblingDB('db1').getCollectionNames();
db.collection.renameCollection()
Renames a collection.
db.collection.renameCollection()
TODO
db.collection.renameCollection(target, dropTarget)
• db.emp.renameCollection('employee', false);
dropTarget : If true, mongod drops the target of renameCollection prior to renaming the
collection. The default value is false.
db.collection.drop()
Removes a collection or view from the database. The method also removes
any indexes associated with the dropped collection.
db.collection.drop()
drop() removes a collection or view from the database. The method also removes any indexes
associated with the dropped collection.
db.collection.drop(<options>)
• db.emp.drop();
Method Embedded Field Specification
.pretty() For fields in an embedded documents, you can specify the field using either:
dot notation; e.g. "field.nestedfield": <value>
nested form; e.g. { field: { nestedfield: <value> } }
db.collection.find()
The find() method always returns the _id field unless you specify _id: 0/false
to suppress the field.
By default, mongo prints the first 20 documents. The mongo shell will prompt the user to
Type "it" to continue iterating the next 20 results.
query: Specifies selection filter using query operators. To return all documents in a collection, omit this parameter or pass
an empty document ({}).
{ "<Field Name>": { "<Comparison Operator>": <Comparison Value> } }
projection: Specifies the fields to return in the documents that match the query filter. To return all fields in the matching
documents, omit this parameter.
{ "<Field Name>": <Boolean Value> } }
• 1 or true to include the field in the return documents. Non-zero integers are also treated as true.
• 0 or false to exclude the field.
db.collection.find()
TODO '<array>.<index>
'
db['collection'].find({ query }, { projection })
db.collection.find({ query }, { projection })
db.getCollection('name').find({ query }, { projection })
• db.emp.find();
• db ['emp'].find ()
• db.getCollection('emp').find();
• db.getSiblingDB('db1').getCollection('emp').find();
• db.emp.find({job: 'manager'})
• db.emp.find({}, {ename: true, job: true});
• db.emp.find({sal:{ $gt: 4 }})
• db.emp.find({job: 'manager'}, {ename: true, job: true})
• db.emp.find({job: 'manager'}, {_id: false, ename: true, job: true})
db.collection.find()
TODO '<array>.<index>
'
db['collection'].find({ query }, { projection })
db.collection.find({ query }, { projection })
db.getCollection('name').find({ query }, { projection })
TODO
• db.emp.find()[0];
• db.emp.find()[0].ename;
• db.getCollection('emp').find() [0];
• db.emp.find()[db.emp.find().count()-1]
cursor with db.collection.find()
In the mongo shell, if the returned cursor is not assigned to a variable using the var keyword,
the cursor is automatically iterated to access up to the first 20 documents that match the query.
cursor.limit(<number>)
db['collection'].find({ query }, { projection }).limit(<number>)
db.collection.find({ query }, { projection }).limit(<number>)
cursor.skip(<offset_number>)
db['collection'].find({ query }, { projection }).skip(<offset_number>)
db.collection.find({ query }, { projection }).skip( < offset_number > )
• db.emp.find().skip(4);
• db.emp.find().skip(db.emp.countDocuments({}) - 1);
count
Counts the number of documents referenced by a cursor. Append the
count() method to a find() query to return the number of matching
documents. The operation does not perform the query but instead counts
the results that would be returned by the query.
db.collection.find().count()
count() counts the number of documents referenced by a cursor. Append the count() method
to a find() query to return the number of matching documents. The operation does not
perform the query but instead counts the results that would be returned by the query.
cursor.count()
db['collection'].find({ query }).count()
db.collection.find({ query }).count()
• db.emp.find().count();
• db.emp.find({job: 'manager'}).count();
db.collection.distinct()
Finds the distinct values for a specified field across a single
collection or view and returns the results in an array.
db.collection.distinct()
distinct() finds the distinct values for a specified field across a single collection or view and
returns the results in an array.
• db.emp.distinct("job")
• db.emp.distinct("job", { sal: { $gt: 5000 } } )
var x = db.emp.find()[10]
for (i in x) {
print(i)
}
db.collection.count[Documents]()
TODO
db.collection.count[Documents]()
countDocuments() returns the count of documents that match the query for a
collection
Field Description
• db.emp.count({});
• db.emp.countDocuments({});
• db.emp.countDocuments({job: 'manager'});
• db.emp.countDocuments({job: 'salesman'}, {skip: 1, limit: 3});
findOne
find() method always returns the _id field unless you specify _id:
0/false to suppress the field.
db.collection.findOne()
findOne() returns one document that satisfies the specified query criteria on the collection. If
multiple documents satisfy the query, this method returns the first document according to the
order in which order the documents are stored in the disk. If no document satisfies the query, the
method returns null.
• db.emp.findOne();
• db.emp.findOne({ job: 'manager' });
• If the document does not contain an _id field, then the save() method calls the insert()
method. During the operation, the mongo shell will create an ObjectId and assign it to
the _id field.
• If the document contains an _id field, then the save() method is equivalent to an update
with the upsert option set to true and the query predicate on the _id field.
db.collection.save()
Updates an existing document or inserts a new document, depending on
its document parameter.
db.collection.save()
save() UPDATES an existing document or INSERTS a new document, depending on its
document parameter.
db.collection.save({ document })
db.collection.insert({<document>})
db.collection.insert([{<document 1>} , {<document 2>}, ... ])
• db.x.insert({})
• db.x.insert({ ename: 'ram', job: 'programmer ', salary:
42000 })
• db.x.insert([ { ename: 'sham'} , { ename: 'y' } ]) #
for multiple documents.
db.collection.insertOne({<document>})
db.collection.insertMany([{<document 1>} , {<document 2>}, ... ])
var obj = {}
> var doc = {}; # JavaScript object
> doc.title = "MongoDB Tutorial"
> doc.url = "http://mongodb.org"
> doc.comment = "Good tutorial video"
> doc.tags = ['tutorial', 'noSQL']
> doc.saveondate = new Date ()
> doc.meta = {} # object within doc object{}
> doc.meta.browser = 'Google Chrome'
> doc.meta.os = 'Microsoft Windows7'
> doc.meta.mongodbversion = '2.4.0.0'
> doc
> db.book.insert(doc);
load ("app.js")
Loads and runs a JavaScript file into the current shell environment.
load(file.js)
Specifies the path of a JavaScript file to execute.
load(file)
cat(file)
• function app(x, y) {
return (x + y);
}
• function app1(x, y, z) {
return (x + y + z);
}
• load("scripts/app.js")
• cat ("scripts/app.js")
javascript function
• db.emp.find({$or:[ {job:'manager'}, {job:'salesman'} ]}, {}).forEach(function(doc) {
print (doc.ename.padEnd(12, "-") + doc.job);
});
• db.emp.find().forEach(function(doc) { • db.emp.find().forEach(function(doc) {
if (doc.ename == 'saleel') { x = doc.job.split(" ");
print (doc.ename, doc.job); print (x[0]);
} else { });
quit;
};
});
• db.emp.find().forEach((doc) => {
if (doc.ename.length >= 7) {
print(doc.ename + ": " + doc.ename.length);
};
});
• db.emp.find().forEach(function(data) {
print("user: " + data.ename.toUpperCase();)
});
javascript function
• db.emp.find().forEach(function(doc) {
if(doc.job.split(' ')[1]=='Programmer' || doc.job=='programmer') {
print(doc.ename, doc.job);
}
});
• function findProductByID(_productID) {
return db.products.find({productID: _productID}, {_id:false,
productID:true, productname:true});
};
• function productValidation(_productID) {
var x = db.products.find({productID:_productID}).count();
if (x != 0) {
return db.products.find({productID: _productID}, {_id:false,
productID:true, productname:true});
} else {
return ("Document not found!");
};
};
db.collection.update()
Modifies an existing document or documents in a collection. The method
can modify specific fields of an existing document or documents or
replace an existing document entirely, depending on the update
parameter. By default, the update() method updates a single document.
Set the Multi Parameter to update all documents that match the query
criteria.
db.collection.update()
By default, the update() method updates a single document. Set the multi Parameter to
update all documents that match the query criteria, an upsert means an update than inserts
a new document if no document matches the filter.
db.collection.deleteOne({ filter })
• db.emp.deleteOne({})
• db.emp.deleteOne({ job: 'manager' })
db.collection.deleteMany()
Removes all documents that match the filter from a collection.
db.collection.deleteMany()
deleteMany() removes all documents that match the filter from a collection.
db.collection.deleteMany({ filter })
• db.emp.deleteMany({});
• db.emp.deleteMany({ job: 'manager' });
“Accept your past without regret,
handle our present with confidence
and face your future without fear.“
A.P.J. Abdul Kalam