0% found this document useful (0 votes)
8 views

ADO Lecture v 2023-25

Uploaded by

thehorizon2026
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views

ADO Lecture v 2023-25

Uploaded by

thehorizon2026
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 44

Advance Data Organization

Lecture V
MBA(DSDA) 2024-26, SCIT
ADO
Session Plan
1. Introduction to ADO, Data, Big Data, Time-Series Data, Spatial
Data, Graph Data, Streaming Data, Session Plan, Cos (0.5 Session)
2. Features of Database. Structured/Semi-Structured/Unstructured,
SQL DBs, NoSQL DBs, NewSQL DBs, ACID- CAP-BASE Property,
Distributed Databases. (0.5 Session)
3. Journey from RDBMS to NoSQL- BigTable, Dynamo DB, Hbase,
Cassandra, VoltDB. (2 Sessions)
4. MongoDB (in Detail) (3-4 Sessions)
5. Neo4j (in Detail) (2 Sessions)
6. Time-Series DB (if time Permits) (1 Session)
7. Data Lakes and Data Quality Management (1 Session)
ADO
MongoDB
1. Document Data Concept/Model
2. MongoDB platform
3. Basic Commands
4. CRUD
5. Indexing, Data Types
6. File Import/Export
7. GridFS
8. Collection
9. Spatial Features, Time-series
10. Complex Queries
MongoDB
Basic commands
> show dbs
> show collections
>db.stats()
>db.numbers.stats()
MongoDB
Indexing
• Indexes support efficient execution
of queries
• Adding an index has negative
performance impact for write
operations

https://www.mongodb.com/docs/manual/core/indexes/
MongoDB
Indexing
• A human resources department often needs
Single Field

to look up employees by employee ID. You


can create an index on the employee ID field
Index

to improve query performance.


• A grocery store manager often needs to look
up inventory items by name and quantity to
Compound

determine which items are low stock. You can


create a single index on both the item and
Index

quantity fields to improve query


performance.
https://www.mongodb.com/docs/manual/core/indexes/index-types/
MongoDB
Indexing
Single Field

• db.collection.createIndex( <keys>,
<options>, <commitQuorum>)
Index

• db.<collection>.createIndex( {
<field1>: <sortOrder>,
You can specify up to 32 fields in a
Compound

<field2>: <sortOrder>,
single compound index.

...
Index

<fieldN>: <sortOrder>
• })
https://www.mongodb.com/docs/manual/reference/method/db.collection.createIndex/
MongoDB
Indexing Types
Single Field
Index

https://www.mongodb.com/docs/manual/core/indexes/index-types/
MongoDB
Indexing Types Compound
Index

https://www.mongodb.com/docs/manual/core/indexes/index-types/
MongoDB
Indexing Types Compound
Index

Will the index work for -

db.students.find({gpa:3.6})

https://www.mongodb.com/docs/manual/core/indexes/index-types/
MongoDB
Indexing Types Multikey
Index
• Multikey indexes collect and sort
data stored in arrays.
• You do not need to explicitly specify
the multikey type. When you create
an index on a field that contains an
array value, MongoDB automatically
sets the index to be a multikey index.

https://www.mongodb.com/docs/manual/core/indexes/index-types/
MongoDB
Indexing Types Multikey
Index

https://www.mongodb.com/docs/manual/core/indexes/index-types/
MongoDB
Indexing Types [ Multikey
db.students.insertMany(
{ Index
"name": "Andre Robinson",
"test_scores": [ 88, 97 ]
},
{
"name": "Wei Zhang",
"test_scores": [ 62, 73 ]
},
{
"name": "Jacob Meyer",
"test_scores": [ 92, 89 ]
}
])
https://www.mongodb.com/docs/manual/core/indexes/index-types/
MongoDB
Indexing Types Multikey
Index

db.students.createIndex({ test_scores:
1 } )

The index contains a key for each individual


value that appears in the test_scores field. The
index is ascending, meaning the keys are stored
in this order: [ 62, 73, 88, 89, 92, 97 ].
https://www.mongodb.com/docs/manual/core/indexes/index-types/
MongoDB
File Import
• mongoimport
• mongoexport

https://www.mongodb.com/docs/database-tools/
MongoDB
File Import
• mongoimport
Importing files JSON Files
***********
mongoimport --db dsda2325 --collection restaurants --type=json --file G:\primer-
dataset.json

OR

mongoimport --db dsda2325 --collection restaurants --file G:\primer-dataset.json

OR

mongoimport --db dsda2325 --collection restaurants --drop --file G:\primer-


dataset.jsonhttps://www.mongodb.com/docs/database-tools/
MongoDB
File Import

>db.restaurants.countDocuments()
25359
>db.restaurants.find()
https://www.mongodb.com/docs/database-tools/
MongoDB
File Import
• mongoimport
Importing files csv Files- with Headerline
***********
mongoimport --db dsda2325 --collection trips --
drop --type=csv --headerline --file G:\2014-02-
Citi Bike trip data.csv

https://www.mongodb.com/docs/database-tools/
MongoDB
File Import
• mongoimport
Importing files csv Files- without Headerline
***********
mongoimport
--db dsda2325
--collection=trips
--drop
--file =G:\2014-02-Citi Bike trip data.csv
--type=csv
--fields="tripduration","starttime","stoptime","start station id","start station
name","start station latitude","start station longitude","end station id","end
station name","end station latitude","end station
longitude","bikeid","usertype","birth year","gender"
https://www.mongodb.com/docs/database-tools/
MongoDB
File Import
• mongoimport
Importing files csv Files- without Headerline- details in text file
***********
mongoimport
--db dsda2325
--collection=trips
--drop
--file =G:\2014-02-Citi Bike trip data.csv
--type=csv
--fields=https://www.mongodb.com/docs/database-tools/
G:\field_file.txt
MongoDB
File Import
• mongoimport
Importing files csv Files- without Headerline- details in text file along with
column types
***********
mongoimport
--db dsda2325
--collection=trips
--drop
--file =G:\2014-02-Citi Bike trip data.csv
--type=csv
--columnsHaveTypes
--fieldFile = G:\field_file_with_types.txt
https://www.mongodb.com/docs/database-tools/
MongoDB
GridFS
GridFS is a specification for storing and
retrieving files that exceed the BSON-document
size limit of 16 MB.

https://www.mongodb.com/docs/manual/core/gridfs/
MongoDB
GridFS
• GridFS is the MongoDB specification for
storing and retrieving large files such as
images, audio files, video files, etc.
• It is kind of a file system to store files but its
data is stored within MongoDB collections.
• GridFS has the capability to store files even
greater than its document size limit of 16MB.
https://www.mongodb.com/docs/manual/core/gridfs/
MongoDB
GridFS
• GridFS divides a file into chunks and stores
each chunk of data in a separate document,
each of maximum size 255k
• GridFS by default uses two collections fs.files
and fs.chunks to store the file's metadata and
the chunks.

https://www.mongodb.com/docs/manual/core/gridfs/
MongoDB
GridFS
• The size of each chunk in bytes. GridFS divides
the document into chunks of size chunkSize,
except for the last, which is only as large as
needed. The default size is 255 kilobytes (kB).

https://www.mongodb.com/docs/manual/core/gridfs/
MongoDB
GridFS
• If you want to change the default size of
document, then use following command

>db.settings.save( { _id:"chunksize", value:


<sizeInMB> } )

https://www.mongodb.com/docs/manual/core/gridfs/
MongoDB
GridFS
Go to bin directory of MongoDB Tool 100

mongofiles -d video put <your_file_path>\


Airtel_Zero_Complaint.mp4

mongofiles -d video put G:\Airtel_Zero_Complaint_O.mp4

mongofiles -d video list


https://www.mongodb.com/docs/manual/core/gridfs/
MongoDB
GridFS
Go to bin directory of MongoDB Tool 100

show databases
use video
show collections

mongofiles -d video get <your_file_path>\Airtel_Zero_Complaint.mp4

mongofiles -d video delete <your_file_path>\Airtel_Zero_Complaint.mp4

use video
show collections
db.dropDatabase()
https://www.mongodb.com/docs/manual/core/gridfs/
MongoDB
DataTypes
MongoDB supports many datatypes. Some of them are −
• String − This is the most commonly used datatype to store the data. String in
MongoDB must be UTF-8 valid.
• Integer − This type is used to store a numerical value. Integer can be 32 bit or
64 bit depending upon your server.
• Boolean − This type is used to store a boolean (true/ false) value.
• Double − This type is used to store floating point values.
• Min/ Max keys − This type is used to compare a value against the lowest and
highest BSON elements.
• Arrays − This type is used to store arrays or list or multiple values into one key.
• Timestamp − ctimestamp. This can be handy for recording when a document
has been modified or added.
• Object − This datatype is used for embedded documents.
https://www.mongodb.com/docs/manual/reference/bson-types/
MongoDB
DataTypes
MongoDB supports many datatypes. Some of them are − (contd…)
• Null − This type is used to store a Null value.
• Symbol − This datatype is used identically to a string; however, it's
generally reserved for languages that use a specific symbol type.
• Date − This datatype is used to store the current date or time in UNIX
time format. You can specify your own date time by creating object of
Date and passing day, month, year into it.
• Object ID − This datatype is used to store the document’s ID.
• Binary data − This datatype is used to store binary data.
• Code − This datatype is used to store JavaScript code into the
document.
• Regular expression − This datatype is used to store regular expression.
https://www.mongodb.com/docs/manual/reference/bson-types/
MongoDB
DataTypes

https://www.mongodb.com/docs/manual/reference/bson-types/
MongoDB
DataTypes
The $type operator supports using these values
to query fields by their BSON type. $type also
supports the number alias, which matches the
integer, decimal, double, and long BSON types.

https://www.mongodb.com/docs/manual/reference/bson-types/
MongoDB
DataTypes https://www.prisma.io/dataguide/mongodb/mongodb-datatypes
db.types.insertMany({ _id: 1, value: 1, expectedType: 'Int32' },
{ _id: 2, value: Long("1"), expectedType: 'Long' },
{ _id: 3, value: 1.01, expectedType: 'Double' },
{ _id: 4, value: Decimal128("1.01"), expectedType: 'Decimal128' },
{ _id: 5, value: 3200000001, expectedType: 'Double' })
db.types.find({"value":{$type: "int"}})
db.types.find({"value":{$type: "long"}})
db.types.find({"value":{$type: "decimal"}})
db.types.find({"value":{$type: "double"}})
db.types.find({"value":{$type: "number"}})
db.types.find({"value":{ $type: "decimal" } } )
db.types.find({"value":{ $type: 19 }})
db.types.find({"value": 1.01})
db.types.find({"value": 1})
db.mytestcoll.find( { "first_name": { $type: "string" } } )
db.mytestcoll.find( { "first_name": { $type: 2 } } )

https://www.mongodb.com/docs/mongodb-shell/reference/data-types/#std-label-
MongoDB
DataTypes- ObjectID
ObjectId(<value>)
Returns a new ObjectId. The 12-byte ObjectId consists of:
• A 4-byte timestamp, representing the ObjectId's creation, measured in seconds
since the Unix epoch.
• A 5-byte random value generated once per process. This random value is unique
to the machine and process.
• A 3-byte incrementing counter, initialized to a random value.
For timestamp and counter values, the most significant bytes appear first in the byte
sequence (big-endian). This is unlike other BSON values, where the least significant
bytes appear first (little-endian).

If an integer value is used to create an ObjectId, the integer replaces the timestamp.

https://www.mongodb.com/docs/manual/reference/method/ObjectId/#mongodb-
MongoDB
DataTypes- ObjectID

https://www.mongodb.com/docs/manual/reference/method/ObjectId/#mongodb-
MongoDB
DataTypes- ObjectID

https://www.mongodb.com/docs/manual/reference/method/ObjectId/#mongodb-
MongoDB
DataTypes- ObjectID

ObjectId("00000020a7a0ee98923c771e")
The example ObjectId consists of:
• A four byte time stamp, 00000020
• A five byte random element, a7a0ee9892
• A three byte counter, 3c771e
The first four bytes of the ObjectId are the number of seconds since the
Unix epoch. In this example, the ObjectId timestamp is 00000020 which is 32 in
hexadecimal.

https://www.mongodb.com/docs/manual/reference/method/ObjectId/#mongodb-
MongoDB
DataTypes-
Date
Timestamp
String
Null
Undefined
….
MongoDB
DataTypes- Timestamp()
MongoDB
Hands-on

Operations: AND, OR, limit(), skip(),


Projection, Sorting and Aggregation
MongoDB
Aggregation
• An aggregation pipeline consists of one or
more stages that process documents:
• Each stage performs an operation on the input
documents. For example, a stage can filter
documents, group documents, and calculate
values.
• The documents that are output from a stage are
passed to the next stage.
• An aggregation pipeline can return results for
groups of documents. For example, return the
total, average, maximum, and minimum values.
https://www.mongodb.com/docs/manual/core/aggregation-pipeline/
MongoDB
Aggregation

$match stage – filters those documents we need to work with, those that fit our needs
$group stage – does the aggregation job
$sort stage – sorts the resulting documents the way we require (ascending or
descending)

https://www.mongodb.com/docs/manual/core/aggregation-pipeline/
MongoDB
Aggregation

https://www.mongodb.com/docs/manual/core/aggregation-pipeline/
MongoDB
Aggregation
The $group stage supports certain expressions (operators)
allowing users to perform arithmetic, array, boolean and other
operations as part of the aggregation pipeline.

https://www.mongodb.com/docs/manual/core/aggregation-pipeline/

You might also like