MongoDB - Cours 4
MongoDB - Cours 4
1
Table of contents
1. Introduction
2. Query a MongoDB
a. Simple extraction and various simple commands
b. Aggregation
3. Data Modeling
2
Chapter 1
3
SQL and NoSQL evolution
● 1970 : Dr. EF Codd published the paper “A Relation Model of Data for Large
Shared Data Banks”
● 1974 : IBM developed SQL
● 1979 : Oracle provides the first RDBMS commercially available
● 1986 : The first SQL Standard is published
● 1998 : First time the term NoSQL is coined by Carlo Strozzi for an open
source database that relaxes some SQL constraints (ACID) in favor of
availability and scalability
4
MongoDB key characteristics and use cases
Document-oriented database
5
MongoDB key characteristics and use cases
Key characteristics
7
MongoDB key characteristics and use cases
MongoDB criticism
● Schema-less nature
● Lack of proper ACID guarantees
8
MongoDB key characteristics and use cases
MongoDB binaries
● mongod
○ starts MongoDB
● mongo
○ open the MongoDB shell
● mongodump
○ create a database dump (bson+json files)
● mongorestore
○ restore a database dump
9
Chapter 2
10
CRUD using the shell
11
CRUD using the shell
● Update a document
○ db.<collection_name>.update({...MATCHING_OBJECT...},{...UPDATING_OBJECT...})
○ db.books.update({isbn: '1331'},{price: 30})
=> Replace all matched objects with the new object
● Update a document (2)
○ db.<collection_name>.update({...MATCHING_OBJECT...},{$set: {...UPDATING_OBJECT...}})
○ db.books.update({isbn: '1331'},{$set: {price: 30, title:'Mongo'})
● Find all documents
○ db.<collection_name>.find()
● Find documents and pretty print them in the console
○ db.<collection_name>.find({...MATCHING_OBJECT...}).pretty()
○ db.books.find().pretty()
12
Scripting for the mongo shell
Main reason for using the shell : Mongo shell is a Javascript shell
> var title = 'MongoDB in a nutshell'
> title
MongoDB in a nutshell
> db.books.insert({title: title, isbn: 102})
WriteResult({ "nInserted" : 1 })
> db.books.find()
{ "_id" : ObjectId("59203874141daf984112d080"), "title" : "MongoDB in a
nutshell", "isbn" : 102 }
13
Scripting for the mongo shell
14
Scripting for the mongo shell
Batch inserts using the shell
authorMongoFactory = function() {
for(loop=0;loop<1000;loop++){
db.books.insert({name: "MongoDB factory book" + loop})
}
}
Although working, it performs 1000 database inserts (which is not efficient).
Prefer a bulk write:
fastAuthorMongoFactory = function() {
var bulk = db.books.initializeUnorderedBulkOp();
for(loop=0;loop<1000;loop++) {
bulk.insert({name: "MongoDB factory book" + loop})
}
bulk.execute();
}
If you want the data having the same order of insertion than declaration, use
db.books.initializeOrderedBulkOp();
15
Scripting for the mongo shell
Batch inserts using the shell
Alternative
db.collection.bulkWrite(
[ <operation 1>, <operation 2>, ... ],
{
writeConcern : <document>,
ordered : <boolean>
}
)
16
Administration
● db.dropDatabase()
● db.getCollectionNames()
● db.copyDatabase(fromDB, toDB)
17
Simple Extraction
18
Simple extraction
find()
db.<collection>.find(
{...SELECTOR_OBJECT...},
{...PROJECTION_OBJECT...}
)
19
Simple extraction
$exists
$exists
allows to filter on properties existence
20
Simple extraction
Projection
db.<collection>.find({},{<FIELD> : <BOOL>})
When <BOOL> is
21
Simple extraction
$eq, $gt, $lt, $gte, $lte, $ne
22
Simple extraction
$in, $nin
Is in or is not in operator
db.<collection>.find({
<FIELD>: {<OPERATOR>:<VALUES_ARRAY>}
},{})
Example:
23
Simple extraction
$and, $or, $nor, $not
$not
24
Simple extraction
$type
Selects the documents where the value of the field is an instance of the specified BSON type(s).
db.<collection>.find({
<FIELD>: {$type : <TYPESTRING>}
},{})
Example:
db.products.find({$and: [
{Price: {$type:'double'}}
]},{})
25
Simple extraction
$mod
Select documents where the value of a field divided by a divisor has the specified remainder
db.<collection>.find({
<FIELD>: {$mod : [<DIVISOR>,<REMAINDER>]}
},{})
Example:
db.products.find({$and: [
{Price: {$mod:[2,0]}}
]},{})
26
Simple extraction
$regex
Provides regular expression capabilities for pattern matching strings in queries. (PCRE 8.41)
db.<collection>.find({
<FIELD>: {$regex : /REGEX/, $options : <STRING_OPTIONS>}
},{}) i Case insensitivity to match upper and lower
cases.
db.<collection>.find({
<FIELD>: /REGEX/OPTIONS m For patterns that include anchors (i.e. ^ for
the start, $ for the end), match at the
},{}) beginning or end of each line for strings with
multiline values. Without this option, these
Example: anchors match at beginning or end of the
string.
Performs a text search on the content of the fields indexed with a text index.
REQUIRED: db.<collection>.createIndex({<FIELD>:”text”})
db.<collection>.find({
$text: {
$search: <TEXT_TO_SEARCH>
$language: <LANGUAGE>
$caseSensitive: <BOOL>
$diacriticSensitive: <BOOL>
}
},{})
28
Simple extraction
$text
Performs a text search on the content of the fields indexed with a text index.
Example:
db.products.find({
$text: {
$search: ‘fry’,
$language: ‘english’,
$diacriticSensitive: true
}
})
29
Simple extraction
$where
Use the $where operator to pass either a string containing a JavaScript expression or a full JavaScript
function to the query system. (From 3.6, $expr should be prefered).
db.products.find({
$where: <JS_EXPRESSION>
})
Example:
30
Simple extraction
$expr (3.6)
Allows the use of aggregation expressions within the query language. $expr can build query expressions
that compare fields from the same document in a $match stage
db.products.find({
$expr: <EXPRESSION>
})
Example:
db.monthlyBudget.find({$expr:{$gt:["$spent","$budget"]}})
31
Simple extraction
.distinct()
Finds the distinct values for a specified field across a single collection or view and returns the results in an
array.
db.collection.distinct(
field,
query
)
32
Simple extraction
.count()
Returns the count of documents that would match a find() query for the collection or view
db.collection.count(
query,
options
)
33
Simple extraction
Exercices
34
Aggregation
35
Aggregation
$expr (3.6)
db.<collection>.aggregate([
{<EXPRESSION1>},
{<EXPRESSION2>},
{<EXPRESSION3>},
...
])
36
Aggregation Pipeline Stages
$match
Filters the documents to pass only the documents that match the specified
condition(s) to the next pipeline stage.
db.<collection>.aggregate([
{$match: <QUERY>},
...
])
Example:
37
Aggregation Pipeline Stages
$group
Groups documents by some specified expression and outputs to the next stage a document for each
distinct grouping. The output documents contain an _id field which contains the distinct group by key.
$avg
db.<collection>.aggregate([
{$group: { $first
]) $stdDevPop
$sdtDevSamp
$sum
38
Aggregation Pipeline Stages
$group
Groups documents by some specified expression and outputs to the next stage a document for each
distinct grouping. The output documents contain an _id field which contains the distinct group by key.
Example:
db.<collection>.aggregate([
{$group: {
_id: <EXPRESSION>,
<FIELD1>: { <ACC1> : <EXPRESSION1> },
<FIELD2>: { <ACC2> : <EXPRESSION2> },
...
}},
...
])
39
Aggregation Pipeline Stages
$unwind
Deconstructs an array field from the input documents to output a document for each element. Each output
document is the input document with the value of the array field replaced by the element.
db.<collection>.aggregate([
{
$unwind:
{
path: <FIELD_PATH>,
includeArrayIndex: <STRING>,
preserveNullAndEmptyArrays: <BOOL>
}
}
,
...
])
To specify a field path, prefix the field name with a dollar sign $ and enclose in quotes.
40
Aggregation Pipeline Stages
$lookup
Performs a left outer join to an unsharded collection in the same database to filter in documents from the
“joined” collection for processing.
db.<collection>.aggregate([
{
$lookup:
{
from: <collection to join>,
localField: <field from the input documents>,
foreignField: <field from the documents of the "from" collection>,
as: <output array field>
}
},
...
])
41
Aggregation Pipeline Stages
$sort
Sorts all input documents and returns them to the pipeline in sorted order.
db.<collection>.aggregate([
{
$sort: {
<field1>: <sort order>,
<field2>: <sort order>,
...
}
},
...
])
42
Aggregation Pipeline Stages
$limit
Limits the number of documents passed to the next stage in the pipeline.
db.<collection>.aggregate([
{
$limit: <POSITIVE_NUMBER>
},
...
])
43
Aggregation Pipeline Stages
$out
Takes the documents returned by the aggregation pipeline and writes them to a specified collection. The
$out operator must be the last stage in the pipeline.
db.<collection>.aggregate([
{
$out: {<OUTPUT_COLLECTION_STRING>}
},
...
])
44
Aggregation Pipeline Stages
Exercices
46
MapReduce
47
MapReduce
Exercice
48
Chapter 3
49
Data Modelling
Flexible schema
50
Data Modelling
Document structure
51
Data Modelling
Embedded data
Use:
Use:
53
Data Modeling
Rules of Thumb for MongoDB Schema Design
54
Data Modeling
JSON Schema
db.createCollection(<collection_name>, {
validator: {
$jsonSchema: {
}}})
}}})
55
Documentation: https://docs.mongodb.com/v3.6/core/schema-validation/index.html#json-schema
Library System
Exercice
The library contains one or several copies of the same book. Every copy of a book has a copy number
and is located at a specific location in a shelf. A copy is identified by the copy number and the ISBN
number of the book. Every book has a unique ISBN, a publication year, a title, an author, and a number of
pages. Books are published by publishers. A publisher has a name as well as a location.
Within the library system, books are assigned to one or several categories. A category can be a
subcategory of exactly one other category. A category has a name and no further properties. Each reader
needs to provide his/her family name, his/her first name, his/her city, and his/her date of birth to register at
the library. Each reader gets a unique reader number. Readers borrow copies of books. Upon borrowing
the return date is stored.
Source: ETH Zurich - Data Modelling and Databases Lectures – Spring 2018 56
Data Modeling
Exercice 2
57
Data Modeling
Exercice 3
Customers are always known by their first name and last name that are strings.
We can know their address which consists of a street, a zip code (int varying from
1000 to 9999) and a city.
58