SlideShare a Scribd company logo
SQL vs. NoSQL
An experiment (for dummies) with MongoDB
Marco Segato // v20170621
SUMMARY
★ What?
★ When?
★ Why (MongoDB)?
★ How?
★ :)
What?
{
The Big Data Landscape (2017),
Start from a definition,
Some NoSQL databases,
SQL vs. NoSQL differences
}
The Big Data Landscape (2017) http://mattturck.com/bigdata2017/
Start from a definition
A NoSQL (originally referring to "non SQL", "non
relational" or "not only SQL") database provides
a mechanism for storage and retrieval of data
which is modeled in means other than the
tabular relations used in relational databases.
[ https://en.wikipedia.org/wiki/NoSQL ]
Some NoSQL databases
SQL vs. NoSQL differences / 1
SQL
One type (SQL database) with minor variations.
Developed in 1970s to deal with first wave of
data storage applications.
MySQL, Postgres, Oracle Database.
To store information about a new data item, the
entire database must be altered, during which
time the database must be taken offline.
NoSQL
Different types including key-value stores,
document databases, wide-column stores,
and graph databases.
Developed in 2000s to deal with limitations of
SQL databases, concerning scale, replication
and unstructured data storage.
MongoDB, Cassandra, HBase, Neo4j.
Records can add new information on the fly,
and unlike SQL table rows, dissimilar data can
be stored together as necessary.
Examples
Schemas
History
Types
SQL vs. NoSQL differences / 2
SQL
Mix of open-source (e.g., Postgres, MySQL)
and closed-source (e.g., Oracle Database).
Yes, updates can be configured to
complete entirely or not at all.
Specific language using Select, Insert, and
Update statements.
NoSQL
Open-source.
In certain circumstances and at certain
levels (e.g., document level vs. database
level).
Through object-oriented APIs.Data
Manipulation
Supports
Transactions
Development
Model
When?
{
Size vs. Complexity,
Big Data,
Use cases,
NoSQL Pros and Cons
}
Size vs. Complexity
Big Data
One of the first reasons to use NoSQL is because you have a Big Data project to
tackle. A Big Data project is normally typified by:
● High data velocity – lots of data coming in very quickly, possibly from different locations.
● Data variety – storage of data that is structured, semi-structured and unstructured.
● Data volume – data that involves many terabytes or petabytes in size.
● Data complexity – data that is stored and managed in different locations or data centers.
Use cases
LARGE DATA VOLUMES
EXTREME QUERY WORKLOAD
SCHEMA EVOLUTION
We are storing more data now than we ever
have before.
Connections between our data are growing all
the time.
We don’t make things knowing the structure from
day 1.
Server architecture is now at a stage where we
can take advantage of it.
NoSQL Pros and Cons
PROS
MASSIVE SCALABILITY
HIGH AVAILABILITY
LOWER COST
SCHEMA FLEXIBILITY
SPARSE AND SEMI STRUCTURED
DATA
CONS
LIMITED QUERY CAPABILITIES
NOT STANDARDISED
(PORTABILITY MAY BE AN ISSUE)
STILL A DEVELOPING
TECHNOLOGY
INSTALLATION, MANAGEMENT
AND TOOLSETS STILL MATURING
Why (MongoDB)?
{
Some notes,
The leading NoSQL Database,
Who’s using MongoDB,
Main features,
TCO Comparison MongoDB & Oracle,
MongoDB University
}
Some notes
History: The software company “10gen” began developing MongoDB in 2007 as a
component of a planned platform as a service product. In 2009, the company
shifted to an open source development model, with the company offering
commercial support and other services. In 2013, “10gen” changed its name to
MongoDB Inc.
Licensing: MongoDB is available at no cost under the
GNU Affero General Public License, version 3. The
language drivers are available under an Apache
License. In addition, MongoDB Inc. offers proprietary
licenses for MongoDB.
MongoDB – The Leading NoSQL Database
NoSQL adoption (based
on Google Trends) *
LinkedIn job skills * Job trends (2015)
* https://www.mongodb.com/leading-nosql-database
Who’s using MongoDB
Main features
Ad hoc queries - MongoDB supports field, range queries, regular expression searches.
Indexing - Fields in a MongoDB document can be indexed with primary and secondary indices.
Replication - MongoDB provides high availability with replica sets. A replica set consists of two or more copies of the data.
Load balancing - MongoDB scales horizontally using sharding. The user chooses a shard key, which determines how the data in a
collection will be distributed. The data is split into ranges (based on the shard key) and distributed across multiple shards. MongoDB can
run over multiple servers, balancing the load or duplicating data to keep the system up and running in case of hardware failure.
File storage - MongoDB can be used as a file system with load balancing and data replication features over multiple machines.
Aggregation - MapReduce can be used for batch processing of data and aggregation operations. The aggregation framework enables
users to obtain the kind of results for which the SQL GROUP BY clause is used. The aggregation framework includes the $lookup
operator which can join documents from multiple documents, as well as statistical operators such as standard deviation.
Others - In-memory Storage Engine, Native Graph Processing, Optimized Connectors for BI & Spark, Database as a Cloud Service
TCO Comparison of MongoDB & Oracle (aug-15)
Small Enterprise Project Large Enterprise Project
MongoDB Oracle MongoDB Oracle
Initial Developer Effort $ 120.000 $ 240.000 $ 360.000 $ 720.000
Initial Administrative Effort $ 10.000 $ 20.000 $ 30.000 $ 60.000
Software Licenses $ 0 $ 423.000 $ 0 $ 4.230.000
Server Hardware $ 12.000 $ 12.000 $ 120.000 $ 120.000
Storage Hardware $ 24.000 $ 125.000 $ 240.000 $ 500.000
Total Upfront Costs $ 166.000 $ 820.000 $ 750.000 $ 5.630.000
https://www.mongodb.com/collateral/total-cost-ownership-comparison-mongodb-oracle
MongoDB University
MongoDB University offers free online courses to teach you how to build and
deploy apps on MongoDB. Over 400,000 of your peers have already signed up.
https://university.mongodb.com/
M101P: MongoDB for Developers
Learn everything you need to know to get
started building a MongoDB-based app
(7 weeks).
How?
{
Battlefield and opponents,
Install & run,
Contest,
A doubt,
Tools,
Comparison,
And the winner is...
}
Battlefield and opponents
Red Hat Enterprise Linux Server v5.5
RAM 8 GB
V-CPU 1
11g Enterprise Edition 64 bit 2.6.3 Community Edition 64 bit
(current release: 12c) (current release: 3.4.6)
Install & run
Install MongoDB on linux and start the database service:
# tar -zxvf mongodb-linux-x86_64-x.y.z.tgz
# mkdir -p /data/db
# cd mongodb-linux-x86_64-x.y.z
# mongod --dbpath /data/db
JDBC connection string:
mongodb://[username:password@]host1[:port1][/[database][?options]]
Note: the port is optional, the default value is :27017 if not specified.
Contest
Table of daily sales:
INDEXES:
IDX1 C_PROD
IDX2 C_PROD, DATA
IDX3 C_ENTE, C_PROD
IDX4 DATA, C_TIPO_DOC, C_ENTE
IDX5 FLG_FIDELITY, C_PROD, C_ENTE, DATA
IDX6 TRIM("C_ENTE"), TRIM("C_PROD")
IDX7 NUM
$ mongoimport -d mydb -c sales --type csv --file
mydb_sales.csv --headerline
≈ 3.000.000 record
2’30’’ to complete the import
No index defined
A doubt
Is it correct/useful to compare them working with a typical
RDBMS' object?
● If you work in a standard legacy environment, you could even not to be
interested on databases other than RDBMS
● If you work in a futuristic start-up, you surely already moved your data
aggregation to a new strategy
but… what if your company has to manage a transitional period in which data
structure can’t be modified, but you need to move on anyway? (e.g.: due to costs,
customer requirements, warranty on data safety before final migration, etc...)
Tools
mongo shell
Robo 3T (formerly Robomongo)
the free lightweight GUI for MongoDB.
https://robomongo.org/
DBeaver, universal SQL client.
http://dbeaver.jkiss.org/
Comparison / COUNT
select count (*)
from mydb;
db.mydb.aggregate( [
{
$group: {
_id: null,
count: { $sum: 1 }
}
}
] );
19.000’’ 2.960’’
Comparison / WHERE
select data, c_prod
from mydb
where data =
to_date('26/09/2011','DD/MM/YYYY');
db.mydb.find({
"DATA": "26/09/2011"
}, {
"DATA": 1,
"C_PROD": 1
}).pretty();
0.116’’ 0.006’’
Comparison / COUNT + GROUP BY
select data, c_prod, count(c_prod)
from mydb
group by data, c_prod;
db.mydb.aggregate( [
{
$group: {
_id: {data: "$DATA", c_prod: "$C_PROD"},
count: { $sum: 1 }
}
}
],
{ allowDiskUse: true }
);
2’32’’ 0’15’’
Comparison / COUNT + GROUP BY + WHERE
select data, c_prod, count(c_prod)
from mydb
where data =
to_date('26/09/2011','DD/MM/YYYY')
group by data, c_prod;
db.mydb.aggregate( [
{ $match: { DATA: "26/09/2011" } },
{
$group: {
_id: {data: "$DATA", c_prod: "$C_PROD"},
count: { $sum: 1 }
}
}
],
{ allowDiskUse: true }
);
1’14’’ 0’01’’
Comparison / DISTINCT
select distinct data
from mydb;
db.mydb.distinct("DATA");
37.000’’ 2.306’’
Comparison / INSERT
insert into mydb
(NUM, C_ENTE, C_TIPO_DOC, DATA...)
Values
(-1, '67335 ', '12', TO_DATE('01/22/2015
00:00:00', 'MM/DD/YYYY HH24:MI:SS')...);
db.mydb.insert({
"NUM" : -1,
"C_ENTE" : "67335 ",
"C_TIPO_DOC" : "12",
"DATA" : "01/22/2015",
...
});
0.539’’ 0.003’’
Comparison / UPDATE
update mydb
set VALORE_01 = 5.5
where NUM = -1;
db.mydb.update(
{"NUM" : -1},
{ $set : { "VALORE_01" : 5.5}
});
0.063’’ 0.642’’
And the winner is...
1 - 6
And now…
… it’s up to you!
:)
MongoDB official site: https://www.mongodb.com/
MongoDB Tools: http://mongodb-tools.com/
MongoDB Tutorial: http://www.w3resource.com/mongodb/introduction-mongodb.php
Marco Segato
Project Manager at TESISQUARE®
https://www.linkedin.com/in/marcosegato/
@machms
Passionate with #linux #opensource #innovation
My interests: #rock #reading #photo #cinema #theatre

More Related Content

What's hot (20)

PPTX
NoSQL Architecture Overview
Christopher Foot
 
PDF
Apache Spark Introduction
sudhakara st
 
PPTX
What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...
Simplilearn
 
PPTX
Introduction to Node.js
Vikash Singh
 
PPTX
Introduction to azure cosmos db
Ratan Parai
 
PDF
Designing a modern data warehouse in azure
Antonios Chatzipavlis
 
PDF
Nodejs presentation
Arvind Devaraj
 
PPTX
Graph databases
Vinoth Kannan
 
PDF
Azure Synapse Analytics
WinWire Technologies Inc
 
PPTX
Relational databases vs Non-relational databases
James Serra
 
PDF
Spark SQL
Joud Khattab
 
PDF
Changing the game with cloud dw
elephantscale
 
PDF
Snowflake for Data Engineering
Harald Erb
 
PPTX
Azure Data Factory Data Flows Training v005
Mark Kromer
 
PPTX
Azure data bricks by Eugene Polonichko
Alex Tumanoff
 
PPT
Introduction to MongoDB
Ravi Teja
 
PDF
Introduction to Microsoft Azure Cloud
Dinesh Kumar Wickramasinghe
 
PDF
Cloud concepts principles of cloud computing
SimiSreedharan2
 
PDF
Introduction to Azure Data Lake
Antonios Chatzipavlis
 
NoSQL Architecture Overview
Christopher Foot
 
Apache Spark Introduction
sudhakara st
 
What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...
Simplilearn
 
Introduction to Node.js
Vikash Singh
 
Introduction to azure cosmos db
Ratan Parai
 
Designing a modern data warehouse in azure
Antonios Chatzipavlis
 
Nodejs presentation
Arvind Devaraj
 
Graph databases
Vinoth Kannan
 
Azure Synapse Analytics
WinWire Technologies Inc
 
Relational databases vs Non-relational databases
James Serra
 
Spark SQL
Joud Khattab
 
Changing the game with cloud dw
elephantscale
 
Snowflake for Data Engineering
Harald Erb
 
Azure Data Factory Data Flows Training v005
Mark Kromer
 
Azure data bricks by Eugene Polonichko
Alex Tumanoff
 
Introduction to MongoDB
Ravi Teja
 
Introduction to Microsoft Azure Cloud
Dinesh Kumar Wickramasinghe
 
Cloud concepts principles of cloud computing
SimiSreedharan2
 
Introduction to Azure Data Lake
Antonios Chatzipavlis
 

Similar to SQL vs NoSQL, an experiment with MongoDB (20)

PDF
Mongodb
Apurva Vyas
 
PPTX
Introduction to NoSQL
balwinders
 
PPTX
How to learn MongoDB for beginner's
surajkumartpoint
 
PDF
Introduction to MongoDB and its best practices
AshishRathore72
 
PDF
Hands on Big Data Analysis with MongoDB - Cloud Expo Bootcamp NYC
Laura Ventura
 
PDF
Accra MongoDB User Group
MongoDB
 
PDF
MongoDB Lab Manual (1).pdf used in data science
bitragowthamkumar1
 
PPTX
how_can_businesses_address_storage_issues_using_mongodb.pptx
sarah david
 
PDF
MongoDB eBook a complete guide to beginners
MeiyappanRm
 
PDF
how_can_businesses_address_storage_issues_using_mongodb.pdf
sarah david
 
PPTX
Mongo db intro.pptx
JWORKS powered by Ordina
 
PPTX
MongoDB NoSQL - Developer Guide
Shiv K Sah
 
PDF
MONGODB VS MYSQL: A COMPARATIVE STUDY OF PERFORMANCE IN SUPER MARKET MANAGEME...
ijcsity
 
PDF
MONGODB VS MYSQL: A COMPARATIVE STUDY OF PERFORMANCE IN SUPER MARKET MANAGEME...
ijcsity
 
PDF
MONGODB VS MYSQL: A COMPARATIVE STUDY OF PERFORMANCE IN SUPER MARKET MANAGEME...
ijcsity
 
PDF
MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB
 
PPTX
Introduction to MongoDB a brief intro(1).pptx
mehfooz968268
 
PDF
Mongodb
Thiago Veiga
 
PDF
A Brief Introduction: MongoDB
DATAVERSITY
 
PDF
Mongodb Introduction
Jeremy Taylor
 
Mongodb
Apurva Vyas
 
Introduction to NoSQL
balwinders
 
How to learn MongoDB for beginner's
surajkumartpoint
 
Introduction to MongoDB and its best practices
AshishRathore72
 
Hands on Big Data Analysis with MongoDB - Cloud Expo Bootcamp NYC
Laura Ventura
 
Accra MongoDB User Group
MongoDB
 
MongoDB Lab Manual (1).pdf used in data science
bitragowthamkumar1
 
how_can_businesses_address_storage_issues_using_mongodb.pptx
sarah david
 
MongoDB eBook a complete guide to beginners
MeiyappanRm
 
how_can_businesses_address_storage_issues_using_mongodb.pdf
sarah david
 
Mongo db intro.pptx
JWORKS powered by Ordina
 
MongoDB NoSQL - Developer Guide
Shiv K Sah
 
MONGODB VS MYSQL: A COMPARATIVE STUDY OF PERFORMANCE IN SUPER MARKET MANAGEME...
ijcsity
 
MONGODB VS MYSQL: A COMPARATIVE STUDY OF PERFORMANCE IN SUPER MARKET MANAGEME...
ijcsity
 
MONGODB VS MYSQL: A COMPARATIVE STUDY OF PERFORMANCE IN SUPER MARKET MANAGEME...
ijcsity
 
MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB
 
Introduction to MongoDB a brief intro(1).pptx
mehfooz968268
 
Mongodb
Thiago Veiga
 
A Brief Introduction: MongoDB
DATAVERSITY
 
Mongodb Introduction
Jeremy Taylor
 
Ad

Recently uploaded (20)

PDF
Reverse Engineering of Security Products: Developing an Advanced Microsoft De...
nwbxhhcyjv
 
PDF
AI Agents in the Cloud: The Rise of Agentic Cloud Architecture
Lilly Gracia
 
PDF
Bitcoin for Millennials podcast with Bram, Power Laws of Bitcoin
Stephen Perrenod
 
PPTX
Seamless Tech Experiences Showcasing Cross-Platform App Design.pptx
presentifyai
 
PDF
UiPath DevConnect 2025: Agentic Automation Community User Group Meeting
DianaGray10
 
PDF
“Computer Vision at Sea: Automated Fish Tracking for Sustainable Fishing,” a ...
Edge AI and Vision Alliance
 
PPTX
Digital Circuits, important subject in CS
contactparinay1
 
PPTX
COMPARISON OF RASTER ANALYSIS TOOLS OF QGIS AND ARCGIS
Sharanya Sarkar
 
PPTX
Designing_the_Future_AI_Driven_Product_Experiences_Across_Devices.pptx
presentifyai
 
PPT
Ericsson LTE presentation SEMINAR 2010.ppt
npat3
 
PDF
NLJUG Speaker academy 2025 - first session
Bert Jan Schrijver
 
PPTX
The Project Compass - GDG on Campus MSIT
dscmsitkol
 
PDF
Agentic AI lifecycle for Enterprise Hyper-Automation
Debmalya Biswas
 
PDF
“Squinting Vision Pipelines: Detecting and Correcting Errors in Vision Models...
Edge AI and Vision Alliance
 
PDF
Staying Human in a Machine- Accelerated World
Catalin Jora
 
PDF
The Rise of AI and IoT in Mobile App Tech.pdf
IMG Global Infotech
 
PDF
Newgen Beyond Frankenstein_Build vs Buy_Digital_version.pdf
darshakparmar
 
PPTX
MuleSoft MCP Support (Model Context Protocol) and Use Case Demo
shyamraj55
 
PDF
Transforming Utility Networks: Large-scale Data Migrations with FME
Safe Software
 
PPTX
Q2 FY26 Tableau User Group Leader Quarterly Call
lward7
 
Reverse Engineering of Security Products: Developing an Advanced Microsoft De...
nwbxhhcyjv
 
AI Agents in the Cloud: The Rise of Agentic Cloud Architecture
Lilly Gracia
 
Bitcoin for Millennials podcast with Bram, Power Laws of Bitcoin
Stephen Perrenod
 
Seamless Tech Experiences Showcasing Cross-Platform App Design.pptx
presentifyai
 
UiPath DevConnect 2025: Agentic Automation Community User Group Meeting
DianaGray10
 
“Computer Vision at Sea: Automated Fish Tracking for Sustainable Fishing,” a ...
Edge AI and Vision Alliance
 
Digital Circuits, important subject in CS
contactparinay1
 
COMPARISON OF RASTER ANALYSIS TOOLS OF QGIS AND ARCGIS
Sharanya Sarkar
 
Designing_the_Future_AI_Driven_Product_Experiences_Across_Devices.pptx
presentifyai
 
Ericsson LTE presentation SEMINAR 2010.ppt
npat3
 
NLJUG Speaker academy 2025 - first session
Bert Jan Schrijver
 
The Project Compass - GDG on Campus MSIT
dscmsitkol
 
Agentic AI lifecycle for Enterprise Hyper-Automation
Debmalya Biswas
 
“Squinting Vision Pipelines: Detecting and Correcting Errors in Vision Models...
Edge AI and Vision Alliance
 
Staying Human in a Machine- Accelerated World
Catalin Jora
 
The Rise of AI and IoT in Mobile App Tech.pdf
IMG Global Infotech
 
Newgen Beyond Frankenstein_Build vs Buy_Digital_version.pdf
darshakparmar
 
MuleSoft MCP Support (Model Context Protocol) and Use Case Demo
shyamraj55
 
Transforming Utility Networks: Large-scale Data Migrations with FME
Safe Software
 
Q2 FY26 Tableau User Group Leader Quarterly Call
lward7
 
Ad

SQL vs NoSQL, an experiment with MongoDB

  • 1. SQL vs. NoSQL An experiment (for dummies) with MongoDB Marco Segato // v20170621
  • 2. SUMMARY ★ What? ★ When? ★ Why (MongoDB)? ★ How? ★ :)
  • 3. What? { The Big Data Landscape (2017), Start from a definition, Some NoSQL databases, SQL vs. NoSQL differences }
  • 4. The Big Data Landscape (2017) http://mattturck.com/bigdata2017/
  • 5. Start from a definition A NoSQL (originally referring to "non SQL", "non relational" or "not only SQL") database provides a mechanism for storage and retrieval of data which is modeled in means other than the tabular relations used in relational databases. [ https://en.wikipedia.org/wiki/NoSQL ]
  • 7. SQL vs. NoSQL differences / 1 SQL One type (SQL database) with minor variations. Developed in 1970s to deal with first wave of data storage applications. MySQL, Postgres, Oracle Database. To store information about a new data item, the entire database must be altered, during which time the database must be taken offline. NoSQL Different types including key-value stores, document databases, wide-column stores, and graph databases. Developed in 2000s to deal with limitations of SQL databases, concerning scale, replication and unstructured data storage. MongoDB, Cassandra, HBase, Neo4j. Records can add new information on the fly, and unlike SQL table rows, dissimilar data can be stored together as necessary. Examples Schemas History Types
  • 8. SQL vs. NoSQL differences / 2 SQL Mix of open-source (e.g., Postgres, MySQL) and closed-source (e.g., Oracle Database). Yes, updates can be configured to complete entirely or not at all. Specific language using Select, Insert, and Update statements. NoSQL Open-source. In certain circumstances and at certain levels (e.g., document level vs. database level). Through object-oriented APIs.Data Manipulation Supports Transactions Development Model
  • 9. When? { Size vs. Complexity, Big Data, Use cases, NoSQL Pros and Cons }
  • 11. Big Data One of the first reasons to use NoSQL is because you have a Big Data project to tackle. A Big Data project is normally typified by: ● High data velocity – lots of data coming in very quickly, possibly from different locations. ● Data variety – storage of data that is structured, semi-structured and unstructured. ● Data volume – data that involves many terabytes or petabytes in size. ● Data complexity – data that is stored and managed in different locations or data centers.
  • 12. Use cases LARGE DATA VOLUMES EXTREME QUERY WORKLOAD SCHEMA EVOLUTION We are storing more data now than we ever have before. Connections between our data are growing all the time. We don’t make things knowing the structure from day 1. Server architecture is now at a stage where we can take advantage of it.
  • 13. NoSQL Pros and Cons PROS MASSIVE SCALABILITY HIGH AVAILABILITY LOWER COST SCHEMA FLEXIBILITY SPARSE AND SEMI STRUCTURED DATA CONS LIMITED QUERY CAPABILITIES NOT STANDARDISED (PORTABILITY MAY BE AN ISSUE) STILL A DEVELOPING TECHNOLOGY INSTALLATION, MANAGEMENT AND TOOLSETS STILL MATURING
  • 14. Why (MongoDB)? { Some notes, The leading NoSQL Database, Who’s using MongoDB, Main features, TCO Comparison MongoDB & Oracle, MongoDB University }
  • 15. Some notes History: The software company “10gen” began developing MongoDB in 2007 as a component of a planned platform as a service product. In 2009, the company shifted to an open source development model, with the company offering commercial support and other services. In 2013, “10gen” changed its name to MongoDB Inc. Licensing: MongoDB is available at no cost under the GNU Affero General Public License, version 3. The language drivers are available under an Apache License. In addition, MongoDB Inc. offers proprietary licenses for MongoDB.
  • 16. MongoDB – The Leading NoSQL Database NoSQL adoption (based on Google Trends) * LinkedIn job skills * Job trends (2015) * https://www.mongodb.com/leading-nosql-database
  • 18. Main features Ad hoc queries - MongoDB supports field, range queries, regular expression searches. Indexing - Fields in a MongoDB document can be indexed with primary and secondary indices. Replication - MongoDB provides high availability with replica sets. A replica set consists of two or more copies of the data. Load balancing - MongoDB scales horizontally using sharding. The user chooses a shard key, which determines how the data in a collection will be distributed. The data is split into ranges (based on the shard key) and distributed across multiple shards. MongoDB can run over multiple servers, balancing the load or duplicating data to keep the system up and running in case of hardware failure. File storage - MongoDB can be used as a file system with load balancing and data replication features over multiple machines. Aggregation - MapReduce can be used for batch processing of data and aggregation operations. The aggregation framework enables users to obtain the kind of results for which the SQL GROUP BY clause is used. The aggregation framework includes the $lookup operator which can join documents from multiple documents, as well as statistical operators such as standard deviation. Others - In-memory Storage Engine, Native Graph Processing, Optimized Connectors for BI & Spark, Database as a Cloud Service
  • 19. TCO Comparison of MongoDB & Oracle (aug-15) Small Enterprise Project Large Enterprise Project MongoDB Oracle MongoDB Oracle Initial Developer Effort $ 120.000 $ 240.000 $ 360.000 $ 720.000 Initial Administrative Effort $ 10.000 $ 20.000 $ 30.000 $ 60.000 Software Licenses $ 0 $ 423.000 $ 0 $ 4.230.000 Server Hardware $ 12.000 $ 12.000 $ 120.000 $ 120.000 Storage Hardware $ 24.000 $ 125.000 $ 240.000 $ 500.000 Total Upfront Costs $ 166.000 $ 820.000 $ 750.000 $ 5.630.000 https://www.mongodb.com/collateral/total-cost-ownership-comparison-mongodb-oracle
  • 20. MongoDB University MongoDB University offers free online courses to teach you how to build and deploy apps on MongoDB. Over 400,000 of your peers have already signed up. https://university.mongodb.com/ M101P: MongoDB for Developers Learn everything you need to know to get started building a MongoDB-based app (7 weeks).
  • 21. How? { Battlefield and opponents, Install & run, Contest, A doubt, Tools, Comparison, And the winner is... }
  • 22. Battlefield and opponents Red Hat Enterprise Linux Server v5.5 RAM 8 GB V-CPU 1 11g Enterprise Edition 64 bit 2.6.3 Community Edition 64 bit (current release: 12c) (current release: 3.4.6)
  • 23. Install & run Install MongoDB on linux and start the database service: # tar -zxvf mongodb-linux-x86_64-x.y.z.tgz # mkdir -p /data/db # cd mongodb-linux-x86_64-x.y.z # mongod --dbpath /data/db JDBC connection string: mongodb://[username:password@]host1[:port1][/[database][?options]] Note: the port is optional, the default value is :27017 if not specified.
  • 24. Contest Table of daily sales: INDEXES: IDX1 C_PROD IDX2 C_PROD, DATA IDX3 C_ENTE, C_PROD IDX4 DATA, C_TIPO_DOC, C_ENTE IDX5 FLG_FIDELITY, C_PROD, C_ENTE, DATA IDX6 TRIM("C_ENTE"), TRIM("C_PROD") IDX7 NUM $ mongoimport -d mydb -c sales --type csv --file mydb_sales.csv --headerline ≈ 3.000.000 record 2’30’’ to complete the import No index defined
  • 25. A doubt Is it correct/useful to compare them working with a typical RDBMS' object? ● If you work in a standard legacy environment, you could even not to be interested on databases other than RDBMS ● If you work in a futuristic start-up, you surely already moved your data aggregation to a new strategy but… what if your company has to manage a transitional period in which data structure can’t be modified, but you need to move on anyway? (e.g.: due to costs, customer requirements, warranty on data safety before final migration, etc...)
  • 26. Tools mongo shell Robo 3T (formerly Robomongo) the free lightweight GUI for MongoDB. https://robomongo.org/ DBeaver, universal SQL client. http://dbeaver.jkiss.org/
  • 27. Comparison / COUNT select count (*) from mydb; db.mydb.aggregate( [ { $group: { _id: null, count: { $sum: 1 } } } ] ); 19.000’’ 2.960’’
  • 28. Comparison / WHERE select data, c_prod from mydb where data = to_date('26/09/2011','DD/MM/YYYY'); db.mydb.find({ "DATA": "26/09/2011" }, { "DATA": 1, "C_PROD": 1 }).pretty(); 0.116’’ 0.006’’
  • 29. Comparison / COUNT + GROUP BY select data, c_prod, count(c_prod) from mydb group by data, c_prod; db.mydb.aggregate( [ { $group: { _id: {data: "$DATA", c_prod: "$C_PROD"}, count: { $sum: 1 } } } ], { allowDiskUse: true } ); 2’32’’ 0’15’’
  • 30. Comparison / COUNT + GROUP BY + WHERE select data, c_prod, count(c_prod) from mydb where data = to_date('26/09/2011','DD/MM/YYYY') group by data, c_prod; db.mydb.aggregate( [ { $match: { DATA: "26/09/2011" } }, { $group: { _id: {data: "$DATA", c_prod: "$C_PROD"}, count: { $sum: 1 } } } ], { allowDiskUse: true } ); 1’14’’ 0’01’’
  • 31. Comparison / DISTINCT select distinct data from mydb; db.mydb.distinct("DATA"); 37.000’’ 2.306’’
  • 32. Comparison / INSERT insert into mydb (NUM, C_ENTE, C_TIPO_DOC, DATA...) Values (-1, '67335 ', '12', TO_DATE('01/22/2015 00:00:00', 'MM/DD/YYYY HH24:MI:SS')...); db.mydb.insert({ "NUM" : -1, "C_ENTE" : "67335 ", "C_TIPO_DOC" : "12", "DATA" : "01/22/2015", ... }); 0.539’’ 0.003’’
  • 33. Comparison / UPDATE update mydb set VALORE_01 = 5.5 where NUM = -1; db.mydb.update( {"NUM" : -1}, { $set : { "VALORE_01" : 5.5} }); 0.063’’ 0.642’’
  • 34. And the winner is... 1 - 6
  • 35. And now… … it’s up to you! :) MongoDB official site: https://www.mongodb.com/ MongoDB Tools: http://mongodb-tools.com/ MongoDB Tutorial: http://www.w3resource.com/mongodb/introduction-mongodb.php
  • 36. Marco Segato Project Manager at TESISQUARE® https://www.linkedin.com/in/marcosegato/ @machms Passionate with #linux #opensource #innovation My interests: #rock #reading #photo #cinema #theatre