SlideShare a Scribd company logo
NoSQL Database
Akshay Mathur
Sarang Shravagi
@akshaymathu, @_sarangs
{name: ‘mongo’, type: ‘db’}
Who uses MongoDB
@akshaymathu, @_sarangs 2
Let’s Know Each Other
• Do you code?
• OS?
• Programing Language?
• Why are you attending?
@akshaymathu, @_sarangs 3
Akshay Mathur
• Managed development, testing and
release teams in last 14+ years
– Currently Principal Architect at ShopSocially
• Founding Team Member of
– ShopSocially (Enabling “social” for retailers)
– AirTight Neworks (Global leader of WIPS)
@akshaymathu, @_sarangs 4
Sarang Shravagi
• 10gen Certified Developer and DBA
• CS graduate from PICT Pune
• 3+ years in Software Product industry
• Currently Senior Full-stack Developer at
ShopSocially
@akshaymathu, @_sarangs 5
How we use MongoDB
@akshaymathu, @_sarangs 6
Python MongoDB
MongoEngine
Where MongoDB Fits
@akshaymathu, @_sarangs 7
Program Outline: Understanding NoSQL
• Data Landscape
• Different Storage Needs
• Design Paradigm Shift from SQL to
NoSQL
• Different Datastores
• Closer look to Document Storage
• Drawing parallel from RDBMS
@akshaymathu, @_sarangs 8
Program Outline: Hands on Lab
• Installation and basic configuration
• Mongo Shell
• Creating and Changing Schema
• Create, Read, Update and Delete of Data
• Analyzing Performance
• Improving performance by creating Indices
• Assignment
• Problem solving for the assignment
@akshaymathu, @_sarangs 9
Program Outline: Advance Topics
• Handling Big Data
– Introduction to Map/Reduce
– Introduction to Data Partitioning (Sharding)
• Disaster Recovery
– Introduction to Replica set and High
Availability
@akshaymathu, @_sarangs 10
Ground Rules
• Disturb Everyone
– Not by phone rings
– Not by local talks
– By more information
and questions
@akshaymathu, @_sarangs 11
Data Patterns & Storage Needs
@akshaymathu, @_sarangs 12
Data at an Online Store
• Product Information
• User Information
• Purchase Information
• Product Reviews
• Site Interactions
• Social Graph
• Search Index
@akshaymathu, @_sarangs 13
SQL to NoSQL
Design Paradigm Shift
@akshaymathu, @_sarangs 14
SQL Storage
• Was designed when
– Storage and data transfer was costly
– Processing was slow
– Applications were oriented more towards data
collection
• Initial adopters were financial institutions
@akshaymathu, @_sarangs 15
SQL Storage
• Structured
– schema
• Relational
– foreign keys, constraints
• Transactional
– Atomicity, Consistency, Isolation, Durability
• High Availability through robustness
– Minimize failures
• Optimized for Writes
• Typically Scale Up
@akshaymathu, @_sarangs 16
NoSQL Storage
• Is designed when
– Storage is cheap
– Data transfer is fast
– Much more processing power is available
• Clustering of machines is also possible
– Applications are oriented towards
consumption of User Generated Content
– Better on-screen user experience is in
demand
@akshaymathu, @_sarangs 17
NoSQL Storage
• Semi-structured
– Schemaless
• Consistency, Availability, Partition
Tolerance
• High Availability through clustering
– expect failures
• Optimized for Reads
• Typically Scale Out
@akshaymathu, @_sarangs 18
Different Datastores
Half Level Deep
@akshaymathu, @_sarangs 19
SQL: RDBMS
• MySql, Postgresql, Oracle etc.
• Stores data in tables having columns
– Basic (number, text) data types
• Strong query language
• Transparent values
– Query language can read and filter on them
– Relationship between tables based on values
• Suited for user info and transactions
@akshaymathu, @_sarangs 20
NoSQL: Key/Value
• Redis, DynamoDB etc.
• Stores a values against a key
– Strings
• Values are opaque
– Can not be part of query
• Suited for site interactions
@akshaymathu, @_sarangs 21
NoSQL: Key/Value
NoSQL: Document
• MongoDB, CouchDB etc.
• Object Oriented data models
– Stores data in document objects having fields
– Basic and compound (list, dict) data types
• SQL like queries
• Transparent values
– Can be part of query
• Suited for product info and its reviews
@akshaymathu, @_sarangs 23
NoSQL: Document
NoSQL: Column Family
• Cassandra, Big Table etc.
• Stores data in columns
• Transparent values
– Can be part of query
• SQL like queries
• Suited for search
@akshaymathu, @_sarangs 25
NoSQL: Column Family
NoSQL: Graph
• Neo4j
• Stores data in form of nodes and
relationships
• Query is in form of traversal
• In-memory
• Suited for social graph
@akshaymathu, @_sarangs 27
NoSQL: Graph
Mongo db
Document Storage: Closer Look
@akshaymathu, @_sarangs 30
MongoDB
• Document database
• Powerful query language
• Docs, sub-docs, indexes
• Map/reduce
• Replicas, shards, replicated shards
• SDKs/drivers for so many languages
– C, C++, C#, Python, Erlang, PHP, Java, Javascript, NodeJS, Perl,
Ruby, Scala
@akshaymathu, @_sarangs 31
RDBMS: DB Design
@akshaymathu, @_sarangs 32
RDBMS: Query
@akshaymathu, @_sarangs 33
RDBMS  MongoDB
RDBMS MongoDB
Database Database
Table Collection
Row Document
Column Field
Select c1, c2 from Table where c1 = ‘v1’
order by c2 limit n
Collection.objects(F1 =
‘v1’).order_by(‘c2’).limit(n)
@akshaymathu, @_sarangs 34
MongoDB: Design
@akshaymathu, @_sarangs 35
MongoDB: Query
• Movies.objects()
@akshaymathu, @_sarangs 36
@akshaymathu, @_sarangs 37
Have you Installed?
http://www.mongodb.org/downloads
@akshaymathu, @_sarangs
Hands-on
Dive-in with Sarang
@akshaymathu, @_sarangs 39
MongoDB: Core Binaries
• mongod
– Database server
• mongo
– Database client shell
• mongos
– Router for Sharding
@akshaymathu, @_sarangs 40
Getting Help
• For mongo shell
– mongo –help
• Shows options available for running the shell
• Inside mongo shell
– Object.help()
• Shows commands available on the object
@akshaymathu, @_sarangs 41
Import Export Tools
• For objects
– mongodump
– mongorestore
– bsondump
– mongooplog
• For data items
– mongoimport
– mongoexport
@akshaymathu, @_sarangs 42
Database Operations
• Database creation
• Creating/changing collection
• Data insertion
• Data read
• Data update
• Creating indices
• Data deletion
• Dropping collection
@akshaymathu, @_sarangs 43
Diagnostic Tools
• mongostat
• mongoperf
• mongosnif
• mongotop
@akshaymathu, @_sarangs 44
@akshaymathu, @_sarangs 45
Assignment
• Go to http://www.velocitainc.com/mongo/
– Tasks
• assignments.txt
– Data
• students.json
@akshaymathu, @_sarangs 46
Disaster Recovery
Introduction to Replica Sets and
High Availability
@akshaymathu, @_sarangs 47
Disasters
• Physical Failure
– Hardware
– Network
• Solution
– Replica Sets
• Provide redundant storage for High Availability
– Real time data synchronization
• Automatic failover for zero down time
@akshaymathu, @_sarangs 48
Replication
@akshaymathu, @_sarangs 49
Multi Replication
• Data can be replicated to multiple places
simultaneously
• Odd number of machines are always
needed in a replica set
@akshaymathu, @_sarangs 50
Single Replication
• If you want to have only one or odd
number of secondary, you need to setup
an arbiter
@akshaymathu, @_sarangs 51
Failover
• When primary fails, remaining machines
vote for electing new primary
@akshaymathu, @_sarangs 52
Handling Big Data
Introduction to Map/Reduce
and Sharding
@akshaymathu, @_sarangs 53
Large Data Sets
• Problem 1
– Performance
• Queries go slow
• Solution
– Map/Reduce
@akshaymathu, @_sarangs 54
Map Reduce
• A way to divide large query computation
into smaller chunks
• May run in multiple processes across
multiple machines
• Think of it as GROUP BY of SQL
@akshaymathu, @_sarangs 55
Map/Reduce Example
• Map function digs the data and returns
required values
@akshaymathu, @_sarangs 56
Map/Reduce Example
• Reduce function uses the output of Map
function and generates aggregated value
@akshaymathu, @_sarangs 57
Large Data Sets
• Problem 2
– Vertical Scaling of Hardware
• Can’t increase machine size beyond a limit
• Solution
– Sharding
@akshaymathu, @_sarangs 58
Sharding
• A method for storing data across multiple
machines
• Data is partitioned using Shard Keys
@akshaymathu, @_sarangs 59
Data Partitioning: Range Based
• A range of Shard Keys stay in a chunk
@akshaymathu, @_sarangs 60
Data Partitioning: Hash Bsed
• A hash function on Shard Keys decides the chunk
@akshaymathu, @_sarangs 61
Sharded Cluster
@akshaymathu, @_sarangs 62
Optimizing Shards: Splitting
• In a shard, when size of a chunk
increases, the chunk is divided into two
@akshaymathu, @_sarangs 63
Optimizing Shards: Balancing
• When number of chunks in a shard
increase, a few chunks are migrated to
other shard
@akshaymathu, @_sarangs 64
Summary
• MongoDB is good
– Stores objects as we use in programming
language
– Flexible semi-structured design
– Scales out to store big data
– Embedded documents eliminates need for join
• MongoDB is bad
– No multi-document query
– De-normalized storage
– No support for transactions
@akshaymathu, @_sarangs 65
Thanks
@akshaymathu, @_sarangs 66
@akshaymathu @_sarangs
Ad

More Related Content

What's hot (20)

An Introduction to SPARQL
An Introduction to SPARQLAn Introduction to SPARQL
An Introduction to SPARQL
Olaf Hartig
 
Power bi
Power biPower bi
Power bi
jainema23
 
Introduction to Graph Databases
Introduction to Graph DatabasesIntroduction to Graph Databases
Introduction to Graph Databases
Max De Marzi
 
Relational databases vs Non-relational databases
Relational databases vs Non-relational databasesRelational databases vs Non-relational databases
Relational databases vs Non-relational databases
James Serra
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDB
NodeXperts
 
Apache Spark Data Validation
Apache Spark Data ValidationApache Spark Data Validation
Apache Spark Data Validation
Databricks
 
Introduction to Tableau
Introduction to Tableau Introduction to Tableau
Introduction to Tableau
Mithileysh Sathiyanarayanan
 
Data Quality With or Without Apache Spark and Its Ecosystem
Data Quality With or Without Apache Spark and Its EcosystemData Quality With or Without Apache Spark and Its Ecosystem
Data Quality With or Without Apache Spark and Its Ecosystem
Databricks
 
MongoDB Administration 101
MongoDB Administration 101MongoDB Administration 101
MongoDB Administration 101
MongoDB
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDB
Ravi Teja
 
Introduction to pig & pig latin
Introduction to pig & pig latinIntroduction to pig & pig latin
Introduction to pig & pig latin
knowbigdata
 
The PostgreSQL Query Planner
The PostgreSQL Query PlannerThe PostgreSQL Query Planner
The PostgreSQL Query Planner
Command Prompt., Inc
 
Indexing and hashing
Indexing and hashingIndexing and hashing
Indexing and hashing
Jeet Poria
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDB
Mike Dirolf
 
Intro to GemStone/S
Intro to GemStone/SIntro to GemStone/S
Intro to GemStone/S
ESUG
 
Spark sql
Spark sqlSpark sql
Spark sql
Freeman Zhang
 
An overview of Neo4j Internals
An overview of Neo4j InternalsAn overview of Neo4j Internals
An overview of Neo4j Internals
Tobias Lindaaker
 
Real Time Analytics for Big Data a Twitter Case Study
Real Time Analytics for Big Data a Twitter Case StudyReal Time Analytics for Big Data a Twitter Case Study
Real Time Analytics for Big Data a Twitter Case Study
Nati Shalom
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDB
S.Shayan Daneshvar
 
Data Visualization With Tableau | Edureka
Data Visualization With Tableau | EdurekaData Visualization With Tableau | Edureka
Data Visualization With Tableau | Edureka
Edureka!
 
An Introduction to SPARQL
An Introduction to SPARQLAn Introduction to SPARQL
An Introduction to SPARQL
Olaf Hartig
 
Introduction to Graph Databases
Introduction to Graph DatabasesIntroduction to Graph Databases
Introduction to Graph Databases
Max De Marzi
 
Relational databases vs Non-relational databases
Relational databases vs Non-relational databasesRelational databases vs Non-relational databases
Relational databases vs Non-relational databases
James Serra
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDB
NodeXperts
 
Apache Spark Data Validation
Apache Spark Data ValidationApache Spark Data Validation
Apache Spark Data Validation
Databricks
 
Data Quality With or Without Apache Spark and Its Ecosystem
Data Quality With or Without Apache Spark and Its EcosystemData Quality With or Without Apache Spark and Its Ecosystem
Data Quality With or Without Apache Spark and Its Ecosystem
Databricks
 
MongoDB Administration 101
MongoDB Administration 101MongoDB Administration 101
MongoDB Administration 101
MongoDB
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDB
Ravi Teja
 
Introduction to pig & pig latin
Introduction to pig & pig latinIntroduction to pig & pig latin
Introduction to pig & pig latin
knowbigdata
 
Indexing and hashing
Indexing and hashingIndexing and hashing
Indexing and hashing
Jeet Poria
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDB
Mike Dirolf
 
Intro to GemStone/S
Intro to GemStone/SIntro to GemStone/S
Intro to GemStone/S
ESUG
 
An overview of Neo4j Internals
An overview of Neo4j InternalsAn overview of Neo4j Internals
An overview of Neo4j Internals
Tobias Lindaaker
 
Real Time Analytics for Big Data a Twitter Case Study
Real Time Analytics for Big Data a Twitter Case StudyReal Time Analytics for Big Data a Twitter Case Study
Real Time Analytics for Big Data a Twitter Case Study
Nati Shalom
 
Data Visualization With Tableau | Edureka
Data Visualization With Tableau | EdurekaData Visualization With Tableau | Edureka
Data Visualization With Tableau | Edureka
Edureka!
 

Viewers also liked (20)

MongoDB for Beginners
MongoDB for BeginnersMongoDB for Beginners
MongoDB for Beginners
Enoch Joshua
 
Mongo DB
Mongo DBMongo DB
Mongo DB
Karan Kukreja
 
Intro To MongoDB
Intro To MongoDBIntro To MongoDB
Intro To MongoDB
Alex Sharp
 
Connecting NodeJS & MongoDB
Connecting NodeJS & MongoDBConnecting NodeJS & MongoDB
Connecting NodeJS & MongoDB
Enoch Joshua
 
Mongo DB
Mongo DBMongo DB
Mongo DB
Edureka!
 
Mongo DB
Mongo DB Mongo DB
Mongo DB
Tata Consultancy Services
 
Mongo db basics
Mongo db basicsMongo db basics
Mongo db basics
Claudio Montoya
 
Pdf almas
Pdf almasPdf almas
Pdf almas
eventosculturales
 
Mongo db basics
Mongo db basicsMongo db basics
Mongo db basics
Harischandra M K
 
Mongo DB: Fundamentals & Basics/ An Overview of MongoDB/ Mongo DB tutorials
Mongo DB: Fundamentals & Basics/ An Overview of MongoDB/ Mongo DB tutorialsMongo DB: Fundamentals & Basics/ An Overview of MongoDB/ Mongo DB tutorials
Mongo DB: Fundamentals & Basics/ An Overview of MongoDB/ Mongo DB tutorials
SpringPeople
 
Mongo db
Mongo dbMongo db
Mongo db
Noman Ellahi
 
Intro to NoSQL and MongoDB
Intro to NoSQL and MongoDBIntro to NoSQL and MongoDB
Intro to NoSQL and MongoDB
DATAVERSITY
 
MongoDB NoSQL database a deep dive -MyWhitePaper
MongoDB  NoSQL database a deep dive -MyWhitePaperMongoDB  NoSQL database a deep dive -MyWhitePaper
MongoDB NoSQL database a deep dive -MyWhitePaper
Rajesh Kumar
 
An Introduction To NoSQL & MongoDB
An Introduction To NoSQL & MongoDBAn Introduction To NoSQL & MongoDB
An Introduction To NoSQL & MongoDB
Lee Theobald
 
Mongo Presentation by Metatagg Solutions
Mongo Presentation by Metatagg SolutionsMongo Presentation by Metatagg Solutions
Mongo Presentation by Metatagg Solutions
Metatagg Solutions
 
Administrasi MongoDB
Administrasi MongoDBAdministrasi MongoDB
Administrasi MongoDB
Agus Kurniawan
 
Konsep oop pada php dan mvc pada php framework, 1200631047 1200631018 1200631028
Konsep oop pada php dan mvc pada php framework, 1200631047 1200631018 1200631028Konsep oop pada php dan mvc pada php framework, 1200631047 1200631018 1200631028
Konsep oop pada php dan mvc pada php framework, 1200631047 1200631018 1200631028
iis dahlia
 
2013 Tips and Tricks Mashup, From ModelBuilder to Formal Python Code, Step-by...
2013 Tips and Tricks Mashup, From ModelBuilder to Formal Python Code, Step-by...2013 Tips and Tricks Mashup, From ModelBuilder to Formal Python Code, Step-by...
2013 Tips and Tricks Mashup, From ModelBuilder to Formal Python Code, Step-by...
GIS in the Rockies
 
Nosql
NosqlNosql
Nosql
ericwilliammarshall
 
MapReduce and NoSQL
MapReduce and NoSQLMapReduce and NoSQL
MapReduce and NoSQL
Aaron Cordova
 
MongoDB for Beginners
MongoDB for BeginnersMongoDB for Beginners
MongoDB for Beginners
Enoch Joshua
 
Intro To MongoDB
Intro To MongoDBIntro To MongoDB
Intro To MongoDB
Alex Sharp
 
Connecting NodeJS & MongoDB
Connecting NodeJS & MongoDBConnecting NodeJS & MongoDB
Connecting NodeJS & MongoDB
Enoch Joshua
 
Mongo DB: Fundamentals & Basics/ An Overview of MongoDB/ Mongo DB tutorials
Mongo DB: Fundamentals & Basics/ An Overview of MongoDB/ Mongo DB tutorialsMongo DB: Fundamentals & Basics/ An Overview of MongoDB/ Mongo DB tutorials
Mongo DB: Fundamentals & Basics/ An Overview of MongoDB/ Mongo DB tutorials
SpringPeople
 
Intro to NoSQL and MongoDB
Intro to NoSQL and MongoDBIntro to NoSQL and MongoDB
Intro to NoSQL and MongoDB
DATAVERSITY
 
MongoDB NoSQL database a deep dive -MyWhitePaper
MongoDB  NoSQL database a deep dive -MyWhitePaperMongoDB  NoSQL database a deep dive -MyWhitePaper
MongoDB NoSQL database a deep dive -MyWhitePaper
Rajesh Kumar
 
An Introduction To NoSQL & MongoDB
An Introduction To NoSQL & MongoDBAn Introduction To NoSQL & MongoDB
An Introduction To NoSQL & MongoDB
Lee Theobald
 
Mongo Presentation by Metatagg Solutions
Mongo Presentation by Metatagg SolutionsMongo Presentation by Metatagg Solutions
Mongo Presentation by Metatagg Solutions
Metatagg Solutions
 
Konsep oop pada php dan mvc pada php framework, 1200631047 1200631018 1200631028
Konsep oop pada php dan mvc pada php framework, 1200631047 1200631018 1200631028Konsep oop pada php dan mvc pada php framework, 1200631047 1200631018 1200631028
Konsep oop pada php dan mvc pada php framework, 1200631047 1200631018 1200631028
iis dahlia
 
2013 Tips and Tricks Mashup, From ModelBuilder to Formal Python Code, Step-by...
2013 Tips and Tricks Mashup, From ModelBuilder to Formal Python Code, Step-by...2013 Tips and Tricks Mashup, From ModelBuilder to Formal Python Code, Step-by...
2013 Tips and Tricks Mashup, From ModelBuilder to Formal Python Code, Step-by...
GIS in the Rockies
 
Ad

Similar to Mongo db (20)

Scalable web architecture
Scalable web architectureScalable web architecture
Scalable web architecture
Kaushik Paranjape
 
NoSQLDatabases
NoSQLDatabasesNoSQLDatabases
NoSQLDatabases
Adi Challa
 
Hadoop: The Default Machine Learning Platform ?
Hadoop: The Default Machine Learning Platform ?Hadoop: The Default Machine Learning Platform ?
Hadoop: The Default Machine Learning Platform ?
Milind Bhandarkar
 
Challenges of Implementing an Advanced SQL Engine on Hadoop
Challenges of Implementing an Advanced SQL Engine on HadoopChallenges of Implementing an Advanced SQL Engine on Hadoop
Challenges of Implementing an Advanced SQL Engine on Hadoop
DataWorks Summit
 
NoSQL Simplified: Schema vs. Schema-less
NoSQL Simplified: Schema vs. Schema-lessNoSQL Simplified: Schema vs. Schema-less
NoSQL Simplified: Schema vs. Schema-less
InfiniteGraph
 
FOSSASIA 2015 - 10 Features your developers are missing when stuck with Propr...
FOSSASIA 2015 - 10 Features your developers are missing when stuck with Propr...FOSSASIA 2015 - 10 Features your developers are missing when stuck with Propr...
FOSSASIA 2015 - 10 Features your developers are missing when stuck with Propr...
Ashnikbiz
 
NoSQL-Overview
NoSQL-OverviewNoSQL-Overview
NoSQL-Overview
Ranjeet Jha - OCM-JEA
 
NoSql Brownbag
NoSql BrownbagNoSql Brownbag
NoSql Brownbag
Sandeep Kumar
 
Microservices - Is it time to breakup?
Microservices - Is it time to breakup? Microservices - Is it time to breakup?
Microservices - Is it time to breakup?
Dave Nielsen
 
Hadoop Data Modeling
Hadoop Data ModelingHadoop Data Modeling
Hadoop Data Modeling
Adam Doyle
 
Couchbase 3.0.2 d1
Couchbase 3.0.2  d1Couchbase 3.0.2  d1
Couchbase 3.0.2 d1
Sachin Kumar Kansal
 
No SQL- The Future Of Data Storage
No SQL- The Future Of Data StorageNo SQL- The Future Of Data Storage
No SQL- The Future Of Data Storage
Bethmi Gunasekara
 
Module 2.2 Introduction to NoSQL Databases.pptx
Module 2.2 Introduction to NoSQL Databases.pptxModule 2.2 Introduction to NoSQL Databases.pptx
Module 2.2 Introduction to NoSQL Databases.pptx
NiramayKolalle
 
Scalability designprinciples-v2-130718023602-phpapp02 (1)
Scalability designprinciples-v2-130718023602-phpapp02 (1)Scalability designprinciples-v2-130718023602-phpapp02 (1)
Scalability designprinciples-v2-130718023602-phpapp02 (1)
Minal Patil
 
Scalability Design Principles - Internal Session
Scalability Design Principles - Internal SessionScalability Design Principles - Internal Session
Scalability Design Principles - Internal Session
Sachin Sancheti - Microsoft Azure Architect
 
Introduction to NoSQL and MongoDB
Introduction to NoSQL and MongoDBIntroduction to NoSQL and MongoDB
Introduction to NoSQL and MongoDB
Ahmed Farag
 
Architecting Database by Jony Sugianto (Detik.com)
Architecting Database by Jony Sugianto (Detik.com)Architecting Database by Jony Sugianto (Detik.com)
Architecting Database by Jony Sugianto (Detik.com)
Tech in Asia ID
 
Datastore PPT.pptx
Datastore PPT.pptxDatastore PPT.pptx
Datastore PPT.pptx
Jatin Chuglani
 
Python Ireland Conference 2016 - Python and MongoDB Workshop
Python Ireland Conference 2016 - Python and MongoDB WorkshopPython Ireland Conference 2016 - Python and MongoDB Workshop
Python Ireland Conference 2016 - Python and MongoDB Workshop
Joe Drumgoole
 
Continuous Optimization for Distributed BigData Analysis
Continuous Optimization for Distributed BigData AnalysisContinuous Optimization for Distributed BigData Analysis
Continuous Optimization for Distributed BigData Analysis
Kai Sasaki
 
NoSQLDatabases
NoSQLDatabasesNoSQLDatabases
NoSQLDatabases
Adi Challa
 
Hadoop: The Default Machine Learning Platform ?
Hadoop: The Default Machine Learning Platform ?Hadoop: The Default Machine Learning Platform ?
Hadoop: The Default Machine Learning Platform ?
Milind Bhandarkar
 
Challenges of Implementing an Advanced SQL Engine on Hadoop
Challenges of Implementing an Advanced SQL Engine on HadoopChallenges of Implementing an Advanced SQL Engine on Hadoop
Challenges of Implementing an Advanced SQL Engine on Hadoop
DataWorks Summit
 
NoSQL Simplified: Schema vs. Schema-less
NoSQL Simplified: Schema vs. Schema-lessNoSQL Simplified: Schema vs. Schema-less
NoSQL Simplified: Schema vs. Schema-less
InfiniteGraph
 
FOSSASIA 2015 - 10 Features your developers are missing when stuck with Propr...
FOSSASIA 2015 - 10 Features your developers are missing when stuck with Propr...FOSSASIA 2015 - 10 Features your developers are missing when stuck with Propr...
FOSSASIA 2015 - 10 Features your developers are missing when stuck with Propr...
Ashnikbiz
 
Microservices - Is it time to breakup?
Microservices - Is it time to breakup? Microservices - Is it time to breakup?
Microservices - Is it time to breakup?
Dave Nielsen
 
Hadoop Data Modeling
Hadoop Data ModelingHadoop Data Modeling
Hadoop Data Modeling
Adam Doyle
 
No SQL- The Future Of Data Storage
No SQL- The Future Of Data StorageNo SQL- The Future Of Data Storage
No SQL- The Future Of Data Storage
Bethmi Gunasekara
 
Module 2.2 Introduction to NoSQL Databases.pptx
Module 2.2 Introduction to NoSQL Databases.pptxModule 2.2 Introduction to NoSQL Databases.pptx
Module 2.2 Introduction to NoSQL Databases.pptx
NiramayKolalle
 
Scalability designprinciples-v2-130718023602-phpapp02 (1)
Scalability designprinciples-v2-130718023602-phpapp02 (1)Scalability designprinciples-v2-130718023602-phpapp02 (1)
Scalability designprinciples-v2-130718023602-phpapp02 (1)
Minal Patil
 
Introduction to NoSQL and MongoDB
Introduction to NoSQL and MongoDBIntroduction to NoSQL and MongoDB
Introduction to NoSQL and MongoDB
Ahmed Farag
 
Architecting Database by Jony Sugianto (Detik.com)
Architecting Database by Jony Sugianto (Detik.com)Architecting Database by Jony Sugianto (Detik.com)
Architecting Database by Jony Sugianto (Detik.com)
Tech in Asia ID
 
Python Ireland Conference 2016 - Python and MongoDB Workshop
Python Ireland Conference 2016 - Python and MongoDB WorkshopPython Ireland Conference 2016 - Python and MongoDB Workshop
Python Ireland Conference 2016 - Python and MongoDB Workshop
Joe Drumgoole
 
Continuous Optimization for Distributed BigData Analysis
Continuous Optimization for Distributed BigData AnalysisContinuous Optimization for Distributed BigData Analysis
Continuous Optimization for Distributed BigData Analysis
Kai Sasaki
 
Ad

More from Akshay Mathur (20)

Documentation with Sphinx
Documentation with SphinxDocumentation with Sphinx
Documentation with Sphinx
Akshay Mathur
 
Kubernetes Journey of a Large FinTech
Kubernetes Journey of a Large FinTechKubernetes Journey of a Large FinTech
Kubernetes Journey of a Large FinTech
Akshay Mathur
 
Security and Observability of Application Traffic in Kubernetes
Security and Observability of Application Traffic in KubernetesSecurity and Observability of Application Traffic in Kubernetes
Security and Observability of Application Traffic in Kubernetes
Akshay Mathur
 
Enhanced Security and Visibility for Microservices Applications
Enhanced Security and Visibility for Microservices ApplicationsEnhanced Security and Visibility for Microservices Applications
Enhanced Security and Visibility for Microservices Applications
Akshay Mathur
 
Considerations for East-West Traffic Security and Analytics for Kubernetes En...
Considerations for East-West Traffic Security and Analytics for Kubernetes En...Considerations for East-West Traffic Security and Analytics for Kubernetes En...
Considerations for East-West Traffic Security and Analytics for Kubernetes En...
Akshay Mathur
 
Kubernetes as Orchestrator for A10 Lightning Controller
Kubernetes as Orchestrator for A10 Lightning ControllerKubernetes as Orchestrator for A10 Lightning Controller
Kubernetes as Orchestrator for A10 Lightning Controller
Akshay Mathur
 
Cloud Bursting with A10 Lightning ADS
Cloud Bursting with A10 Lightning ADSCloud Bursting with A10 Lightning ADS
Cloud Bursting with A10 Lightning ADS
Akshay Mathur
 
Shared Security Responsibility Model of AWS
Shared Security Responsibility Model of AWSShared Security Responsibility Model of AWS
Shared Security Responsibility Model of AWS
Akshay Mathur
 
Techniques for scaling application with security and visibility in cloud
Techniques for scaling application with security and visibility in cloudTechniques for scaling application with security and visibility in cloud
Techniques for scaling application with security and visibility in cloud
Akshay Mathur
 
Introduction to Node js
Introduction to Node jsIntroduction to Node js
Introduction to Node js
Akshay Mathur
 
Object Oriented Programing in JavaScript
Object Oriented Programing in JavaScriptObject Oriented Programing in JavaScript
Object Oriented Programing in JavaScript
Akshay Mathur
 
Getting Started with Angular JS
Getting Started with Angular JSGetting Started with Angular JS
Getting Started with Angular JS
Akshay Mathur
 
Releasing Software Without Testing Team
Releasing Software Without Testing TeamReleasing Software Without Testing Team
Releasing Software Without Testing Team
Akshay Mathur
 
Getting Started with jQuery
Getting Started with jQueryGetting Started with jQuery
Getting Started with jQuery
Akshay Mathur
 
CoffeeScript
CoffeeScriptCoffeeScript
CoffeeScript
Akshay Mathur
 
Creating Single Page Web App using Backbone JS
Creating Single Page Web App using Backbone JSCreating Single Page Web App using Backbone JS
Creating Single Page Web App using Backbone JS
Akshay Mathur
 
Getting Started with Web
Getting Started with WebGetting Started with Web
Getting Started with Web
Akshay Mathur
 
Getting Started with Javascript
Getting Started with JavascriptGetting Started with Javascript
Getting Started with Javascript
Akshay Mathur
 
Using Google App Engine Python
Using Google App Engine PythonUsing Google App Engine Python
Using Google App Engine Python
Akshay Mathur
 
Working with GIT
Working with GITWorking with GIT
Working with GIT
Akshay Mathur
 
Documentation with Sphinx
Documentation with SphinxDocumentation with Sphinx
Documentation with Sphinx
Akshay Mathur
 
Kubernetes Journey of a Large FinTech
Kubernetes Journey of a Large FinTechKubernetes Journey of a Large FinTech
Kubernetes Journey of a Large FinTech
Akshay Mathur
 
Security and Observability of Application Traffic in Kubernetes
Security and Observability of Application Traffic in KubernetesSecurity and Observability of Application Traffic in Kubernetes
Security and Observability of Application Traffic in Kubernetes
Akshay Mathur
 
Enhanced Security and Visibility for Microservices Applications
Enhanced Security and Visibility for Microservices ApplicationsEnhanced Security and Visibility for Microservices Applications
Enhanced Security and Visibility for Microservices Applications
Akshay Mathur
 
Considerations for East-West Traffic Security and Analytics for Kubernetes En...
Considerations for East-West Traffic Security and Analytics for Kubernetes En...Considerations for East-West Traffic Security and Analytics for Kubernetes En...
Considerations for East-West Traffic Security and Analytics for Kubernetes En...
Akshay Mathur
 
Kubernetes as Orchestrator for A10 Lightning Controller
Kubernetes as Orchestrator for A10 Lightning ControllerKubernetes as Orchestrator for A10 Lightning Controller
Kubernetes as Orchestrator for A10 Lightning Controller
Akshay Mathur
 
Cloud Bursting with A10 Lightning ADS
Cloud Bursting with A10 Lightning ADSCloud Bursting with A10 Lightning ADS
Cloud Bursting with A10 Lightning ADS
Akshay Mathur
 
Shared Security Responsibility Model of AWS
Shared Security Responsibility Model of AWSShared Security Responsibility Model of AWS
Shared Security Responsibility Model of AWS
Akshay Mathur
 
Techniques for scaling application with security and visibility in cloud
Techniques for scaling application with security and visibility in cloudTechniques for scaling application with security and visibility in cloud
Techniques for scaling application with security and visibility in cloud
Akshay Mathur
 
Introduction to Node js
Introduction to Node jsIntroduction to Node js
Introduction to Node js
Akshay Mathur
 
Object Oriented Programing in JavaScript
Object Oriented Programing in JavaScriptObject Oriented Programing in JavaScript
Object Oriented Programing in JavaScript
Akshay Mathur
 
Getting Started with Angular JS
Getting Started with Angular JSGetting Started with Angular JS
Getting Started with Angular JS
Akshay Mathur
 
Releasing Software Without Testing Team
Releasing Software Without Testing TeamReleasing Software Without Testing Team
Releasing Software Without Testing Team
Akshay Mathur
 
Getting Started with jQuery
Getting Started with jQueryGetting Started with jQuery
Getting Started with jQuery
Akshay Mathur
 
Creating Single Page Web App using Backbone JS
Creating Single Page Web App using Backbone JSCreating Single Page Web App using Backbone JS
Creating Single Page Web App using Backbone JS
Akshay Mathur
 
Getting Started with Web
Getting Started with WebGetting Started with Web
Getting Started with Web
Akshay Mathur
 
Getting Started with Javascript
Getting Started with JavascriptGetting Started with Javascript
Getting Started with Javascript
Akshay Mathur
 
Using Google App Engine Python
Using Google App Engine PythonUsing Google App Engine Python
Using Google App Engine Python
Akshay Mathur
 

Recently uploaded (20)

Electronic_Mail_Attacks-1-35.pdf by xploit
Electronic_Mail_Attacks-1-35.pdf by xploitElectronic_Mail_Attacks-1-35.pdf by xploit
Electronic_Mail_Attacks-1-35.pdf by xploit
niftliyevhuseyn
 
Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...
Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...
Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...
Impelsys Inc.
 
DevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptx
DevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptxDevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptx
DevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptx
Justin Reock
 
SAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdf
SAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdfSAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdf
SAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdf
Precisely
 
Drupalcamp Finland – Measuring Front-end Energy Consumption
Drupalcamp Finland – Measuring Front-end Energy ConsumptionDrupalcamp Finland – Measuring Front-end Energy Consumption
Drupalcamp Finland – Measuring Front-end Energy Consumption
Exove
 
Manifest Pre-Seed Update | A Humanoid OEM Deeptech In France
Manifest Pre-Seed Update | A Humanoid OEM Deeptech In FranceManifest Pre-Seed Update | A Humanoid OEM Deeptech In France
Manifest Pre-Seed Update | A Humanoid OEM Deeptech In France
chb3
 
IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...
IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...
IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...
organizerofv
 
AI and Data Privacy in 2025: Global Trends
AI and Data Privacy in 2025: Global TrendsAI and Data Privacy in 2025: Global Trends
AI and Data Privacy in 2025: Global Trends
InData Labs
 
Technology Trends in 2025: AI and Big Data Analytics
Technology Trends in 2025: AI and Big Data AnalyticsTechnology Trends in 2025: AI and Big Data Analytics
Technology Trends in 2025: AI and Big Data Analytics
InData Labs
 
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc
 
tecnologias de las primeras civilizaciones.pdf
tecnologias de las primeras civilizaciones.pdftecnologias de las primeras civilizaciones.pdf
tecnologias de las primeras civilizaciones.pdf
fjgm517
 
Cybersecurity Identity and Access Solutions using Azure AD
Cybersecurity Identity and Access Solutions using Azure ADCybersecurity Identity and Access Solutions using Azure AD
Cybersecurity Identity and Access Solutions using Azure AD
VICTOR MAESTRE RAMIREZ
 
Complete Guide to Advanced Logistics Management Software in Riyadh.pdf
Complete Guide to Advanced Logistics Management Software in Riyadh.pdfComplete Guide to Advanced Logistics Management Software in Riyadh.pdf
Complete Guide to Advanced Logistics Management Software in Riyadh.pdf
Software Company
 
Into The Box Conference Keynote Day 1 (ITB2025)
Into The Box Conference Keynote Day 1 (ITB2025)Into The Box Conference Keynote Day 1 (ITB2025)
Into The Box Conference Keynote Day 1 (ITB2025)
Ortus Solutions, Corp
 
Quantum Computing Quick Research Guide by Arthur Morgan
Quantum Computing Quick Research Guide by Arthur MorganQuantum Computing Quick Research Guide by Arthur Morgan
Quantum Computing Quick Research Guide by Arthur Morgan
Arthur Morgan
 
Generative Artificial Intelligence (GenAI) in Business
Generative Artificial Intelligence (GenAI) in BusinessGenerative Artificial Intelligence (GenAI) in Business
Generative Artificial Intelligence (GenAI) in Business
Dr. Tathagat Varma
 
Heap, Types of Heap, Insertion and Deletion
Heap, Types of Heap, Insertion and DeletionHeap, Types of Heap, Insertion and Deletion
Heap, Types of Heap, Insertion and Deletion
Jaydeep Kale
 
Linux Support for SMARC: How Toradex Empowers Embedded Developers
Linux Support for SMARC: How Toradex Empowers Embedded DevelopersLinux Support for SMARC: How Toradex Empowers Embedded Developers
Linux Support for SMARC: How Toradex Empowers Embedded Developers
Toradex
 
Designing Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep Dive
Designing Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep DiveDesigning Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep Dive
Designing Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep Dive
ScyllaDB
 
ThousandEyes Partner Innovation Updates for May 2025
ThousandEyes Partner Innovation Updates for May 2025ThousandEyes Partner Innovation Updates for May 2025
ThousandEyes Partner Innovation Updates for May 2025
ThousandEyes
 
Electronic_Mail_Attacks-1-35.pdf by xploit
Electronic_Mail_Attacks-1-35.pdf by xploitElectronic_Mail_Attacks-1-35.pdf by xploit
Electronic_Mail_Attacks-1-35.pdf by xploit
niftliyevhuseyn
 
Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...
Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...
Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...
Impelsys Inc.
 
DevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptx
DevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptxDevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptx
DevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptx
Justin Reock
 
SAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdf
SAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdfSAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdf
SAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdf
Precisely
 
Drupalcamp Finland – Measuring Front-end Energy Consumption
Drupalcamp Finland – Measuring Front-end Energy ConsumptionDrupalcamp Finland – Measuring Front-end Energy Consumption
Drupalcamp Finland – Measuring Front-end Energy Consumption
Exove
 
Manifest Pre-Seed Update | A Humanoid OEM Deeptech In France
Manifest Pre-Seed Update | A Humanoid OEM Deeptech In FranceManifest Pre-Seed Update | A Humanoid OEM Deeptech In France
Manifest Pre-Seed Update | A Humanoid OEM Deeptech In France
chb3
 
IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...
IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...
IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...
organizerofv
 
AI and Data Privacy in 2025: Global Trends
AI and Data Privacy in 2025: Global TrendsAI and Data Privacy in 2025: Global Trends
AI and Data Privacy in 2025: Global Trends
InData Labs
 
Technology Trends in 2025: AI and Big Data Analytics
Technology Trends in 2025: AI and Big Data AnalyticsTechnology Trends in 2025: AI and Big Data Analytics
Technology Trends in 2025: AI and Big Data Analytics
InData Labs
 
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc
 
tecnologias de las primeras civilizaciones.pdf
tecnologias de las primeras civilizaciones.pdftecnologias de las primeras civilizaciones.pdf
tecnologias de las primeras civilizaciones.pdf
fjgm517
 
Cybersecurity Identity and Access Solutions using Azure AD
Cybersecurity Identity and Access Solutions using Azure ADCybersecurity Identity and Access Solutions using Azure AD
Cybersecurity Identity and Access Solutions using Azure AD
VICTOR MAESTRE RAMIREZ
 
Complete Guide to Advanced Logistics Management Software in Riyadh.pdf
Complete Guide to Advanced Logistics Management Software in Riyadh.pdfComplete Guide to Advanced Logistics Management Software in Riyadh.pdf
Complete Guide to Advanced Logistics Management Software in Riyadh.pdf
Software Company
 
Into The Box Conference Keynote Day 1 (ITB2025)
Into The Box Conference Keynote Day 1 (ITB2025)Into The Box Conference Keynote Day 1 (ITB2025)
Into The Box Conference Keynote Day 1 (ITB2025)
Ortus Solutions, Corp
 
Quantum Computing Quick Research Guide by Arthur Morgan
Quantum Computing Quick Research Guide by Arthur MorganQuantum Computing Quick Research Guide by Arthur Morgan
Quantum Computing Quick Research Guide by Arthur Morgan
Arthur Morgan
 
Generative Artificial Intelligence (GenAI) in Business
Generative Artificial Intelligence (GenAI) in BusinessGenerative Artificial Intelligence (GenAI) in Business
Generative Artificial Intelligence (GenAI) in Business
Dr. Tathagat Varma
 
Heap, Types of Heap, Insertion and Deletion
Heap, Types of Heap, Insertion and DeletionHeap, Types of Heap, Insertion and Deletion
Heap, Types of Heap, Insertion and Deletion
Jaydeep Kale
 
Linux Support for SMARC: How Toradex Empowers Embedded Developers
Linux Support for SMARC: How Toradex Empowers Embedded DevelopersLinux Support for SMARC: How Toradex Empowers Embedded Developers
Linux Support for SMARC: How Toradex Empowers Embedded Developers
Toradex
 
Designing Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep Dive
Designing Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep DiveDesigning Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep Dive
Designing Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep Dive
ScyllaDB
 
ThousandEyes Partner Innovation Updates for May 2025
ThousandEyes Partner Innovation Updates for May 2025ThousandEyes Partner Innovation Updates for May 2025
ThousandEyes Partner Innovation Updates for May 2025
ThousandEyes
 

Mongo db

  • 1. NoSQL Database Akshay Mathur Sarang Shravagi @akshaymathu, @_sarangs {name: ‘mongo’, type: ‘db’}
  • 3. Let’s Know Each Other • Do you code? • OS? • Programing Language? • Why are you attending? @akshaymathu, @_sarangs 3
  • 4. Akshay Mathur • Managed development, testing and release teams in last 14+ years – Currently Principal Architect at ShopSocially • Founding Team Member of – ShopSocially (Enabling “social” for retailers) – AirTight Neworks (Global leader of WIPS) @akshaymathu, @_sarangs 4
  • 5. Sarang Shravagi • 10gen Certified Developer and DBA • CS graduate from PICT Pune • 3+ years in Software Product industry • Currently Senior Full-stack Developer at ShopSocially @akshaymathu, @_sarangs 5
  • 6. How we use MongoDB @akshaymathu, @_sarangs 6 Python MongoDB MongoEngine
  • 8. Program Outline: Understanding NoSQL • Data Landscape • Different Storage Needs • Design Paradigm Shift from SQL to NoSQL • Different Datastores • Closer look to Document Storage • Drawing parallel from RDBMS @akshaymathu, @_sarangs 8
  • 9. Program Outline: Hands on Lab • Installation and basic configuration • Mongo Shell • Creating and Changing Schema • Create, Read, Update and Delete of Data • Analyzing Performance • Improving performance by creating Indices • Assignment • Problem solving for the assignment @akshaymathu, @_sarangs 9
  • 10. Program Outline: Advance Topics • Handling Big Data – Introduction to Map/Reduce – Introduction to Data Partitioning (Sharding) • Disaster Recovery – Introduction to Replica set and High Availability @akshaymathu, @_sarangs 10
  • 11. Ground Rules • Disturb Everyone – Not by phone rings – Not by local talks – By more information and questions @akshaymathu, @_sarangs 11
  • 12. Data Patterns & Storage Needs @akshaymathu, @_sarangs 12
  • 13. Data at an Online Store • Product Information • User Information • Purchase Information • Product Reviews • Site Interactions • Social Graph • Search Index @akshaymathu, @_sarangs 13
  • 14. SQL to NoSQL Design Paradigm Shift @akshaymathu, @_sarangs 14
  • 15. SQL Storage • Was designed when – Storage and data transfer was costly – Processing was slow – Applications were oriented more towards data collection • Initial adopters were financial institutions @akshaymathu, @_sarangs 15
  • 16. SQL Storage • Structured – schema • Relational – foreign keys, constraints • Transactional – Atomicity, Consistency, Isolation, Durability • High Availability through robustness – Minimize failures • Optimized for Writes • Typically Scale Up @akshaymathu, @_sarangs 16
  • 17. NoSQL Storage • Is designed when – Storage is cheap – Data transfer is fast – Much more processing power is available • Clustering of machines is also possible – Applications are oriented towards consumption of User Generated Content – Better on-screen user experience is in demand @akshaymathu, @_sarangs 17
  • 18. NoSQL Storage • Semi-structured – Schemaless • Consistency, Availability, Partition Tolerance • High Availability through clustering – expect failures • Optimized for Reads • Typically Scale Out @akshaymathu, @_sarangs 18
  • 19. Different Datastores Half Level Deep @akshaymathu, @_sarangs 19
  • 20. SQL: RDBMS • MySql, Postgresql, Oracle etc. • Stores data in tables having columns – Basic (number, text) data types • Strong query language • Transparent values – Query language can read and filter on them – Relationship between tables based on values • Suited for user info and transactions @akshaymathu, @_sarangs 20
  • 21. NoSQL: Key/Value • Redis, DynamoDB etc. • Stores a values against a key – Strings • Values are opaque – Can not be part of query • Suited for site interactions @akshaymathu, @_sarangs 21
  • 23. NoSQL: Document • MongoDB, CouchDB etc. • Object Oriented data models – Stores data in document objects having fields – Basic and compound (list, dict) data types • SQL like queries • Transparent values – Can be part of query • Suited for product info and its reviews @akshaymathu, @_sarangs 23
  • 25. NoSQL: Column Family • Cassandra, Big Table etc. • Stores data in columns • Transparent values – Can be part of query • SQL like queries • Suited for search @akshaymathu, @_sarangs 25
  • 27. NoSQL: Graph • Neo4j • Stores data in form of nodes and relationships • Query is in form of traversal • In-memory • Suited for social graph @akshaymathu, @_sarangs 27
  • 30. Document Storage: Closer Look @akshaymathu, @_sarangs 30
  • 31. MongoDB • Document database • Powerful query language • Docs, sub-docs, indexes • Map/reduce • Replicas, shards, replicated shards • SDKs/drivers for so many languages – C, C++, C#, Python, Erlang, PHP, Java, Javascript, NodeJS, Perl, Ruby, Scala @akshaymathu, @_sarangs 31
  • 34. RDBMS  MongoDB RDBMS MongoDB Database Database Table Collection Row Document Column Field Select c1, c2 from Table where c1 = ‘v1’ order by c2 limit n Collection.objects(F1 = ‘v1’).order_by(‘c2’).limit(n) @akshaymathu, @_sarangs 34
  • 40. MongoDB: Core Binaries • mongod – Database server • mongo – Database client shell • mongos – Router for Sharding @akshaymathu, @_sarangs 40
  • 41. Getting Help • For mongo shell – mongo –help • Shows options available for running the shell • Inside mongo shell – Object.help() • Shows commands available on the object @akshaymathu, @_sarangs 41
  • 42. Import Export Tools • For objects – mongodump – mongorestore – bsondump – mongooplog • For data items – mongoimport – mongoexport @akshaymathu, @_sarangs 42
  • 43. Database Operations • Database creation • Creating/changing collection • Data insertion • Data read • Data update • Creating indices • Data deletion • Dropping collection @akshaymathu, @_sarangs 43
  • 44. Diagnostic Tools • mongostat • mongoperf • mongosnif • mongotop @akshaymathu, @_sarangs 44
  • 46. Assignment • Go to http://www.velocitainc.com/mongo/ – Tasks • assignments.txt – Data • students.json @akshaymathu, @_sarangs 46
  • 47. Disaster Recovery Introduction to Replica Sets and High Availability @akshaymathu, @_sarangs 47
  • 48. Disasters • Physical Failure – Hardware – Network • Solution – Replica Sets • Provide redundant storage for High Availability – Real time data synchronization • Automatic failover for zero down time @akshaymathu, @_sarangs 48
  • 50. Multi Replication • Data can be replicated to multiple places simultaneously • Odd number of machines are always needed in a replica set @akshaymathu, @_sarangs 50
  • 51. Single Replication • If you want to have only one or odd number of secondary, you need to setup an arbiter @akshaymathu, @_sarangs 51
  • 52. Failover • When primary fails, remaining machines vote for electing new primary @akshaymathu, @_sarangs 52
  • 53. Handling Big Data Introduction to Map/Reduce and Sharding @akshaymathu, @_sarangs 53
  • 54. Large Data Sets • Problem 1 – Performance • Queries go slow • Solution – Map/Reduce @akshaymathu, @_sarangs 54
  • 55. Map Reduce • A way to divide large query computation into smaller chunks • May run in multiple processes across multiple machines • Think of it as GROUP BY of SQL @akshaymathu, @_sarangs 55
  • 56. Map/Reduce Example • Map function digs the data and returns required values @akshaymathu, @_sarangs 56
  • 57. Map/Reduce Example • Reduce function uses the output of Map function and generates aggregated value @akshaymathu, @_sarangs 57
  • 58. Large Data Sets • Problem 2 – Vertical Scaling of Hardware • Can’t increase machine size beyond a limit • Solution – Sharding @akshaymathu, @_sarangs 58
  • 59. Sharding • A method for storing data across multiple machines • Data is partitioned using Shard Keys @akshaymathu, @_sarangs 59
  • 60. Data Partitioning: Range Based • A range of Shard Keys stay in a chunk @akshaymathu, @_sarangs 60
  • 61. Data Partitioning: Hash Bsed • A hash function on Shard Keys decides the chunk @akshaymathu, @_sarangs 61
  • 63. Optimizing Shards: Splitting • In a shard, when size of a chunk increases, the chunk is divided into two @akshaymathu, @_sarangs 63
  • 64. Optimizing Shards: Balancing • When number of chunks in a shard increase, a few chunks are migrated to other shard @akshaymathu, @_sarangs 64
  • 65. Summary • MongoDB is good – Stores objects as we use in programming language – Flexible semi-structured design – Scales out to store big data – Embedded documents eliminates need for join • MongoDB is bad – No multi-document query – De-normalized storage – No support for transactions @akshaymathu, @_sarangs 65