SlideShare a Scribd company logo
NOSQL DATABASES
AND BIG DATA STORAGE SYSTEMS
Ateeq Ateeq
CONTENT
 1- Introduction to NOSQL Systems
 2- The CAP Theorem
 3- Document-Based NOSQL Systems and MongoDB
 4- NOSQL Key-Value Stores
 5- Column-Based or Wide Column NOSQL Systems
 6- NOSQL Graph Databases and Neo4j
INTRODUCTION TO NOSQL SYSTEMS
 1.1 Emergence of NOSQL Systems
 1.2 Characteristics of NOSQL Systems
 1.3 Categories of NOSQL Systems
1.1 EMERGENCE OF NOSQL SYSTEMS
 SQL system may not be appropriate for some applications
such as Emails
 SQL systems offer too many services (powerful query
language, concurrency control, etc.), which this application
may not need;
 structured data model such the traditional relational model
may be too restrictive.
 SQL require schemas, which are not required by many of
the NOSQL systems.
1.1 EMERGENCE OF NOSQL SYSTEMS
 Examples of NOSQL systems:
 Google – BigTable
 Amazon – DynamoDB
 Facebook – Cassandra
 MongoDB
 CouchDB
 Graph databases like Neo4J and GraphBase
1.2 CHARACTERISTICS OF NOSQL SYSTEMS
 NOSQL characteristics related to distributed
databases and distributed systems.
 NOSQL characteristics related to data models and
query languages.
CHARACTERISTICS RELATED TO DISTRIBUTED
DATABASES AND DISTRIBUTED SYSTEMS
1- Scalability:
 horizontal scalability: adding more nodes for data
storage and processing as the volume of data grows.
 Vertical scalability: expanding the storage and
computing power of existing nodes.
 In NOSQL systems, horizontal scalability is employed
while the system is operational, so techniques for
distributing the existing data among new nodes without
interrupting system operation are necessary.
CHARACTERISTICS RELATED TO DISTRIBUTED
DATABASES AND DISTRIBUTED SYSTEMS
2- Availability, Replication and Eventual Consistency:
 Data is replicated over two or more nodes in a
transparent manner.
 Update must be applied to every copy of the replicated
data items.
 Eventual consistency: is a consistency model used in
distributed computing to achieve high availability that
informally guarantees that, if no new updates are made
to a given data item, eventually all accesses to that item
will return the last updated value.
CHARACTERISTICS RELATED TO DISTRIBUTED
DATABASES AND DISTRIBUTED SYSTEMS
3- Replication Models:
 Master-slave replication: requires one copy to be the
master copy;
 Write operations must be applied to the master copy, usually
using eventual consistency
 For read, all reads are from the master copy, or reads at the
slave copies but would not guarantee that the values are the
latest writes.
 Master-master replication: allows reads and writes at
any of the replicas.
 The values of an item will be temporarily inconsistent.
 Reconciliation method to resolve conflicting write operations of
the same data item at different nodes must be implemented as
part of the master-master replication scheme.
CHARACTERISTICS RELATED TO DISTRIBUTED
DATABASES AND DISTRIBUTED SYSTEMS
CHARACTERISTICS RELATED TO DISTRIBUTED
DATABASES AND DISTRIBUTED SYSTEMS
CHARACTERISTICS RELATED TO DISTRIBUTED
DATABASES AND DISTRIBUTED SYSTEMS
 4- Sharding of Files:
 Files can have many millions of records accessed concurrently by
thousands of users.
 Sharding (also known as horizontal) serves to distribute the load
of accessing the file records to multiple nodes.
 Shards works in tandem to improve load balancing on the
replication as well as data availability.
CHARACTERISTICS RELATED TO DISTRIBUTED
DATABASES AND DISTRIBUTED SYSTEMS
CHARACTERISTICS RELATED TO DISTRIBUTED
DATABASES AND DISTRIBUTED SYSTEMS
 5- High-Performance Data Access:
 Hashing: The location of the value is given by the result of h(k).
 Range partitioning: the location is determined via a range of key values.
Example: location i would hold the objects whose key values K are in the
range Kimin ≤ K ≤ Kimax.
In applications that require range queries, where multiple objects within a range of
key values are retrieved, range partitioned is preferred.
CHARACTERISTICS RELATED TO DISTRIBUTED
DATABASES AND DISTRIBUTED SYSTEMS
CHARACTERISTICS RELATED TO DISTRIBUTED
DATABASES AND DISTRIBUTED SYSTEMS
CHARACTERISTICS RELATED TO DATA MODELS
AND QUERY LANGUAGES.
 1- Not Requiring a Schema:
 Allowing semi-structured and self describing data.
 The users can specify a partial schema in some systems to improve storage
efficiency, but it is not required to have a schema in most of the NOSQL
systems.
 Constraints on the data would have to be programmed in the application
programs that access the data items.
 Languages for describing semi-structured data: JSON (JavaScript Object
Notation) and XML (Extensible Markup Language)
CHARACTERISTICS RELATED TO DATA MODELS
AND QUERY LANGUAGES.
 2- Less Powerful Query Languages:
 In many applications that use NOSQL systems may not require a powerful
query language such as SQL, because search (read) queries in these systems
often locate single objects in a single file based on their object keys.
 Reading and writing the data objects is accomplished by calling the
appropriate operations by the programmer (API).
 SCRUD: Search, Create, Read, Update and Delete
 Provide a high-level query language, but it may not have the full power of
SQL, for example the joins need to be implemented in the application
programs.
CHARACTERISTICS RELATED TO DATA MODELS
AND QUERY LANGUAGES.
 3- Versioning:
 Provide storage of multiple versions of the data items, with the timestamps of
when the data version was created.
1.3 CATEGORIES OF NOSQL SYSTEMS
The most common categories:
1. Document-based NOSQL systems:
 Store data in the form of documents using well-known formats such as JSON.
 Documents are accessible via their document id, but can also be accessed rapidly
using other indexes.
2. NOSQL key-value stores:
 Fast access by the key to the value associated with the key
 Value can be a record or an object or a document or even have a more complex
data structure.
3. Column-based or wide column NOSQL systems:
 Partition a table by column into column families
 Form of vertical partitioning.
4. Graph-based NOSQL systems:
 Data is represented as graphs
 Related nodes can be found by traversing the edges using path expressions.
1.3 CATEGORIES OF NOSQL SYSTEMS
Additional categories :
5. Hybrid NOSQL systems:
 These systems have characteristics from two or more of the common categories..
6. Object databases.
7. XML databases.
THE CAP THEOREM
 The CAP: it’s impossible to guarantee consistency, availability and
partition tolerance at the same time in a distributed system with data
replication.
 Two properties out of the three to guarantee.
 Weaker consistency levels are often used in NOSQL system instead
of guaranteeing serializability.
 Eventual consistency is used.
THE CAP THEOREM
 The CAP theorem is used to explain some of the
competing requirements in a distributed system with
replication.
 The three letters in CAP refers to
 Consistency (among replicated copies):
 The nodes will have the same copies of a replicated data item
visible for various transactions.
 Availability (of the system for read and write operations) :
 Each read or write will either be processed successfully or will
receive a message that the operation cannot be completed.
 Partition tolerance (in the face of the nodes in the system
being partitioned by a network fault).:
 The system can continue operating if the network connecting the
nodes has a fault that results in two or more partitions,
 Nodes in each partition can only communicate among each other.
THE CAP THEOREM
DOCUMENT-BASED NOSQL SYSTEMS AND MONGODB
1. Introduction
2. MongoDB Data Model
3. MongoDB CRUD Operations
4. MongoDB Distributed Systems Characteristics
3.1INTRODUCTION
 Document-based NOSQL systems store data as
collections of similar documents.
 Documents resemble complex objects or XML
documents
 Documents in a collection should be similar, but
they can have different attributes.
 Document-based NOSQL systems: MongoDB and
CouchDB.
3.2 MONGODB DATA MODEL
 MongoDB is a free and open-source cross-platform
document-oriented database.
 Classified as a NoSQL database,
3.2 MONGODB DATA MODEL
 MongoDB documents are stored in BSON (Binary
JSON) format.
 BSON is a variation of JSON with some additional data
types and is more efficient for storage than JSON.
 Individual documents are stored in a collection.
 The operation createCollection is used to create each
collection.
3.2 MONGODB DATA MODEL
 Example: create a collection called project to hold PROJECT
objects from the COMPANY database :
db.createCollection(“project”, { capped : true, size : 1310720,
max : 500 } )
 “project” is the name of the collection (Mandatory)
 Capped: capped means it has upper limits on its storage
space (size) and number of documents (max).
 Capping helps the system to choose the storage options
for each collection.
3.2 MONGODB DATA MODEL
 Example: create a document collection called worker :
db.createCollection(“worker”, { capped : true, size : 5242880, max : 2000 } )
 Each document has a unique ObjectId field “_id”
 The _id is by default:
 Automatically indexed in the collection.
 The value is system-generated.
 System-generated have a specific format – “combines the timestamp when the object is
created, the node id, the process id and a counter “.
 User-generated can have any value specified by the user as long as its.
3.2 MONGODB DATA MODEL
 A collection does not have a schema.
 The structure of the data fields in documents is chosen based on
how documents will be accessed and used, and the user can choose
a normalized design (similar to normalized relational tuples) or a
denormalized design (similar to XML documents or complex objects).
 Interdocument references can be specified by storing in one
document the ObjectId or ObjectIds of other related documents.
3.2 MONGODB DATA MODEL
Company database example
3.2 MONGODB DATA MODEL
Project info
Embedded workers info
3.2 MONGODB DATA MODEL
Project info
Embedded workers array
Workers
3.2 MONGODB DATA MODEL
Project ID as an attribute
3.2 MONGODB DATA MODEL
3.3 MONGODB CRUD OPERATIONS
 Insert:
 db.<collection_name>.insert(<document(s)>)
 Example:
 Db.project.insert({_id:”P1”, Pname:”ProjectX”,Plocation:”Jenin”})
 Delete: remove
 db.<collection_name>.remove(<condition>)
 Example:
 db.project.remove( {"_id": ObjectId(“P1")});
3.3 MONGODB CRUD OPERATIONS
 Read: fined
 db.<collection_name>.find(<condition>)
 Example:
 Db.project.find({"_id": ObjectId(“P1")})
 Update:
 db.<collection_name>. update(SELECTIOIN_CRITERIA,
UPDATED_DATA)
 Example:
 Db.project.update({"_id" : ObjectId(P1)},{$set:{‘PLocation':‘AAUJ'}})
3.4 MONGODB DISTRIBUTED SYSTEMS
CHARACTERISTICS
 Replication in MongoDB
 Sharding in MongoDB
REPLICATION IN MONGODB
 Master-slave approach for replication.
 All read and write are done on the primary copy.
 Secondary copies are to recover from primary fails.
SHARDING IN MONGODB
 Sharding of the documents in the collection—also
known as horizontal partitioning— divides the
documents into disjoint partitions known as shards.
 Two ways:
 Range partitioning
 Hash partitioning
SHARDING IN MONGODB
 Range and Hash portioning require that the user
specify a particular document field to be used as
the basis for partitioning the documents into shards.
 The partitioning field—known as the “shard key”,
must exist in every document in the collection, and
it must have an index.
 The values of the shard key are divided into
chunks, and the documents are partitioned based
on the chunks of shard key values
SHARDING IN MONGODB
 Chunks created by specifying a range of key values
and each chunk contains the key values in one
range.
 If range queries are commonly applied to a
collection (for example, retrieving all documents
whose shard key value is between 200 and 400),
then range partitioning is preferred
 Because each range query will typically be submitted to
a single node that contains all the required documents
in one shard.
 If most searches retrieve one document at a time,
hash partitioning may be preferable because it
randomizes the distribution of shard key values into
chunks.
SHARDING IN MONGODB
 MongoDB queries are submitted to a module called
the query router, which keeps track of which nodes
contain which shards based on the particular
partitioning method used on the shard keys.
 The query will be routed to the nodes that contain the
shards that hold the documents that the query is
requesting.
 If the system cannot determine which shards hold the
required documents, the query will be submitted to all
the nodes that hold shards of the collection.
SHARDING IN MONGODB
 Sharding and replication are used together:
 Sharding focuses on improving performance via load
balancing and horizontal scalability.
 Replication focuses on ensuring system availability
when certain nodes fail in the distributed system.
WHY NOSQL?
 Document or table ?
WHY NOSQL?
 Alter the table and add Description, Rate and Reviews
 NOSQL is Flexible
No Schema restrictions
WHY NOSQL?
 SQL is Restricted !
Fill the data
WHY NOSQL? - USE CASES WHERE NOSQL
WILL OUTPERFORM SQL
 Agile - Flexibility for Faster Development
WHY NOSQL? - USE CASES WHERE NOSQL
WILL OUTPERFORM SQL
 Agile - Flexibility for Faster Development
WHY NOSQL? - USE CASES WHERE NOSQL
WILL OUTPERFORM SQL
 Agile - Simplicity for Easier Development
WHY NOSQL? - USE CASES WHERE NOSQL
WILL OUTPERFORM SQL
 Agile - Simplicity for Easier Development
 Reading this profile would require the application to
read six rows from three table
WHY NOSQL? - USE CASES WHERE NOSQL
WILL OUTPERFORM SQL
 Agile - Simplicity for Easier Development
WHY NOSQL? - USE CASES WHERE NOSQL
WILL OUTPERFORM SQL
 Availability for Always-on
WHY NOSQL? - USE CASES WHERE NOSQL
WILL OUTPERFORM SQL
 Availability for Always-on
NOSQL CATEGORIES EXAMPLES -
DOCUMENT-BASED NOSQL SYSTEMS
XML is stored into a native XML Type
NOSQL CATEGORIES EXAMPLES -
DOCUMENT-BASED NOSQL SYSTEMS
 The query retrieves the <Features> child element of
the <ProductDescription> element
 Result:
NOSQL CATEGORIES EXAMPLES - NOSQL
KEY-VALUE STORES
 RIAK as example
NOSQL CATEGORIES EXAMPLES - NOSQL
KEY-VALUE STORES
 The response to a query will be an object contains
a list of documents which match the given query.
 The documents returned are Search documents (a
set of Solr field/values)
NOSQL CATEGORIES EXAMPLES - COLUMN
NOSQL SYSTEMS
 Cassandra as an example
 returns a result-set of rows, where each row
consists of a key and a collection of columns
corresponding to the query
NOSQL CATEGORIES EXAMPLES - COLUMN
NOSQL SYSTEMS
 LOCAL_QUORUM: it’s a consistency level type
 Used in multiple data center clusters.
 Use to maintain consistency locally (within the single data center).
NOSQL CATEGORIES EXAMPLES - GRAPH-
BASED NOSQL SYSTEMS
 Neo4j as an example
NOSQL CATEGORIES EXAMPLES - GRAPH-
BASED NOSQL SYSTEMS
NOSQL CATEGORIES EXAMPLES - OBJECT
DATABASES
 LINQ as an example
NOSQL KEY-VALUE STORES
1. Introduction
2. DynamoDB Overview
3. Voldemort Key-Value Distributed Data Store
4. Examples of Other Key-Value Stores
4.1 INTRODUCTION
 No query language
 A set of operations that can be used by the
application programmers.
 Characteristics:
 Every value is associated with a unique key.
 Retrieving the value by supplying the key is very fast.
4.1 INTRODUCTION
4.2 DYNAMODB OVERVIEW
 Amazon product – part AWS
 Data model is using the concepts of tables, items,
and attributes.
 The table does not have a schema.
 Holds a collection of self-describing items.
 The item consist of a number of (attribute, value) pairs
 Attribute values can be single-valued or multivalued.
4.2 DYNAMODB OVERVIEW
 Uploads an item to the ProductCatalog table
4.3 VOLDEMORT KEY-VALUE DISTRIBUTED
DATA STORE
 Based on Amazon’s DynamoDB.
 Used by LinkedIn.
 Simple and basic set of operations, like (put, delete
and get).
 Pluggable with other storage engines like MySQL
 Nodes are independent
 Automatic replications and partitioning
4.3 VOLDEMORT KEY-VALUE DISTRIBUTED
DATA STORE
4.4 EXAMPLES OF OTHER KEY-VALUE
STORES
1. Oracle key-value store.
2. Redis key-value cache and store.
3. Apache Cassandra
COLUMN-BASED OR WIDE COLUMN
NOSQL SYSTEMS
 Stores data tables as columns rather than as rows.
HBASE DATA MODEL AND VERSIONING
 Apache HBase is an open-source, distributed, versioned, non-
relational database.
 Column is identified by a combination of (column family:column
qualifier).
 Stores multiple versions of a data item, with a timestamp associated
with each version.
HBASE DATA MODEL AND VERSIONING
HBASE DATA MODEL AND VERSIONING
 Table is divided into a number of regions.
 Range partitioning.
 Apache Zookeeper and Apache HDFS (Hadoop Distributed
File System) are used for management.
NOSQL GRAPH DATABASES AND NEO4J
 The data is represented as a graph, which is a collection of vertices
(nodes) and edges.
 Nodes and edges can be labeled to indicate the types of entities and
relationships they represent
 It is generally possible to store data associated with both individual
nodes and individual edges.
 Neo4j is a NOSQL Graph DB and it’s an open source system, also it
is implemented in Java.
NEO4J
 The data model in Neo4j organizes data using the concepts of nodes
and relationships.
 Nodes and relationships have properties which store the data items.
 Nodes can have labels.
 Nodes that have the same label are grouped into a collection that
identifies a subset of the nodes in the database graph for querying
purposes.
 A node can have zero, one, or several labels.
NEO4J
NEO4J
Nosql databases
Ad

More Related Content

What's hot (20)

NoSQL databases
NoSQL databasesNoSQL databases
NoSQL databases
Filip Ilievski
 
introduction to NOSQL Database
introduction to NOSQL Databaseintroduction to NOSQL Database
introduction to NOSQL Database
nehabsairam
 
Mongodb basics and architecture
Mongodb basics and architectureMongodb basics and architecture
Mongodb basics and architecture
Bishal Khanal
 
The Basics of MongoDB
The Basics of MongoDBThe Basics of MongoDB
The Basics of MongoDB
valuebound
 
No sqlpresentation
No sqlpresentationNo sqlpresentation
No sqlpresentation
Salma Gouia
 
Introduction to NOSQL databases
Introduction to NOSQL databasesIntroduction to NOSQL databases
Introduction to NOSQL databases
Ashwani Kumar
 
Relational databases vs Non-relational databases
Relational databases vs Non-relational databasesRelational databases vs Non-relational databases
Relational databases vs Non-relational databases
James Serra
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDB
MongoDB
 
9. Document Oriented Databases
9. Document Oriented Databases9. Document Oriented Databases
9. Document Oriented Databases
Fabio Fumarola
 
Introduction to MongoDB and CRUD operations
Introduction to MongoDB and CRUD operationsIntroduction to MongoDB and CRUD operations
Introduction to MongoDB and CRUD operations
Anand Kumar
 
NoSQL databases
NoSQL databasesNoSQL databases
NoSQL databases
Marin Dimitrov
 
Sql vs NoSQL-Presentation
 Sql vs NoSQL-Presentation Sql vs NoSQL-Presentation
Sql vs NoSQL-Presentation
Shubham Tomar
 
Data Warehouse Fundamentals
Data Warehouse FundamentalsData Warehouse Fundamentals
Data Warehouse Fundamentals
Rashmi Bhat
 
Basics of MongoDB
Basics of MongoDB Basics of MongoDB
Basics of MongoDB
HabileLabs
 
NoSql
NoSqlNoSql
NoSql
AnitaSenthilkumar
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDB
Dineesha Suraweera
 
MongoDB
MongoDBMongoDB
MongoDB
nikhil2807
 
Introduction to MongoDB.pptx
Introduction to MongoDB.pptxIntroduction to MongoDB.pptx
Introduction to MongoDB.pptx
Surya937648
 
Introduction to Hadoop and Hadoop component
Introduction to Hadoop and Hadoop component Introduction to Hadoop and Hadoop component
Introduction to Hadoop and Hadoop component
rebeccatho
 
Mongo db intro.pptx
Mongo db intro.pptxMongo db intro.pptx
Mongo db intro.pptx
JWORKS powered by Ordina
 
introduction to NOSQL Database
introduction to NOSQL Databaseintroduction to NOSQL Database
introduction to NOSQL Database
nehabsairam
 
Mongodb basics and architecture
Mongodb basics and architectureMongodb basics and architecture
Mongodb basics and architecture
Bishal Khanal
 
The Basics of MongoDB
The Basics of MongoDBThe Basics of MongoDB
The Basics of MongoDB
valuebound
 
No sqlpresentation
No sqlpresentationNo sqlpresentation
No sqlpresentation
Salma Gouia
 
Introduction to NOSQL databases
Introduction to NOSQL databasesIntroduction to NOSQL databases
Introduction to NOSQL databases
Ashwani Kumar
 
Relational databases vs Non-relational databases
Relational databases vs Non-relational databasesRelational databases vs Non-relational databases
Relational databases vs Non-relational databases
James Serra
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDB
MongoDB
 
9. Document Oriented Databases
9. Document Oriented Databases9. Document Oriented Databases
9. Document Oriented Databases
Fabio Fumarola
 
Introduction to MongoDB and CRUD operations
Introduction to MongoDB and CRUD operationsIntroduction to MongoDB and CRUD operations
Introduction to MongoDB and CRUD operations
Anand Kumar
 
Sql vs NoSQL-Presentation
 Sql vs NoSQL-Presentation Sql vs NoSQL-Presentation
Sql vs NoSQL-Presentation
Shubham Tomar
 
Data Warehouse Fundamentals
Data Warehouse FundamentalsData Warehouse Fundamentals
Data Warehouse Fundamentals
Rashmi Bhat
 
Basics of MongoDB
Basics of MongoDB Basics of MongoDB
Basics of MongoDB
HabileLabs
 
Introduction to MongoDB.pptx
Introduction to MongoDB.pptxIntroduction to MongoDB.pptx
Introduction to MongoDB.pptx
Surya937648
 
Introduction to Hadoop and Hadoop component
Introduction to Hadoop and Hadoop component Introduction to Hadoop and Hadoop component
Introduction to Hadoop and Hadoop component
rebeccatho
 

Viewers also liked (20)

NoSQL Databases: Why, what and when
NoSQL Databases: Why, what and whenNoSQL Databases: Why, what and when
NoSQL Databases: Why, what and when
Lorenzo Alberton
 
FoundationDB - NoSQL and ACID
FoundationDB - NoSQL and ACIDFoundationDB - NoSQL and ACID
FoundationDB - NoSQL and ACID
inside-BigData.com
 
Deterministic simulation testing
Deterministic simulation testingDeterministic simulation testing
Deterministic simulation testing
FoundationDB
 
NoSql Databases
NoSql DatabasesNoSql Databases
NoSql Databases
Nimat Khattak
 
Nosql databases for the .net developer
Nosql databases for the .net developerNosql databases for the .net developer
Nosql databases for the .net developer
Jesus Rodriguez
 
NOSQL Database: Apache Cassandra
NOSQL Database: Apache CassandraNOSQL Database: Apache Cassandra
NOSQL Database: Apache Cassandra
Folio3 Software
 
A practical introduction to Oracle NoSQL Database - OOW2014
A practical introduction to Oracle NoSQL Database - OOW2014A practical introduction to Oracle NoSQL Database - OOW2014
A practical introduction to Oracle NoSQL Database - OOW2014
Anuj Sahni
 
Big Data and NoSQL for Database and BI Pros
Big Data and NoSQL for Database and BI ProsBig Data and NoSQL for Database and BI Pros
Big Data and NoSQL for Database and BI Pros
Andrew Brust
 
NoSQL and ACID
NoSQL and ACIDNoSQL and ACID
NoSQL and ACID
FoundationDB
 
An Intro to NoSQL Databases
An Intro to NoSQL DatabasesAn Intro to NoSQL Databases
An Intro to NoSQL Databases
Rajith Pemabandu
 
Using Spring with NoSQL databases (SpringOne China 2012)
Using Spring with NoSQL databases (SpringOne China 2012)Using Spring with NoSQL databases (SpringOne China 2012)
Using Spring with NoSQL databases (SpringOne China 2012)
Chris Richardson
 
NoSQL-Database-Concepts
NoSQL-Database-ConceptsNoSQL-Database-Concepts
NoSQL-Database-Concepts
Bhaskar Gunda
 
NOSQL Overview
NOSQL OverviewNOSQL Overview
NOSQL Overview
Tobias Lindaaker
 
NoSQL databases and managing big data
NoSQL databases and managing big dataNoSQL databases and managing big data
NoSQL databases and managing big data
Steven Francia
 
Distributed computing
Distributed computingDistributed computing
Distributed computing
Alokeparna Choudhury
 
NoSQL databases
NoSQL databasesNoSQL databases
NoSQL databases
Harri Kauhanen
 
NoSQL Slideshare Presentation
NoSQL Slideshare Presentation NoSQL Slideshare Presentation
NoSQL Slideshare Presentation
Ericsson Labs
 
Data Modeling for NoSQL
Data Modeling for NoSQLData Modeling for NoSQL
Data Modeling for NoSQL
Tony Tam
 
NoSQL Databases, Not just a Buzzword
NoSQL Databases, Not just a Buzzword NoSQL Databases, Not just a Buzzword
NoSQL Databases, Not just a Buzzword
Haitham El-Ghareeb
 
Cassandra By Example: Data Modelling with CQL3
Cassandra By Example: Data Modelling with CQL3Cassandra By Example: Data Modelling with CQL3
Cassandra By Example: Data Modelling with CQL3
Eric Evans
 
NoSQL Databases: Why, what and when
NoSQL Databases: Why, what and whenNoSQL Databases: Why, what and when
NoSQL Databases: Why, what and when
Lorenzo Alberton
 
Deterministic simulation testing
Deterministic simulation testingDeterministic simulation testing
Deterministic simulation testing
FoundationDB
 
Nosql databases for the .net developer
Nosql databases for the .net developerNosql databases for the .net developer
Nosql databases for the .net developer
Jesus Rodriguez
 
NOSQL Database: Apache Cassandra
NOSQL Database: Apache CassandraNOSQL Database: Apache Cassandra
NOSQL Database: Apache Cassandra
Folio3 Software
 
A practical introduction to Oracle NoSQL Database - OOW2014
A practical introduction to Oracle NoSQL Database - OOW2014A practical introduction to Oracle NoSQL Database - OOW2014
A practical introduction to Oracle NoSQL Database - OOW2014
Anuj Sahni
 
Big Data and NoSQL for Database and BI Pros
Big Data and NoSQL for Database and BI ProsBig Data and NoSQL for Database and BI Pros
Big Data and NoSQL for Database and BI Pros
Andrew Brust
 
An Intro to NoSQL Databases
An Intro to NoSQL DatabasesAn Intro to NoSQL Databases
An Intro to NoSQL Databases
Rajith Pemabandu
 
Using Spring with NoSQL databases (SpringOne China 2012)
Using Spring with NoSQL databases (SpringOne China 2012)Using Spring with NoSQL databases (SpringOne China 2012)
Using Spring with NoSQL databases (SpringOne China 2012)
Chris Richardson
 
NoSQL-Database-Concepts
NoSQL-Database-ConceptsNoSQL-Database-Concepts
NoSQL-Database-Concepts
Bhaskar Gunda
 
NoSQL databases and managing big data
NoSQL databases and managing big dataNoSQL databases and managing big data
NoSQL databases and managing big data
Steven Francia
 
NoSQL Slideshare Presentation
NoSQL Slideshare Presentation NoSQL Slideshare Presentation
NoSQL Slideshare Presentation
Ericsson Labs
 
Data Modeling for NoSQL
Data Modeling for NoSQLData Modeling for NoSQL
Data Modeling for NoSQL
Tony Tam
 
NoSQL Databases, Not just a Buzzword
NoSQL Databases, Not just a Buzzword NoSQL Databases, Not just a Buzzword
NoSQL Databases, Not just a Buzzword
Haitham El-Ghareeb
 
Cassandra By Example: Data Modelling with CQL3
Cassandra By Example: Data Modelling with CQL3Cassandra By Example: Data Modelling with CQL3
Cassandra By Example: Data Modelling with CQL3
Eric Evans
 
Ad

Similar to Nosql databases (20)

NoSQL BIg Data Analytics Mongo DB and Cassandra .pdf
NoSQL BIg Data Analytics Mongo DB and Cassandra .pdfNoSQL BIg Data Analytics Mongo DB and Cassandra .pdf
NoSQL BIg Data Analytics Mongo DB and Cassandra .pdf
SharmilaChidaravalli
 
No sq lv2
No sq lv2No sq lv2
No sq lv2
Nusrat Sharmin
 
Softwae and database in data communication network
Softwae and database in data communication networkSoftwae and database in data communication network
Softwae and database in data communication network
AyoubSohiabMohammad
 
Datastores
DatastoresDatastores
Datastores
Mike02143
 
Presentation on NoSQL Database related RDBMS
Presentation on NoSQL Database related RDBMSPresentation on NoSQL Database related RDBMS
Presentation on NoSQL Database related RDBMS
abdurrobsoyon
 
Big Data Analytics Module-3 as per vtu syllabus.pptx
Big Data Analytics Module-3 as per vtu syllabus.pptxBig Data Analytics Module-3 as per vtu syllabus.pptx
Big Data Analytics Module-3 as per vtu syllabus.pptx
shilpabl1803
 
Comparative study of no sql document, column store databases and evaluation o...
Comparative study of no sql document, column store databases and evaluation o...Comparative study of no sql document, column store databases and evaluation o...
Comparative study of no sql document, column store databases and evaluation o...
IJDMS
 
Nosql databases
Nosql databasesNosql databases
Nosql databases
Fayez Shayeb
 
no sql presentation
no sql presentationno sql presentation
no sql presentation
chandanm2
 
Oracle DBA Tutorial for Beginners -Oracle training institute in bangalore
Oracle DBA Tutorial for Beginners -Oracle training institute in bangaloreOracle DBA Tutorial for Beginners -Oracle training institute in bangalore
Oracle DBA Tutorial for Beginners -Oracle training institute in bangalore
TIB Academy
 
Oracle archi ppt
Oracle archi pptOracle archi ppt
Oracle archi ppt
Hitesh Kumar Markam
 
NOSQL and MongoDB Database
NOSQL and MongoDB DatabaseNOSQL and MongoDB Database
NOSQL and MongoDB Database
Tariqul islam
 
Data management in cloud study of existing systems and future opportunities
Data management in cloud study of existing systems and future opportunitiesData management in cloud study of existing systems and future opportunities
Data management in cloud study of existing systems and future opportunities
Editor Jacotech
 
ORDBMS.pptx
ORDBMS.pptxORDBMS.pptx
ORDBMS.pptx
Anitta Antony
 
Introduction to Oracle Database
Introduction to Oracle DatabaseIntroduction to Oracle Database
Introduction to Oracle Database
puja_dhar
 
Datastores
DatastoresDatastores
Datastores
Raveen Vijayan
 
NoSQL Basics and MongDB
NoSQL Basics and  MongDBNoSQL Basics and  MongDB
NoSQL Basics and MongDB
Shamima Yeasmin Mukta
 
Master.pptx
Master.pptxMaster.pptx
Master.pptx
KarthikR780430
 
Overview of oracle database
Overview of oracle databaseOverview of oracle database
Overview of oracle database
Samar Prasad
 
Overview of oracle database
Overview of oracle databaseOverview of oracle database
Overview of oracle database
Samar Prasad
 
NoSQL BIg Data Analytics Mongo DB and Cassandra .pdf
NoSQL BIg Data Analytics Mongo DB and Cassandra .pdfNoSQL BIg Data Analytics Mongo DB and Cassandra .pdf
NoSQL BIg Data Analytics Mongo DB and Cassandra .pdf
SharmilaChidaravalli
 
Softwae and database in data communication network
Softwae and database in data communication networkSoftwae and database in data communication network
Softwae and database in data communication network
AyoubSohiabMohammad
 
Presentation on NoSQL Database related RDBMS
Presentation on NoSQL Database related RDBMSPresentation on NoSQL Database related RDBMS
Presentation on NoSQL Database related RDBMS
abdurrobsoyon
 
Big Data Analytics Module-3 as per vtu syllabus.pptx
Big Data Analytics Module-3 as per vtu syllabus.pptxBig Data Analytics Module-3 as per vtu syllabus.pptx
Big Data Analytics Module-3 as per vtu syllabus.pptx
shilpabl1803
 
Comparative study of no sql document, column store databases and evaluation o...
Comparative study of no sql document, column store databases and evaluation o...Comparative study of no sql document, column store databases and evaluation o...
Comparative study of no sql document, column store databases and evaluation o...
IJDMS
 
no sql presentation
no sql presentationno sql presentation
no sql presentation
chandanm2
 
Oracle DBA Tutorial for Beginners -Oracle training institute in bangalore
Oracle DBA Tutorial for Beginners -Oracle training institute in bangaloreOracle DBA Tutorial for Beginners -Oracle training institute in bangalore
Oracle DBA Tutorial for Beginners -Oracle training institute in bangalore
TIB Academy
 
NOSQL and MongoDB Database
NOSQL and MongoDB DatabaseNOSQL and MongoDB Database
NOSQL and MongoDB Database
Tariqul islam
 
Data management in cloud study of existing systems and future opportunities
Data management in cloud study of existing systems and future opportunitiesData management in cloud study of existing systems and future opportunities
Data management in cloud study of existing systems and future opportunities
Editor Jacotech
 
Introduction to Oracle Database
Introduction to Oracle DatabaseIntroduction to Oracle Database
Introduction to Oracle Database
puja_dhar
 
Overview of oracle database
Overview of oracle databaseOverview of oracle database
Overview of oracle database
Samar Prasad
 
Overview of oracle database
Overview of oracle databaseOverview of oracle database
Overview of oracle database
Samar Prasad
 
Ad

Recently uploaded (20)

Adobe Lightroom Classic Crack FREE Latest link 2025
Adobe Lightroom Classic Crack FREE Latest link 2025Adobe Lightroom Classic Crack FREE Latest link 2025
Adobe Lightroom Classic Crack FREE Latest link 2025
kashifyounis067
 
WinRAR Crack for Windows (100% Working 2025)
WinRAR Crack for Windows (100% Working 2025)WinRAR Crack for Windows (100% Working 2025)
WinRAR Crack for Windows (100% Working 2025)
sh607827
 
Expand your AI adoption with AgentExchange
Expand your AI adoption with AgentExchangeExpand your AI adoption with AgentExchange
Expand your AI adoption with AgentExchange
Fexle Services Pvt. Ltd.
 
FL Studio Producer Edition Crack 2025 Full Version
FL Studio Producer Edition Crack 2025 Full VersionFL Studio Producer Edition Crack 2025 Full Version
FL Studio Producer Edition Crack 2025 Full Version
tahirabibi60507
 
Revolutionizing Residential Wi-Fi PPT.pptx
Revolutionizing Residential Wi-Fi PPT.pptxRevolutionizing Residential Wi-Fi PPT.pptx
Revolutionizing Residential Wi-Fi PPT.pptx
nidhisingh691197
 
Adobe Illustrator Crack FREE Download 2025 Latest Version
Adobe Illustrator Crack FREE Download 2025 Latest VersionAdobe Illustrator Crack FREE Download 2025 Latest Version
Adobe Illustrator Crack FREE Download 2025 Latest Version
kashifyounis067
 
Proactive Vulnerability Detection in Source Code Using Graph Neural Networks:...
Proactive Vulnerability Detection in Source Code Using Graph Neural Networks:...Proactive Vulnerability Detection in Source Code Using Graph Neural Networks:...
Proactive Vulnerability Detection in Source Code Using Graph Neural Networks:...
Ranjan Baisak
 
Download Wondershare Filmora Crack [2025] With Latest
Download Wondershare Filmora Crack [2025] With LatestDownload Wondershare Filmora Crack [2025] With Latest
Download Wondershare Filmora Crack [2025] With Latest
tahirabibi60507
 
TestMigrationsInPy: A Dataset of Test Migrations from Unittest to Pytest (MSR...
TestMigrationsInPy: A Dataset of Test Migrations from Unittest to Pytest (MSR...TestMigrationsInPy: A Dataset of Test Migrations from Unittest to Pytest (MSR...
TestMigrationsInPy: A Dataset of Test Migrations from Unittest to Pytest (MSR...
Andre Hora
 
Meet the Agents: How AI Is Learning to Think, Plan, and Collaborate
Meet the Agents: How AI Is Learning to Think, Plan, and CollaborateMeet the Agents: How AI Is Learning to Think, Plan, and Collaborate
Meet the Agents: How AI Is Learning to Think, Plan, and Collaborate
Maxim Salnikov
 
Top 10 Client Portal Software Solutions for 2025.docx
Top 10 Client Portal Software Solutions for 2025.docxTop 10 Client Portal Software Solutions for 2025.docx
Top 10 Client Portal Software Solutions for 2025.docx
Portli
 
Adobe Marketo Engage Champion Deep Dive - SFDC CRM Synch V2 & Usage Dashboards
Adobe Marketo Engage Champion Deep Dive - SFDC CRM Synch V2 & Usage DashboardsAdobe Marketo Engage Champion Deep Dive - SFDC CRM Synch V2 & Usage Dashboards
Adobe Marketo Engage Champion Deep Dive - SFDC CRM Synch V2 & Usage Dashboards
BradBedford3
 
F-Secure Freedome VPN 2025 Crack Plus Activation New Version
F-Secure Freedome VPN 2025 Crack Plus Activation  New VersionF-Secure Freedome VPN 2025 Crack Plus Activation  New Version
F-Secure Freedome VPN 2025 Crack Plus Activation New Version
saimabibi60507
 
Exceptional Behaviors: How Frequently Are They Tested? (AST 2025)
Exceptional Behaviors: How Frequently Are They Tested? (AST 2025)Exceptional Behaviors: How Frequently Are They Tested? (AST 2025)
Exceptional Behaviors: How Frequently Are They Tested? (AST 2025)
Andre Hora
 
Exploring Code Comprehension in Scientific Programming: Preliminary Insight...
Exploring Code Comprehension  in Scientific Programming:  Preliminary Insight...Exploring Code Comprehension  in Scientific Programming:  Preliminary Insight...
Exploring Code Comprehension in Scientific Programming: Preliminary Insight...
University of Hawai‘i at Mānoa
 
Not So Common Memory Leaks in Java Webinar
Not So Common Memory Leaks in Java WebinarNot So Common Memory Leaks in Java Webinar
Not So Common Memory Leaks in Java Webinar
Tier1 app
 
Avast Premium Security Crack FREE Latest Version 2025
Avast Premium Security Crack FREE Latest Version 2025Avast Premium Security Crack FREE Latest Version 2025
Avast Premium Security Crack FREE Latest Version 2025
mu394968
 
Scaling GraphRAG: Efficient Knowledge Retrieval for Enterprise AI
Scaling GraphRAG:  Efficient Knowledge Retrieval for Enterprise AIScaling GraphRAG:  Efficient Knowledge Retrieval for Enterprise AI
Scaling GraphRAG: Efficient Knowledge Retrieval for Enterprise AI
danshalev
 
EASEUS Partition Master Crack + License Code
EASEUS Partition Master Crack + License CodeEASEUS Partition Master Crack + License Code
EASEUS Partition Master Crack + License Code
aneelaramzan63
 
Secure Test Infrastructure: The Backbone of Trustworthy Software Development
Secure Test Infrastructure: The Backbone of Trustworthy Software DevelopmentSecure Test Infrastructure: The Backbone of Trustworthy Software Development
Secure Test Infrastructure: The Backbone of Trustworthy Software Development
Shubham Joshi
 
Adobe Lightroom Classic Crack FREE Latest link 2025
Adobe Lightroom Classic Crack FREE Latest link 2025Adobe Lightroom Classic Crack FREE Latest link 2025
Adobe Lightroom Classic Crack FREE Latest link 2025
kashifyounis067
 
WinRAR Crack for Windows (100% Working 2025)
WinRAR Crack for Windows (100% Working 2025)WinRAR Crack for Windows (100% Working 2025)
WinRAR Crack for Windows (100% Working 2025)
sh607827
 
Expand your AI adoption with AgentExchange
Expand your AI adoption with AgentExchangeExpand your AI adoption with AgentExchange
Expand your AI adoption with AgentExchange
Fexle Services Pvt. Ltd.
 
FL Studio Producer Edition Crack 2025 Full Version
FL Studio Producer Edition Crack 2025 Full VersionFL Studio Producer Edition Crack 2025 Full Version
FL Studio Producer Edition Crack 2025 Full Version
tahirabibi60507
 
Revolutionizing Residential Wi-Fi PPT.pptx
Revolutionizing Residential Wi-Fi PPT.pptxRevolutionizing Residential Wi-Fi PPT.pptx
Revolutionizing Residential Wi-Fi PPT.pptx
nidhisingh691197
 
Adobe Illustrator Crack FREE Download 2025 Latest Version
Adobe Illustrator Crack FREE Download 2025 Latest VersionAdobe Illustrator Crack FREE Download 2025 Latest Version
Adobe Illustrator Crack FREE Download 2025 Latest Version
kashifyounis067
 
Proactive Vulnerability Detection in Source Code Using Graph Neural Networks:...
Proactive Vulnerability Detection in Source Code Using Graph Neural Networks:...Proactive Vulnerability Detection in Source Code Using Graph Neural Networks:...
Proactive Vulnerability Detection in Source Code Using Graph Neural Networks:...
Ranjan Baisak
 
Download Wondershare Filmora Crack [2025] With Latest
Download Wondershare Filmora Crack [2025] With LatestDownload Wondershare Filmora Crack [2025] With Latest
Download Wondershare Filmora Crack [2025] With Latest
tahirabibi60507
 
TestMigrationsInPy: A Dataset of Test Migrations from Unittest to Pytest (MSR...
TestMigrationsInPy: A Dataset of Test Migrations from Unittest to Pytest (MSR...TestMigrationsInPy: A Dataset of Test Migrations from Unittest to Pytest (MSR...
TestMigrationsInPy: A Dataset of Test Migrations from Unittest to Pytest (MSR...
Andre Hora
 
Meet the Agents: How AI Is Learning to Think, Plan, and Collaborate
Meet the Agents: How AI Is Learning to Think, Plan, and CollaborateMeet the Agents: How AI Is Learning to Think, Plan, and Collaborate
Meet the Agents: How AI Is Learning to Think, Plan, and Collaborate
Maxim Salnikov
 
Top 10 Client Portal Software Solutions for 2025.docx
Top 10 Client Portal Software Solutions for 2025.docxTop 10 Client Portal Software Solutions for 2025.docx
Top 10 Client Portal Software Solutions for 2025.docx
Portli
 
Adobe Marketo Engage Champion Deep Dive - SFDC CRM Synch V2 & Usage Dashboards
Adobe Marketo Engage Champion Deep Dive - SFDC CRM Synch V2 & Usage DashboardsAdobe Marketo Engage Champion Deep Dive - SFDC CRM Synch V2 & Usage Dashboards
Adobe Marketo Engage Champion Deep Dive - SFDC CRM Synch V2 & Usage Dashboards
BradBedford3
 
F-Secure Freedome VPN 2025 Crack Plus Activation New Version
F-Secure Freedome VPN 2025 Crack Plus Activation  New VersionF-Secure Freedome VPN 2025 Crack Plus Activation  New Version
F-Secure Freedome VPN 2025 Crack Plus Activation New Version
saimabibi60507
 
Exceptional Behaviors: How Frequently Are They Tested? (AST 2025)
Exceptional Behaviors: How Frequently Are They Tested? (AST 2025)Exceptional Behaviors: How Frequently Are They Tested? (AST 2025)
Exceptional Behaviors: How Frequently Are They Tested? (AST 2025)
Andre Hora
 
Exploring Code Comprehension in Scientific Programming: Preliminary Insight...
Exploring Code Comprehension  in Scientific Programming:  Preliminary Insight...Exploring Code Comprehension  in Scientific Programming:  Preliminary Insight...
Exploring Code Comprehension in Scientific Programming: Preliminary Insight...
University of Hawai‘i at Mānoa
 
Not So Common Memory Leaks in Java Webinar
Not So Common Memory Leaks in Java WebinarNot So Common Memory Leaks in Java Webinar
Not So Common Memory Leaks in Java Webinar
Tier1 app
 
Avast Premium Security Crack FREE Latest Version 2025
Avast Premium Security Crack FREE Latest Version 2025Avast Premium Security Crack FREE Latest Version 2025
Avast Premium Security Crack FREE Latest Version 2025
mu394968
 
Scaling GraphRAG: Efficient Knowledge Retrieval for Enterprise AI
Scaling GraphRAG:  Efficient Knowledge Retrieval for Enterprise AIScaling GraphRAG:  Efficient Knowledge Retrieval for Enterprise AI
Scaling GraphRAG: Efficient Knowledge Retrieval for Enterprise AI
danshalev
 
EASEUS Partition Master Crack + License Code
EASEUS Partition Master Crack + License CodeEASEUS Partition Master Crack + License Code
EASEUS Partition Master Crack + License Code
aneelaramzan63
 
Secure Test Infrastructure: The Backbone of Trustworthy Software Development
Secure Test Infrastructure: The Backbone of Trustworthy Software DevelopmentSecure Test Infrastructure: The Backbone of Trustworthy Software Development
Secure Test Infrastructure: The Backbone of Trustworthy Software Development
Shubham Joshi
 

Nosql databases

  • 1. NOSQL DATABASES AND BIG DATA STORAGE SYSTEMS Ateeq Ateeq
  • 2. CONTENT  1- Introduction to NOSQL Systems  2- The CAP Theorem  3- Document-Based NOSQL Systems and MongoDB  4- NOSQL Key-Value Stores  5- Column-Based or Wide Column NOSQL Systems  6- NOSQL Graph Databases and Neo4j
  • 3. INTRODUCTION TO NOSQL SYSTEMS  1.1 Emergence of NOSQL Systems  1.2 Characteristics of NOSQL Systems  1.3 Categories of NOSQL Systems
  • 4. 1.1 EMERGENCE OF NOSQL SYSTEMS  SQL system may not be appropriate for some applications such as Emails  SQL systems offer too many services (powerful query language, concurrency control, etc.), which this application may not need;  structured data model such the traditional relational model may be too restrictive.  SQL require schemas, which are not required by many of the NOSQL systems.
  • 5. 1.1 EMERGENCE OF NOSQL SYSTEMS  Examples of NOSQL systems:  Google – BigTable  Amazon – DynamoDB  Facebook – Cassandra  MongoDB  CouchDB  Graph databases like Neo4J and GraphBase
  • 6. 1.2 CHARACTERISTICS OF NOSQL SYSTEMS  NOSQL characteristics related to distributed databases and distributed systems.  NOSQL characteristics related to data models and query languages.
  • 7. CHARACTERISTICS RELATED TO DISTRIBUTED DATABASES AND DISTRIBUTED SYSTEMS 1- Scalability:  horizontal scalability: adding more nodes for data storage and processing as the volume of data grows.  Vertical scalability: expanding the storage and computing power of existing nodes.  In NOSQL systems, horizontal scalability is employed while the system is operational, so techniques for distributing the existing data among new nodes without interrupting system operation are necessary.
  • 8. CHARACTERISTICS RELATED TO DISTRIBUTED DATABASES AND DISTRIBUTED SYSTEMS 2- Availability, Replication and Eventual Consistency:  Data is replicated over two or more nodes in a transparent manner.  Update must be applied to every copy of the replicated data items.  Eventual consistency: is a consistency model used in distributed computing to achieve high availability that informally guarantees that, if no new updates are made to a given data item, eventually all accesses to that item will return the last updated value.
  • 9. CHARACTERISTICS RELATED TO DISTRIBUTED DATABASES AND DISTRIBUTED SYSTEMS 3- Replication Models:  Master-slave replication: requires one copy to be the master copy;  Write operations must be applied to the master copy, usually using eventual consistency  For read, all reads are from the master copy, or reads at the slave copies but would not guarantee that the values are the latest writes.  Master-master replication: allows reads and writes at any of the replicas.  The values of an item will be temporarily inconsistent.  Reconciliation method to resolve conflicting write operations of the same data item at different nodes must be implemented as part of the master-master replication scheme.
  • 10. CHARACTERISTICS RELATED TO DISTRIBUTED DATABASES AND DISTRIBUTED SYSTEMS
  • 11. CHARACTERISTICS RELATED TO DISTRIBUTED DATABASES AND DISTRIBUTED SYSTEMS
  • 12. CHARACTERISTICS RELATED TO DISTRIBUTED DATABASES AND DISTRIBUTED SYSTEMS  4- Sharding of Files:  Files can have many millions of records accessed concurrently by thousands of users.  Sharding (also known as horizontal) serves to distribute the load of accessing the file records to multiple nodes.  Shards works in tandem to improve load balancing on the replication as well as data availability.
  • 13. CHARACTERISTICS RELATED TO DISTRIBUTED DATABASES AND DISTRIBUTED SYSTEMS
  • 14. CHARACTERISTICS RELATED TO DISTRIBUTED DATABASES AND DISTRIBUTED SYSTEMS  5- High-Performance Data Access:  Hashing: The location of the value is given by the result of h(k).  Range partitioning: the location is determined via a range of key values. Example: location i would hold the objects whose key values K are in the range Kimin ≤ K ≤ Kimax. In applications that require range queries, where multiple objects within a range of key values are retrieved, range partitioned is preferred.
  • 15. CHARACTERISTICS RELATED TO DISTRIBUTED DATABASES AND DISTRIBUTED SYSTEMS
  • 16. CHARACTERISTICS RELATED TO DISTRIBUTED DATABASES AND DISTRIBUTED SYSTEMS
  • 17. CHARACTERISTICS RELATED TO DATA MODELS AND QUERY LANGUAGES.  1- Not Requiring a Schema:  Allowing semi-structured and self describing data.  The users can specify a partial schema in some systems to improve storage efficiency, but it is not required to have a schema in most of the NOSQL systems.  Constraints on the data would have to be programmed in the application programs that access the data items.  Languages for describing semi-structured data: JSON (JavaScript Object Notation) and XML (Extensible Markup Language)
  • 18. CHARACTERISTICS RELATED TO DATA MODELS AND QUERY LANGUAGES.  2- Less Powerful Query Languages:  In many applications that use NOSQL systems may not require a powerful query language such as SQL, because search (read) queries in these systems often locate single objects in a single file based on their object keys.  Reading and writing the data objects is accomplished by calling the appropriate operations by the programmer (API).  SCRUD: Search, Create, Read, Update and Delete  Provide a high-level query language, but it may not have the full power of SQL, for example the joins need to be implemented in the application programs.
  • 19. CHARACTERISTICS RELATED TO DATA MODELS AND QUERY LANGUAGES.  3- Versioning:  Provide storage of multiple versions of the data items, with the timestamps of when the data version was created.
  • 20. 1.3 CATEGORIES OF NOSQL SYSTEMS The most common categories: 1. Document-based NOSQL systems:  Store data in the form of documents using well-known formats such as JSON.  Documents are accessible via their document id, but can also be accessed rapidly using other indexes. 2. NOSQL key-value stores:  Fast access by the key to the value associated with the key  Value can be a record or an object or a document or even have a more complex data structure. 3. Column-based or wide column NOSQL systems:  Partition a table by column into column families  Form of vertical partitioning. 4. Graph-based NOSQL systems:  Data is represented as graphs  Related nodes can be found by traversing the edges using path expressions.
  • 21. 1.3 CATEGORIES OF NOSQL SYSTEMS Additional categories : 5. Hybrid NOSQL systems:  These systems have characteristics from two or more of the common categories.. 6. Object databases. 7. XML databases.
  • 22. THE CAP THEOREM  The CAP: it’s impossible to guarantee consistency, availability and partition tolerance at the same time in a distributed system with data replication.  Two properties out of the three to guarantee.  Weaker consistency levels are often used in NOSQL system instead of guaranteeing serializability.  Eventual consistency is used.
  • 23. THE CAP THEOREM  The CAP theorem is used to explain some of the competing requirements in a distributed system with replication.  The three letters in CAP refers to  Consistency (among replicated copies):  The nodes will have the same copies of a replicated data item visible for various transactions.  Availability (of the system for read and write operations) :  Each read or write will either be processed successfully or will receive a message that the operation cannot be completed.  Partition tolerance (in the face of the nodes in the system being partitioned by a network fault).:  The system can continue operating if the network connecting the nodes has a fault that results in two or more partitions,  Nodes in each partition can only communicate among each other.
  • 25. DOCUMENT-BASED NOSQL SYSTEMS AND MONGODB 1. Introduction 2. MongoDB Data Model 3. MongoDB CRUD Operations 4. MongoDB Distributed Systems Characteristics
  • 26. 3.1INTRODUCTION  Document-based NOSQL systems store data as collections of similar documents.  Documents resemble complex objects or XML documents  Documents in a collection should be similar, but they can have different attributes.  Document-based NOSQL systems: MongoDB and CouchDB.
  • 27. 3.2 MONGODB DATA MODEL  MongoDB is a free and open-source cross-platform document-oriented database.  Classified as a NoSQL database,
  • 28. 3.2 MONGODB DATA MODEL  MongoDB documents are stored in BSON (Binary JSON) format.  BSON is a variation of JSON with some additional data types and is more efficient for storage than JSON.  Individual documents are stored in a collection.  The operation createCollection is used to create each collection.
  • 29. 3.2 MONGODB DATA MODEL  Example: create a collection called project to hold PROJECT objects from the COMPANY database : db.createCollection(“project”, { capped : true, size : 1310720, max : 500 } )  “project” is the name of the collection (Mandatory)  Capped: capped means it has upper limits on its storage space (size) and number of documents (max).  Capping helps the system to choose the storage options for each collection.
  • 30. 3.2 MONGODB DATA MODEL  Example: create a document collection called worker : db.createCollection(“worker”, { capped : true, size : 5242880, max : 2000 } )  Each document has a unique ObjectId field “_id”  The _id is by default:  Automatically indexed in the collection.  The value is system-generated.  System-generated have a specific format – “combines the timestamp when the object is created, the node id, the process id and a counter “.  User-generated can have any value specified by the user as long as its.
  • 31. 3.2 MONGODB DATA MODEL  A collection does not have a schema.  The structure of the data fields in documents is chosen based on how documents will be accessed and used, and the user can choose a normalized design (similar to normalized relational tuples) or a denormalized design (similar to XML documents or complex objects).  Interdocument references can be specified by storing in one document the ObjectId or ObjectIds of other related documents.
  • 32. 3.2 MONGODB DATA MODEL Company database example
  • 33. 3.2 MONGODB DATA MODEL Project info Embedded workers info
  • 34. 3.2 MONGODB DATA MODEL Project info Embedded workers array Workers
  • 35. 3.2 MONGODB DATA MODEL Project ID as an attribute
  • 37. 3.3 MONGODB CRUD OPERATIONS  Insert:  db.<collection_name>.insert(<document(s)>)  Example:  Db.project.insert({_id:”P1”, Pname:”ProjectX”,Plocation:”Jenin”})  Delete: remove  db.<collection_name>.remove(<condition>)  Example:  db.project.remove( {"_id": ObjectId(“P1")});
  • 38. 3.3 MONGODB CRUD OPERATIONS  Read: fined  db.<collection_name>.find(<condition>)  Example:  Db.project.find({"_id": ObjectId(“P1")})  Update:  db.<collection_name>. update(SELECTIOIN_CRITERIA, UPDATED_DATA)  Example:  Db.project.update({"_id" : ObjectId(P1)},{$set:{‘PLocation':‘AAUJ'}})
  • 39. 3.4 MONGODB DISTRIBUTED SYSTEMS CHARACTERISTICS  Replication in MongoDB  Sharding in MongoDB
  • 40. REPLICATION IN MONGODB  Master-slave approach for replication.  All read and write are done on the primary copy.  Secondary copies are to recover from primary fails.
  • 41. SHARDING IN MONGODB  Sharding of the documents in the collection—also known as horizontal partitioning— divides the documents into disjoint partitions known as shards.  Two ways:  Range partitioning  Hash partitioning
  • 42. SHARDING IN MONGODB  Range and Hash portioning require that the user specify a particular document field to be used as the basis for partitioning the documents into shards.  The partitioning field—known as the “shard key”, must exist in every document in the collection, and it must have an index.  The values of the shard key are divided into chunks, and the documents are partitioned based on the chunks of shard key values
  • 43. SHARDING IN MONGODB  Chunks created by specifying a range of key values and each chunk contains the key values in one range.  If range queries are commonly applied to a collection (for example, retrieving all documents whose shard key value is between 200 and 400), then range partitioning is preferred  Because each range query will typically be submitted to a single node that contains all the required documents in one shard.  If most searches retrieve one document at a time, hash partitioning may be preferable because it randomizes the distribution of shard key values into chunks.
  • 44. SHARDING IN MONGODB  MongoDB queries are submitted to a module called the query router, which keeps track of which nodes contain which shards based on the particular partitioning method used on the shard keys.  The query will be routed to the nodes that contain the shards that hold the documents that the query is requesting.  If the system cannot determine which shards hold the required documents, the query will be submitted to all the nodes that hold shards of the collection.
  • 45. SHARDING IN MONGODB  Sharding and replication are used together:  Sharding focuses on improving performance via load balancing and horizontal scalability.  Replication focuses on ensuring system availability when certain nodes fail in the distributed system.
  • 47. WHY NOSQL?  Alter the table and add Description, Rate and Reviews  NOSQL is Flexible No Schema restrictions
  • 48. WHY NOSQL?  SQL is Restricted ! Fill the data
  • 49. WHY NOSQL? - USE CASES WHERE NOSQL WILL OUTPERFORM SQL  Agile - Flexibility for Faster Development
  • 50. WHY NOSQL? - USE CASES WHERE NOSQL WILL OUTPERFORM SQL  Agile - Flexibility for Faster Development
  • 51. WHY NOSQL? - USE CASES WHERE NOSQL WILL OUTPERFORM SQL  Agile - Simplicity for Easier Development
  • 52. WHY NOSQL? - USE CASES WHERE NOSQL WILL OUTPERFORM SQL  Agile - Simplicity for Easier Development  Reading this profile would require the application to read six rows from three table
  • 53. WHY NOSQL? - USE CASES WHERE NOSQL WILL OUTPERFORM SQL  Agile - Simplicity for Easier Development
  • 54. WHY NOSQL? - USE CASES WHERE NOSQL WILL OUTPERFORM SQL  Availability for Always-on
  • 55. WHY NOSQL? - USE CASES WHERE NOSQL WILL OUTPERFORM SQL  Availability for Always-on
  • 56. NOSQL CATEGORIES EXAMPLES - DOCUMENT-BASED NOSQL SYSTEMS XML is stored into a native XML Type
  • 57. NOSQL CATEGORIES EXAMPLES - DOCUMENT-BASED NOSQL SYSTEMS  The query retrieves the <Features> child element of the <ProductDescription> element  Result:
  • 58. NOSQL CATEGORIES EXAMPLES - NOSQL KEY-VALUE STORES  RIAK as example
  • 59. NOSQL CATEGORIES EXAMPLES - NOSQL KEY-VALUE STORES  The response to a query will be an object contains a list of documents which match the given query.  The documents returned are Search documents (a set of Solr field/values)
  • 60. NOSQL CATEGORIES EXAMPLES - COLUMN NOSQL SYSTEMS  Cassandra as an example  returns a result-set of rows, where each row consists of a key and a collection of columns corresponding to the query
  • 61. NOSQL CATEGORIES EXAMPLES - COLUMN NOSQL SYSTEMS  LOCAL_QUORUM: it’s a consistency level type  Used in multiple data center clusters.  Use to maintain consistency locally (within the single data center).
  • 62. NOSQL CATEGORIES EXAMPLES - GRAPH- BASED NOSQL SYSTEMS  Neo4j as an example
  • 63. NOSQL CATEGORIES EXAMPLES - GRAPH- BASED NOSQL SYSTEMS
  • 64. NOSQL CATEGORIES EXAMPLES - OBJECT DATABASES  LINQ as an example
  • 65. NOSQL KEY-VALUE STORES 1. Introduction 2. DynamoDB Overview 3. Voldemort Key-Value Distributed Data Store 4. Examples of Other Key-Value Stores
  • 66. 4.1 INTRODUCTION  No query language  A set of operations that can be used by the application programmers.  Characteristics:  Every value is associated with a unique key.  Retrieving the value by supplying the key is very fast.
  • 68. 4.2 DYNAMODB OVERVIEW  Amazon product – part AWS  Data model is using the concepts of tables, items, and attributes.  The table does not have a schema.  Holds a collection of self-describing items.  The item consist of a number of (attribute, value) pairs  Attribute values can be single-valued or multivalued.
  • 69. 4.2 DYNAMODB OVERVIEW  Uploads an item to the ProductCatalog table
  • 70. 4.3 VOLDEMORT KEY-VALUE DISTRIBUTED DATA STORE  Based on Amazon’s DynamoDB.  Used by LinkedIn.  Simple and basic set of operations, like (put, delete and get).  Pluggable with other storage engines like MySQL  Nodes are independent  Automatic replications and partitioning
  • 71. 4.3 VOLDEMORT KEY-VALUE DISTRIBUTED DATA STORE
  • 72. 4.4 EXAMPLES OF OTHER KEY-VALUE STORES 1. Oracle key-value store. 2. Redis key-value cache and store. 3. Apache Cassandra
  • 73. COLUMN-BASED OR WIDE COLUMN NOSQL SYSTEMS  Stores data tables as columns rather than as rows.
  • 74. HBASE DATA MODEL AND VERSIONING  Apache HBase is an open-source, distributed, versioned, non- relational database.  Column is identified by a combination of (column family:column qualifier).  Stores multiple versions of a data item, with a timestamp associated with each version.
  • 75. HBASE DATA MODEL AND VERSIONING
  • 76. HBASE DATA MODEL AND VERSIONING  Table is divided into a number of regions.  Range partitioning.  Apache Zookeeper and Apache HDFS (Hadoop Distributed File System) are used for management.
  • 77. NOSQL GRAPH DATABASES AND NEO4J  The data is represented as a graph, which is a collection of vertices (nodes) and edges.  Nodes and edges can be labeled to indicate the types of entities and relationships they represent  It is generally possible to store data associated with both individual nodes and individual edges.  Neo4j is a NOSQL Graph DB and it’s an open source system, also it is implemented in Java.
  • 78. NEO4J  The data model in Neo4j organizes data using the concepts of nodes and relationships.  Nodes and relationships have properties which store the data items.  Nodes can have labels.  Nodes that have the same label are grouped into a collection that identifies a subset of the nodes in the database graph for querying purposes.  A node can have zero, one, or several labels.
  • 79. NEO4J
  • 80. NEO4J