0% found this document useful (0 votes)
12 views

WK10 - Nosql Databases

Uploaded by

aikhomuremen
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views

WK10 - Nosql Databases

Uploaded by

aikhomuremen
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 62

DATABASE MANAGEMENT II

Dr. Bilkisu L. Muhammad-Bello

Week 10 Database Management II


1
INTRODUCTION TO NOSQL
DATABASES

Defines the term NoSQL and the technology it references


Categories of NoSQL Databases, their Architecture and Use Cases

Week 10 Database Management II


2
Objectives

After studying this module you should be able to


Define the term NoSQL and the technology it references
Describe the history of NoSQL in the database landscape.
Describe the concepts and characteristics of NoSQL databases
Explain the primary benefits of adopting a NoSQL database
Describe the four main categories of NoSQL database
Describe the architecture of each NoSQL database category
Explain the primary use cases for each NoSQL database category

3
Week 10 Database Management II
Introduction
Relational databases store data in structured tables that have a predefined
schema.
To use relational databases, a data model must be designed and then
the data is transformed and loaded into the database.
When data is used in applications, the data then must be retrieved using
SQL, and adapted to the form used in the application. Then when the data
is written back, it must be transformed again back into the relational tables.
NoSQL databases allow the data to be stored in ways that are easier to
understand or closer to the way the data is used by applications. Fewer
transformations are required when the data is stored or retrieved for use.
With NoSQL Databases, many different types of data, whether structured,
unstructured, or semi-structured, can be stored and retrieved more easily.

4
Week 10 Database Management II
What is No SQL?

The NoSQL name was introduced during an


open-source event to discuss distributed
databases that were coming onto the scene.
NoSQL stands for ‘Not Only SQL’ and not ‘NO
SQL’.
It refers to a family of databases that vary
widely in style and technology, but which all
share a common trait in that they are non-
relational in nature.
Therefore, a better name to describe these
databases would be ‘non-relational’.
5
Week 10 Database Management II
What is a NoSQL database?

NoSQL databases provide new ways of storing and querying data


that address a number of issues for modern applications.
Most importantly, the majority of NoSQL databases are geared to
handle a different breed of scale problems that have arisen
associated with the “Big Data” movement.
By scale, we are referring to both the size of the data, but also the
concurrent users acting on that data.
They provide flexible schemas and scale easily with large amounts
of data and high user loads.
NoSQL databases are typically also more specialized in their use
cases and can be much simpler to develop application
functionality for, than RDBMS. 6
Week 10 Database Management II
History of NoSQL

7
Week 10 Database Management II
Enablers – 1. Types of Data to be stored

As storage costs rapidly decreased, the amount of data that


applications needed to store and query increased.
This data came in all shapes and sizes —
structured, semi-structured, and polymorphic — and defining the
schema in advance became nearly impossible.
NoSQL databases allow developers to store huge amounts of
unstructured data, giving them a lot of flexibility.

8
Week 10 Database Management II
Enablers – 2. Agile Methodology

Additionally, the Agile Manifesto was rising in popularity, and


software engineers were rethinking the way they developed
software.
They were recognizing the need to rapidly adapt to changing
requirements.
They needed the ability to iterate quickly and make changes
throughout their software stack — all the way down to the
database.
NoSQL databases gave them this flexibility.

9
Week 10 Database Management II
Enablers – 3. Cloud Computing

Cloud computing also rose in popularity, and developers began


using public clouds to host their applications and data.
They wanted the ability to distribute data across multiple servers
and regions to make their applications resilient, to scale out
instead of scale up, and to intelligently geo-place their data.

10
Week 10 Database Management II
Why use NoSQL Databases

The pace of development with NoSQL databases can be much


faster than with a SQL database.
The structure of many different forms of data is more easily
handled and evolved with a NoSQL database.
The amount of data in many applications cannot be served
affordably by a SQL database.
The scale of traffic and need for zero downtime cannot be
handled by SQL.
New application paradigms can be more easily supported.

11
Week 10 Database Management II
NoSQL Database Categories

Several efforts have been made to categorize


NoSQL databases and, in the marketplace,
there is a general consensus that they fit into
four types:
Key-Value,
Document,
Column-based and
Graph style NoSQL databases.
There is some overlap amongst these types,
so the definition isn’t always clear, but more
details on the different types and their use
cases will be discussed.
12
Week 10 Database Management II
NoSQL Database Categories cont.
Document databases store data in documents similar to JSON
(JavaScript Object Notation) objects. Each document contains
pairs of fields and values. The values can typically be a variety of
types including things like strings, numbers, booleans, arrays, or
objects.
Key-value databases are a simpler type of database where
each item contains keys and values.
Wide-column stores store data in tables, rows, and dynamic
columns.
Graph databases store data in nodes and edges. Nodes typically
store information about people, places, and things, while edges
store information about the relationships between the nodes.
13
Week 10 Database Management II
Key-Value NoSQL Databases
Its architecture and primary use cases

14
Week 10 Database Management II
Key-Value NoSQL Database Architecture

In Key-Value databases, all data is stored with a key and an


associated value blob.
Architecturally, they are:
The least complex of the NoSQL databases.
Represented as a hashmap
Powerful for basic Create-Read-Update-Delete operations
Scale quite well and
Shard easily across 'x' number of nodes. Each shard would contain a range of
keys and their associated values.

15
Week 10 Database Management II
Key-Value NoSQL Database Architecture

However, these databases are usually not meant for complex


queries attempting to connect multiple pieces of data, and are
atomic for single key operations only.
The value blob is opaque to database and therefore typically will
have less flexibility when it comes to indexing and querying the
data than other database types.

16
Week 10 Database Management II
Key-Value NoSQL Database: Suitable Use Cases

For quick basic Create-Read-Update-Delete operations on non-


interconnected data. (Quick performance)
For example, storing and retrieving session information for
a Web application.
Each user session would receive some sort of unique key and all data would
be stored together in the opaque value blob.
Plus, there would be no need to query based on the information in the value
blob. All transactions would be based on the unique key.

17
Week 10 Database Management II
Key-Value NoSQL Database: Suitable Use Cases

Similar use cases would be for:


storing user profiles and preferences within an application
storing shopping cart data for online stores or marketplaces.
In these cases, complex queries or handling relationships between
different keys would be few and far between.

18
Week 10 Database Management II
Key-Value NoSQL Database: Unsuitable Use Cases

Key-Value type NoSQL databases would not be suitable for use cases that require
just the opposite.
When your data is interconnected with a number of many-to-many
relationships in the data, such as:
social networking or recommendation engine scenarios, a Key-Value NoSQL database is likely
to exhibit poor performance.
When high level of consistency for multi-operation transactions with
multiple keys.
Needs a database that provides Atomicity, Consistency, Isolation, and Durability, (or ACID),
transactions.
When apps run queries based on value vs. key
Consider the ‘Document’ category of NoSQL databases, which we will cover next.
19
Week 10 Database Management II
Key-Value NoSQL Database Examples

Some examples of the more popular implementations of key-value


NoSQL databases are:
Amazon DynamoDB, Oracle NoSQL Database, Redis, Aerospike,
Riak KV, MemcacheDB, and Project Voldemort.

20
Week 10 Database Management II
Document-Based NoSQL
Databases
Its architecture and primary use cases

21
Week 10 Database Management II
Document-Based NoSQL Database Architecture

Document databases build off the Key-Value model by making the


value visible and able to be queried.
Each piece of data is considered a document and typically stored
in either JSON or XML format.
One of the benefits of document databases is that each document
truly offers a flexible schema
No two documents need to be the same or contain the same information.

22
Week 10 Database Management II
Document-Based NoSQL Database Architecture

Document databases typically offer the ability to index and


query the contents of the documents, offering
key and value range lookups and searchability
analytical queries via paradigms like MapReduce.
Document databases are horizontally scalable
Allow for sharding across multiple nodes, typically sharded by
some unique key in the document.
Typically only guarantee atomic transactions on single
document operations.

23
Week 10 Database Management II
Document-Based NoSQL Database – Suitable Use Cases

Event logging for an application or process. Each instance


would constitute a new document or aggregate, containing all of
the information corresponding to the event.
Online blogging. Each user would be represented as a
document; each post a document; and each comment, like, or
action would be a document.
All documents would contain information pertaining to the type of data, such
as username, post content, or timestamp when the document was created.
Operational datasets for Web and Mobile applications.
They were designed with the internet in mind– thinking
JSON, RESTful API, and unstructured data.

24
Week 10 Database Management II
Document-Based NoSQL Database – Unsuitable Use
Cases

When you require ACID transactions.


Document databases cannot handle transactions that operates over multiple
documents.
A relational database may be a better choice in this instance.
If your data is in an aggregate-oriented design.
If data naturally falls into a normalized/tabular model, this would be another
time to research relational databases instead.

25
Week 10 Database Management II
Document-Based NoSQL Database – Examples

Document databases are currently the most widespread of the


NoSQL database categories in use today, and some examples of
the more popular implementations of document NoSQL databases
are: IBM Cloudant, MongoDB, Apache CouchDB, Terrastore,
OrientDB, Couchbase, and RavenDB

26
Week 10 Database Management II
Column-Based NoSQL
Databases
Its architecture and primary use cases

27
Week 10 Database Management II
Column-Based NoSQL Database Architecture

Column-based databases spawned from an architecture that Google created


called “Bigtable”. a.k.a. Bigtable clones, or Columnar databases, or
Wide-Column databases.
Focus on columns and groups of columns when storing and accessing data.
Column ‘Families’ are several rows, each with a unique key or
identifier, that belong to one or more columns.
These columns are grouped together in families because they are often
accessed together.
Rows in a column family are not required to share any of the same
columns. They can share all, a subset, or none of the columns and
columns can be added to any number of rows and not to others.
28
Week 10 Database Management II
Column-Based NoSQL Database – Suitable Use Cases

Great for when you're dealing with large amounts of sparse data.
When compared to row-oriented databases, Column-based databases can
better compress data and save storage space.
Similar to document databases, a Column-based No SQL database
could be used for event logging and blogs, but the data would
be stored in a different fashion.
For enterprise logging, every application can write to its own set
of columns and have each row key formatted in such a way to
promote easy lookup based on application and timestamp.

29
Week 10 Database Management II
Column-Based NoSQL Database – Suitable Use Cases

Column databases continue the trend of horizontal scalability.


As with Key-Value and Document databases, Column-based
databases can handle being deployed across clusters of nodes.
Counters are a unique use case for Column-based databases.
You may come across applications that need an easy way to count or
increment as events occur.
Some Column-based databases, like Cassandra, have special column types
that allow for simple counters.
In addition, columns can have a time-to-live parameter, making them useful
for data with an expiration date or time, like trial periods or ad
timing.

30
Week 10 Database Management II
Column-Based NoSQL Database – Unsuitable Use Cases

When you require traditional ACID transactions provided by


relational databases.
Reads and writes are only atomic at the row level.
In early development, query patterns may change and require
numerous changes to the column-based designs. This can be
costly and slowdown the production timeline.

31
Week 10 Database Management II
Column-Based NoSQL Database – Examples

Some examples of the more popular implementations of Column-


based NoSQL databases are: Cassandra, HBASE, Hypertable, and
accumulo.

32
Week 10 Database Management II
Graph NoSQL Databases
Its architecture and primary use cases

33
Week 10 Database Management II
Graph NoSQL Database Architecture
This database category stands apart from the previous three types covered because it
doesn't follow a few of the common traits previously seen.
Graph databases store information in entities (or nodes), and relationships (or
edges).
Graph databases are impressive when your dataset resembles a graph-like data
structure. Traversing all of the relationships is quick and efficient, but these
databases tend not to scale as well horizontally.
Sharding a graph database is not recommended since traversing a graph with nodes split across
multiple servers can become difficult and hurt performance.
Graph databases are also ACID transaction compliant. This prevents any
dangling relationships between nodes that don't exist.

34
Week 10 Database Management II
Graph NoSQL database – Suitable Use Cases

Graph databases can be very powerful when your data is highly


connected and related in some way.
Social networking sites can benefit by quickly locating friends,
friends of friends, likes, and so on.
Routing, spatial, and map applications may use graphs to
easily model their data for finding close locations or building
shortest routes for directions.
Recommendation engines can leverage the close relationships
and links between products to easily provide other options to their
customers.

35
Week 10 Database Management II
Graph NoSQL database – Unsuitable Use Cases

Graph databases are not a good fit when you’re looking for some
of the advantages offered by the other NoSQL database
categories.
When an application needs to scale horizontally, you're going to
quickly reach the limitations associated with these types of data
stores.
Another general negative surfaces when trying to update all or a
subset of nodes with a given parameter.
These types of operations can prove to be difficult and non-trivial.

36
Week 10 Database Management II
Graph NoSQL database – Examples

Some examples of the more popular implementations of Graph


NoSQL databases are: Neo4j, OrientDB, ArangoDB, Amazon
Neptune (part of Amazon Web Services), Apache Giraph, and
JanusGraph

37
Week 10 Database Management II
No SQL Database Characteristics

Majority of them have their roots in the open source community.


Many have been used and leveraged in an open source manner.
This has been fundamental for spring-boarding their growth in the industry.
You’ll often see companies who also provide a commercial version of the
database, and services and support of the technology, at the same time
providing sponsorship of the open source counterpart.
Examples of this include IBM Cloudant for CouchDB, Datastax for Apache
Cassandra, and Mongo has their own open source version of the Mongo
database too.

38
Week 10 Database Management II
No SQL Database Characteristics

Technically speaking they all differ quite a bit, but a few


commonalities do emerge.
Most NoSQL databases:
are built to scale horizontally
have flexible schemas and fast queries due to data model
share their data more easily than RDBMS.
Use of a global unique key across a whole database, to simplify partitioning
(or ‘sharding’).
Are also more specialized to certain use cases than RDBMS.
Allow more agile development via flexible schemas.
Are more developer friendly than RDBMS. Developers are drawn to NoSQL
databases for their ease of data modelling and use.
39
Week 10 Database Management II
No SQL Database Characteristics cont.

A flexible schema allows you to easily make changes to your


database as requirements change.
You can iterate quickly and continuously integrate new application features
to provide value to your users faster.
Horizontal scaling: most NoSQL databases allow you to scale-
out horizontally, meaning you can add cheaper, commodity servers
whenever you need to.
Fast queries: data in NoSQL databases is typically stored in a
way that is optimized for queries.
The rule of thumb when you use a NoSQL db such as MongoDB is Data
that is accessed together should be stored together.
Queries typically do not require joins, so the queries are very fast.
40
Week 10 Database Management II
No SQL Database Characteristics cont.

Easy for developers: Some NoSQL databases like MongoDB


map their data structures to those of popular programming
languages.
This mapping allows developers to store their data in the same
way that they use it in their application code.
While it may seem like a trivial advantage, this mapping can allow
developers to write less code, leading to faster development time
and fewer bugs.

41
Week 10 Database Management II
Benefits of NoSQL Databases

Scalability: particularly the ability to horizontally scale across


clusters of servers, racks, and possibly even data centers. ​
The elasticity of scaling both up and down to meet the varying
demands of applications is key.
NoSQL databases are well suited to meet the large data size and
huge number of concurrent users that “Big Data” applications
exhibit. ​

42
Week 10 Database Management II
Benefits of NoSQL Databases

Performance: goes hand-in-hand with


scalability.
The need to deliver fast response times
even with large data sets and high
concurrency is a must for modern
applications, and the ability of NoSQL
databases to leverage the resources of large
clusters of servers makes them ideal for fast
performance in these circumstances.

43
Week 10 Database Management II
Benefits of NoSQL Databases

High Availability: an obvious


requirement for a database
Having a database run on a cluster of
servers with multiple copies of the data
makes for a more resilient solution than a
single server solution.

44
Week 10 Database Management II
Benefits of NoSQL Databases

Cloud Architecture: Historically, large databases


have run on expensive machines or mainframes.
Modern enterprises are employing cloud
architectures to support their applications, and the
distributed data nature of NoSQL databases
means that they can be deployed and operated on
clusters of servers in cloud architectures,
thereby massively reducing cost.

45
Week 10 Database Management II
Benefits of NoSQL Databases

Cost is important for any technology venture, and it is common to


hear of NoSQL adopters cutting significant costs vs. their existing
databases... and still be able to get the same or better
performance and functionality.

46
Week 10 Database Management II
Benefits of NoSQL Databases

Flexible schema and intuitive data structures are key features


that developers love when wanting to build applications efficiently.
Most NoSQL databases allow for having flexible schemas, which
means that one can build new features into applications
quickly and without any database locking or downtime.

47
Week 10 Database Management II
Benefits of NoSQL Databases

Varied Data Structures which often are more eloquent for


solving development needs than the rows and columns of
relational datastores.
Examples include:
key-value stores for quick lookup,
document stores for storing de-normalized intuitive information
graph databases for associative data sets.

48
Week 10 Database Management II
Benefits of NoSQL Databases

There are also various specialized capabilities that certain NoSQL


providers offer that attract end users.
Examples include:
specific indexing and querying capabilities such as geospatial search,
data replication robustness
modern HTTP API’s.
With all these benefits you might well ask why anyone would ever
use anything but a NoSQL database.
You could say this is true for most cases these days, but there are definitely
still many requirements which are best met with an RDBMS.

49
Week 10 Database Management II
Differences between SQL and NoSQL
SQL Databases NoSQL Databases

Data Storage Model Tables with fixed rows and Document: JSON dcocuments
columns Key-Value: key-value pairs
Wide-Columns: tables with rows and
dynamic columns
Graph: nodes and edges.
Development History Developed in the 1970s with Developed in the late 2000s with a focus on
focus on reducing data duplication scaling and allowing for rapid application
change driven by agile and DevOps
practices.
Examples: Oracle, MySQL, Microsoft SQL Document: MongoDB and Couch DB
Server, and PostgresSQL Key-Value: Redis and DynamoDB
Wide-Columns: Cassandra and Hbase
50
Week 10 Database Management II
Graph: Neo4j and Amazon Neptune
Differences between SQL and NoSQL
SQL Databases NoSQL Databases

Primary Purpose General Purpose Document: general purpose


Key-Value: Large amounts of data with
simple lookup queries.
Wide-Columns: Large amounts of data with
predictable query patterns.
Graph: analyzing and traversing
relationships between connected data.
Schemas Rigid Flexible

Scaling Vertical (scale-up with a larger Horizontal (scale-out across commodity


server) servers)
51
Week 10 Database Management II
Differences between SQL and NoSQL
SQL Databases NoSQL Databases

Multi-Record ACID Supported Most do not support multi-record ACID


Transactions transactions. However, some like MongoDB
do.
Joins Typically required Typically not required

Data to Object Requires ORM (object –relational Many do not require ORMs. For example,
Mapping mapping) MongoDB documents map directly to data
structures in most popular programming
languages.

52
Week 10 Database Management II
Difference between RDBMS and NoSQL databases

While a variety of differences exist between relational database


management systems (RDBMS) and NoSQL databases, one of the
key differences is the way the data is modeled in the
database.
Let us work through an example of modeling the same data in a
relational database and a NoSQL database.
Then, I’ll highlight some of the other key differences between
relational databases and NoSQL databases.

53
Week 10 Database Management II
RDBMS vs NoSQL: Data Modeling Example

54
Week 10 Database Management II
RDBMS vs NoSQL: Data Modeling Example

In order to retrieve all of the information about a user and their


hobbies, information from the Users table and Hobbies table will
need to be joined together.
The data model we design for a NoSQL database will depend on
the type of NoSQL database we choose.
Let's consider how to store the same information about a user and
their hobbies in a document database like MongoDB.
In order to retrieve all of the information about a user and their
hobbies, a single document can be retrieved from the database.
No joins are required, resulting in faster queries.

55
Week 10 Database Management II
RDBMS vs NoSQL: Data Modeling Example

56
Week 10 Database Management II
When should NoSQL be used?

When deciding which database to use, decision-makers typically


find one or more of the following factors lead them to selecting a
NoSQL database:
Fast-paced Agile development
Storage of structured and semi-structured data
Huge volumes of data
Requirements for scale-out architecture
Modern application paradigms like micro-services and real-time streaming

57
Week 10 Database Management II
Drawbacks of NoSQL Databases?

They don’t support ACID (atomicity, consistency, isolation,


durability) transactions across multiple documents. With
appropriate schema design, single record atomicity is acceptable
for lots of applications.
Since data models in NoSQL databases are typically optimized for
queries and not for reducing data duplication, NoSQL databases
can be larger than SQL databases.
Storage is currently so cheap that most consider this a minor drawback, and
some NoSQL databases also support compression to reduce the storage
footprint.

58
Week 10 Database Management II
Summary

In this module, you learned that:


The name ‘NoSQL’ stands for Not only SQL.
NoSQL refers to a class of databases that are non-relational in
architecture.
Implementations of NoSQL databases are technically different
from each other, they all share some common traits.
Since the year 2000 NoSQL databases have become more popular
in the database marketplace, due to the scale demands of Big
Data.
There are four categories of NoSQL database
There are several benefits to adopting NoSQL databases.
59
Week 10 Database Management II
Summary – Key Points

Key-Value NoSQL databases are the least complex architecturally speaking; the
data is stored with a key and corresponding value blob and is represented by a
hashmap.
The primary use cases for the Key-Value NoSQL database category are for
quick CRUD operations; for example, storing and retrieving session
information, storing in-app user profiles, and storing shopping cart data in
online stores.
Document-based NoSQL databases use documents to make values visible and
able to be queried. Each piece of data is considered a document and typically
stored in either JSON or XML format.
Each document offers a flexible schema.
60
Week 10 Database Management II
Summary – Key Points

The primary use cases for document-based NoSQL databases are


event logging for apps and processes, online blogging, and
operational datasets or metadata for web and mobile apps.
Column-based databases spawned from the architecture of
Google’s Bigtable storage system. Column-based databases store
data in columns or groups of columns. Column ‘families’ are
several rows, with unique keys, belonging to one or more
columns.
The primary use cases for Column-based NoSQL databases are
event logging and blogs, counters, data with expiration values.

61
Week 10 Database Management II
Summary – Key Points

Graph databases store information in entities (or nodes), and


relationships (or edges).
Graph databases are impressive when your dataset resembles a
graph-like data structure.
Graph databases do not shard well but are ACID transaction
compliant. The primary use cases for the Graph NoSQL database
category are for highly connected and related data, for social
networking sites, for routing, spatial and map applications, and for
recommendation engines.

62
Week 10 Database Management II

You might also like