0% found this document useful (0 votes)
61 views

Nosql Databases Types

- Aggregation oriented databases store object aggregates rather than normalized tables, making data manipulation operations more efficient on distributed data. The query load dictates the database design. - The document discusses key-value stores, document databases, column family/wide column databases, and graph databases as types of NoSQL databases. Key-value stores use a dictionary/map concept to store key-value pairs, document databases add structure to values, and column family databases group columns into flexible column families. - Document databases like MongoDB and CouchDB allow flexible or full schemas and support querying value attributes. Column family databases like HBase and Cassandra were inspired by BigTable and DynamoDB and support sparse, multidimensional data

Uploaded by

Parv Agarwal
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
61 views

Nosql Databases Types

- Aggregation oriented databases store object aggregates rather than normalized tables, making data manipulation operations more efficient on distributed data. The query load dictates the database design. - The document discusses key-value stores, document databases, column family/wide column databases, and graph databases as types of NoSQL databases. Key-value stores use a dictionary/map concept to store key-value pairs, document databases add structure to values, and column family databases group columns into flexible column families. - Document databases like MongoDB and CouchDB allow flexible or full schemas and support querying value attributes. Column family databases like HBase and Cassandra were inspired by BigTable and DynamoDB and support sparse, multidimensional data

Uploaded by

Parv Agarwal
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 29

Types of No SQL Databases

pm jat @ daiict
Recap
• Aggregation Oriented Databases
– Object Aggregates are saved rather than normalized tables
– In other words it is “de-normalized” through related “object embedding”
– “Partition strategies” can be defined better
– By storing together data are queried together makes data manipulation
operations very efficient on distributed data
• Design of “Aggregation Oriented Databases” dictated by Query Load that means
some set of queries execute faster on a design while another set of queries shall
perform poorly.
– Note that this is not the case with relational systems!

05-10-2023 Types of No SQL Databases 2


What do we discuss here

• Types of No-SQL databases with some examples


– Key-Value
– Document Oriented
– “Column Family” or “Wide Column Databases”
– Graph Databases

05-10-2023 Types of No SQL Databases 3


Key Value Databases
• The concept of Key value is not new in Programming world.
• Hopefully, you have used one of following in your programming. Different languages
give different names: Maps, Dictionary, Associated Arrays
• Operations are done only on Key, quiet similar to the Dictionary
– Example: Get, Put, and Remove
• For Example:
items.put( 313, item_x );
Item a = items.get( 123 );
• So forth

05-10-2023 Types of No SQL Databases 4


Key Value Databases
• Key Value database uses the concept of “Dictionary” or “Map” in “databases”
• A key value database could very much defined as a “Persistent Dictionary” or
“Persistent Map”. Where
– value is an “Entity”
– key is “primary key” of that entity
• Key-Value databases are also called “Key Value Stores”
• Make sure that you clearly understand what the meaning of “Key” and “Value” in
Key-value databases. This is important.
• These are Aggregation Oriented Database, where key is Key of Aggregate and Value
is Aggregate itself?

05-10-2023 Types of No SQL Databases 5


Key Value Databases Operations
• In a Key-Value database, the collection of key-value pairs may be called as “Table” or
“Collection”
• Example: we have database “empdb”, then read/write operations re performed as
following:
• Write: empdb.employees.put("10011", '{empno:"1001",
"name":"Michael", "salary":60000}')
• Read: emp1 = empdb.employees.get("10005")
• Remove: emp2 = empdb.employees.remove("10001")

05-10-2023 Types of No SQL Databases 6


Key Value Stores
Key “empno”: “1234”
Value
• In pure “Key Value” databases, content
of value is not “visible” to DBMS
• Value is just read and written as a
chunk
• DynamoDB, Redis, Riak are popular key-
value stores.
• Since key-value systems always use
primary-key access, they generally have
greater performance and can be easily
scaled.

05-10-2023 Types of No SQL Databases 7


Key Value Stores
Key “orderNo”: 1234
Value

• Value in KV DB could be structured or


unstructured as byteArray, String CSV,
or
• Could be very much structured
represented as XML, or JSON

05-10-2023 Types of No SQL Databases 8


Key Value Stores - Summarized
• Database here is a “Persistent Collection”
of “Key-Value pairs”
• Where value is typically a “Data Object” , “Aggregated Data Object”
• Key here, typically could be “Primary Key”
• Value part is transparent and “DBMS does not see and does not process it”
• Values can be stored internal stored in any of data representation format
– String, Tuple, CLOB, BLOB, XML, JSON.
• By definition KV databases are schema-less

05-10-2023 Types of No SQL Databases 9


KV database – Pros and Cons
• Pros
– Efficient queries (very predictable performance) as operations are always based
on Key.
– Easy to distribute across a cluster.
– No impedance (object-relational) miss-match
• Cons
– No complex query filters (WHERE(Predicate) part is limited to Keys only)
– All joins must be done manually through code
– No foreign key constraints
– No triggers

05-10-2023 Types of No SQL Databases 10


Document Oriented Databases
• Document databases are also built on top of “Key-Value” strategy only
• A KV database does not know anything about the value part, and does not perform
any operation on value. It just puts and gets value as a block.
• Whereas in document databases, we specify structural information (partial or full)
of value part.
• We can use attributes from value part in SELECT and PREDICATE part
• Note that there is a thin boundary between KV and Document databases!
Document DB Key-Value DB

05-10-2023 Types of No SQL Databases 11


Document Oriented Databases
• Here the term “Document” refers to a record as shown below.
• Document here is a analogues to a row in RDB
• Here is a snapshot from Mongo DB docs.
• MongoDB It calls the collection
of documents as “Collection”
which is analogues to a
Table in RDB

Source: https://docs.mongodb.com/manual/core/databases-and-collections/
05-10-2023 Types of No SQL Databases 12
Document Oriented Databases
• Here are Mongo DB correspondences with RDB

RDB Doc DB
Database/Schema Database
Table “Collection” of
Documents
Row/Tuple “Document”
Row ID _id

05-10-2023 Types of No SQL Databases 13


Document DB and Schema
• Flexible Schema
– Mongo DB allows Schema to be “No Schema” to “Full Schema” including Type
and Cardinality Constraints
– We may define set of required attributes and some constraints on them.
– You can define ID, or Shard Key
– Schema remain flexible - any extra set of attribute-values.
• DB Design can be “Aggregated” and Normalized” or anything in between
• You can define index on attributes from “Value Part”

05-10-2023 Types of No SQL Databases 14


Mongo DB “Sharded”

• Recall modern distributed databases


are “partitioned” and “replicated”
• “Sharding” is a special type of
“partition” where data records are
partitioned “horizontally”
• Each partition has same schema.
• Each partition has a local DB Engine
“Wiredtiger” storage engine!

05-10-2023 Types of No SQL Databases 15


Popular “Document Databases”
• Popular Document databases are
– Mongo DB (more towards “Consistency”)
– Couch DB (more towards “Availability”)
• Different Document databases would have different features in terms of
– Data Types, and other building blocks
– API
– Partitioning, replication, indexing
– “Consistency Models”

05-10-2023 Types of No SQL Databases 16


“Column Family” or “Wide Column” Databases

• Why do we call them wide-column databases?


– Can have a huge number of columns, making a row too wide!
– Note that the row here is not the same as the row in relational systems!
• Why do we call them “Column Family Databases”?
– We do not define the names of columns while creating tables. Only the thing we
do is define “column families” (shall learn about shortly)

05-10-2023 Types of No SQL Databases 17


“Column Family” or “Wide Column” Databases

• Do use concepts of “Table”, “Row”, and “Column”, though slightly different, and
additionally “Column Family” [4]
• Column: “attribute” may not be atomic
• Row: represents a “Data Record”
– typically aggregate and certainly not a normalized row
• Table: A collection of “rows”
• Column Family: A group of columns. A table only has a fixed set of column families
but not a fixed set of columns. A column family can have any number of columns
and names are not known upfront!

05-10-2023 Types of No SQL Databases 18


“Table”, “Row”, “Column Family”, “Column” [5]

05-10-2023 Types of No SQL Databases 19


“Column Family” or “Wide Column” Databases

• Google – BigTable was the first column family databases


• Then “HBase” at Hadoop, “Cassandra” at Facebook followed!
– HBase is a Hadoop implementation of Big Table only, whereas
– Cassandra is “Column Family” adaption of “Dynamo DB”, a key-value database
• The BigTable article [2], defines “Big Table” as
– sparse, distributed, persistent multidimensional sorted “map”.
– The map is indexed by a row key, column key, and a timestamp; each value in the
map is an uninterested array of bytes

05-10-2023 Types of No SQL Databases 20


“Column Family” or “Wide Column” Databases [2]

• “sparse”:
– Total number of columns in a table could be very large, however individual rows
have values only for few of columns
– This is also reason that these systems are called “Wide Column” databases
• “distributed”: partitioned
• “persistent”
• “multidimensional sorted map”.
– “Map” tells that this is key value store
– Rows are kept “Sorted” on key

05-10-2023 Types of No SQL Databases 21


“Table”, “Row”, “Column Family”, “Column” [5]

Row as a multidimensional sorted map! [2]


05-10-2023 Types of No SQL Databases 22
Also a “key-value store” [5]
• Multi Level Key-Value store: [row key, column family, column qualifier, timestamp]

05-10-2023 Types of No SQL Databases 23


“Column Family Databases” and Schema
• While defining a table
– Row Key is Required
– Column Family is Required
– There can be any number of columns in Column Family
• A row required to have a “Row Key” and can have any number of column in any
column family
• A row need not to have columns in all column families!

05-10-2023 Types of No SQL Databases 24


Why do we three different types of “KV databases”
• Key Value
– Highly Scalable, Highly Efficient on Key based Access
• Document
– More friendly with “OOP”, Aggregation Oriented Store.
– Have no Impedance mismatch problem
– Tries to have the same functionality of “traditional databases” at scale
– Access is based on many attributes of an object even attributes of embedded objects
• Wide Column or Column Family
– Key value beyond mere “KEY”
– Multi-Dimensional Map
– We still have a large number of attributes (theoretically millions) still “look up” is
amazingly fast and that too at scale.
05-10-2023 Types of No SQL Databases 25
Graph Databases

• A node represents an Entity, whereas


• An edge Represents a Relationship
• Operations are performed by Graph Manipulations
• Graph Databases are popularly used in RDF-based Linked Open Data, Social
networks, and other information networks.

05-10-2023 Types of No SQL Databases 26


Graph Databases [1]

05-10-2023 Types of No SQL Databases 27


“Works that made the impact”
• GFS
• Map Reduce
• Google Big Table [2]
• Amazon Dynamo [3]
• Column Oriented Storage

05-10-2023 Types of No SQL Databases 28


References/Further Readings
[1] Chapter 2 of book Sadalage, Pramod J., and Martin Fowler. NoSQL distilled: a brief guide to the
emerging world of polyglot persistence. Pearson Education, 2013.
[2] Chang, Fay, et al. "Bigtable: A distributed storage system for structured data." ACM Transactions on
Computer Systems (TOCS) 26.2 (2008): 1-26.
[3] DeCandia, Giuseppe, et al. "Dynamo: Amazon's highly available key-value store." ACM SIGOPS
operating systems review 41.6 (2007): 205-220.
[4] Lakshman, Avinash, and Prashant Malik. "Cassandra: a decentralized structured storage system."
ACM SIGOPS Operating Systems Review 44.2 (2010): 35-40.
[5] Khurana, Amandeep. "Introduction to HBase schema design." White Paper, Cloudera (2012).
[6] Sasaki, Bryce Merkl, Joy Chao, and Rachel Howard. "Graph databases for beginners." Neo4j (2018).

05-10-2023 Types of No SQL Databases 29

You might also like