0% found this document useful (0 votes)

5 views15 pages

UNIT 3

This document provides an overview of NoSQL storage architecture, focusing on column-oriented databases like HBase, their architecture, and use cases. It details the components of HBase, including HMaster, HRegion servers, and ZooKeeper, and compares column-oriented and row-oriented databases. Additionally, it introduces document stores, highlighting their schema-less nature and ability to handle semi-structured data.

Uploaded by

amuthavalli.cse

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

5 views15 pages

UNIT 3

Uploaded by

amuthavalli.cse

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 15

UNIT-3-NoSQL Storage Architecture

Working With Column-Oriented Databases, Hbase Distributed Storage Architecture,

Document Store Internals, Understanding Key/Value Stores In Memcached And Redis,
Eventually Consistent Non-Relational Databases

Working with Column-Oriented Databases

1. Introduction
A column-oriented database (also known as a column-family database) is a type of
NoSQL database that stores data in columns rather than rows. This design is optimized for
analytics, fast read performance, and scalability.
Key Features
Stores data by columns instead of rows.
Optimized for analytical queries (e.g., OLAP workloads).
Efficient compression and faster retrieval of specific columns.
Used in big data applications, data warehouses, and real-time analytics.

2. Row-Oriented vs Column-Oriented Databases

Row-Oriented (RDBMS: Column-Oriented (Cassandra, HBase,
Feature
MySQL, PostgreSQL) ClickHouse)
Storage
Stores entire rows together Stores data column-wise
Format
Read Speed Fast for retrieving entire rows Fast for retrieving specific columns
Faster for transactional workloads Slower for individual writes but
Write Speed
(OLTP) optimized for batch operations
Compression Low High (similar values in columns)
Transactional databases Analytical workloads (big data, time-
Best Use Case
(banking, e-commerce) series)

3. Architecture of Column-Oriented Databases

A. Column-Family Structure
Instead of tables with fixed rows and columns, column-oriented databases use a column-
family model:
Column Family → Like a table but stores related columns together.
Row Key → Unique identifier for each row.
Columns → Stored together within a column family.

Example (Cassandra Table Structure)

Row Key | Name | Age | City
---------------------------------
1001 | Alex | 25 | New York
1002 | Bob | 30 | Chicago
1003 | Carol | 28 | Boston
Stored as:
Column 1 (Name): {1001: "Alex", 1002: "Bob", 1003: "Carol"}
Column 2 (Age): {1001: 25, 1002: 30, 1003: 28}
Column 3 (City): {1001: "New York", 1002: "Chicago", 1003: "Boston"}

4. Popular Column-Oriented Databases

Database Type Use Case
Apache Cassandra Wide Column Store Distributed, high-availability applications
HBase Wide Column Store Big data storage (Hadoop integration)
ClickHouse Analytical DB Real-time analytics, log processing
Amazon Redshift Data Warehouse Business Intelligence (BI)
Google BigQuery Data Warehouse Cloud-based big data analytics

5. Working with Column-Oriented Databases

A. Creating a Table in Cassandra
CREATE TABLE users (
user_id UUID PRIMARY KEY,
name TEXT,
age INT,
city TEXT
) WITH COMPACTION = { 'class': 'LeveledCompactionStrategy' };
B. Inserting Data
INSERT INTO users (user_id, name, age, city)
VALUES (uuid(), 'Alice', 28, 'Los Angeles');
C. Querying Specific Columns
Optimized for column-based retrieval:
SELECT name, age FROM users WHERE user_id = 1001;
6. Advantages of Column-Oriented Databases
Fast Aggregations – SUM, AVG, COUNT queries run efficiently.
Better Compression – Similar values in columns allow high compression ratios.
Scalability – Distributed architecture, handles petabytes of data.
Optimized for Analytics – Works well with BI tools (Tableau, Power BI).
Example: Efficient Querying
 If querying millions of rows for total revenue per year, a columnar database will
be 10x faster than a row-based one.

7. When to Use Column-Oriented Databases?

Best for:
 Data Warehousing & Business Intelligence (BI)
 Real-time Analytics (e.g., user behavior tracking)
 Time-Series Data (e.g., IoT, log monitoring)
 Big Data Processing (e.g., Hadoop, Spark)
Not ideal for:
 Transactional Workloads (e.g., Banking, E-commerce)
 Frequent Single Record Updates

HBase Architecture and its Important Components

Introduction
HBase, a distributed, scalable, and NoSQL database built on top of the Hadoop Distributed
File System (HDFS), stands as a cornerstone in the realm of big data storage and processing.
With its column-oriented structure, automatic sharding, and strong consistency guarantees,
HBase offers a robust foundation for storing and managing vast amounts of structured data in
real-time.
What is HBase?
HBase is a data model and is similar to Google's big table. It is an open-source, distributed
database developed by Apache software foundation written in Java.

The table in HBase is split into regions and served by the region servers in HBase. Regions
are vertically divided by column familied into "stores." Stores are usually saved as files in
HDFS. HBase runs on top of HDFS (Hadoop Distributed File System).
HBase architecture consists mainly of five components
 HMaster
 HRegionserver
 HRegions
 Zookeeper
 HDFS
Below is a detailed architecture of HBase with components:

HBase Architecture Diagram

HMaster

HMaster in HBase is the implementation of a Master server in HBase architecture. It acts as

a monitoring agent to monitor all Region Server instances present in the cluster and acts as an
interface for all the metadata changes. In a distributed cluster environment, Master runs on
NameNode. Master runs several background threads.

The following are important roles performed by HMaster in HBase.

 Plays a vital role in terms of performance and maintaining nodes in the cluster.
 HMaster provides admin performance and distributes services to different region
servers.
 HMaster assigns regions to region servers.
 HMaster has the features like controlling load balancing and failover to handle the
load over nodes present in the cluster.
 When a client wants to change any schema and to change any Metadata operations,
HMaster takes responsibility for these operations.
Some of the methods exposed by HMaster Interface are primarily Metadata oriented
methods.
 Table (createTable, removeTable, enable, disable)
 ColumnFamily (add Column, modify Column)
 Region (move, assign)

The client communicates in a bi-directional way with both HMaster and ZooKeeper. For read
and write operations, it directly contacts with HRegion servers. HMaster assigns regions to
region servers and in turn, check the health status of region servers.
In entire architecture, we have multiple region servers. Hlog present in region servers which
are going to store all the log files.
HBase Region Servers
When HBase Region Server receives writes and read requests from the client, it assigns the
request to a specific region, where the actual column family resides. However, the client can
directly contact with HRegion servers, there is no need of HMaster mandatory permission to
the client regarding communication with HRegion servers. The client requires HMaster help
when operations related to metadata and schema changes are required.

HRegionServer is the Region Server implementation. It is responsible for serving and

managing regions or data that is present in a distributed cluster. The region servers run on
Data Nodes present in the Hadoop cluster.

HMaster can get into contact with multiple HRegion servers and performs the following
functions.
 Hosting and managing regions
 Splitting regions automatically
 Handling read and writes requests
 Communicating with the client directly
HBase Regions
HRegions are the basic building elements of HBase cluster that consists of the distribution of
tables and are comprised of Column families. It contains multiple stores, one for each column
family. It consists of mainly two components, which are Memstore and Hfile.
ZooKeeper
HBase Zookeeper is a centralized monitoring server which maintains configuration
information and provides distributed synchronization. Distributed synchronization is to
access the distributed applications running across the cluster with the responsibility of
providing coordination services between nodes. If the client wants to communicate with
regions, the server’s client has to approach ZooKeeper first.

It is an open-source project, and it provides so many important services.

Services provided by ZooKeeper

 Maintains Configuration information

 Provides distributed synchronization
 Client Communication establishment with region servers
 Provides ephemeral nodes for which represent different region servers
 Master servers’ usability of ephemeral nodes for discovering available servers in the
cluster
 To track server failure and network partitions

Master and HBase slave nodes ( region servers) registered themselves with ZooKeeper. The
client needs access to ZK(zookeeper) quorum configuration to connect with master and
region servers.
During a failure of nodes that present in HBase cluster, ZKquoram will trigger error
messages, and it starts to repair the failed nodes.

HDFS
HDFS is a Hadoop distributed File System, as the name implies it provides a distributed
environment for the storage and it is a file system designed in a way to run on commodity
hardware. It stores each file in multiple blocks and to maintain fault tolerance, the blocks are
replicated across a Hadoop cluster.

HDFS provides a high degree of fault –tolerance and runs on cheap commodity hardware. By
adding nodes to the cluster and performing processing & storing by using the cheap
commodity hardware, it will give the client better results as compared to the existing one.

In here, the data stored in each block replicates into 3 nodes any in a case when any node
goes down there will be no loss of data, it will have a proper backup recovery mechanism.
HDFS get in contact with the HBase components and stores a large amount of data in a
distributed manner.

HBase Data Model

HBase Data Model is a set of components that consists of Tables, Rows, Column families,
Cells, Columns, and Versions. HBase tables contain column families and rows with elements
defined as Primary keys. A column in HBase data model table represents attributes to the
objects.
HBase Data Model consists of following elements,
 Set of tables
 Each table with column families and rows
 Each table must have an element defined as Primary Key.
 Row key acts as a Primary key in HBase.
 Any access to HBase tables uses this Primary Key
 Each column present in HBase denotes attribute corresponding to object

HBase Use Cases
Following are examples of HBase use cases with a detailed explanation of the solution it
provides to various technical problems
Problem Statement Solution
Telecom Industry faces following
technical challenges
 Storing billions of CDR (Call
HBase is used to store billions of rows of detailed
detailed recording) log records
call records. If 20TB of data is added per month to
generated by telecom domain
the existing RDBMS database, performance will
 Providing real-time access to CDR
deteriorate. To handle a large amount of data in this
logs and billing information of
use case, HBase is the best solution. HBase performs
customers
fast querying and displays records.
 Provide cost-effective solution
comparing to traditional database
systems
The Banking industry generates millions
To store, process and update vast volumes of data
of records on a daily basis. In addition to
and performing analytics, an ideal solution is –
this, the banking industry also needs an
HBase integrated with several Hadoop ecosystem
analytics solution that can detect Fraud in
components.
money transactions

That apart, HBase can be used

 Whenever there is a need to write heavy applications.
 Performing online log analytics and to generate compliance reports.
Storage Mechanism in HBase

HBase is a column-oriented database and data is stored in tables. The tables are sorted by
RowId. As shown below, HBase has RowId, which is the collection of several column
families that are present in the table.
The column families that are present in the schema are key-value pairs. If we observe in
detail each column family having multiple numbers of columns. The column values stored
into disk memory. Each cell of the table has its own Metadata like timestamp and other
information.

Storage Mechanism in HBase

Coming to HBase the following are the key terms representing table schema
 Table: Collection of rows present.
 Row: Collection of column families.
 Column Family: Collection of columns.
 Column: Collection of key-value pairs.
 Namespace: Logical grouping of tables.
 Cell: A {row, column, version} tuple exactly specifies a cell definition in HBase.
Column-oriented vs Row-oriented storages
Column and Row-oriented storages differ in their storage mechanism. As we all know
traditional relational models store data in terms of row-based format like in terms of rows of
data. Column-oriented storages store data tables in terms of columns and column families.
The following Table gives some key differences between these two storages

Column-oriented Database Row oriented Database

When the situation comes to process and analytics we Online Transactional process such as
use this approach. Such as Online Analytical banking and finance domains use this
Column-oriented Database Row oriented Database
Processing and it’s applications. approach.

The amount of data that can able to store in this model is It is designed for a small number of
very huge like in terms of petabytes rows and columns.

HBase Read and Write Data E

The Read and Write Operations from Client into Hfile can be shown in below diagram.

Step 1) Client wants to write data and in turn first communicates with Regions server and
then regions
Step 2) Regions contacting memstore for storing associated with the column family
Step 3) First data stores into Memstore, where the data is sorted and after that, it flushes into
HFile. The main reason for using Memstore is to store data in a Distributed file system based
on Row Key. Memstore will be placed in Region server main memory while HFiles are
written into HDFS.
Step 4) Client wants to read data from Regions
Step 5) In turn Client can have direct access to Mem store, and it can request for data.
Step 6) Client approaches HFiles to get the data. The data are fetched and retrieved by the
Client.
Memstore holds in-memory modifications to the store. The hierarchy of objects in HBase
Regions is as shown from top to bottom in below table.
Table HBase table present in the HBase cluster
Region HRegions for the presented tables
Store It stores per ColumnFamily for each region for the table
 Memstore for each store for each region for the table
Memstore  It sorts data before flushing into HFiles
 Write and read performance will increase because of sorting
StoreFile StoreFiles for each store for each region for the table
Block Blocks present inside StoreFiles

HBase vs. HDFS

HBase runs on top of HDFS and Hadoop. Some key differences between HDFS and HBase
are in terms of data operations and processing.

HBASE HDFS
Low latency operations High latency operations
Random reads and writes Write once Read many times
Accessed through shell commands, client API in Primarily accessed through MR (Map Reduce)
Java, REST, Avro or Thrift jobs

Storage and process both can be perform It’s only for storage areas

Some typical IT industrial applications use HBase operations along with Hadoop.
Applications include stock exchange data, online banking data operations, and processing
Hbase is best-suited solution method.

Summary
 HBase architecture components: HMaster, HRegion Server, HRegions, ZooKeeper,
HDFS
 HMaster in HBase is the implementation of a Master server in HBase architecture.
 When HBase Region Server receives writes and read requests from the client, it
assigns the request to a specific region, where the actual column family resides
 HRegions are the basic building elements of HBase cluster that consists of the
distribution of tables and are comprised of Column families.
 HBase Zookeeper is a centralized monitoring server which maintains configuration
information and provides distributed synchronization.
 HDFS provides a high degree of fault–tolerance and runs on cheap commodity
hardware.
 HBase Data Model is a set of components that consists of Tables, Rows, Column
families, Cells, Columns, and Versions.
 Column and Row-oriented storages differ in their storage mechanism.

Document Store Internals

1. Introduction to Document Stores
 Document stores are NoSQL databases designed to store, retrieve, and manage semi-
structured data in document format (e.g., JSON, BSON, XML).
 Unlike relational databases, they do not require a fixed schema and can store nested
and dynamic structures.
 Commonly used for flexible, scalable, and high-performance applications.

2. Key Characteristics
Schema-less – No predefined schema, supports dynamic fields.
Hierarchical Data Model – Stores complex, nested structures in a single document.
Efficient Querying – Supports indexing, filtering, and full-text search.
Horizontal Scalability – Uses sharding and replication for distribution.

3. Internal Components of Document Stores

1. Documents
Core storage unit, typically in JSON, BSON, or XML format.
Stores key-value pairs, arrays, and nested objects.
Example (JSON Document in MongoDB):
{
"_id": "12345",
"name": "John Doe",
"email": "[email protected]",
"orders": [
{ "product": "Laptop", "price": 1000 },
{ "product": "Mouse", "price": 50 }
]
}
2. Collections
Group of related documents, similar to a table in relational databases.
Documents in a collection do not need to have the same structure.
3. Indexing Mechanism
Uses B-Tree and Hash Indexes for fast data retrieval.
Supports compound, geospatial, and text indexes.
4. Storage Engine
MongoDB (WiredTiger, MMAPv1) – Optimized for concurrent reads/writes.
CouchDB (Append-only B+ Tree) – Ensures durability using Multi-Version
Concurrency Control (MVCC).
5. Replication & Sharding
Replication – Copies data across multiple nodes for high availability.
Sharding – Distributes documents across servers based on a shard key.

4. Read & Write Operations in Document Stores

Write Operation
1. Data is written to the primary node.
2. Stored in memory and flushed to disk periodically.
3. Changes are replicated to secondary nodes for fault tolerance.
Read Operation
1. The query engine searches indexes first for faster lookups.
2. If no index exists, it performs a collection scan.
3. The document is fetched and returned in JSON/BSON/XML format.
5. Advantages of Document Stores
High Performance – Fast reads/writes due to indexing and in-memory caching.
Flexible Schema – Handles dynamic, evolving data structures.
Horizontal Scaling – Supports automatic sharding for big data.
Better for JSON-Based APIs – Ideal for modern web & mobile applications.

6. Limitations of Document Stores

Lack of ACID Transactions – Some document stores provide eventual consistency.
Inefficient Joins – No built-in join support like relational databases.
Increased Storage Overhead – Due to document duplication and indexing.

7. Popular Document Stores

 MongoDB – Most widely used, supports JSON-like documents.
 CouchDB – Uses Multi-Version Concurrency Control (MVCC) for durability.
 Amazon DocumentDB – Fully managed NoSQL document store.
 RethinkDB – Real-time document store optimized for live applications.

Understanding Key/Value Stores in Memcached and Redis

Introduction to Key/Value Stores

Key/Value stores are a type of NoSQL database that store data as a collection of
key-value pairs.
They provide fast, scalable, and efficient access to data, making them ideal for
caching, real-time applications, and session storage.
Memcached and Redis are two of the most widely used in-memory key/value
stores.
2. Memcached vs Redis: Overview

Feature Memcached Redis

Pure key-value Key-value with advanced data
Data Storage
cache structures
No (data lost on
Persistence Supports disk persistence
restart)
Data Strings, Lists, Sets, Hashes,
Strings only
Structures Sorted Sets
Supports master-slave
Replication No replication
replication
Sharding Client-side sharding Built-in sharding
No transaction
Transactions Supports transactions
support
Pub/Sub Not supported Supported
Caching, real-time analytics,
Use Cases Simple caching
messaging
3. Memcached Internals
a) Architecture
In-memory key-value store designed for high-speed caching.
Distributed – scales horizontally across multiple servers.
Uses Least Recently Used (LRU) eviction policy to manage memory.
b) How Memcached Works
1. Stores data in key-value pairs in RAM.
2. Clients query Memcached before accessing the database.
3. If data is found (cache hit), it is returned immediately.
4. If data is not found (cache miss), it is retrieved from the database and stored in
Memcached for future access.
c) Limitations of Memcached
No persistence (data is lost on restart).
Does not support complex data types.
No built-in replication (requires client-side sharding).

4. Redis Internals
a) Architecture
In-memory key-value store with support for complex data structures (Strings,
Lists, Sets, Hashes, Sorted Sets).
Supports persistence via RDB (snapshot) and AOF (append-only file).
Master-slave replication for high availability.
Pub/Sub messaging for real-time applications.
b) How Redis Works
1. Data is stored in RAM for fast access.
2. If persistence is enabled, Redis periodically saves snapshots to disk.
3. Redis supports automatic failover using Redis Sentinel.
4. Supports Lua scripting, transactions, and cluster mode for scaling.
c) Advantages of Redis Over Memcached
Persistent storage (RDB, AOF).
Rich data types (Lists, Sets, Hashes, etc.).
Replication & clustering for high availability.
Atomic operations & transactions support.

Use Cases

Memcached Use Cases

Simple caching – Web session storage, database query caching.
CDN & API rate limiting – Reducing backend load.
Redis Use Cases
Real-time analytics – Leaderboards, counters.
Message queues & Pub/Sub – Chat apps, notifications.
Distributed locks – Prevent race conditions in applications.

Eventually Consistent Non-Relational Databases

1. Introduction
An eventually consistent non-relational database is a type of database that
prioritizes availability and partition tolerance over immediate consistency. It
follows the eventual consistency model, meaning data updates are asynchronously
propagated to all nodes in a distributed system, and after some time, all nodes
converge to the same state.
This model is commonly used in NoSQL databases, which are designed for
scalability, high availability, and fault tolerance.

2. Eventual Consistency: Definition

Eventual consistency is a relaxed consistency model that ensures:
All replicas will receive updates.
Given enough time (assuming no new updates), all nodes will eventually have
consistent data.
Unlike strong consistency, there is a time lag before all nodes reflect the latest
changes.

Example

A user updates a profile picture on Node A.

The change is asynchronously propagated to Node B and Node C.
Eventually, all nodes have the updated profile picture.
This model is suitable for systems where high availability is more critical than
immediate consistency, such as social media feeds, shopping carts, and
recommendation systems.

3. CAP Theorem and Eventual Consistency

The CAP theorem states that a distributed system can guarantee only two out of
three properties:
C (Consistency) – Every read returns the most recent write.
A (Availability) – Every request receives a response.
P (Partition Tolerance) – The system continues functioning even if some nodes fail.
How Eventual Consistency Fits
It favors Availability (A) and Partition Tolerance (P).
It relaxes Consistency (C) by allowing temporary inconsistencies.
💡 Example: Amazon DynamoDB prioritizes AP, ensuring fast writes and reads, even
if data is slightly stale.

4. Characteristics of Eventually Consistent Databases

Asynchronous Replication – Updates are propagated to replicas without waiting for
immediate confirmation.
Conflict Resolution – Strategies like last-write-wins, vector clocks, or CRDTs
(Conflict-free Replicated Data Types) handle conflicts.
High Availability – Nodes remain operational even when some are temporarily
unreachable.
Partition Tolerance – Designed to handle network failures gracefully.

5. Examples of Eventually Consistent NoSQL Databases

Database Type Eventual Consistency Mechanism
Amazon Key-Value
Uses quorum-based replication
DynamoDB Store
Apache Wide Column Uses tunable consistency (eventual
Cassandra Store by default)
Document Allows read preferences to enable
MongoDB
Store eventual consistency
Key-Value Uses CRDTs and last-write-wins
Riak
Store strategy

6. Eventual Consistency vs Strong Consistency

Feature Eventual Consistency Strong Consistency
Read Speed Fast Slower
Write
Low High
Latency
Availability High Lower
Social media, caching, Banking, ticket booking,
Use Case
shopping carts financial transactions

7. When to Use Eventual Consistency

Best suited for:
Social Media Feeds (Twitter, Facebook News Feed)
E-commerce Shopping Carts (Amazon, Flipkart)
IoT Data Storage (Sensor Data)
Content Delivery Networks (CDN)
Caching Systems (Redis, Memcached)
Not suitable for:
Banking Transactions
Stock Trading
Critical Real-Time Applications

Unit 5 Lecture No-3(Hbase)
No ratings yet
Unit 5 Lecture No-3(Hbase)
35 pages
Big Data Analytics Unit-5
No ratings yet
Big Data Analytics Unit-5
28 pages
HBase - Tutorial
No ratings yet
HBase - Tutorial
14 pages
Unit 5 BDA
No ratings yet
Unit 5 BDA
34 pages
Bda - Unit 5
No ratings yet
Bda - Unit 5
30 pages
Unit v Hadoop Related Tools_b5f716067e8295de72a527efb7a3698b
No ratings yet
Unit v Hadoop Related Tools_b5f716067e8295de72a527efb7a3698b
54 pages
Bda Unit 5
No ratings yet
Bda Unit 5
16 pages
Unit 5 Lecture No-3(Hbase)
No ratings yet
Unit 5 Lecture No-3(Hbase)
35 pages
NoSql-Unit-2
No ratings yet
NoSql-Unit-2
72 pages
9 HBase
No ratings yet
9 HBase
77 pages
unit-5 notes
No ratings yet
unit-5 notes
61 pages
pbds unit-5
No ratings yet
pbds unit-5
60 pages
10 NoSQL Databases - HBase Hive Cassandra
No ratings yet
10 NoSQL Databases - HBase Hive Cassandra
74 pages
BDA Unit-4 Part-2 HBase,Hive,Pig
No ratings yet
BDA Unit-4 Part-2 HBase,Hive,Pig
74 pages
BDA Unit 5
No ratings yet
BDA Unit 5
33 pages
UNIT 5 Notes
No ratings yet
UNIT 5 Notes
47 pages
BDA.Unit-5
No ratings yet
BDA.Unit-5
31 pages
Unit - IV_Notes
No ratings yet
Unit - IV_Notes
23 pages
BDM Unit 5
No ratings yet
BDM Unit 5
60 pages
UNIT-4
No ratings yet
UNIT-4
15 pages
Hadoop HBASE
No ratings yet
Hadoop HBASE
71 pages
HBASE (1)
No ratings yet
HBASE (1)
18 pages
lec18
No ratings yet
lec18
21 pages
Unit III_Full
No ratings yet
Unit III_Full
31 pages
DBMS Unit3
No ratings yet
DBMS Unit3
28 pages
BDA Module 2-2023
No ratings yet
BDA Module 2-2023
30 pages
Unit 5 Big Data
No ratings yet
Unit 5 Big Data
34 pages
BDA
No ratings yet
BDA
20 pages
HBase (Unit 4)
No ratings yet
HBase (Unit 4)
37 pages
Hadoop Week 6
No ratings yet
Hadoop Week 6
38 pages
UNIT5
No ratings yet
UNIT5
42 pages
lec18
No ratings yet
lec18
18 pages
BDT UNIT - V
No ratings yet
BDT UNIT - V
15 pages
HBase
No ratings yet
HBase
27 pages
4 4HBase
No ratings yet
4 4HBase
17 pages
unit 3 hbase,mongodb and couch db
No ratings yet
unit 3 hbase,mongodb and couch db
12 pages
HBase Presentation
No ratings yet
HBase Presentation
23 pages
Hbase in Practice
No ratings yet
Hbase in Practice
46 pages
Apache HBase PPT
No ratings yet
Apache HBase PPT
12 pages
10_HBase
No ratings yet
10_HBase
13 pages
Hadoop and HBase
No ratings yet
Hadoop and HBase
31 pages
Columnar Database
No ratings yet
Columnar Database
18 pages
HBASE
No ratings yet
HBASE
35 pages
Unit - 5 Part - 1
No ratings yet
Unit - 5 Part - 1
8 pages
Big Data Unit 5
No ratings yet
Big Data Unit 5
18 pages
HBase
No ratings yet
HBase
6 pages
Cse 17CS82 M2 S4 PPT
No ratings yet
Cse 17CS82 M2 S4 PPT
19 pages
HBASE
No ratings yet
HBASE
11 pages
Unit V
No ratings yet
Unit V
6 pages
Unit 5
No ratings yet
Unit 5
10 pages
HBase
No ratings yet
HBase
31 pages
Hbase - Quick Guide Hbase - Overview
No ratings yet
Hbase - Quick Guide Hbase - Overview
53 pages
Cs525: Special Topics in DBS: Large-Scale Data Management
No ratings yet
Cs525: Special Topics in DBS: Large-Scale Data Management
35 pages
Large-Scale Data Management: Hbase
No ratings yet
Large-Scale Data Management: Hbase
36 pages
70-464 Exam Dumps With PDF and VCE Download
No ratings yet
70-464 Exam Dumps With PDF and VCE Download
112 pages
Assignment Day 10: Task 1
No ratings yet
Assignment Day 10: Task 1
8 pages
Assignment 10
No ratings yet
Assignment 10
9 pages
Hbase Big Table: Oriented vs. Column-Oriented Data Stores. As Shown Below, in A Row
No ratings yet
Hbase Big Table: Oriented vs. Column-Oriented Data Stores. As Shown Below, in A Row
6 pages
UNIT 1 NOTES
No ratings yet
UNIT 1 NOTES
28 pages
UNIT 2 NOTES
No ratings yet
UNIT 2 NOTES
12 pages
Question Bank SQL
No ratings yet
Question Bank SQL
6 pages
Hbase: Q) What Is Hbase ?
No ratings yet
Hbase: Q) What Is Hbase ?
15 pages
Mca Project Topics
100% (1)
Mca Project Topics
7 pages
Simpledb Demo Java
No ratings yet
Simpledb Demo Java
6 pages
Hbase - in Detail: Pushpinder Singh Paxcel Technologies
No ratings yet
Hbase - in Detail: Pushpinder Singh Paxcel Technologies
32 pages
Source Code Documentation
No ratings yet
Source Code Documentation
9 pages
Learning Management System Using PHP and Mysql: Adrian M. de Castro Bernadeth B. Rico Kenneth D. Pineda
No ratings yet
Learning Management System Using PHP and Mysql: Adrian M. de Castro Bernadeth B. Rico Kenneth D. Pineda
26 pages
7_Calculation_procedures
No ratings yet
7_Calculation_procedures
119 pages
EasyBuilderPro V60301 UserManual Eng
No ratings yet
EasyBuilderPro V60301 UserManual Eng
943 pages
Pomerium Vision Statement EN-5873e13f
No ratings yet
Pomerium Vision Statement EN-5873e13f
35 pages
Pro-Watch 4.5 Installation Guide
No ratings yet
Pro-Watch 4.5 Installation Guide
46 pages
New - Hive
No ratings yet
New - Hive
46 pages
Inheritance Notes
No ratings yet
Inheritance Notes
15 pages
AdvanceGuard Commissioning Guide v3-0g
No ratings yet
AdvanceGuard Commissioning Guide v3-0g
111 pages
Oracle Standby Database Hardware-Software Requirements
No ratings yet
Oracle Standby Database Hardware-Software Requirements
15 pages
104 Management Information Systems.
No ratings yet
104 Management Information Systems.
17 pages
21-0381E v4.9.1 Verex Director User's Guide English
No ratings yet
21-0381E v4.9.1 Verex Director User's Guide English
383 pages
Anil Kumar Hotta - Business Analyst
No ratings yet
Anil Kumar Hotta - Business Analyst
1 page
cp5293 Big Data Analytics Unit 5 PDF
No ratings yet
cp5293 Big Data Analytics Unit 5 PDF
28 pages
Chapter 3: Dms Functions & Supporting Functions General Requirements
No ratings yet
Chapter 3: Dms Functions & Supporting Functions General Requirements
25 pages
E-Tender For Setting Up of Security Operations Centre (SOC)
No ratings yet
E-Tender For Setting Up of Security Operations Centre (SOC)
35 pages
MS Access Assignments
No ratings yet
MS Access Assignments
1 page
Nov 2018 MTP Ii
No ratings yet
Nov 2018 MTP Ii
2 pages
An Introduction To Systems Engineering PDF
No ratings yet
An Introduction To Systems Engineering PDF
123 pages
ICT For Social Development: Some Experiences and Observations
No ratings yet
ICT For Social Development: Some Experiences and Observations
10 pages
Mohammed Yahya Sarayi: Summary
No ratings yet
Mohammed Yahya Sarayi: Summary
5 pages
A Prototype of A Mobile Car Rental System: Journal of Physics: Conference Series
No ratings yet
A Prototype of A Mobile Car Rental System: Journal of Physics: Conference Series
9 pages
Final Year Project 2020 - 2021: Chhatrapati Shivaji Maharaj Institute of Technology Department of Computer Engineering
No ratings yet
Final Year Project 2020 - 2021: Chhatrapati Shivaji Maharaj Institute of Technology Department of Computer Engineering
18 pages
As ISO IEC 13249.2-2005 Information Technology - Database Languages - SQL Multimedia and Application Packages
No ratings yet
As ISO IEC 13249.2-2005 Information Technology - Database Languages - SQL Multimedia and Application Packages
12 pages
Human HumaReader HS - User Manual
100% (2)
Human HumaReader HS - User Manual
38 pages
Database Interview Questions
No ratings yet
Database Interview Questions
21 pages
Phocas Memoire Chap 1
No ratings yet
Phocas Memoire Chap 1
5 pages
Oracle DG Issue SOP
No ratings yet
Oracle DG Issue SOP
10 pages
Learn Hbase in 24 Hours
From Everand
Learn Hbase in 24 Hours
Alex Nordeen
No ratings yet

UNIT 3

Uploaded by

UNIT 3

Uploaded by

UNIT-3-NoSQL Storage Architecture

Working With Column-Oriented Databases, Hbase Distributed Storage Architecture,

Working with Column-Oriented Databases

2. Row-Oriented vs Column-Oriented Databases

3. Architecture of Column-Oriented Databases

Example (Cassandra Table Structure)

4. Popular Column-Oriented Databases

5. Working with Column-Oriented Databases

7. When to Use Column-Oriented Databases?

HBase Architecture and its Important Components

HBase Architecture Diagram

HMaster in HBase is the implementation of a Master server in HBase architecture. It acts as

The following are important roles performed by HMaster in HBase.

HRegionServer is the Region Server implementation. It is responsible for serving and

It is an open-source project, and it provides so many important services.

 Maintains Configuration information

HBase Data Model

That apart, HBase can be used

Storage Mechanism in HBase

Column-oriented Database Row oriented Database

HBase Read and Write Data E

HBase vs. HDFS

Document Store Internals

3. Internal Components of Document Stores

4. Read & Write Operations in Document Stores

6. Limitations of Document Stores

7. Popular Document Stores

Understanding Key/Value Stores in Memcached and Redis

Introduction to Key/Value Stores

Feature Memcached Redis

Memcached Use Cases

Eventually Consistent Non-Relational Databases

2. Eventual Consistency: Definition

A user updates a profile picture on Node A.

3. CAP Theorem and Eventual Consistency

4. Characteristics of Eventually Consistent Databases

5. Examples of Eventually Consistent NoSQL Databases

6. Eventual Consistency vs Strong Consistency

7. When to Use Eventual Consistency

You might also like