0% found this document useful (0 votes)

28 views6 pages

HBase

Uploaded by

mytempemail2023

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

28 views6 pages

HBase

Uploaded by

mytempemail2023

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 6

HBase

HBase is a data model designed to provide quick random access to huge amounts of data.

Since 1970, RDBMS is the solution for data storage and maintenance related problems. After
the advent of big data, companies realized the benefit of processing big data and started
opting for solutions like Hadoop.

What is HBase?
HBase is a distributed column-oriented database built on top of the Hadoop file system. It is
an open-source project and is horizontally scalable. It leverages the fault tolerance provided
by the Hadoop File System (HDFS).
It is a part of the Hadoop ecosystem that provides random real-time read/write access to data
in the Hadoop File System.
One can store the data in HDFS either directly or through HBase. Data consumer
reads/accesses the data in HDFS randomly using HBase. HBase sits on top of the Hadoop
File System and provides read and write access.

HBase and HDFS

HDFS HBase

HDFS is a distributed file system HBase is a database built on top of the HDFS.
suitable for storing large files.

HDFS does not support fast HBase provides fast lookups for larger tables.
individual record lookups.

It provides high latency It provides low latency access to single rows from billions of
processing records (Random access).
It provides only sequential access HBase internally uses Hash tables and provides random
of data. access, and it stores the data in indexed HDFS files for faster
lookups.

Storage Mechanism in HBase

HBase is a column-oriented database and the tables in it are sorted by row. The table schema
defines only column families, which are the key value pairs. A table has multiple column
families and each column family can have any number of columns. Subsequent column
values are stored contiguously on the disk. Each cell value of the table has a timestamp. In
short, in an HBase:

• Table is a collection of rows.

• Row is a collection of column families.
• Column family is a collection of columns.
• Column is a collection of key value pairs.
Given below is an example schema of table in HBase.

Rowid Column Family Column Family Column Family Column Family

col1 col2 col3 col1 col2 col3 col1 col2 col3 col1 col2 col3

Column Oriented and Row Oriented

Column-oriented databases are those that store data tables as sections of columns of data,
rather than as rows of data. Shortly, they will have column families.

Row-Oriented Database Column-Oriented Database

It is suitable for Online Transaction Process It is suitable for Online Analytical

(OLTP). Processing (OLAP).

Such databases are designed for small number of Column-oriented databases are designed
rows and columns. for huge tables.

The following image shows column families in a column-oriented database:

HBase and RDBMS

HBase RDBMS

HBase is schema-less, it doesn't have the concept An RDBMS is governed by its schema,
of fixed columns schema; defines only column which describes the whole structure of
families. tables.

It is built for wide tables. HBase is horizontally It is thin and built for small tables. Hard to
scalable. scale.

No transactions are there in HBase. RDBMS is transactional.

It has de-normalized data. It will have normalized data.

It is good for unstructured, semi-structured as well It is good for structured data.

as structured data.
Features of HBase
• HBase is linearly scalable.
• It has automatic failure support.
• It provides consistent read and writes.
• It integrates with Hadoop, both as a source and a destination.
• It has easy java API for client.
• It provides data replication across clusters.

Where to Use HBase

• HBase is used to have random, real-time read/write access to Big Data.
• It hosts very large tables on top of clusters of commodity hardware.
• HBase is a non-relational database. HBase works on top of Hadoop and HDFS.

Applications of HBase
• It is used whenever there is a need to write heavy applications.
• HBase is used whenever we need to provide fast random access to available data.
• Companies such as Facebook, Twitter, Yahoo, and Adobe use HBase internally.

HBase History
Year Event

Nov 2006 Google released the paper on BigTable.

Feb 2007 Initial HBase prototype was created as a Hadoop contribution.

Oct 2007 The first usable HBase along with Hadoop 0.15.0 was released.

Jan 2008 HBase became the sub project of Hadoop.

Oct 2008 HBase 0.18.1 was released.

Jan 2009 HBase 0.19.0 was released.

Sept 2009 HBase 0.20.0 was released.

May 2010 HBase became Apache top-level project.

In HBase, tables are split into regions and are served by the region servers. Regions are
vertically divided by column families into “Stores”. Stores are saved as files in HDFS. Shown
below is the architecture of HBase.
Note: The term ‘store’ is used for regions to explain the storage structure.

HBase has three major components: the client library, a master server, and region servers.
Region servers can be added or removed as per requirement.

MasterServer
The master server -
• Assigns regions to the region servers and takes the help of ZooKeeper for this task.
• Handles load balancing of the regions across region servers. It unloads the busy
servers and shifts the regions to less occupied servers.
• Maintains the state of the cluster by negotiating the load balancing.
• Is responsible for schema changes and other metadata operations such as creation of
tables and column families.

Regions
Regions are nothing but tables that are split up and spread across the region servers.
Region server
The region servers have regions that -

• Communicate with the client and handle data-related operations.

• Handle read and write requests for all the regions under it.
• Decide the size of the region by following the region size thresholds.
When we take a deeper look into the region server, it contains regions and stores as shown
below:

The store contains memory store and HFiles. Memstore is just like a cache memory.
Anything that is entered into the HBase is stored here initially. Later, the data is transferred
and saved in Hfiles as blocks and the memstore is flushed.

Zookeeper
• Zookeeper is an open-source project that provides services like maintaining
configuration information, naming, providing distributed synchronization, etc.
• Zookeeper has ephemeral nodes representing different region servers. Master servers
use these nodes to discover available servers.
• In addition to availability, the nodes are also used to track server failures or network
partitions.
• Clients communicate with region servers via zookeeper.
• In pseudo and standalone modes, HBase itself will take care of zookeeper.

Diploma Engineer Question Paper-2
No ratings yet
Diploma Engineer Question Paper-2
1 page
Unit 5 Lecture No-3(Hbase)
No ratings yet
Unit 5 Lecture No-3(Hbase)
35 pages
Big Data Analytics Unit-5
No ratings yet
Big Data Analytics Unit-5
28 pages
Production Geologist 2 14
No ratings yet
Production Geologist 2 14
2 pages
HBase - Tutorial
No ratings yet
HBase - Tutorial
14 pages
[FREE PDF sample] Jump Start HTML5 1st Edition Tiffany B. Brown ebooks
No ratings yet
[FREE PDF sample] Jump Start HTML5 1st Edition Tiffany B. Brown ebooks
55 pages
Bharata Rajyangam Final
No ratings yet
Bharata Rajyangam Final
138 pages
h Base Tutorial
No ratings yet
h Base Tutorial
38 pages
UNIT 3
No ratings yet
UNIT 3
15 pages
Dando Mule PDF
No ratings yet
Dando Mule PDF
2 pages
Lesson 6 NoSQL Databases HBase
100% (1)
Lesson 6 NoSQL Databases HBase
47 pages
DOC-20250429-WA0005. (1)
No ratings yet
DOC-20250429-WA0005. (1)
53 pages
Lycaa FSD Ops FRM 08
No ratings yet
Lycaa FSD Ops FRM 08
7 pages
Hadoop Distributed File System (HDFS)
No ratings yet
Hadoop Distributed File System (HDFS)
6 pages
Big SQL
No ratings yet
Big SQL
6 pages
Hive
No ratings yet
Hive
7 pages
DP Operations Manual Rev 0 - 79564619
100% (1)
DP Operations Manual Rev 0 - 79564619
30 pages
HP t620 Series Thin Clients
No ratings yet
HP t620 Series Thin Clients
4 pages
Bid Abstract
No ratings yet
Bid Abstract
1 page
Unit 5 Lecture No-3(Hbase)
No ratings yet
Unit 5 Lecture No-3(Hbase)
35 pages
UNIT-4
No ratings yet
UNIT-4
15 pages
HBase
No ratings yet
HBase
39 pages
Syllabus
No ratings yet
Syllabus
55 pages
Fabrication of Automatic Braking System Using Ultrasonic Sensor
No ratings yet
Fabrication of Automatic Braking System Using Ultrasonic Sensor
9 pages
HBase
No ratings yet
HBase
12 pages
BDA Unit 5
No ratings yet
BDA Unit 5
33 pages
Agfa Drystar-5300 PDF
80% (5)
Agfa Drystar-5300 PDF
619 pages
Chapter 12 HBase[1]
No ratings yet
Chapter 12 HBase[1]
108 pages
Disassembly RoHSWEEE Flatpack S (E - 2120603 - 1 - 2) - 1
No ratings yet
Disassembly RoHSWEEE Flatpack S (E - 2120603 - 1 - 2) - 1
2 pages
Hbase
No ratings yet
Hbase
3 pages
Unit 5 BDA
No ratings yet
Unit 5 BDA
34 pages
PI ProcessBook 2015 Release Notes
No ratings yet
PI ProcessBook 2015 Release Notes
30 pages
Unit - IV_Notes
No ratings yet
Unit - IV_Notes
23 pages
DHC-6-100-200-300-400 - Rev 14
100% (1)
DHC-6-100-200-300-400 - Rev 14
111 pages
HBASE (1)
No ratings yet
HBASE (1)
18 pages
Unit v Hadoop Related Tools_b5f716067e8295de72a527efb7a3698b
No ratings yet
Unit v Hadoop Related Tools_b5f716067e8295de72a527efb7a3698b
54 pages
BDA.Unit-5
No ratings yet
BDA.Unit-5
31 pages
MTech-Admission-Notice
No ratings yet
MTech-Admission-Notice
2 pages
SB Collect
No ratings yet
SB Collect
2 pages
BDA Unit-4 Part-2 HBase,Hive,Pig
No ratings yet
BDA Unit-4 Part-2 HBase,Hive,Pig
74 pages
BDA1
No ratings yet
BDA1
42 pages
Heating Ovens: Drying and Heating Chambers With Forced Convection
No ratings yet
Heating Ovens: Drying and Heating Chambers With Forced Convection
7 pages
Hbase
No ratings yet
Hbase
23 pages
Hbase
No ratings yet
Hbase
15 pages
lec18
No ratings yet
lec18
21 pages
Unit - 5 Part - 1
No ratings yet
Unit - 5 Part - 1
8 pages
Hadoop - MapReduce
No ratings yet
Hadoop - MapReduce
5 pages
antiragging_form
No ratings yet
antiragging_form
2 pages
College Event Management System
No ratings yet
College Event Management System
2 pages
UNIT 5 Notes
No ratings yet
UNIT 5 Notes
47 pages
BDT UNIT - V
No ratings yet
BDT UNIT - V
15 pages
Pig
No ratings yet
Pig
6 pages
Bda - Unit 5
No ratings yet
Bda - Unit 5
30 pages
Cloud Computing Security Challenges by Chukwueke Chika M
No ratings yet
Cloud Computing Security Challenges by Chukwueke Chika M
36 pages
Apache HBase PPT
No ratings yet
Apache HBase PPT
12 pages
Big Data 22MSM40206
No ratings yet
Big Data 22MSM40206
9 pages
4 4HBase
No ratings yet
4 4HBase
17 pages
lec18
No ratings yet
lec18
18 pages
Waspmote GPRS/GPS: Networking Guide
No ratings yet
Waspmote GPRS/GPS: Networking Guide
55 pages
Ias 3316: Industrial Training Proposal: 1 Title: Centrix Infotech & Engineering Web Development
No ratings yet
Ias 3316: Industrial Training Proposal: 1 Title: Centrix Infotech & Engineering Web Development
7 pages
10_HBase
No ratings yet
10_HBase
13 pages
Big data UNIT 5 own
No ratings yet
Big data UNIT 5 own
18 pages
Unit 5 Hbase
No ratings yet
Unit 5 Hbase
15 pages
BDA Module 2-2023
No ratings yet
BDA Module 2-2023
30 pages
Unit 5 Big Data
No ratings yet
Unit 5 Big Data
34 pages
BDA Unit 5 HIVE HBASE
No ratings yet
BDA Unit 5 HIVE HBASE
33 pages
Dhanpal Resume
No ratings yet
Dhanpal Resume
3 pages
HBASE
No ratings yet
HBASE
11 pages
9 HBase
No ratings yet
9 HBase
77 pages
Elective Course MIT-SBM ITB
No ratings yet
Elective Course MIT-SBM ITB
24 pages
Checklist For Load Runner Scripting
No ratings yet
Checklist For Load Runner Scripting
8 pages
Big Data Unit 5
No ratings yet
Big Data Unit 5
18 pages
HBase (Unit 4)
No ratings yet
HBase (Unit 4)
37 pages
MTL5012 PROXIMITY DETECTOR Interface Barrier
No ratings yet
MTL5012 PROXIMITY DETECTOR Interface Barrier
1 page
Hbase - Quick Guide Hbase - Overview
No ratings yet
Hbase - Quick Guide Hbase - Overview
53 pages
HAZOP REPORT Final
No ratings yet
HAZOP REPORT Final
19 pages
CIS4002-student ID St20212772@outlook - Cardiffmet.ac - Uk
No ratings yet
CIS4002-student ID St20212772@outlook - Cardiffmet.ac - Uk
31 pages
Hbase What Is Hbase?
No ratings yet
Hbase What Is Hbase?
2 pages
Hadoop Week 6
No ratings yet
Hadoop Week 6
38 pages
Proceso Calidad Fai
No ratings yet
Proceso Calidad Fai
1 page
Cse 17CS82 M2 S4 PPT
No ratings yet
Cse 17CS82 M2 S4 PPT
19 pages
HBase
No ratings yet
HBase
27 pages
HBase
No ratings yet
HBase
30 pages
Hadoop HBASE
No ratings yet
Hadoop HBASE
71 pages
UNIT5
No ratings yet
UNIT5
42 pages
Assignment Day 10: Task 1
No ratings yet
Assignment Day 10: Task 1
8 pages
Columnar Database
No ratings yet
Columnar Database
18 pages
Consumer Attitude Towards Mobile Phones
No ratings yet
Consumer Attitude Towards Mobile Phones
10 pages
Tribhuvan University Faculty of Humanities and Social Sciences
100% (4)
Tribhuvan University Faculty of Humanities and Social Sciences
13 pages
Hbase - in Detail: Pushpinder Singh Paxcel Technologies
No ratings yet
Hbase - in Detail: Pushpinder Singh Paxcel Technologies
32 pages
HBASE
No ratings yet
HBASE
35 pages
Hbase Big Table: Oriented vs. Column-Oriented Data Stores. As Shown Below, in A Row
No ratings yet
Hbase Big Table: Oriented vs. Column-Oriented Data Stores. As Shown Below, in A Row
6 pages
MAH MCA CET Sample Question Paper
0% (2)
MAH MCA CET Sample Question Paper
2 pages
HBase Presentation
No ratings yet
HBase Presentation
23 pages
Manual Reactor UV
No ratings yet
Manual Reactor UV
60 pages
Assignment 10
No ratings yet
Assignment 10
9 pages
Large-Scale Data Management: Hbase
No ratings yet
Large-Scale Data Management: Hbase
36 pages
HBase
No ratings yet
HBase
31 pages
Learn Hbase in 24 Hours
From Everand
Learn Hbase in 24 Hours
Alex Nordeen
No ratings yet

HBase

Uploaded by

HBase

Uploaded by

HBase

HBase and HDFS

Storage Mechanism in HBase

• Table is a collection of rows.

Rowid Column Family Column Family Column Family Column Family

Column Oriented and Row Oriented

Row-Oriented Database Column-Oriented Database

It is suitable for Online Transaction Process It is suitable for Online Analytical

The following image shows column families in a column-oriented database:

HBase and RDBMS

No transactions are there in HBase. RDBMS is transactional.

It has de-normalized data. It will have normalized data.

It is good for unstructured, semi-structured as well It is good for structured data.

Where to Use HBase

Nov 2006 Google released the paper on BigTable.

Feb 2007 Initial HBase prototype was created as a Hadoop contribution.

Jan 2008 HBase became the sub project of Hadoop.

Oct 2008 HBase 0.18.1 was released.

Jan 2009 HBase 0.19.0 was released.

Sept 2009 HBase 0.20.0 was released.

May 2010 HBase became Apache top-level project.

• Communicate with the client and handle data-related operations.

You might also like