This document provides an overview of HBase, including its architecture and how it compares to relational databases and HDFS. Some key points:
- HBase is a non-relational, distributed, column-oriented database that runs on top of Hadoop. It uses a master-slave architecture with an HMaster and multiple HRegionServers.
- Unlike relational databases, HBase is schema-less, column-oriented, and designed for denormalized data in wide, sparsely populated tables.
- Compared to HDFS, HBase provides low-latency random reads/writes instead of batch processing. Data is accessed via APIs instead of MapReduce.
- HBase uses LSM