# apache/hbase

**Attribution required: if you use, quote, or summarise this content, you must credit and link back to [awesome-repositories.com](https://awesome-repositories.com/repository/apache-hbase).**

5,540 stars · 3,397 forks · Java · Apache-2.0

## Links

- GitHub: https://github.com/apache/hbase
- Homepage: https://hbase.apache.org/
- awesome-repositories: https://awesome-repositories.com/repository/apache-hbase.md

## Description

HBase is a distributed, wide-column NoSQL store and big data storage engine designed for sparse datasets. It functions as a scalable columnar database built on top of the Hadoop Distributed File System to provide real-time read and write access to massive volumes of structured and unstructured data.

The system acts as a cross-language database gateway, offering connectivity through native remote procedure calls, REST, and Thrift interfaces. It distinguishes itself through a master-worker coordination model that enables horizontal scaling and fault tolerance across a cluster.

The project covers a broad set of capabilities including fine-grained access control via cell-level visibility labels, pluggable data compression, and server-side data aggregation. It also supports big data analytics workflows through map-reduce integration and allows for the execution of custom server-side logic.

Operational monitoring is provided through system metric tracking and plugin-based metric exporting.

## Tags

### Data & Databases

- [Columnar Databases](https://awesome-repositories.com/f/data-databases/data-warehousing/hadoop/columnar-databases.md) — Implements a distributed NoSQL wide-column store built on top of the Hadoop ecosystem for sparse datasets.
- [Column Family Management](https://awesome-repositories.com/f/data-databases/column-family-management.md) — Organizes sparse data into grouped column families for efficient distributed storage and retrieval.
- [Hadoop](https://awesome-repositories.com/f/data-databases/data-warehousing/hadoop.md) — Integrates with the Hadoop Distributed File System to provide a columnar store for large-scale data analysis.
- [Distributed File System Backends](https://awesome-repositories.com/f/data-databases/distributed-file-system-backends.md) — Relies on the Hadoop Distributed File System for durable, replicated persistent storage of data files.
- [Sparse Dataset Management](https://awesome-repositories.com/f/data-databases/large-scale-dataset-management/sparse-dataset-management.md) — Provides scalable storage and versioning for massive, sparse, column-oriented datasets across a cluster. ([source](https://github.com/apache/hbase#readme))
- [LSM-Tree Storage Engines](https://awesome-repositories.com/f/data-databases/storage-engines/key-value/log-structured-merge-trees/lsm-tree-key-value-stores/lsm-tree-storage-engines.md) — Utilizes an LSM-tree storage engine to provide high write throughput via in-memory buffering and sorted flushes.
- [Region-Based Partitioning](https://awesome-repositories.com/f/data-databases/volume-based-partitioning/logical-data-partitioning/region-based-partitioning.md) — Implements region-based partitioning by splitting the sorted keyspace into contiguous ranges for horizontal scaling.
- [Wide-Column Stores](https://awesome-repositories.com/f/data-databases/wide-column-stores.md) — Organizes data into column families to provide real-time read and write access to high-scale datasets.
- [Column-Oriented Disk Storage](https://awesome-repositories.com/f/data-databases/wide-column-stores/column-oriented-disk-storage.md) — Organizes sparse datasets into column-oriented disk storage for scalable, versioned data management. ([source](https://github.com/apache/hbase/blob/master/README.md))
- [Big Data Processing](https://awesome-repositories.com/f/data-databases/big-data-processing.md) — Supports big data processing workflows using map-reduce patterns for large-scale data transformation.
- [Cross-Language Data Interfaces](https://awesome-repositories.com/f/data-databases/cross-language-data-interfaces.md) — Provides consistent data interaction interfaces via native RPC, REST, and Thrift APIs for clients in multiple programming languages.
- [MapReduce Processing Engines](https://awesome-repositories.com/f/data-databases/mapreduce-processing-engines.md) — Integrates with MapReduce processing engines to transform and migrate large volumes of data between tables. ([source](https://github.com/apache/hbase/tree/master/hbase-examples))
- [Server-Side Aggregations](https://awesome-repositories.com/f/data-databases/server-side-aggregations.md) — Calculates summaries and statistics directly on the server to minimize data transfer to the client.

### Part of an Awesome List

- [Big Data Storage](https://awesome-repositories.com/f/awesome-lists/data/big-data-storage.md) — Functions as a distributed engine for storing and querying massive volumes of structured and unstructured data.
- [Database Systems](https://awesome-repositories.com/f/awesome-lists/data/database-systems.md) — Distributed big data store modeled after Bigtable.

### DevOps & Infrastructure

- [Distributed Data Stores](https://awesome-repositories.com/f/devops-infrastructure/single-node-deployment/single-process-servers/multi-model-server-architectures/peer-to-peer-storage-models/distributed-data-stores.md) — Provides a cluster-based storage system with horizontal scaling and fault tolerance for scalable data retrieval.
- [Application REST API Gateways](https://awesome-repositories.com/f/devops-infrastructure/rest-api-endpoint-management/application-rest-api-gateways.md) — Exposes database operations and cluster status through a standardized REST API gateway. ([source](https://github.com/apache/hbase/tree/master/hbase-rest))

### Security & Cryptography

- [Cell-Level Controls](https://awesome-repositories.com/f/security-cryptography/granular-access-controls/database-access-controls/table-level-access-controls/cell-level-controls.md) — Enforces fine-grained access control using visibility labels at the individual data cell level.

### Software Engineering & Architecture

- [Master-Worker Coordination](https://awesome-repositories.com/f/software-engineering-architecture/distributed-coordination-systems/task-coordinations/master-worker-coordination.md) — Employs a master-worker coordination model to manage cluster metadata and region assignments.
- [Distributed Storage Clusters](https://awesome-repositories.com/f/software-engineering-architecture/distributed-systems/distributed-data-management/distributed-storage-clusters.md) — Implements a scalable architecture that aggregates multiple nodes into a unified storage system for massive datasets.
- [Distributed File Systems](https://awesome-repositories.com/f/software-engineering-architecture/distributed-systems/distributed-data-management/distributed-storage-clusters/distributed-file-systems.md) — Relies on a distributed file system like HDFS for durable and replicated storage of underlying data files.
- [Remote Procedure Call Protocols](https://awesome-repositories.com/f/software-engineering-architecture/remote-procedure-call-protocols.md) — Implements structured messaging protocols for standardized communication between cluster nodes and clients. ([source](https://github.com/apache/hbase/tree/master/hbase-protocol-shaded))

### Networking & Communication

- [Thrift RPC Servers](https://awesome-repositories.com/f/networking-communication/apache-thrift-integrations/thrift-rpc-servers.md) — Ships a dedicated Thrift server to enable cross-language connectivity for database operations. ([source](https://github.com/apache/hbase/tree/master/hbase-examples))
- [Cross-Language Service Gateways](https://awesome-repositories.com/f/networking-communication/cross-language-service-gateways.md) — Acts as an entry point that translates REST, Thrift, and RPC requests into internal database protocols.
- [Storage Block Compression](https://awesome-repositories.com/f/networking-communication/data-compression/storage-block-compression.md) — Applies pluggable block compression to reduce the physical storage footprint of datasets on disk. ([source](https://github.com/apache/hbase/blob/master/AGENTS.md))
- [Multi-Protocol Communication Bridges](https://awesome-repositories.com/f/networking-communication/multi-protocol-communication-bridges.md) — Provides a multi-protocol gateway allowing clients to connect via RPC, HTTP, and Thrift. ([source](https://github.com/apache/hbase/blob/master/AGENTS.md))
- [Remote Procedure Calls](https://awesome-repositories.com/f/networking-communication/remote-procedure-calls.md) — Uses remote procedure calls for low-latency communication between clients, master nodes, and region servers.
