Overview

Use HBase when u need random, realtime read/write access to ur Big Data.
HBase is an open-source, distributed, versioned, non-relational database modeled after Google‘s Bigtable: A Distributed Storage System for Structured Data

Features

Linear and modular scalability; 可扩展性
Strictly consistent reads and writes; 强读写一致性
Automatic and configurable sharding of tables; 自动、可配置的表分区
Automatic failover support between RegionServers; RegionServers间自动的失效备援
Convenient base classes for backing Hadoop MapReduce jobs with Apach HBase tables;
Easy to use Java API for client access;
Block cache and Bloom filters for real-time queries; 块缓存 & bloom filter for实时查询 [for high volume query optimization]
Thrift gateway and a REST-ful Web service that supports XML, Protobuf, and binary data encoding options；
Extensible jruby-based(JIRB) shell;
Support for exporting metrics via the Hadoop metrics subsystem to files or Ganglia; or via JMX

尽管在概念层面上，表可以被看成行的稀疏集合。但在物理上它们是通过列族存储的。一个新的列修饰符(column_family:column_qualifier)可以随时被添加到已有的列族中。

Table 5. ColumnFamily `anchor`
Row Key	Time Stamp	Column Family `anchor`
"com.cnn.www"	t9	`anchor:cnnsi.com = "CNN"`
"com.cnn.www"	t8	`anchor:my.look.ca = "CNN.com"`

"NoSQL" is a general term meaning that the database isn’t an RDBMS which supports SQL as its primary access language.
There are many types of NoSQL databases: BerkeleyDB is an example of a local NoSQL database, whereas HBase is very much a distributed database. Technically speaking, HBase is really more a "Data Store" than "Data Base" because it lacks many of the features you find in an RDBMS, such as typed columns, secondary indexes, triggers, and advanced query languages, etc.【HBase是分布式数据库。它缺少RDBMS中的很多特性，比如二级索引、触发器和高级查询语言等等。】

HBase并不适合所有场景。
首先，确保你有足够的数据。否则你的数据可能在一个单一机器上，其他的node会sitting idle。
第二，确保u can like without all the extra featrues that an RDBMS provides(e.g., typed columns, secondary indexes, transactions, advanced query languages, etc.)
第三，确保你有足够多硬件设备。

HDFS很适合于存储大文件。但是它不提供快速的、单一记录的查询。
而HBase is built on top of HDFS, 并且提供对large tables的快速记录查询和更新。
关于这一点，你可能会有点困惑。实际上，HBase是通过将数据放在索引的存放在HDFS上的"StoreFiles"来提供高速查询的。关于更详细的解释，可以看下本篇介绍的Data Model部分。

catalog table `hbase:meta`实际上是作为HBase表存在的。虽然它被排除在HBase shell的list命令之外，但实际上它本质和其他表一样。

<HBase>

声明：以上内容来自用户投稿及互联网公开渠道收集整理发布，本网站不拥有所有权，未作人工编辑处理，也不承担相关法律责任，若内容有误或涉及侵权可进行投诉：投诉/举报工作人员会在5个工作日内联系你，一经查实，本站将立刻删除涉嫌侵权内容。

联系
我们