The advantages of using HBase are as follows:
- As you know already, HBase is built on top of the HDFS, which is a distributed file system. This gives HBase the ability to store large amounts of data and perform analytics in a short period of time. HBase provides quite a cost-effective solution in case the size of the data is in the petabyte range, as it uses commodity hardware.
- HBase is schema-less, and therefore, HBase column families can be added, updated or deleted dynamically.
- HBase provides read and write consistency through random lookups, unlike HDFS.
- Data reading and processing take less time in HBase as compared to in traditional databases, thanks to faster lookups.
- Features such as the Bloom filter (testing the presence of an element in a set of data) and the block cache (storing frequent and recent data), which are taken from the Google Bigtable, can be used for query optimisation.
- Newer nodes can be added in case the application grows in size.
- HBase is quite resource-intensive, as it provides random and faster lookups on top of the HDFS.
- HBase does not provide any kind of built-in authentication. It allows read and write access to everyone on every table.
- HBase allows only one default row key for sorting, whereas RDBMSes provide multiple keys.
- HBase has a single point of failure, i.e., if the active HMaster fails, it may take a while to have another HMaster in place. So, to have an always-available system, one should opt for Cassandra.
- HBase does not have a specific query language to access data from the datastore. To achieve querying in HBase, it needs to be integrated with other technologies, such as Hive, which can lead to latency. To overcome this limitation, one can use Cassandra, which has its own query language; this resembles the traditional query language and can be used easily.
- The query model of HBase uses key-value pairs; it does not provide various filters, aggregate functions, comparison, etc. On the other hand, the expressive query language model of MongoDB provides powerful query operators, which can handle advanced analytics workloads.