Introduction
HBase is a tabular-oriented database that runs on top of HDFS. It is modeled on Google’s BigTable.
In this post, I’m going to install HBase in Pseudo mode, so please use these instructions for setting up a developer’s workstation, not for a production cluster.
When should you use HBase
HBase should be used when you need random read/write access to the data in Hadoop. While HBase gives you random seeks, it does so at the expense of performance vs. HDFS. Therefore, it is important to look at your workload and pick the correct solution for your specific requirements.
Install Zookeeper
Install Zookeeper before installing HBase.
Install Prerequisites
sudo apt-get install ntp libopts25
Installation
sudo apt-get install hbase
Let’s see what files were installed. I have written an HBase Files and Directories post that contains more information about what’s installed with the hbase package.
dpkg -L hbase | less
sudo apt-get install hbase-master
Next, we’ll stop the HBase Master.
sudo service hbase-master stop
Configure HBase to run in pseudo mode
Let’s check the hostname and port used by the HDFS Name Node.
grep -A 1 fs.default.name /etc/hadoop/conf.pseudo/core-site.xml | grep value
You should see output of:
<value>hdfs://localhost:8020</value>
cd /etc/hbase/conf; ls -l
sudo vi hbase-site.xml
Paste the following into hbase-site.xml, between <configuration> and </configuration>.
<property>
<name>hbase.cluster.distributed</name>
<value>true</value>
</property>
<property>
<name>hbase.rootdir</name>
<value>hdfs://localhost:8020/hbase</value>
</property>
Add the /hbase directory to HDFS
Important
The following commands assume that you’ve followed the instructions in my post on how to Create a .bash_aliases file.
shmkdir /hbase
shchown hbase /hbase
Let’s check that the /hbase directory was created correctly in HDFS.
hls /
You should see output that includes a line for the /hbase directory.
Start the HBase Master
sudo service hbase-master start
Install an HBase Region Server
The HBase Region Server is started automatically when you install it in Ubuntu.
sudo apt-get install hbase-regionserver
Check that HBase is Setup Correctly
sudo /usr/lib/jvm/jdk1.6.0_31/bin/jps
You should see output similar to the following (look for QuorumPeerMain, NameNode, DataNode, HRegionServer, and HMaster):
1942 SecondaryNameNode
12783 QuorumPeerMain
1747 NameNode
1171 DataNode
15034 HRegionServer
14755 HMaster
2396 NodeManager
2497 ResourceManager
2152 JobHistoryServer
15441 Jps
Open http://localhost:60010 in a web browser to verify that the HBase Master was installed correctly.
If everything installed correctly then you should see the following:
- In the Region Servers section, there should be one line for localhost.
- In the Attributes section, you should see HBase Version = 0.92.1-cdh4.0.0.
Add the JDK 1.6.0 u31 Path to BigTop
This update is required as BigTop uses a fixed array approach to finding JAVA_HOME.
sudo vi /usr/lib/bigtop-utils/bigtop-detect-javahome
Add the following line just below the for candidate in \ line:
/usr/lib/jvm/jdk1.6.0_31 \
Update the hosts file
It’s likely that you’ll get an error due to the localhost loopback.
That’s it. You now have HBase installed and ready for use on a developer’s workstation/laptop.
Additional Reading
There are some additional configuration options for HBase, including:
Image may be NSFW.
Clik here to view.
Clik here to view.
