Monday 28 September 2015

Hadoop 2: Setting up a Hadoop cluster with High Availability (HA)

In this post, I will explain about the steps to install and setup a Hadoop 2 cluster with HA.

Earlier in Hadoop version 1, one of the major set back was the single point failure of NameNode.

Each cluster had a single NameNode and if that machine or process became unavailable, entire cluster would be unavailable until the NameNode was either restarted or started on a separate machine.

This situation impacted the total availability of the HDFS cluster in two major ways:

 1) In the case of an unplanned event such as a machine crash, the cluster would be unavailable until an operator restarted the NameNode.

2) Planned maintenance events such as software or hardware upgrades on the NameNode machine would result in downtime of entire cluster.

 The HDFS HA feature addresses these problems. The HA feature lets you run redundant NameNodes in the same cluster in an Active/Passive configuration with a hot standby. This mechanism thus facilitates either a fast failover to the new NameNode during machine crash or a graceful administrator-initiated failover during planned maintenance.

In HA scenario, there will be one Standby node and Active node. In order for the Standby node to keep its state synchronized with the Active node, both nodes communicate with a group of separate daemons called JournalNodes (JNs). When the Active node performs any namespace modification, the Active node durably logs a modification record to a majority of these JNs. The Standby node reads the edits from the JNs and continuously watches the JNs for changes to the edit log. Once the Standby Node observes the edits, it applies these edits to its own namespace. When using QJM, JournalNodes acts the shared editlog storage. In a failover event, the Standby ensures that it has read all of the edits from the JounalNodes before promoting itself to the Active state. (This mechanism ensures that the namespace state is fully synchronized before a failover completes.)


SCENARIO:

In the lab setup, I have 3 machines, their hostnames, IP addresses and the services that will be running in these machines are mentioned below:

------
192.168.151.221-->hadoop-master-test ( Master node: NameNode, JournalNode, QuorumPeerMain, ResourceManager, DFSZKFailoverController )


192.168.151.222-->hadoop-secondary-test ( Standby Node: NameNode, JournalNode, QuorumPeerMain, DFSZKFailoverController )

192.168.151.223-->data-node-1-test( Data Node, JournalNode, QuorumPeerMain )
------

Software packages required:
----------
Hadoop 2.6.0, zookeeper-3.4.6 and Java 1.7.0_65
----------

PREREQUISITES:

Step 1:  Install jdk and configure passwordless login between master and all nodes.

Refer to the links below for details:

---------
http://www.maninmanoj.com/2015/03/installing-java-7-jdk-7u75-on-linux.html
http://www.maninmanoj.com/2013/08/how-to-perform-ssh-login-without.html
---------

Step 2: Make sure,  forward and reverse lookup is working fine. If not, add the details to "/etc/hosts" file of all nodes, so that local name resolution works fine.


IMPLEMENTATION:

Let us start hadoop installation if prerequisites are met.

Step 1: Download desired hadoop package from the link below:
-----
http://apache.claz.org/hadoop/common/
----

cd /usr/local/
wget http://apache.claz.org/hadoop/common/hadoop-2.6.0/hadoop-2.6.0.tar.gz
tar -xvzf hadoop-2.6.0.tar.gz
----

Now, rename the extracted folder as below:
----
mv hadoop-2.6.0 hadoop
-----

Once done, edit ~/.bashrc and enter the below variables:
-----
export JAVA_HOME=/usr/lib/jvm/jre-1.7.0-openjdk.x86_64
export HADOOP_PREFIX="/usr/local/hadoop"
export PATH=$PATH:$HADOOP_PREFIX/bin
export PATH=$PATH:$HADOOP_PREFIX/sbin
export PATH=$PATH:/usr/lib/jvm/jre-1.7.0-openjdk.x86_64/bin
export HADOOP_MAPRED_HOME=${HADOOP_PREFIX}
export HADOOP_COMMON_HOME=${HADOOP_PREFIX}
export HADOOP_HDFS_HOME=${HADOOP_PREFIX}
export YARN_HOME=${HADOOP_PREFIX}
-----

Then run the below command to make it effective.
-----
exec bash
-----

After this step, we shall move to HA configuration.

STEP 2: We shall download and install zookeeper:

Download and rename, the folder to zookeeper.
--------
cd /usr/local/
wget http://www.apache.org/dist/zookeeper/zookeeper-3.4.6/zookeeper-3.4.6.tar.gz
tar -xvzf zookeeper-3.4.6.tar.gz
mv zookeeper-3.4.6.tar.gz zookeeper
--------

STEP 3: Editing configuration file:

-----------
Configuration file:  /usr/local/zookeeper/conf/zoo.cfg
Binary executables: /usr/local/zookeeper/bin/
-----------

Edit "zoo.cfg" file as below:

--------
tickTime=2000
initLimit=10
syncLimit=5
clientPort=2181
dataDir=/home/root/zookeeper/data/
dataLogDir=/home/root/zookeeper/logs/
server.1=hadoop-master-test:2888:3888
server.2=hadoop-secondary-test:2889:3889
server.3=data-node-1-test:2890:3890
--------

STEP 4: Create "myid" file in "/home/root/zookeeper/data/myid" assign the value of each of the node in the cluster.

In the master node:
-----------
[root@hadoop-master-test tmp]# cat /home/root/zookeeper/data/myid
1
-----------

In the secondary node:
-----------
[root@hadoop-secondary-test ~]# cat /home/root/zookeeper/data/myid
2
-----------

In the data node:
-----------
[root@data-node-1-test ~]# cat /home/root/zookeeper/data/myid
3
-----------

STEP 5: Hadoop configuration and HA settings.

Edit the file "/usr/local/hadoop/etc/hadoop/hadoop-env.sh" and set the environment variables.
------------
JAVA_HOME=/usr/lib/jvm/jre-1.7.0-openjdk.x86_64
export HADOOP_COMMON_LIB_NATIVE_DIR="/usr/local/hadoop/lib/native/"
export HADOOP_OPTS="$HADOOP_OPTS -Djava.library.path=/usr/local/hadoop/lib/native/"
-------------

STEP 6: Edit the "core-site.xml" file and add the following lines in cores-site.xml file to configure  journaling , default FS , temp directory & hdfs cluster. Within the <configuration> tag.

--------------
<configuration>

<property>
<name>fs.defaultFS</name>
<value>hdfs://mycluster</value>
</property>

<property>
<name>dfs.journalnode.edits.dir</name>
<value>/home/root/journal/node/local/data</value>
</property>

<property>
<name>hadoop.tmp.dir</name>
<value>/home/root/tmp</value>
</property>

</configuration>
--------------

fs.defaultFS:

The default path prefix used by the Hadoop FS client. Optionally, you may now
configure the default path for Hadoop clients to use the new HA-enabled logical URI.
For example, for mycluster nameservice ID, this will be the value of the authority
portion of all of your HDFS paths.

dfs.journalnode.edits.dir:

This is the absolute path on the JournalNode machines where the edits and other
local state (used by the JNs) will be stored. You may only use a single path for this
configuration. Redundancy for this data is provided by either running multiple
separate JournalNodes or by configuring this directory on a locally-attached RAID
array.

STEP 7:

Add following lines in hdfs-site.xml file to configure  dfs nameservice , cluster , dfs high availability, zookeper & failover. Within the <configuration> tag.

The parameters, that may need to be changed according to the environment is highlighted.
---------------
<configuration>

<property>
<name>dfs.data.dir</name>
<value>/data/hadoop/data</value>
<final>true</final>
</property>

 <property>
 <name>dfs.name.dir</name>
 <value>/data/hadoop/namenode</value>
 <final>true</final>
 </property>

<property>
<name>dfs.nameservices</name>
<value>mycluster</value>
<final>true</final>

</property>
<property>
<name>dfs.ha.namenodes.mycluster</name>
<value>nn1,nn2</value>
<final>true</final>
</property>

<property>
<name>dfs.namenode.rpc-address.mycluster.nn1</name>
<value>hadoop-master-test:8020</value>
</property>

<property>
<name>dfs.namenode.rpc-address.mycluster.nn2</name>
<value>hadoop-secondary-test:8020</value>
</property>

<property>
<name>dfs.namenode.http-address.mycluster.nn1</name>
<value>hadoop-master-test:50070</value>
</property>

<property>
<name>dfs.namenode.http-address.mycluster.nn2</name>
<value>hadoop-secondary-test:50070</value>

</property>
<property>
<name>dfs.namenode.shared.edits.dir</name>
<value>qjournal://hadoop-master-test:8485;data-node-1-test:8485;hadoop-secondary-test:8485/mycluster</value>
</property>

<property>
<name>dfs.ha.automatic-failover.enabled</name>
<value>true</value>
</property>

<property>
<name>ha.zookeeper.quorum</name>
<value>hadoop-master-test:2181,hadoop-secondary-test:2181,data-node-1-test:2181</value>
</property>

<property>
<name>dfs.ha.fencing.methods</name>
<value>sshfence</value>
</property>

<property>
<name>dfs.ha.fencing.ssh.private-key-files</name>
<value>/home/root/.ssh/id_rsa</value>
</property>

<property>
<name>dfs.replication</name>
<value>3</value>
</property>

<property>
<name>dfs.ha.fencing.ssh.connect-timeout</name>
<value>3000</value>
</property>

<property>
<name>dfs.client.failover.proxy.provider.mycluster</name>
<value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
</property>

</configuration>
--------------

STEP 8:  Mention the slave node in the "slaves" file.

---------------
[root@hadoop-master-test namenode]# cat  /usr/local/hadoop/etc/hadoop/slaves
data-node-1-test
---------------

STEP 9: Add the following lines for applying mapreduce settings, within the <configuration> tag.

----------------
<configuration>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>

<property>
<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>

</configuration>
-------------------

STEP 10:

10.1: Create folder structure for journalnode as defined in core-site.xml, repeat following step in all the cluster nodes.

--------
mkdir –p /home/root/journal/node/local/data
--------

10.2: Create temp folder for hadoop cluster as defined in core-site.xml, repeat following step  in all the  cluster nodes.

-----------
mkdir -p /home/root/tmp
-----------

10.3: Create the folder structure for Zookeeper data and logs as defined in zoo.cfg , repeat following step in all the nodes in the cluster.

-------------
mkdir –p /home/root/zookeeper/data/

mkdir –p /home/root/zookeeper/log/
-------------

STEP 11: Copy these configuration files to all nodes in the cluster.

11.1  Start zookeeper service in all the nodes in cluster used for zookeeper , repeat below step in all the cluster nodes running zookpeer.

-----------------
/usr/local/zookeeper/bin/zkServer.sh start
------------------

STEP 12: Start journalnode in all the cluster nodes.

Run the following command in all Journal nodes in the cluster:
----------------
$hadoop-daemon.sh start journalnode
-------------

STEP 13: Format Zookeeper file system in hadoop-master-test

---------
[root@hadoop-master-test~]$hdfs zkfc –formatZK
---------

STEP 14: Format namenode in hadoop-master-test

--------
[root@hadoop-master-test~]$hdfs namenode –format
--------

STEP 15: Start namenode service in master node.

---------
$hadoop-daemon.sh start namenode
---------

STEP 16: Start namenode service in Standby Namenode.

---------
[root@hadoop-secondary-test~]$hdfs namenode –bootstrapStandby
----------

STEP 17: start hadoop service

------------
$cd /usr/local/hadoop/bin/

./stop-all.sh

and start again.

./start-all.sh
--------------

Services in the Master Node:


Services in Standby Node:

Services in DataNode:


Status of Active node via UI:


Status of Standby  Node, via UI:


I will explain about the command line options available for HA in my next post.

Keep reading :)

No comments:

Post a Comment

Note: only a member of this blog may post a comment.