Tuesday 21 April 2015

Procedure to add a new data node to a hadoop cluster

Inorder to add a new datanode to a hadoop cluster, we need to follow the steps below:


1)  Make sure passwordless login is enabled from master to new datanode.

2)  The name resolution ( DNS) is working fine for the hostname of datanode.


Step 1: First add the new cluster to "$HADOOP_PREFIX/conf/slaves" file of master.

Step 2: Copy the configurations from the hadoop master to new data node.  The best option is to perform an rsync of  "$HADOOP_PREFIX/conf/" directory from master to new slave.

Step 3: Now run the below command  to new datanode.

hadoop-daemon.sh start datanode

This will start Datanode.

Step 4:  Now start task-tracker in new datanode as below:
hadoop-daemon.sh start tasktracker 

Now , go to master node and perform a refresh
hadoop mradmin -refreshNodes   --> This refresh map reduce on all nodes.

hadoop dfsadmin -refreshNodes   ---> This will refresh DFS of all nodes.

Now, we have add a new datanode to the cluster without any interruption.

We have to run a balancer to reallocate the data in the cluster. Run the below command:

Kool :) 

No comments:

Post a Comment

Note: only a member of this blog may post a comment.