Inorder to add a new datanode to a hadoop cluster, we need to follow the steps below:
Prerequisites:
1) Make sure passwordless login is enabled from master to new datanode.
2) The name resolution ( DNS) is working fine for the hostname of datanode.
PROCEDURE:
Step 1: First add the new cluster to "$HADOOP_PREFIX/conf/slaves" file of master.
Step 2: Copy the configurations from the hadoop master to new data node. The best option is to perform an rsync of "$HADOOP_PREFIX/conf/" directory from master to new slave.
Step 3: Now run the below command to new datanode.
-------
hadoop-daemon.sh start datanode
-------
This will start Datanode.
Step 4: Now start task-tracker in new datanode as below:
------
hadoop-daemon.sh start tasktracker
------
Now , go to master node and perform a refresh
-------
hadoop mradmin -refreshNodes --> This refresh map reduce on all nodes.
hadoop dfsadmin -refreshNodes ---> This will refresh DFS of all nodes.
----------
Now, we have add a new datanode to the cluster without any interruption.
We have to run a balancer to reallocate the data in the cluster. Run the below command:
---------
start-balancer.sh
--------
Kool :)
Prerequisites:
1) Make sure passwordless login is enabled from master to new datanode.
2) The name resolution ( DNS) is working fine for the hostname of datanode.
PROCEDURE:
Step 1: First add the new cluster to "$HADOOP_PREFIX/conf/slaves" file of master.
Step 2: Copy the configurations from the hadoop master to new data node. The best option is to perform an rsync of "$HADOOP_PREFIX/conf/" directory from master to new slave.
Step 3: Now run the below command to new datanode.
-------
hadoop-daemon.sh start datanode
-------
This will start Datanode.
Step 4: Now start task-tracker in new datanode as below:
------
hadoop-daemon.sh start tasktracker
------
Now , go to master node and perform a refresh
-------
hadoop mradmin -refreshNodes --> This refresh map reduce on all nodes.
hadoop dfsadmin -refreshNodes ---> This will refresh DFS of all nodes.
----------
Now, we have add a new datanode to the cluster without any interruption.
We have to run a balancer to reallocate the data in the cluster. Run the below command:
---------
start-balancer.sh
--------
Kool :)
No comments:
Post a Comment
Note: only a member of this blog may post a comment.