Monday 20 April 2015

PROCEDURE: DECOMMISSIONING A DATANODE FROM HADOOP CLUSTER.

Datanodes can be removed from a cluster when it is running, without data loss. But, if nodes are shutdown "hard," data loss may occur as they may hold copy of one or more file blocks.

HDFS provides a decommissioning procedure which ensures that, this process can be performed safely. To use it, follow the steps below:    

Step 1: Configurations in name node:

If it is assumed that nodes may be decommissioned in a cluster, then before it is started, an excludes file must be configured. Add a key dfs.hosts.exclude to "/usr/local/hadoop/conf/core-site.xml" file.

The entry in core-site.xml is as below:

------------------
<property>
<name>dfs.hosts.exclude</name>
<value>/usr/local/hadoop/excludes</value>
</property>
-------------------

Step 2: Determining hosts to be decommissioned: 

The machines to be decommissioned has to be added to the file "/usr/local/hadoop/excludes". This will prevent them from connecting to the NameNode.

vim /usr/local/hadoop/excludes

Step 3: Force configuration reload:
----------
hadoop dfsadmin -refreshNodes.
----------

This will force the NameNode to reread its configuration. It will decommission the nodes over a period of time.

Step 4: Shutdown nodes:

After the decommission process has completed, the decommissioned hardware can be safely shutdown for maintenance, etc.

-------
hadoop dfsadmin -report    ----> command will describe which nodes are connected to the cluster.
-------

Step 5: Run Balancer:

Run balancer in the cluster, to balance the distribution of data across cluster.

----------
start-balancer.sh
---------


Step 6: Run a DFS filesystem checking utility
-------------
hadoop fsck - /
-------------

The above command will show the health of filesystem.

Now, if you need fix FS errors in a HDFS filesystem, see my next post.

Kool :) 

No comments:

Post a Comment

Note: only a member of this blog may post a comment.