In this post, I will explain about some of the commonly used HDFS shell commands. Hadoop is installed in "/usr/local/hadoop" folder of this server.
Navigate to the folder "/usr/local/hadoop/bin". You will be able to see binary hadoop, which will be used in commands from here on.
Command 1: To check which version of Hadoop is installed.
--------
hadoop version
--------
Command 2: List the contents of root directory in HDFS.
Sample output
--------
[root@MANINMANOJ]#hadoop fs -ls /
Found 3 items
drwxr-xr-x - root supergroup 2015-03-16 23:37 /data
drwxr-xr-x - root supergroup 2015-03-16 23:37 /user
drwxr-xr-x - root supergroup 2015-03-16 23:11 /usr
Command 18: Default file permissions are 666 in HDFS. Use '-chmod' command to change permissions of a file
------------
hadoop fs -chmod 600 /user/root/data/sample/testing.txt
------------
Command 19: Default names of owner and group can be changed using chown command:
-----------
hadoop fs -chown root:root /user/root/data/sample/testing.txt
------------
Command 20: Move a directory from one location to other
------
hadoop fs -mv /user/root/data/sample1/testing.txt /user/root/data/sample2/testing2.txt
-------
Command 21: Default replication factor to a file is 3. Use '-setrep' command to change replication factor of a file
--------
hadoop fs -setrep -w 2 /user/root/data/sample/testing.txt
---------
Command 22: Copy a directory from one node in the cluster to another. Use
1) '-distcp' command to copy,
2) -overwrite option to overwrite in an existing files
3) -update command to synchronize both directories
--------
hadoop fs -distcp hdfs://namenode1/apache_hadoop hdfs://namenode2/hadoop
--------
Command 23: List all the hadoop file system shell commands
----------
hadoop fs
---------
Command 24: To get help
--------
hadoop fs -help
--------
Navigate to the folder "/usr/local/hadoop/bin". You will be able to see binary hadoop, which will be used in commands from here on.
Command 1: To check which version of Hadoop is installed.
--------
hadoop version
--------
Command 2: List the contents of root directory in HDFS.
Sample output
--------
[root@MANINMANOJ]#hadoop fs -ls /
Found 3 items
drwxr-xr-x - root supergroup 2015-03-16 23:37 /data
drwxr-xr-x - root supergroup 2015-03-16 23:37 /user
drwxr-xr-x - root supergroup 2015-03-16 23:11 /usr
---------
Command 3: Count the number of directories,files and bytes under the paths that match the specified file pattern
Sample Output:
------
[root@MANINMANOJ]#hadoop fs -count hdfs:/
11 1 4 hdfs://192.168.150.210:10001/
-------
Command 4: Run a DFS filesystem checking utility
Sample Output:
-----
[root@MANINMANOJ]#hadoop fsck - /
FSCK started by root from /192.168.150.210 for path / at Tue Mar 17 23:43:51 IST 2015
.
/usr/local/hadoop/tmp/mapred/system/jobtracker.info: Under replicated blk_-9110710984033906000_1001. Target Replicas is 3 but found 1 replica(s).
Status: HEALTHY
Total size: 4 B
Total dirs: 11
Total files: 1
Total blocks (validated): 1 (avg. block size 4 B)
Minimally replicated blocks: 1 (100.0 %)
Over-replicated blocks: 0 (0.0 %)
Under-replicated blocks: 1 (100.0 %)
Mis-replicated blocks: 0 (0.0 %)
Default replication factor: 3
Average block replication: 1.0
Corrupt blocks: 0
Missing replicas: 2 (200.0 %)
Number of data-nodes: 1
Number of racks: 1
FSCK ended at Tue Mar 17 23:43:51 IST 2015 in 3 milliseconds
The filesystem under path '/' is HEALTHY
[root@MANINMANOJ]#
-----------------
Command 5: Run a cluster balancing utility
------
[root@MANINMANOJ]#hadoop balancer
Time Stamp Iteration# Bytes Already Moved Bytes Left To Move Bytes Being Moved
15/03/17 23:46:58 INFO net.NetworkTopology: Adding a new node: /default-rack/192.168.150.210:50010
15/03/17 23:46:58 INFO balancer.Balancer: 0 over utilized nodes:
15/03/17 23:46:58 INFO balancer.Balancer: 1 under utilized nodes: 192.168.150.210:50010
The cluster is balanced. Exiting...
Balancing took 1.136 seconds
-------
Command 6: Right now, I am logged in as user root. Hence I am having a directory "/user/root" in hdfs. I will be creating a new directory "new" in this location using the below command.
----------
[root@MANINMANOJ]#hadoop fs -mkdir /user/root/new
[root@MANINMANOJ]#hadoop fs -ls /user/root/
Found 1 item
drwxr-xr-x - root supergroup 2015-03-17 23:51 /user/root/new
[root@MANINMANOJ]#
-----------
Command 7: Add a sample text file from the local directory named "test.txt" to the new directory created in previous step:
----------
[root@MANINMANOJ]#hadoop fs -put /sample/test.txt /user/root/new
[root@MANINMANOJ]#
[root@MANINMANOJ]#hadoop fs -ls /user/root/new
Found 1 items
-rw-r--r-- 3 root supergroup 2015-03-18 00:06 /user/root/new/test.txt
-----------
Command 8: Add a sample directory "sample" to a directory " "/user/root/data/" in HDFS
----------
[root@MANINMANOJ]#hadoop fs -put /sample/ /user/root/data
[root@MANINMANOJ]#
[root@MANINMANOJ]#hadoop fs -ls /user/root/data/
Found 1 items
drwxr-xr-x - root supergroup 2015-03-18 00:13 /user/root/data/sample
-------
Command 9: The space utilized by the directory "/user/root/data/"
-----------
[root@MANINMANOJ]#hadoop fs -du /user/root/data/
Found 1 items
73 hdfs://192.168.150.210:10001/user/root/data/sample
------------
Command 10: To delete a file "test,.txt" from HDFS file system.
------
[root@MANINMANOJ]#hadoop fs -rm /user/root/data/sample/test.txt
Deleted hdfs://192.168.150.210:10001/user/root/data/sample/test.txt
[root@MANINMANOJ]#
-------
Command 11: Remove the entire sample directory and all of its contents in HDFS.
----------
[root@MANINMANOJ]#hadoop fs -rmr /user/root/data/sample/
Deleted hdfs://192.168.150.210:10001/user/root/data/sample
[root@MANINMANOJ]#
-----------
Command 12: . Add the file "testing.txt" from the local directory named "/var/tmp/testing.txt" to the directory "/user/root/data/sample" in HDFS
-----------
[root@MANINMANOJ]#hadoop fs -copyFromLocal /var/tmp/testing.txt /user/root/data/sample
[root@MANINMANOJ]#
[root@MANINMANOJ]#hadoop fs -ls /user/root/data/sample
Found 1 items
-rw-r--r-- 3 root supergroup 409 2015-03-24 22:58 /user/root/data/sample/testing.txt
[root@MANINMANOJ]#hadoop fs -rmr /user/root/data/sample/
Deleted hdfs://192.168.150.210:10001/user/root/data/sample
[root@MANINMANOJ]#
-----------
Command 12: . Add the file "testing.txt" from the local directory named "/var/tmp/testing.txt" to the directory "/user/root/data/sample" in HDFS
-----------
[root@MANINMANOJ]#hadoop fs -copyFromLocal /var/tmp/testing.txt /user/root/data/sample
[root@MANINMANOJ]#
[root@MANINMANOJ]#hadoop fs -ls /user/root/data/sample
Found 1 items
-rw-r--r-- 3 root supergroup 409 2015-03-24 22:58 /user/root/data/sample/testing.txt
-----------
Command 13: To view the contents of text file testing.txt which is present in "sample" directory in HDFS.
---------
hadoop fs -cat /user/root/data/sample/testing.txt
----------
Command 14: Add the testing.txt file from "sample" directory which is present in HDFS to the directory "/home/manoj" which is present in the local directory.
-------
hadoop fs -copytoLocal /user/root/data/sample/testing.txt /home/manoj
-------
Command 15: cp is used to copy files between directories present in HDFS
--------
hadoop fs -cp /home/manoj/*.txt /user/root/data/sample/
--------
Command 16: '-get' command can be used alternatively to '-copyToLocal' command
------
hadoop fs -get /user/root/data/sample/testing.txt /home/manoj
-------
Command 17: Display last 10 lines of the file "testing.txt" to stdout.
---------
hadoop fs -tail /user/root/data/sample/testing.txt
--------
Command 18: Default file permissions are 666 in HDFS. Use '-chmod' command to change permissions of a file
------------
hadoop fs -chmod 600 /user/root/data/sample/testing.txt
------------
Command 19: Default names of owner and group can be changed using chown command:
-----------
hadoop fs -chown root:root /user/root/data/sample/testing.txt
------------
Command 20: Move a directory from one location to other
------
hadoop fs -mv /user/root/data/sample1/testing.txt /user/root/data/sample2/testing2.txt
-------
Command 21: Default replication factor to a file is 3. Use '-setrep' command to change replication factor of a file
--------
hadoop fs -setrep -w 2 /user/root/data/sample/testing.txt
---------
Command 22: Copy a directory from one node in the cluster to another. Use
1) '-distcp' command to copy,
2) -overwrite option to overwrite in an existing files
3) -update command to synchronize both directories
--------
hadoop fs -distcp hdfs://namenode1/apache_hadoop hdfs://namenode2/hadoop
--------
Command 23: List all the hadoop file system shell commands
----------
hadoop fs
---------
Command 24: To get help
--------
hadoop fs -help
--------
No comments:
Post a Comment
Note: only a member of this blog may post a comment.