If you are working on Hadoop, you’ll realize there are several shell commands available to manage your hadoop cluster.
This article provides a quick handy reference to all Hadoop administration commands.
If you are new to big data, read the introduction to Hadoop article to understand the basics.
1. Hadoop Namenode Commands
Command
Description
hadoop namenode -format
Format HDFS filesystem from Namenode
hadoop namenode -upgrade
Upgrade the NameNode
start-dfs.sh
Start HDFS Daemons
stop-dfs.sh
Stop HDFS Daemons
start-mapred.sh
Start MapReduce Daemons
stop-mapred.sh
Stop MapReduce Daemons
hadoop namenode -recover -force
Recover namenode metadata after a cluster failure (may lose data)
2. Hadoop fsck Commands
Command
Description
hadoop fsck /
Filesystem check on HDFS
hadoop fsck / -files
Display files during check
hadoop fsck / -files -blocks
Display files and blocks during check
hadoop fsck / -files -blocks -locations
Display files, blocks and its location during check
hadoop fsck / -files -blocks -locations -racks
Display network topology for data-node locations
hadoop fsck -delete
Delete corrupted files
hadoop fsck -move
Move corrupted files to /lost+found directory
3. Hadoop Job Commands
Command
Description
hadoop job -submit <job-file>
Submit the job
hadoop job -status <job-id>
Print job status completion percentage
hadoop job -list all
List all jobs
hadoop job -list-active-trackers
List all available TaskTrackers
hadoop job -set-priority <job-id> <priority>
Set priority for a job. Valid priorities: VERY_HIGH, HIGH, NORMAL, LOW, VERY_LOW
hadoop job -kill-task <task-id>
Kill a task
hadoop job -history
Display job history including job details, failed and killed jobs
4. Hadoop dfsadmin Commands
Command
Description
hadoop dfsadmin -report
Report filesystem info and statistics
hadoop dfsadmin -metasave file.txt
Save namenode’s primary data structures to file.txt
hadoop dfsadmin -setQuota 10 /quotatest
Set Hadoop directory quota to only 10 files
hadoop dfsadmin -clrQuota /quotatest
Clear Hadoop directory quota
hadoop dfsadmin -refreshNodes
Read hosts and exclude files to update datanodes that are allowed to connect to namenode. Mostly used to commission or decommsion nodes
hadoop fs -count -q /mydir
Check quota space on directory /mydir
hadoop dfsadmin -setSpaceQuota /mydir 100M
Set quota to 100M on hdfs directory named /mydir
hadoop dfsadmin -clrSpaceQuota /mydir
Clear quota on a HDFS directory
hadooop dfsadmin -saveNameSpace
Backup Metadata (fsimage & edits). Put cluster in safe mode before this command.
5. Hadoop Safe Mode (Maintenance Mode) Commands
The following dfsadmin commands helps the cluster to enter or leave safe mode, which is also called as maintenance mode. In this mode, Namenode does not accept any changes to the name space, it does not replicate or delete blocks.
Comments on this entry are closed.
Can you explain me the fsck output? I mean it shows BP, BLK and other stuff with 3 data node information.
Many thanks! this details are very valuable