Hadoop Datanode Decommission Tuning Process

Hadoop Datanode Decommission tuning process

Before start the decommission of hadoop datanode we better to recheck some key points to avoid issues in middle of decommission.Here i am sharing hadoop datanode decommission tuning process steps based on my experience.

What happens when you do the decommission?

This is the process of copying the files from the decommissioning node to the other datanodes in the cluster  in a small batches. If the no. of blocks is more, then the time to move the blocks from it also takes more time. Here the Namenode ensures that the each and every file from the block is available across the cluster based on the replication factor.

Before going for decommission process first you need to check for the following points to tune or avoid some critical situations in decommission process.

Check for Open files / Corrupted blocks

First task is to check for the missing/under replicated blocks in the cluster. Also need to check for open files for writing.

hdfs fsck / -list-corruptfileblocks -openforwrite -files -blocks -locations 2>&1 > /tmp/files_report.log

Fix any issues related to the corrupted files. If these files are not important then you can move / delete those files.

hdfs fsck file_name -move   // it will move the data to lost+found directory
hdfs fsck file_name -delete

Raise the heap size of the datanode you are planning to decommission. Default recommended value is 2 GB. If it is more than it will help us to increase in iterations and max streams.

Ambari UI -> HDFS Service -> Config -> HADOOP_DATANODE_OPTS in hadoop-env.sh file.
You will see two values (-Xms and -Xmx). Example -Xms2048m -Xmx2048m 

Increase the replication work multiplier per iteration to a larger number (if you are not configured then it will take default value as 2, recommended is 10). Higher value may impact the overall cluster performance.

If you don’t see these parameters in the cluster means it will be like default values.

dfs.namenode.replication.work.multiplier.per.iteration=2  // 10
dfs.namenode.replication.max-streams=2 // 20
dfs.namenode.replication.max-streams-hard-limit =4 //50

To perform the changes

- Ambari UI -> HDFS Service -> Config -> Advances -> Custom hdfs-site -> Add property -> (add the parameters) -> Restart HDFS services

Note: Once the decommission process is completed then revert the above changes to its original values.


Please enter your comment!
Please enter your name here