When we are start doing the datanode service decommission process the best choice is to go with 1 or 2 servers depends on your no. of servers and rack configuration. Here i would like to share an issue which i was faced while doing the 3 datanodes decommission process at a time.
What to do when rack awareness is configured?
- Check the decommission servers rack information, one good choice is to go with different rack servers for decommission process.
- Check the storage details for those rack servers, if you see the enough space is available in those servers in each rack then we wont get any issue while transferring the blocks from one node to other.
Note: Coming to the block movement, first it will start the block movement in same racks servers. Second it will try for offrack block movement.So, check the other data nodes space under same rack either space is available or not to avoid issues while doing decommission block movement.
In my case out of 3 datanodes, 2 servers status is successfully changed to decommissioned state, but one of the server is still showing as decommissioning state.
As an admin, my initial approach is to check for corrupted/missing/under replicated blocks in the Ambari/Namenode Web UI. I found one block is corrupted and one block is under replicated. So, validated that file from the cli and deleted the file (This is a log file).
hdfs fsck / -list-corruptfileblocks -files -blocks -locations 2>&1 > /tmp/corrupted_files_report.log
Still, I didn’t see any change in the datanode status after deletion of corrupted block. So, our next step is to check datanode logs. In the datanode we checked with keyword WARN for errors.
Here we discuss some points we observed in logs.
- the datanode decommission log contains of below information.
- Get the block info.
- Register the datanode (mapping the source and destination datanodes)
- Transfer file
If first 1&2&3 steps are went well but we are getting IOException while transferring the block.
2020-06-20 05:58:54,527 INFO datanode.DataNode (DataNode.java:transferBlock(2147)) - DatanodeRegistration(128.10.xxx.xxx:1019, datanodeUuid=7dc008ce-7f3b-459e-b915-2fe9350460fe, infoPort=1022, infoSecurePort=0, ipcPort=8010, storageInfo=lv=-56;cid=CID-26a39b76-bda6-4dab-bdbb-0062283ee9ba;nsid=1187796825;c=0) Starting thread to transfer BP-300373885-128.10.xx.xx-1445435308785:blk_2621915838_1689288216 to 128.10.xx.xx:1019 2020-06-20 05:58:54,556 INFO checker.ThrottledAsyncChecker (ThrottledAsyncChecker.java:schedule(122)) - Scheduling a check for /hadoop/hdfs/data/current 2020-06-20 05:58:54,557 WARN datanode.DataNode (DataNode.java:run(2384)) - DatanodeRegistration(128.10.xxx.xxx:1019, datanodeUuid=7dc008ce-7f3b-459e-b915-2fe9350460fe, infoPort=1022, infoSecurePort=0, ipcPort=8010, storageInfo=lv=-56;cid=CID-26a39b76-bda6-4dab-bdbb-0062283ee9ba;nsid=1187796825;c=0):Failed to transfer BP-300373885-128.10.xx.xx-1445435308785:blk_2621915838_1689288216 to 128.10.xx.xxx:1019 got java.net.SocketException: Original Exception : java.io.IOException: Connection reset by peer at sun.nio.ch.FileChannelImpl.transferTo0(Native Method) at sun.nio.ch.FileChannelImpl.transferToDirectlyInternal(FileChannelImpl.java:428) at sun.nio.ch.FileChannelImpl.transferToDirectly(FileChannelImpl.java:493) at sun.nio.ch.FileChannelImpl.transferTo(FileChannelImpl.java:608) at org.apache.hadoop.net.SocketOutputStream.transferToFully(SocketOutputStream.java:223) at org.apache.hadoop.hdfs.server.datanode.FileIoProvider.transferToSocketFully(FileIoProvider.java:263) at org.apache.hadoop.hdfs.server.datanode.BlockSender.sendPacket(BlockSender.java:583) at org.apache.hadoop.hdfs.server.datanode.BlockSender.doSendBlock(BlockSender.java:767) at org.apache.hadoop.hdfs.server.datanode.BlockSender.sendBlock(BlockSender.java:714) at org.apache.hadoop.hdfs.server.datanode.DataNode$DataTransfer.run(DataNode.java:2352) at java.lang.Thread.run(Thread.java:748) Caused by: java.io.IOException: Connection reset by peer
Then tried to find the file info from the block id for the respective block.
hdfs fsck -blockId blk_2621915838
It will list the file details as well as replication details.
hdfs fsck -blockId blk_2621915838 Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=512m; support was removed in 8.0 Connecting to namenode via http://master.hadoop.com:50070/fsck?ugi=hadoop&blockId=b1k_2621915838+&path=%2F FSCK started by hdpadmin (auth:KERBEROS_SSL) from /128.10.X.X at Tue May 05 06:13:20 BST 2020 Block Id: blk_2621915838 Block belongs to: /user/hdfs/prod/data/logs/ADJ_13926.1og No. of Expected Replica: 3 No. of live Replica: 2 No. of excess Replica: 0 No. of stale Replica: 0 No. of decommissioned Replica: 0 No. of decommissioning Replica: 1 No. of corrupted Replica: 0 Block replica on datanode/rack: datanode01.hadoop.com/XXXX/YYYY/Row01/Cab02 is HEALTHY Block replica on datanode/rack: datanode02.hadoop.com/XXXX/YYYY/Row03/Cab03 is DECOMMISSIONING Block replica on datanode/rack: datanode03.hadoop.com/XXXX/YYYY/Row03/Cab05 is HEALTHY
From the above output it is struck at one of replica copy. Because it is unable to move the block from the current decommissioning server to other server (one copy is already present in another server same rack and we have only 2 servers in the rack) in same rack as well as to the other rack (Here all the servers are with 100% usage).
As an alternative approach we tried to copy the under replicated block file to other hdfs location temporarily and deleted the original one. Finally moved the file from temporary location to its original path. (It will create a new block with new server location when we copied the file).
Note: One solution is to recommission datanode service, once it is completed then run the balancer command to balance all the servers in the cluster. Now, try to decommission the same server again / choose the server from the different rack.