SOLVED: the mystery of BlockMissingException in the Hadoop file system (without data loss)
The best way of handling corrupt or missing blocks in HDFS.
Caused by: java.lang.RuntimeException: java.io.IOException: org.apache.hadoop.hdfs.BlockMissingException: Could not obtain block
Before we jump into the solution for the exception of the missing blocks in the Hadoop file system, let's overview how the Hadoop file system works.
Image source : https://data-flair.training/blogs/hadoop-hdfs-namenode-high-availability/
- Active NameNode — It handles all HDFS client operations in the HDFS cluster.
- Passive NameNode — It is a standby namenode. It has similar data as active NameNode.
So, whenever Active NameNode fails, passive NameNode will take all the responsibility of the active node. Thus, the HDFS cluster continues to work.
To know more about “Implementation of NameNode High Availability Architecture”, please read here:https://data-flair.training/blogs/hadoop-hdfs-namenode-high-availability/
- Active and Standby NameNode should always be in sync with each other, i.e. they should have the same metadata. This permit to reinstate the Hadoop cluster to the same namespace state where it got crashed. And this will provide us to have fast failover.
- The way HDFS works is by having a main « NameNode » and multiple « data nodes » on a commodity hardware cluster. All the nodes are usually organized within the same physical rack in the data center. Data is then broken down into separate « blocks » that are distributed among the various data nodes for storage. Blocks are also replicated across nodes to reduce the likelihood of failure.
- The NameNode is the «smart» node in the cluster. It knows exactly which data node contains which blocks and where the data nodes are located within the machine cluster. The NameNode also manages access to the files, including reads, writes, creates, deletes and replication of data blocks across different data nodes.
- The NameNode operates in a “loosely coupled” way with the data nodes. This means the elements of the cluster can dynamically adapt to the real-time demand of server capacity by adding or subtracting nodes as the system sees fit.
- The data nodes constantly communicate with the NameNode to see if they need complete a certain task. The constant communication ensures that the NameNode is aware of each data node’s status at all times. Since the NameNode assigns tasks to the individual datanodes, should it realize that a datanode is not functioning properly it is able to immediately re-assign that node’s task to a different node containing that same data block. Data nodes also communicate with each other so they can cooperate during normal file operations. Clearly the NameNode is critical to the whole system and should be replicated to prevent system failure.
- Again, data blocks are replicated across multiple data nodes and access is managed by the NameNode. This means when a data node no longer sends a “life signal” to the NameNode, the NameNode unmaps the data note from the cluster and keeps operating with the other data nodes as if nothing had happened. When this data node comes back to life or a different (new) data node is detected, that new data node is (re-)added to the system. That is what makes HDFS resilient and self-healing.
Now we have understood the Hadoop ecosystem well enough to solve the missing blocks problem.
When we suddenly start seeing the below exception, it means the overall health of the Hadoop cluster is not in a good state. This may be caused by datanode failure in Hadoop cluster or namenode switch ( between unhealthy standby node ie. fsimage/metadata of standby namenode was not up to date (it can be due to unhealthy journal nodes communication with standby namenode))
Caused by: java.lang.RuntimeException: java.io.IOException: org.apache.hadoop.hdfs.BlockMissingException: Could not obtain block
Solution and Debugging of the root casue :
Step 0: Check Hadoop File system
(https://hadoop.apache.org/docs/r1.2.1/commands_manual.html#fsck)
hadoop fsck [GENERIC_OPTIONS] <path>
Below command output will show block missing error/exception.
hdfs fsck -list-corruptfileblocks
Step 1: Make sure that each data node is reachable from Namenode.
Step 2: Check Namenode and editLog/fsimage file status
* Important: Check namenode logs if there is any WARNING or ERROR level exception.
few known exception :
WARN ipc.Server (Server.java:checkDataLength(1728)) — Requested data length 69990449 is longer than maximum configured RPC length 67108864.
* Here Namenode stops registering block metadata from data nodes due to the above error. ( namenode was not able to register block metadata )
Solution to this error :
<property>
<name>ipc.maximum.data.length</name>
<value>134217728</value>
</property>
Step 3: Try Switching namenode ( with standby ) (optional)
If this does not help in recovering the block missing exception. Go to Step 4.
Step 4: Restart Hadoop ecosystem components in the below order.
- Restart namenode & check if the error still exists.
- Restart the Journal node so that Namenode and standby node are in SYNC. ( optional )
- Restart all data nodes one by one, and observe namenode logs whether namenode has started registering blocks information. ( IMPORTANT STEP )
* Here every data node submits every block metadata to namenode again, if any block metadata is missing from namenode’s metadata then it will be mapped and recovered automatically.
Now, wait for some time to finish updating of namenode metadata, after this namenode should come out from SAFE mode with no missing blocks.
NOTE: Do not perform the Hadoop delete operation in hurry to solve the issue.
hdfs fsck / -delete -> This can delete data permanently resulting in data loss.