码迷,mamicode.com
首页 > 其他好文 > 详细

HDFS读取数据发生异常

时间:2017-07-28 14:45:55      阅读:233      评论:0      收藏:0      [点我收藏+]

标签:blocks   class   point   exception   ora   run   r.java   missing   datanode   

当HDFS某个或者某几个datanode被关闭,并且这期间一直有数据在写入HDFS时,HDFS上某些block可能会发生HDFS租约问题,导致在一定时间期限内,其他应用程序(MR、spark、hive等)无法读取该block数据而抛出异常,异常如下:

17/07/28 14:13:40 WARN scheduler.TaskSetManager: Lost task 28.0 in stage 40.0 (TID 2777, dcnode5): java.io.IOException: Cannot obtain block length for LocatedBlock{BP-1594711030-10.29.180.177-1497607441986:blk_1073842908_103050; getBlockSize()=24352; corrupt=false; offset=0; locs=[DatanodeInfoWithStorage[10.241.104.148:50010,DS-2a2e3731-7889-4572-ac03-1645cb9681f5,DISK], DatanodeInfoWithStorage[10.28.142.158:50010,DS-40c6a66e-4f6f-4061-8a54-ac1a8874e3e1,DISK], DatanodeInfoWithStorage[10.28.142.143:50010,DS-41399e02-856b-4761-af41-c916986bd400,DISK], DatanodeInfoWithStorage[10.28.142.37:50010,DS-4bb951a2-6963-4f24-ac80-4df64e0b5d99,DISK]]}
    at org.apache.hadoop.hdfs.DFSInputStream.readBlockLength(DFSInputStream.java:427)
    at org.apache.hadoop.hdfs.DFSInputStream.fetchLocatedBlocksAndGetLastBlockLength(DFSInputStream.java:335)
    at org.apache.hadoop.hdfs.DFSInputStream.openInfo(DFSInputStream.java:271)
    at org.apache.hadoop.hdfs.DFSInputStream.<init>(DFSInputStream.java:263)
    at org.apache.hadoop.hdfs.DFSClient.open(DFSClient.java:1565)
    at org.apache.hadoop.hdfs.DistributedFileSystem$3.doCall(DistributedFileSystem.java:309)
    at org.apache.hadoop.hdfs.DistributedFileSystem$3.doCall(DistributedFileSystem.java:305)
    at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
    at org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:305)
    at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:778)
    at org.apache.hadoop.mapred.LineRecordReader.<init>(LineRecordReader.java:109)
    at org.apache.hadoop.mapred.TextInputFormat.getRecordReader(TextInputFormat.java:67)
    at org.apache.spark.rdd.HadoopRDD$$anon$1.<init>(HadoopRDD.scala:237)
    at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:208)
    at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:101)
    at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
    at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
    at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
    at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
    at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
    at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
    at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
    at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
    at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
    at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
    at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
    at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
    at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
    at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
    at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73)
    at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
    at org.apache.spark.scheduler.Task.run(Task.scala:89)
    at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)

关于租约的详细情况,可以从此链接进行了解:http://www.cnblogs.com/cssdongl/p/6699919.html

 

此时这些block并不是已经损坏了,只是租约未释放,导致其他程序无法读写,我们可以将其租约恢复或者暴力的直接删除该文件;

恢复租约的方法比较麻烦,在这里我只介绍如何找到这些block并且将其删除:

找出HDFS指定目录下,有哪些block因为租约问题而无法读写:(注意:例子中给出的是HDFS根目录地址“/”,请根据实际情况替换)

hadoop fsck / -openforwrite | egrep -v ^\.+$ | egrep "MISSING|OPENFORWRITE" | grep -o "/[^ ]*" | sed -e "s/:$//"

删除这些block

hadoop fsck / -openforwrite | egrep -v ^\.+$ | egrep "MISSING|OPENFORWRITE" | grep -o "/[^ ]*" | sed -e "s/:$//" | xargs -i hadoop fs -rmr {};

 

HDFS读取数据发生异常

标签:blocks   class   point   exception   ora   run   r.java   missing   datanode   

原文地址:http://www.cnblogs.com/xyliao/p/7250034.html

(0)
(0)
   
举报
评论 一句话评论(0
登录后才能评论!
© 2014 mamicode.com 版权所有  联系我们:gaon5@hotmail.com
迷上了代码!