Showing posts with label linux. Show all posts
Showing posts with label linux. Show all posts

April 11, 2011

Common Hadoop HDFS exceptions with large files

Big data in HDFS, so many disk problems. First of all, make sure there are at least ~20-30% free space in each node. There are two other problems I faced recently:

all datanodes are bad
This error could be cause because of there are too many open files. limit is 1024 by default. To increase this use
ulimit -n newsize
For more information click!

error in shuffle in fetcher#k 
This is another problem - here is full error log:
2011-04-11 05:59:45,744 WARN org.apache.hadoop.mapred.Child: 
Exception running child : org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError: 
error in shuffle in fetcher#2
 at org.apache.hadoop.mapreduce.task.reduce.Shuffle.run(Shuffle.java:124)
 at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:362)
 at org.apache.hadoop.mapred.Child$4.run(Child.java:217)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:416)
 at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:742)
 at org.apache.hadoop.mapred.Child.main(Child.java:211)
Caused by: java.lang.OutOfMemoryError: Java heap space
 at org.apache.hadoop.io.BoundedByteArrayOutputStream.(BoundedByteArrayOutputStream.java:58)
 at org.apache.hadoop.io.BoundedByteArrayOutputStream.(BoundedByteArrayOutputStream.java:45)
 at org.apache.hadoop.mapreduce.task.reduce.MapOutput.(MapOutput.java:104)
 at org.apache.hadoop.mapreduce.task.reduce.MergeManager.unconditionalReserve(MergeManager.java:267)
 at org.apache.hadoop.mapreduce.task.reduce.MergeManager.reserve(MergeManager.java:257)
 at org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyMapOutput(Fetcher.java:305)
 at org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:251)
 at org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:149)

One way to go around this problem is making sure there are not too many map tasks for small input files. If possible you can cat input files manually to create bigger chunks or push hadoop to combine multiple tiny input files for a single mapper. For more details, take a look at here.

Also, at Hadoop discussions groups, it is mentioned that default value of dfs.datanode.max.xcievers parameter, the upper bound for the number of files an HDFS DataNode can serve, is too low and causes ShuffleError. In hdfs-site.xml, I set this value to 2048 and worked in my case.
<property>
        <name>dfs.datanode.max.xcievers</name>
        <value>2048</value>
  </property>

Update: Default value for dfs.datanode.max.xcievers is updated with this JIRA.

November 02, 2010

How to manipulate (copy/move/rename etc..) most recent n files under a directory

In order to list most recent n files under current directory:

ls --sort-time -r | tail -n

n represent number of files, so should be replaced with a number; eg: 5

in order to move, copy, delete or do something with the result, this line can be fed into "cp" "mv" "rm" commands. However the format is important. This line should be in between single quotes. However not the ones near by enter on your keyboard ( ' ), use the ones under esc key ( ` ).
So here is the command line for moving top n files from one directory to another:

mv `ls --sort-time -r | tail -n` /home/yasemin/hebelek/