First one is
isSplitable, determines whether file is splittable or not. Next three variables,
mapred.min.split.size, mapred.max.split.size, dfs.block.size determine the actual split size used if input is splittable. By default, min split size is 0 and max split size is Long.MAX and block size 64MB. For actual split size; minSplitSize&blockSize set the lower bound and blockSize&maxSplitSize together sets the upper bound. Here is the function to calculate:max(minsplitsize, min(maxsplitsize, blocksize))Note: compressed input files (eg. gzip) are not splittable, there are patches * * available.