How many maps are there in a particular Job?

How many maps are there in a particular Job? Mar 14, 2014 6:44:08 GMT

Quote

Post by Admin on Mar 14, 2014 6:44:08 GMT

The number of maps is usually driven by the total size of the inputs, that is,the total number of blocks of the input files.
Generally it is around 10-100 maps per-node. Task setup takes awhile, so it is best if the maps take at least a minute to execute.
Suppose, if you expect 10TB of input data and have a blocksize of 128MB, you'll end up with 82,000 maps, to control the number of block you can use the mapreduce.job.maps parameter (which only provides a hint to the framework).Ultimately, the number of tasks is controlled by the number of splits returned by
the InputFormat.getSplits() method (which you can override).

Software Guiders

How many maps are there in a particular Job?

Post by Admin on Mar 14, 2014 6:44:08 GMT

Quick Reply