Introduction of Mapreduce

Introduction of Mapreduce Mar 13, 2014 9:09:36 GMT

Quote

Post by Admin on Mar 13, 2014 9:09:36 GMT

Map reduce published in 2004 by GOOGLE.
HADOOP can run map reduce programs written in various languages like java, rubes, python and C++.
Map reduce is a parallel programming model for processing the huge amount of data.
Map reduce making the structured data and out of some unstructured data . etc.
Mp reduce provides {automatic parallelization & distribution fault-tolerance, I/O
Scheduling monitoring and status and updates}

Map Reduce Overview:

a)Applications processing data on HADOOP are written using the map reduce paradigm.
b)A map reduce job usually splits the input data –set into independent chunks, which are processed by the map tasks in a completely parallel manner. The framework sorts the outputs of the maps , which are then input to the reduce tasks. Typically both the input and the output of the job are stored tasks , monitoring them and re - executes the failed tasks.
c)Map reduce applications specify the input/output locations and supply map and reduce functions via. Implementation of appropriate HADOOP interfaces such as map per and reducer. These and other parameters comprise the job configuration. The HADOOP job client then submits the job(jar/executable ,etc) and configuration some text job tracker, which then assumes the responsibility
Slaves , scheduling tasks and monitoring them , providing status and diagnostic
information to the job-client.
d)The map/reduce frame work operates exclusively on (key value) pairs-that is,the frame work views the input to the job as a set of <key ,value> pairs and produces a set of <key value> pairs as the output of the job conceivably of different types.

Post by Admin on Mar 13, 2014 9:09:36 GMT

Quick Reply