Tuesday, July 14, 2015

Hadoop Combiner

Hadoop Combiner
 
The Hadoop combiner is used for reducing/optimizing the bandwidth of the MapReduce Job. 

The combiner sits in between the Maper and Reducer, so the workflow would be like Maper -> Combiner -> Reducer. 

The combiner acts as a Mini-Reducer. So the output from the Maper is sent to Combiner and from Combiner it is sent to Reducer
 
Example : – Suppose an Maper program emits the wordcount ("Hello",1) three times. So instead of passing these three sets ("Hello",1), ("Hello",1),("Hello",1) to reduce combiner will pass ("Hello",3) . It will reduce the overhead for the reducer. 

Difference between Combiner and Reducer
  • Combiners can be used on the fucntions which are commutative(a.b==b.a)
  • Reducer can get input from multiple mapper but combiner can get input from only one mapper.

No comments:

Post a Comment