org.apache.pig.backend.hadoop.executionengine.mapReduceLayer
Class CombinerOptimizer

java.lang.Object
  extended by org.apache.pig.impl.plan.PlanVisitor<MapReduceOper,MROperPlan>
      extended by org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.plans.MROpPlanVisitor
          extended by org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.CombinerOptimizer

public class CombinerOptimizer
extends MROpPlanVisitor

Optimize map reduce plans to use the combiner where possible. Algebriac functions and distinct in nested plan of a foreach are partially computed in the map and combine phase. A new foreach statement with initial and intermediate forms of algebraic functions are added to map and combine plans respectively. If bag portion of group-by result is projected or a non algebraic expression/udf has bag as input, combiner will not be used. This is because the use of combiner in such case is likely to degrade performance as there will not be much reduction in data size in combine stage to offset the cost of the additional number of times (de)serialization is done. Major areas for enhancement: 1. use of combiner in cogroup 2. queries with order-by, limit or sort in a nested foreach after group-by 3. case where group-by is followed by filter that has algebraic expression


Field Summary
 
Fields inherited from class org.apache.pig.impl.plan.PlanVisitor
mCurrentWalker, mPlan
 
Constructor Summary
CombinerOptimizer(MROperPlan plan, boolean doMapAgg)
           
CombinerOptimizer(MROperPlan plan, boolean doMapAgg, CompilationMessageCollector messageCollector)
           
 
Method Summary
 CompilationMessageCollector getMessageCollector()
           
 void visitMROp(MapReduceOper mr)
           
 
Methods inherited from class org.apache.pig.impl.plan.PlanVisitor
getPlan, popWalker, pushWalker, visit
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

CombinerOptimizer

public CombinerOptimizer(MROperPlan plan,
                         boolean doMapAgg)

CombinerOptimizer

public CombinerOptimizer(MROperPlan plan,
                         boolean doMapAgg,
                         CompilationMessageCollector messageCollector)
Method Detail

getMessageCollector

public CompilationMessageCollector getMessageCollector()

visitMROp

public void visitMROp(MapReduceOper mr)
               throws VisitorException
Overrides:
visitMROp in class MROpPlanVisitor
Throws:
VisitorException


Copyright © 2007-2012 The Apache Software Foundation