org.apache.pig.backend.hadoop.executionengine.mapReduceLayer
Class InputSizeReducerEstimator

java.lang.Object
  extended by org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.InputSizeReducerEstimator
All Implemented Interfaces:
PigReducerEstimator

public class InputSizeReducerEstimator
extends Object
implements PigReducerEstimator

Class that estimates the number of reducers based on input size. Number of reducers is based on two properties:

If using a loader that implements LoadMetadata the reported input size is used, otherwise attempt to determine size from the filesystem.

e.g. the following is your pig script

 a = load '/data/a';
 b = load '/data/b';
 c = join a by $0, b by $0;
 store c into '/tmp';
 
and the size of /data/a is 1000*1000*1000, and the size of /data/b is 2*1000*1000*1000 then the estimated number of reducer to use will be (1000*1000*1000+2*1000*1000*1000)/(1000*1000*1000)=3


Field Summary
 
Fields inherited from interface org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigReducerEstimator
BYTES_PER_REDUCER_PARAM, DEFAULT_BYTES_PER_REDUCER, DEFAULT_MAX_REDUCER_COUNT_PARAM, MAX_REDUCER_COUNT_PARAM
 
Constructor Summary
InputSizeReducerEstimator()
           
 
Method Summary
 int estimateNumberOfReducers(org.apache.hadoop.mapreduce.Job job, MapReduceOper mapReduceOper)
          Determines the number of reducers to be used.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

InputSizeReducerEstimator

public InputSizeReducerEstimator()
Method Detail

estimateNumberOfReducers

public int estimateNumberOfReducers(org.apache.hadoop.mapreduce.Job job,
                                    MapReduceOper mapReduceOper)
                             throws IOException
Determines the number of reducers to be used.

Specified by:
estimateNumberOfReducers in interface PigReducerEstimator
Parameters:
job - job instance
mapReduceOper -
Returns:
the number of reducers to use, or -1 if the count couldn't be estimated
Throws:
IOException


Copyright © 2007-2012 The Apache Software Foundation