public class InputSizeReducerEstimator extends Object implements PigReducerEstimator
e.g. the following is your pig script
a = load '/data/a'; b = load '/data/b'; c = join a by $0, b by $0; store c into '/tmp';and the size of /data/a is 1000*1000*1000, and the size of /data/b is 2*1000*1000*1000 then the estimated number of reducer to use will be (1000*1000*1000+2*1000*1000*1000)/(1000*1000*1000)=3
BYTES_PER_REDUCER_PARAM, DEFAULT_BYTES_PER_REDUCER, DEFAULT_MAX_REDUCER_COUNT_PARAM, MAX_REDUCER_COUNT_PARAM
Constructor and Description |
---|
InputSizeReducerEstimator() |
Modifier and Type | Method and Description |
---|---|
int |
estimateNumberOfReducers(org.apache.hadoop.mapreduce.Job job,
MapReduceOper mapReduceOper)
Determines the number of reducers to be used.
|
public int estimateNumberOfReducers(org.apache.hadoop.mapreduce.Job job, MapReduceOper mapReduceOper) throws IOException
estimateNumberOfReducers
in interface PigReducerEstimator
job
- job instancemapReduceOper
- IOException
Copyright © 2007-2012 The Apache Software Foundation