public class InputSizeReducerEstimator extends Object implements PigReducerEstimator
e.g. the following is your pig script
a = load '/data/a'; b = load '/data/b'; c = join a by $0, b by $0; store c into '/tmp';and the size of /data/a is 1000*1000*1000, and the size of /data/b is 2*1000*1000*1000 then the estimated number of reducer to use will be (1000*1000*1000+2*1000*1000*1000)/(1000*1000*1000)=3
BYTES_PER_REDUCER_PARAM, DEFAULT_BYTES_PER_REDUCER, DEFAULT_MAX_REDUCER_COUNT_PARAM, MAX_REDUCER_COUNT_PARAM
Constructor and Description |
---|
InputSizeReducerEstimator() |
Modifier and Type | Method and Description |
---|---|
int |
estimateNumberOfReducers(org.apache.hadoop.mapreduce.Job job,
MapReduceOper mapReduceOper)
Determines the number of reducers to be used.
|
static long |
getTotalInputFileSize(org.apache.hadoop.conf.Configuration conf,
List<POLoad> lds,
org.apache.hadoop.mapreduce.Job job) |
public int estimateNumberOfReducers(org.apache.hadoop.mapreduce.Job job, MapReduceOper mapReduceOper) throws IOException
estimateNumberOfReducers
in interface PigReducerEstimator
job
- job instancemapReduceOper
- IOException
public static long getTotalInputFileSize(org.apache.hadoop.conf.Configuration conf, List<POLoad> lds, org.apache.hadoop.mapreduce.Job job) throws IOException
IOException
Copyright © 2007-2017 The Apache Software Foundation