public class JobControlCompiler extends Object
A few words on how comparators are chosen. In almost all cases we use raw comparators (the one exception being when the user provides a comparison function for order by). For order by queries the PigTYPERawComparator functions are used, where TYPE is Int, Long, etc. These comparators are null aware and asc/desc aware. The first byte of each of the NullableTYPEWritable classes contains info on whether the value is null. Asc/desc is written as an array into the JobConf with the key pig.sortOrder so that it can be read by each of the comparators as part of their setConf call.
For non-order by queries, PigTYPEWritableComparator classes are used. These are all just type specific instances of WritableComparator.
Modifier and Type | Field and Description |
---|---|
static String |
BIG_JOB_LOG_MSG |
static String |
END_OF_INP_IN_MAP |
HashMap<String,ArrayList<Pair<String,Long>>> |
globalCounters |
static String |
LOG_DIR |
static String |
PIG_MAP_COUNTER |
static String |
PIG_MAP_RANK_NAME |
static String |
PIG_MAP_SEPARATOR |
static String |
PIG_MAP_STORES
We will serialize the POStore(s) present in map and reduce in lists in
the Hadoop Conf.
|
static String |
PIG_REDUCE_STORES |
static String |
SMALL_JOB_LOG_MSG |
Constructor and Description |
---|
JobControlCompiler(PigContext pigContext,
org.apache.hadoop.conf.Configuration conf) |
JobControlCompiler(PigContext pigContext,
org.apache.hadoop.conf.Configuration conf,
org.apache.hadoop.conf.Configuration defaultConf) |
Modifier and Type | Method and Description |
---|---|
void |
adjustNumReducers(MROperPlan plan,
MapReduceOper mro,
org.apache.hadoop.mapreduce.Job nwJob)
Adjust the number of reducers based on the default_parallel, requested parallel and estimated
parallel.
|
org.apache.hadoop.mapred.jobcontrol.JobControl |
compile(MROperPlan plan,
String grpName)
Compiles all jobs that have no dependencies removes them from
the plan and returns.
|
static void |
configureCompression(org.apache.hadoop.conf.Configuration conf) |
static int |
estimateNumberOfReducers(org.apache.hadoop.mapreduce.Job job,
MapReduceOper mapReducerOper)
Looks up the estimator from REDUCER_ESTIMATOR_KEY and invokes it to find the number of
reducers to use.
|
static org.apache.hadoop.fs.Path |
getCacheStagingDir(org.apache.hadoop.conf.Configuration conf) |
static org.apache.hadoop.fs.Path |
getFromCache(PigContext pigContext,
org.apache.hadoop.conf.Configuration conf,
URL url) |
Map<org.apache.hadoop.mapred.jobcontrol.Job,MapReduceOper> |
getJobMroMap()
Gets the map of Job and the MR Operator
|
List<POStore> |
getStores(org.apache.hadoop.mapred.jobcontrol.Job job)
Returns all store locations of a previously compiled job
|
void |
moveResults(List<org.apache.hadoop.mapred.jobcontrol.Job> completedJobs)
Moves all the results of a collection of MR jobs to the final
output directory.
|
void |
reset()
Resets the state
|
static void |
setOutputFormat(org.apache.hadoop.mapreduce.Job job) |
int |
updateMROpPlan(List<org.apache.hadoop.mapred.jobcontrol.Job> completeFailedJobs) |
public static final String LOG_DIR
public static final String END_OF_INP_IN_MAP
public static final String PIG_MAP_COUNTER
public static final String PIG_MAP_RANK_NAME
public static final String PIG_MAP_SEPARATOR
public static final String SMALL_JOB_LOG_MSG
public static final String BIG_JOB_LOG_MSG
public static final String PIG_MAP_STORES
public static final String PIG_REDUCE_STORES
public JobControlCompiler(PigContext pigContext, org.apache.hadoop.conf.Configuration conf)
public JobControlCompiler(PigContext pigContext, org.apache.hadoop.conf.Configuration conf, org.apache.hadoop.conf.Configuration defaultConf)
public List<POStore> getStores(org.apache.hadoop.mapred.jobcontrol.Job job)
public void reset()
public Map<org.apache.hadoop.mapred.jobcontrol.Job,MapReduceOper> getJobMroMap()
public void moveResults(List<org.apache.hadoop.mapred.jobcontrol.Job> completedJobs) throws IOException
IOException
public org.apache.hadoop.mapred.jobcontrol.JobControl compile(MROperPlan plan, String grpName) throws JobCreationException
plan
- - The MROperPlan to be compiledgrpName
- - The name given to the JobControlJobCreationException
public int updateMROpPlan(List<org.apache.hadoop.mapred.jobcontrol.Job> completeFailedJobs)
public static void configureCompression(org.apache.hadoop.conf.Configuration conf)
public void adjustNumReducers(MROperPlan plan, MapReduceOper mro, org.apache.hadoop.mapreduce.Job nwJob) throws IOException
plan
- the MR planmro
- the MR operatornwJob
- the current jobIOException
public static int estimateNumberOfReducers(org.apache.hadoop.mapreduce.Job job, MapReduceOper mapReducerOper) throws IOException
job
- mapReducerOper
- IOException
public static org.apache.hadoop.fs.Path getCacheStagingDir(org.apache.hadoop.conf.Configuration conf) throws IOException
IOException
public static org.apache.hadoop.fs.Path getFromCache(PigContext pigContext, org.apache.hadoop.conf.Configuration conf, URL url) throws IOException
IOException
public static void setOutputFormat(org.apache.hadoop.mapreduce.Job job)
Copyright © 2007-2012 The Apache Software Foundation