org.apache.pig.backend.hadoop.executionengine.mapReduceLayer
Class Launcher

java.lang.Object
  extended by org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Launcher
Direct Known Subclasses:
MapReduceLauncher

public abstract class Launcher
extends Object


Constructor Summary
protected Launcher()
           
 
Method Summary
protected  double calculateProgress(org.apache.hadoop.mapred.jobcontrol.JobControl jc, org.apache.hadoop.mapred.JobClient jobClient)
          Compute the progress of the current job submitted through the JobControl object jc to the JobClient jobClient
protected  long computeTimeSpent(org.apache.hadoop.mapred.TaskReport[] mapReports)
           
abstract  void explain(PhysicalPlan pp, PigContext pc, PrintStream ps, String format, boolean verbose)
          Explain how a pig job will be executed on the underlying infrastructure.
protected  void getErrorMessages(org.apache.hadoop.mapred.TaskReport[] reports, String type, boolean errNotDbg, PigContext pigContext)
           
protected  String getFirstLineFromMessage(String message)
           
 StackTraceElement getStackTraceElement(String line)
           
protected  void getStats(org.apache.hadoop.mapred.jobcontrol.Job job, org.apache.hadoop.mapred.JobClient jobClient, boolean errNotDbg, PigContext pigContext)
           
 long getTotalHadoopTimeSpent()
           
protected  boolean isComplete(double prog)
           
abstract  PigStats launchPig(PhysicalPlan php, String grpName, PigContext pc)
          Method to launch pig for hadoop either for a cluster's job tracker or for a local job runner.
protected  double progressOfRunningJob(org.apache.hadoop.mapred.jobcontrol.Job j, org.apache.hadoop.mapred.JobClient jobClient)
          Returns the progress of a Job j which is part of a submitted JobControl object.
 void reset()
          Resets the state after a launch
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

Launcher

protected Launcher()
Method Detail

reset

public void reset()
Resets the state after a launch


launchPig

public abstract PigStats launchPig(PhysicalPlan php,
                                   String grpName,
                                   PigContext pc)
                            throws PlanException,
                                   VisitorException,
                                   IOException,
                                   ExecException,
                                   JobCreationException,
                                   Exception
Method to launch pig for hadoop either for a cluster's job tracker or for a local job runner. THe only difference between the two is the job client. Depending on the pig context the job client will be initialize to one of the two. Launchers for other frameworks can overide these methods. Given an input PhysicalPlan, it compiles it to get a MapReduce Plan. The MapReduce plan which has multiple MapReduce operators each one of which has to be run as a map reduce job with dependency information stored in the plan. It compiles the MROperPlan into a JobControl object. Each Map Reduce operator is converted into a Job and added to the JobControl object. Each Job also has a set of dependent Jobs that are created using the MROperPlan. The JobControl object is obtained from the JobControlCompiler Then a new thread is spawned that submits these jobs while respecting the dependency information. The parent thread monitors the submitted jobs' progress and after it is complete, stops the JobControl thread.

Parameters:
php -
grpName -
pc -
Throws:
PlanException
VisitorException
IOException
ExecException
JobCreationException
Exception

explain

public abstract void explain(PhysicalPlan pp,
                             PigContext pc,
                             PrintStream ps,
                             String format,
                             boolean verbose)
                      throws PlanException,
                             VisitorException,
                             IOException
Explain how a pig job will be executed on the underlying infrastructure.

Parameters:
pp - PhysicalPlan to explain
pc - PigContext to use for configuration
ps - PrintStream to write output on.
format - Format to write in
verbose - Amount of information to print
Throws:
VisitorException
IOException
PlanException

isComplete

protected boolean isComplete(double prog)

getStats

protected void getStats(org.apache.hadoop.mapred.jobcontrol.Job job,
                        org.apache.hadoop.mapred.JobClient jobClient,
                        boolean errNotDbg,
                        PigContext pigContext)
                 throws Exception
Throws:
Exception

computeTimeSpent

protected long computeTimeSpent(org.apache.hadoop.mapred.TaskReport[] mapReports)

getErrorMessages

protected void getErrorMessages(org.apache.hadoop.mapred.TaskReport[] reports,
                                String type,
                                boolean errNotDbg,
                                PigContext pigContext)
                         throws Exception
Throws:
Exception

calculateProgress

protected double calculateProgress(org.apache.hadoop.mapred.jobcontrol.JobControl jc,
                                   org.apache.hadoop.mapred.JobClient jobClient)
                            throws IOException
Compute the progress of the current job submitted through the JobControl object jc to the JobClient jobClient

Parameters:
jc - - The JobControl object that has been submitted
jobClient - - The JobClient to which it has been submitted
Returns:
The progress as a precentage in double format
Throws:
IOException

progressOfRunningJob

protected double progressOfRunningJob(org.apache.hadoop.mapred.jobcontrol.Job j,
                                      org.apache.hadoop.mapred.JobClient jobClient)
                               throws IOException
Returns the progress of a Job j which is part of a submitted JobControl object. The progress is for this Job. So it has to be scaled down by the num of jobs that are present in the JobControl.

Parameters:
j - - The Job for which progress is required
jobClient - - the JobClient to which it has been submitted
Returns:
Returns the percentage progress of this Job
Throws:
IOException

getTotalHadoopTimeSpent

public long getTotalHadoopTimeSpent()

getStackTraceElement

public StackTraceElement getStackTraceElement(String line)
                                       throws Exception
Parameters:
line - the string representation of a stack trace returned by printStackTrace
Returns:
the StackTraceElement object representing the stack trace
Throws:
Exception

getFirstLineFromMessage

protected String getFirstLineFromMessage(String message)


Copyright © ${year} The Apache Software Foundation