org.apache.pig.backend.hadoop.executionengine.physicalLayer
Class PhysicalOperator

java.lang.Object
  extended by org.apache.pig.impl.plan.Operator<PhyPlanVisitor>
      extended by org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator
All Implemented Interfaces:
Serializable, Cloneable, Comparable<Operator>, Illustrable
Direct Known Subclasses:
ExpressionOperator, POCollectedGroup, POCounter, POCross, PODemux, PODistinct, POFilter, POForEach, POFRJoin, POGlobalRearrange, POLimit, POLoad, POLocalRearrange, POMergeCogroup, POMergeJoin, PONative, POPackage, POPartialAgg, POPreCombinerLocalRearrange, PORank, POSkewedJoin, POSort, POSplit, POStore, POStream, POUnion

public abstract class PhysicalOperator
extends Operator<PhyPlanVisitor>
implements Illustrable, Cloneable

This is the base class for all operators. This supports a generic way of processing inputs which can be overridden by operators extending this class. The input model assumes that it can either be taken from an operator or can be attached directly to this operator. Also it is assumed that inputs to an operator are always in the form of a tuple. For this pipeline rework, we assume a pull based model, i.e, the root operator is going to call getNext with the appropriate type which initiates a cascade of getNext calls that unroll to create input for the root operator to work on. Any operator that extends the PhysicalOperator, supports a getNext with all the different types of parameter types. The concrete implementation should use the result type of its input operator to decide the type of getNext's parameter. This is done to avoid switch/case based on the type as much as possible. The default is assumed to return an erroneus Result corresponding to an unsupported operation on that type. So the operators need to implement only those types that are supported.

See Also:
Serialized Form

Nested Class Summary
static class PhysicalOperator.OriginalLocation
           
 
Field Summary
protected  String alias
           
protected  Illustrator illustrator
           
protected  Tuple input
           
protected  boolean inputAttached
           
protected  List<PhysicalOperator> inputs
           
protected  LineageTracer lineageTracer
           
protected  List<PhysicalOperator> outputs
           
protected  PhysicalPlan parentPlan
           
protected static PigLogger pigLogger
           
protected  int requestedParallelism
           
protected  Result res
           
protected  byte resultType
           
protected static long serialVersionUID
           
 
Fields inherited from class org.apache.pig.impl.plan.Operator
mKey
 
Constructor Summary
PhysicalOperator(OperatorKey k)
           
PhysicalOperator(OperatorKey k, int rp)
           
PhysicalOperator(OperatorKey k, int rp, List<PhysicalOperator> inp)
           
PhysicalOperator(OperatorKey k, List<PhysicalOperator> inp)
           
 
Method Summary
 void addOriginalLocation(String alias, List<PhysicalOperator.OriginalLocation> originalLocations)
           
 void addOriginalLocation(String alias, SourceLocation sourceLocation)
           
 void attachInput(Tuple t)
          Shorts the input path of this operator by providing the input tuple directly
 PhysicalOperator clone()
          Make a deep copy of this operator.
protected  void cloneHelper(PhysicalOperator op)
           
 void detachInput()
          Detaches any tuples that are attached
 String getAlias()
           
protected  String getAliasString()
           
 Illustrator getIllustrator()
           
 List<PhysicalOperator> getInputs()
           
 org.apache.commons.logging.Log getLogger()
           
 Result getNext(byte dataType)
          Implementations that call into the different versions of getNext are often identical, differing only in the signature of the getNext() call they make.
 Result getNextBigDecimal()
           
 Result getNextBigInteger()
           
 Result getNextBoolean()
           
 Result getNextDataBag()
           
 Result getNextDataByteArray()
           
 Result getNextDateTime()
           
 Result getNextDouble()
           
 Result getNextFloat()
           
 Result getNextInteger()
           
 Result getNextLong()
           
 Result getNextMap()
           
 Result getNextString()
           
 Result getNextTuple()
           
 List<PhysicalOperator.OriginalLocation> getOriginalLocations()
           
static PigLogger getPigLogger()
           
static PigProgressable getReporter()
           
 int getRequestedParallelism()
           
 byte getResultType()
           
 boolean isAccumStarted()
           
 boolean isAccumulative()
           
 boolean isBlocking()
          A blocking operator should override this to return true.
 boolean isInputAttached()
           
 Result processInput()
          A generic method for parsing input that either returns the attached input if it exists or fetches it from its predecessor.
 void reset()
          Reset internal state in an operator.
 void setAccumEnd()
           
 void setAccumStart()
           
 void setAccumulative()
           
 void setIllustrator(Illustrator illustrator)
           
 void setInputs(List<PhysicalOperator> inputs)
           
 void setParentPlan(PhysicalPlan physicalPlan)
           
static void setPigLogger(PigLogger logger)
           
static void setReporter(PigProgressable reporter)
           
 void setRequestedParallelism(int requestedParallelism)
           
 void setResultType(byte resultType)
           
abstract  void visit(PhyPlanVisitor v)
          Visit this node with the provided visitor.
 
Methods inherited from class org.apache.pig.impl.plan.Operator
compareTo, equals, getOperatorKey, getProjectionMap, hashCode, name, regenerateProjectionMap, rewire, supportsMultipleInputs, supportsMultipleOutputs, toString, unsetProjectionMap
 
Methods inherited from class java.lang.Object
finalize, getClass, notify, notifyAll, wait, wait, wait
 
Methods inherited from interface org.apache.pig.pen.Illustrable
illustratorMarkup
 

Field Detail

serialVersionUID

protected static final long serialVersionUID
See Also:
Constant Field Values

requestedParallelism

protected int requestedParallelism

inputs

protected List<PhysicalOperator> inputs

outputs

protected List<PhysicalOperator> outputs

resultType

protected byte resultType

parentPlan

protected PhysicalPlan parentPlan

inputAttached

protected boolean inputAttached

input

protected Tuple input

res

protected Result res

alias

protected String alias

pigLogger

protected static PigLogger pigLogger

lineageTracer

protected LineageTracer lineageTracer

illustrator

protected transient Illustrator illustrator
Constructor Detail

PhysicalOperator

public PhysicalOperator(OperatorKey k)

PhysicalOperator

public PhysicalOperator(OperatorKey k,
                        int rp)

PhysicalOperator

public PhysicalOperator(OperatorKey k,
                        List<PhysicalOperator> inp)

PhysicalOperator

public PhysicalOperator(OperatorKey k,
                        int rp,
                        List<PhysicalOperator> inp)
Method Detail

setIllustrator

public void setIllustrator(Illustrator illustrator)
Specified by:
setIllustrator in interface Illustrable

getIllustrator

public Illustrator getIllustrator()

getRequestedParallelism

public int getRequestedParallelism()

setRequestedParallelism

public void setRequestedParallelism(int requestedParallelism)

getResultType

public byte getResultType()

getAlias

public String getAlias()

getAliasString

protected String getAliasString()

addOriginalLocation

public void addOriginalLocation(String alias,
                                SourceLocation sourceLocation)

addOriginalLocation

public void addOriginalLocation(String alias,
                                List<PhysicalOperator.OriginalLocation> originalLocations)

getOriginalLocations

public List<PhysicalOperator.OriginalLocation> getOriginalLocations()

setAccumulative

public void setAccumulative()

isAccumulative

public boolean isAccumulative()

setAccumStart

public void setAccumStart()

isAccumStarted

public boolean isAccumStarted()

setAccumEnd

public void setAccumEnd()

setResultType

public void setResultType(byte resultType)

getInputs

public List<PhysicalOperator> getInputs()

setInputs

public void setInputs(List<PhysicalOperator> inputs)

isInputAttached

public boolean isInputAttached()

attachInput

public void attachInput(Tuple t)
Shorts the input path of this operator by providing the input tuple directly

Parameters:
t - - The tuple that should be used as input

detachInput

public void detachInput()
Detaches any tuples that are attached


isBlocking

public boolean isBlocking()
A blocking operator should override this to return true. Blocking operators are those that need the full bag before operate on the tuples inside the bag. Example is the Global Rearrange. Non-blocking or pipeline operators are those that work on a tuple by tuple basis.

Returns:
true if blocking and false otherwise

processInput

public Result processInput()
                    throws ExecException
A generic method for parsing input that either returns the attached input if it exists or fetches it from its predecessor. If special processing is required, this method should be overridden.

Returns:
The Result object that results from processing the input
Throws:
ExecException

visit

public abstract void visit(PhyPlanVisitor v)
                    throws VisitorException
Description copied from class: Operator
Visit this node with the provided visitor. This should only be called by the visitor class itself, never directly.

Specified by:
visit in class Operator<PhyPlanVisitor>
Parameters:
v - Visitor to visit with.
Throws:
VisitorException - if the visitor has a problem.

getNext

public Result getNext(byte dataType)
               throws ExecException
Implementations that call into the different versions of getNext are often identical, differing only in the signature of the getNext() call they make. This method allows to cut down on some of the copy-and-paste.

Parameters:
dataType - Describes the type of obj; a byte from DataType.
Returns:
result Result of applying this Operator to the Object.
Throws:
ExecException

getNextInteger

public Result getNextInteger()
                      throws ExecException
Throws:
ExecException

getNextLong

public Result getNextLong()
                   throws ExecException
Throws:
ExecException

getNextDouble

public Result getNextDouble()
                     throws ExecException
Throws:
ExecException

getNextFloat

public Result getNextFloat()
                    throws ExecException
Throws:
ExecException

getNextDateTime

public Result getNextDateTime()
                       throws ExecException
Throws:
ExecException

getNextString

public Result getNextString()
                     throws ExecException
Throws:
ExecException

getNextDataByteArray

public Result getNextDataByteArray()
                            throws ExecException
Throws:
ExecException

getNextMap

public Result getNextMap()
                  throws ExecException
Throws:
ExecException

getNextBoolean

public Result getNextBoolean()
                      throws ExecException
Throws:
ExecException

getNextTuple

public Result getNextTuple()
                    throws ExecException
Throws:
ExecException

getNextDataBag

public Result getNextDataBag()
                      throws ExecException
Throws:
ExecException

getNextBigInteger

public Result getNextBigInteger()
                         throws ExecException
Throws:
ExecException

getNextBigDecimal

public Result getNextBigDecimal()
                         throws ExecException
Throws:
ExecException

reset

public void reset()
Reset internal state in an operator. For use in nested pipelines where operators like limit and sort may need to reset their state. Limit needs it because it needs to know it's seeing a fresh set of input. Blocking operators like sort and distinct need it because they may not have drained their previous input due to a limit and thus need to be told to drop their old input and start over.


getReporter

public static PigProgressable getReporter()
Returns:
PigProgressable stored in threadlocal

setReporter

public static void setReporter(PigProgressable reporter)
Parameters:
reporter - PigProgressable to be stored in threadlocal

clone

public PhysicalOperator clone()
                       throws CloneNotSupportedException
Make a deep copy of this operator. This function is blank, however, we should leave a place holder so that the subclasses can clone

Overrides:
clone in class Operator<PhyPlanVisitor>
Throws:
CloneNotSupportedException
See Also:
Do not use the clone method directly. Operators are cloned when logical plans are cloned using {@link LogicalPlanCloner}

cloneHelper

protected void cloneHelper(PhysicalOperator op)

setParentPlan

public void setParentPlan(PhysicalPlan physicalPlan)
Parameters:
physicalPlan -

getLogger

public org.apache.commons.logging.Log getLogger()

setPigLogger

public static void setPigLogger(PigLogger logger)

getPigLogger

public static PigLogger getPigLogger()


Copyright © 2007-2012 The Apache Software Foundation