org.apache.pig.backend.hadoop.executionengine.physicalLayer
Class PhysicalOperator

java.lang.Object
  extended by org.apache.pig.impl.plan.Operator<PhyPlanVisitor>
      extended by org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator
All Implemented Interfaces:
Serializable, Cloneable, Comparable<Operator>
Direct Known Subclasses:
ExpressionOperator, POCogroup, POCollectedGroup, POCounter, POCross, PODemux, PODistinct, POFilter, POForEach, POFRJoin, POGlobalRearrange, POLimit, POLoad, POLocalRearrange, POMergeCogroup, POMergeJoin, PONative, POPackage, POPreCombinerLocalRearrange, PORead, POSkewedJoin, POSort, POSplit, POSplit, POSplitOutput, POStore, POStream, POUnion

public abstract class PhysicalOperator
extends Operator<PhyPlanVisitor>
implements Cloneable

This is the base class for all operators. This supports a generic way of processing inputs which can be overridden by operators extending this class. The input model assumes that it can either be taken from an operator or can be attached directly to this operator. Also it is assumed that inputs to an operator are always in the form of a tuple. For this pipeline rework, we assume a pull based model, i.e, the root operator is going to call getNext with the appropriate type which initiates a cascade of getNext calls that unroll to create input for the root operator to work on. Any operator that extends the PhysicalOperator, supports a getNext with all the different types of parameter types. The concrete implementation should use the result type of its input operator to decide the type of getNext's parameter. This is done to avoid switch/case based on the type as much as possible. The default is assumed to return an erroneus Result corresponding to an unsupported operation on that type. So the operators need to implement only those types that are supported.

See Also:
Serialized Form

Field Summary
protected  String alias
           
protected static DataBag dummyBag
           
protected static Boolean dummyBool
           
protected static DataByteArray dummyDBA
           
protected static Double dummyDouble
           
protected static Float dummyFloat
           
protected static Integer dummyInt
           
protected static Long dummyLong
           
protected static Map dummyMap
           
protected static String dummyString
           
protected static Tuple dummyTuple
           
protected  Tuple input
           
protected  boolean inputAttached
           
protected  List<PhysicalOperator> inputs
           
protected  LineageTracer lineageTracer
           
protected  List<PhysicalOperator> outputs
           
protected  PhysicalPlan parentPlan
           
protected static PigLogger pigLogger
           
static PigProgressable reporter
           
protected  int requestedParallelism
           
protected  Result res
           
protected  byte resultType
           
protected static long serialVersionUID
           
 
Fields inherited from class org.apache.pig.impl.plan.Operator
mKey
 
Constructor Summary
PhysicalOperator(OperatorKey k)
           
PhysicalOperator(OperatorKey k, int rp)
           
PhysicalOperator(OperatorKey k, int rp, List<PhysicalOperator> inp)
           
PhysicalOperator(OperatorKey k, List<PhysicalOperator> inp)
           
 
Method Summary
 void attachInput(Tuple t)
          Shorts the input path of this operator by providing the input tuple directly
 PhysicalOperator clone()
          Make a deep copy of this operator.
protected  void cloneHelper(PhysicalOperator op)
           
 void detachInput()
          Detaches any tuples that are attached
 String getAlias()
           
protected  String getAliasString()
           
 List<PhysicalOperator> getInputs()
           
 org.apache.commons.logging.Log getLogger()
           
 Result getNext(Boolean b)
           
 Result getNext(DataBag db)
           
 Result getNext(DataByteArray ba)
           
 Result getNext(Double d)
           
 Result getNext(Float f)
           
 Result getNext(Integer i)
           
 Result getNext(Long l)
           
 Result getNext(Map m)
           
 Result getNext(String s)
           
 Result getNext(Tuple t)
           
static PigLogger getPigLogger()
           
 int getRequestedParallelism()
           
 byte getResultType()
           
 boolean isAccumStarted()
           
 boolean isAccumulative()
           
 boolean isBlocking()
          A blocking operator should override this to return true.
 boolean isInputAttached()
           
 Result processInput()
          A generic method for parsing input that either returns the attached input if it exists or fetches it from its predecessor.
 void reset()
          Reset internal state in an operator.
 void setAccumEnd()
           
 void setAccumStart()
           
 void setAccumulative()
           
 void setAlias(String alias)
           
 void setInputs(List<PhysicalOperator> inputs)
           
 void setLineageTracer(LineageTracer lineage)
           
 void setParentPlan(PhysicalPlan physicalPlan)
           
static void setPigLogger(PigLogger logger)
           
static void setReporter(PigProgressable reporter)
           
 void setRequestedParallelism(int requestedParallelism)
           
 void setResultType(byte resultType)
           
abstract  void visit(PhyPlanVisitor v)
          Visit this node with the provided visitor.
 
Methods inherited from class org.apache.pig.impl.plan.Operator
compareTo, equals, getOperatorKey, getProjectionMap, hashCode, name, regenerateProjectionMap, rewire, supportsMultipleInputs, supportsMultipleOutputs, toString, unsetProjectionMap
 
Methods inherited from class java.lang.Object
finalize, getClass, notify, notifyAll, wait, wait, wait
 

Field Detail

serialVersionUID

protected static final long serialVersionUID
See Also:
Constant Field Values

requestedParallelism

protected int requestedParallelism

inputs

protected List<PhysicalOperator> inputs

outputs

protected List<PhysicalOperator> outputs

resultType

protected byte resultType

parentPlan

protected PhysicalPlan parentPlan

inputAttached

protected boolean inputAttached

input

protected Tuple input

res

protected Result res

alias

protected String alias

reporter

public static PigProgressable reporter

pigLogger

protected static PigLogger pigLogger

dummyDBA

protected static final DataByteArray dummyDBA

dummyString

protected static final String dummyString

dummyDouble

protected static final Double dummyDouble

dummyFloat

protected static final Float dummyFloat

dummyInt

protected static final Integer dummyInt

dummyLong

protected static final Long dummyLong

dummyBool

protected static final Boolean dummyBool

dummyTuple

protected static final Tuple dummyTuple

dummyBag

protected static final DataBag dummyBag

dummyMap

protected static final Map dummyMap

lineageTracer

protected LineageTracer lineageTracer
Constructor Detail

PhysicalOperator

public PhysicalOperator(OperatorKey k)

PhysicalOperator

public PhysicalOperator(OperatorKey k,
                        int rp)

PhysicalOperator

public PhysicalOperator(OperatorKey k,
                        List<PhysicalOperator> inp)

PhysicalOperator

public PhysicalOperator(OperatorKey k,
                        int rp,
                        List<PhysicalOperator> inp)
Method Detail

setLineageTracer

public void setLineageTracer(LineageTracer lineage)

getRequestedParallelism

public int getRequestedParallelism()

setRequestedParallelism

public void setRequestedParallelism(int requestedParallelism)

getResultType

public byte getResultType()

getAlias

public String getAlias()

getAliasString

protected String getAliasString()

setAlias

public void setAlias(String alias)

setAccumulative

public void setAccumulative()

isAccumulative

public boolean isAccumulative()

setAccumStart

public void setAccumStart()

isAccumStarted

public boolean isAccumStarted()

setAccumEnd

public void setAccumEnd()

setResultType

public void setResultType(byte resultType)

getInputs

public List<PhysicalOperator> getInputs()

setInputs

public void setInputs(List<PhysicalOperator> inputs)

isInputAttached

public boolean isInputAttached()

attachInput

public void attachInput(Tuple t)
Shorts the input path of this operator by providing the input tuple directly

Parameters:
t - - The tuple that should be used as input

detachInput

public void detachInput()
Detaches any tuples that are attached


isBlocking

public boolean isBlocking()
A blocking operator should override this to return true. Blocking operators are those that need the full bag before operate on the tuples inside the bag. Example is the Global Rearrange. Non-blocking or pipeline operators are those that work on a tuple by tuple basis.

Returns:
true if blocking and false otherwise

processInput

public Result processInput()
                    throws ExecException
A generic method for parsing input that either returns the attached input if it exists or fetches it from its predecessor. If special processing is required, this method should be overridden.

Returns:
The Result object that results from processing the input
Throws:
ExecException

visit

public abstract void visit(PhyPlanVisitor v)
                    throws VisitorException
Description copied from class: Operator
Visit this node with the provided visitor. This should only be called by the visitor class itself, never directly.

Specified by:
visit in class Operator<PhyPlanVisitor>
Parameters:
v - Visitor to visit with.
Throws:
VisitorException - if the visitor has a problem.

getNext

public Result getNext(Integer i)
               throws ExecException
Throws:
ExecException

getNext

public Result getNext(Long l)
               throws ExecException
Throws:
ExecException

getNext

public Result getNext(Double d)
               throws ExecException
Throws:
ExecException

getNext

public Result getNext(Float f)
               throws ExecException
Throws:
ExecException

getNext

public Result getNext(String s)
               throws ExecException
Throws:
ExecException

getNext

public Result getNext(DataByteArray ba)
               throws ExecException
Throws:
ExecException

getNext

public Result getNext(Map m)
               throws ExecException
Throws:
ExecException

getNext

public Result getNext(Boolean b)
               throws ExecException
Throws:
ExecException

getNext

public Result getNext(Tuple t)
               throws ExecException
Throws:
ExecException

getNext

public Result getNext(DataBag db)
               throws ExecException
Throws:
ExecException

reset

public void reset()
Reset internal state in an operator. For use in nested pipelines where operators like limit and sort may need to reset their state. Limit needs it because it needs to know it's seeing a fresh set of input. Blocking operators like sort and distinct need it because they may not have drained their previous input due to a limit and thus need to be told to drop their old input and start over.


setReporter

public static void setReporter(PigProgressable reporter)

clone

public PhysicalOperator clone()
                       throws CloneNotSupportedException
Make a deep copy of this operator. This function is blank, however, we should leave a place holder so that the subclasses can clone

Overrides:
clone in class Operator<PhyPlanVisitor>
Throws:
CloneNotSupportedException
See Also:
Do not use the clone method directly. Operators are cloned when logical plans are cloned using {@link LogicalPlanCloner}

cloneHelper

protected void cloneHelper(PhysicalOperator op)

setParentPlan

public void setParentPlan(PhysicalPlan physicalPlan)
Parameters:
physicalPlan -

getLogger

public org.apache.commons.logging.Log getLogger()

setPigLogger

public static void setPigLogger(PigLogger logger)

getPigLogger

public static PigLogger getPigLogger()


Copyright © ${year} The Apache Software Foundation