org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators
Class POCounter

java.lang.Object
  extended by org.apache.pig.impl.plan.Operator<PhyPlanVisitor>
      extended by org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator
          extended by org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POCounter
All Implemented Interfaces:
Serializable, Cloneable, Comparable<Operator>, Illustrable

public class POCounter
extends PhysicalOperator

This operator is part of the RANK operator implementation. It adds a local counter and a unique task id to each tuple. There are 2 modes of operations: regular and dense. The local counter is depends on the mode of operation. With regular rank is considered duplicate rows while assigning numbers to distinct values groups. With dense rank counts the number of distinct values, without considering duplicate rows. Depending on if it is considered. the entire tuple (row number) or a by a set of columns (rank by). This Physical Operator relies on some specific MR class, available at PigMapReduceCounter.

See Also:
Serialized Form

Nested Class Summary
 
Nested classes/interfaces inherited from class org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator
PhysicalOperator.OriginalLocation
 
Field Summary
protected static TupleFactory mTupleFactory
           
 
Fields inherited from class org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator
alias, illustrator, input, inputAttached, inputs, lineageTracer, outputs, parentPlan, pigLogger, requestedParallelism, res, resultType
 
Fields inherited from class org.apache.pig.impl.plan.Operator
mKey
 
Constructor Summary
POCounter(OperatorKey k)
           
POCounter(OperatorKey k, int rp)
           
POCounter(OperatorKey k, int rp, List<PhysicalOperator> inputs)
           
POCounter(OperatorKey operatorKey, int requestedParallelism, List inp, List<PhysicalPlan> counterPlans, List<Boolean> ascendingCol)
           
POCounter(OperatorKey k, List<PhysicalOperator> inputs)
           
 
Method Summary
protected  Result addCounterValue(Result input)
          Add current task id and local counter value.
 void addToLocalCounter(Long sizeBag)
           
 List<Boolean> getAscendingColumns()
           
 List<PhysicalPlan> getCounterPlans()
           
 Long getLocalCounter()
           
 Result getNextTuple()
           
 String getOperationID()
           
 String getTaskId()
           
 Tuple illustratorMarkup(Object in, Object out, int eqClassIndex)
          input tuple mark up to be illustrate-able
 Long incrementLocalCounter()
          Sequential counter used at ROW NUMBER and RANK BY DENSE mode
 boolean isDenseRank()
           
 boolean isRowNumber()
           
 String name()
           
 void resetLocalCounter()
          Initialization step into the POCounter is to set up local counter to 1.
 void setAscendingColumns(List<Boolean> mAscCols)
           
 void setCounterPlans(List<PhysicalPlan> counterPlans)
           
 void setIsDenseRank(boolean isDenseRank)
          Dense Rank flag
 void setIsRowNumber(boolean isRowNumber)
          Row number flag
 void setLocalCounter(Long localCount)
           
 void setOperationID(String operationID)
          Operation ID: identifier shared within the corresponding PORank
 void setTaskId(String taskID)
          Task ID: identifier of the task (map or reducer)
 boolean supportsMultipleInputs()
          Indicates whether this operator supports multiple inputs.
 boolean supportsMultipleOutputs()
          Indicates whether this operator supports multiple outputs.
 void visit(PhyPlanVisitor v)
          Visit this node with the provided visitor.
 
Methods inherited from class org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator
addOriginalLocation, addOriginalLocation, attachInput, clone, cloneHelper, detachInput, getAlias, getAliasString, getIllustrator, getInputs, getLogger, getNext, getNextBigDecimal, getNextBigInteger, getNextBoolean, getNextDataBag, getNextDataByteArray, getNextDateTime, getNextDouble, getNextFloat, getNextInteger, getNextLong, getNextMap, getNextString, getOriginalLocations, getPigLogger, getReporter, getRequestedParallelism, getResultType, isAccumStarted, isAccumulative, isBlocking, isInputAttached, processInput, reset, setAccumEnd, setAccumStart, setAccumulative, setIllustrator, setInputs, setParentPlan, setPigLogger, setReporter, setRequestedParallelism, setResultType
 
Methods inherited from class org.apache.pig.impl.plan.Operator
compareTo, equals, getOperatorKey, getProjectionMap, hashCode, regenerateProjectionMap, rewire, toString, unsetProjectionMap
 
Methods inherited from class java.lang.Object
finalize, getClass, notify, notifyAll, wait, wait, wait
 

Field Detail

mTupleFactory

protected static final TupleFactory mTupleFactory
Constructor Detail

POCounter

public POCounter(OperatorKey k)

POCounter

public POCounter(OperatorKey k,
                 int rp)

POCounter

public POCounter(OperatorKey k,
                 List<PhysicalOperator> inputs)

POCounter

public POCounter(OperatorKey k,
                 int rp,
                 List<PhysicalOperator> inputs)

POCounter

public POCounter(OperatorKey operatorKey,
                 int requestedParallelism,
                 List inp,
                 List<PhysicalPlan> counterPlans,
                 List<Boolean> ascendingCol)
Method Detail

illustratorMarkup

public Tuple illustratorMarkup(Object in,
                               Object out,
                               int eqClassIndex)
Description copied from interface: Illustrable
input tuple mark up to be illustrate-able

Parameters:
in - input tuple
out - output tuple before wrapped in ExampleTuple
eqClassIndex - index into equivalence classes in illustrator
Returns:
tuple

visit

public void visit(PhyPlanVisitor v)
           throws VisitorException
Description copied from class: Operator
Visit this node with the provided visitor. This should only be called by the visitor class itself, never directly.

Specified by:
visit in class PhysicalOperator
Parameters:
v - Visitor to visit with.
Throws:
VisitorException - if the visitor has a problem.

getNextTuple

public Result getNextTuple()
                    throws ExecException
Overrides:
getNextTuple in class PhysicalOperator
Throws:
ExecException

addCounterValue

protected Result addCounterValue(Result input)
                          throws ExecException
Add current task id and local counter value.

Parameters:
input - from the previous output
Returns:
a tuple within two values prepended to the tuple the task identifier and the local counter value. Local counter value could be incremented by one (is a row number or dense rank) or, could be incremented by the size of the bag on the previous tuple processed
Throws:
ExecException

supportsMultipleInputs

public boolean supportsMultipleInputs()
Description copied from class: Operator
Indicates whether this operator supports multiple inputs.

Specified by:
supportsMultipleInputs in class Operator<PhyPlanVisitor>
Returns:
true if it does, otherwise false.

supportsMultipleOutputs

public boolean supportsMultipleOutputs()
Description copied from class: Operator
Indicates whether this operator supports multiple outputs.

Specified by:
supportsMultipleOutputs in class Operator<PhyPlanVisitor>
Returns:
true if it does, otherwise false.

name

public String name()
Specified by:
name in class Operator<PhyPlanVisitor>

setCounterPlans

public void setCounterPlans(List<PhysicalPlan> counterPlans)

getCounterPlans

public List<PhysicalPlan> getCounterPlans()

setAscendingColumns

public void setAscendingColumns(List<Boolean> mAscCols)

getAscendingColumns

public List<Boolean> getAscendingColumns()

resetLocalCounter

public void resetLocalCounter()
Initialization step into the POCounter is to set up local counter to 1.


incrementLocalCounter

public Long incrementLocalCounter()
Sequential counter used at ROW NUMBER and RANK BY DENSE mode


setLocalCounter

public void setLocalCounter(Long localCount)

getLocalCounter

public Long getLocalCounter()

addToLocalCounter

public void addToLocalCounter(Long sizeBag)

setTaskId

public void setTaskId(String taskID)
Task ID: identifier of the task (map or reducer)


getTaskId

public String getTaskId()

setIsDenseRank

public void setIsDenseRank(boolean isDenseRank)
Dense Rank flag


isDenseRank

public boolean isDenseRank()

setIsRowNumber

public void setIsRowNumber(boolean isRowNumber)
Row number flag


isRowNumber

public boolean isRowNumber()

setOperationID

public void setOperationID(String operationID)
Operation ID: identifier shared within the corresponding PORank


getOperationID

public String getOperationID()


Copyright © 2007-2012 The Apache Software Foundation