org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators
Class PODistinct

java.lang.Object
  extended by org.apache.pig.impl.plan.Operator<PhyPlanVisitor>
      extended by org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator
          extended by org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.PODistinct
All Implemented Interfaces:
Serializable, Cloneable, Comparable<Operator>, Illustrable
Direct Known Subclasses:
POSortedDistinct

public class PODistinct
extends PhysicalOperator
implements Cloneable

Find the distinct set of tuples in a bag. This is a blocking operator. All the input is put in the hashset implemented in DistinctDataBag which also provides the other DataBag interfaces.

See Also:
Serialized Form

Nested Class Summary
 
Nested classes/interfaces inherited from class org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator
PhysicalOperator.OriginalLocation
 
Field Summary
protected  String customPartitioner
           
 
Fields inherited from class org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator
alias, illustrator, input, inputAttached, inputs, lineageTracer, outputs, parentPlan, pigLogger, requestedParallelism, res, resultType
 
Fields inherited from class org.apache.pig.impl.plan.Operator
mKey
 
Constructor Summary
PODistinct(OperatorKey k)
           
PODistinct(OperatorKey k, int rp)
           
PODistinct(OperatorKey k, int rp, List<PhysicalOperator> inp)
           
PODistinct(OperatorKey k, List<PhysicalOperator> inp)
           
 
Method Summary
 PODistinct clone()
          Make a deep copy of this operator.
 String getCustomPartitioner()
           
 Result getNextTuple()
           
 Tuple illustratorMarkup(Object in, Object out, int eqClassIndex)
          input tuple mark up to be illustrate-able
 boolean isBlocking()
          A blocking operator should override this to return true.
 String name()
           
 void reset()
          Reset internal state in an operator.
 void setCustomPartitioner(String customPartitioner)
           
 boolean supportsMultipleInputs()
          Indicates whether this operator supports multiple inputs.
 boolean supportsMultipleOutputs()
          Indicates whether this operator supports multiple outputs.
 void visit(PhyPlanVisitor v)
          Visit this node with the provided visitor.
 
Methods inherited from class org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator
addOriginalLocation, addOriginalLocation, attachInput, cloneHelper, detachInput, getAlias, getAliasString, getIllustrator, getInputs, getLogger, getNext, getNextBigDecimal, getNextBigInteger, getNextBoolean, getNextDataBag, getNextDataByteArray, getNextDateTime, getNextDouble, getNextFloat, getNextInteger, getNextLong, getNextMap, getNextString, getOriginalLocations, getPigLogger, getReporter, getRequestedParallelism, getResultType, isAccumStarted, isAccumulative, isInputAttached, processInput, setAccumEnd, setAccumStart, setAccumulative, setIllustrator, setInputs, setParentPlan, setPigLogger, setReporter, setRequestedParallelism, setResultType
 
Methods inherited from class org.apache.pig.impl.plan.Operator
compareTo, equals, getOperatorKey, getProjectionMap, hashCode, regenerateProjectionMap, rewire, toString, unsetProjectionMap
 
Methods inherited from class java.lang.Object
finalize, getClass, notify, notifyAll, wait, wait, wait
 

Field Detail

customPartitioner

protected String customPartitioner
Constructor Detail

PODistinct

public PODistinct(OperatorKey k,
                  int rp,
                  List<PhysicalOperator> inp)

PODistinct

public PODistinct(OperatorKey k,
                  int rp)

PODistinct

public PODistinct(OperatorKey k,
                  List<PhysicalOperator> inp)

PODistinct

public PODistinct(OperatorKey k)
Method Detail

getCustomPartitioner

public String getCustomPartitioner()

setCustomPartitioner

public void setCustomPartitioner(String customPartitioner)

isBlocking

public boolean isBlocking()
Description copied from class: PhysicalOperator
A blocking operator should override this to return true. Blocking operators are those that need the full bag before operate on the tuples inside the bag. Example is the Global Rearrange. Non-blocking or pipeline operators are those that work on a tuple by tuple basis.

Overrides:
isBlocking in class PhysicalOperator
Returns:
true if blocking and false otherwise

getNextTuple

public Result getNextTuple()
                    throws ExecException
Overrides:
getNextTuple in class PhysicalOperator
Throws:
ExecException

name

public String name()
Specified by:
name in class Operator<PhyPlanVisitor>

supportsMultipleInputs

public boolean supportsMultipleInputs()
Description copied from class: Operator
Indicates whether this operator supports multiple inputs.

Specified by:
supportsMultipleInputs in class Operator<PhyPlanVisitor>
Returns:
true if it does, otherwise false.

supportsMultipleOutputs

public boolean supportsMultipleOutputs()
Description copied from class: Operator
Indicates whether this operator supports multiple outputs.

Specified by:
supportsMultipleOutputs in class Operator<PhyPlanVisitor>
Returns:
true if it does, otherwise false.

reset

public void reset()
Description copied from class: PhysicalOperator
Reset internal state in an operator. For use in nested pipelines where operators like limit and sort may need to reset their state. Limit needs it because it needs to know it's seeing a fresh set of input. Blocking operators like sort and distinct need it because they may not have drained their previous input due to a limit and thus need to be told to drop their old input and start over.

Overrides:
reset in class PhysicalOperator

visit

public void visit(PhyPlanVisitor v)
           throws VisitorException
Description copied from class: Operator
Visit this node with the provided visitor. This should only be called by the visitor class itself, never directly.

Specified by:
visit in class PhysicalOperator
Parameters:
v - Visitor to visit with.
Throws:
VisitorException - if the visitor has a problem.

clone

public PODistinct clone()
                 throws CloneNotSupportedException
Description copied from class: PhysicalOperator
Make a deep copy of this operator. This function is blank, however, we should leave a place holder so that the subclasses can clone

Overrides:
clone in class PhysicalOperator
Throws:
CloneNotSupportedException
See Also:
Do not use the clone method directly. Operators are cloned when logical plans are cloned using {@link LogicalPlanCloner}

illustratorMarkup

public Tuple illustratorMarkup(Object in,
                               Object out,
                               int eqClassIndex)
Description copied from interface: Illustrable
input tuple mark up to be illustrate-able

Specified by:
illustratorMarkup in interface Illustrable
Parameters:
in - input tuple
out - output tuple before wrapped in ExampleTuple
eqClassIndex - index into equivalence classes in illustrator
Returns:
tuple


Copyright © 2007-2012 The Apache Software Foundation