org.apache.pig.newplan.logical.relational
Class LOCube

java.lang.Object
  extended by org.apache.pig.newplan.Operator
      extended by org.apache.pig.newplan.logical.relational.LogicalRelationalOperator
          extended by org.apache.pig.newplan.logical.relational.LOCube

public class LOCube
extends LogicalRelationalOperator

CUBE operator implementation for data cube computation.

Cube operator syntax

 alias = CUBE rel BY { CUBE | ROLLUP }(col_ref) [, { CUBE | ROLLUP }(col_ref) ...];
 alias - output alias 
 CUBE - operator
 rel - input relation
 BY - operator
 CUBE | ROLLUP - cube or rollup operation
 col_ref - column references or * or range in the schema referred by rel
 

The cube computation and rollup computation using UDFs CubeDimensions and RollupDimensions can be represented like below

 events = LOAD '/logs/events' USING EventLoader() AS (lang, event, app_id, event_id, total);
 eventcube = CUBE events BY CUBE(lang, event), ROLLUP(app_id, event_id);
 result = FOREACH eventcube GENERATE FLATTEN(group) as (lang, event),
          COUNT_STAR(cube), SUM(cube.total);
 STORE result INTO 'cuberesult';
 
 
In the above example, CUBE(lang, event) will generate all combinations of aggregations {(lang, event), (lang, ), ( , event), ( , )}. For n dimensions, 2^n combinations of aggregations will be generated. Similarly, ROLLUP(app_id, event_id) will generate aggregations from the most detailed to the most general (grandtotal) level in the hierarchical order like {(app_id, event_id), (app_id, ), ( , )}. For n dimensions, n+1 combinations of aggregations will be generated. The output of the above example query will have the following combinations of aggregations {(lang, event, app_id, event_id), (lang, , app_id, event_id), ( , event, app_id, event_id), ( , , app_id, event_id), (lang, event, app_id, ), (lang, , app_id, ), ( , event, app_id, ), ( , , app_id, ), (lang, event, , ), (lang, , , ), ( , event, , ), ( , , , )} Total number of combinations will be ( 2^n * (n+1) ) Since cube and rollup clause use null to represent "all" values of a dimension, if the dimension values contain null values it will be converted to "unknown" before computing cube or rollup.


Field Summary
 
Fields inherited from class org.apache.pig.newplan.logical.relational.LogicalRelationalOperator
alias, lineNum, mCustomPartitioner, mPinnedOptions, requestedParallelism, schema
 
Fields inherited from class org.apache.pig.newplan.Operator
annotations, hashPrime, location, name, plan
 
Constructor Summary
LOCube(LogicalPlan plan)
           
LOCube(OperatorPlan plan, MultiMap<Integer,LogicalExpressionPlan> expressionPlans)
           
 
Method Summary
 void accept(PlanVisitor v)
          Accept a visitor at this node in the graph.
 MultiMap<Integer,LogicalExpressionPlan> getExpressionPlans()
           
 List<Operator> getInputs(LogicalPlan plan)
           
 List<String> getOperations()
           
 LogicalSchema getSchema()
          Get the schema for the output of this relational operator.
 boolean isEqual(Operator other)
          This is like a shallow equals comparison.
 void resetUid()
          Erase all cached uid, regenerate uid when we regenerating schema.
 void setExpressionPlans(MultiMap<Integer,LogicalExpressionPlan> plans)
           
 void setOperations(List<String> operations)
           
 
Methods inherited from class org.apache.pig.newplan.logical.relational.LogicalRelationalOperator
checkEquality, getAlias, getCustomPartitioner, getLineNumber, getRequestedParallelism, isPinnedOption, neverUseForRealSetSchema, pinOption, resetSchema, setAlias, setCustomPartitioner, setRequestedParallelism, setSchema, toString
 
Methods inherited from class org.apache.pig.newplan.Operator
annotate, getAnnotation, getLocation, getName, getPlan, removeAnnotation, setLocation, setPlan
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Constructor Detail

LOCube

public LOCube(LogicalPlan plan)

LOCube

public LOCube(OperatorPlan plan,
              MultiMap<Integer,LogicalExpressionPlan> expressionPlans)
Method Detail

getSchema

public LogicalSchema getSchema()
                        throws FrontendException
Description copied from class: LogicalRelationalOperator
Get the schema for the output of this relational operator. This does not merely return the schema variable. If schema is not yet set, this will attempt to construct it. Therefore it is abstract since each operator will need to construct its schema differently.

Specified by:
getSchema in class LogicalRelationalOperator
Returns:
the schema
Throws:
FrontendException

accept

public void accept(PlanVisitor v)
            throws FrontendException
Description copied from class: Operator
Accept a visitor at this node in the graph.

Specified by:
accept in class Operator
Parameters:
v - Visitor to accept.
Throws:
FrontendException

isEqual

public boolean isEqual(Operator other)
                throws FrontendException
Description copied from class: Operator
This is like a shallow equals comparison. It returns true if two operators have equivalent properties even if they are different objects. Here properties mean equivalent plan and equivalent name.

Specified by:
isEqual in class Operator
Returns:
true if two object have equivalent properties, else false
Throws:
FrontendException

getExpressionPlans

public MultiMap<Integer,LogicalExpressionPlan> getExpressionPlans()

setExpressionPlans

public void setExpressionPlans(MultiMap<Integer,LogicalExpressionPlan> plans)

resetUid

public void resetUid()
Description copied from class: LogicalRelationalOperator
Erase all cached uid, regenerate uid when we regenerating schema. This process currently only used in ImplicitSplitInsert, which will insert split and invalidate some uids in plan

Overrides:
resetUid in class LogicalRelationalOperator

getInputs

public List<Operator> getInputs(LogicalPlan plan)

getOperations

public List<String> getOperations()

setOperations

public void setOperations(List<String> operations)


Copyright © 2007-2012 The Apache Software Foundation