org.apache.hadoop.zebra.pig.comparator
Class KeyGenerator

java.lang.Object
  extended by org.apache.hadoop.zebra.pig.comparator.KeyGenerator

public class KeyGenerator
extends Object

Generating binary keys for algorithmic comparators. A user may construct an algorithmic comparator by creating a ComparatorExpr object (through various static methods in this class). She could then create a KeyGenerator object and use it to create binary keys for tuple. The KeyGenerator object can be reused for different tuples that conform to the same schema. Sorting the tuples by the binary key yields the same ordering as sorting by the algorithmic comparator. Basic idea (without optimization):

TODO Remove the strong dependency with Pig by adding a DatumExtractor interface that allow applications to extract leaf datum from user objects, something like the following:
 interface DatumExtractor {
   Object extract(Object o);
 }
 
And user may do something like this:
 class MyObject {
  int a;
  String b;
 }
 
 ComparatorExpr expr = KeyBuilder.createLeafExpr(new DatumExtractor {
  Object extract(Object o) {
      MyObject obj = (MyObject)o;
      return obj.b;
  } }, DataType.CHARARRAY);
 
TODO Change BagExpr to IteratorExpr, so that it may be used in more general context (any Java collection). TODO Add an ArrayExpr (for Java []).


Constructor Summary
KeyGenerator(ComparatorExpr expr)
          Create a key builder that can generate binary keys for the input key expression.
 
Method Summary
 org.apache.hadoop.io.BytesWritable generateKey(Tuple t)
          Generate the binary key for the input tuple
 void illustrate(PrintStream ps)
          Illustrate how the key would be generated from source.
 void reset(ComparatorExpr expr)
          Reset the key builder for a new expression.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

KeyGenerator

public KeyGenerator(ComparatorExpr expr)
Create a key builder that can generate binary keys for the input key expression.

Parameters:
expr - comparator expression
Method Detail

reset

public void reset(ComparatorExpr expr)
Reset the key builder for a new expression.

Parameters:
expr - comparator expression

generateKey

public org.apache.hadoop.io.BytesWritable generateKey(Tuple t)
                                               throws ExecException
Generate the binary key for the input tuple

Parameters:
t - input tuple
Returns:
A BytesWritable containing the binary sorting key for the input tuple.
Throws:
ExecException

illustrate

public void illustrate(PrintStream ps)
Illustrate how the key would be generated from source.

Parameters:
ps - The output print stream.


Copyright © ${year} The Apache Software Foundation