org.apache.pig.data
Class BagFactory

java.lang.Object
  extended by org.apache.pig.data.BagFactory
Direct Known Subclasses:
DefaultBagFactory

@InterfaceAudience.Public
@InterfaceStability.Stable
public abstract class BagFactory
extends Object

Factory for constructing different types of bags. This class is abstract so that users can override the bag factory if they desire to provide their own that returns their implementation of a bag. If the property pig.data.bag.factory.name is set to a class name and pig.data.bag.factory.jar is set to a URL pointing to a jar that contains the above named class, then getInstance() will create an instance of the named class using the indicated jar. Otherwise, it will create an instance of DefaultBagFactory.


Constructor Summary
protected BagFactory()
          Construct a new BagFactory
 
Method Summary
static BagFactory getInstance()
          Get a reference to the singleton factory.
abstract  DataBag newDefaultBag()
          Get a default (unordered, not distinct) data bag.
abstract  DataBag newDefaultBag(List<Tuple> listOfTuples)
          Get a default (unordered, not distinct) data bag with an existing list of tuples inserted into the bag.
abstract  DataBag newDistinctBag()
          Get a distinct data bag.
abstract  DataBag newSortedBag(Comparator<Tuple> comp)
          Get a sorted data bag.
protected  void registerBag(DataBag b)
          Register a bag with the SpillableMemoryManager.
static void resetSelf()
          Provided for testing purposes only.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

BagFactory

protected BagFactory()
Construct a new BagFactory

Method Detail

getInstance

public static BagFactory getInstance()
Get a reference to the singleton factory.

Returns:
BagFactory

newDefaultBag

public abstract DataBag newDefaultBag()
Get a default (unordered, not distinct) data bag.

Returns:
default data bag.

newDefaultBag

public abstract DataBag newDefaultBag(List<Tuple> listOfTuples)
Get a default (unordered, not distinct) data bag with an existing list of tuples inserted into the bag.

Parameters:
listOfTuples - list of tuples to be placed in the bag. This list may not be copied, it may be used directly by the created bag.
Returns:
default data bag.

newSortedBag

public abstract DataBag newSortedBag(Comparator<Tuple> comp)
Get a sorted data bag. Sorted bags guarantee that when an iterator is opened on the bag the tuples will be returned in sorted order.

Parameters:
comp - Comparator that controls how the data is sorted. If null, default comparator will be used.
Returns:
a sorted data bag

newDistinctBag

public abstract DataBag newDistinctBag()
Get a distinct data bag. Distinct bags guarantee that when an iterator is opened on the bag, no two tuples returned from the iterator will be equal.

Returns:
distinct data bag

registerBag

protected void registerBag(DataBag b)
Register a bag with the SpillableMemoryManager. If the bags created by an implementation of BagFactory are managed by the SpillableMemoryManager then this method should be called each time a new bag is created.

Parameters:
b - bag to be registered.

resetSelf

public static void resetSelf()
Provided for testing purposes only. This function should never be called by anybody but the unit tests.



Copyright © ${year} The Apache Software Foundation