org.apache.pig.data
Class ReadOnceBag

java.lang.Object
  extended by org.apache.pig.data.ReadOnceBag
All Implemented Interfaces:
Serializable, Comparable, Iterable<Tuple>, org.apache.hadoop.io.Writable, org.apache.hadoop.io.WritableComparable, DataBag, Spillable

public class ReadOnceBag
extends Object
implements DataBag

This bag does not store the tuples in memory, but has access to an iterator typically provided by Hadoop. Use this when you already have an iterator over tuples and do not want to copy over again to a new bag.

See Also:
Serialized Form

Nested Class Summary
protected  class ReadOnceBag.ReadOnceBagIterator
           
 
Field Summary
protected  PigNullableWritable keyWritable
           
protected  Packager pkgr
           
protected  Iterator<NullableTuple> tupIter
           
 
Constructor Summary
ReadOnceBag(Packager pkgr, Iterator<NullableTuple> tupIter, PigNullableWritable keyWritable)
          This constructor creates a bag out of an existing iterator of tuples by taking ownership of the iterator and NOT copying the elements of the iterator.
 
Method Summary
 void add(Tuple t)
          Add a tuple to the bag.
 void addAll(DataBag b)
          Add contents of a bag to the bag.
 void clear()
          Clear out the contents of the bag, both on disk and in memory.
 int compareTo(Object o)
           
 boolean equals(Object other)
           
 long getMemorySize()
          Requests that an object return an estimate of its in memory size.
 int hashCode()
           
 boolean isDistinct()
          Find out if the bag is distinct.
 boolean isSorted()
          Find out if the bag is sorted.
 Iterator<Tuple> iterator()
          Get an iterator to the bag.
 void markStale(boolean stale)
          This is used by FuncEvalSpec.FakeDataBag.
 void readFields(DataInput in)
           
 long size()
          Get the number of elements in the bag, both in memory and on disk.
 long spill()
          Instructs an object to spill whatever it can to disk and release references to any data structures it spills.
 void write(DataOutput out)
           
 
Methods inherited from class java.lang.Object
clone, finalize, getClass, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

pkgr

protected Packager pkgr

tupIter

protected transient Iterator<NullableTuple> tupIter

keyWritable

protected PigNullableWritable keyWritable
Constructor Detail

ReadOnceBag

public ReadOnceBag(Packager pkgr,
                   Iterator<NullableTuple> tupIter,
                   PigNullableWritable keyWritable)
This constructor creates a bag out of an existing iterator of tuples by taking ownership of the iterator and NOT copying the elements of the iterator.

Parameters:
pkg - POPackageLite
tupIter - Iterator
key - Object
Method Detail

getMemorySize

public long getMemorySize()
Description copied from interface: Spillable
Requests that an object return an estimate of its in memory size.

Specified by:
getMemorySize in interface Spillable
Returns:
estimated in memory size.

spill

public long spill()
Description copied from interface: Spillable
Instructs an object to spill whatever it can to disk and release references to any data structures it spills.

Specified by:
spill in interface Spillable
Returns:
number of objects spilled.

add

public void add(Tuple t)
Description copied from interface: DataBag
Add a tuple to the bag.

Specified by:
add in interface DataBag
Parameters:
t - tuple to add.

addAll

public void addAll(DataBag b)
Description copied from interface: DataBag
Add contents of a bag to the bag.

Specified by:
addAll in interface DataBag
Parameters:
b - bag to add contents of.

clear

public void clear()
Description copied from interface: DataBag
Clear out the contents of the bag, both on disk and in memory. Any attempts to read after this is called will produce undefined results.

Specified by:
clear in interface DataBag

isDistinct

public boolean isDistinct()
Description copied from interface: DataBag
Find out if the bag is distinct.

Specified by:
isDistinct in interface DataBag
Returns:
true if the bag is a distinct bag, false otherwise.

isSorted

public boolean isSorted()
Description copied from interface: DataBag
Find out if the bag is sorted.

Specified by:
isSorted in interface DataBag
Returns:
true if this is a sorted data bag, false otherwise.

iterator

public Iterator<Tuple> iterator()
Description copied from interface: DataBag
Get an iterator to the bag. For default and distinct bags, no particular order is guaranteed. For sorted bags the order is guaranteed to be sorted according to the provided comparator.

Specified by:
iterator in interface Iterable<Tuple>
Specified by:
iterator in interface DataBag
Returns:
tuple iterator

markStale

public void markStale(boolean stale)
Description copied from interface: DataBag
This is used by FuncEvalSpec.FakeDataBag.

Specified by:
markStale in interface DataBag
Parameters:
stale - Set stale state.

size

public long size()
Description copied from interface: DataBag
Get the number of elements in the bag, both in memory and on disk.

Specified by:
size in interface DataBag
Returns:
number of elements in the bag

readFields

public void readFields(DataInput in)
                throws IOException
Specified by:
readFields in interface org.apache.hadoop.io.Writable
Throws:
IOException

write

public void write(DataOutput out)
           throws IOException
Specified by:
write in interface org.apache.hadoop.io.Writable
Throws:
IOException

compareTo

public int compareTo(Object o)
Specified by:
compareTo in interface Comparable

equals

public boolean equals(Object other)
Overrides:
equals in class Object

hashCode

public int hashCode()
Overrides:
hashCode in class Object


Copyright © 2007-2012 The Apache Software Foundation