IndexableLoadFunc (Pig 0.18.0 API)

All Known Implementing Classes:

DefaultIndexableLoader, TezIndexableLoader
```
@InterfaceAudience.Public
 @InterfaceStability.Evolving
public interface IndexableLoadFunc
```
This class is intended for use by LoadFunc implementations which have an internal index for sorted data and can use the index to support merge join in Pig. Interaction with the index is abstracted away by the methods in this interface which the Pig runtime will call in a particular sequence to get the records it needs to perform the merge based join. The sequence of calls made from the Pig runtime are:
1. LoadFunc.setUDFContextSignature(String)
2. initialize(Configuration)
3. LoadFunc.setLocation(String, org.apache.hadoop.mapreduce.Job)
4. seekNear(Tuple)
5. LoadFunc.getNext() called multiple times to retrieve data and perform the join
6. close()
Since:

Pig 0.6

Method Summary

All Methods Instance Methods Abstract Methods
Modifier and Type	Method and Description
`void`	`close()` A method called by the Pig runtime to give an opportunity for implementations to perform cleanup actions like closing the underlying input stream.
`void`	`initialize(org.apache.hadoop.conf.Configuration conf)` This method is called by Pig run time to allow the IndexableLoadFunc to perform any initialization actions
`void`	`seekNear(Tuple keys)` This method is called by the Pig runtime to indicate to the LoadFunc to position its underlying input stream near the keys supplied as the argument.

- Method Detail
  - initialize
```
void initialize(org.apache.hadoop.conf.Configuration conf)
         throws java.io.IOException
```
    This method is called by Pig run time to allow the IndexableLoadFunc to perform any initialization actions
    
    Parameters:
    
    conf - The job configuration object
    
    Throws:
    
    java.io.IOException
  - seekNear
```
void seekNear(Tuple keys)
       throws java.io.IOException
```
    This method is called by the Pig runtime to indicate to the LoadFunc to position its underlying input stream near the keys supplied as the argument. Specifically: 1) if the keys are present in the input stream, the loadfunc implementation should position its read position to a record where the key(s) is/are the biggest key(s) less than the key(s) supplied in the argument OR to the record with the first occurrence of the keys(s) supplied. 2) if the key(s) are absent in the input stream, the implementation should position its read position to a record where the key(s) is/are the biggest key(s) less than the key(s) supplied OR to the first record where the key(s) is/are the smallest key(s) greater than the keys(s) supplied. The description above holds for descending order data in a similar manner with "biggest" and "less than" replaced with "smallest" and "greater than" and vice versa.
    
    Parameters:
    
    keys - Tuple with join keys (which are a prefix of the sort keys of the input data). For example if the data is sorted on columns in position 2,4,5 any of the following Tuples are valid as an argument value: (fieldAt(2)) (fieldAt(2), fieldAt(4)) (fieldAt(2), fieldAt(4), fieldAt(5)) The following are some invalid cases: (fieldAt(4)) (fieldAt(2), fieldAt(5)) (fieldAt(4), fieldAt(5))
    
    Throws:
    
    java.io.IOException - When the loadFunc is unable to position to the required point in its input stream
  - close
```
void close()
    throws java.io.IOException
```
    A method called by the Pig runtime to give an opportunity for implementations to perform cleanup actions like closing the underlying input stream. This is necessary since while performing a join the Pig run time may determine than no further join is possible with remaining records and may indicate to the IndexableLoader to cleanup by calling this method.
    
    Throws:
    
    java.io.IOException - if the loadfunc is unable to perform its close actions.

Interface IndexableLoadFunc

Method Summary

Method Detail

initialize

seekNear

close