public class DefaultIndexableLoader extends LoadFunc implements IndexableLoadFunc
Constructor and Description |
---|
DefaultIndexableLoader(java.lang.String loaderFuncSpec,
java.lang.String indexFile,
java.lang.String indexFileLoadFuncSpec,
java.lang.String scope,
java.lang.String inputLocation) |
Modifier and Type | Method and Description |
---|---|
void |
close()
A method called by the Pig runtime to give an opportunity
for implementations to perform cleanup actions like closing
the underlying input stream.
|
org.apache.hadoop.mapreduce.InputFormat |
getInputFormat()
This will be called during planning on the front end.
|
LoadCaster |
getLoadCaster()
This will be called on the front end during planning and not on the back
end during execution.
|
Tuple |
getNext()
Retrieves the next tuple to be processed.
|
void |
initialize(org.apache.hadoop.conf.Configuration conf)
This method is called by Pig run time to allow the
IndexableLoadFunc to perform any initialization actions
|
void |
prepareToRead(org.apache.hadoop.mapreduce.RecordReader reader,
PigSplit split)
Initializes LoadFunc for reading data.
|
void |
seekNear(Tuple keys)
This method is called by the Pig runtime to indicate
to the LoadFunc to position its underlying input stream
near the keys supplied as the argument.
|
void |
setIndexFile(java.lang.String indexFile) |
void |
setLocation(java.lang.String location,
org.apache.hadoop.mapreduce.Job job)
Communicate to the loader the location of the object(s) being loaded.
|
getAbsolutePath, getCacheFiles, getPathStrings, getShipFiles, join, relativeToAbsolutePath, setUDFContextSignature, warn
public DefaultIndexableLoader(java.lang.String loaderFuncSpec, java.lang.String indexFile, java.lang.String indexFileLoadFuncSpec, java.lang.String scope, java.lang.String inputLocation)
public void seekNear(Tuple keys) throws java.io.IOException
IndexableLoadFunc
seekNear
in interface IndexableLoadFunc
keys
- Tuple with join keys (which are a prefix of the sort
keys of the input data). For example if the data is sorted on
columns in position 2,4,5 any of the following Tuples are
valid as an argument value:
(fieldAt(2))
(fieldAt(2), fieldAt(4))
(fieldAt(2), fieldAt(4), fieldAt(5))
The following are some invalid cases:
(fieldAt(4))
(fieldAt(2), fieldAt(5))
(fieldAt(4), fieldAt(5))java.io.IOException
- When the loadFunc is unable to position
to the required point in its input streampublic Tuple getNext() throws java.io.IOException
LoadFunc
public void close() throws java.io.IOException
IndexableLoadFunc
close
in interface IndexableLoadFunc
java.io.IOException
- if the loadfunc is unable to perform
its close actions.public void initialize(org.apache.hadoop.conf.Configuration conf) throws java.io.IOException
IndexableLoadFunc
initialize
in interface IndexableLoadFunc
conf
- The job configuration objectjava.io.IOException
public org.apache.hadoop.mapreduce.InputFormat getInputFormat() throws java.io.IOException
LoadFunc
getInputFormat
in class LoadFunc
java.io.IOException
- if there is an exception during InputFormat
constructionpublic LoadCaster getLoadCaster() throws java.io.IOException
LoadFunc
getLoadCaster
in class LoadFunc
LoadCaster
associated with this loader. Returning null
indicates that casts from byte array are not supported for this loader.
constructionjava.io.IOException
- if there is an exception during LoadCasterpublic void prepareToRead(org.apache.hadoop.mapreduce.RecordReader reader, PigSplit split)
LoadFunc
prepareToRead
in class LoadFunc
reader
- RecordReader
to be used by this instance of the LoadFuncsplit
- The input PigSplit
to processpublic void setLocation(java.lang.String location, org.apache.hadoop.mapreduce.Job job) throws java.io.IOException
LoadFunc
LoadFunc.relativeToAbsolutePath(String, Path)
. Implementations
should use this method to communicate the location (and any other information)
to its underlying InputFormat through the Job object.
This method will be called in the frontend and backend multiple times. Implementations
should bear in mind that this method is called multiple times and should
ensure there are no inconsistent side effects due to the multiple calls.setLocation
in class LoadFunc
location
- Location as returned by
LoadFunc.relativeToAbsolutePath(String, Path)
job
- the Job
object
store or retrieve earlier stored information from the UDFContext
java.io.IOException
- if the location is not valid.public void setIndexFile(java.lang.String indexFile)
Copyright © 2007-2012 The Apache Software Foundation