org.apache.pig
Class LoadFuncWrapper

java.lang.Object
  extended by org.apache.pig.LoadFunc
      extended by org.apache.pig.LoadFuncWrapper
Direct Known Subclasses:
LoadFuncMetadataWrapper

public class LoadFuncWrapper
extends LoadFunc

Convenience class to extend when decorating a LoadFunc. Subclasses must call the setLoadFunc with an instance of LoadFunc before other methods can be called. Not doing so will result in an IllegalArgumentException when the method is called.


Constructor Summary
protected LoadFuncWrapper()
           
 
Method Summary
 org.apache.hadoop.mapreduce.InputFormat getInputFormat()
          This will be called during planning on the front end.
 LoadCaster getLoadCaster()
          This will be called on the front end during planning and not on the back end during execution.
protected  String getMethodName(int depth)
          Returns a method in the call stack at the given depth.
 Tuple getNext()
          Retrieves the next tuple to be processed.
protected  LoadFunc loadFunc()
           
 void prepareToRead(org.apache.hadoop.mapreduce.RecordReader reader, PigSplit split)
          Initializes LoadFunc for reading data.
 String relativeToAbsolutePath(String location, org.apache.hadoop.fs.Path curDir)
          This method is called by the Pig runtime in the front end to convert the input location to an absolute path if the location is relative.
protected  void setLoadFunc(LoadFunc loadFunc)
          The wrapped LoadFunc object must be set before method calls are made on this object.
 void setLocation(String location, org.apache.hadoop.mapreduce.Job job)
          Communicate to the loader the location of the object(s) being loaded.
 void setUDFContextSignature(String signature)
          This method will be called by Pig both in the front end and back end to pass a unique signature to the LoadFunc.
 
Methods inherited from class org.apache.pig.LoadFunc
getAbsolutePath, getPathStrings, join, warn
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

LoadFuncWrapper

protected LoadFuncWrapper()
Method Detail

setLoadFunc

protected void setLoadFunc(LoadFunc loadFunc)
The wrapped LoadFunc object must be set before method calls are made on this object. Typically, this is done with via constructor, but often times the wrapped object can not be properly initialized until later in the lifecycle of the wrapper object.

Parameters:
loadFunc -

relativeToAbsolutePath

public String relativeToAbsolutePath(String location,
                                     org.apache.hadoop.fs.Path curDir)
                              throws IOException
Description copied from class: LoadFunc
This method is called by the Pig runtime in the front end to convert the input location to an absolute path if the location is relative. The loadFunc implementation is free to choose how it converts a relative location to an absolute location since this may depend on what the location string represent (hdfs path or some other data source)

Overrides:
relativeToAbsolutePath in class LoadFunc
Parameters:
location - location as provided in the "load" statement of the script
curDir - the current working direction based on any "cd" statements in the script before the "load" statement. If there are no "cd" statements in the script, this would be the home directory -
/user/ 
Returns:
the absolute location based on the arguments passed
Throws:
IOException - if the conversion is not possible

setLocation

public void setLocation(String location,
                        org.apache.hadoop.mapreduce.Job job)
                 throws IOException
Description copied from class: LoadFunc
Communicate to the loader the location of the object(s) being loaded. The location string passed to the LoadFunc here is the return value of LoadFunc.relativeToAbsolutePath(String, Path). Implementations should use this method to communicate the location (and any other information) to its underlying InputFormat through the Job object. This method will be called in the frontend and backend multiple times. Implementations should bear in mind that this method is called multiple times and should ensure there are no inconsistent side effects due to the multiple calls.

Specified by:
setLocation in class LoadFunc
Parameters:
location - Location as returned by LoadFunc.relativeToAbsolutePath(String, Path)
job - the Job object store or retrieve earlier stored information from the UDFContext
Throws:
IOException - if the location is not valid.

getInputFormat

public org.apache.hadoop.mapreduce.InputFormat getInputFormat()
                                                       throws IOException
Description copied from class: LoadFunc
This will be called during planning on the front end. This is the instance of InputFormat (rather than the class name) because the load function may need to instantiate the InputFormat in order to control how it is constructed.

Specified by:
getInputFormat in class LoadFunc
Returns:
the InputFormat associated with this loader.
Throws:
IOException - if there is an exception during InputFormat construction

getLoadCaster

public LoadCaster getLoadCaster()
                         throws IOException
Description copied from class: LoadFunc
This will be called on the front end during planning and not on the back end during execution.

Overrides:
getLoadCaster in class LoadFunc
Returns:
the LoadCaster associated with this loader. Returning null indicates that casts from byte array are not supported for this loader. construction
Throws:
IOException - if there is an exception during LoadCaster

prepareToRead

public void prepareToRead(org.apache.hadoop.mapreduce.RecordReader reader,
                          PigSplit split)
                   throws IOException
Description copied from class: LoadFunc
Initializes LoadFunc for reading data. This will be called during execution before any calls to getNext. The RecordReader needs to be passed here because it has been instantiated for a particular InputSplit.

Specified by:
prepareToRead in class LoadFunc
Parameters:
reader - RecordReader to be used by this instance of the LoadFunc
split - The input PigSplit to process
Throws:
IOException - if there is an exception during initialization

getNext

public Tuple getNext()
              throws IOException
Description copied from class: LoadFunc
Retrieves the next tuple to be processed. Implementations should NOT reuse tuple objects (or inner member objects) they return across calls and should return a different tuple object in each call.

Specified by:
getNext in class LoadFunc
Returns:
the next tuple to be processed or null if there are no more tuples to be processed.
Throws:
IOException - if there is an exception while retrieving the next tuple

setUDFContextSignature

public void setUDFContextSignature(String signature)
Description copied from class: LoadFunc
This method will be called by Pig both in the front end and back end to pass a unique signature to the LoadFunc. The signature can be used to store into the UDFContext any information which the LoadFunc needs to store between various method invocations in the front end and back end. A use case is to store LoadPushDown.RequiredFieldList passed to it in LoadPushDown.pushProjection(RequiredFieldList) for use in the back end before returning tuples in LoadFunc.getNext(). This method will be call before other methods in LoadFunc

Overrides:
setUDFContextSignature in class LoadFunc
Parameters:
signature - a unique signature to identify this LoadFunc

loadFunc

protected LoadFunc loadFunc()

getMethodName

protected String getMethodName(int depth)
Returns a method in the call stack at the given depth. Depth 0 will return the method that called this getMethodName, depth 1 the method that called it, etc...

Parameters:
depth -
Returns:
method name as String


Copyright © 2007-2012 The Apache Software Foundation