org.apache.pig.builtin
Class ParquetLoader

java.lang.Object
  extended by org.apache.pig.LoadFunc
      extended by org.apache.pig.LoadFuncWrapper
          extended by org.apache.pig.LoadFuncMetadataWrapper
              extended by org.apache.pig.builtin.ParquetLoader
All Implemented Interfaces:
LoadMetadata, LoadPushDown

public class ParquetLoader
extends LoadFuncMetadataWrapper
implements LoadPushDown

Wrapper class which will delegate calls to parquet.pig.ParquetLoader


Nested Class Summary
 
Nested classes/interfaces inherited from interface org.apache.pig.LoadPushDown
LoadPushDown.OperatorSet, LoadPushDown.RequiredField, LoadPushDown.RequiredFieldList, LoadPushDown.RequiredFieldResponse
 
Constructor Summary
ParquetLoader()
           
ParquetLoader(String requestedSchemaStr)
           
 
Method Summary
 List<LoadPushDown.OperatorSet> getFeatures()
          Determine the operators that can be pushed to the loader.
 LoadPushDown.RequiredFieldResponse pushProjection(LoadPushDown.RequiredFieldList requiredFieldList)
          Indicate to the loader fields that will be needed.
 void setLocation(String location, org.apache.hadoop.mapreduce.Job job)
          Communicate to the loader the location of the object(s) being loaded.
 
Methods inherited from class org.apache.pig.LoadFuncMetadataWrapper
getPartitionKeys, getSchema, getStatistics, setLoadFunc, setPartitionFilter
 
Methods inherited from class org.apache.pig.LoadFuncWrapper
getInputFormat, getLoadCaster, getMethodName, getNext, loadFunc, prepareToRead, relativeToAbsolutePath, setLoadFunc, setUDFContextSignature
 
Methods inherited from class org.apache.pig.LoadFunc
getAbsolutePath, getPathStrings, join, warn
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

ParquetLoader

public ParquetLoader()
              throws FrontendException
Throws:
FrontendException

ParquetLoader

public ParquetLoader(String requestedSchemaStr)
              throws FrontendException
Throws:
FrontendException
Method Detail

setLocation

public void setLocation(String location,
                        org.apache.hadoop.mapreduce.Job job)
                 throws IOException
Description copied from class: LoadFunc
Communicate to the loader the location of the object(s) being loaded. The location string passed to the LoadFunc here is the return value of LoadFunc.relativeToAbsolutePath(String, Path). Implementations should use this method to communicate the location (and any other information) to its underlying InputFormat through the Job object. This method will be called in the frontend and backend multiple times. Implementations should bear in mind that this method is called multiple times and should ensure there are no inconsistent side effects due to the multiple calls.

Overrides:
setLocation in class LoadFuncWrapper
Parameters:
location - Location as returned by LoadFunc.relativeToAbsolutePath(String, Path)
job - the Job object store or retrieve earlier stored information from the UDFContext
Throws:
IOException - if the location is not valid.

getFeatures

public List<LoadPushDown.OperatorSet> getFeatures()
Description copied from interface: LoadPushDown
Determine the operators that can be pushed to the loader. Note that by indicating a loader can accept a certain operator (such as selection) the loader is not promising that it can handle all selections. When it is passed the actual operators to push down it will still have a chance to reject them.

Specified by:
getFeatures in interface LoadPushDown
Returns:
list of all features that the loader can support

pushProjection

public LoadPushDown.RequiredFieldResponse pushProjection(LoadPushDown.RequiredFieldList requiredFieldList)
                                                  throws FrontendException
Description copied from interface: LoadPushDown
Indicate to the loader fields that will be needed. This can be useful for loaders that access data that is stored in a columnar format where indicating columns to be accessed a head of time will save scans. This method will not be invoked by the Pig runtime if all fields are required. So implementations should assume that if this method is not invoked, then all fields from the input are required. If the loader function cannot make use of this information, it is free to ignore it by returning an appropriate Response

Specified by:
pushProjection in interface LoadPushDown
Parameters:
requiredFieldList - RequiredFieldList indicating which columns will be needed. This structure is read only. User cannot make change to it inside pushProjection.
Returns:
Indicates which fields will be returned
Throws:
FrontendException


Copyright © 2007-2012 The Apache Software Foundation