Class ParquetLoader

  extended by org.apache.pig.LoadFunc
      extended by org.apache.pig.LoadFuncWrapper
          extended by org.apache.pig.LoadFuncMetadataWrapper
              extended by org.apache.pig.builtin.ParquetLoader
All Implemented Interfaces:
LoadMetadata, LoadPushDown

public class ParquetLoader
extends LoadFuncMetadataWrapper
implements LoadPushDown

Wrapper class which will delegate calls to parquet.pig.ParquetLoader

Nested Class Summary
Nested classes/interfaces inherited from interface org.apache.pig.LoadPushDown
LoadPushDown.OperatorSet, LoadPushDown.RequiredField, LoadPushDown.RequiredFieldList, LoadPushDown.RequiredFieldResponse
Constructor Summary
ParquetLoader(String requestedSchemaStr)
Method Summary
 List<LoadPushDown.OperatorSet> getFeatures()
          Determine the operators that can be pushed to the loader.
 LoadPushDown.RequiredFieldResponse pushProjection(LoadPushDown.RequiredFieldList requiredFieldList)
          Indicate to the loader fields that will be needed.
 void setLocation(String location, org.apache.hadoop.mapreduce.Job job)
          Communicate to the loader the location of the object(s) being loaded.
Methods inherited from class org.apache.pig.LoadFuncMetadataWrapper
getPartitionKeys, getSchema, getStatistics, setLoadFunc, setPartitionFilter
Methods inherited from class org.apache.pig.LoadFuncWrapper
getInputFormat, getLoadCaster, getMethodName, getNext, loadFunc, prepareToRead, relativeToAbsolutePath, setLoadFunc, setUDFContextSignature
Methods inherited from class org.apache.pig.LoadFunc
getAbsolutePath, getPathStrings, join, warn
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Constructor Detail


public ParquetLoader()
              throws FrontendException


public ParquetLoader(String requestedSchemaStr)
              throws FrontendException
Method Detail


public void setLocation(String location,
                        org.apache.hadoop.mapreduce.Job job)
                 throws IOException
Description copied from class: LoadFunc
Communicate to the loader the location of the object(s) being loaded. The location string passed to the LoadFunc here is the return value of LoadFunc.relativeToAbsolutePath(String, Path). Implementations should use this method to communicate the location (and any other information) to its underlying InputFormat through the Job object. This method will be called in the frontend and backend multiple times. Implementations should bear in mind that this method is called multiple times and should ensure there are no inconsistent side effects due to the multiple calls.

setLocation in class LoadFuncWrapper
location - Location as returned by LoadFunc.relativeToAbsolutePath(String, Path)
job - the Job object store or retrieve earlier stored information from the UDFContext
IOException - if the location is not valid.


public List<LoadPushDown.OperatorSet> getFeatures()
Description copied from interface: LoadPushDown
Determine the operators that can be pushed to the loader. Note that by indicating a loader can accept a certain operator (such as selection) the loader is not promising that it can handle all selections. When it is passed the actual operators to push down it will still have a chance to reject them.

Specified by:
getFeatures in interface LoadPushDown
list of all features that the loader can support


public LoadPushDown.RequiredFieldResponse pushProjection(LoadPushDown.RequiredFieldList requiredFieldList)
                                                  throws FrontendException
Description copied from interface: LoadPushDown
Indicate to the loader fields that will be needed. This can be useful for loaders that access data that is stored in a columnar format where indicating columns to be accessed a head of time will save scans. This method will not be invoked by the Pig runtime if all fields are required. So implementations should assume that if this method is not invoked, then all fields from the input are required. If the loader function cannot make use of this information, it is free to ignore it by returning an appropriate Response

Specified by:
pushProjection in interface LoadPushDown
requiredFieldList - RequiredFieldList indicating which columns will be needed. This structure is read only. User cannot make change to it inside pushProjection.
Indicates which fields will be returned

Copyright © 2007-2012 The Apache Software Foundation