public class ParquetLoader extends LoadFuncMetadataWrapper implements LoadPushDown, LoadPredicatePushdown
LoadPushDown.OperatorSet, LoadPushDown.RequiredField, LoadPushDown.RequiredFieldList, LoadPushDown.RequiredFieldResponse
Constructor and Description |
---|
ParquetLoader() |
ParquetLoader(java.lang.String requestedSchemaStr) |
Modifier and Type | Method and Description |
---|---|
java.util.List<LoadPushDown.OperatorSet> |
getFeatures()
Determine the operators that can be pushed to the loader.
|
java.util.List<java.lang.String> |
getPredicateFields(java.lang.String location,
org.apache.hadoop.mapreduce.Job job)
Find what fields of the data can support predicate pushdown.
|
java.util.List<Expression.OpType> |
getSupportedExpressionTypes()
Indicate operations on fields supported by the loader for predicate pushdown
|
LoadPushDown.RequiredFieldResponse |
pushProjection(LoadPushDown.RequiredFieldList requiredFieldList)
Indicate to the loader fields that will be needed.
|
void |
setLocation(java.lang.String location,
org.apache.hadoop.mapreduce.Job job)
Communicate to the loader the location of the object(s) being loaded.
|
void |
setPushdownPredicate(Expression predicate)
Push down expression to the loader
|
getPartitionKeys, getSchema, getStatistics, setLoadFunc, setPartitionFilter
getInputFormat, getLoadCaster, getMethodName, getNext, loadFunc, prepareToRead, relativeToAbsolutePath, setLoadFunc, setUDFContextSignature
addCredentials, getAbsolutePath, getCacheFiles, getGlobPaths, getPathStrings, getShipFiles, join, warn
public ParquetLoader() throws FrontendException
FrontendException
public ParquetLoader(java.lang.String requestedSchemaStr) throws FrontendException
FrontendException
public void setLocation(java.lang.String location, org.apache.hadoop.mapreduce.Job job) throws java.io.IOException
LoadFunc
LoadFunc.relativeToAbsolutePath(String, Path)
. Implementations
should use this method to communicate the location (and any other information)
to its underlying InputFormat through the Job object.
This method will be called in the frontend and backend multiple times. Implementations
should bear in mind that this method is called multiple times and should
ensure there are no inconsistent side effects due to the multiple calls.setLocation
in class LoadFuncWrapper
location
- Location as returned by
LoadFunc.relativeToAbsolutePath(String, Path)
job
- the Job
object
store or retrieve earlier stored information from the UDFContext
java.io.IOException
- if the location is not valid.public java.util.List<LoadPushDown.OperatorSet> getFeatures()
LoadPushDown
getFeatures
in interface LoadPushDown
public LoadPushDown.RequiredFieldResponse pushProjection(LoadPushDown.RequiredFieldList requiredFieldList) throws FrontendException
LoadPushDown
pushProjection
in interface LoadPushDown
requiredFieldList
- RequiredFieldList indicating which columns will be needed.
This structure is read only. User cannot make change to it inside pushProjection.FrontendException
public java.util.List<java.lang.String> getPredicateFields(java.lang.String location, org.apache.hadoop.mapreduce.Job job) throws java.io.IOException
LoadPredicatePushdown
getPredicateFields
in interface LoadPredicatePushdown
location
- Location as returned by
LoadFunc.relativeToAbsolutePath(String, org.apache.hadoop.fs.Path)
job
- The Job
object - this should be used only to obtain
cluster properties through JobContextImpl.getConfiguration()
and not to set/query
any runtime job information.java.io.IOException
- if an exception occurs while retrieving predicate fieldspublic java.util.List<Expression.OpType> getSupportedExpressionTypes()
LoadPredicatePushdown
getSupportedExpressionTypes
in interface LoadPredicatePushdown
public void setPushdownPredicate(Expression predicate) throws java.io.IOException
LoadPredicatePushdown
setPushdownPredicate
in interface LoadPredicatePushdown
predicate
- expression to be filtered by the loader.java.io.IOException
Copyright © 2007-2025 The Apache Software Foundation