PathPartitionHelper (Pig 0.17.0 API)

java.lang.Object
- org.apache.pig.piggybank.storage.partition.PathPartitionHelper

```
public class PathPartitionHelper
extends Object
```
Implements the logic for:
- Listing partition keys and values used in an hdfs path
- Filtering of partitions from a pig filter operator expression
Restrictions
Function calls are not supported by this partition helper and it can only handle String values.
This is normally not a problem given that partition values are part of the hdfs folder path and is given a
determined value that would not need parsing by any external processes.

Field Summary

Fields
Modifier and Type Field and Description

static String PARITITION_FILTER_EXPRESSION

static String PARTITION_COLUMNS

Fields
Modifier and Type	Field and Description
`static String`	`PARITITION_FILTER_EXPRESSION`
`static String`	`PARTITION_COLUMNS`

Constructor Summary

Constructors
Constructor and Description

PathPartitionHelper()

Constructors
Constructor and Description
`PathPartitionHelper()`

Method Summary

All Methods Instance Methods Concrete Methods
Modifier and Type	Method and Description
`Set<String>`	`getPartitionKeys(String location, org.apache.hadoop.conf.Configuration conf)` Returns the partition keys for a location. The work is delegated to the PathPartitioner class
`Map<String,String>`	`getPathPartitionKeyValues(String location)` Returns the Partition keys and each key's value for a single location. That is the location must be something like mytable/partition1=a/partition2=b/myfile. This method will return a map with [partition1='a', partition2='b'] The work is delegated to the PathPartitioner class
`List<org.apache.hadoop.fs.FileStatus>`	`listStatus(org.apache.hadoop.mapreduce.JobContext ctx, Class<? extends LoadFunc> loaderClass, String signature)` This method is called by the FileInputFormat to find the input paths for which splits should be calculated. If applyDateRanges == true: Then the HiveRCDateSplitter is used to apply filtering on the input files. Else the default FileInputFormat listStatus method is used.
`void`	`setPartitionFilterExpression(String partitionFilterExpression, Class<? extends LoadFunc> loaderClass, String signature)` Sets the PARITITION_FILTER_EXPRESSION property in the UDFContext identified by the loaderClass.
`void`	`setPartitionKeys(String location, org.apache.hadoop.conf.Configuration conf, Class<? extends LoadFunc> loaderClass, String signature)` Reads the partition keys from the location i.e the base directory

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Field Detail

PARTITION_COLUMNS

public static final String PARTITION_COLUMNS

PARITITION_FILTER_EXPRESSION

public static final String PARITITION_FILTER_EXPRESSION

Constructor Detail
- PathPartitionHelper
```
public PathPartitionHelper()
```

Method Detail

getPathPartitionKeyValues
```
public Map<String,String> getPathPartitionKeyValues(String location)
                                             throws IOException
```
Returns the Partition keys and each key's value for a single location.
That is the location must be something like mytable/partition1=a/partition2=b/myfile.
This method will return a map with [partition1='a', partition2='b']
The work is delegated to the PathPartitioner class

Parameters:

location -

Returns:

Map of String, String

Throws:

IOException

getPartitionKeys
```
public Set<String> getPartitionKeys(String location,
                                    org.apache.hadoop.conf.Configuration conf)
                             throws IOException
```
Returns the partition keys for a location.
The work is delegated to the PathPartitioner class

Parameters:

location - String must be the base directory for the partitions

conf -

Returns:

Throws:

IOException

setPartitionFilterExpression

public void setPartitionFilterExpression(String partitionFilterExpression,
                                         Class<? extends LoadFunc> loaderClass,
                                         String signature)
                                  throws IOException

Sets the PARITITION_FILTER_EXPRESSION property in the UDFContext identified by the loaderClass.

Parameters:: partitionFilterExpression -; loaderClass -
Throws:: IOException

setPartitionKeys

public void setPartitionKeys(String location,
                             org.apache.hadoop.conf.Configuration conf,
                             Class<? extends LoadFunc> loaderClass,
                             String signature)
                      throws IOException

Reads the partition keys from the location i.e the base directory

Parameters:: location - String must be the base directory for the partitions; conf -; loaderClass -
Throws:: IOException

listStatus

public List<org.apache.hadoop.fs.FileStatus> listStatus(org.apache.hadoop.mapreduce.JobContext ctx,
                                                        Class<? extends LoadFunc> loaderClass,
                                                        String signature)
                                                 throws IOException

This method is called by the FileInputFormat to find the input paths for which splits should be calculated.
If applyDateRanges == true: Then the HiveRCDateSplitter is used to apply filtering on the input files.
Else the default FileInputFormat listStatus method is used.

Parameters:: ctx - JobContext; loaderClass - this is chosen to be a subclass of LoadFunc to maintain some consistency.
Throws:: IOException

Class PathPartitionHelper

Field Summary

Constructor Summary

Method Summary

Methods inherited from class java.lang.Object

Field Detail

PARTITION_COLUMNS

PARITITION_FILTER_EXPRESSION

Constructor Detail

PathPartitionHelper

Method Detail

getPathPartitionKeyValues

getPartitionKeys

setPartitionFilterExpression

setPartitionKeys

listStatus