PathPartitionHelper (Pig 0.16.0 API)

java.lang.Object
- org.apache.pig.piggybank.storage.partition.PathPartitionHelper

```
public class PathPartitionHelper
extends Object
```
Implements the logic for:
- Listing partition keys and values used in an hdfs path
- Filtering of partitions from a pig filter operator expression
Restrictions
Function calls are not supported by this partition helper and it can only handle String values.
This is normally not a problem given that partition values are part of the hdfs folder path and is given a
determined value that would not need parsing by any external processes.

Field Summary

Fields
Modifier and Type Field and Description

static String PARITITION_FILTER_EXPRESSION

static String PARTITION_COLUMNS

Fields
Modifier and Type	Field and Description
`static String`	`PARITITION_FILTER_EXPRESSION`
`static String`	`PARTITION_COLUMNS`

Constructor Summary

Constructors
Constructor and Description

PathPartitionHelper()

Constructors
Constructor and Description
`PathPartitionHelper()`

Method Summary

Methods
Modifier and Type	Method and Description
`Set<String>`	`getPartitionKeys(String location, org.apache.hadoop.conf.Configuration conf)` Returns the partition keys for a location. The work is delegated to the PathPartitioner class
`Map<String,String>`	`getPathPartitionKeyValues(String location)` Returns the Partition keys and each key's value for a single location. That is the location must be something like mytable/partition1=a/partition2=b/myfile. This method will return a map with [partition1='a', partition2='b'] The work is delegated to the PathPartitioner class
`List<org.apache.hadoop.fs.FileStatus>`	`listStatus(org.apache.hadoop.mapreduce.JobContext ctx, Class<? extends LoadFunc> loaderClass, String signature)` This method is called by the FileInputFormat to find the input paths for which splits should be calculated. If applyDateRanges == true: Then the HiveRCDateSplitter is used to apply filtering on the input files. Else the default FileInputFormat listStatus method is used.
`void`	`setPartitionFilterExpression(String partitionFilterExpression, Class<? extends LoadFunc> loaderClass, String signature)` Sets the PARITITION_FILTER_EXPRESSION property in the UDFContext identified by the loaderClass.
`void`	`setPartitionKeys(String location, org.apache.hadoop.conf.Configuration conf, Class<? extends LoadFunc> loaderClass, String signature)` Reads the partition keys from the location i.e the base directory

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Field Detail

PARTITION_COLUMNS

public static final String PARTITION_COLUMNS

PARITITION_FILTER_EXPRESSION

public static final String PARITITION_FILTER_EXPRESSION

Constructor Detail
- PathPartitionHelper
```
public PathPartitionHelper()
```

Method Detail
- getPathPartitionKeyValues
```
public Map<String,String> getPathPartitionKeyValues(String location)
                                             throws IOException
```
  Returns the Partition keys and each key's value for a single location.
  That is the location must be something like mytable/partition1=a/partition2=b/myfile.
  This method will return a map with [partition1='a', partition2='b']
  The work is delegated to the PathPartitioner class
  
  Parameters:
  location -
  
  Returns:
  Map of String, String
  
  Throws:
  
  IOException
- getPartitionKeys
```
public Set<String> getPartitionKeys(String location,
                           org.apache.hadoop.conf.Configuration conf)
                             throws IOException
```
  Returns the partition keys for a location.
  The work is delegated to the PathPartitioner class
  
  Parameters:
  location - String must be the base directory for the partitions
  conf -
  
  Returns:
  
  Throws:
  
  IOException
- setPartitionFilterExpression
```
public void setPartitionFilterExpression(String partitionFilterExpression,
                                Class<? extends LoadFunc> loaderClass,
                                String signature)
                                  throws IOException
```
  Sets the PARITITION_FILTER_EXPRESSION property in the UDFContext identified by the loaderClass.
  
  Parameters:
  partitionFilterExpression -
  loaderClass -
  
  Throws:
  
  IOException
- setPartitionKeys
```
public void setPartitionKeys(String location,
                    org.apache.hadoop.conf.Configuration conf,
                    Class<? extends LoadFunc> loaderClass,
                    String signature)
                      throws IOException
```
  Reads the partition keys from the location i.e the base directory
  
  Parameters:
  location - String must be the base directory for the partitions
  conf -
  loaderClass -
  
  Throws:
  
  IOException
- listStatus
```
public List<org.apache.hadoop.fs.FileStatus> listStatus(org.apache.hadoop.mapreduce.JobContext ctx,
                                               Class<? extends LoadFunc> loaderClass,
                                               String signature)
                                                 throws IOException
```
  This method is called by the FileInputFormat to find the input paths for which splits should be calculated.
  If applyDateRanges == true: Then the HiveRCDateSplitter is used to apply filtering on the input files.
  Else the default FileInputFormat listStatus method is used.
  
  Parameters:
  ctx - JobContext
  loaderClass - this is chosen to be a subclass of LoadFunc to maintain some consistency.
  
  Throws:
  
  IOException

Class PathPartitionHelper

Field Summary

Constructor Summary

Method Summary

Methods inherited from class java.lang.Object

Field Detail

PARTITION_COLUMNS

PARITITION_FILTER_EXPRESSION

Constructor Detail

PathPartitionHelper

Method Detail

getPathPartitionKeyValues

getPartitionKeys

setPartitionFilterExpression

setPartitionKeys

listStatus