PathPartitionHelper (Pig 0.14.0 API)

java.lang.Object
- org.apache.pig.piggybank.storage.partition.PathPartitionHelper

```
public class PathPartitionHelper
extends java.lang.Object
```
Implements the logic for:
- Listing partition keys and values used in an hdfs path
- Filtering of partitions from a pig filter operator expression
Restrictions
Function calls are not supported by this partition helper and it can only handle String values.
This is normally not a problem given that partition values are part of the hdfs folder path and is given a
determined value that would not need parsing by any external processes.

Field Summary

Fields
Modifier and Type Field and Description

static java.lang.String PARITITION_FILTER_EXPRESSION

static java.lang.String PARTITION_COLUMNS

Fields
Modifier and Type	Field and Description
`static java.lang.String`	`PARITITION_FILTER_EXPRESSION`
`static java.lang.String`	`PARTITION_COLUMNS`

Constructor Summary

Constructors
Constructor and Description

PathPartitionHelper()

Constructors
Constructor and Description
`PathPartitionHelper()`

Method Summary

Methods
Modifier and Type	Method and Description
`java.util.Set<java.lang.String>`	`getPartitionKeys(java.lang.String location, org.apache.hadoop.conf.Configuration conf)` Returns the partition keys for a location. The work is delegated to the PathPartitioner class
`java.util.Map<java.lang.String,java.lang.String>`	`getPathPartitionKeyValues(java.lang.String location)` Returns the Partition keys and each key's value for a single location. That is the location must be something like mytable/partition1=a/partition2=b/myfile. This method will return a map with [partition1='a', partition2='b'] The work is delegated to the PathPartitioner class
`java.util.List<org.apache.hadoop.fs.FileStatus>`	`listStatus(org.apache.hadoop.mapreduce.JobContext ctx, java.lang.Class<? extends LoadFunc> loaderClass, java.lang.String signature)` This method is called by the FileInputFormat to find the input paths for which splits should be calculated. If applyDateRanges == true: Then the HiveRCDateSplitter is used to apply filtering on the input files. Else the default FileInputFormat listStatus method is used.
`void`	`setPartitionFilterExpression(java.lang.String partitionFilterExpression, java.lang.Class<? extends LoadFunc> loaderClass, java.lang.String signature)` Sets the PARITITION_FILTER_EXPRESSION property in the UDFContext identified by the loaderClass.
`void`	`setPartitionKeys(java.lang.String location, org.apache.hadoop.conf.Configuration conf, java.lang.Class<? extends LoadFunc> loaderClass, java.lang.String signature)` Reads the partition keys from the location i.e the base directory

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Field Detail

PARTITION_COLUMNS

public static final java.lang.String PARTITION_COLUMNS

PARITITION_FILTER_EXPRESSION

public static final java.lang.String PARITITION_FILTER_EXPRESSION

Constructor Detail
- PathPartitionHelper
```
public PathPartitionHelper()
```

Method Detail

getPathPartitionKeyValues
```
public java.util.Map<java.lang.String,java.lang.String> getPathPartitionKeyValues(java.lang.String location)
                                                                           throws java.io.IOException
```
Returns the Partition keys and each key's value for a single location.
That is the location must be something like mytable/partition1=a/partition2=b/myfile.
This method will return a map with [partition1='a', partition2='b']
The work is delegated to the PathPartitioner class

Parameters:
location -

Returns:
Map of String, String

Throws:

java.io.IOException

getPartitionKeys

public java.util.Set<java.lang.String> getPartitionKeys(java.lang.String location,
                                               org.apache.hadoop.conf.Configuration conf)
                                                 throws java.io.IOException

Returns the partition keys for a location.
The work is delegated to the PathPartitioner class

Parameters:: location - String must be the base directory for the partitions; conf -
Returns:
Throws:: java.io.IOException

setPartitionFilterExpression

public void setPartitionFilterExpression(java.lang.String partitionFilterExpression,
                                java.lang.Class<? extends LoadFunc> loaderClass,
                                java.lang.String signature)
                                  throws java.io.IOException

Sets the PARITITION_FILTER_EXPRESSION property in the UDFContext identified by the loaderClass.

Parameters:: partitionFilterExpression -; loaderClass -
Throws:: java.io.IOException

setPartitionKeys

public void setPartitionKeys(java.lang.String location,
                    org.apache.hadoop.conf.Configuration conf,
                    java.lang.Class<? extends LoadFunc> loaderClass,
                    java.lang.String signature)
                      throws java.io.IOException

Reads the partition keys from the location i.e the base directory

Parameters:: location - String must be the base directory for the partitions; conf -; loaderClass -
Throws:: java.io.IOException

listStatus
```
public java.util.List<org.apache.hadoop.fs.FileStatus> listStatus(org.apache.hadoop.mapreduce.JobContext ctx,
                                                         java.lang.Class<? extends LoadFunc> loaderClass,
                                                         java.lang.String signature)
                                                           throws java.io.IOException
```
This method is called by the FileInputFormat to find the input paths for which splits should be calculated.
If applyDateRanges == true: Then the HiveRCDateSplitter is used to apply filtering on the input files.
Else the default FileInputFormat listStatus method is used.

Parameters:
ctx - JobContext
loaderClass - this is chosen to be a subclass of LoadFunc to maintain some consistency.

Throws:

java.io.IOException

Class PathPartitionHelper

Field Summary

Constructor Summary

Method Summary

Methods inherited from class java.lang.Object

Field Detail

PARTITION_COLUMNS

PARITITION_FILTER_EXPRESSION

Constructor Detail

PathPartitionHelper

Method Detail

getPathPartitionKeyValues

getPartitionKeys

setPartitionFilterExpression

setPartitionKeys

listStatus