org.apache.pig.backend.hadoop.executionengine.util
Class MapRedUtil

java.lang.Object
  extended by org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil

public class MapRedUtil
extends Object

A class of utility static methods to be used in the hadoop map reduce backend


Field Summary
static String FILE_SYSTEM_NAME
           
 
Constructor Summary
MapRedUtil()
           
 
Method Summary
static FileSpec checkLeafIsStore(PhysicalPlan plan, PigContext pigContext)
           
static List<org.apache.hadoop.fs.FileStatus> getAllFileRecursively(List<org.apache.hadoop.fs.FileStatus> files, org.apache.hadoop.conf.Configuration conf)
          Get all files recursively from the given list of files
static List<List<org.apache.hadoop.mapreduce.InputSplit>> getCombinePigSplits(List<org.apache.hadoop.mapreduce.InputSplit> oneInputSplits, long maxCombinedSplitSize, org.apache.hadoop.conf.Configuration conf)
           
 String inputSplitToString(org.apache.hadoop.mapreduce.InputSplit[] splits)
           
static
<E> Map<E,Pair<Integer,Integer>>
loadPartitionFileFromLocalCache(String keyDistFile, Integer[] totalReducers, byte keyType)
          Loads the key distribution sampler file
static void setupUDFContext(org.apache.hadoop.conf.Configuration job)
           
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

FILE_SYSTEM_NAME

public static final String FILE_SYSTEM_NAME
See Also:
Constant Field Values
Constructor Detail

MapRedUtil

public MapRedUtil()
Method Detail

loadPartitionFileFromLocalCache

public static <E> Map<E,Pair<Integer,Integer>> loadPartitionFileFromLocalCache(String keyDistFile,
                                                                               Integer[] totalReducers,
                                                                               byte keyType)
                                                                    throws IOException
Loads the key distribution sampler file

Parameters:
keyDistFile - the name for the distribution file
totalReducers - gets set to the total number of reducers as found in the dist file
keyType - Type of the key to be stored in the return map. It currently treats Tuple as a special case.
Throws:
IOException

setupUDFContext

public static void setupUDFContext(org.apache.hadoop.conf.Configuration job)
                            throws IOException
Throws:
IOException

checkLeafIsStore

public static FileSpec checkLeafIsStore(PhysicalPlan plan,
                                        PigContext pigContext)
                                 throws ExecException
Throws:
ExecException

getAllFileRecursively

public static List<org.apache.hadoop.fs.FileStatus> getAllFileRecursively(List<org.apache.hadoop.fs.FileStatus> files,
                                                                          org.apache.hadoop.conf.Configuration conf)
                                                                   throws IOException
Get all files recursively from the given list of files

Parameters:
files - a list of FileStatus
conf - the configuration object
Returns:
the list of fileStatus that contains all the files in the given list and, recursively, all the files inside the directories in the given list
Throws:
IOException

getCombinePigSplits

public static List<List<org.apache.hadoop.mapreduce.InputSplit>> getCombinePigSplits(List<org.apache.hadoop.mapreduce.InputSplit> oneInputSplits,
                                                                                     long maxCombinedSplitSize,
                                                                                     org.apache.hadoop.conf.Configuration conf)
                                                                              throws IOException,
                                                                                     InterruptedException
Throws:
IOException
InterruptedException

inputSplitToString

public String inputSplitToString(org.apache.hadoop.mapreduce.InputSplit[] splits)
                          throws IOException,
                                 InterruptedException
Throws:
IOException
InterruptedException


Copyright © ${year} The Apache Software Foundation