public class MapRedUtil
extends java.lang.Object
| Modifier and Type | Field and Description |
|---|---|
static java.lang.String |
FILE_SYSTEM_NAME |
| Constructor and Description |
|---|
MapRedUtil() |
| Modifier and Type | Method and Description |
|---|---|
static FileSpec |
checkLeafIsStore(PhysicalPlan plan,
PigContext pigContext) |
static void |
copyTmpFileConfigurationValues(org.apache.hadoop.conf.Configuration fromConf,
org.apache.hadoop.conf.Configuration toConf) |
static java.util.List<org.apache.hadoop.fs.FileStatus> |
getAllFileRecursively(java.util.List<org.apache.hadoop.fs.FileStatus> files,
org.apache.hadoop.conf.Configuration conf)
Get all files recursively from the given list of files
|
static java.util.List<java.util.List<org.apache.hadoop.mapreduce.InputSplit>> |
getCombinePigSplits(java.util.List<org.apache.hadoop.mapreduce.InputSplit> oneInputSplits,
long maxCombinedSplitSize,
org.apache.hadoop.conf.Configuration conf) |
static long |
getPathLength(org.apache.hadoop.fs.FileSystem fs,
org.apache.hadoop.fs.FileStatus status) |
static long |
getPathLength(org.apache.hadoop.fs.FileSystem fs,
org.apache.hadoop.fs.FileStatus status,
long max)
Returns the total number of bytes for this file, or if a directory all
files in the directory.
|
java.lang.String |
inputSplitToString(org.apache.hadoop.mapreduce.InputSplit[] splits) |
static <E> java.util.Map<E,Pair<java.lang.Integer,java.lang.Integer>> |
loadPartitionFileFromLocalCache(java.lang.String keyDistFile,
java.lang.Integer[] totalReducers,
byte keyType,
org.apache.hadoop.conf.Configuration mapConf)
Loads the key distribution sampler file
|
static void |
setupStreamingDirsConfMulti(PigContext pigContext,
org.apache.hadoop.conf.Configuration conf)
Sets up output and log dir paths for a multi-store streaming job
|
static void |
setupStreamingDirsConfSingle(POStore st,
PigContext pigContext,
org.apache.hadoop.conf.Configuration conf)
Sets up output and log dir paths for a single-store streaming job
|
static void |
setupUDFContext(org.apache.hadoop.conf.Configuration job) |
public static final java.lang.String FILE_SYSTEM_NAME
public static <E> java.util.Map<E,Pair<java.lang.Integer,java.lang.Integer>> loadPartitionFileFromLocalCache(java.lang.String keyDistFile, java.lang.Integer[] totalReducers, byte keyType, org.apache.hadoop.conf.Configuration mapConf) throws java.io.IOException
keyDistFile - the name for the distribution filetotalReducers - gets set to the total number of reducers as found in the dist filekeyType - Type of the key to be stored in the return map. It currently treats Tuple as a special case.java.io.IOExceptionpublic static void copyTmpFileConfigurationValues(org.apache.hadoop.conf.Configuration fromConf,
org.apache.hadoop.conf.Configuration toConf)
public static void setupUDFContext(org.apache.hadoop.conf.Configuration job)
throws java.io.IOException
java.io.IOExceptionpublic static void setupStreamingDirsConfSingle(POStore st, PigContext pigContext, org.apache.hadoop.conf.Configuration conf) throws java.io.IOException
st - - POStore of the current jobpigContext - conf - java.io.IOExceptionpublic static void setupStreamingDirsConfMulti(PigContext pigContext, org.apache.hadoop.conf.Configuration conf) throws java.io.IOException
pigContext - conf - java.io.IOExceptionpublic static FileSpec checkLeafIsStore(PhysicalPlan plan, PigContext pigContext) throws ExecException
ExecExceptionpublic static java.util.List<org.apache.hadoop.fs.FileStatus> getAllFileRecursively(java.util.List<org.apache.hadoop.fs.FileStatus> files,
org.apache.hadoop.conf.Configuration conf)
throws java.io.IOException
files - a list of FileStatusconf - the configuration objectjava.io.IOExceptionpublic static long getPathLength(org.apache.hadoop.fs.FileSystem fs,
org.apache.hadoop.fs.FileStatus status)
throws java.io.IOException
java.io.IOExceptionpublic static long getPathLength(org.apache.hadoop.fs.FileSystem fs,
org.apache.hadoop.fs.FileStatus status,
long max)
throws java.io.IOException
fs - FileSystemstatus - FileStatusmax - Maximum value of total length that will trigger exit. Many
times we're only interested whether the total length of files is greater
than X or not. In such case, we can exit the function early as soon as
the max is reached.java.io.IOExceptionpublic static java.util.List<java.util.List<org.apache.hadoop.mapreduce.InputSplit>> getCombinePigSplits(java.util.List<org.apache.hadoop.mapreduce.InputSplit> oneInputSplits,
long maxCombinedSplitSize,
org.apache.hadoop.conf.Configuration conf)
throws java.io.IOException,
java.lang.InterruptedException
java.io.IOExceptionjava.lang.InterruptedExceptionpublic java.lang.String inputSplitToString(org.apache.hadoop.mapreduce.InputSplit[] splits)
throws java.io.IOException,
java.lang.InterruptedException
java.io.IOExceptionjava.lang.InterruptedExceptionCopyright © 2007-2025 The Apache Software Foundation