|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||
java.lang.Objectorg.apache.hadoop.mapreduce.InputFormat<K,V>
org.apache.hadoop.mapreduce.lib.input.FileInputFormat<org.apache.hadoop.io.LongWritable,BytesRefArrayWritable>
org.apache.pig.piggybank.storage.hiverc.HiveRCInputFormat
public class HiveRCInputFormat
HiveRCInputFormat used by HiveColumnarLoader as the InputFormat;
Reasons for implementing a new InputFormat sub class:
| Constructor Summary | |
|---|---|
HiveRCInputFormat()
No date partitioning is applied |
|
HiveRCInputFormat(String dateRange)
Date partitioning will be applied to the input path. The path must be partitioned as input-path/daydate=yyyy-MM-dd. |
|
| Method Summary | |
|---|---|
org.apache.hadoop.mapreduce.RecordReader<org.apache.hadoop.io.LongWritable,BytesRefArrayWritable> |
createRecordReader(org.apache.hadoop.mapreduce.InputSplit split,
org.apache.hadoop.mapreduce.TaskAttemptContext ctx)
Initialises an instance of HiveRCRecordReader. |
protected long |
getFormatMinSplitSize()
The input split size should never be smaller than the RCFile.SYNC_INTERVAL |
protected List<org.apache.hadoop.fs.FileStatus> |
listStatus(org.apache.hadoop.mapreduce.JobContext ctx)
This method is called by the FileInputFormat to find the input paths for which splits should be calculated. If applyDateRanges == true: Then the HiveRCDateSplitter is used to apply filtering on the input files. Else the default FileInputFormat listStatus method is used. |
| Methods inherited from class org.apache.hadoop.mapreduce.lib.input.FileInputFormat |
|---|
addInputPath, addInputPaths, computeSplitSize, getBlockIndex, getInputPathFilter, getInputPaths, getMaxSplitSize, getMinSplitSize, getSplits, isSplitable, setInputPathFilter, setInputPaths, setInputPaths, setMaxInputSplitSize, setMinInputSplitSize |
| Methods inherited from class java.lang.Object |
|---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
| Constructor Detail |
|---|
public HiveRCInputFormat()
public HiveRCInputFormat(String dateRange)
dateRange - Must have format yyyy-MM-dd:yyyy-MM-dd with the left most being the start of the range.| Method Detail |
|---|
public org.apache.hadoop.mapreduce.RecordReader<org.apache.hadoop.io.LongWritable,BytesRefArrayWritable> createRecordReader(org.apache.hadoop.mapreduce.InputSplit split,
org.apache.hadoop.mapreduce.TaskAttemptContext ctx)
throws IOException,
InterruptedException
createRecordReader in class org.apache.hadoop.mapreduce.InputFormat<org.apache.hadoop.io.LongWritable,BytesRefArrayWritable>IOException
InterruptedException
protected List<org.apache.hadoop.fs.FileStatus> listStatus(org.apache.hadoop.mapreduce.JobContext ctx)
throws IOException
listStatus in class org.apache.hadoop.mapreduce.lib.input.FileInputFormat<org.apache.hadoop.io.LongWritable,BytesRefArrayWritable>IOExceptionprotected long getFormatMinSplitSize()
getFormatMinSplitSize in class org.apache.hadoop.mapreduce.lib.input.FileInputFormat<org.apache.hadoop.io.LongWritable,BytesRefArrayWritable>
|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||