HiveRCInputFormat (Pig 0.17.0 API)

java.lang.Object
- org.apache.hadoop.mapreduce.InputFormat<K,V>
- - org.apache.hadoop.mapreduce.lib.input.FileInputFormat<org.apache.hadoop.io.LongWritable,org.apache.hadoop.hive.serde2.columnar.BytesRefArrayWritable>
  - - org.apache.pig.piggybank.storage.hiverc.HiveRCInputFormat

```
public class HiveRCInputFormat
extends org.apache.hadoop.mapreduce.lib.input.FileInputFormat<org.apache.hadoop.io.LongWritable,org.apache.hadoop.hive.serde2.columnar.BytesRefArrayWritable>
```
HiveRCInputFormat used by HiveColumnarLoader as the InputFormat;
Reasons for implementing a new InputFormat sub class:
- The current RCFileInputFormat uses the old InputFormat mapred interface, and the pig load store design used the new InputFormat mapreduce classes.
- The splits are calculated by the InputFormat, HiveColumnarLoader supports date partitions, the filtering is done here.

Nested Class Summary
- Nested classes/interfaces inherited from class org.apache.hadoop.mapreduce.lib.input.FileInputFormat
  org.apache.hadoop.mapreduce.lib.input.FileInputFormat.Counter

Field Summary
- Fields inherited from class org.apache.hadoop.mapreduce.lib.input.FileInputFormat
  DEFAULT_LIST_STATUS_NUM_THREADS, INPUT_DIR, INPUT_DIR_RECURSIVE, LIST_STATUS_NUM_THREADS, NUM_INPUT_FILES, PATHFILTER_CLASS, SPLIT_MAXSIZE, SPLIT_MINSIZE

Constructor Summary

Constructors
Constructor and Description

HiveRCInputFormat()

HiveRCInputFormat(String signature)

Constructors
Constructor and Description
`HiveRCInputFormat()`
`HiveRCInputFormat(String signature)`

Method Summary

All Methods Instance Methods Concrete Methods
Modifier and Type	Method and Description
`org.apache.hadoop.mapreduce.RecordReader<org.apache.hadoop.io.LongWritable,org.apache.hadoop.hive.serde2.columnar.BytesRefArrayWritable>`	`createRecordReader(org.apache.hadoop.mapreduce.InputSplit split, org.apache.hadoop.mapreduce.TaskAttemptContext ctx)` Initialises an instance of HiveRCRecordReader.
`protected long`	`getFormatMinSplitSize()` The input split size should never be smaller than the RCFile.SYNC_INTERVAL
`protected List<org.apache.hadoop.fs.FileStatus>`	`listStatus(org.apache.hadoop.mapreduce.JobContext jobContext)`

Methods inherited from class org.apache.hadoop.mapreduce.lib.input.FileInputFormat
addInputPath, addInputPathRecursively, addInputPaths, computeSplitSize, getBlockIndex, getInputDirRecursive, getInputPathFilter, getInputPaths, getMaxSplitSize, getMinSplitSize, getSplits, isSplitable, makeSplit, makeSplit, setInputDirRecursive, setInputPathFilter, setInputPaths, setInputPaths, setMaxInputSplitSize, setMinInputSplitSize

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Constructor Detail

HiveRCInputFormat
```
public HiveRCInputFormat()
```

HiveRCInputFormat

public HiveRCInputFormat(String signature)

Method Detail

listStatus
```
protected List<org.apache.hadoop.fs.FileStatus> listStatus(org.apache.hadoop.mapreduce.JobContext jobContext)
                                                    throws IOException
```
Overrides:

listStatus in class org.apache.hadoop.mapreduce.lib.input.FileInputFormat<org.apache.hadoop.io.LongWritable,org.apache.hadoop.hive.serde2.columnar.BytesRefArrayWritable>

Throws:

IOException

createRecordReader

public org.apache.hadoop.mapreduce.RecordReader<org.apache.hadoop.io.LongWritable,org.apache.hadoop.hive.serde2.columnar.BytesRefArrayWritable> createRecordReader(org.apache.hadoop.mapreduce.InputSplit split,
                                                                                                                                                                   org.apache.hadoop.mapreduce.TaskAttemptContext ctx)
                                                                                                                                                            throws IOException,
                                                                                                                                                                   InterruptedException

Initialises an instance of HiveRCRecordReader.

Specified by:: createRecordReader in class org.apache.hadoop.mapreduce.InputFormat<org.apache.hadoop.io.LongWritable,org.apache.hadoop.hive.serde2.columnar.BytesRefArrayWritable>
Throws:: IOException; InterruptedException

getFormatMinSplitSize
```
protected long getFormatMinSplitSize()
```
The input split size should never be smaller than the RCFile.SYNC_INTERVAL

Overrides:

getFormatMinSplitSize in class org.apache.hadoop.mapreduce.lib.input.FileInputFormat<org.apache.hadoop.io.LongWritable,org.apache.hadoop.hive.serde2.columnar.BytesRefArrayWritable>

Class HiveRCInputFormat

Nested Class Summary

Nested classes/interfaces inherited from class org.apache.hadoop.mapreduce.lib.input.FileInputFormat

Field Summary

Fields inherited from class org.apache.hadoop.mapreduce.lib.input.FileInputFormat

Constructor Summary

Method Summary

Methods inherited from class org.apache.hadoop.mapreduce.lib.input.FileInputFormat

Methods inherited from class java.lang.Object

Constructor Detail

HiveRCInputFormat

HiveRCInputFormat

Method Detail

listStatus

createRecordReader

getFormatMinSplitSize