PigAvroInputFormat (Pig 0.17.0 API)

java.lang.Object
- org.apache.hadoop.mapreduce.InputFormat<K,V>
- - org.apache.hadoop.mapreduce.lib.input.FileInputFormat<org.apache.hadoop.io.NullWritable,org.apache.hadoop.io.Writable>
  - - org.apache.pig.piggybank.storage.avro.PigAvroInputFormat

public class PigAvroInputFormat
extends org.apache.hadoop.mapreduce.lib.input.FileInputFormat<org.apache.hadoop.io.NullWritable,org.apache.hadoop.io.Writable>

The InputFormat for avro data.

Nested Class Summary
- Nested classes/interfaces inherited from class org.apache.hadoop.mapreduce.lib.input.FileInputFormat
  org.apache.hadoop.mapreduce.lib.input.FileInputFormat.Counter

Field Summary
- Fields inherited from class org.apache.hadoop.mapreduce.lib.input.FileInputFormat
  DEFAULT_LIST_STATUS_NUM_THREADS, INPUT_DIR, INPUT_DIR_RECURSIVE, LIST_STATUS_NUM_THREADS, NUM_INPUT_FILES, PATHFILTER_CLASS, SPLIT_MAXSIZE, SPLIT_MINSIZE

Constructor Summary

Constructors
Constructor and Description
`PigAvroInputFormat()` empty constructor
`PigAvroInputFormat(org.apache.avro.Schema readerSchema, boolean ignoreBadFiles, Map<org.apache.hadoop.fs.Path,Map<Integer,Integer>> schemaToMergedSchemaMap, boolean useMultipleSchemas)` constructor called by AvroStorage to pass in schema and ignoreBadFiles.

Method Summary

All Methods Instance Methods Concrete Methods
Modifier and Type	Method and Description
`org.apache.hadoop.mapreduce.RecordReader<org.apache.hadoop.io.NullWritable,org.apache.hadoop.io.Writable>`	`createRecordReader(org.apache.hadoop.mapreduce.InputSplit split, org.apache.hadoop.mapreduce.TaskAttemptContext context)` Create and return an avro record reader.
`protected List<org.apache.hadoop.fs.FileStatus>`	`listStatus(org.apache.hadoop.mapreduce.JobContext job)`

Methods inherited from class org.apache.hadoop.mapreduce.lib.input.FileInputFormat
addInputPath, addInputPathRecursively, addInputPaths, computeSplitSize, getBlockIndex, getFormatMinSplitSize, getInputDirRecursive, getInputPathFilter, getInputPaths, getMaxSplitSize, getMinSplitSize, getSplits, isSplitable, makeSplit, makeSplit, setInputDirRecursive, setInputPathFilter, setInputPaths, setInputPaths, setMaxInputSplitSize, setMinInputSplitSize

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Constructor Detail
- PigAvroInputFormat
```
public PigAvroInputFormat()
```
  empty constructor
- PigAvroInputFormat
```
public PigAvroInputFormat(org.apache.avro.Schema readerSchema,
                          boolean ignoreBadFiles,
                          Map<org.apache.hadoop.fs.Path,Map<Integer,Integer>> schemaToMergedSchemaMap,
                          boolean useMultipleSchemas)
```
  constructor called by AvroStorage to pass in schema and ignoreBadFiles.
  
  Parameters:
  
  readerSchema - reader schema
  
  ignoreBadFiles - whether ignore corrupted files during load
  
  schemaToMergedSchemaMap - map that associates each input record with a remapping of its fields relative to the merged schema

Method Detail

createRecordReader

public org.apache.hadoop.mapreduce.RecordReader<org.apache.hadoop.io.NullWritable,org.apache.hadoop.io.Writable> createRecordReader(org.apache.hadoop.mapreduce.InputSplit split,
                                                                                                                                    org.apache.hadoop.mapreduce.TaskAttemptContext context)
                                                                                                                             throws IOException,
                                                                                                                                    InterruptedException

Create and return an avro record reader. It uses the input schema passed in to the constructor.

Specified by:: createRecordReader in class org.apache.hadoop.mapreduce.InputFormat<org.apache.hadoop.io.NullWritable,org.apache.hadoop.io.Writable>
Throws:: IOException; InterruptedException

listStatus

protected List<org.apache.hadoop.fs.FileStatus> listStatus(org.apache.hadoop.mapreduce.JobContext job)
                                                    throws IOException

Overrides:: listStatus in class org.apache.hadoop.mapreduce.lib.input.FileInputFormat<org.apache.hadoop.io.NullWritable,org.apache.hadoop.io.Writable>
Throws:: IOException

Class PigAvroInputFormat

Nested Class Summary

Nested classes/interfaces inherited from class org.apache.hadoop.mapreduce.lib.input.FileInputFormat

Field Summary

Fields inherited from class org.apache.hadoop.mapreduce.lib.input.FileInputFormat

Constructor Summary

Method Summary

Methods inherited from class org.apache.hadoop.mapreduce.lib.input.FileInputFormat

Methods inherited from class java.lang.Object

Constructor Detail

PigAvroInputFormat

PigAvroInputFormat

Method Detail

createRecordReader

listStatus