org.apache.pig.piggybank.storage.avro
Class PigAvroInputFormat

java.lang.Object
  extended by org.apache.hadoop.mapreduce.InputFormat<K,V>
      extended by org.apache.hadoop.mapreduce.lib.input.FileInputFormat<org.apache.hadoop.io.NullWritable,org.apache.hadoop.io.Writable>
          extended by org.apache.pig.piggybank.storage.avro.PigAvroInputFormat

public class PigAvroInputFormat
extends org.apache.hadoop.mapreduce.lib.input.FileInputFormat<org.apache.hadoop.io.NullWritable,org.apache.hadoop.io.Writable>

The InputFormat for avro data.


Nested Class Summary
 
Nested classes/interfaces inherited from class org.apache.hadoop.mapreduce.lib.input.FileInputFormat
org.apache.hadoop.mapreduce.lib.input.FileInputFormat.Counter
 
Constructor Summary
PigAvroInputFormat()
          empty constructor
PigAvroInputFormat(org.apache.avro.Schema readerSchema, boolean ignoreBadFiles, Map<org.apache.hadoop.fs.Path,Map<Integer,Integer>> schemaToMergedSchemaMap, boolean useMultipleSchemas)
          constructor called by AvroStorage to pass in schema and ignoreBadFiles.
 
Method Summary
 org.apache.hadoop.mapreduce.RecordReader<org.apache.hadoop.io.NullWritable,org.apache.hadoop.io.Writable> createRecordReader(org.apache.hadoop.mapreduce.InputSplit split, org.apache.hadoop.mapreduce.TaskAttemptContext context)
          Create and return an avro record reader.
protected  List<org.apache.hadoop.fs.FileStatus> listStatus(org.apache.hadoop.mapreduce.JobContext job)
           
 
Methods inherited from class org.apache.hadoop.mapreduce.lib.input.FileInputFormat
addInputPath, addInputPaths, computeSplitSize, getBlockIndex, getFormatMinSplitSize, getInputPathFilter, getInputPaths, getMaxSplitSize, getMinSplitSize, getSplits, isSplitable, setInputPathFilter, setInputPaths, setInputPaths, setMaxInputSplitSize, setMinInputSplitSize
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

PigAvroInputFormat

public PigAvroInputFormat()
empty constructor


PigAvroInputFormat

public PigAvroInputFormat(org.apache.avro.Schema readerSchema,
                          boolean ignoreBadFiles,
                          Map<org.apache.hadoop.fs.Path,Map<Integer,Integer>> schemaToMergedSchemaMap,
                          boolean useMultipleSchemas)
constructor called by AvroStorage to pass in schema and ignoreBadFiles.

Parameters:
readerSchema - reader schema
ignoreBadFiles - whether ignore corrupted files during load
schemaToMergedSchemaMap - map that associates each input record with a remapping of its fields relative to the merged schema
Method Detail

createRecordReader

public org.apache.hadoop.mapreduce.RecordReader<org.apache.hadoop.io.NullWritable,org.apache.hadoop.io.Writable> createRecordReader(org.apache.hadoop.mapreduce.InputSplit split,
                                                                                                                                    org.apache.hadoop.mapreduce.TaskAttemptContext context)
                                                                                                                             throws IOException,
                                                                                                                                    InterruptedException
Create and return an avro record reader. It uses the input schema passed in to the constructor.

Specified by:
createRecordReader in class org.apache.hadoop.mapreduce.InputFormat<org.apache.hadoop.io.NullWritable,org.apache.hadoop.io.Writable>
Throws:
IOException
InterruptedException

listStatus

protected List<org.apache.hadoop.fs.FileStatus> listStatus(org.apache.hadoop.mapreduce.JobContext job)
                                                    throws IOException
Overrides:
listStatus in class org.apache.hadoop.mapreduce.lib.input.FileInputFormat<org.apache.hadoop.io.NullWritable,org.apache.hadoop.io.Writable>
Throws:
IOException


Copyright © 2007-2012 The Apache Software Foundation