Class Summary
AllLoader The AllLoader provides the ability to point pig at a folder that contains files in multiple formats e.g.
AllLoader.AllLoaderInputFormat InputFormat that encapsulates the correct input format based on the file type.
AllLoader.AllReader This is where the logic is for selecting the correct Loader.
CSVExcelStorage CSV loading and storing with support for multi-line fields, and escaping of delimiters and double quotes within fields; uses CSV conventions of Excel 2007.
CSVLoader A load function based on PigStorage that implements part of the CSV "standard" This loader properly supports double-quoted fields that contain commas and other double-quotes escaped with backslashes.
FixedWidthLoader A fixed-width file loader.
FixedWidthStorer Stores Pig records in a fixed-width file format.
HiveColumnarLoader Loader for Hive RC Columnar files.
Supports the following types:
* Hive Type Pig Type from DataType string CHARARRAY int INTEGER bigint or long LONG float float double DOUBLE boolean BOOLEAN byte BYTE array TUPLE map MAP

The input paths are scanned by the loader for [partition name]=[value] patterns in the subdirectories.
If detected these partitions are appended to the table schema.
For example if you have the directory structure:

IndexedStorage IndexedStorage is a form of PigStorage that supports a per record seek.
IndexedStorage.IndexedStorageInputFormat Internal InputFormat class
IndexedStorage.IndexedStorageInputFormat.IndexedStorageRecordReader Internal RecordReader class
IndexedStorage.IndexedStorageInputFormat.IndexedStorageRecordReader.IndexedStorageRecordReaderComparator Class to compare record readers using underlying indexes
IndexedStorage.IndexedStorageOutputFormat Internal OutputFormat class
IndexedStorage.IndexedStorageOutputFormat.IndexedStorageRecordWriter Internal class to do the actual record writing and index generation
IndexedStorage.IndexManager IndexManager manages the index file (both writing and reading) It keeps track of the last index read during reading.
JsonMetadata Deprecated.
MultiStorage The UDF is useful for splitting the output data into a bunch of directories and files dynamically based on user specified key field in the output tuple.
PigStorageSchema Deprecated. Use PigStorage with a -schema option instead
RegExLoader RegExLoader is an abstract class used to parse logs based on a regular expression.
SequenceFileLoader A Loader for Hadoop-Standard SequenceFiles.
XMLLoader Parses an XML input file given a specified identifier of tags to be loaded.
XMLLoader.XMLRecordReader Use this record reader to read XML tags out of a text file.

Enum Summary
HadoopJobHistoryLoader.JobKeys Job Keys

Copyright © 2007-2012 The Apache Software Foundation