Class | Description |
---|---|
AllLoader |
The AllLoader provides the ability to point pig at a folder that contains
files in multiple formats e.g.
|
AllLoader.AllLoaderInputFormat |
InputFormat that encapsulates the correct input format based on the file
type.
|
AllLoader.AllReader |
This is where the logic is for selecting the correct Loader.
|
CSVExcelStorage |
CSV loading and storing with support for multi-line fields,
and escaping of delimiters and double quotes within fields;
uses CSV conventions of Excel 2007.
|
CSVLoader |
A load function based on PigStorage that implements part of the CSV "standard"
This loader properly supports double-quoted fields that contain commas and other
double-quotes escaped with backslashes.
|
DBStorage | |
FixedWidthLoader |
A fixed-width file loader.
|
FixedWidthLoader.FixedWidthField | |
FixedWidthStorer |
Stores Pig records in a fixed-width file format.
|
HadoopJobHistoryLoader | |
HadoopJobHistoryLoader.HadoopJobHistoryInputFormat | |
HadoopJobHistoryLoader.HadoopJobHistoryReader | |
HadoopJobHistoryLoader.JobHistoryPathFilter | |
HadoopJobHistoryLoader.MRJobInfo | |
HiveColumnarLoader |
Loader for Hive RC Columnar files.
Supports the following types: * Hive Type Pig Type from DataType string CHARARRAY int INTEGER bigint or long LONG float float double DOUBLE boolean BOOLEAN byte BYTE array TUPLE map MAP Partitions The input paths are scanned by the loader for [partition name]=[value] patterns in the subdirectories. If detected these partitions are appended to the table schema. For example if you have the directory structure: |
HiveColumnarStorage | |
IndexedStorage |
IndexedStorage is a form of PigStorage that supports a
per record seek. |
IndexedStorage.IndexedStorageInputFormat |
Internal InputFormat class
|
IndexedStorage.IndexedStorageInputFormat.IndexedStorageRecordReader |
Internal RecordReader class
|
IndexedStorage.IndexedStorageInputFormat.IndexedStorageRecordReader.IndexedStorageLineReader | |
IndexedStorage.IndexedStorageInputFormat.IndexedStorageRecordReader.IndexedStorageRecordReaderComparator |
Class to compare record readers using underlying indexes
|
IndexedStorage.IndexedStorageOutputFormat |
Internal OutputFormat class
|
IndexedStorage.IndexedStorageOutputFormat.IndexedStorageRecordWriter |
Internal class to do the actual record writing and index generation
|
IndexedStorage.IndexManager |
IndexManager manages the index file (both writing and reading)
It keeps track of the last index read during reading. |
JsonMetadata | Deprecated |
MultiStorage |
The UDF is useful for splitting the output data into a bunch of directories
and files dynamically based on user specified key field in the output tuple.
|
MultiStorage.MultiStorageOutputFormat | |
MultiStorage.MultiStorageOutputFormat.MyLineRecordWriter | |
MyRegExLoader | |
PigStorageSchema | Deprecated
Use PigStorage with a -schema option instead
|
RegExLoader |
RegExLoader is an abstract class used to parse logs based on a regular expression.
|
SequenceFileLoader |
A Loader for Hadoop-Standard SequenceFiles.
|
XMLLoader |
Parses an XML input file given a specified identifier of tags to be loaded.
|
XMLLoader.XMLRecordReader |
Use this record reader to read XML tags out of a text file.
|
Enum | Description |
---|---|
CSVExcelStorage.Headers | |
CSVExcelStorage.Linebreaks | |
CSVExcelStorage.Multiline | |
HadoopJobHistoryLoader.JobKeys |
Job Keys
|
Copyright © 2007-2012 The Apache Software Foundation