Interface | Description |
---|---|
InvokerFunction |
Class | Description |
---|---|
ABS |
ABS implements a binding to the Java function
Math.abs(double) for computing the
absolute value of the argument. |
ACOS |
ACOS implements a binding to the Java function
Math.acos(double) for computing the
arc cosine of value of the argument. |
AddDuration |
AddDuration returns the result of a DateTime object plus a Duration object
|
AlgebraicBigDecimalMathBase |
Core logic for applying an SUM function to a
bag of BigDecimals.
|
AlgebraicBigDecimalMathBase.Final | |
AlgebraicBigDecimalMathBase.Intermediate | |
AlgebraicBigIntegerMathBase |
Core logic for applying an SUM function to a
bag of BigIntegers.
|
AlgebraicBigIntegerMathBase.Final | |
AlgebraicBigIntegerMathBase.Intermediate | |
AlgebraicByteArrayMathBase |
Core logic for applying an accumulative/algebraic math function to a
bag of doubles.
|
AlgebraicByteArrayMathBase.Final | |
AlgebraicByteArrayMathBase.Initial | |
AlgebraicByteArrayMathBase.Intermediate | |
AlgebraicDoubleMathBase |
Core logic for applying an accumulative/algebraic math function to a
bag of doubles.
|
AlgebraicDoubleMathBase.Final | |
AlgebraicDoubleMathBase.Intermediate | |
AlgebraicFloatMathBase |
Core logic for applying an accumulative/algebraic math function to a
bag of Floats.
|
AlgebraicFloatMathBase.Final | |
AlgebraicFloatMathBase.Intermediate | |
AlgebraicIntMathBase |
Core logic for applying an accumulative/algebraic math function to a
bag of doubles.
|
AlgebraicIntMathBase.Final | |
AlgebraicIntMathBase.Intermediate | |
AlgebraicLongMathBase |
Core logic for applying an accumulative/algebraic math function to a
bag of Longs.
|
AlgebraicLongMathBase.Final | |
AlgebraicLongMathBase.Intermediate | |
ARITY | Deprecated
Use
SIZE instead. |
ASIN |
ASIN implements a binding to the Java function
Math.asin(double) for computing the
arc sine of value of the argument. |
Assert | |
ATAN |
ATAN implements a binding to the Java function
Math.atan(double) for computing the
arc tangent of value of the argument. |
AVG |
Generates the average of a set of values.
|
AVG.Final | |
AVG.Initial | |
AVG.Intermediate | |
AvroStorage |
Pig UDF for reading and writing Avro data.
|
BagSize |
This method should never be used directly, use
SIZE . |
BagToString |
Flatten a bag into a string.
|
BagToTuple |
Flatten a bag into a tuple.
|
Base |
base class for math udfs
|
BigDecimalAbs | |
BigDecimalAvg |
This method should never be used directly, use
AVG . |
BigDecimalAvg.Final | |
BigDecimalAvg.Initial | |
BigDecimalAvg.Intermediate | |
BigDecimalMax |
This method should never be used directly, use
MAX . |
BigDecimalMax.Final | |
BigDecimalMax.Intermediate | |
BigDecimalMin |
This method should never be used directly, use
MIN . |
BigDecimalMin.Final | |
BigDecimalMin.Intermediate | |
BigDecimalSum |
This method should never be used directly, use
SUM . |
BigDecimalSum.Final | |
BigDecimalSum.Intermediate | |
BigDecimalWrapper |
Max and min seeds cannot be defined to BigDecimal as the value could go as large as
The computer allows.
|
BigIntegerAbs | |
BigIntegerAvg |
This method should never be used directly, use
AVG . |
BigIntegerAvg.Final | |
BigIntegerAvg.Initial | |
BigIntegerAvg.Intermediate | |
BigIntegerMax |
This method should never be used directly, use
MAX . |
BigIntegerMax.Final | |
BigIntegerMax.Intermediate | |
BigIntegerMin |
This method should never be used directly, use
MIN . |
BigIntegerMin.Final | |
BigIntegerMin.Intermediate | |
BigIntegerSum |
This method should never be used directly, use
SUM . |
BigIntegerSum.Final | |
BigIntegerSum.Intermediate | |
BigIntegerWrapper |
Max and min seeds cannot be defined to BigInteger as the value could go as large as
The computer allows.
|
BinStorage |
Load and store data in a binary format.
|
Bloom |
Use a Bloom filter build previously by BuildBloom.
|
BuildBloom |
Build a bloom filter for use later in Bloom.
|
BuildBloom.Final | |
BuildBloom.Initial | |
BuildBloom.Intermediate | |
BuildBloomBase<T> |
A Base class for BuildBloom and its Algebraic implementations.
|
CBRT |
CBRT implements a binding to the Java function
Math.cbrt(double) for computing the
cube root of the argument. |
CEIL |
CEIL implements a binding to the Java function
Math.ceil(double) . |
CONCAT |
Generates the concatenation of two or more arguments.
|
ConstantSize |
This method should never be used directly, use
SIZE . |
COR |
Computes the correlation between sets of data.
|
COR.Final | |
COR.Initial | |
COR.Intermed | |
COS |
COS implements a binding to the Java function
Math.cos(double) . |
COSH |
COSH implements a binding to the Java function
Math.cosh(double) . |
COUNT |
Generates the count of the number of values in a bag.
|
COUNT_STAR |
Generates the count of the values of the first field of a tuple.
|
COUNT_STAR.Final | |
COUNT_STAR.Initial | |
COUNT_STAR.Intermediate | |
COUNT.Final | |
COUNT.Initial | |
COUNT.Intermediate | |
COV |
Computes the covariance between sets of data.
|
COV.Final | |
COV.Initial | |
COV.Intermed | |
CubeDimensions |
Produces a DataBag with all combinations of the argument tuple members
as in a data cube.
|
CurrentTime | |
DateTimeMax |
This method should never be used directly, use
MAX . |
DateTimeMax.Final | |
DateTimeMax.Initial | |
DateTimeMax.Intermediate | |
DateTimeMin |
This method should never be used directly, use
MAX . |
DateTimeMin.Final | |
DateTimeMin.Initial | |
DateTimeMin.Intermediate | |
DaysBetween |
DaysBetween returns the number of days between two DateTime objects
|
DIFF |
DIFF takes two bags as arguments and compares them.
|
Distinct |
Find the distinct set of tuples in a bag.
|
Distinct.Final | |
Distinct.Initial | |
Distinct.Intermediate | |
DoubleAbs | |
DoubleAvg |
This method should never be used directly, use
AVG . |
DoubleAvg.Final | |
DoubleAvg.Initial | |
DoubleAvg.Intermediate | |
DoubleBase |
base class for math udfs that return Double value
|
DoubleMax |
This method should never be used directly, use
MAX . |
DoubleMax.Final | |
DoubleMax.Intermediate | |
DoubleMin |
This method should never be used directly, use
MIN . |
DoubleMin.Final | |
DoubleMin.Intermediate | |
DoubleRound |
Given a single data atom it Returns the closest long to the argument.
|
DoubleRoundTo |
ROUND_TO safely rounds a number to a given precision by using an intermediate
BigDecimal.
|
DoubleSum |
This method should never be used directly, use
SUM . |
DoubleSum.Final | |
DoubleSum.Intermediate | |
ENDSWITH |
Pig UDF to test input
tuple.get(0) against tuple.get(1)
to determine if the first argument ends with the string in the second. |
EqualsIgnoreCase |
Compares two Strings ignoring case considerations.
|
EXP |
Given a single data atom it returns the Euler's number e raised to the power of input
|
FloatAbs | |
FloatAvg |
This method should never be used directly, use
AVG . |
FloatAvg.Final | |
FloatAvg.Initial | |
FloatAvg.Intermediate | |
FloatMax |
This method should never be used directly, use
MAX . |
FloatMax.Final | |
FloatMax.Intermediate | |
FloatMin |
This method should never be used directly, use
MIN . |
FloatMin.Final | |
FloatMin.Intermediate | |
FloatRound |
ROUND implements a binding to the Java function
Math.round(float) . |
FloatRoundTo |
ROUND_TO safely rounds a number to a given precision by using an intermediate
BigDecimal.
|
FloatSum |
This method should never be used directly, use
SUM . |
FLOOR |
FLOOR implements a binding to the Java function
Math.floor(double) . |
FunctionWrapperEvalFunc |
EvalFunc that wraps an implementation of the Function interface, which is passed as a String
in the constructor.
|
FuncUtils | |
GenericInvoker<T> |
The generic Invoker class does all the common grunt work of setting up an invoker.
|
GetDay |
GetDay extracts the day of a month from a DateTime object.
|
GetHour |
GetHour extracts the hour of a day from a DateTime object.
|
GetMilliSecond |
GetSecond extracts the millisecond of a second from a DateTime object.
|
GetMinute |
GetMinute extracts the minute of an hour from a DateTime object.
|
GetMonth |
GetMonth extracts the month of a year from a DateTime object.
|
GetSecond |
GetSecond extracts the second of a minute from a DateTime object.
|
GetWeek |
GetMonth extracts the week of a week year from a DateTime object.
|
GetWeekYear |
GetMonth extracts the week year from a DateTime object.
|
GetYear |
GetYear extracts the year from a DateTime object.
|
HiveUDAF |
Use Hive UDAF or GenericUDAF.
|
HiveUDAF.Final | |
HiveUDAF.Initial | |
HiveUDAF.Intermediate | |
HiveUDF |
Use Hive UDF or GenericUDF.
|
HiveUDTF |
Use Hive GenericUDTF.
|
HoursBetween |
HoursBetween returns the number of hours between two DateTime objects
|
INDEXOF |
INDEXOF implements eval function to search for a string
Example:
A = load 'mydata' as (name);
B = foreach A generate INDEXOF(name, ",");
|
IntAbs |
ABS implements a binding to the Java function
Math.abs(int) for computing the
absolute value of the argument. |
IntAvg |
This method should never be used directly, use
AVG . |
IntAvg.Final | |
IntAvg.Initial | |
IntAvg.Intermediate | |
IntMax |
This method should never be used directly, use
MAX . |
IntMax.Final | |
IntMax.Intermediate | |
IntMin |
This method should never be used directly, use
MIN . |
IntMin.Final | |
IntMin.Intermediate | |
IntSum |
This method should never be used directly, use
SUM . |
INVERSEMAP |
This UDF accepts a Map as input with values of any primitive data type.
|
InvokeForDouble | |
InvokeForFloat | |
InvokeForInt | |
InvokeForLong | |
InvokeForString | |
Invoker<T> | |
InvokerGenerator | |
IsEmpty |
Determine whether a bag or map is empty.
|
JsonLoader |
A loader for data stored using
JsonStorage . |
JsonMetadata |
Reads and Writes metadata using JSON in metafiles next to the data.
|
JsonStorage |
A JSON Pig store function.
|
KEYSET |
This UDF takes a Map and returns a Bag containing the keyset.
|
LAST_INDEX_OF |
string.INSTR implements eval function to search for the last occurrence of a string
Returns null on error
Example:
A = load 'mydata' as (name);
B = foreach A generate LASTINDEXOF(name, ",");
|
LCFIRST |
lower-case the first character of a string
|
LOG |
LOG implements a binding to the Java function
Math.log(double) . |
LOG10 |
LOG10 implements a binding to the Java function
Math.log10(double) . |
LongAbs | |
LongAvg |
This method should never be used directly, use
AVG . |
LongAvg.Final | |
LongAvg.Initial | |
LongAvg.Intermediate | |
LongMax |
This method should never be used directly, use
MAX . |
LongMax.Final | |
LongMax.Intermediate | |
LongMin |
This method should never be used directly, use
MIN . |
LongMin.Final | |
LongMin.Intermediate | |
LongSum |
This method should never be used directly, use
SUM . |
LongSum.Final | |
LongSum.Intermediate | |
LOWER |
LOWER implements eval function to convert a string to lower case
Example:
A = load 'mydata' as (name);
B = foreach A generate LOWER(name);
|
LTRIM |
Returns a string, with only leading whitespace omitted.
|
MapSize |
This method should never be used directly, use
SIZE . |
MAX |
Generates the maximum of a set of values.
|
MAX.Final | |
MAX.Intermediate | |
MilliSecondsBetween |
MilliSecondsBetween returns the number of milliseconds between two DateTime objects
|
MIN |
Generates the minimum of a set of values.
|
MIN.Final | |
MIN.Intermediate | |
MinutesBetween |
MinutesBetween returns the number of minutes between two DateTime objects
|
MonthsBetween |
MonthsBetween returns the number of months between two DateTime objects
|
OrcStorage |
A load function and store function for ORC file.
|
OrcStorage.NonEmptyOrcFileFilter | |
ParquetLoader |
Wrapper class which will delegate calls to parquet.pig.ParquetLoader
|
ParquetStorer |
Wrapper class which will delegate calls to parquet.pig.ParquetStorer
|
PigStorage |
A load function that parses a line of input into fields using a character delimiter.
|
PigStreaming |
The default implementation of
PigStreamingBase . |
PluckTuple |
This is a UDF which allows the user to specify a string prefix, and then
filter for the columns in a relation that begin with that prefix.
|
RANDOM |
Return a random double value.
|
REGEX_EXTRACT |
Syntax:
String RegexExtract(String expression, String regex, int match_index) .
Input:
expression -source string .
regex -regular expression .
match_index -index of the group to extract .
Output:
extracted group, if fail, return null .
Matching strategy:
Try to only match the first sequence by using Matcher.find() instead of
Matcher.matches() (default useMatches=false).
DEFINE NON_GREEDY_EXTRACT REGEX_EXTRACT('true');
|
REGEX_EXTRACT_ALL |
Syntax:
String RegexExtractAll(String expression, String regex) .
Input:
expression -source string .
regex -regular expression .
Output:
A tuple of matched strings .
Matching strategy:
Trying to match the entire input by using Matcher.matches() instead of
Matcher.find() (default useMatches=true).
DEFINE GREEDY_EXTRACT REGEX_EXTRACT_ALL('false');
|
REGEX_SEARCH |
Search and find all matched characters in a string with a given
regular expression.
|
REPLACE |
REPLACE implements eval function to replace part of a string.
|
RollupDimensions |
Produces a DataBag with hierarchy of values (from the most detailed level of
aggregation to most general level of aggregation) of the specified dimensions
For example, (a, b, c) will produce the following bag:
|
ROUND |
ROUND implements a binding to the Java function
Math.round(double) . |
ROUND_TO |
ROUND_TO safely rounds a number to a given precision by using an intermediate
BigDecimal.
|
RoundRobinPartitioner | Deprecated |
RTRIM |
Returns a string, with only tailing whitespace omitted.
|
SecondsBetween |
SecondsBetween returns the number of seconds between two DateTime objects
|
SIN |
SIN implements a binding to the Java function
Math.sin(double) . |
SINH |
SINH implements a binding to the Java function
Math.sinh(double) . |
SIZE |
Generates the size of the argument passed to it.
|
SPRINTF |
Formatted strings using java.util.Formatter
See http://docs.oracle.com/javase/7/docs/api/java/util/Formatter.html
ex:
SPRINTF('%2$10s %1$-17s %2$,10d %2$8x %3$10.3f %4$1TFT%
|
SQRT |
SQRT implements a binding to the Java function
Math.sqrt(double) . |
STARTSWITH |
Pig UDF to test input
tuple.get(0) against tuple.get(1)
to determine if the first argument starts with the string in the second. |
StringConcat |
This method should never be used directly, use
CONCAT . |
StringMax |
This method should never be used directly, use
MAX . |
StringMax.Final | |
StringMax.Initial | |
StringMax.Intermediate | |
StringMin |
This method should never be used directly, use
MIN . |
StringMin.Final | |
StringMin.Initial | |
StringMin.Intermediate | |
StringSize |
This method should never be used directly, use
SIZE . |
STRSPLIT |
Wrapper around Java's String.split
input tuple: first column is assumed to have a string to split; the optional second column is assumed to have the delimiter or regex to split on; if not provided, it's assumed to be '\s' (space) the optional third column may provide a limit to the number of results. If limit is not provided, 0 is assumed, as per Java's split(). |
STRSPLITTOBAG |
Wrapper around Java's String.split
input tuple: first column is assumed to have a string to split; the optional second column is assumed to have the delimiter or regex to split on; if not provided, it's assumed to be '\s' (space) the optional third column may provide a limit to the number of results. If limit is not provided, 0 is assumed, as per Java's split(). |
SUBSTRING |
SUBSTRING implements eval function to get a part of a string.
|
SUBTRACT |
SUBTRACT takes two bags as arguments and returns a new bag composed of tuples of first bag not in the second bag.
If null, bag arguments are replaced by empty bags. |
SubtractDuration |
SubtractDuration returns the result of a DateTime object plus a Duration object
|
SUM |
Generates the sum of a set of values.
|
SUM.Final | |
SUM.Intermediate | |
TAN |
TAN implements a binding to the Java function
Math.tan(double) . |
TANH |
TANH implements a binding to the Java function
Math.tanh(double) . |
TextLoader |
This load function simply creates a tuple for each line of text that has a
single chararray field that
contains the line of text.
|
TOBAG |
This class takes a list of items and puts them into a bag
T = foreach U generate TOBAG($0, $1, $2);
It's like saying this:
T = foreach U generate {($0), ($1), ($2)}
All arguments that are not of tuple type are inserted into a tuple before
being added to the bag.
|
ToDate |
ToDate converts the ISO or the customized string or the Unix timestamp to the DateTime object.
|
ToDate2ARGS |
This method should never be used directly, use
ToDate . |
ToDate3ARGS |
This method should never be used directly, use
ToDate . |
ToDateISO |
This method should never be used directly, use
ToDate . |
TOKENIZE |
Given a chararray as an argument, this method will split the chararray and
return a bag with a tuple for each chararray that results from the split.
|
TOMAP |
This class makes a map out of the parameters passed to it
T = foreach U generate TOMAP($0, $1, $2, $3);
It generates a map $0->1, $2->$3
This UDF also accepts a bag with 'pair' tuples (i.e.
|
ToMilliSeconds |
ToMilliSeconds converts the DateTime to the number of milliseconds that have passed
since January 1, 1970 00:00:00.000 GMT.
|
TOP |
Top UDF accepts a bag of tuples and returns top-n tuples depending upon the
tuple field value of type long.
|
TOP.Final | |
TOP.Initial | |
TOP.Intermed | |
ToString |
ToString converts the DateTime object of the ISO or the customized string.
|
TOTUPLE |
This class makes a tuple out of the parameter
T = foreach U generate TOTUPLE($0, $1, $2);
It generates a tuple containing $0, $1, and $2
|
ToUnixTime |
ToUnixTime converts the DateTime to the Unix Time Long
|
TrevniStorage |
Pig Store/Load Function for Trevni.
|
TRIM |
Returns a string, with leading and trailing whitespace omitted.
|
TupleSize |
This method should never be used directly, use
SIZE . |
UCFIRST |
upper-case the first character of a string
|
UniqueID |
UniqueID generates a unique id for each records in the job.
|
UPPER |
UPPER implements eval function to convert a string to upper case
Example:
A = load 'mydata' as (name);
B = foreach A generate UPPER(name);
|
Utf8StorageConverter |
This abstract class provides standard conversions between utf8 encoded data
and pig data types.
|
VALUELIST |
This UDF takes a Map and returns a Bag containing the values from map.
|
VALUESET |
This UDF takes a Map and returns a Tuple containing the value set.
|
WeeksBetween |
WeeksBetween returns the number of weeks between two DateTime objects
|
YearsBetween |
YearsBetween returns the number of years between two DateTime objects
|
Annotation Type | Description |
---|---|
MonitoredUDF |
Describes how the execution of a UDF should be monitored, and what
to do if it times out.
|
Nondeterministic |
A non-deterministic UDF is one that can produce different results when
invoked on the same input.
|
OutputSchema |
An EvalFunc can annotated with an
OutputSchema to
tell Pig what the expected output is. |
Copyright © 2007-2017 The Apache Software Foundation