DateExtractor has four different constructors which each allow for different functionality. The
incomingDateFormat ("dd/MMM/yyyy:HH:mm:ss Z" by default) is used to match the date string that gets passed in from the
log. The outgoingDateFormat ("yyyy-MM-dd" by default) is used to format the returned string.
Different constructors exist for each combination; please use the appropriate respective constructor.
Note that any data that exists in the SimpleDateFormat schema can be supported. For example, if you were
starting with the default incoming format and wanted to extract just the year, you would use the single
string constructor DateExtractor("yyyy").
From pig latin you will need to use aliases to use a non-default format, like
define MyDateExtractor org.apache.pig.piggybank.evaluation.util.apachelogparser.DateExtractor("yyyy-MM");
A = FOREACH row GENERATE DateExtractor(dayTime);
If a string cannot be parsed, null will be returned and an error message printed to stderr.
By default, the DateExtractor uses the GMT timezone. You can use the three-parameter constructor to override the
Nested Class Summary
Nested classes/interfaces inherited from class org.apache.pig.EvalFunc
This callback method must be implemented by all subclasses. This
is the method that will be invoked on every Tuple of a given dataset.
Since the dataset may be divided up in a variety of ways the programmer
should not make assumptions about state that is maintained between
invocations of this method.
Allow a UDF to specify type specific implementations of itself. For example,
an implementation of arithmetic sum might have int and float implementations,
since integer arithmetic performs much better than floating point arithmetic. Pig's
typechecker will call this method and using the returned list plus the schema
of the function's input data, decide which implementation of the UDF to use.
A List containing FuncSpec objects representing the EvalFunc class
which can handle the inputs corresponding to the schema in the objects. Each
FuncSpec should be constructed with a schema that describes the input for that
implementation. For example, the sum function above would return two elements in its
FuncSpec(this.getClass().getName(), new Schema(new Schema.FieldSchema(null, DataType.DOUBLE)))
FuncSpec(IntSum.getClass().getName(), new Schema(new Schema.FieldSchema(null, DataType.INTEGER)))
This would indicate that the main implementation is used for doubles, and the special
implementation IntSum is used for ints.