Class Stitch

  extended by org.apache.pig.EvalFunc<DataBag>
      extended by org.apache.pig.piggybank.evaluation.Stitch

public class Stitch
extends EvalFunc<DataBag>

Given a set of bags, stitch them together tuple by tuple. That is, assuming the bags have row numbers join them by row number. So given two bags

{(1, 2), (3, 4)} and

{(5, 6), (7, 8)} the result will be

{(1, 2, 5, 6), (3, 4, 7, 8)} In general it is assumed that each bag has the same number of tuples. The implementation uses the first bag to determine the number of tuples placed in the output. If bags beyond the first have fewer tuples then the resulting tuples will have fewer fields. Nulls will not be filled in.

Any number of bags can be passed to this function.

Nested Class Summary
Nested classes/interfaces inherited from class org.apache.pig.EvalFunc
Field Summary
Fields inherited from class org.apache.pig.EvalFunc
log, pigLogger, reporter, returnType
Constructor Summary
Method Summary
 DataBag exec(Tuple input)
          This callback method must be implemented by all subclasses.
 Schema outputSchema(Schema inputSch)
          Report the schema of the output of this UDF.
Methods inherited from class org.apache.pig.EvalFunc
finish, getArgToFuncMapping, getCacheFiles, getInputSchema, getLogger, getPigLogger, getReporter, getReturnType, getSchemaName, getSchemaType, isAsynchronous, progress, setInputSchema, setPigLogger, setReporter, setUDFContextSignature, warn
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Constructor Detail


public Stitch()
Method Detail


public DataBag exec(Tuple input)
             throws IOException
Description copied from class: EvalFunc
This callback method must be implemented by all subclasses. This is the method that will be invoked on every Tuple of a given dataset. Since the dataset may be divided up in a variety of ways the programmer should not make assumptions about state that is maintained between invocations of this method.

Specified by:
exec in class EvalFunc<DataBag>
input - the Tuple to be processed.
result, of type T.


public Schema outputSchema(Schema inputSch)
Description copied from class: EvalFunc
Report the schema of the output of this UDF. Pig will make use of this in error checking, optimization, and planning. The schema of input data to this UDF is provided.

The default implementation interprets the OutputSchema annotation, if one is present. Otherwise, it returns null (no known output schema).

outputSchema in class EvalFunc<DataBag>
inputSch - Schema of the input
Schema of the output

Copyright © 2007-2012 The Apache Software Foundation