org.apache.pig.builtin
Class DIFF

java.lang.Object
  extended by org.apache.pig.EvalFunc<DataBag>
      extended by org.apache.pig.builtin.DIFF

public class DIFF
extends EvalFunc<DataBag>

DIFF takes two bags as arguments and compares them. Any tuples that are in one bag but not the other are returned. If the fields are not bags then they will be returned if they do not match, or an empty bag will be returned if the two records match.

The implementation assumes that both bags being passed to this function will fit entirely into memory simultaneously. If that is not the case the UDF will still function, but it will be very slow.


Nested Class Summary
 
Nested classes/interfaces inherited from class org.apache.pig.EvalFunc
EvalFunc.SchemaType
 
Field Summary
 
Fields inherited from class org.apache.pig.EvalFunc
log, pigLogger, reporter, returnType
 
Constructor Summary
DIFF()
           
 
Method Summary
 DataBag exec(Tuple input)
          Compares a tuple with two fields.
 
Methods inherited from class org.apache.pig.EvalFunc
finish, getArgToFuncMapping, getCacheFiles, getInputSchema, getLogger, getPigLogger, getReporter, getReturnType, getSchemaName, getSchemaType, isAsynchronous, outputSchema, progress, setInputSchema, setPigLogger, setReporter, setUDFContextSignature, warn
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

DIFF

public DIFF()
Method Detail

exec

public DataBag exec(Tuple input)
             throws IOException
Compares a tuple with two fields. Emits any differences.

Specified by:
exec in class EvalFunc<DataBag>
Parameters:
input - a tuple with exactly two fields.
Returns:
result, of type T.
Throws:
IOException - if there are not exactly two fields in a tuple


Copyright © 2007-2012 The Apache Software Foundation