Class XPath

  extended by org.apache.pig.EvalFunc<String>
      extended by org.apache.pig.piggybank.evaluation.xml.XPath

public class XPath
extends EvalFunc<String>

XPath is a function that allows for text extraction from xml

Nested Class Summary
Nested classes/interfaces inherited from class org.apache.pig.EvalFunc
Field Summary
Fields inherited from class org.apache.pig.EvalFunc
log, pigLogger, reporter, returnType
Constructor Summary
Method Summary
 String exec(Tuple input)
          input should contain: 1) xml 2) xpath 3) optional cache xml doc flag Usage: 1) XPath(xml, xpath) 2) XPath(xml, xpath, false)
 List<FuncSpec> getArgToFuncMapping()
          Allow a UDF to specify type specific implementations of itself.
Methods inherited from class org.apache.pig.EvalFunc
finish, getCacheFiles, getInputSchema, getLogger, getPigLogger, getReporter, getReturnType, getSchemaName, getSchemaType, isAsynchronous, outputSchema, progress, setInputSchema, setPigLogger, setReporter, setUDFContextSignature, warn
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Constructor Detail


public XPath()
Method Detail


public String exec(Tuple input)
            throws IOException
input should contain: 1) xml 2) xpath 3) optional cache xml doc flag Usage: 1) XPath(xml, xpath) 2) XPath(xml, xpath, false)

Specified by:
exec in class EvalFunc<String>
1st - element should to be the xml 2nd element should be the xpath 3rd optional boolean cache flag (default true) This UDF will cache the last xml document. This is helpful when multiple consecutive xpath calls are made for the same xml document. Caching can be turned off to ensure that the UDF's recreates the internal javax.xml.xpath.XPath for every call
chararrary result or null if no match


public List<FuncSpec> getArgToFuncMapping()
                                   throws FrontendException
Description copied from class: EvalFunc
Allow a UDF to specify type specific implementations of itself. For example, an implementation of arithmetic sum might have int and float implementations, since integer arithmetic performs much better than floating point arithmetic. Pig's typechecker will call this method and using the returned list plus the schema of the function's input data, decide which implementation of the UDF to use.

getArgToFuncMapping in class EvalFunc<String>
A List containing FuncSpec objects representing the EvalFunc class which can handle the inputs corresponding to the schema in the objects. Each FuncSpec should be constructed with a schema that describes the input for that implementation. For example, the sum function above would return two elements in its list:
  1. FuncSpec(this.getClass().getName(), new Schema(new Schema.FieldSchema(null, DataType.DOUBLE)))
  2. FuncSpec(IntSum.getClass().getName(), new Schema(new Schema.FieldSchema(null, DataType.INTEGER)))
This would indicate that the main implementation is used for doubles, and the special implementation IntSum is used for ints.

Copyright © 2007-2012 The Apache Software Foundation