org.apache.pig.scripting
Class Pig

java.lang.Object
  extended by org.apache.pig.scripting.Pig

public class Pig
extends Object

The class being used in scripts to interact with Pig


Constructor Summary
protected Pig(String script, ScriptPigContext scriptContext, String name)
           
 
Method Summary
 BoundScript bind()
          Bind a Pig object to variables in the host language (optional operation).
 BoundScript bind(List<Map<String,Object>> vars)
          Bind this to multiple sets of variables.
 BoundScript bind(Map<String,Object> vars)
          Bind this to a set of variables.
static Pig compile(String pl)
          Define a Pig pipeline.
static Pig compile(String name, String pl)
          Define a named portion of a Pig pipeline.
static Pig compileFromFile(String filename)
          Define a Pig pipeline based on Pig Latin in a separate file.
static Pig compileFromFile(String name, String filename)
          Define a named Pig pipeline based on Pig Latin in a separate file.
static void define(String alias, String definition)
          Define an alias for a UDF or a streaming command.
static int fs(String cmd)
          Run a filesystem command.
static void registerJar(String jarfile)
          Register a jar for use in Pig.
static void registerUDF(String udffile, String namespace)
          Register scripting UDFs for use in Pig.
static void set(String var, String value)
          Set a variable for use in Pig Latin.
static int sql(String cmd)
          Run a sql command.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

Pig

protected Pig(String script,
              ScriptPigContext scriptContext,
              String name)
Method Detail

fs

public static int fs(String cmd)
              throws IOException
Run a filesystem command. Any output from this command is written to stdout or stderr as appropriate.

Parameters:
cmd - Filesystem command to run along with its arguments as one string.
Throws:
IOException

sql

public static int sql(String cmd)
               throws IOException
Run a sql command. Any output from this command is written to stdout or stderr as appropriate.

Parameters:
cmd - sql command to run along with its arguments as one string. Currently only hcat is supported as a sql backend
Throws:
IOException

registerJar

public static void registerJar(String jarfile)
                        throws IOException
Register a jar for use in Pig. Once this is done this jar will be registered for all subsequent Pig pipelines in this script. If you wish to register it for only a single Pig pipeline, use register within that definition.

Parameters:
jarfile - Path of jar to include.
Throws:
IOException - if the indicated jarfile cannot be found.

registerUDF

public static void registerUDF(String udffile,
                               String namespace)
                        throws IOException
Register scripting UDFs for use in Pig. Once this is done all UDFs defined in the file will be available for all subsequent Pig pipelines in this script. If you wish to register UDFS for only a single Pig pipeline, use register within that definition.

Parameters:
udffile - Path of the script UDF file
namespace - namespace of the UDFs
Throws:
IOException

define

public static void define(String alias,
                          String definition)
                   throws IOException
Define an alias for a UDF or a streaming command. This definition will then be present for all subsequent Pig pipelines defined in this script. If you wish to define it for only a single Pig pipeline, use define within that definition.

Parameters:
alias - name of the defined alias
definition - string this alias is defined as
Throws:
IOException

set

public static void set(String var,
                       String value)
                throws IOException
Set a variable for use in Pig Latin. This set will then be present for all subsequent Pig pipelines defined in this script. If you wish to set it for only a single Pig pipeline, use set within that definition.

Parameters:
var - variable to set
value - to set it to
Throws:
IOException

compile

public static Pig compile(String pl)
                   throws IOException
Define a Pig pipeline.

Parameters:
pl - Pig Latin definition of the pipeline.
Returns:
Pig object representing this pipeline.
Throws:
IOException - if the Pig Latin does not compile.

compile

public static Pig compile(String name,
                          String pl)
                   throws IOException
Define a named portion of a Pig pipeline. This allows it to be imported into another pipeline.

Parameters:
name - Name that will be used to define this pipeline. The namespace is global.
pl - Pig Latin definition of the pipeline.
Returns:
Pig object representing this pipeline.
Throws:
IOException - if the Pig Latin does not compile.

compileFromFile

public static Pig compileFromFile(String filename)
                           throws IOException
Define a Pig pipeline based on Pig Latin in a separate file.

Parameters:
filename - File to read Pig Latin from. This must be a purely Pig Latin file. It cannot contain host language constructs in it.
Returns:
Pig object representing this pipeline.
Throws:
IOException - if the Pig Latin does not compile or the file cannot be found.

compileFromFile

public static Pig compileFromFile(String name,
                                  String filename)
                           throws IOException
Define a named Pig pipeline based on Pig Latin in a separate file. This allows it to be imported into another pipeline.

Parameters:
name - Name that will be used to define this pipeline. The namespace is global.
filename - File to read Pig Latin from. This must be a purely Pig Latin file. It cannot contain host language constructs in it.
Returns:
Pig object representing this pipeline.
Throws:
IOException - if the Pig Latin does not compile or the file cannot be found.

bind

public BoundScript bind(Map<String,Object> vars)
                 throws IOException
Bind this to a set of variables. Values must be provided for all Pig Latin parameters.

Parameters:
vars - map of variables to bind. Keys should be parameters defined in the Pig Latin. Values should be strings that provide values for those parameters. They can be either constants or variables from the host language. Host language variables must contain strings.
Returns:
a BoundScript object
Throws:
IOException - if there is not a key for each Pig Latin parameter or if they contain unsupported types.

bind

public BoundScript bind(List<Map<String,Object>> vars)
                 throws IOException
Bind this to multiple sets of variables. This will cause the Pig Latin script to be executed in parallel over these sets of variables.

Parameters:
vars - list of maps of variables to bind. Keys should be parameters defined in the Pig Latin. Values should be strings that provide values for those variables. They can be either constants or variables from the host language. Host language variables must be strings.
Returns:
a BoundScript object
Throws:
IOException - if there is not a key for each Pig Latin parameter or if they contain unsupported types.

bind

public BoundScript bind()
                 throws IOException
Bind a Pig object to variables in the host language (optional operation). This does an implicit mapping of variables in the host language to parameters in Pig Latin. For example, if the user provides a Pig Latin statement p = Pig.compile("A = load '$input';"); and then calls this function it will look for a variable called input in the host language. Scoping rules of the host language will be followed in selecting which variable to bind. The variable bound must contain a string value. This method is optional because not all host languages may support searching for in scope variables.

Throws:
IOException - if host language variables are not found to resolve all Pig Latin parameters or if they contain unsupported types.


Copyright © 2007-2012 The Apache Software Foundation