org.apache.pig.impl.streaming
Class StreamingCommand

java.lang.Object
  extended by org.apache.pig.impl.streaming.StreamingCommand
All Implemented Interfaces:
Serializable, Cloneable

public class StreamingCommand
extends Object
implements Serializable, Cloneable

StreamingCommand represents the specification of an external command to be executed in a Pig Query. StreamingCommand encapsulates all relevant details of the command specified by the user either directly via the STREAM operator or indirectly via a DEFINE operator. It includes details such as input/output/error specifications and also files to be shipped to the cluster and files to be cached.

See Also:
Serialized Form

Nested Class Summary
static class StreamingCommand.Handle
          Handle to communicate with the external process.
static class StreamingCommand.HandleSpec
          Specification about the usage of the StreamingCommand.Handle to communicate with the external process.
 
Field Summary
static int MAX_TASKS
           
 
Constructor Summary
StreamingCommand(PigContext pigContext, String[] argv)
          Create a new StreamingCommand with the given command.
 
Method Summary
 void addHandleSpec(StreamingCommand.Handle handle, StreamingCommand.HandleSpec handleSpec)
          Attach a StreamingCommand.HandleSpec to a given StreamingCommand.Handle
 void addPathToCache(String path)
          Add a file to be cached on execute nodes on the cluster.
 void addPathToShip(String path)
          Add a file to be shipped to the cluster.
 Object clone()
           
 List<String> getCacheSpecs()
          Get the list of files which need to be cached on the execute nodes.
 String[] getCommandArgs()
          Get the parsed command arguments.
 String getExecutable()
          Get the command to be executed.
 List<StreamingCommand.HandleSpec> getHandleSpecs(StreamingCommand.Handle handle)
          Get specifications for the given Handle.
 StreamingCommand.HandleSpec getInputSpec()
          Get the input specification of the StreamingCommand.
 String getLogDir()
          Get the directory where the log-files of the command are persisted.
 int getLogFilesLimit()
          Get the maximum number of tasks whose stderr logs files are persisted.
 StreamingCommand.HandleSpec getOutputSpec()
          Get the specification of the primary output of the StreamingCommand.
 boolean getPersistStderr()
          Should the stderr of the managed process be persisted?
 boolean getShipFiles()
          Get whether files for this command should be shipped or not.
 List<String> getShipSpecs()
          Get the list of files which need to be shipped to the cluster.
 void setCommandArgs(String[] argv)
          Set the command line arguments for the StreamingCommand.
 void setExecutable(String executable)
          Set the executable for the StreamingCommand.
 void setInputSpec(StreamingCommand.HandleSpec spec)
          Set the input specification for the StreamingCommand.
 void setLogDir(String logDir)
          Set the directory where the log-files of the command are persisted.
 void setLogFilesLimit(int logFilesLimit)
          Set the maximum number of tasks whose stderr logs files are persisted.
 void setOutputSpec(StreamingCommand.HandleSpec spec)
          Set the specification for the primary output of the StreamingCommand.
 void setPersistStderr(boolean persistStderr)
          Specify if the stderr of the managed process should be persisted.
 void setShipFiles(boolean shipFiles)
          Set whether files should be shipped or not.
 String toString()
           
 
Methods inherited from class java.lang.Object
equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Field Detail

MAX_TASKS

public static final int MAX_TASKS
See Also:
Constant Field Values
Constructor Detail

StreamingCommand

public StreamingCommand(PigContext pigContext,
                        String[] argv)
Create a new StreamingCommand with the given command.

Parameters:
pigContext - PigContext structure
argv - parsed arguments of the command
Method Detail

getExecutable

public String getExecutable()
Get the command to be executed.

Returns:
the command to be executed

setExecutable

public void setExecutable(String executable)
Set the executable for the StreamingCommand.

Parameters:
executable - the executable for the StreamingCommand

setCommandArgs

public void setCommandArgs(String[] argv)
Set the command line arguments for the StreamingCommand.

Parameters:
argv - the command line arguments for the StreamingCommand

getCommandArgs

public String[] getCommandArgs()
Get the parsed command arguments.

Returns:
the parsed command arguments as String[]

getShipSpecs

public List<String> getShipSpecs()
Get the list of files which need to be shipped to the cluster.

Returns:
the list of files which need to be shipped to the cluster

getCacheSpecs

public List<String> getCacheSpecs()
Get the list of files which need to be cached on the execute nodes.

Returns:
the list of files which need to be cached on the execute nodes

addPathToShip

public void addPathToShip(String path)
                   throws IOException
Add a file to be shipped to the cluster. Users can use this to distribute executables and other necessary files to the clusters.

Parameters:
path - path of the file to be shipped to the cluster
Throws:
IOException

addPathToCache

public void addPathToCache(String path)
                    throws IOException
Add a file to be cached on execute nodes on the cluster. The file is assumed to be available at the shared filesystem.

Parameters:
path - path of the file to be cached on the execute nodes
Throws:
IOException

addHandleSpec

public void addHandleSpec(StreamingCommand.Handle handle,
                          StreamingCommand.HandleSpec handleSpec)
Attach a StreamingCommand.HandleSpec to a given StreamingCommand.Handle

Parameters:
handle - Handle to which the specification is to be attached.
handleSpec - HandleSpec for the given handle.

setInputSpec

public void setInputSpec(StreamingCommand.HandleSpec spec)
Set the input specification for the StreamingCommand.

Parameters:
spec - input specification

getInputSpec

public StreamingCommand.HandleSpec getInputSpec()
Get the input specification of the StreamingCommand.

Returns:
input specification of the StreamingCommand

setOutputSpec

public void setOutputSpec(StreamingCommand.HandleSpec spec)
Set the specification for the primary output of the StreamingCommand.

Parameters:
spec - specification for the primary output of the StreamingCommand

getOutputSpec

public StreamingCommand.HandleSpec getOutputSpec()
Get the specification of the primary output of the StreamingCommand.

Returns:
specification of the primary output of the StreamingCommand

getHandleSpecs

public List<StreamingCommand.HandleSpec> getHandleSpecs(StreamingCommand.Handle handle)
Get specifications for the given Handle.

Parameters:
handle - Handle of the stream
Returns:
specification for the given Handle

getPersistStderr

public boolean getPersistStderr()
Should the stderr of the managed process be persisted?

Returns:
true if the stderr of the managed process should be persisted, false otherwise.

setPersistStderr

public void setPersistStderr(boolean persistStderr)
Specify if the stderr of the managed process should be persisted.

Parameters:
persistStderr - true if the stderr of the managed process should be persisted, else false

getLogDir

public String getLogDir()
Get the directory where the log-files of the command are persisted.

Returns:
the directory where the log-files of the command are persisted

setLogDir

public void setLogDir(String logDir)
Set the directory where the log-files of the command are persisted.

Parameters:
logDir - the directory where the log-files of the command are persisted

getLogFilesLimit

public int getLogFilesLimit()
Get the maximum number of tasks whose stderr logs files are persisted.

Returns:
the maximum number of tasks whose stderr logs files are persisted

setLogFilesLimit

public void setLogFilesLimit(int logFilesLimit)
Set the maximum number of tasks whose stderr logs files are persisted.

Parameters:
logFilesLimit - the maximum number of tasks whose stderr logs files are persisted

setShipFiles

public void setShipFiles(boolean shipFiles)
Set whether files should be shipped or not.

Parameters:
shipFiles - true if files of this command should be shipped, false otherwise

getShipFiles

public boolean getShipFiles()
Get whether files for this command should be shipped or not.

Returns:
true if files of this command should be shipped, false otherwise

toString

public String toString()
Overrides:
toString in class Object

clone

public Object clone()
Overrides:
clone in class Object


Copyright © 2007-2012 The Apache Software Foundation