PigGenericMapBase (Pig 0.18.0 API)

java.lang.Object
- org.apache.hadoop.mapreduce.Mapper<org.apache.hadoop.io.Text,Tuple,PigNullableWritable,org.apache.hadoop.io.Writable>
- - org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase

Direct Known Subclasses:

PigMapBase
```
public abstract class PigGenericMapBase
extends org.apache.hadoop.mapreduce.Mapper<org.apache.hadoop.io.Text,Tuple,PigNullableWritable,org.apache.hadoop.io.Writable>
```
This class is the base class for PigMapBase, which has slightly difference among different versions of hadoop. PigMapBase implementation is located in $PIG_HOME/shims.

Nested Class Summary
- Nested classes/interfaces inherited from class org.apache.hadoop.mapreduce.Mapper
  org.apache.hadoop.mapreduce.Mapper.Context

Field Summary

Fields
Modifier and Type	Field and Description
`protected boolean`	`errorInMap`
`protected byte`	`keyType`
`protected PhysicalPlan`	`mp`
`protected java.util.List<POStore>`	`stores`
`protected TupleFactory`	`tf`

Constructor Summary

Constructors
Constructor and Description

PigGenericMapBase()

Constructors
Constructor and Description
`PigGenericMapBase()`

Method Summary

All Methods Instance Methods Abstract Methods Concrete Methods
Modifier and Type	Method and Description
`void`	`cleanup(org.apache.hadoop.mapreduce.Mapper.Context context)` Will be called when all the tuples in the input are done.
`abstract void`	`collect(org.apache.hadoop.mapreduce.Mapper.Context oc, Tuple tuple)`
`abstract org.apache.hadoop.mapreduce.Mapper.Context`	`getIllustratorContext(org.apache.hadoop.conf.Configuration conf, DataBag input, java.util.List<Pair<PigNullableWritable,org.apache.hadoop.io.Writable>> output, org.apache.hadoop.mapreduce.InputSplit split)`
`byte`	`getKeyType()`
`abstract boolean`	`inIllustrator(org.apache.hadoop.mapreduce.Mapper.Context context)`
`protected void`	`map(org.apache.hadoop.io.Text key, Tuple inpTuple, org.apache.hadoop.mapreduce.Mapper.Context context)` The map function that attaches the inpTuple appropriately and executes the map plan if its not empty.
`protected void`	`runPipeline(PhysicalOperator leaf)`
`void`	`setKeyType(byte keyType)`
`void`	`setMapPlan(PhysicalPlan plan)` for local map/reduce simulation
`void`	`setup(org.apache.hadoop.mapreduce.Mapper.Context context)` Configures the mapper with the map plan and the reproter thread

Methods inherited from class org.apache.hadoop.mapreduce.Mapper
run

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Field Detail

keyType
```
protected byte keyType
```

mp
```
protected PhysicalPlan mp
```

stores

protected java.util.List<POStore> stores

tf
```
protected TupleFactory tf
```

errorInMap
```
protected boolean errorInMap
```

Constructor Detail
- PigGenericMapBase
```
public PigGenericMapBase()
```

Method Detail

setMapPlan
```
public void setMapPlan(PhysicalPlan plan)
```
for local map/reduce simulation

Parameters:

plan - the map plan

cleanup
```
public void cleanup(org.apache.hadoop.mapreduce.Mapper.Context context)
             throws java.io.IOException,
                    java.lang.InterruptedException
```
Will be called when all the tuples in the input are done. So reporter thread should be closed.

Overrides:

cleanup in class org.apache.hadoop.mapreduce.Mapper<org.apache.hadoop.io.Text,Tuple,PigNullableWritable,org.apache.hadoop.io.Writable>

Throws:

java.io.IOException

java.lang.InterruptedException

setup
```
public void setup(org.apache.hadoop.mapreduce.Mapper.Context context)
           throws java.io.IOException,
                  java.lang.InterruptedException
```
Configures the mapper with the map plan and the reproter thread

Overrides:

setup in class org.apache.hadoop.mapreduce.Mapper<org.apache.hadoop.io.Text,Tuple,PigNullableWritable,org.apache.hadoop.io.Writable>

Throws:

java.io.IOException

java.lang.InterruptedException

map
```
protected void map(org.apache.hadoop.io.Text key,
                   Tuple inpTuple,
                   org.apache.hadoop.mapreduce.Mapper.Context context)
            throws java.io.IOException,
                   java.lang.InterruptedException
```
The map function that attaches the inpTuple appropriately and executes the map plan if its not empty. Collects the result of execution into oc or the input directly to oc if map plan empty. The collection is left abstract for the map-only or map-reduce job to implement. Map-only collects the tuple as-is whereas map-reduce collects it after extracting the key and indexed tuple.

Overrides:

map in class org.apache.hadoop.mapreduce.Mapper<org.apache.hadoop.io.Text,Tuple,PigNullableWritable,org.apache.hadoop.io.Writable>

Throws:

java.io.IOException

java.lang.InterruptedException

runPipeline

protected void runPipeline(PhysicalOperator leaf)
                    throws java.io.IOException,
                           java.lang.InterruptedException

Throws:: java.io.IOException; java.lang.InterruptedException

collect

public abstract void collect(org.apache.hadoop.mapreduce.Mapper.Context oc,
                             Tuple tuple)
                      throws java.lang.InterruptedException,
                             java.io.IOException

Throws:: java.lang.InterruptedException; java.io.IOException

inIllustrator

public abstract boolean inIllustrator(org.apache.hadoop.mapreduce.Mapper.Context context)

getKeyType
```
public byte getKeyType()
```
Returns:

the keyType

setKeyType
```
public void setKeyType(byte keyType)
```
Parameters:

keyType - the keyType to set

getIllustratorContext

public abstract org.apache.hadoop.mapreduce.Mapper.Context getIllustratorContext(org.apache.hadoop.conf.Configuration conf,
                                                                                 DataBag input,
                                                                                 java.util.List<Pair<PigNullableWritable,org.apache.hadoop.io.Writable>> output,
                                                                                 org.apache.hadoop.mapreduce.InputSplit split)
                                                                          throws java.io.IOException,
                                                                                 java.lang.InterruptedException

Throws:: java.io.IOException; java.lang.InterruptedException

Class PigGenericMapBase

Nested Class Summary

Nested classes/interfaces inherited from class org.apache.hadoop.mapreduce.Mapper

Field Summary

Constructor Summary

Method Summary

Methods inherited from class org.apache.hadoop.mapreduce.Mapper

Methods inherited from class java.lang.Object

Field Detail

keyType

mp

stores

tf

errorInMap

Constructor Detail

PigGenericMapBase

Method Detail

setMapPlan

cleanup

setup

map

runPipeline

collect

inIllustrator

getKeyType

setKeyType

getIllustratorContext