Support for logging in using a kerberos keytab file.
Kerberos is a authentication system that uses tickets with a limited valitity time.
As a consequence running a pig script on a kerberos secured hadoop cluster limits the running time to at most
the remaining validity time of these kerberos tickets. When doing really complex analytics this may become a
problem as the job may need to run for a longer time than these ticket times allow.
A kerberos keytab file is essentially a Kerberos specific form of the password of a user.
It is possible to enable a Hadoop job to request new tickets when they expire by creating a keytab file and
make it part of the job that is running in the cluster.
This will extend the maximum job duration beyond the maximum renew time of the kerberos tickets.
Usage:
- Create a keytab file for the required principal.
Using the ktutil tool you can create a keytab using roughly these commands:
addent -password -p niels@EXAMPLE.NL -k 1 -e rc4-hmac
addent -password -p niels@EXAMPLE.NL -k 1 -e aes256-cts
wkt niels.keytab
- Set the following properties (either via the .pigrc file or on the command line via -P file)
- java.security.krb5.conf
The path to the local krb5.conf file.
Usually this is "/etc/krb5.conf"
- hadoop.security.krb5.principal
The pricipal you want to login with.
Usually this would look like this "niels@EXAMPLE.NL"
- hadoop.security.krb5.keytab
The path to the local keytab file that must be used to authenticate with.
Usually this would look like this "/home/niels/.krb/niels.keytab"
NOTE: All paths in these variables are local to the client system starting the actual pig script.
This can be run without any special access to the cluster nodes.