public class LogFormatLoader
extends nl.basjes.pig.input.apachehttpdlog.Loader
-- Specify any existing file as long as it exists. -- It won't be read by the loader when no fields are requested. Example = LOAD 'test.pig' USING org.apache.pig.piggybank.storage.apachelog.LogFormatLoader( '%h %l %u %t "%r" %>s %b "%{Referer}i" "%{User-Agent}i"' ); DUMP Example;The output of this command is a (huge) example (yes actual pig code) which demonstrates how all possible fields can be extracted. In normal use cases this example will be trimmed down to request only the fields your application really needs. This loader implements pushdown projection so there is no need to worry too much about the fields you leave in. This loader supports extracting things like an individual cookie or query string parameter regardless of the position it has in the actual log line. In addition to the logformat specification used in your custom config this parser also understands the standard formats:
common combined combinedio referer agentSo this works also:
Example = LOAD 'test.pig' USING org.apache.pig.piggybank.storage.apachelog.LogFormatLoader('common'); DUMP Example;This class is simply a wrapper around https://github.com/nielsbasjes/logparser so more detailed documentation can be found there.
LoadPushDown.OperatorSet, LoadPushDown.RequiredField, LoadPushDown.RequiredFieldList, LoadPushDown.RequiredFieldResponse
Constructor and Description |
---|
LogFormatLoader(String... parameters) |
getAdditionalDissectors, getFeatures, getInputFormat, getLogformat, getNext, getPartitionKeys, getRequestedFields, getSchema, getStatistics, getTypeRemappings, prepareToRead, pushProjection, setLocation, setPartitionFilter, setUDFContextSignature
getAbsolutePath, getCacheFiles, getLoadCaster, getPathStrings, getShipFiles, join, relativeToAbsolutePath, warn
public LogFormatLoader(String... parameters)
Copyright © 2007-2012 The Apache Software Foundation