Package org.apache.nutch.tools.arc
Class ArcInputFormat
- java.lang.Object
- 
- org.apache.hadoop.mapreduce.InputFormat<K,V>
- 
- org.apache.hadoop.mapreduce.lib.input.FileInputFormat<Text,BytesWritable>
- 
- org.apache.nutch.tools.arc.ArcInputFormat
 
 
 
- 
 public class ArcInputFormat extends FileInputFormat<Text,BytesWritable> A input format the reads arc files.
- 
- 
Nested Class Summary- 
Nested classes/interfaces inherited from class org.apache.hadoop.mapreduce.lib.input.FileInputFormatFileInputFormat.Counter
 
- 
 - 
Field Summary- 
Fields inherited from class org.apache.hadoop.mapreduce.lib.input.FileInputFormatDEFAULT_LIST_STATUS_NUM_THREADS, INPUT_DIR, INPUT_DIR_NONRECURSIVE_IGNORE_SUBDIRS, INPUT_DIR_RECURSIVE, LIST_STATUS_NUM_THREADS, NUM_INPUT_FILES, PATHFILTER_CLASS, SPLIT_MAXSIZE, SPLIT_MINSIZE
 
- 
 - 
Constructor SummaryConstructors Constructor Description ArcInputFormat()
 - 
Method SummaryAll Methods Instance Methods Concrete Methods Modifier and Type Method Description RecordReader<Text,BytesWritable>createRecordReader(InputSplit split, TaskAttemptContext context)RecordReader<Text,BytesWritable>getRecordReader(InputSplit split, Job job, Mapper.Context context)Get theRecordReaderfor reading the arc file.- 
Methods inherited from class org.apache.hadoop.mapreduce.lib.input.FileInputFormataddInputPath, addInputPathRecursively, addInputPaths, computeSplitSize, getBlockIndex, getFormatMinSplitSize, getInputDirRecursive, getInputPathFilter, getInputPaths, getMaxSplitSize, getMinSplitSize, getSplits, isSplitable, listStatus, makeSplit, makeSplit, setInputDirRecursive, setInputPathFilter, setInputPaths, setInputPaths, setMaxInputSplitSize, setMinInputSplitSize
 
- 
 
- 
- 
- 
Method Detail- 
createRecordReaderpublic RecordReader<Text,BytesWritable> createRecordReader(InputSplit split, TaskAttemptContext context) - Specified by:
- createRecordReaderin class- InputFormat<Text,BytesWritable>
 
 - 
getRecordReaderpublic RecordReader<Text,BytesWritable> getRecordReader(InputSplit split, Job job, Mapper.Context context) throws IOException Get theRecordReaderfor reading the arc file.- Parameters:
- split- The InputSplit of the arc file to process.
- job- The job configuration.
- context- The task context.
- Returns:
- A configured ArcRecordReader
- Throws:
- IOException- if there is a fatal I/O error reading the- InputSplit
 
 
- 
 
-