Package org.apache.nutch.fetcher
Class Fetcher.InputFormat
- java.lang.Object
- 
- org.apache.hadoop.mapreduce.InputFormat<K,V>
- 
- org.apache.hadoop.mapreduce.lib.input.FileInputFormat<K,V>
- 
- org.apache.hadoop.mapreduce.lib.input.SequenceFileInputFormat<Text,CrawlDatum>
- 
- org.apache.nutch.fetcher.Fetcher.InputFormat
 
 
 
 
- 
- Enclosing class:
- Fetcher
 
 public static class Fetcher.InputFormat extends SequenceFileInputFormat<Text,CrawlDatum> 
- 
- 
Nested Class Summary- 
Nested classes/interfaces inherited from class org.apache.hadoop.mapreduce.lib.input.FileInputFormatFileInputFormat.Counter
 
- 
 - 
Field Summary- 
Fields inherited from class org.apache.hadoop.mapreduce.lib.input.FileInputFormatDEFAULT_LIST_STATUS_NUM_THREADS, INPUT_DIR, INPUT_DIR_NONRECURSIVE_IGNORE_SUBDIRS, INPUT_DIR_RECURSIVE, LIST_STATUS_NUM_THREADS, NUM_INPUT_FILES, PATHFILTER_CLASS, SPLIT_MAXSIZE, SPLIT_MINSIZE
 
- 
 - 
Constructor SummaryConstructors Constructor Description InputFormat()
 - 
Method SummaryAll Methods Instance Methods Concrete Methods Modifier and Type Method Description List<InputSplit>getSplits(JobContext job)Don't split inputs to keep things polite - a single fetch list must be processed in one fetcher task.- 
Methods inherited from class org.apache.hadoop.mapreduce.lib.input.SequenceFileInputFormatcreateRecordReader, getFormatMinSplitSize, listStatus
 - 
Methods inherited from class org.apache.hadoop.mapreduce.lib.input.FileInputFormataddInputPath, addInputPathRecursively, addInputPaths, computeSplitSize, getBlockIndex, getInputDirRecursive, getInputPathFilter, getInputPaths, getMaxSplitSize, getMinSplitSize, isSplitable, makeSplit, makeSplit, setInputDirRecursive, setInputPathFilter, setInputPaths, setInputPaths, setMaxInputSplitSize, setMinInputSplitSize
 
- 
 
- 
- 
- 
Method Detail- 
getSplitspublic List<InputSplit> getSplits(JobContext job) throws IOException Don't split inputs to keep things polite - a single fetch list must be processed in one fetcher task. Do not split a fetch lists and assigning the splits to multiple parallel tasks.- Overrides:
- getSplitsin class- FileInputFormat<Text,CrawlDatum>
- Throws:
- IOException
 
 
- 
 
-