Package org.apache.nutch.indexer
Class IndexerMapReduce
- java.lang.Object
- 
- org.apache.hadoop.conf.Configured
- 
- org.apache.nutch.indexer.IndexerMapReduce
 
 
- 
- All Implemented Interfaces:
- Configurable
 
 public class IndexerMapReduce extends Configured This class is typically invoked from within IndexingJoband handles all MapReduce functionality required when undertaking indexing.This is a consequence of one or more indexing plugins being invoked which extend IndexWriter.See initMRJob(Path, Path, Collection, Job, boolean)for details on the specific data structures and parameters required for indexing.
- 
- 
Nested Class SummaryNested Classes Modifier and Type Class Description static classIndexerMapReduce.IndexerMapperstatic classIndexerMapReduce.IndexerReducer
 - 
Field SummaryFields Modifier and Type Field Description static StringINDEXER_BINARY_AS_BASE64static StringINDEXER_DELETEstatic StringINDEXER_DELETE_ROBOTS_NOINDEXstatic StringINDEXER_DELETE_SKIPPEDstatic StringINDEXER_NO_COMMITstatic StringINDEXER_PARAMSstatic StringINDEXER_SKIP_NOTMODIFIEDstatic StringURL_FILTERINGstatic StringURL_NORMALIZING
 - 
Constructor SummaryConstructors Constructor Description IndexerMapReduce()
 - 
Method SummaryAll Methods Static Methods Concrete Methods Modifier and Type Method Description static voidinitMRJob(Path crawlDb, Path linkDb, Collection<Path> segments, Job job, boolean addBinaryContent)- 
Methods inherited from class org.apache.hadoop.conf.ConfiguredgetConf, setConf
 
- 
 
- 
- 
- 
Field Detail- 
INDEXER_PARAMSpublic static final String INDEXER_PARAMS - See Also:
- Constant Field Values
 
 - 
INDEXER_DELETEpublic static final String INDEXER_DELETE - See Also:
- Constant Field Values
 
 - 
INDEXER_NO_COMMITpublic static final String INDEXER_NO_COMMIT - See Also:
- Constant Field Values
 
 - 
INDEXER_DELETE_ROBOTS_NOINDEXpublic static final String INDEXER_DELETE_ROBOTS_NOINDEX - See Also:
- Constant Field Values
 
 - 
INDEXER_DELETE_SKIPPEDpublic static final String INDEXER_DELETE_SKIPPED - See Also:
- Constant Field Values
 
 - 
INDEXER_SKIP_NOTMODIFIEDpublic static final String INDEXER_SKIP_NOTMODIFIED - See Also:
- Constant Field Values
 
 - 
URL_FILTERINGpublic static final String URL_FILTERING - See Also:
- Constant Field Values
 
 - 
URL_NORMALIZINGpublic static final String URL_NORMALIZING - See Also:
- Constant Field Values
 
 - 
INDEXER_BINARY_AS_BASE64public static final String INDEXER_BINARY_AS_BASE64 - See Also:
- Constant Field Values
 
 
- 
 - 
Method Detail- 
initMRJobpublic static void initMRJob(Path crawlDb, Path linkDb, Collection<Path> segments, Job job, boolean addBinaryContent) throws IOException - Throws:
- IOException
 
 
- 
 
-