Package org.apache.nutch.indexer
Class IndexingJob
- java.lang.Object
- 
- org.apache.hadoop.conf.Configured
- 
- org.apache.nutch.util.NutchTool
- 
- org.apache.nutch.indexer.IndexingJob
 
 
 
- 
- All Implemented Interfaces:
- Configurable,- Tool
 
 public class IndexingJob extends NutchTool implements Tool Generic indexer which relies on the plugins implementing IndexWriter
- 
- 
Field Summary- 
Fields inherited from class org.apache.nutch.util.NutchToolcurrentJob, currentJobNum, numJobs, results, status
 
- 
 - 
Constructor SummaryConstructors Constructor Description IndexingJob()IndexingJob(Configuration conf)
 - 
Method SummaryAll Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description voidindex(Path crawlDb, Path linkDb, List<Path> segments, boolean noCommit)voidindex(Path crawlDb, Path linkDb, List<Path> segments, boolean noCommit, boolean deleteGone)voidindex(Path crawlDb, Path linkDb, List<Path> segments, boolean noCommit, boolean deleteGone, String params)voidindex(Path crawlDb, Path linkDb, List<Path> segments, boolean noCommit, boolean deleteGone, String params, boolean filter, boolean normalize)voidindex(Path crawlDb, Path linkDb, List<Path> segments, boolean noCommit, boolean deleteGone, String params, boolean filter, boolean normalize, boolean addBinaryContent)voidindex(Path crawlDb, Path linkDb, List<Path> segments, boolean noCommit, boolean deleteGone, String params, boolean filter, boolean normalize, boolean addBinaryContent, boolean base64)static voidmain(String[] args)intrun(String[] args)Map<String,Object>run(Map<String,Object> args, String crawlId)Runs the tool, using a map of arguments.- 
Methods inherited from class org.apache.nutch.util.NutchToolgetProgress, getStatus, killJob, setConf, stopJob
 - 
Methods inherited from class org.apache.hadoop.conf.ConfiguredgetConf
 - 
Methods inherited from class java.lang.Objectclone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 - 
Methods inherited from interface org.apache.hadoop.conf.ConfigurablegetConf, setConf
 
- 
 
- 
- 
- 
Constructor Detail- 
IndexingJobpublic IndexingJob() 
 - 
IndexingJobpublic IndexingJob(Configuration conf) 
 
- 
 - 
Method Detail- 
indexpublic void index(Path crawlDb, Path linkDb, List<Path> segments, boolean noCommit) throws IOException, InterruptedException, ClassNotFoundException 
 - 
indexpublic void index(Path crawlDb, Path linkDb, List<Path> segments, boolean noCommit, boolean deleteGone) throws IOException, InterruptedException, ClassNotFoundException 
 - 
indexpublic void index(Path crawlDb, Path linkDb, List<Path> segments, boolean noCommit, boolean deleteGone, String params) throws IOException, InterruptedException, ClassNotFoundException 
 - 
indexpublic void index(Path crawlDb, Path linkDb, List<Path> segments, boolean noCommit, boolean deleteGone, String params, boolean filter, boolean normalize) throws IOException, InterruptedException, ClassNotFoundException 
 - 
indexpublic void index(Path crawlDb, Path linkDb, List<Path> segments, boolean noCommit, boolean deleteGone, String params, boolean filter, boolean normalize, boolean addBinaryContent) throws IOException, InterruptedException, ClassNotFoundException 
 - 
indexpublic void index(Path crawlDb, Path linkDb, List<Path> segments, boolean noCommit, boolean deleteGone, String params, boolean filter, boolean normalize, boolean addBinaryContent, boolean base64) throws IOException, InterruptedException, ClassNotFoundException 
 - 
runpublic Map<String,Object> run(Map<String,Object> args, String crawlId) throws Exception Description copied from class:NutchToolRuns the tool, using a map of arguments. May return results, or null.- Specified by:
- runin class- NutchTool
- Parameters:
- args- a- Mapof arguments to be run with the tool
- crawlId- a crawl identifier to associate with the tool invocation
- Returns:
- Map results object if tool executes successfully otherwise null
- Throws:
- Exception- if there is an error during the tool execution
 
 
- 
 
-