Package org.apache.nutch.segment
Class SegmentReader
- java.lang.Object
- 
- org.apache.hadoop.conf.Configured
- 
- org.apache.nutch.segment.SegmentReader
 
 
- 
- All Implemented Interfaces:
- Configurable,- Tool
 
 public class SegmentReader extends Configured implements Tool Dump the content of a segment.
- 
- 
Nested Class SummaryNested Classes Modifier and Type Class Description static classSegmentReader.InputCompatMapperstatic classSegmentReader.InputCompatReducerstatic classSegmentReader.SegmentReaderStatsstatic classSegmentReader.TextOutputFormatImplements a text output format
 - 
Constructor SummaryConstructors Constructor Description SegmentReader()
 - 
Method SummaryAll Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description voiddump(Path segment, Path output)voidget(Path segment, Text key, Writer writer, Map<String,List<Writable>> results)static CharsetgetCharset(Metadata parseMeta)Try to get HTML encoding from parse metadata.voidgetStats(Path segment, SegmentReader.SegmentReaderStats stats)voidlist(List<Path> dirs, Writer writer)static voidmain(String[] args)intrun(String[] args)- 
Methods inherited from class org.apache.hadoop.conf.ConfiguredgetConf, setConf
 - 
Methods inherited from class java.lang.Objectclone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 - 
Methods inherited from interface org.apache.hadoop.conf.ConfigurablegetConf, setConf
 
- 
 
- 
- 
- 
Method Detail- 
dumppublic void dump(Path segment, Path output) throws IOException, InterruptedException, ClassNotFoundException 
 - 
getpublic void get(Path segment, Text key, Writer writer, Map<String,List<Writable>> results) throws Exception - Throws:
- Exception
 
 - 
getCharsetpublic static Charset getCharset(Metadata parseMeta) Try to get HTML encoding from parse metadata. TryNutch.CHAR_ENCODING_FOR_CONVERSION, thenHttpHeaders.CONTENT_ENCODINGthen fallbackStandardCharsets.UTF_8
 - 
getStatspublic void getStats(Path segment, SegmentReader.SegmentReaderStats stats) throws Exception - Throws:
- Exception
 
 
- 
 
-