Class HostURLNormalizer
- java.lang.Object
- 
- org.apache.nutch.net.urlnormalizer.host.HostURLNormalizer
 
- 
- All Implemented Interfaces:
- Configurable,- URLNormalizer
 
 public class HostURLNormalizer extends Object implements URLNormalizer URL normalizer for mapping hosts to their desired form. It takes a simple text file as source in the format: example.org www.example.org mapping all URL's of example.org the the www sub-domain. It also allows for wildcards to be used to map all sub-domains to another host: *.example.org www.example.org
- 
- 
Field Summary- 
Fields inherited from interface org.apache.nutch.net.URLNormalizerX_POINT_ID
 
- 
 - 
Constructor SummaryConstructors Constructor Description HostURLNormalizer()
 - 
Method SummaryAll Methods Instance Methods Concrete Methods Modifier and Type Method Description ConfigurationgetConf()Stringnormalize(String urlString, String scope)protected StringreplaceHost(String urlString, String host, String target)voidsetConf(Configuration conf)
 
- 
- 
- 
Method Detail- 
getConfpublic Configuration getConf() - Specified by:
- getConfin interface- Configurable
 
 - 
setConfpublic void setConf(Configuration conf) - Specified by:
- setConfin interface- Configurable
 
 - 
normalizepublic String normalize(String urlString, String scope) throws MalformedURLException - Specified by:
- normalizein interface- URLNormalizer
- Throws:
- MalformedURLException
 
 
- 
 
-