Class WikipediaTokenizerFactory
java.lang.Object
org.apache.lucene.analysis.AbstractAnalysisFactory
org.apache.lucene.analysis.TokenizerFactory
org.apache.lucene.analysis.wikipedia.WikipediaTokenizerFactory
Factory for
WikipediaTokenizer.
<fieldType name="text_wiki" class="solr.TextField" positionIncrementGap="100">
<analyzer>
<tokenizer class="solr.WikipediaTokenizerFactory"/>
</analyzer>
</fieldType>- Since:
- 3.1
- SPI Name (case-insensitive: if the name is 'htmlStrip', 'htmlstrip' can be used when looking up the service).
- "wikipedia"
-
Field Summary
FieldsModifier and TypeFieldDescriptionstatic final StringSPI namestatic final Stringprotected final intstatic final StringFields inherited from class org.apache.lucene.analysis.AbstractAnalysisFactory
LUCENE_MATCH_VERSION_PARAM, luceneMatchVersion -
Constructor Summary
ConstructorsConstructorDescriptionDefault ctor for compatibility with SPIWikipediaTokenizerFactory(Map<String, String> args) Creates a new WikipediaTokenizerFactory -
Method Summary
Methods inherited from class org.apache.lucene.analysis.TokenizerFactory
availableTokenizers, create, findSPIName, forName, lookupClass, reloadTokenizersMethods inherited from class org.apache.lucene.analysis.AbstractAnalysisFactory
defaultCtorException, get, get, get, get, get, getBoolean, getChar, getClassArg, getFloat, getInt, getLines, getLuceneMatchVersion, getOriginalArgs, getPattern, getSet, getSnowballWordSet, getWordSet, isExplicitLuceneMatchVersion, require, require, require, requireBoolean, requireChar, requireFloat, requireInt, setExplicitLuceneMatchVersion, splitAt, splitFileNames
-
Field Details
-
NAME
SPI name- See Also:
-
TOKEN_OUTPUT
- See Also:
-
UNTOKENIZED_TYPES
- See Also:
-
tokenOutput
protected final int tokenOutput -
untokenizedTypes
-
-
Constructor Details
-
WikipediaTokenizerFactory
Creates a new WikipediaTokenizerFactory -
WikipediaTokenizerFactory
public WikipediaTokenizerFactory()Default ctor for compatibility with SPI
-
-
Method Details
-
create
- Specified by:
createin classTokenizerFactory
-