Package org.apache.lucene.analysis.core
Class WhitespaceTokenizerFactory
java.lang.Object
org.apache.lucene.analysis.AbstractAnalysisFactory
org.apache.lucene.analysis.TokenizerFactory
org.apache.lucene.analysis.core.WhitespaceTokenizerFactory
Factory for
WhitespaceTokenizer
.
<fieldType name="text_ws" class="solr.TextField" positionIncrementGap="100"> <analyzer> <tokenizer class="solr.WhitespaceTokenizerFactory" rule="unicode" maxTokenLen="256"/> </analyzer> </fieldType>Options:
- rule: either "java" for
WhitespaceTokenizer
or "unicode" forUnicodeWhitespaceTokenizer
- maxTokenLen: max token length, should be greater than 0 and less than
MAX_TOKEN_LENGTH_LIMIT (1024*1024). It is rare to need to change this else
CharTokenizer
::DEFAULT_MAX_TOKEN_LEN
- Since:
- 3.1
-
Field Summary
FieldsModifier and TypeFieldDescriptionprivate final int
static final String
SPI nameprivate final String
static final String
private static final Collection
<String> static final String
Fields inherited from class org.apache.lucene.analysis.AbstractAnalysisFactory
LUCENE_MATCH_VERSION_PARAM, luceneMatchVersion
-
Constructor Summary
ConstructorsConstructorDescriptionDefault ctor for compatibility with SPICreates a new WhitespaceTokenizerFactory -
Method Summary
Modifier and TypeMethodDescriptioncreate
(AttributeFactory factory) Creates a TokenStream of the specified input using the given AttributeFactoryMethods inherited from class org.apache.lucene.analysis.TokenizerFactory
availableTokenizers, create, findSPIName, forName, lookupClass, reloadTokenizers
Methods inherited from class org.apache.lucene.analysis.AbstractAnalysisFactory
defaultCtorException, get, get, get, get, get, getBoolean, getChar, getClassArg, getFloat, getInt, getLines, getLuceneMatchVersion, getOriginalArgs, getPattern, getSet, getSnowballWordSet, getWordSet, isExplicitLuceneMatchVersion, require, require, require, requireBoolean, requireChar, requireFloat, requireInt, setExplicitLuceneMatchVersion, splitAt, splitFileNames
-
Field Details
-
NAME
SPI name- See Also:
-
RULE_JAVA
- See Also:
-
RULE_UNICODE
- See Also:
-
RULE_NAMES
-
rule
-
maxTokenLen
private final int maxTokenLen
-
-
Constructor Details
-
WhitespaceTokenizerFactory
Creates a new WhitespaceTokenizerFactory -
WhitespaceTokenizerFactory
public WhitespaceTokenizerFactory()Default ctor for compatibility with SPI
-
-
Method Details
-
create
Description copied from class:TokenizerFactory
Creates a TokenStream of the specified input using the given AttributeFactory- Specified by:
create
in classTokenizerFactory
-