Class | Description |
---|---|
CommonPreprocessor | |
CustomStemmingPreprocessor |
This is StemmingPreprocessor compatible with different StemmingProcessors defined as lucene/tartarus SnowballProgram
Like, but not limited to: RussianStemmer, DutchStemmer, FrenchStemmer etc
PLEASE NOTE: This preprocessor is thread-safe by using synchronized method
|
EmbeddedStemmingPreprocessor |
This tokenizer preprocessor uses given preprocessor + does english Porter stemming on tokens on top of it
|
EndingPreProcessor |
Gets rid of endings:
ed,ing, ly, s, .
|
LowCasePreProcessor | |
StemmingPreprocessor |
This tokenizer preprocessor implements basic cleaning inherited from CommonPreprocessor + does english Porter stemming on tokens
PLEASE NOTE: This preprocessor is thread-safe by using synchronized method
|
StemmingPreprocessorTest | |
StringCleaning |
Various string cleaning utils
|