Constructor and Description |
---|
PosUimaTokenizer(java.lang.String tokens,
org.apache.uima.analysis_engine.AnalysisEngine engine,
java.util.Collection<java.lang.String> allowedPosTags) |
PosUimaTokenizer(java.lang.String tokens,
org.apache.uima.analysis_engine.AnalysisEngine engine,
java.util.Collection<java.lang.String> allowedPosTags,
boolean stripNones) |
Modifier and Type | Method and Description |
---|---|
int |
countTokens()
The number of tokens in the tokenizer
|
static org.apache.uima.analysis_engine.AnalysisEngine |
defaultAnalysisEngine() |
java.util.List<java.lang.String> |
getTokens()
Returns a list of all the tokens
|
boolean |
hasMoreTokens()
An iterator for tracking whether
more tokens are left in the iterator not
|
java.lang.String |
nextToken()
The next token (word usually) in the string
|
void |
setTokenPreProcessor(TokenPreProcess tokenPreProcessor)
Set the token pre process
|
public PosUimaTokenizer(java.lang.String tokens, org.apache.uima.analysis_engine.AnalysisEngine engine, java.util.Collection<java.lang.String> allowedPosTags)
public PosUimaTokenizer(java.lang.String tokens, org.apache.uima.analysis_engine.AnalysisEngine engine, java.util.Collection<java.lang.String> allowedPosTags, boolean stripNones)
public boolean hasMoreTokens()
Tokenizer
hasMoreTokens
in interface Tokenizer
public int countTokens()
Tokenizer
countTokens
in interface Tokenizer
public java.lang.String nextToken()
Tokenizer
public java.util.List<java.lang.String> getTokens()
Tokenizer
public static org.apache.uima.analysis_engine.AnalysisEngine defaultAnalysisEngine()
public void setTokenPreProcessor(@NonNull TokenPreProcess tokenPreProcessor)
Tokenizer
setTokenPreProcessor
in interface Tokenizer
tokenPreProcessor
- the token pre processor to set