public class Tokenizer extends TokenizerBase
See Token
for details on the morphological features produced by this tokenizer
The following code example demonstrates how to use the Kuromoji tokenizer:
package com.atilika.kuromoji.example;
import com.atilika.kuromoji.ipadic.Token;
import com.atilika.kuromoji.ipadic.Tokenizer;
import java.util.List;
public class KuromojiExample {
public static void main(String[] args) {
Tokenizer tokenizer = new Tokenizer() ;
List<Token> tokens = tokenizer.tokenize("お寿司が食べたい。");
for (Token token : tokens) {
System.out.println(token.getSurface() + "\t" + token.getAllFeatures());
}
}
}
Modifier and Type | Class and Description |
---|---|
static class |
Tokenizer.Builder
Builder class for creating a customized tokenizer instance
|
TokenizerBase.Mode
dictionaryMap, tokenFactory
Constructor and Description |
---|
Tokenizer()
Construct a default tokenizer
|
Modifier and Type | Method and Description |
---|---|
java.util.List<Token> |
tokenize(java.lang.String text)
Tokenizes the provided text and returns a list of tokens with various feature information
|
configure, createTokenList, debugLattice, debugTokenize
public java.util.List<Token> tokenize(java.lang.String text)
This method is thread safe
tokenize
in class TokenizerBase
text
- text to tokenize