public class Tokenizer extends TokenizerBase
See Token for details on the morphological features produced by this tokenizer
The following code example demonstrates how to use the Kuromoji tokenizer:
package com.atilika.kuromoji.example;
import com.atilika.kuromoji.ipadic.Token;
import com.atilika.kuromoji.ipadic.Tokenizer;
import java.util.List;
public class KuromojiExample {
public static void main(String[] args) {
Tokenizer tokenizer = new Tokenizer() ;
List<Token> tokens = tokenizer.tokenize("お寿司が食べたい。");
for (Token token : tokens) {
System.out.println(token.getSurface() + "\t" + token.getAllFeatures());
}
}
}
| Modifier and Type | Class and Description |
|---|---|
static class |
Tokenizer.Builder
Builder class for creating a customized tokenizer instance
|
TokenizerBase.ModedictionaryMap, tokenFactory| Constructor and Description |
|---|
Tokenizer()
Construct a default tokenizer
|
| Modifier and Type | Method and Description |
|---|---|
java.util.List<Token> |
tokenize(java.lang.String text)
Tokenizes the provided text and returns a list of tokens with various feature information
|
configure, createTokenList, debugLattice, debugTokenizepublic java.util.List<Token> tokenize(java.lang.String text)
This method is thread safe
tokenize in class TokenizerBasetext - text to tokenize