use of org.deeplearning4j.text.tokenization.tokenizerfactory.JapaneseTokenizerFactory in project deeplearning4j by deeplearning4j.
the class JapaneseTokenizerTest method testJapaneseTokenizer.
@Test
public void testJapaneseTokenizer() throws Exception {
String toTokenize = "黒い瞳の綺麗な女の子";
TokenizerFactory t = new JapaneseTokenizerFactory();
Tokenizer tokenizer = t.create(toTokenize);
String[] expect = { "黒い", "瞳", "の", "綺麗", "な", "女の子" };
assertEquals(expect.length, tokenizer.countTokens());
for (int i = 0; i < tokenizer.countTokens(); ++i) {
assertEquals(tokenizer.nextToken(), expect[i]);
}
}
Aggregations