Search in sources :

Example 1 with JapaneseTokenizerFactory

use of org.deeplearning4j.text.tokenization.tokenizerfactory.JapaneseTokenizerFactory in project deeplearning4j by deeplearning4j.

the class JapaneseTokenizerTest method testJapaneseTokenizer.

@Test
public void testJapaneseTokenizer() throws Exception {
    String toTokenize = "黒い瞳の綺麗な女の子";
    TokenizerFactory t = new JapaneseTokenizerFactory();
    Tokenizer tokenizer = t.create(toTokenize);
    String[] expect = { "黒い", "瞳", "の", "綺麗", "な", "女の子" };
    assertEquals(expect.length, tokenizer.countTokens());
    for (int i = 0; i < tokenizer.countTokens(); ++i) {
        assertEquals(tokenizer.nextToken(), expect[i]);
    }
}
Also used : JapaneseTokenizerFactory(org.deeplearning4j.text.tokenization.tokenizerfactory.JapaneseTokenizerFactory) JapaneseTokenizerFactory(org.deeplearning4j.text.tokenization.tokenizerfactory.JapaneseTokenizerFactory) TokenizerFactory(org.deeplearning4j.text.tokenization.tokenizerfactory.TokenizerFactory) JapaneseTokenizer(org.deeplearning4j.text.tokenization.tokenizer.JapaneseTokenizer) Tokenizer(org.deeplearning4j.text.tokenization.tokenizer.Tokenizer) Test(org.junit.Test)

Aggregations

JapaneseTokenizer (org.deeplearning4j.text.tokenization.tokenizer.JapaneseTokenizer)1 Tokenizer (org.deeplearning4j.text.tokenization.tokenizer.Tokenizer)1 JapaneseTokenizerFactory (org.deeplearning4j.text.tokenization.tokenizerfactory.JapaneseTokenizerFactory)1 TokenizerFactory (org.deeplearning4j.text.tokenization.tokenizerfactory.TokenizerFactory)1 Test (org.junit.Test)1