Search in sources :

Example 1 with UimaTokenizerFactory

use of org.deeplearning4j.text.tokenization.tokenizerfactory.UimaTokenizerFactory in project deeplearning4j by deeplearning4j.

the class Word2VecIteratorTest method before.

@Before
public void before() throws Exception {
    if (vec == null) {
        ClassPathResource resource = new ClassPathResource("/labeled/");
        File file = resource.getFile();
        SentenceIterator iter = UimaSentenceIterator.createWithPath(file.getAbsolutePath());
        new File("cache.ser").delete();
        TokenizerFactory t = new UimaTokenizerFactory();
        vec = new Word2Vec.Builder().minWordFrequency(1).iterations(5).layerSize(100).stopWords(new ArrayList<String>()).useUnknown(true).windowSize(5).iterate(iter).tokenizerFactory(t).build();
        vec.fit();
    }
}
Also used : UimaTokenizerFactory(org.deeplearning4j.text.tokenization.tokenizerfactory.UimaTokenizerFactory) UimaTokenizerFactory(org.deeplearning4j.text.tokenization.tokenizerfactory.UimaTokenizerFactory) TokenizerFactory(org.deeplearning4j.text.tokenization.tokenizerfactory.TokenizerFactory) File(java.io.File) ClassPathResource(org.datavec.api.util.ClassPathResource) UimaSentenceIterator(org.deeplearning4j.text.sentenceiterator.UimaSentenceIterator) LabelAwareFileSentenceIterator(org.deeplearning4j.text.sentenceiterator.labelaware.LabelAwareFileSentenceIterator) SentenceIterator(org.deeplearning4j.text.sentenceiterator.SentenceIterator) Before(org.junit.Before)

Aggregations

File (java.io.File)1 ClassPathResource (org.datavec.api.util.ClassPathResource)1 SentenceIterator (org.deeplearning4j.text.sentenceiterator.SentenceIterator)1 UimaSentenceIterator (org.deeplearning4j.text.sentenceiterator.UimaSentenceIterator)1 LabelAwareFileSentenceIterator (org.deeplearning4j.text.sentenceiterator.labelaware.LabelAwareFileSentenceIterator)1 TokenizerFactory (org.deeplearning4j.text.tokenization.tokenizerfactory.TokenizerFactory)1 UimaTokenizerFactory (org.deeplearning4j.text.tokenization.tokenizerfactory.UimaTokenizerFactory)1 Before (org.junit.Before)1