Search in sources :

Example 1 with HunspellStemFilter

use of org.apache.lucene.analysis.hunspell.HunspellStemFilter in project omegat by omegat-org.

the class HunspellTokenizer method getTokenStream.

@Override
protected TokenStream getTokenStream(final String strOrig, final boolean stemsAllowed, final boolean stopWordsAllowed) throws IOException {
    StandardTokenizer tokenizer = new StandardTokenizer();
    tokenizer.setReader(new StringReader(strOrig));
    if (stemsAllowed) {
        Dictionary dictionary = getDict();
        if (dictionary == null) {
            return tokenizer;
        }
        return new HunspellStemFilter(tokenizer, dictionary);
    // / TODO: implement stop words checks
    } else {
        return tokenizer;
    }
}
Also used : Dictionary(org.apache.lucene.analysis.hunspell.Dictionary) HunspellStemFilter(org.apache.lucene.analysis.hunspell.HunspellStemFilter) StandardTokenizer(org.apache.lucene.analysis.standard.StandardTokenizer) StringReader(java.io.StringReader)

Aggregations

StringReader (java.io.StringReader)1 Dictionary (org.apache.lucene.analysis.hunspell.Dictionary)1 HunspellStemFilter (org.apache.lucene.analysis.hunspell.HunspellStemFilter)1 StandardTokenizer (org.apache.lucene.analysis.standard.StandardTokenizer)1