Search in sources :

Example 1 with PersianAnalyzer

use of org.apache.lucene.analysis.fa.PersianAnalyzer in project omegat by omegat-org.

the class LucenePersianTokenizer method getTokenStream.

@SuppressWarnings("resource")
@Override
protected TokenStream getTokenStream(final String strOrig, final boolean stemsAllowed, final boolean stopWordsAllowed) throws IOException {
    if (stemsAllowed) {
        CharArraySet stopWords = stopWordsAllowed ? PersianAnalyzer.getDefaultStopSet() : CharArraySet.EMPTY_SET;
        PersianAnalyzer analyzer = new PersianAnalyzer(stopWords);
        return analyzer.tokenStream("", new StringReader(strOrig));
    } else {
        return getStandardTokenStream(strOrig);
    }
}
Also used : CharArraySet(org.apache.lucene.analysis.util.CharArraySet) StringReader(java.io.StringReader) PersianAnalyzer(org.apache.lucene.analysis.fa.PersianAnalyzer)

Aggregations

StringReader (java.io.StringReader)1 PersianAnalyzer (org.apache.lucene.analysis.fa.PersianAnalyzer)1 CharArraySet (org.apache.lucene.analysis.util.CharArraySet)1