Search in sources :

Example 1 with PolishAnalyzer

use of org.apache.lucene.analysis.pl.PolishAnalyzer in project omegat by omegat-org.

the class LucenePolishTokenizer method getTokenStream.

@SuppressWarnings("resource")
@Override
protected TokenStream getTokenStream(final String strOrig, final boolean stemsAllowed, final boolean stopWordsAllowed) throws IOException {
    if (stemsAllowed) {
        CharArraySet stopWords = stopWordsAllowed ? PolishAnalyzer.getDefaultStopSet() : CharArraySet.EMPTY_SET;
        PolishAnalyzer analyzer = new PolishAnalyzer(stopWords);
        return analyzer.tokenStream("", new StringReader(strOrig));
    } else {
        return getStandardTokenStream(strOrig);
    }
}
Also used : CharArraySet(org.apache.lucene.analysis.util.CharArraySet) PolishAnalyzer(org.apache.lucene.analysis.pl.PolishAnalyzer) StringReader(java.io.StringReader)

Aggregations

StringReader (java.io.StringReader)1 PolishAnalyzer (org.apache.lucene.analysis.pl.PolishAnalyzer)1 CharArraySet (org.apache.lucene.analysis.util.CharArraySet)1