Search in sources :

Example 1 with EnglishMinimalStemFilter

use of org.apache.lucene.analysis.en.EnglishMinimalStemFilter in project nutch by apache.

the class LuceneAnalyzerUtil method createComponents.

@Override
protected TokenStreamComponents createComponents(String fieldName) {
    Tokenizer source = new ClassicTokenizer();
    TokenStream filter = new LowerCaseFilter(source);
    if (stopSet != null) {
        filter = new StopFilter(filter, stopSet);
    }
    switch(stemFilterType) {
        case PORTERSTEM_FILTER:
            filter = new PorterStemFilter(filter);
            break;
        case ENGLISHMINIMALSTEM_FILTER:
            filter = new EnglishMinimalStemFilter(filter);
            break;
        default:
            break;
    }
    return new TokenStreamComponents(source, filter);
}
Also used : TokenStream(org.apache.lucene.analysis.TokenStream) StopFilter(org.apache.lucene.analysis.core.StopFilter) PorterStemFilter(org.apache.lucene.analysis.en.PorterStemFilter) ClassicTokenizer(org.apache.lucene.analysis.standard.ClassicTokenizer) Tokenizer(org.apache.lucene.analysis.Tokenizer) ClassicTokenizer(org.apache.lucene.analysis.standard.ClassicTokenizer) LowerCaseFilter(org.apache.lucene.analysis.core.LowerCaseFilter) EnglishMinimalStemFilter(org.apache.lucene.analysis.en.EnglishMinimalStemFilter)

Aggregations

TokenStream (org.apache.lucene.analysis.TokenStream)1 Tokenizer (org.apache.lucene.analysis.Tokenizer)1 LowerCaseFilter (org.apache.lucene.analysis.core.LowerCaseFilter)1 StopFilter (org.apache.lucene.analysis.core.StopFilter)1 EnglishMinimalStemFilter (org.apache.lucene.analysis.en.EnglishMinimalStemFilter)1 PorterStemFilter (org.apache.lucene.analysis.en.PorterStemFilter)1 ClassicTokenizer (org.apache.lucene.analysis.standard.ClassicTokenizer)1