Search in sources :

Example 1 with AnalysisStempelPlugin

use of org.elasticsearch.plugin.analysis.stempel.AnalysisStempelPlugin in project elasticsearch by elastic.

the class PolishAnalysisTests method testDefaultsPolishAnalysis.

public void testDefaultsPolishAnalysis() throws IOException {
    final TestAnalysis analysis = createTestAnalysis(new Index("test", "_na_"), Settings.EMPTY, new AnalysisStempelPlugin());
    TokenFilterFactory tokenizerFactory = analysis.tokenFilter.get("polish_stem");
    MatcherAssert.assertThat(tokenizerFactory, instanceOf(PolishStemTokenFilterFactory.class));
    Analyzer analyzer = analysis.indexAnalyzers.get("polish").analyzer();
    MatcherAssert.assertThat(analyzer, instanceOf(PolishAnalyzer.class));
}
Also used : PolishAnalyzer(org.apache.lucene.analysis.pl.PolishAnalyzer) Index(org.elasticsearch.index.Index) PolishStemTokenFilterFactory(org.elasticsearch.index.analysis.pl.PolishStemTokenFilterFactory) AnalysisStempelPlugin(org.elasticsearch.plugin.analysis.stempel.AnalysisStempelPlugin) PolishAnalyzer(org.apache.lucene.analysis.pl.PolishAnalyzer) Analyzer(org.apache.lucene.analysis.Analyzer) PolishStemTokenFilterFactory(org.elasticsearch.index.analysis.pl.PolishStemTokenFilterFactory)

Example 2 with AnalysisStempelPlugin

use of org.elasticsearch.plugin.analysis.stempel.AnalysisStempelPlugin in project elasticsearch by elastic.

the class SimplePolishTokenFilterTests method testToken.

private void testToken(String source, String expected) throws IOException {
    Index index = new Index("test", "_na_");
    Settings settings = Settings.builder().put("index.analysis.filter.myStemmer.type", "polish_stem").build();
    TestAnalysis analysis = createTestAnalysis(index, settings, new AnalysisStempelPlugin());
    TokenFilterFactory filterFactory = analysis.tokenFilter.get("myStemmer");
    Tokenizer tokenizer = new KeywordTokenizer();
    tokenizer.setReader(new StringReader(source));
    TokenStream ts = filterFactory.create(tokenizer);
    CharTermAttribute term1 = ts.addAttribute(CharTermAttribute.class);
    ts.reset();
    assertThat(ts.incrementToken(), equalTo(true));
    assertThat(term1.toString(), equalTo(expected));
}
Also used : TokenStream(org.apache.lucene.analysis.TokenStream) CharTermAttribute(org.apache.lucene.analysis.tokenattributes.CharTermAttribute) StringReader(java.io.StringReader) Index(org.elasticsearch.index.Index) AnalysisStempelPlugin(org.elasticsearch.plugin.analysis.stempel.AnalysisStempelPlugin) KeywordTokenizer(org.apache.lucene.analysis.core.KeywordTokenizer) Tokenizer(org.apache.lucene.analysis.Tokenizer) KeywordTokenizer(org.apache.lucene.analysis.core.KeywordTokenizer) Settings(org.elasticsearch.common.settings.Settings)

Example 3 with AnalysisStempelPlugin

use of org.elasticsearch.plugin.analysis.stempel.AnalysisStempelPlugin in project elasticsearch by elastic.

the class SimplePolishTokenFilterTests method testAnalyzer.

private void testAnalyzer(String source, String... expected_terms) throws IOException {
    TestAnalysis analysis = createTestAnalysis(new Index("test", "_na_"), Settings.EMPTY, new AnalysisStempelPlugin());
    Analyzer analyzer = analysis.indexAnalyzers.get("polish").analyzer();
    TokenStream ts = analyzer.tokenStream("test", source);
    CharTermAttribute term1 = ts.addAttribute(CharTermAttribute.class);
    ts.reset();
    for (String expected : expected_terms) {
        assertThat(ts.incrementToken(), equalTo(true));
        assertThat(term1.toString(), equalTo(expected));
    }
}
Also used : TokenStream(org.apache.lucene.analysis.TokenStream) CharTermAttribute(org.apache.lucene.analysis.tokenattributes.CharTermAttribute) Index(org.elasticsearch.index.Index) AnalysisStempelPlugin(org.elasticsearch.plugin.analysis.stempel.AnalysisStempelPlugin) Analyzer(org.apache.lucene.analysis.Analyzer)

Aggregations

Index (org.elasticsearch.index.Index)3 AnalysisStempelPlugin (org.elasticsearch.plugin.analysis.stempel.AnalysisStempelPlugin)3 Analyzer (org.apache.lucene.analysis.Analyzer)2 TokenStream (org.apache.lucene.analysis.TokenStream)2 CharTermAttribute (org.apache.lucene.analysis.tokenattributes.CharTermAttribute)2 StringReader (java.io.StringReader)1 Tokenizer (org.apache.lucene.analysis.Tokenizer)1 KeywordTokenizer (org.apache.lucene.analysis.core.KeywordTokenizer)1 PolishAnalyzer (org.apache.lucene.analysis.pl.PolishAnalyzer)1 Settings (org.elasticsearch.common.settings.Settings)1 PolishStemTokenFilterFactory (org.elasticsearch.index.analysis.pl.PolishStemTokenFilterFactory)1