Search in sources :

Example 1 with ICUFoldingFilter

use of org.apache.lucene.analysis.icu.ICUFoldingFilter in project elasticsearch by elastic.

the class IcuFoldingTokenFilterFactory method create.

@Override
public TokenStream create(TokenStream tokenStream) {
    // ICUFoldingFilter lacks a constructor for adding filtering so we implemement it here
    if (unicodeSetFilter != null) {
        Normalizer2 base = Normalizer2.getInstance(ICUFoldingFilter.class.getResourceAsStream("utr30.nrm"), "utr30", Normalizer2.Mode.COMPOSE);
        UnicodeSet unicodeSet = new UnicodeSet(unicodeSetFilter);
        unicodeSet.freeze();
        Normalizer2 filtered = new FilteredNormalizer2(base, unicodeSet);
        return new org.apache.lucene.analysis.icu.ICUNormalizer2Filter(tokenStream, filtered);
    } else {
        return new ICUFoldingFilter(tokenStream);
    }
}
Also used : FilteredNormalizer2(com.ibm.icu.text.FilteredNormalizer2) FilteredNormalizer2(com.ibm.icu.text.FilteredNormalizer2) Normalizer2(com.ibm.icu.text.Normalizer2) UnicodeSet(com.ibm.icu.text.UnicodeSet) ICUFoldingFilter(org.apache.lucene.analysis.icu.ICUFoldingFilter)

Aggregations

FilteredNormalizer2 (com.ibm.icu.text.FilteredNormalizer2)1 Normalizer2 (com.ibm.icu.text.Normalizer2)1 UnicodeSet (com.ibm.icu.text.UnicodeSet)1 ICUFoldingFilter (org.apache.lucene.analysis.icu.ICUFoldingFilter)1