Search in sources :

Example 1 with OffsetSource

use of org.apache.lucene.search.uhighlight.UnifiedHighlighter.OffsetSource in project OpenSearch by opensearch-project.

the class UnifiedHighlighter method buildHighlighter.

CustomUnifiedHighlighter buildHighlighter(FieldHighlightContext fieldContext) throws IOException {
    Encoder encoder = fieldContext.field.fieldOptions().encoder().equals("html") ? HighlightUtils.Encoders.HTML : HighlightUtils.Encoders.DEFAULT;
    int maxAnalyzedOffset = fieldContext.context.getIndexSettings().getHighlightMaxAnalyzedOffset();
    int keywordIgnoreAbove = Integer.MAX_VALUE;
    if (fieldContext.fieldType instanceof KeywordFieldMapper.KeywordFieldType) {
        KeywordFieldMapper mapper = (KeywordFieldMapper) fieldContext.context.mapperService().documentMapper().mappers().getMapper(fieldContext.fieldName);
        keywordIgnoreAbove = mapper.ignoreAbove();
    }
    int numberOfFragments = fieldContext.field.fieldOptions().numberOfFragments();
    Analyzer analyzer = getAnalyzer(fieldContext.context.mapperService().documentMapper());
    PassageFormatter passageFormatter = getPassageFormatter(fieldContext.hitContext, fieldContext.field, encoder);
    IndexSearcher searcher = fieldContext.context.searcher();
    OffsetSource offsetSource = getOffsetSource(fieldContext.fieldType);
    BreakIterator breakIterator;
    int higlighterNumberOfFragments;
    if (numberOfFragments == 0 || // non-tokenized fields should not use any break iterator (ignore boundaryScannerType)
    fieldContext.fieldType.getTextSearchInfo().isTokenized() == false) {
        /*
             * We use a control char to separate values, which is the
             * only char that the custom break iterator breaks the text
             * on, so we don't lose the distinction between the different
             * values of a field and we get back a snippet per value
             */
        breakIterator = new CustomSeparatorBreakIterator(MULTIVAL_SEP_CHAR);
        higlighterNumberOfFragments = numberOfFragments == 0 ? Integer.MAX_VALUE - 1 : numberOfFragments;
    } else {
        // using paragraph separator we make sure that each field value holds a discrete passage for highlighting
        breakIterator = getBreakIterator(fieldContext.field);
        higlighterNumberOfFragments = numberOfFragments;
    }
    return new CustomUnifiedHighlighter(searcher, analyzer, offsetSource, passageFormatter, fieldContext.field.fieldOptions().boundaryScannerLocale(), breakIterator, fieldContext.context.getIndexName(), fieldContext.fieldName, fieldContext.query, fieldContext.field.fieldOptions().noMatchSize(), higlighterNumberOfFragments, fieldMatcher(fieldContext), keywordIgnoreAbove, maxAnalyzedOffset);
}
Also used : IndexSearcher(org.apache.lucene.search.IndexSearcher) KeywordFieldMapper(org.opensearch.index.mapper.KeywordFieldMapper) Encoder(org.apache.lucene.search.highlight.Encoder) CustomUnifiedHighlighter(org.apache.lucene.search.uhighlight.CustomUnifiedHighlighter) Analyzer(org.apache.lucene.analysis.Analyzer) CustomPassageFormatter(org.apache.lucene.search.uhighlight.CustomPassageFormatter) PassageFormatter(org.apache.lucene.search.uhighlight.PassageFormatter) OffsetSource(org.apache.lucene.search.uhighlight.UnifiedHighlighter.OffsetSource) CustomSeparatorBreakIterator(org.apache.lucene.search.uhighlight.CustomSeparatorBreakIterator) BreakIterator(java.text.BreakIterator) CustomSeparatorBreakIterator(org.apache.lucene.search.uhighlight.CustomSeparatorBreakIterator)

Aggregations

BreakIterator (java.text.BreakIterator)1 Analyzer (org.apache.lucene.analysis.Analyzer)1 IndexSearcher (org.apache.lucene.search.IndexSearcher)1 Encoder (org.apache.lucene.search.highlight.Encoder)1 CustomPassageFormatter (org.apache.lucene.search.uhighlight.CustomPassageFormatter)1 CustomSeparatorBreakIterator (org.apache.lucene.search.uhighlight.CustomSeparatorBreakIterator)1 CustomUnifiedHighlighter (org.apache.lucene.search.uhighlight.CustomUnifiedHighlighter)1 PassageFormatter (org.apache.lucene.search.uhighlight.PassageFormatter)1 OffsetSource (org.apache.lucene.search.uhighlight.UnifiedHighlighter.OffsetSource)1 KeywordFieldMapper (org.opensearch.index.mapper.KeywordFieldMapper)1