Search in sources :

Example 1 with RandomIndexWriter

use of org.apache.lucene.tests.index.RandomIndexWriter in project OpenSearch by opensearch-project.

the class ChildrenToParentAggregatorTests method testTermsParentChildTerms.

public void testTermsParentChildTerms() throws IOException {
    Directory directory = newDirectory();
    RandomIndexWriter indexWriter = new RandomIndexWriter(random(), directory);
    final Map<String, Tuple<Integer, Integer>> expectedParentChildRelations = setupIndex(indexWriter);
    indexWriter.close();
    SortedMap<Integer, Long> sortedValues = new TreeMap<>();
    for (Tuple<Integer, Integer> value : expectedParentChildRelations.values()) {
        Long l = sortedValues.computeIfAbsent(value.v2(), integer -> 0L);
        sortedValues.put(value.v2(), l + 1);
    }
    IndexReader indexReader = OpenSearchDirectoryReader.wrap(DirectoryReader.open(directory), new ShardId(new Index("foo", "_na_"), 1));
    // TODO set "maybeWrap" to true for IndexSearcher once #23338 is resolved
    IndexSearcher indexSearcher = newSearcher(indexReader, false, true);
    // verify a terms-aggregation inside the parent-aggregation which itself is inside a
    // terms-aggregation on the child-documents
    testCaseTermsParentTerms(new MatchAllDocsQuery(), indexSearcher, longTerms -> {
        assertNotNull(longTerms);
        for (LongTerms.Bucket bucket : longTerms.getBuckets()) {
            assertNotNull(bucket);
            assertNotNull(bucket.getKeyAsString());
        }
    });
    indexReader.close();
    directory.close();
}
Also used : IndexSearcher(org.apache.lucene.search.IndexSearcher) Index(org.opensearch.index.Index) TreeMap(java.util.TreeMap) MatchAllDocsQuery(org.apache.lucene.search.MatchAllDocsQuery) ShardId(org.opensearch.index.shard.ShardId) IndexReader(org.apache.lucene.index.IndexReader) LongTerms(org.opensearch.search.aggregations.bucket.terms.LongTerms) RandomIndexWriter(org.apache.lucene.tests.index.RandomIndexWriter) Tuple(org.opensearch.common.collect.Tuple) Directory(org.apache.lucene.store.Directory)

Example 2 with RandomIndexWriter

use of org.apache.lucene.tests.index.RandomIndexWriter in project OpenSearch by opensearch-project.

the class ChildrenToParentAggregatorTests method testNoDocs.

public void testNoDocs() throws IOException {
    Directory directory = newDirectory();
    RandomIndexWriter indexWriter = new RandomIndexWriter(random(), directory);
    // intentionally not writing any docs
    indexWriter.close();
    IndexReader indexReader = DirectoryReader.open(directory);
    testCase(new MatchAllDocsQuery(), newSearcher(indexReader, false, true), childrenToParent -> {
        assertEquals(0, childrenToParent.getDocCount());
        Aggregation parentAggregation = childrenToParent.getAggregations().get("in_parent");
        assertEquals(0, childrenToParent.getDocCount());
        assertNotNull("Aggregations: " + childrenToParent.getAggregations().asMap(), parentAggregation);
        assertEquals(Double.POSITIVE_INFINITY, ((InternalMin) parentAggregation).getValue(), Double.MIN_VALUE);
        assertFalse(JoinAggregationInspectionHelper.hasValue(childrenToParent));
    });
    indexReader.close();
    directory.close();
}
Also used : Aggregation(org.opensearch.search.aggregations.Aggregation) IndexReader(org.apache.lucene.index.IndexReader) MatchAllDocsQuery(org.apache.lucene.search.MatchAllDocsQuery) RandomIndexWriter(org.apache.lucene.tests.index.RandomIndexWriter) Directory(org.apache.lucene.store.Directory)

Example 3 with RandomIndexWriter

use of org.apache.lucene.tests.index.RandomIndexWriter in project OpenSearch by opensearch-project.

the class ParentToChildrenAggregatorTests method testParentChild.

public void testParentChild() throws IOException {
    Directory directory = newDirectory();
    RandomIndexWriter indexWriter = new RandomIndexWriter(random(), directory);
    final Map<String, Tuple<Integer, Integer>> expectedParentChildRelations = setupIndex(indexWriter);
    indexWriter.close();
    IndexReader indexReader = OpenSearchDirectoryReader.wrap(DirectoryReader.open(directory), new ShardId(new Index("foo", "_na_"), 1));
    // TODO set "maybeWrap" to true for IndexSearcher once #23338 is resolved
    IndexSearcher indexSearcher = newSearcher(indexReader, false, true);
    testCase(new MatchAllDocsQuery(), indexSearcher, child -> {
        int expectedTotalChildren = 0;
        int expectedMinValue = Integer.MAX_VALUE;
        for (Tuple<Integer, Integer> expectedValues : expectedParentChildRelations.values()) {
            expectedTotalChildren += expectedValues.v1();
            expectedMinValue = Math.min(expectedMinValue, expectedValues.v2());
        }
        assertEquals(expectedTotalChildren, child.getDocCount());
        assertTrue(JoinAggregationInspectionHelper.hasValue(child));
        assertEquals(expectedMinValue, ((InternalMin) child.getAggregations().get("in_child")).getValue(), Double.MIN_VALUE);
    });
    for (String parent : expectedParentChildRelations.keySet()) {
        testCase(new TermInSetQuery(IdFieldMapper.NAME, Uid.encodeId(parent)), indexSearcher, child -> {
            assertEquals((long) expectedParentChildRelations.get(parent).v1(), child.getDocCount());
            assertEquals(expectedParentChildRelations.get(parent).v2(), ((InternalMin) child.getAggregations().get("in_child")).getValue(), Double.MIN_VALUE);
        });
    }
    indexReader.close();
    directory.close();
}
Also used : IndexSearcher(org.apache.lucene.search.IndexSearcher) Index(org.opensearch.index.Index) MatchAllDocsQuery(org.apache.lucene.search.MatchAllDocsQuery) ShardId(org.opensearch.index.shard.ShardId) TermInSetQuery(org.apache.lucene.search.TermInSetQuery) IndexReader(org.apache.lucene.index.IndexReader) RandomIndexWriter(org.apache.lucene.tests.index.RandomIndexWriter) Tuple(org.opensearch.common.collect.Tuple) Directory(org.apache.lucene.store.Directory)

Example 4 with RandomIndexWriter

use of org.apache.lucene.tests.index.RandomIndexWriter in project OpenSearch by opensearch-project.

the class ParentToChildrenAggregatorTests method testNoDocs.

public void testNoDocs() throws IOException {
    Directory directory = newDirectory();
    RandomIndexWriter indexWriter = new RandomIndexWriter(random(), directory);
    // intentionally not writing any docs
    indexWriter.close();
    IndexReader indexReader = DirectoryReader.open(directory);
    testCase(new MatchAllDocsQuery(), newSearcher(indexReader, false, true), parentToChild -> {
        assertEquals(0, parentToChild.getDocCount());
        assertEquals(Double.POSITIVE_INFINITY, ((InternalMin) parentToChild.getAggregations().get("in_child")).getValue(), Double.MIN_VALUE);
    });
    indexReader.close();
    directory.close();
}
Also used : IndexReader(org.apache.lucene.index.IndexReader) MatchAllDocsQuery(org.apache.lucene.search.MatchAllDocsQuery) RandomIndexWriter(org.apache.lucene.tests.index.RandomIndexWriter) Directory(org.apache.lucene.store.Directory)

Example 5 with RandomIndexWriter

use of org.apache.lucene.tests.index.RandomIndexWriter in project OpenSearch by opensearch-project.

the class AnnotatedTextHighlighterTests method assertHighlightOneDoc.

private void assertHighlightOneDoc(String fieldName, String[] markedUpInputs, Query query, Locale locale, BreakIterator breakIterator, int noMatchSize, String[] expectedPassages) throws Exception {
    // Annotated fields wrap the usual analyzer with one that injects extra tokens
    Analyzer wrapperAnalyzer = new AnnotationAnalyzerWrapper(new StandardAnalyzer());
    Directory dir = newDirectory();
    IndexWriterConfig iwc = newIndexWriterConfig(wrapperAnalyzer);
    iwc.setMergePolicy(newTieredMergePolicy(random()));
    RandomIndexWriter iw = new RandomIndexWriter(random(), dir, iwc);
    FieldType ft = new FieldType(TextField.TYPE_STORED);
    if (randomBoolean()) {
        ft.setIndexOptions(IndexOptions.DOCS_AND_FREQS_AND_POSITIONS_AND_OFFSETS);
    } else {
        ft.setIndexOptions(IndexOptions.DOCS_AND_FREQS);
    }
    ft.freeze();
    Document doc = new Document();
    for (String input : markedUpInputs) {
        Field field = new Field(fieldName, "", ft);
        field.setStringValue(input);
        doc.add(field);
    }
    iw.addDocument(doc);
    DirectoryReader reader = iw.getReader();
    IndexSearcher searcher = newSearcher(reader);
    iw.close();
    AnnotatedText[] annotations = new AnnotatedText[markedUpInputs.length];
    for (int i = 0; i < markedUpInputs.length; i++) {
        annotations[i] = AnnotatedText.parse(markedUpInputs[i]);
    }
    AnnotatedHighlighterAnalyzer hiliteAnalyzer = new AnnotatedHighlighterAnalyzer(wrapperAnalyzer);
    hiliteAnalyzer.setAnnotations(annotations);
    AnnotatedPassageFormatter passageFormatter = new AnnotatedPassageFormatter(new DefaultEncoder());
    passageFormatter.setAnnotations(annotations);
    ArrayList<Object> plainTextForHighlighter = new ArrayList<>(annotations.length);
    for (int i = 0; i < annotations.length; i++) {
        plainTextForHighlighter.add(annotations[i].textMinusMarkup);
    }
    TopDocs topDocs = searcher.search(new MatchAllDocsQuery(), 1, Sort.INDEXORDER);
    assertThat(topDocs.totalHits.value, equalTo(1L));
    String rawValue = Strings.collectionToDelimitedString(plainTextForHighlighter, String.valueOf(MULTIVAL_SEP_CHAR));
    CustomUnifiedHighlighter highlighter = new CustomUnifiedHighlighter(searcher, hiliteAnalyzer, null, passageFormatter, locale, breakIterator, "index", "text", query, noMatchSize, expectedPassages.length, name -> "text".equals(name), Integer.MAX_VALUE);
    highlighter.setFieldMatcher((name) -> "text".equals(name));
    final Snippet[] snippets = highlighter.highlightField(getOnlyLeafReader(reader), topDocs.scoreDocs[0].doc, () -> rawValue);
    assertEquals(expectedPassages.length, snippets.length);
    for (int i = 0; i < snippets.length; i++) {
        assertEquals(expectedPassages[i], snippets[i].getText());
    }
    reader.close();
    dir.close();
}
Also used : IndexSearcher(org.apache.lucene.search.IndexSearcher) AnnotatedHighlighterAnalyzer(org.opensearch.index.mapper.annotatedtext.AnnotatedTextFieldMapper.AnnotatedHighlighterAnalyzer) ArrayList(java.util.ArrayList) AnnotatedHighlighterAnalyzer(org.opensearch.index.mapper.annotatedtext.AnnotatedTextFieldMapper.AnnotatedHighlighterAnalyzer) Analyzer(org.apache.lucene.analysis.Analyzer) StandardAnalyzer(org.apache.lucene.analysis.standard.StandardAnalyzer) Document(org.apache.lucene.document.Document) TopDocs(org.apache.lucene.search.TopDocs) Field(org.apache.lucene.document.Field) TextField(org.apache.lucene.document.TextField) DefaultEncoder(org.apache.lucene.search.highlight.DefaultEncoder) Directory(org.apache.lucene.store.Directory) DirectoryReader(org.apache.lucene.index.DirectoryReader) AnnotatedText(org.opensearch.index.mapper.annotatedtext.AnnotatedTextFieldMapper.AnnotatedText) CustomUnifiedHighlighter(org.apache.lucene.search.uhighlight.CustomUnifiedHighlighter) Snippet(org.apache.lucene.search.uhighlight.Snippet) MatchAllDocsQuery(org.apache.lucene.search.MatchAllDocsQuery) FieldType(org.apache.lucene.document.FieldType) StandardAnalyzer(org.apache.lucene.analysis.standard.StandardAnalyzer) AnnotationAnalyzerWrapper(org.opensearch.index.mapper.annotatedtext.AnnotatedTextFieldMapper.AnnotationAnalyzerWrapper) RandomIndexWriter(org.apache.lucene.tests.index.RandomIndexWriter) IndexWriterConfig(org.apache.lucene.index.IndexWriterConfig)

Aggregations

RandomIndexWriter (org.apache.lucene.tests.index.RandomIndexWriter)230 Directory (org.apache.lucene.store.Directory)227 IndexReader (org.apache.lucene.index.IndexReader)206 MatchAllDocsQuery (org.apache.lucene.search.MatchAllDocsQuery)155 IndexSearcher (org.apache.lucene.search.IndexSearcher)154 Document (org.apache.lucene.document.Document)144 MappedFieldType (org.opensearch.index.mapper.MappedFieldType)101 SortedNumericDocValuesField (org.apache.lucene.document.SortedNumericDocValuesField)81 BytesRef (org.apache.lucene.util.BytesRef)44 ArrayList (java.util.ArrayList)43 NumericDocValuesField (org.apache.lucene.document.NumericDocValuesField)38 LongPoint (org.apache.lucene.document.LongPoint)36 StringField (org.apache.lucene.document.StringField)35 IndexWriterConfig (org.apache.lucene.index.IndexWriterConfig)30 IOException (java.io.IOException)26 Sort (org.apache.lucene.search.Sort)26 ParsedQuery (org.opensearch.index.query.ParsedQuery)26 DirectoryReader (org.apache.lucene.index.DirectoryReader)25 SortField (org.apache.lucene.search.SortField)24 SearchShardTask (org.opensearch.action.search.SearchShardTask)24