Search in sources :

Example 6 with CollectorContext

use of io.crate.expression.reference.doc.lucene.CollectorContext in project crate by crate.

the class LuceneBatchIteratorTest method testLuceneBatchIterator.

@Test
public void testLuceneBatchIterator() throws Exception {
    BatchIteratorTester tester = new BatchIteratorTester(() -> new LuceneBatchIterator(indexSearcher, new MatchAllDocsQuery(), null, false, new CollectorContext(), columnRefs, columnRefs));
    tester.verifyResultAndEdgeCaseBehaviour(expectedResult);
}
Also used : BatchIteratorTester(io.crate.testing.BatchIteratorTester) CollectorContext(io.crate.expression.reference.doc.lucene.CollectorContext) MatchAllDocsQuery(org.apache.lucene.search.MatchAllDocsQuery) Test(org.junit.Test)

Example 7 with CollectorContext

use of io.crate.expression.reference.doc.lucene.CollectorContext in project crate by crate.

the class DocLevelExpressionsTest method prepare.

@Before
public void prepare() throws Exception {
    SQLExecutor e = SQLExecutor.builder(clusterService).addTable(createTableStatement).build();
    indexEnv = new IndexEnv(THREAD_POOL, (DocTableInfo) StreamSupport.stream(e.schemas().spliterator(), false).filter(x -> x instanceof DocSchemaInfo).map(x -> (DocSchemaInfo) x).findFirst().orElseThrow(() -> new IllegalStateException("No doc schema found")).getTables().iterator().next(), clusterService.state(), Version.CURRENT, createTempDir());
    IndexWriter writer = indexEnv.writer();
    insertValues(writer);
    DirectoryReader directoryReader = DirectoryReader.open(writer, true, true);
    readerContext = directoryReader.leaves().get(0);
    ctx = new CollectorContext();
}
Also used : DocTableInfo(io.crate.metadata.doc.DocTableInfo) CollectorContext(io.crate.expression.reference.doc.lucene.CollectorContext) IndexEnv(io.crate.testing.IndexEnv) DirectoryReader(org.apache.lucene.index.DirectoryReader) CrateDummyClusterServiceUnitTest(io.crate.test.integration.CrateDummyClusterServiceUnitTest) IndexWriter(org.apache.lucene.index.IndexWriter) Version(org.elasticsearch.Version) DocSchemaInfo(io.crate.metadata.doc.DocSchemaInfo) After(org.junit.After) StreamSupport(java.util.stream.StreamSupport) LeafReaderContext(org.apache.lucene.index.LeafReaderContext) SQLExecutor(io.crate.testing.SQLExecutor) Before(org.junit.Before) DocTableInfo(io.crate.metadata.doc.DocTableInfo) DocSchemaInfo(io.crate.metadata.doc.DocSchemaInfo) SQLExecutor(io.crate.testing.SQLExecutor) IndexEnv(io.crate.testing.IndexEnv) IndexWriter(org.apache.lucene.index.IndexWriter) DirectoryReader(org.apache.lucene.index.DirectoryReader) CollectorContext(io.crate.expression.reference.doc.lucene.CollectorContext) Before(org.junit.Before)

Example 8 with CollectorContext

use of io.crate.expression.reference.doc.lucene.CollectorContext in project crate by crate.

the class DocValuesGroupByOptimizedIteratorTest method test_group_by_doc_values_optimized_iterator_for_single_numeric_key.

@Test
public void test_group_by_doc_values_optimized_iterator_for_single_numeric_key() throws Exception {
    SumAggregation<?> sumAggregation = (SumAggregation<?>) functions.getQualified(Signature.aggregate(SumAggregation.NAME, DataTypes.LONG.getTypeSignature(), DataTypes.LONG.getTypeSignature()), List.of(DataTypes.LONG), DataTypes.LONG);
    var sumDocValuesAggregator = sumAggregation.getDocValueAggregator(List.of(new Reference(new ReferenceIdent(RelationName.fromIndexName("test"), "z"), RowGranularity.DOC, DataTypes.LONG, ColumnPolicy.DYNAMIC, IndexType.PLAIN, true, true, 0, null)), mock(DocTableInfo.class), List.of());
    var keyExpressions = List.of(new LongColumnReference("y"));
    var it = DocValuesGroupByOptimizedIterator.GroupByIterator.forSingleKey(List.of(sumDocValuesAggregator), indexSearcher, new Reference(new ReferenceIdent(RelationName.fromIndexName("test"), "y"), RowGranularity.DOC, DataTypes.LONG, ColumnPolicy.DYNAMIC, IndexType.PLAIN, true, true, 0, null), keyExpressions, RamAccounting.NO_ACCOUNTING, null, null, new MatchAllDocsQuery(), new CollectorContext());
    var rowConsumer = new TestingRowConsumer();
    rowConsumer.accept(it, null);
    assertThat(rowConsumer.getResult(), containsInAnyOrder(new Object[] { 0L, 6L }, new Object[] { 1L, 4L }));
}
Also used : DocTableInfo(io.crate.metadata.doc.DocTableInfo) BytesRefColumnReference(io.crate.expression.reference.doc.lucene.BytesRefColumnReference) AtomicReference(java.util.concurrent.atomic.AtomicReference) LongColumnReference(io.crate.expression.reference.doc.lucene.LongColumnReference) Reference(io.crate.metadata.Reference) SumAggregation(io.crate.execution.engine.aggregation.impl.SumAggregation) LongColumnReference(io.crate.expression.reference.doc.lucene.LongColumnReference) CollectorContext(io.crate.expression.reference.doc.lucene.CollectorContext) MatchAllDocsQuery(org.apache.lucene.search.MatchAllDocsQuery) ReferenceIdent(io.crate.metadata.ReferenceIdent) TestingRowConsumer(io.crate.testing.TestingRowConsumer) CrateDummyClusterServiceUnitTest(io.crate.test.integration.CrateDummyClusterServiceUnitTest) Test(org.junit.Test)

Example 9 with CollectorContext

use of io.crate.expression.reference.doc.lucene.CollectorContext in project crate by crate.

the class GroupByOptimizedIterator method tryOptimizeSingleStringKey.

@Nullable
static BatchIterator<Row> tryOptimizeSingleStringKey(IndexShard indexShard, DocTableInfo table, LuceneQueryBuilder luceneQueryBuilder, FieldTypeLookup fieldTypeLookup, BigArrays bigArrays, InputFactory inputFactory, DocInputFactory docInputFactory, RoutedCollectPhase collectPhase, CollectTask collectTask) {
    Collection<? extends Projection> shardProjections = shardProjections(collectPhase.projections());
    GroupProjection groupProjection = getSingleStringKeyGroupProjection(shardProjections);
    if (groupProjection == null) {
        return null;
    }
    assert groupProjection.keys().size() == 1 : "Must have 1 key if getSingleStringKeyGroupProjection returned a projection";
    Reference keyRef = getKeyRef(collectPhase.toCollect(), groupProjection.keys().get(0));
    if (keyRef == null) {
        // group by on non-reference
        return null;
    }
    keyRef = (Reference) DocReferences.inverseSourceLookup(keyRef);
    MappedFieldType keyFieldType = fieldTypeLookup.get(keyRef.column().fqn());
    if (keyFieldType == null || !keyFieldType.hasDocValues()) {
        return null;
    }
    if (Symbols.containsColumn(collectPhase.toCollect(), DocSysColumns.SCORE) || Symbols.containsColumn(collectPhase.where(), DocSysColumns.SCORE)) {
        // to keep the optimized implementation a bit simpler
        return null;
    }
    if (hasHighCardinalityRatio(() -> indexShard.acquireSearcher("group-by-cardinality-check"), keyFieldType.name())) {
        return null;
    }
    ShardId shardId = indexShard.shardId();
    SharedShardContext sharedShardContext = collectTask.sharedShardContexts().getOrCreateContext(shardId);
    var searcher = sharedShardContext.acquireSearcher("group-by-ordinals:" + formatSource(collectPhase));
    collectTask.addSearcher(sharedShardContext.readerId(), searcher);
    final QueryShardContext queryShardContext = sharedShardContext.indexService().newQueryShardContext();
    InputFactory.Context<? extends LuceneCollectorExpression<?>> docCtx = docInputFactory.getCtx(collectTask.txnCtx());
    docCtx.add(collectPhase.toCollect().stream()::iterator);
    InputFactory.Context<CollectExpression<Row, ?>> ctxForAggregations = inputFactory.ctxForAggregations(collectTask.txnCtx());
    ctxForAggregations.add(groupProjection.values());
    final List<CollectExpression<Row, ?>> aggExpressions = ctxForAggregations.expressions();
    List<AggregationContext> aggregations = ctxForAggregations.aggregations();
    List<? extends LuceneCollectorExpression<?>> expressions = docCtx.expressions();
    RamAccounting ramAccounting = collectTask.getRamAccounting();
    CollectorContext collectorContext = new CollectorContext(sharedShardContext.readerId());
    InputRow inputRow = new InputRow(docCtx.topLevelInputs());
    LuceneQueryBuilder.Context queryContext = luceneQueryBuilder.convert(collectPhase.where(), collectTask.txnCtx(), indexShard.mapperService(), indexShard.shardId().getIndexName(), queryShardContext, table, sharedShardContext.indexService().cache());
    return getIterator(bigArrays, searcher.item(), keyRef.column().fqn(), aggregations, expressions, aggExpressions, ramAccounting, collectTask.memoryManager(), collectTask.minNodeVersion(), inputRow, queryContext.query(), collectorContext, groupProjection.mode());
}
Also used : AggregationContext(io.crate.execution.engine.aggregation.AggregationContext) InputFactory(io.crate.expression.InputFactory) RamAccounting(io.crate.breaker.RamAccounting) AtomicReference(java.util.concurrent.atomic.AtomicReference) Reference(io.crate.metadata.Reference) ShardId(org.elasticsearch.index.shard.ShardId) LuceneQueryBuilder(io.crate.lucene.LuceneQueryBuilder) MappedFieldType(org.elasticsearch.index.mapper.MappedFieldType) InputRow(io.crate.expression.InputRow) QueryShardContext(org.elasticsearch.index.query.QueryShardContext) CollectorContext(io.crate.expression.reference.doc.lucene.CollectorContext) GroupProjection(io.crate.execution.dsl.projection.GroupProjection) SharedShardContext(io.crate.execution.jobs.SharedShardContext) Nullable(javax.annotation.Nullable)

Example 10 with CollectorContext

use of io.crate.expression.reference.doc.lucene.CollectorContext in project crate by crate.

the class LuceneShardCollectorProvider method getUnorderedIterator.

@Override
protected BatchIterator<Row> getUnorderedIterator(RoutedCollectPhase collectPhase, boolean requiresScroll, CollectTask collectTask) {
    ShardId shardId = indexShard.shardId();
    SharedShardContext sharedShardContext = collectTask.sharedShardContexts().getOrCreateContext(shardId);
    var searcher = sharedShardContext.acquireSearcher("unordered-iterator: " + formatSource(collectPhase));
    collectTask.addSearcher(sharedShardContext.readerId(), searcher);
    IndexShard sharedShardContextShard = sharedShardContext.indexShard();
    // A closed shard has no mapper service and cannot be queried with lucene,
    // therefore skip it
    boolean isClosed = sharedShardContextShard.mapperService() == null;
    if (isClosed) {
        return InMemoryBatchIterator.empty(SentinelRow.SENTINEL);
    }
    QueryShardContext queryShardContext = sharedShardContext.indexService().newQueryShardContext();
    LuceneQueryBuilder.Context queryContext = luceneQueryBuilder.convert(collectPhase.where(), collectTask.txnCtx(), sharedShardContextShard.mapperService(), sharedShardContextShard.shardId().getIndexName(), queryShardContext, table, sharedShardContext.indexService().cache());
    InputFactory.Context<? extends LuceneCollectorExpression<?>> docCtx = docInputFactory.extractImplementations(collectTask.txnCtx(), collectPhase);
    return new LuceneBatchIterator(searcher.item(), queryContext.query(), queryContext.minScore(), Symbols.containsColumn(collectPhase.toCollect(), DocSysColumns.SCORE), new CollectorContext(sharedShardContext.readerId()), docCtx.topLevelInputs(), docCtx.expressions());
}
Also used : ShardId(org.elasticsearch.index.shard.ShardId) InputFactory(io.crate.expression.InputFactory) IndexShard(org.elasticsearch.index.shard.IndexShard) LuceneQueryBuilder(io.crate.lucene.LuceneQueryBuilder) QueryShardContext(org.elasticsearch.index.query.QueryShardContext) CollectorContext(io.crate.expression.reference.doc.lucene.CollectorContext) SharedShardContext(io.crate.execution.jobs.SharedShardContext) LuceneBatchIterator(io.crate.execution.engine.collect.collectors.LuceneBatchIterator)

Aggregations

CollectorContext (io.crate.expression.reference.doc.lucene.CollectorContext)13 Reference (io.crate.metadata.Reference)6 InputFactory (io.crate.expression.InputFactory)5 AtomicReference (java.util.concurrent.atomic.AtomicReference)5 MatchAllDocsQuery (org.apache.lucene.search.MatchAllDocsQuery)5 ShardId (org.elasticsearch.index.shard.ShardId)4 SharedShardContext (io.crate.execution.jobs.SharedShardContext)3 LuceneQueryBuilder (io.crate.lucene.LuceneQueryBuilder)3 ReferenceIdent (io.crate.metadata.ReferenceIdent)3 DocTableInfo (io.crate.metadata.doc.DocTableInfo)3 CrateDummyClusterServiceUnitTest (io.crate.test.integration.CrateDummyClusterServiceUnitTest)3 TestingRowConsumer (io.crate.testing.TestingRowConsumer)3 StandardAnalyzer (org.apache.lucene.analysis.standard.StandardAnalyzer)3 Document (org.apache.lucene.document.Document)3 NumericDocValuesField (org.apache.lucene.document.NumericDocValuesField)3 IndexWriter (org.apache.lucene.index.IndexWriter)3 QueryShardContext (org.elasticsearch.index.query.QueryShardContext)3 Test (org.junit.Test)3 OrderBy (io.crate.analyze.OrderBy)2 RamAccounting (io.crate.breaker.RamAccounting)2