Search in sources :

Example 41 with BatchIterator

use of io.crate.data.BatchIterator in project crate by crate.

the class RemoteCollectorFactory method createCollector.

/**
 * create a RemoteCollector
 * The RemoteCollector will collect data from another node using a wormhole as if it was collecting on this node.
 * <p>
 * This should only be used if a shard is not available on the current node due to a relocation
 */
public CompletableFuture<BatchIterator<Row>> createCollector(ShardId shardId, RoutedCollectPhase collectPhase, CollectTask collectTask, ShardCollectorProviderFactory shardCollectorProviderFactory) {
    ShardStateObserver shardStateObserver = new ShardStateObserver(clusterService);
    CompletableFuture<ShardRouting> shardBecameActive = shardStateObserver.waitForActiveShard(shardId);
    Runnable onClose = () -> {
    };
    Consumer<Throwable> kill = killReason -> {
        shardBecameActive.cancel(true);
        shardBecameActive.completeExceptionally(killReason);
    };
    return shardBecameActive.thenApply(activePrimaryRouting -> CollectingBatchIterator.newInstance(onClose, kill, () -> retrieveRows(activePrimaryRouting, collectPhase, collectTask, shardCollectorProviderFactory), true));
}
Also used : ShardRouting(org.elasticsearch.cluster.routing.ShardRouting) ShardId(org.elasticsearch.index.shard.ShardId) Projections(io.crate.execution.dsl.projection.Projections) ClusterService(org.elasticsearch.cluster.service.ClusterService) Buckets(io.crate.data.Buckets) BatchIterator(io.crate.data.BatchIterator) CompletableFuture(java.util.concurrent.CompletableFuture) CollectingRowConsumer(io.crate.data.CollectingRowConsumer) Inject(org.elasticsearch.common.inject.Inject) ArrayList(java.util.ArrayList) Routing(io.crate.metadata.Routing) IntArrayList(com.carrotsearch.hppc.IntArrayList) RemoteCollector(io.crate.execution.engine.collect.collectors.RemoteCollector) Map(java.util.Map) ShardStateObserver(io.crate.execution.engine.collect.collectors.ShardStateObserver) ThreadPool(org.elasticsearch.threadpool.ThreadPool) DistributionInfo(io.crate.planner.distribution.DistributionInfo) IndicesService(org.elasticsearch.indices.IndicesService) Collector(java.util.stream.Collector) ShardCollectorProviderFactory(io.crate.execution.engine.collect.sources.ShardCollectorProviderFactory) Executor(java.util.concurrent.Executor) UUIDs(org.elasticsearch.common.UUIDs) RoutedCollectPhase(io.crate.execution.dsl.phases.RoutedCollectPhase) UUID(java.util.UUID) TransportActionProvider(io.crate.execution.TransportActionProvider) Collectors(java.util.stream.Collectors) Lists2(io.crate.common.collections.Lists2) CollectingBatchIterator(io.crate.data.CollectingBatchIterator) TasksService(io.crate.execution.jobs.TasksService) Consumer(java.util.function.Consumer) Exceptions(io.crate.exceptions.Exceptions) List(java.util.List) Row(io.crate.data.Row) Singleton(org.elasticsearch.common.inject.Singleton) ShardRouting(org.elasticsearch.cluster.routing.ShardRouting) ShardStateObserver(io.crate.execution.engine.collect.collectors.ShardStateObserver)

Example 42 with BatchIterator

use of io.crate.data.BatchIterator in project crate by crate.

the class SystemCollectSource method getIterator.

@Override
public CompletableFuture<BatchIterator<Row>> getIterator(TransactionContext txnCtx, CollectPhase phase, CollectTask collectTask, boolean supportMoveToStart) {
    RoutedCollectPhase collectPhase = (RoutedCollectPhase) phase;
    Map<String, Map<String, IntIndexedContainer>> locations = collectPhase.routing().locations();
    String table = Iterables.getOnlyElement(locations.get(clusterService.localNode().getId()).keySet());
    RelationName relationName = RelationName.fromIndexName(table);
    StaticTableDefinition<?> tableDefinition = tableDefinition(relationName);
    User user = requireNonNull(userLookup.findUser(txnCtx.sessionSettings().userName()), "User who invoked a statement must exist");
    return CompletableFuture.completedFuture(CollectingBatchIterator.newInstance(() -> {
    }, // If data is already local, then `CollectingBatchIterator` takes care of kill handling.
    t -> {
    }, () -> tableDefinition.retrieveRecords(txnCtx, user).thenApply(records -> recordsToRows(collectPhase, collectTask.txnCtx(), tableDefinition.getReferenceResolver(), supportMoveToStart, records)), tableDefinition.involvesIO()));
}
Also used : UserLookup(io.crate.user.UserLookup) TransactionContext(io.crate.metadata.TransactionContext) InformationSchemaTableDefinitions(io.crate.metadata.information.InformationSchemaTableDefinitions) RelationName(io.crate.metadata.RelationName) ClusterService(org.elasticsearch.cluster.service.ClusterService) BatchIterator(io.crate.data.BatchIterator) ReferenceResolver(io.crate.expression.reference.ReferenceResolver) CompletableFuture(java.util.concurrent.CompletableFuture) Function(java.util.function.Function) PgCatalogSchemaInfo(io.crate.metadata.pgcatalog.PgCatalogSchemaInfo) SysRowUpdater(io.crate.expression.reference.sys.SysRowUpdater) Inject(org.elasticsearch.common.inject.Inject) ArrayList(java.util.ArrayList) SysNodeChecks(io.crate.expression.reference.sys.check.node.SysNodeChecks) SchemaUnknownException(io.crate.exceptions.SchemaUnknownException) SysSchemaInfo(io.crate.metadata.sys.SysSchemaInfo) Map(java.util.Map) Objects.requireNonNull(java.util.Objects.requireNonNull) CollectPhase(io.crate.execution.dsl.phases.CollectPhase) PgCatalogTableDefinitions(io.crate.metadata.pgcatalog.PgCatalogTableDefinitions) InformationSchemaInfo(io.crate.metadata.information.InformationSchemaInfo) IntIndexedContainer(com.carrotsearch.hppc.IntIndexedContainer) NodeContext(io.crate.metadata.NodeContext) User(io.crate.user.User) RoutedCollectPhase(io.crate.execution.dsl.phases.RoutedCollectPhase) RowsTransformer(io.crate.execution.engine.collect.RowsTransformer) StaticTableDefinition(io.crate.expression.reference.StaticTableDefinition) Iterables(io.crate.common.collections.Iterables) CollectingBatchIterator(io.crate.data.CollectingBatchIterator) CollectTask(io.crate.execution.engine.collect.CollectTask) SysNodeChecksTableInfo(io.crate.metadata.sys.SysNodeChecksTableInfo) Row(io.crate.data.Row) SysTableDefinitions(io.crate.metadata.sys.SysTableDefinitions) UserManager(io.crate.user.UserManager) InputFactory(io.crate.expression.InputFactory) RelationUnknown(io.crate.exceptions.RelationUnknown) User(io.crate.user.User) RelationName(io.crate.metadata.RelationName) Map(java.util.Map) RoutedCollectPhase(io.crate.execution.dsl.phases.RoutedCollectPhase)

Example 43 with BatchIterator

use of io.crate.data.BatchIterator in project crate by crate.

the class BatchIteratorBackpressureExecutorTest method testPauseOnFirstBatch.

@Test
public void testPauseOnFirstBatch() throws Exception {
    BatchIterator<Integer> numbersBi = InMemoryBatchIterator.of(() -> IntStream.range(0, 5).iterator(), -1, true);
    BatchSimulatingIterator<Integer> it = new BatchSimulatingIterator<>(numbersBi, 2, 5, executor);
    AtomicInteger numRows = new AtomicInteger(0);
    AtomicInteger numPauses = new AtomicInteger(0);
    Predicate<Integer> shouldPause = i -> {
        if (i == 0 && numPauses.get() == 0) {
            numPauses.incrementAndGet();
            return true;
        }
        return false;
    };
    BatchIteratorBackpressureExecutor<Integer, Integer> executor = new BatchIteratorBackpressureExecutor<>(UUID.randomUUID(), scheduler, this.executor, it, i -> CompletableFuture.supplyAsync(numRows::incrementAndGet, this.executor), (a, b) -> a + b, 0, shouldPause, null, null, ignored -> 1L);
    CompletableFuture<Integer> result = executor.consumeIteratorAndExecute();
    result.get(10, TimeUnit.SECONDS);
    assertThat(numPauses.get(), Matchers.is(1));
}
Also used : AtomicInteger(java.util.concurrent.atomic.AtomicInteger) IntStream(java.util.stream.IntStream) Predicate(java.util.function.Predicate) InMemoryBatchIterator(io.crate.data.InMemoryBatchIterator) Matchers(org.hamcrest.Matchers) CompletableFuture(java.util.concurrent.CompletableFuture) Test(org.junit.Test) BatchIterator(io.crate.data.BatchIterator) BatchSimulatingIterator(io.crate.testing.BatchSimulatingIterator) UUID(java.util.UUID) Executors(java.util.concurrent.Executors) TimeUnit(java.util.concurrent.TimeUnit) AtomicInteger(java.util.concurrent.atomic.AtomicInteger) After(org.junit.After) ScheduledExecutorService(java.util.concurrent.ScheduledExecutorService) ESTestCase(org.elasticsearch.test.ESTestCase) ExecutorService(java.util.concurrent.ExecutorService) Before(org.junit.Before) BatchSimulatingIterator(io.crate.testing.BatchSimulatingIterator) AtomicInteger(java.util.concurrent.atomic.AtomicInteger) Test(org.junit.Test)

Example 44 with BatchIterator

use of io.crate.data.BatchIterator in project crate by crate.

the class IndexWriterProjectorTest method testIndexWriter.

@Test
public void testIndexWriter() throws Throwable {
    execute("create table bulk_import (id int primary key, name string) with (number_of_replicas=0)");
    ensureGreen();
    InputCollectExpression sourceInput = new InputCollectExpression(1);
    List<CollectExpression<Row, ?>> collectExpressions = Collections.<CollectExpression<Row, ?>>singletonList(sourceInput);
    RelationName bulkImportIdent = new RelationName(sqlExecutor.getCurrentSchema(), "bulk_import");
    ClusterState state = clusterService().state();
    Settings tableSettings = TableSettingsResolver.get(state.getMetadata(), bulkImportIdent, false);
    ThreadPool threadPool = internalCluster().getInstance(ThreadPool.class);
    IndexWriterProjector writerProjector = new IndexWriterProjector(clusterService(), new NodeLimits(new ClusterSettings(Settings.EMPTY, ClusterSettings.BUILT_IN_CLUSTER_SETTINGS)), new NoopCircuitBreaker("dummy"), RamAccounting.NO_ACCOUNTING, threadPool.scheduler(), threadPool.executor(ThreadPool.Names.SEARCH), CoordinatorTxnCtx.systemTransactionContext(), new NodeContext(internalCluster().getInstance(Functions.class)), Settings.EMPTY, IndexMetadata.INDEX_NUMBER_OF_SHARDS_SETTING.get(tableSettings), NumberOfReplicas.fromSettings(tableSettings, state.getNodes().getSize()), internalCluster().getInstance(TransportCreatePartitionsAction.class), internalCluster().getInstance(TransportShardUpsertAction.class)::execute, IndexNameResolver.forTable(bulkImportIdent), new Reference(new ReferenceIdent(bulkImportIdent, DocSysColumns.RAW), RowGranularity.DOC, DataTypes.STRING, 0, null), Collections.singletonList(ID_IDENT), Collections.<Symbol>singletonList(new InputColumn(0)), null, null, sourceInput, collectExpressions, 20, null, null, false, false, UUID.randomUUID(), UpsertResultContext.forRowCount(), false);
    BatchIterator rowsIterator = InMemoryBatchIterator.of(IntStream.range(0, 100).mapToObj(i -> new RowN(new Object[] { i, "{\"id\": " + i + ", \"name\": \"Arthur\"}" })).collect(Collectors.toList()), SENTINEL, true);
    TestingRowConsumer consumer = new TestingRowConsumer();
    consumer.accept(writerProjector.apply(rowsIterator), null);
    Bucket objects = consumer.getBucket();
    assertThat(objects, contains(isRow(100L)));
    execute("refresh table bulk_import");
    execute("select count(*) from bulk_import");
    assertThat(response.rowCount(), is(1L));
    assertThat(response.rows()[0][0], is(100L));
}
Also used : TransportCreatePartitionsAction(org.elasticsearch.action.admin.indices.create.TransportCreatePartitionsAction) ClusterState(org.elasticsearch.cluster.ClusterState) ClusterSettings(org.elasticsearch.common.settings.ClusterSettings) NodeContext(io.crate.metadata.NodeContext) Reference(io.crate.metadata.Reference) ThreadPool(org.elasticsearch.threadpool.ThreadPool) BatchIterator(io.crate.data.BatchIterator) InMemoryBatchIterator(io.crate.data.InMemoryBatchIterator) CollectExpression(io.crate.execution.engine.collect.CollectExpression) InputCollectExpression(io.crate.execution.engine.collect.InputCollectExpression) ReferenceIdent(io.crate.metadata.ReferenceIdent) RowN(io.crate.data.RowN) InputCollectExpression(io.crate.execution.engine.collect.InputCollectExpression) Bucket(io.crate.data.Bucket) InputColumn(io.crate.expression.symbol.InputColumn) NodeLimits(io.crate.execution.jobs.NodeLimits) RelationName(io.crate.metadata.RelationName) NoopCircuitBreaker(org.elasticsearch.common.breaker.NoopCircuitBreaker) Settings(org.elasticsearch.common.settings.Settings) ClusterSettings(org.elasticsearch.common.settings.ClusterSettings) TestingRowConsumer(io.crate.testing.TestingRowConsumer) Test(org.junit.Test)

Example 45 with BatchIterator

use of io.crate.data.BatchIterator in project crate by crate.

the class HashInnerJoinBatchIteratorTest method testInnerHashJoinWithBlockSizeSmallerThanDataSet.

@Test
public void testInnerHashJoinWithBlockSizeSmallerThanDataSet() throws Exception {
    Supplier<BatchIterator<Row>> batchIteratorSupplier = () -> new HashInnerJoinBatchIterator(leftIterator.get(), rightIterator.get(), mock(RowAccounting.class), new CombinedRow(1, 1), getCol0EqCol1JoinCondition(), getHashForLeft(), getHashForRight(), () -> 1);
    BatchIteratorTester tester = new BatchIteratorTester(batchIteratorSupplier);
    tester.verifyResultAndEdgeCaseBehaviour(expectedResult);
}
Also used : RowAccounting(io.crate.breaker.RowAccounting) BatchIteratorTester(io.crate.testing.BatchIteratorTester) BatchIterator(io.crate.data.BatchIterator) CombinedRow(io.crate.data.join.CombinedRow) Test(org.junit.Test)

Aggregations

BatchIterator (io.crate.data.BatchIterator)50 Test (org.junit.Test)37 BatchIteratorTester (io.crate.testing.BatchIteratorTester)22 InMemoryBatchIterator (io.crate.data.InMemoryBatchIterator)17 Row (io.crate.data.Row)16 ArrayList (java.util.ArrayList)10 CrateUnitTest (io.crate.test.integration.CrateUnitTest)8 List (java.util.List)8 Map (java.util.Map)7 CompletableFuture (java.util.concurrent.CompletableFuture)7 Bucket (io.crate.data.Bucket)6 InputFactory (io.crate.expression.InputFactory)6 Symbol (io.crate.analyze.symbol.Symbol)4 RowAccounting (io.crate.breaker.RowAccounting)4 RowN (io.crate.data.RowN)4 CombinedRow (io.crate.data.join.CombinedRow)4 InputFactory (io.crate.operation.InputFactory)4 TestingHelpers.isRow (io.crate.testing.TestingHelpers.isRow)4 UUID (java.util.UUID)4 ClusterService (org.elasticsearch.cluster.service.ClusterService)4