Search in sources :

Example 11 with NodeTaskMap

use of io.trino.execution.NodeTaskMap in project trino by trinodb.

the class TestSourcePartitionedScheduler method testScheduleSplitsOneAtATime.

@Test
public void testScheduleSplitsOneAtATime() {
    PlanFragment plan = createFragment();
    NodeTaskMap nodeTaskMap = new NodeTaskMap(finalizerService);
    StageExecution stage = createStageExecution(plan, nodeTaskMap);
    StageScheduler scheduler = getSourcePartitionedScheduler(createFixedSplitSource(60, TestingSplit::createRemoteSplit), stage, nodeManager, nodeTaskMap, 1, STAGE);
    for (int i = 0; i < 60; i++) {
        ScheduleResult scheduleResult = scheduler.schedule();
        // only finishes when last split is fetched
        if (i == 59) {
            assertEffectivelyFinished(scheduleResult, scheduler);
        } else {
            assertFalse(scheduleResult.isFinished());
        }
        // never blocks
        assertTrue(scheduleResult.getBlocked().isDone());
        // first three splits create new tasks
        assertEquals(scheduleResult.getNewTasks().size(), i < 3 ? 1 : 0);
        assertEquals(stage.getAllTasks().size(), i < 3 ? i + 1 : 3);
        assertPartitionedSplitCount(stage, min(i + 1, 60));
    }
    for (RemoteTask remoteTask : stage.getAllTasks()) {
        PartitionedSplitsInfo splitsInfo = remoteTask.getPartitionedSplitsInfo();
        assertEquals(splitsInfo.getCount(), 20);
    }
    stage.abort();
}
Also used : NodeTaskMap(io.trino.execution.NodeTaskMap) PipelinedStageExecution.createPipelinedStageExecution(io.trino.execution.scheduler.PipelinedStageExecution.createPipelinedStageExecution) PartitionedSplitsInfo(io.trino.execution.PartitionedSplitsInfo) MockRemoteTask(io.trino.execution.MockRemoteTaskFactory.MockRemoteTask) RemoteTask(io.trino.execution.RemoteTask) PlanFragment(io.trino.sql.planner.PlanFragment) SourcePartitionedScheduler.newSourcePartitionedSchedulerAsStageScheduler(io.trino.execution.scheduler.SourcePartitionedScheduler.newSourcePartitionedSchedulerAsStageScheduler) Test(org.testng.annotations.Test)

Example 12 with NodeTaskMap

use of io.trino.execution.NodeTaskMap in project trino by trinodb.

the class TestSourcePartitionedScheduler method testDynamicFiltersUnblockedOnBlockedBuildSource.

@Test
public void testDynamicFiltersUnblockedOnBlockedBuildSource() {
    PlanFragment plan = createFragment();
    NodeTaskMap nodeTaskMap = new NodeTaskMap(finalizerService);
    StageExecution stage = createStageExecution(plan, nodeTaskMap);
    NodeScheduler nodeScheduler = new NodeScheduler(new UniformNodeSelectorFactory(nodeManager, new NodeSchedulerConfig().setIncludeCoordinator(false), nodeTaskMap));
    DynamicFilterService dynamicFilterService = new DynamicFilterService(metadata, functionManager, typeOperators, new DynamicFilterConfig());
    dynamicFilterService.registerQuery(QUERY_ID, TEST_SESSION, ImmutableSet.of(DYNAMIC_FILTER_ID), ImmutableSet.of(DYNAMIC_FILTER_ID), ImmutableSet.of(DYNAMIC_FILTER_ID));
    StageScheduler scheduler = newSourcePartitionedSchedulerAsStageScheduler(stage, TABLE_SCAN_NODE_ID, new ConnectorAwareSplitSource(CONNECTOR_ID, createBlockedSplitSource()), new DynamicSplitPlacementPolicy(nodeScheduler.createNodeSelector(session, Optional.of(CONNECTOR_ID)), stage::getAllTasks), 2, dynamicFilterService, new TableExecuteContextManager(), () -> true);
    SymbolAllocator symbolAllocator = new SymbolAllocator();
    Symbol symbol = symbolAllocator.newSymbol("DF_SYMBOL1", BIGINT);
    DynamicFilter dynamicFilter = dynamicFilterService.createDynamicFilter(QUERY_ID, ImmutableList.of(new DynamicFilters.Descriptor(DYNAMIC_FILTER_ID, symbol.toSymbolReference())), ImmutableMap.of(symbol, new TestingColumnHandle("probeColumnA")), symbolAllocator.getTypes());
    // make sure dynamic filtering collecting task was created immediately
    assertEquals(stage.getState(), PLANNED);
    scheduler.start();
    assertEquals(stage.getAllTasks().size(), 1);
    assertEquals(stage.getState(), SCHEDULING);
    // make sure dynamic filter is initially blocked
    assertFalse(dynamicFilter.isBlocked().isDone());
    // make sure dynamic filter is unblocked due to build side source tasks being blocked
    ScheduleResult scheduleResult = scheduler.schedule();
    assertTrue(dynamicFilter.isBlocked().isDone());
    // no new probe splits should be scheduled
    assertEquals(scheduleResult.getSplitsScheduled(), 0);
}
Also used : SymbolAllocator(io.trino.sql.planner.SymbolAllocator) NodeTaskMap(io.trino.execution.NodeTaskMap) PipelinedStageExecution.createPipelinedStageExecution(io.trino.execution.scheduler.PipelinedStageExecution.createPipelinedStageExecution) DynamicFilter(io.trino.spi.connector.DynamicFilter) Symbol(io.trino.sql.planner.Symbol) PlanFragment(io.trino.sql.planner.PlanFragment) ConnectorAwareSplitSource(io.trino.split.ConnectorAwareSplitSource) SourcePartitionedScheduler.newSourcePartitionedSchedulerAsStageScheduler(io.trino.execution.scheduler.SourcePartitionedScheduler.newSourcePartitionedSchedulerAsStageScheduler) TestingColumnHandle(io.trino.testing.TestingMetadata.TestingColumnHandle) TableExecuteContextManager(io.trino.execution.TableExecuteContextManager) DynamicFilterService(io.trino.server.DynamicFilterService) DynamicFilterConfig(io.trino.execution.DynamicFilterConfig) Test(org.testng.annotations.Test)

Example 13 with NodeTaskMap

use of io.trino.execution.NodeTaskMap in project trino by trinodb.

the class TestFaultTolerantStageScheduler method beforeClass.

@BeforeClass
public void beforeClass() {
    finalizerService = new FinalizerService();
    finalizerService.start();
    nodeTaskMap = new NodeTaskMap(finalizerService);
}
Also used : FinalizerService(io.trino.util.FinalizerService) NodeTaskMap(io.trino.execution.NodeTaskMap) BeforeClass(org.testng.annotations.BeforeClass)

Example 14 with NodeTaskMap

use of io.trino.execution.NodeTaskMap in project trino by trinodb.

the class UniformNodeSelector method computeAssignments.

@Override
public SplitPlacementResult computeAssignments(Set<Split> splits, List<RemoteTask> existingTasks) {
    Multimap<InternalNode, Split> assignment = HashMultimap.create();
    NodeMap nodeMap = this.nodeMap.get().get();
    NodeAssignmentStats assignmentStats = new NodeAssignmentStats(nodeTaskMap, nodeMap, existingTasks);
    ResettableRandomizedIterator<InternalNode> randomCandidates = randomizedNodes(nodeMap, includeCoordinator, ImmutableSet.of());
    Set<InternalNode> blockedExactNodes = new HashSet<>();
    boolean splitWaitingForAnyNode = false;
    // splitsToBeRedistributed becomes true only when splits go through locality-based assignment
    boolean splitsToBeRedistributed = false;
    Set<Split> remainingSplits = new HashSet<>();
    // optimizedLocalScheduling enables prioritized assignment of splits to local nodes when splits contain locality information
    if (optimizedLocalScheduling) {
        for (Split split : splits) {
            if (split.isRemotelyAccessible() && !split.getAddresses().isEmpty()) {
                List<InternalNode> candidateNodes = selectExactNodes(nodeMap, split.getAddresses(), includeCoordinator);
                Optional<InternalNode> chosenNode = candidateNodes.stream().filter(ownerNode -> assignmentStats.getTotalSplitsWeight(ownerNode) < maxSplitsWeightPerNode && assignmentStats.getUnacknowledgedSplitCountForStage(ownerNode) < maxUnacknowledgedSplitsPerTask).min(comparingLong(assignmentStats::getTotalSplitsWeight));
                if (chosenNode.isPresent()) {
                    assignment.put(chosenNode.get(), split);
                    assignmentStats.addAssignedSplit(chosenNode.get(), split.getSplitWeight());
                    splitsToBeRedistributed = true;
                    continue;
                }
            }
            remainingSplits.add(split);
        }
    } else {
        remainingSplits = splits;
    }
    for (Split split : remainingSplits) {
        randomCandidates.reset();
        List<InternalNode> candidateNodes;
        if (!split.isRemotelyAccessible()) {
            candidateNodes = selectExactNodes(nodeMap, split.getAddresses(), includeCoordinator);
        } else {
            candidateNodes = selectNodes(minCandidates, randomCandidates);
        }
        if (candidateNodes.isEmpty()) {
            log.debug("No nodes available to schedule %s. Available nodes %s", split, nodeMap.getNodesByHost().keys());
            throw new TrinoException(NO_NODES_AVAILABLE, "No nodes available to run query");
        }
        InternalNode chosenNode = chooseNodeForSplit(assignmentStats, candidateNodes);
        if (chosenNode == null) {
            long minWeight = Long.MAX_VALUE;
            for (InternalNode node : candidateNodes) {
                long queuedWeight = assignmentStats.getQueuedSplitsWeightForStage(node);
                if (queuedWeight <= minWeight && queuedWeight < maxPendingSplitsWeightPerTask && assignmentStats.getUnacknowledgedSplitCountForStage(node) < maxUnacknowledgedSplitsPerTask) {
                    chosenNode = node;
                    minWeight = queuedWeight;
                }
            }
        }
        if (chosenNode != null) {
            assignment.put(chosenNode, split);
            assignmentStats.addAssignedSplit(chosenNode, split.getSplitWeight());
        } else {
            if (split.isRemotelyAccessible()) {
                splitWaitingForAnyNode = true;
            } else // Exact node set won't matter, if a split is waiting for any node
            if (!splitWaitingForAnyNode) {
                blockedExactNodes.addAll(candidateNodes);
            }
        }
    }
    ListenableFuture<Void> blocked;
    if (splitWaitingForAnyNode) {
        blocked = toWhenHasSplitQueueSpaceFuture(existingTasks, calculateLowWatermark(maxPendingSplitsWeightPerTask));
    } else {
        blocked = toWhenHasSplitQueueSpaceFuture(blockedExactNodes, existingTasks, calculateLowWatermark(maxPendingSplitsWeightPerTask));
    }
    if (splitsToBeRedistributed) {
        equateDistribution(assignment, assignmentStats, nodeMap, includeCoordinator);
    }
    return new SplitPlacementResult(blocked, assignment);
}
Also used : InternalNodeManager(io.trino.metadata.InternalNodeManager) ListenableFuture(com.google.common.util.concurrent.ListenableFuture) NodeTaskMap(io.trino.execution.NodeTaskMap) Logger(io.airlift.log.Logger) Multimap(com.google.common.collect.Multimap) AtomicReference(java.util.concurrent.atomic.AtomicReference) Supplier(java.util.function.Supplier) SplitWeight(io.trino.spi.SplitWeight) InetAddress(java.net.InetAddress) HashSet(java.util.HashSet) Preconditions.checkArgument(com.google.common.base.Preconditions.checkArgument) HashMultimap(com.google.common.collect.HashMultimap) NodeScheduler.randomizedNodes(io.trino.execution.scheduler.NodeScheduler.randomizedNodes) ImmutableList(com.google.common.collect.ImmutableList) Objects.requireNonNull(java.util.Objects.requireNonNull) Suppliers(com.google.common.base.Suppliers) NodeScheduler.selectNodes(io.trino.execution.scheduler.NodeScheduler.selectNodes) Nullable(javax.annotation.Nullable) ImmutableSet(com.google.common.collect.ImmutableSet) SplitsBalancingPolicy(io.trino.execution.scheduler.NodeSchedulerConfig.SplitsBalancingPolicy) Iterator(java.util.Iterator) Collection(java.util.Collection) ImmutableList.toImmutableList(com.google.common.collect.ImmutableList.toImmutableList) NodeScheduler.selectDistributionNodes(io.trino.execution.scheduler.NodeScheduler.selectDistributionNodes) RemoteTask(io.trino.execution.RemoteTask) Set(java.util.Set) TrinoException(io.trino.spi.TrinoException) UnknownHostException(java.net.UnknownHostException) SetMultimap(com.google.common.collect.SetMultimap) InternalNode(io.trino.metadata.InternalNode) List(java.util.List) NodeScheduler.selectExactNodes(io.trino.execution.scheduler.NodeScheduler.selectExactNodes) Comparator.comparingLong(java.util.Comparator.comparingLong) IndexedPriorityQueue(io.trino.execution.resourcegroups.IndexedPriorityQueue) Split(io.trino.metadata.Split) Optional(java.util.Optional) NodeScheduler.calculateLowWatermark(io.trino.execution.scheduler.NodeScheduler.calculateLowWatermark) NO_NODES_AVAILABLE(io.trino.spi.StandardErrorCode.NO_NODES_AVAILABLE) VisibleForTesting(com.google.common.annotations.VisibleForTesting) NodeScheduler.toWhenHasSplitQueueSpaceFuture(io.trino.execution.scheduler.NodeScheduler.toWhenHasSplitQueueSpaceFuture) NodeScheduler.getAllNodes(io.trino.execution.scheduler.NodeScheduler.getAllNodes) HostAddress(io.trino.spi.HostAddress) TrinoException(io.trino.spi.TrinoException) InternalNode(io.trino.metadata.InternalNode) Split(io.trino.metadata.Split) HashSet(java.util.HashSet)

Example 15 with NodeTaskMap

use of io.trino.execution.NodeTaskMap in project trino by trinodb.

the class TestHashJoinOperator method setUp.

@BeforeMethod
public void setUp() {
    // Before/AfterMethod is chosen here because the executor needs to be shutdown
    // after every single test case to terminate outstanding threads, if any.
    // The line below is the same as newCachedThreadPool(daemonThreadsNamed(...)) except RejectionExecutionHandler.
    // RejectionExecutionHandler is set to DiscardPolicy (instead of the default AbortPolicy) here.
    // Otherwise, a large number of RejectedExecutionException will flood logging, resulting in Travis failure.
    executor = new ThreadPoolExecutor(0, Integer.MAX_VALUE, 60L, SECONDS, new SynchronousQueue<>(), daemonThreadsNamed("test-executor-%s"), new ThreadPoolExecutor.DiscardPolicy());
    scheduledExecutor = newScheduledThreadPool(2, daemonThreadsNamed(getClass().getSimpleName() + "-scheduledExecutor-%s"));
    NodeScheduler nodeScheduler = new NodeScheduler(new UniformNodeSelectorFactory(new InMemoryNodeManager(), new NodeSchedulerConfig().setIncludeCoordinator(true), new NodeTaskMap(new FinalizerService())));
    nodePartitioningManager = new NodePartitioningManager(nodeScheduler, new BlockTypeOperators(new TypeOperators()));
}
Also used : BlockTypeOperators(io.trino.type.BlockTypeOperators) NodeTaskMap(io.trino.execution.NodeTaskMap) FinalizerService(io.trino.util.FinalizerService) UniformNodeSelectorFactory(io.trino.execution.scheduler.UniformNodeSelectorFactory) SynchronousQueue(java.util.concurrent.SynchronousQueue) NodeScheduler(io.trino.execution.scheduler.NodeScheduler) NodeSchedulerConfig(io.trino.execution.scheduler.NodeSchedulerConfig) ThreadPoolExecutor(java.util.concurrent.ThreadPoolExecutor) NodePartitioningManager(io.trino.sql.planner.NodePartitioningManager) InMemoryNodeManager(io.trino.metadata.InMemoryNodeManager) TypeOperators(io.trino.spi.type.TypeOperators) BlockTypeOperators(io.trino.type.BlockTypeOperators) BeforeMethod(org.testng.annotations.BeforeMethod)

Aggregations

NodeTaskMap (io.trino.execution.NodeTaskMap)15 PipelinedStageExecution.createPipelinedStageExecution (io.trino.execution.scheduler.PipelinedStageExecution.createPipelinedStageExecution)11 SourcePartitionedScheduler.newSourcePartitionedSchedulerAsStageScheduler (io.trino.execution.scheduler.SourcePartitionedScheduler.newSourcePartitionedSchedulerAsStageScheduler)11 PlanFragment (io.trino.sql.planner.PlanFragment)11 Test (org.testng.annotations.Test)11 RemoteTask (io.trino.execution.RemoteTask)9 MockRemoteTask (io.trino.execution.MockRemoteTaskFactory.MockRemoteTask)8 PartitionedSplitsInfo (io.trino.execution.PartitionedSplitsInfo)8 InMemoryNodeManager (io.trino.metadata.InMemoryNodeManager)7 InternalNode (io.trino.metadata.InternalNode)6 DynamicFilterConfig (io.trino.execution.DynamicFilterConfig)4 TableExecuteContextManager (io.trino.execution.TableExecuteContextManager)4 DynamicFilterService (io.trino.server.DynamicFilterService)4 TestingSplit (io.trino.testing.TestingSplit)4 FinalizerService (io.trino.util.FinalizerService)4 Duration (io.airlift.units.Duration)3 ConnectorAwareSplitSource (io.trino.split.ConnectorAwareSplitSource)3 Preconditions.checkArgument (com.google.common.base.Preconditions.checkArgument)2 ImmutableList (com.google.common.collect.ImmutableList)2 ImmutableSet (com.google.common.collect.ImmutableSet)2