Search in sources :

Example 6 with PartitionedSplitsInfo

use of io.trino.execution.PartitionedSplitsInfo in project trino by trinodb.

the class TestSourcePartitionedScheduler method testScheduleSplitsBlock.

@Test
public void testScheduleSplitsBlock() {
    PlanFragment plan = createFragment();
    NodeTaskMap nodeTaskMap = new NodeTaskMap(finalizerService);
    StageExecution stage = createStageExecution(plan, nodeTaskMap);
    StageScheduler scheduler = getSourcePartitionedScheduler(createFixedSplitSource(80, TestingSplit::createRemoteSplit), stage, nodeManager, nodeTaskMap, 1, STAGE);
    // schedule first 60 splits, which will cause the scheduler to block
    for (int i = 0; i <= 60; i++) {
        ScheduleResult scheduleResult = scheduler.schedule();
        assertFalse(scheduleResult.isFinished());
        // blocks at 20 per node
        assertEquals(scheduleResult.getBlocked().isDone(), i != 60);
        // first three splits create new tasks
        assertEquals(scheduleResult.getNewTasks().size(), i < 3 ? 1 : 0);
        assertEquals(stage.getAllTasks().size(), i < 3 ? i + 1 : 3);
        assertPartitionedSplitCount(stage, min(i + 1, 60));
    }
    for (RemoteTask remoteTask : stage.getAllTasks()) {
        PartitionedSplitsInfo splitsInfo = remoteTask.getPartitionedSplitsInfo();
        assertEquals(splitsInfo.getCount(), 20);
    }
    // todo rewrite MockRemoteTask to fire a tate transition when splits are cleared, and then validate blocked future completes
    // drop the 20 splits from one node
    ((MockRemoteTask) stage.getAllTasks().get(0)).clearSplits();
    // schedule remaining 20 splits
    for (int i = 0; i < 20; i++) {
        ScheduleResult scheduleResult = scheduler.schedule();
        // finishes when last split is fetched
        if (i == 19) {
            assertEffectivelyFinished(scheduleResult, scheduler);
        } else {
            assertFalse(scheduleResult.isFinished());
        }
        // does not block again
        assertTrue(scheduleResult.getBlocked().isDone());
        // no additional tasks will be created
        assertEquals(scheduleResult.getNewTasks().size(), 0);
        assertEquals(stage.getAllTasks().size(), 3);
        // we dropped 20 splits so start at 40 and count to 60
        assertPartitionedSplitCount(stage, min(i + 41, 60));
    }
    for (RemoteTask remoteTask : stage.getAllTasks()) {
        PartitionedSplitsInfo splitsInfo = remoteTask.getPartitionedSplitsInfo();
        assertEquals(splitsInfo.getCount(), 20);
    }
    stage.abort();
}
Also used : NodeTaskMap(io.trino.execution.NodeTaskMap) PipelinedStageExecution.createPipelinedStageExecution(io.trino.execution.scheduler.PipelinedStageExecution.createPipelinedStageExecution) PartitionedSplitsInfo(io.trino.execution.PartitionedSplitsInfo) MockRemoteTask(io.trino.execution.MockRemoteTaskFactory.MockRemoteTask) MockRemoteTask(io.trino.execution.MockRemoteTaskFactory.MockRemoteTask) RemoteTask(io.trino.execution.RemoteTask) PlanFragment(io.trino.sql.planner.PlanFragment) SourcePartitionedScheduler.newSourcePartitionedSchedulerAsStageScheduler(io.trino.execution.scheduler.SourcePartitionedScheduler.newSourcePartitionedSchedulerAsStageScheduler) Test(org.testng.annotations.Test)

Example 7 with PartitionedSplitsInfo

use of io.trino.execution.PartitionedSplitsInfo in project trino by trinodb.

the class TestSourcePartitionedScheduler method testScheduleSplitsBatched.

@Test
public void testScheduleSplitsBatched() {
    PlanFragment plan = createFragment();
    NodeTaskMap nodeTaskMap = new NodeTaskMap(finalizerService);
    StageExecution stage = createStageExecution(plan, nodeTaskMap);
    StageScheduler scheduler = getSourcePartitionedScheduler(createFixedSplitSource(60, TestingSplit::createRemoteSplit), stage, nodeManager, nodeTaskMap, 7, STAGE);
    for (int i = 0; i <= (60 / 7); i++) {
        ScheduleResult scheduleResult = scheduler.schedule();
        // finishes when last split is fetched
        if (i == (60 / 7)) {
            assertEffectivelyFinished(scheduleResult, scheduler);
        } else {
            assertFalse(scheduleResult.isFinished());
        }
        // never blocks
        assertTrue(scheduleResult.getBlocked().isDone());
        // first three splits create new tasks
        assertEquals(scheduleResult.getNewTasks().size(), i == 0 ? 3 : 0);
        assertEquals(stage.getAllTasks().size(), 3);
        assertPartitionedSplitCount(stage, min((i + 1) * 7, 60));
    }
    for (RemoteTask remoteTask : stage.getAllTasks()) {
        PartitionedSplitsInfo splitsInfo = remoteTask.getPartitionedSplitsInfo();
        assertEquals(splitsInfo.getCount(), 20);
    }
    stage.abort();
}
Also used : NodeTaskMap(io.trino.execution.NodeTaskMap) PipelinedStageExecution.createPipelinedStageExecution(io.trino.execution.scheduler.PipelinedStageExecution.createPipelinedStageExecution) PartitionedSplitsInfo(io.trino.execution.PartitionedSplitsInfo) MockRemoteTask(io.trino.execution.MockRemoteTaskFactory.MockRemoteTask) RemoteTask(io.trino.execution.RemoteTask) PlanFragment(io.trino.sql.planner.PlanFragment) SourcePartitionedScheduler.newSourcePartitionedSchedulerAsStageScheduler(io.trino.execution.scheduler.SourcePartitionedScheduler.newSourcePartitionedSchedulerAsStageScheduler) Test(org.testng.annotations.Test)

Example 8 with PartitionedSplitsInfo

use of io.trino.execution.PartitionedSplitsInfo in project trino by trinodb.

the class TestSourcePartitionedScheduler method testWorkerBalancedSplitAssignment.

@Test
public void testWorkerBalancedSplitAssignment() {
    // use private node manager so we can add a node later
    InMemoryNodeManager nodeManager = new InMemoryNodeManager();
    nodeManager.addNode(CONNECTOR_ID, new InternalNode("other1", URI.create("http://127.0.0.1:11"), NodeVersion.UNKNOWN, false), new InternalNode("other2", URI.create("http://127.0.0.1:12"), NodeVersion.UNKNOWN, false), new InternalNode("other3", URI.create("http://127.0.0.1:13"), NodeVersion.UNKNOWN, false));
    NodeTaskMap nodeTaskMap = new NodeTaskMap(finalizerService);
    // Schedule 15 splits - there are 3 nodes, each node should get 5 splits
    PlanFragment firstPlan = createFragment();
    StageExecution firstStage = createStageExecution(firstPlan, nodeTaskMap);
    StageScheduler firstScheduler = getSourcePartitionedScheduler(createFixedSplitSource(15, TestingSplit::createRemoteSplit), firstStage, nodeManager, nodeTaskMap, 200, NODE);
    ScheduleResult scheduleResult = firstScheduler.schedule();
    assertEffectivelyFinished(scheduleResult, firstScheduler);
    assertTrue(scheduleResult.getBlocked().isDone());
    assertEquals(scheduleResult.getNewTasks().size(), 3);
    assertEquals(firstStage.getAllTasks().size(), 3);
    for (RemoteTask remoteTask : firstStage.getAllTasks()) {
        PartitionedSplitsInfo splitsInfo = remoteTask.getPartitionedSplitsInfo();
        assertEquals(splitsInfo.getCount(), 5);
    }
    // Add new node
    InternalNode additionalNode = new InternalNode("other4", URI.create("http://127.0.0.1:14"), NodeVersion.UNKNOWN, false);
    nodeManager.addNode(CONNECTOR_ID, additionalNode);
    // Schedule 5 splits in another query. Since the new node does not have any splits, all 5 splits are assigned to the new node
    PlanFragment secondPlan = createFragment();
    StageExecution secondStage = createStageExecution(secondPlan, nodeTaskMap);
    StageScheduler secondScheduler = getSourcePartitionedScheduler(createFixedSplitSource(5, TestingSplit::createRemoteSplit), secondStage, nodeManager, nodeTaskMap, 200, NODE);
    scheduleResult = secondScheduler.schedule();
    assertEffectivelyFinished(scheduleResult, secondScheduler);
    assertTrue(scheduleResult.getBlocked().isDone());
    assertEquals(scheduleResult.getNewTasks().size(), 1);
    assertEquals(secondStage.getAllTasks().size(), 1);
    RemoteTask task = secondStage.getAllTasks().get(0);
    assertEquals(task.getPartitionedSplitsInfo().getCount(), 5);
    firstStage.abort();
    secondStage.abort();
}
Also used : NodeTaskMap(io.trino.execution.NodeTaskMap) PipelinedStageExecution.createPipelinedStageExecution(io.trino.execution.scheduler.PipelinedStageExecution.createPipelinedStageExecution) PartitionedSplitsInfo(io.trino.execution.PartitionedSplitsInfo) MockRemoteTask(io.trino.execution.MockRemoteTaskFactory.MockRemoteTask) RemoteTask(io.trino.execution.RemoteTask) InternalNode(io.trino.metadata.InternalNode) PlanFragment(io.trino.sql.planner.PlanFragment) InMemoryNodeManager(io.trino.metadata.InMemoryNodeManager) SourcePartitionedScheduler.newSourcePartitionedSchedulerAsStageScheduler(io.trino.execution.scheduler.SourcePartitionedScheduler.newSourcePartitionedSchedulerAsStageScheduler) Test(org.testng.annotations.Test)

Example 9 with PartitionedSplitsInfo

use of io.trino.execution.PartitionedSplitsInfo in project trino by trinodb.

the class TestSourcePartitionedScheduler method testScheduleSplitsOneAtATime.

@Test
public void testScheduleSplitsOneAtATime() {
    PlanFragment plan = createFragment();
    NodeTaskMap nodeTaskMap = new NodeTaskMap(finalizerService);
    StageExecution stage = createStageExecution(plan, nodeTaskMap);
    StageScheduler scheduler = getSourcePartitionedScheduler(createFixedSplitSource(60, TestingSplit::createRemoteSplit), stage, nodeManager, nodeTaskMap, 1, STAGE);
    for (int i = 0; i < 60; i++) {
        ScheduleResult scheduleResult = scheduler.schedule();
        // only finishes when last split is fetched
        if (i == 59) {
            assertEffectivelyFinished(scheduleResult, scheduler);
        } else {
            assertFalse(scheduleResult.isFinished());
        }
        // never blocks
        assertTrue(scheduleResult.getBlocked().isDone());
        // first three splits create new tasks
        assertEquals(scheduleResult.getNewTasks().size(), i < 3 ? 1 : 0);
        assertEquals(stage.getAllTasks().size(), i < 3 ? i + 1 : 3);
        assertPartitionedSplitCount(stage, min(i + 1, 60));
    }
    for (RemoteTask remoteTask : stage.getAllTasks()) {
        PartitionedSplitsInfo splitsInfo = remoteTask.getPartitionedSplitsInfo();
        assertEquals(splitsInfo.getCount(), 20);
    }
    stage.abort();
}
Also used : NodeTaskMap(io.trino.execution.NodeTaskMap) PipelinedStageExecution.createPipelinedStageExecution(io.trino.execution.scheduler.PipelinedStageExecution.createPipelinedStageExecution) PartitionedSplitsInfo(io.trino.execution.PartitionedSplitsInfo) MockRemoteTask(io.trino.execution.MockRemoteTaskFactory.MockRemoteTask) RemoteTask(io.trino.execution.RemoteTask) PlanFragment(io.trino.sql.planner.PlanFragment) SourcePartitionedScheduler.newSourcePartitionedSchedulerAsStageScheduler(io.trino.execution.scheduler.SourcePartitionedScheduler.newSourcePartitionedSchedulerAsStageScheduler) Test(org.testng.annotations.Test)

Example 10 with PartitionedSplitsInfo

use of io.trino.execution.PartitionedSplitsInfo in project trino by trinodb.

the class HttpRemoteTask method getQueuedPartitionedSplitsInfo.

@Override
public PartitionedSplitsInfo getQueuedPartitionedSplitsInfo() {
    TaskStatus taskStatus = getTaskStatus();
    if (taskStatus.getState().isDone()) {
        return PartitionedSplitsInfo.forZeroSplits();
    }
    PartitionedSplitsInfo unacknowledgedSplitsInfo = getUnacknowledgedPartitionedSplitsInfo();
    int count = unacknowledgedSplitsInfo.getCount() + taskStatus.getQueuedPartitionedDrivers();
    long weight = unacknowledgedSplitsInfo.getWeightSum() + taskStatus.getQueuedPartitionedSplitsWeight();
    return PartitionedSplitsInfo.forSplitCountAndWeightSum(count, weight);
}
Also used : PartitionedSplitsInfo(io.trino.execution.PartitionedSplitsInfo) TaskStatus(io.trino.execution.TaskStatus)

Aggregations

PartitionedSplitsInfo (io.trino.execution.PartitionedSplitsInfo)10 MockRemoteTask (io.trino.execution.MockRemoteTaskFactory.MockRemoteTask)7 NodeTaskMap (io.trino.execution.NodeTaskMap)7 RemoteTask (io.trino.execution.RemoteTask)7 PipelinedStageExecution.createPipelinedStageExecution (io.trino.execution.scheduler.PipelinedStageExecution.createPipelinedStageExecution)7 SourcePartitionedScheduler.newSourcePartitionedSchedulerAsStageScheduler (io.trino.execution.scheduler.SourcePartitionedScheduler.newSourcePartitionedSchedulerAsStageScheduler)7 PlanFragment (io.trino.sql.planner.PlanFragment)7 Test (org.testng.annotations.Test)7 InMemoryNodeManager (io.trino.metadata.InMemoryNodeManager)4 InternalNode (io.trino.metadata.InternalNode)4 TestingSplit (io.trino.testing.TestingSplit)3 Duration (io.airlift.units.Duration)2 DynamicFilterConfig (io.trino.execution.DynamicFilterConfig)2 TableExecuteContextManager (io.trino.execution.TableExecuteContextManager)2 TaskStatus (io.trino.execution.TaskStatus)2 DynamicFilterService (io.trino.server.DynamicFilterService)2 ConnectorAwareSplitSource (io.trino.split.ConnectorAwareSplitSource)2