Search in sources :

Example 16 with InMemoryNodeManager

use of io.trino.metadata.InMemoryNodeManager in project trino by trinodb.

the class TestSourcePartitionedScheduler method testNewTaskScheduledWhenChildStageBufferIsUnderutilized.

@Test
public void testNewTaskScheduledWhenChildStageBufferIsUnderutilized() {
    NodeTaskMap nodeTaskMap = new NodeTaskMap(finalizerService);
    // use private node manager so we can add a node later
    InMemoryNodeManager nodeManager = new InMemoryNodeManager();
    nodeManager.addNode(CONNECTOR_ID, new InternalNode("other1", URI.create("http://127.0.0.1:11"), NodeVersion.UNKNOWN, false), new InternalNode("other2", URI.create("http://127.0.0.1:12"), NodeVersion.UNKNOWN, false), new InternalNode("other3", URI.create("http://127.0.0.1:13"), NodeVersion.UNKNOWN, false));
    NodeScheduler nodeScheduler = new NodeScheduler(new UniformNodeSelectorFactory(nodeManager, new NodeSchedulerConfig().setIncludeCoordinator(false), nodeTaskMap, new Duration(0, SECONDS)));
    PlanFragment plan = createFragment();
    StageExecution stage = createStageExecution(plan, nodeTaskMap);
    // setting under utilized child output buffer
    StageScheduler scheduler = newSourcePartitionedSchedulerAsStageScheduler(stage, TABLE_SCAN_NODE_ID, new ConnectorAwareSplitSource(CONNECTOR_ID, createFixedSplitSource(500, TestingSplit::createRemoteSplit)), new DynamicSplitPlacementPolicy(nodeScheduler.createNodeSelector(session, Optional.of(CONNECTOR_ID)), stage::getAllTasks), 500, new DynamicFilterService(metadata, functionManager, typeOperators, new DynamicFilterConfig()), new TableExecuteContextManager(), () -> false);
    // the queues of 3 running nodes should be full
    ScheduleResult scheduleResult = scheduler.schedule();
    assertEquals(scheduleResult.getBlockedReason().get(), SPLIT_QUEUES_FULL);
    assertEquals(scheduleResult.getNewTasks().size(), 3);
    assertEquals(scheduleResult.getSplitsScheduled(), 300);
    for (RemoteTask remoteTask : scheduleResult.getNewTasks()) {
        PartitionedSplitsInfo splitsInfo = remoteTask.getPartitionedSplitsInfo();
        assertEquals(splitsInfo.getCount(), 100);
    }
    // new node added - the pending splits should go to it since the child tasks are not blocked
    nodeManager.addNode(CONNECTOR_ID, new InternalNode("other4", URI.create("http://127.0.0.4:14"), NodeVersion.UNKNOWN, false));
    scheduleResult = scheduler.schedule();
    // split queue is full but still the source task creation isn't blocked
    assertEquals(scheduleResult.getBlockedReason().get(), SPLIT_QUEUES_FULL);
    assertEquals(scheduleResult.getNewTasks().size(), 1);
    assertEquals(scheduleResult.getSplitsScheduled(), 100);
}
Also used : NodeTaskMap(io.trino.execution.NodeTaskMap) PipelinedStageExecution.createPipelinedStageExecution(io.trino.execution.scheduler.PipelinedStageExecution.createPipelinedStageExecution) PartitionedSplitsInfo(io.trino.execution.PartitionedSplitsInfo) MockRemoteTask(io.trino.execution.MockRemoteTaskFactory.MockRemoteTask) RemoteTask(io.trino.execution.RemoteTask) Duration(io.airlift.units.Duration) PlanFragment(io.trino.sql.planner.PlanFragment) ConnectorAwareSplitSource(io.trino.split.ConnectorAwareSplitSource) InMemoryNodeManager(io.trino.metadata.InMemoryNodeManager) SourcePartitionedScheduler.newSourcePartitionedSchedulerAsStageScheduler(io.trino.execution.scheduler.SourcePartitionedScheduler.newSourcePartitionedSchedulerAsStageScheduler) TableExecuteContextManager(io.trino.execution.TableExecuteContextManager) InternalNode(io.trino.metadata.InternalNode) DynamicFilterService(io.trino.server.DynamicFilterService) TestingSplit(io.trino.testing.TestingSplit) DynamicFilterConfig(io.trino.execution.DynamicFilterConfig) Test(org.testng.annotations.Test)

Example 17 with InMemoryNodeManager

use of io.trino.metadata.InMemoryNodeManager in project trino by trinodb.

the class TestSourcePartitionedScheduler method testNoNewTaskScheduledWhenChildStageBufferIsOverutilized.

@Test
public void testNoNewTaskScheduledWhenChildStageBufferIsOverutilized() {
    NodeTaskMap nodeTaskMap = new NodeTaskMap(finalizerService);
    // use private node manager so we can add a node later
    InMemoryNodeManager nodeManager = new InMemoryNodeManager();
    nodeManager.addNode(CONNECTOR_ID, new InternalNode("other1", URI.create("http://127.0.0.1:11"), NodeVersion.UNKNOWN, false), new InternalNode("other2", URI.create("http://127.0.0.1:12"), NodeVersion.UNKNOWN, false), new InternalNode("other3", URI.create("http://127.0.0.1:13"), NodeVersion.UNKNOWN, false));
    NodeScheduler nodeScheduler = new NodeScheduler(new UniformNodeSelectorFactory(nodeManager, new NodeSchedulerConfig().setIncludeCoordinator(false), nodeTaskMap, new Duration(0, SECONDS)));
    PlanFragment plan = createFragment();
    StageExecution stage = createStageExecution(plan, nodeTaskMap);
    // setting over utilized child output buffer
    StageScheduler scheduler = newSourcePartitionedSchedulerAsStageScheduler(stage, TABLE_SCAN_NODE_ID, new ConnectorAwareSplitSource(CONNECTOR_ID, createFixedSplitSource(400, TestingSplit::createRemoteSplit)), new DynamicSplitPlacementPolicy(nodeScheduler.createNodeSelector(session, Optional.of(CONNECTOR_ID)), stage::getAllTasks), 400, new DynamicFilterService(metadata, functionManager, typeOperators, new DynamicFilterConfig()), new TableExecuteContextManager(), () -> true);
    // the queues of 3 running nodes should be full
    ScheduleResult scheduleResult = scheduler.schedule();
    assertEquals(scheduleResult.getBlockedReason().get(), SPLIT_QUEUES_FULL);
    assertEquals(scheduleResult.getNewTasks().size(), 3);
    assertEquals(scheduleResult.getSplitsScheduled(), 300);
    for (RemoteTask remoteTask : scheduleResult.getNewTasks()) {
        PartitionedSplitsInfo splitsInfo = remoteTask.getPartitionedSplitsInfo();
        assertEquals(splitsInfo.getCount(), 100);
    }
    // new node added but 1 child's output buffer is overutilized - so lockdown the tasks
    nodeManager.addNode(CONNECTOR_ID, new InternalNode("other4", URI.create("http://127.0.0.4:14"), NodeVersion.UNKNOWN, false));
    scheduleResult = scheduler.schedule();
    assertEquals(scheduleResult.getBlockedReason().get(), SPLIT_QUEUES_FULL);
    assertEquals(scheduleResult.getNewTasks().size(), 0);
    assertEquals(scheduleResult.getSplitsScheduled(), 0);
}
Also used : NodeTaskMap(io.trino.execution.NodeTaskMap) PipelinedStageExecution.createPipelinedStageExecution(io.trino.execution.scheduler.PipelinedStageExecution.createPipelinedStageExecution) PartitionedSplitsInfo(io.trino.execution.PartitionedSplitsInfo) MockRemoteTask(io.trino.execution.MockRemoteTaskFactory.MockRemoteTask) RemoteTask(io.trino.execution.RemoteTask) Duration(io.airlift.units.Duration) PlanFragment(io.trino.sql.planner.PlanFragment) ConnectorAwareSplitSource(io.trino.split.ConnectorAwareSplitSource) InMemoryNodeManager(io.trino.metadata.InMemoryNodeManager) SourcePartitionedScheduler.newSourcePartitionedSchedulerAsStageScheduler(io.trino.execution.scheduler.SourcePartitionedScheduler.newSourcePartitionedSchedulerAsStageScheduler) TableExecuteContextManager(io.trino.execution.TableExecuteContextManager) InternalNode(io.trino.metadata.InternalNode) DynamicFilterService(io.trino.server.DynamicFilterService) TestingSplit(io.trino.testing.TestingSplit) DynamicFilterConfig(io.trino.execution.DynamicFilterConfig) Test(org.testng.annotations.Test)

Example 18 with InMemoryNodeManager

use of io.trino.metadata.InMemoryNodeManager in project trino by trinodb.

the class TestLocalExchange method setUp.

@BeforeMethod
public void setUp() {
    NodeScheduler nodeScheduler = new NodeScheduler(new UniformNodeSelectorFactory(new InMemoryNodeManager(), new NodeSchedulerConfig().setIncludeCoordinator(true), new NodeTaskMap(new FinalizerService())));
    nodePartitioningManager = new NodePartitioningManager(nodeScheduler, new BlockTypeOperators(new TypeOperators()));
}
Also used : BlockTypeOperators(io.trino.type.BlockTypeOperators) NodeTaskMap(io.trino.execution.NodeTaskMap) FinalizerService(io.trino.util.FinalizerService) UniformNodeSelectorFactory(io.trino.execution.scheduler.UniformNodeSelectorFactory) NodeScheduler(io.trino.execution.scheduler.NodeScheduler) NodeSchedulerConfig(io.trino.execution.scheduler.NodeSchedulerConfig) NodePartitioningManager(io.trino.sql.planner.NodePartitioningManager) InMemoryNodeManager(io.trino.metadata.InMemoryNodeManager) TypeOperators(io.trino.spi.type.TypeOperators) BlockTypeOperators(io.trino.type.BlockTypeOperators) BeforeMethod(org.testng.annotations.BeforeMethod)

Example 19 with InMemoryNodeManager

use of io.trino.metadata.InMemoryNodeManager in project trino by trinodb.

the class TestNodeScheduler method testTopologyAwareScheduling.

@Test(timeOut = 60 * 1000)
public void testTopologyAwareScheduling() {
    NodeTaskMap nodeTaskMap = new NodeTaskMap(finalizerService);
    InMemoryNodeManager nodeManager = new InMemoryNodeManager();
    ImmutableList.Builder<InternalNode> nodeBuilder = ImmutableList.builder();
    nodeBuilder.add(new InternalNode("node1", URI.create("http://host1.rack1:11"), NodeVersion.UNKNOWN, false));
    nodeBuilder.add(new InternalNode("node2", URI.create("http://host2.rack1:12"), NodeVersion.UNKNOWN, false));
    nodeBuilder.add(new InternalNode("node3", URI.create("http://host3.rack2:13"), NodeVersion.UNKNOWN, false));
    ImmutableList<InternalNode> nodes = nodeBuilder.build();
    nodeManager.addNode(CONNECTOR_ID, nodes);
    // contents of taskMap indicate the node-task map for the current stage
    Map<InternalNode, RemoteTask> taskMap = new HashMap<>();
    NodeSchedulerConfig nodeSchedulerConfig = new NodeSchedulerConfig().setMaxSplitsPerNode(25).setIncludeCoordinator(false).setMaxPendingSplitsPerTask(20);
    TestNetworkTopology topology = new TestNetworkTopology();
    NodeSelectorFactory nodeSelectorFactory = new TopologyAwareNodeSelectorFactory(topology, nodeManager, nodeSchedulerConfig, nodeTaskMap, getNetworkTopologyConfig());
    NodeScheduler nodeScheduler = new NodeScheduler(nodeSelectorFactory);
    NodeSelector nodeSelector = nodeScheduler.createNodeSelector(session, Optional.of(CONNECTOR_ID));
    // Fill up the nodes with non-local data
    ImmutableSet.Builder<Split> nonRackLocalBuilder = ImmutableSet.builder();
    for (int i = 0; i < (25 + 11) * 3; i++) {
        nonRackLocalBuilder.add(new Split(CONNECTOR_ID, new TestSplitRemote(HostAddress.fromParts("data.other_rack", 1)), Lifespan.taskWide()));
    }
    Set<Split> nonRackLocalSplits = nonRackLocalBuilder.build();
    Multimap<InternalNode, Split> assignments = nodeSelector.computeAssignments(nonRackLocalSplits, ImmutableList.copyOf(taskMap.values())).getAssignments();
    MockRemoteTaskFactory remoteTaskFactory = new MockRemoteTaskFactory(remoteTaskExecutor, remoteTaskScheduledExecutor);
    int task = 0;
    for (InternalNode node : assignments.keySet()) {
        TaskId taskId = new TaskId(new StageId("test", 1), task, 0);
        task++;
        MockRemoteTaskFactory.MockRemoteTask remoteTask = remoteTaskFactory.createTableScanTask(taskId, node, ImmutableList.copyOf(assignments.get(node)), nodeTaskMap.createPartitionedSplitCountTracker(node, taskId));
        remoteTask.startSplits(25);
        nodeTaskMap.addTask(node, remoteTask);
        taskMap.put(node, remoteTask);
    }
    // Continue assigning to fill up part of the queue
    nonRackLocalSplits = Sets.difference(nonRackLocalSplits, new HashSet<>(assignments.values()));
    assignments = nodeSelector.computeAssignments(nonRackLocalSplits, ImmutableList.copyOf(taskMap.values())).getAssignments();
    for (InternalNode node : assignments.keySet()) {
        RemoteTask remoteTask = taskMap.get(node);
        remoteTask.addSplits(ImmutableMultimap.<PlanNodeId, Split>builder().putAll(new PlanNodeId("sourceId"), assignments.get(node)).build());
    }
    nonRackLocalSplits = Sets.difference(nonRackLocalSplits, new HashSet<>(assignments.values()));
    // Check that 3 of the splits were rejected, since they're non-local
    assertEquals(nonRackLocalSplits.size(), 3);
    // Assign rack-local splits
    ImmutableSet.Builder<Split> rackLocalSplits = ImmutableSet.builder();
    HostAddress dataHost1 = HostAddress.fromParts("data.rack1", 1);
    HostAddress dataHost2 = HostAddress.fromParts("data.rack2", 1);
    for (int i = 0; i < 6 * 2; i++) {
        rackLocalSplits.add(new Split(CONNECTOR_ID, new TestSplitRemote(dataHost1), Lifespan.taskWide()));
    }
    for (int i = 0; i < 6; i++) {
        rackLocalSplits.add(new Split(CONNECTOR_ID, new TestSplitRemote(dataHost2), Lifespan.taskWide()));
    }
    assignments = nodeSelector.computeAssignments(rackLocalSplits.build(), ImmutableList.copyOf(taskMap.values())).getAssignments();
    for (InternalNode node : assignments.keySet()) {
        RemoteTask remoteTask = taskMap.get(node);
        remoteTask.addSplits(ImmutableMultimap.<PlanNodeId, Split>builder().putAll(new PlanNodeId("sourceId"), assignments.get(node)).build());
    }
    Set<Split> unassigned = Sets.difference(rackLocalSplits.build(), new HashSet<>(assignments.values()));
    // Compute the assignments a second time to account for the fact that some splits may not have been assigned due to asynchronous
    // loading of the NetworkLocationCache
    assignments = nodeSelector.computeAssignments(unassigned, ImmutableList.copyOf(taskMap.values())).getAssignments();
    for (InternalNode node : assignments.keySet()) {
        RemoteTask remoteTask = taskMap.get(node);
        remoteTask.addSplits(ImmutableMultimap.<PlanNodeId, Split>builder().putAll(new PlanNodeId("sourceId"), assignments.get(node)).build());
    }
    unassigned = Sets.difference(unassigned, new HashSet<>(assignments.values()));
    assertEquals(unassigned.size(), 3);
    int rack1 = 0;
    int rack2 = 0;
    for (Split split : unassigned) {
        String rack = topology.locate(split.getAddresses().get(0)).getSegments().get(0);
        switch(rack) {
            case "rack1":
                rack1++;
                break;
            case "rack2":
                rack2++;
                break;
            default:
                throw new AssertionError("Unexpected rack: " + rack);
        }
    }
    assertEquals(rack1, 2);
    assertEquals(rack2, 1);
    // Assign local splits
    ImmutableSet.Builder<Split> localSplits = ImmutableSet.builder();
    localSplits.add(new Split(CONNECTOR_ID, new TestSplitRemote(HostAddress.fromParts("host1.rack1", 1)), Lifespan.taskWide()));
    localSplits.add(new Split(CONNECTOR_ID, new TestSplitRemote(HostAddress.fromParts("host2.rack1", 1)), Lifespan.taskWide()));
    localSplits.add(new Split(CONNECTOR_ID, new TestSplitRemote(HostAddress.fromParts("host3.rack2", 1)), Lifespan.taskWide()));
    assignments = nodeSelector.computeAssignments(localSplits.build(), ImmutableList.copyOf(taskMap.values())).getAssignments();
    assertEquals(assignments.size(), 3);
    assertEquals(assignments.keySet().size(), 3);
}
Also used : HashMap(java.util.HashMap) ImmutableList(com.google.common.collect.ImmutableList) TopologyAwareNodeSelectorFactory(io.trino.execution.scheduler.TopologyAwareNodeSelectorFactory) NodeSelectorFactory(io.trino.execution.scheduler.NodeSelectorFactory) UniformNodeSelectorFactory(io.trino.execution.scheduler.UniformNodeSelectorFactory) NodeSchedulerConfig(io.trino.execution.scheduler.NodeSchedulerConfig) HostAddress(io.trino.spi.HostAddress) PlanNodeId(io.trino.sql.planner.plan.PlanNodeId) ImmutableSet(com.google.common.collect.ImmutableSet) NodeScheduler(io.trino.execution.scheduler.NodeScheduler) HashSet(java.util.HashSet) LinkedHashSet(java.util.LinkedHashSet) InMemoryNodeManager(io.trino.metadata.InMemoryNodeManager) InternalNode(io.trino.metadata.InternalNode) NodeSelector(io.trino.execution.scheduler.NodeSelector) UniformNodeSelector(io.trino.execution.scheduler.UniformNodeSelector) Split(io.trino.metadata.Split) ConnectorSplit(io.trino.spi.connector.ConnectorSplit) TopologyAwareNodeSelectorFactory(io.trino.execution.scheduler.TopologyAwareNodeSelectorFactory) Test(org.testng.annotations.Test)

Example 20 with InMemoryNodeManager

use of io.trino.metadata.InMemoryNodeManager in project trino by trinodb.

the class TestNodeScheduler method setUp.

@BeforeMethod
public void setUp() {
    session = TestingSession.testSessionBuilder().build();
    finalizerService = new FinalizerService();
    nodeTaskMap = new NodeTaskMap(finalizerService);
    nodeManager = new InMemoryNodeManager();
    nodeSchedulerConfig = new NodeSchedulerConfig().setMaxSplitsPerNode(20).setIncludeCoordinator(false).setMaxPendingSplitsPerTask(10);
    nodeScheduler = new NodeScheduler(new UniformNodeSelectorFactory(nodeManager, nodeSchedulerConfig, nodeTaskMap));
    // contents of taskMap indicate the node-task map for the current stage
    taskMap = new HashMap<>();
    nodeSelector = nodeScheduler.createNodeSelector(session, Optional.of(CONNECTOR_ID));
    remoteTaskExecutor = newCachedThreadPool(daemonThreadsNamed("remoteTaskExecutor-%s"));
    remoteTaskScheduledExecutor = newScheduledThreadPool(2, daemonThreadsNamed("remoteTaskScheduledExecutor-%s"));
    finalizerService.start();
}
Also used : FinalizerService(io.trino.util.FinalizerService) UniformNodeSelectorFactory(io.trino.execution.scheduler.UniformNodeSelectorFactory) NodeSchedulerConfig(io.trino.execution.scheduler.NodeSchedulerConfig) NodeScheduler(io.trino.execution.scheduler.NodeScheduler) InMemoryNodeManager(io.trino.metadata.InMemoryNodeManager) BeforeMethod(org.testng.annotations.BeforeMethod)

Aggregations

InMemoryNodeManager (io.trino.metadata.InMemoryNodeManager)28 Test (org.testng.annotations.Test)20 InternalNode (io.trino.metadata.InternalNode)7 NodeTaskMap (io.trino.execution.NodeTaskMap)6 NodeScheduler (io.trino.execution.scheduler.NodeScheduler)5 NodeSchedulerConfig (io.trino.execution.scheduler.NodeSchedulerConfig)5 UniformNodeSelectorFactory (io.trino.execution.scheduler.UniformNodeSelectorFactory)5 PlanFragment (io.trino.sql.planner.PlanFragment)5 CatalogName (io.trino.connector.CatalogName)4 MockRemoteTask (io.trino.execution.MockRemoteTaskFactory.MockRemoteTask)4 PartitionedSplitsInfo (io.trino.execution.PartitionedSplitsInfo)4 RemoteTask (io.trino.execution.RemoteTask)4 PipelinedStageExecution.createPipelinedStageExecution (io.trino.execution.scheduler.PipelinedStageExecution.createPipelinedStageExecution)4 SourcePartitionedScheduler.newSourcePartitionedSchedulerAsStageScheduler (io.trino.execution.scheduler.SourcePartitionedScheduler.newSourcePartitionedSchedulerAsStageScheduler)4 TestingSplit (io.trino.testing.TestingSplit)4 FinalizerService (io.trino.util.FinalizerService)4 ImmutableList (com.google.common.collect.ImmutableList)3 NodePartitioningManager (io.trino.sql.planner.NodePartitioningManager)3 BlockTypeOperators (io.trino.type.BlockTypeOperators)3 ImmutableMap (com.google.common.collect.ImmutableMap)2