Search in sources :

Example 76 with MockAM

use of org.apache.hadoop.yarn.server.resourcemanager.MockAM in project hadoop by apache.

the class TestCapacitySchedulerSurgicalPreemption method testSimpleSurgicalPreemption.

@Test(timeout = 60000)
public void testSimpleSurgicalPreemption() throws Exception {
    /**
     * Test case: Submit two application (app1/app2) to different queues, queue
     * structure:
     *
     * <pre>
     *             Root
     *            /  |  \
     *           a   b   c
     *          10   20  70
     * </pre>
     *
     * 1) Two nodes (n1/n2) in the cluster, each of them has 20G.
     *
     * 2) app1 submit to queue-a first, it asked 32 * 1G containers
     * We will allocate 16 on n1 and 16 on n2.
     *
     * 3) app2 submit to queue-c, ask for one 1G container (for AM)
     *
     * 4) app2 asks for another 6G container, it will be reserved on n1
     *
     * Now: we have:
     * n1: 17 from app1, 1 from app2, and 1 reserved from app2
     * n2: 16 from app1.
     *
     * After preemption, we should expect:
     * Preempt 4 containers from app1 on n1.
     */
    MockRM rm1 = new MockRM(conf);
    rm1.getRMContext().setNodeLabelManager(mgr);
    rm1.start();
    MockNM nm1 = rm1.registerNode("h1:1234", 20 * GB);
    MockNM nm2 = rm1.registerNode("h2:1234", 20 * GB);
    CapacityScheduler cs = (CapacityScheduler) rm1.getResourceScheduler();
    RMNode rmNode1 = rm1.getRMContext().getRMNodes().get(nm1.getNodeId());
    RMNode rmNode2 = rm1.getRMContext().getRMNodes().get(nm2.getNodeId());
    // launch an app to queue, AM container should be launched in nm1
    RMApp app1 = rm1.submitApp(1 * GB, "app", "user", null, "a");
    MockAM am1 = MockRM.launchAndRegisterAM(app1, rm1, nm1);
    am1.allocate("*", 1 * GB, 32, new ArrayList<ContainerId>());
    // Do allocation for node1/node2
    for (int i = 0; i < 32; i++) {
        cs.handle(new NodeUpdateSchedulerEvent(rmNode1));
        cs.handle(new NodeUpdateSchedulerEvent(rmNode2));
    }
    // App1 should have 33 containers now
    FiCaSchedulerApp schedulerApp1 = cs.getApplicationAttempt(am1.getApplicationAttemptId());
    Assert.assertEquals(33, schedulerApp1.getLiveContainers().size());
    // 17 from n1 and 16 from n2
    waitNumberOfLiveContainersOnNodeFromApp(cs.getNode(rmNode1.getNodeID()), am1.getApplicationAttemptId(), 17);
    waitNumberOfLiveContainersOnNodeFromApp(cs.getNode(rmNode2.getNodeID()), am1.getApplicationAttemptId(), 16);
    // Submit app2 to queue-c and asks for a 1G container for AM
    RMApp app2 = rm1.submitApp(1 * GB, "app", "user", null, "c");
    MockAM am2 = MockRM.launchAndRegisterAM(app2, rm1, nm1);
    // NM1/NM2 has available resource = 2G/4G
    Assert.assertEquals(2 * GB, cs.getNode(nm1.getNodeId()).getUnallocatedResource().getMemorySize());
    Assert.assertEquals(4 * GB, cs.getNode(nm2.getNodeId()).getUnallocatedResource().getMemorySize());
    // AM asks for a 1 * GB container
    am2.allocate(Arrays.asList(ResourceRequest.newInstance(Priority.newInstance(1), ResourceRequest.ANY, Resources.createResource(6 * GB), 1)), null);
    // Call allocation once on n1, we should expect the container reserved on n1
    cs.handle(new NodeUpdateSchedulerEvent(rmNode1));
    Assert.assertNotNull(cs.getNode(nm1.getNodeId()).getReservedContainer());
    // Get edit policy and do one update
    SchedulingEditPolicy editPolicy = getSchedulingEditPolicy(rm1);
    // Call edit schedule twice, and check if 4 containers from app1 at n1 killed
    editPolicy.editSchedule();
    editPolicy.editSchedule();
    waitNumberOfLiveContainersFromApp(schedulerApp1, 29);
    // 13 from n1 (4 preempted) and 16 from n2
    waitNumberOfLiveContainersOnNodeFromApp(cs.getNode(rmNode1.getNodeID()), am1.getApplicationAttemptId(), 13);
    waitNumberOfLiveContainersOnNodeFromApp(cs.getNode(rmNode2.getNodeID()), am1.getApplicationAttemptId(), 16);
    rm1.close();
}
Also used : RMApp(org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMApp) NodeUpdateSchedulerEvent(org.apache.hadoop.yarn.server.resourcemanager.scheduler.event.NodeUpdateSchedulerEvent) RMNode(org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNode) ContainerId(org.apache.hadoop.yarn.api.records.ContainerId) MockNM(org.apache.hadoop.yarn.server.resourcemanager.MockNM) SchedulingEditPolicy(org.apache.hadoop.yarn.server.resourcemanager.monitor.SchedulingEditPolicy) FiCaSchedulerApp(org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp) MockAM(org.apache.hadoop.yarn.server.resourcemanager.MockAM) MockRM(org.apache.hadoop.yarn.server.resourcemanager.MockRM) Test(org.junit.Test)

Example 77 with MockAM

use of org.apache.hadoop.yarn.server.resourcemanager.MockAM in project hadoop by apache.

the class TestCapacitySchedulerSurgicalPreemption method testPriorityPreemptionOnlyTriggeredWhenDemandingQueueUnsatisfied.

@Test(timeout = 60000)
public void testPriorityPreemptionOnlyTriggeredWhenDemandingQueueUnsatisfied() throws Exception {
    /**
     * Test case: Submit two application (app1/app2) to different queues, queue
     * structure:
     *
     * <pre>
     *             Root
     *            /  |  \
     *           a   b   c
     *          10   20  70
     * </pre>
     *
     * 1) 10 nodes (n0-n9) in the cluster, each of them has 10G.
     *
     * 2) app1 submit to queue-b first, it asked 8 * 1G containers
     * We will allocate 1 container on each of n0-n10
     *
     * 3) app2 submit to queue-c, ask for 10 * 10G containers (including AM)
     *
     * After preemption, we should expect:
     * Preempt 7 containers from app1 and usage of app2 is 70%
     */
    conf.setPUOrderingPolicyUnderUtilizedPreemptionEnabled(true);
    conf.setPUOrderingPolicyUnderUtilizedPreemptionDelay(1000);
    conf.setQueueOrderingPolicy(CapacitySchedulerConfiguration.ROOT, CapacitySchedulerConfiguration.QUEUE_PRIORITY_UTILIZATION_ORDERING_POLICY);
    // Queue c has higher priority than a/b
    conf.setQueuePriority(CapacitySchedulerConfiguration.ROOT + ".c", 1);
    MockRM rm1 = new MockRM(conf);
    rm1.getRMContext().setNodeLabelManager(mgr);
    rm1.start();
    MockNM[] mockNMs = new MockNM[10];
    for (int i = 0; i < 10; i++) {
        mockNMs[i] = rm1.registerNode("h" + i + ":1234", 10 * GB);
    }
    CapacityScheduler cs = (CapacityScheduler) rm1.getResourceScheduler();
    RMNode[] rmNodes = new RMNode[10];
    for (int i = 0; i < 10; i++) {
        rmNodes[i] = rm1.getRMContext().getRMNodes().get(mockNMs[i].getNodeId());
    }
    // launch an app to queue, AM container should be launched in nm1
    RMApp app1 = rm1.submitApp(1 * GB, "app", "user", null, "b");
    MockAM am1 = MockRM.launchAndRegisterAM(app1, rm1, mockNMs[0]);
    am1.allocate("*", 1 * GB, 8, new ArrayList<>());
    // Do allocation for nm1-nm8
    for (int i = 1; i < 9; i++) {
        cs.handle(new NodeUpdateSchedulerEvent(rmNodes[i]));
    }
    // App1 should have 9 containers now, so the abs-used-cap of b is 9%
    FiCaSchedulerApp schedulerApp1 = cs.getApplicationAttempt(am1.getApplicationAttemptId());
    Assert.assertEquals(9, schedulerApp1.getLiveContainers().size());
    for (int i = 0; i < 9; i++) {
        waitNumberOfLiveContainersOnNodeFromApp(cs.getNode(rmNodes[i].getNodeID()), am1.getApplicationAttemptId(), 1);
    }
    // Submit app2 to queue-c and asks for a 10G container for AM
    // Launch AM in NM9
    RMApp app2 = rm1.submitApp(10 * GB, "app", "user", null, "c");
    MockAM am2 = MockRM.launchAndRegisterAM(app2, rm1, mockNMs[9]);
    FiCaSchedulerApp schedulerApp2 = cs.getApplicationAttempt(ApplicationAttemptId.newInstance(app2.getApplicationId(), 1));
    // Ask 10 * 10GB containers
    am2.allocate("*", 10 * GB, 10, new ArrayList<>());
    // Do allocation for all nms
    for (int i = 1; i < 10; i++) {
        cs.handle(new NodeUpdateSchedulerEvent(rmNodes[i]));
    }
    // Check am2 reserved resource from nm1-nm9
    for (int i = 1; i < 9; i++) {
        Assert.assertNotNull("Should reserve on nm-" + i, cs.getNode(rmNodes[i].getNodeID()).getReservedContainer());
    }
    // Sleep the timeout interval, we should be able to see 6 containers selected
    // 6 (selected) + 1 (allocated) which makes target capacity to 70%
    Thread.sleep(1000);
    ProportionalCapacityPreemptionPolicy editPolicy = (ProportionalCapacityPreemptionPolicy) getSchedulingEditPolicy(rm1);
    editPolicy.editSchedule();
    checkNumberOfPreemptionCandidateFromApp(editPolicy, 6, am1.getApplicationAttemptId());
    // Call editSchedule again: selected containers are killed
    editPolicy.editSchedule();
    waitNumberOfLiveContainersFromApp(schedulerApp1, 3);
    // Do allocation for all nms
    for (int i = 1; i < 10; i++) {
        cs.handle(new NodeUpdateSchedulerEvent(rmNodes[i]));
    }
    waitNumberOfLiveContainersFromApp(schedulerApp2, 7);
    waitNumberOfLiveContainersFromApp(schedulerApp1, 3);
    rm1.close();
}
Also used : RMApp(org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMApp) NodeUpdateSchedulerEvent(org.apache.hadoop.yarn.server.resourcemanager.scheduler.event.NodeUpdateSchedulerEvent) MockNM(org.apache.hadoop.yarn.server.resourcemanager.MockNM) MockRM(org.apache.hadoop.yarn.server.resourcemanager.MockRM) ProportionalCapacityPreemptionPolicy(org.apache.hadoop.yarn.server.resourcemanager.monitor.capacity.ProportionalCapacityPreemptionPolicy) RMNode(org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNode) FiCaSchedulerApp(org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp) MockAM(org.apache.hadoop.yarn.server.resourcemanager.MockAM) Test(org.junit.Test)

Example 78 with MockAM

use of org.apache.hadoop.yarn.server.resourcemanager.MockAM in project hadoop by apache.

the class TestIncreaseAllocationExpirer method testConsecutiveContainerIncreaseAllocationExpiration.

@Test
public void testConsecutiveContainerIncreaseAllocationExpiration() throws Exception {
    /**
     * 1. Allocate 1 container: containerId2 (1G)
     * 2. Increase resource of containerId2: 1G -> 3G
     * 3. AM acquires the token
     * 4. Increase resource of containerId2 again: 3G -> 5G
     * 5. AM acquires the token
     * 6. AM uses the first token to increase the container in NM to 3G
     * 7. AM NEVER uses the second token
     * 8. Verify containerId2 eventually is allocated 3G after token expires
     * 9. Verify NM eventually uses 3G for containerId2
     */
    // Set the allocation expiration to 5 seconds
    conf.setLong(YarnConfiguration.RM_CONTAINER_ALLOC_EXPIRY_INTERVAL_MS, 5000);
    MockRM rm1 = new MockRM(conf);
    rm1.start();
    // Submit an application
    MockNM nm1 = rm1.registerNode("127.0.0.1:1234", 20 * GB);
    RMApp app1 = rm1.submitApp(1 * GB, "app", "user", null, "default");
    MockAM am1 = MockRM.launchAndRegisterAM(app1, rm1, nm1);
    nm1.nodeHeartbeat(app1.getCurrentAppAttempt().getAppAttemptId(), 1, ContainerState.RUNNING);
    // AM request a new container
    am1.allocate("127.0.0.1", 1 * GB, 1, new ArrayList<ContainerId>());
    ContainerId containerId2 = ContainerId.newContainerId(am1.getApplicationAttemptId(), 2);
    rm1.waitForState(nm1, containerId2, RMContainerState.ALLOCATED);
    // AM acquire a new container to start container allocation expirer
    am1.allocate(null, null).getAllocatedContainers();
    // Report container status
    nm1.nodeHeartbeat(app1.getCurrentAppAttempt().getAppAttemptId(), 2, ContainerState.RUNNING);
    // Wait until container status is RUNNING, and is removed from
    // allocation expirer
    rm1.waitForState(nm1, containerId2, RMContainerState.RUNNING);
    // am1 asks to change containerId2 from 1GB to 3GB
    am1.sendContainerResizingRequest(Collections.singletonList(UpdateContainerRequest.newInstance(0, containerId2, ContainerUpdateType.INCREASE_RESOURCE, Resources.createResource(3 * GB), null)));
    // Kick off scheduling and sleep for 1 second to
    // make sure the allocation is done
    nm1.nodeHeartbeat(true);
    Thread.sleep(1000);
    // Start container increase allocation expirer
    am1.allocate(null, null);
    // Remember the resource (3G) in order to report status
    Resource resource1 = Resources.clone(rm1.getResourceScheduler().getRMContainer(containerId2).getAllocatedResource());
    // This should not work, since the container version is wrong
    AllocateResponse response = am1.sendContainerResizingRequest(Collections.singletonList(UpdateContainerRequest.newInstance(0, containerId2, ContainerUpdateType.INCREASE_RESOURCE, Resources.createResource(5 * GB), null)));
    List<UpdateContainerError> updateErrors = response.getUpdateErrors();
    Assert.assertEquals(1, updateErrors.size());
    Assert.assertEquals("INCORRECT_CONTAINER_VERSION_ERROR", updateErrors.get(0).getReason());
    Assert.assertEquals(1, updateErrors.get(0).getCurrentContainerVersion());
    // am1 asks to change containerId2 from 3GB to 5GB
    am1.sendContainerResizingRequest(Collections.singletonList(UpdateContainerRequest.newInstance(1, containerId2, ContainerUpdateType.INCREASE_RESOURCE, Resources.createResource(5 * GB), null)));
    // Kick off scheduling and sleep for 1 second to
    // make sure the allocation is done
    nm1.nodeHeartbeat(true);
    Thread.sleep(1000);
    // Reset container increase allocation expirer
    am1.allocate(null, null);
    // Verify current resource allocation in RM
    checkUsedResource(rm1, "default", 6 * GB, null);
    FiCaSchedulerApp app = TestUtils.getFiCaSchedulerApp(rm1, app1.getApplicationId());
    Assert.assertEquals(6 * GB, app.getAppAttemptResourceUsage().getUsed().getMemorySize());
    // Verify available resource is now reduced to 14GB
    verifyAvailableResourceOfSchedulerNode(rm1, nm1.getNodeId(), 14 * GB);
    // Use the first token (3G)
    nm1.containerIncreaseStatus(getContainer(rm1, containerId2, resource1));
    // Wait long enough for the second token (5G) to expire, and verify that
    // the roll back action is completed as expected
    Thread.sleep(10000);
    am1.allocate(null, null);
    Thread.sleep(2000);
    // Verify container size is rolled back to 3G
    Assert.assertEquals(3 * GB, rm1.getResourceScheduler().getRMContainer(containerId2).getAllocatedResource().getMemorySize());
    // Verify total resource usage is 4G
    checkUsedResource(rm1, "default", 4 * GB, null);
    Assert.assertEquals(4 * GB, app.getAppAttemptResourceUsage().getUsed().getMemorySize());
    // Verify available resource is rolled back to 14GB
    verifyAvailableResourceOfSchedulerNode(rm1, nm1.getNodeId(), 16 * GB);
    // Verify NM receives the decrease message (3G)
    List<Container> containersToDecrease = nm1.nodeHeartbeat(true).getContainersToDecrease();
    Assert.assertEquals(1, containersToDecrease.size());
    Assert.assertEquals(3 * GB, containersToDecrease.get(0).getResource().getMemorySize());
    rm1.stop();
}
Also used : RMApp(org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMApp) MockNM(org.apache.hadoop.yarn.server.resourcemanager.MockNM) Resource(org.apache.hadoop.yarn.api.records.Resource) MockRM(org.apache.hadoop.yarn.server.resourcemanager.MockRM) AllocateResponse(org.apache.hadoop.yarn.api.protocolrecords.AllocateResponse) UpdateContainerError(org.apache.hadoop.yarn.api.records.UpdateContainerError) RMContainer(org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainer) Container(org.apache.hadoop.yarn.api.records.Container) ContainerId(org.apache.hadoop.yarn.api.records.ContainerId) FiCaSchedulerApp(org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp) MockAM(org.apache.hadoop.yarn.server.resourcemanager.MockAM) Test(org.junit.Test)

Example 79 with MockAM

use of org.apache.hadoop.yarn.server.resourcemanager.MockAM in project hadoop by apache.

the class TestIncreaseAllocationExpirer method testDecreaseAfterIncreaseWithAllocationExpiration.

@Test
public void testDecreaseAfterIncreaseWithAllocationExpiration() throws Exception {
    /**
     * 1. Allocate three containers: containerId2, containerId3, containerId4
     * 2. Increase resource of containerId2: 3G -> 6G
     * 3. Increase resource of containerId3: 3G -> 6G
     * 4. Increase resource of containerId4: 3G -> 6G
     * 5. Do NOT use the increase tokens for containerId2 and containerId3
     * 6. Decrease containerId2: 6G -> 2G (i.e., below last confirmed resource)
     * 7. Decrease containerId3: 6G -> 4G (i.e., above last confirmed resource)
     * 8. Decrease containerId4: 6G -> 4G (i.e., above last confirmed resource)
     * 9. Use token for containerId4 to increase containerId4 on NM to 6G
     * 10. Verify containerId2 eventually uses 2G (removed from expirer)
     * 11. verify containerId3 eventually uses 3G (increase token expires)
     * 12. Verify containerId4 eventually uses 4G (removed from expirer)
     * 13. Verify NM evetually uses 3G for containerId3, 4G for containerId4
     */
    // Set the allocation expiration to 5 seconds
    conf.setLong(YarnConfiguration.RM_CONTAINER_ALLOC_EXPIRY_INTERVAL_MS, 5000);
    MockRM rm1 = new MockRM(conf);
    rm1.start();
    // Submit an application
    MockNM nm1 = rm1.registerNode("127.0.0.1:1234", 20 * GB);
    RMApp app1 = rm1.submitApp(1 * GB, "app", "user", null, "default");
    MockAM am1 = MockRM.launchAndRegisterAM(app1, rm1, nm1);
    nm1.nodeHeartbeat(app1.getCurrentAppAttempt().getAppAttemptId(), 1, ContainerState.RUNNING);
    // AM request two new continers
    am1.allocate("127.0.0.1", 3 * GB, 3, new ArrayList<ContainerId>());
    ContainerId containerId2 = ContainerId.newContainerId(am1.getApplicationAttemptId(), 2);
    rm1.waitForState(nm1, containerId2, RMContainerState.ALLOCATED);
    ContainerId containerId3 = ContainerId.newContainerId(am1.getApplicationAttemptId(), 3);
    rm1.waitForState(nm1, containerId3, RMContainerState.ALLOCATED);
    ContainerId containerId4 = ContainerId.newContainerId(am1.getApplicationAttemptId(), 4);
    rm1.waitForState(nm1, containerId4, RMContainerState.ALLOCATED);
    // AM acquires tokens to start container allocation expirer
    List<Container> containers = am1.allocate(null, null).getAllocatedContainers();
    Assert.assertEquals(3, containers.size());
    Assert.assertNotNull(containers.get(0).getContainerToken());
    Assert.assertNotNull(containers.get(1).getContainerToken());
    Assert.assertNotNull(containers.get(2).getContainerToken());
    // Report container status
    nm1.nodeHeartbeat(app1.getCurrentAppAttempt().getAppAttemptId(), 2, ContainerState.RUNNING);
    nm1.nodeHeartbeat(app1.getCurrentAppAttempt().getAppAttemptId(), 3, ContainerState.RUNNING);
    nm1.nodeHeartbeat(app1.getCurrentAppAttempt().getAppAttemptId(), 4, ContainerState.RUNNING);
    // Wait until container status becomes RUNNING
    rm1.waitForState(nm1, containerId2, RMContainerState.RUNNING);
    rm1.waitForState(nm1, containerId3, RMContainerState.RUNNING);
    rm1.waitForState(nm1, containerId4, RMContainerState.RUNNING);
    // am1 asks to change containerId2 and containerId3 from 1GB to 3GB
    List<UpdateContainerRequest> increaseRequests = new ArrayList<>();
    increaseRequests.add(UpdateContainerRequest.newInstance(0, containerId2, ContainerUpdateType.INCREASE_RESOURCE, Resources.createResource(6 * GB), null));
    increaseRequests.add(UpdateContainerRequest.newInstance(0, containerId3, ContainerUpdateType.INCREASE_RESOURCE, Resources.createResource(6 * GB), null));
    increaseRequests.add(UpdateContainerRequest.newInstance(0, containerId4, ContainerUpdateType.INCREASE_RESOURCE, Resources.createResource(6 * GB), null));
    am1.sendContainerResizingRequest(increaseRequests);
    nm1.nodeHeartbeat(true);
    Thread.sleep(1000);
    // Start container increase allocation expirer
    am1.allocate(null, null);
    // Decrease containers
    List<UpdateContainerRequest> decreaseRequests = new ArrayList<>();
    decreaseRequests.add(UpdateContainerRequest.newInstance(1, containerId2, ContainerUpdateType.DECREASE_RESOURCE, Resources.createResource(2 * GB), null));
    decreaseRequests.add(UpdateContainerRequest.newInstance(1, containerId3, ContainerUpdateType.DECREASE_RESOURCE, Resources.createResource(4 * GB), null));
    decreaseRequests.add(UpdateContainerRequest.newInstance(1, containerId4, ContainerUpdateType.DECREASE_RESOURCE, Resources.createResource(4 * GB), null));
    AllocateResponse response = am1.sendContainerResizingRequest(decreaseRequests);
    // Verify containers are decreased in scheduler
    Assert.assertEquals(3, response.getUpdatedContainers().size());
    // Use the token for containerId4 on NM (6G). This should set the last
    // confirmed resource to 4G, and cancel the allocation expirer
    nm1.containerIncreaseStatus(getContainer(rm1, containerId4, Resources.createResource(6 * GB)));
    // Wait for containerId3 token to expire,
    Thread.sleep(10000);
    am1.allocate(null, null);
    Assert.assertEquals(2 * GB, rm1.getResourceScheduler().getRMContainer(containerId2).getAllocatedResource().getMemorySize());
    Assert.assertEquals(3 * GB, rm1.getResourceScheduler().getRMContainer(containerId3).getAllocatedResource().getMemorySize());
    Assert.assertEquals(4 * GB, rm1.getResourceScheduler().getRMContainer(containerId4).getAllocatedResource().getMemorySize());
    // Verify NM receives 2 decrease message
    List<Container> containersToDecrease = nm1.nodeHeartbeat(true).getContainersToDecrease();
    Assert.assertEquals(2, containersToDecrease.size());
    // Sort the list to make sure containerId3 is the first
    Collections.sort(containersToDecrease);
    Assert.assertEquals(3 * GB, containersToDecrease.get(0).getResource().getMemorySize());
    Assert.assertEquals(4 * GB, containersToDecrease.get(1).getResource().getMemorySize());
    rm1.stop();
}
Also used : AllocateResponse(org.apache.hadoop.yarn.api.protocolrecords.AllocateResponse) RMApp(org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMApp) RMContainer(org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainer) Container(org.apache.hadoop.yarn.api.records.Container) ContainerId(org.apache.hadoop.yarn.api.records.ContainerId) MockNM(org.apache.hadoop.yarn.server.resourcemanager.MockNM) ArrayList(java.util.ArrayList) MockAM(org.apache.hadoop.yarn.server.resourcemanager.MockAM) MockRM(org.apache.hadoop.yarn.server.resourcemanager.MockRM) UpdateContainerRequest(org.apache.hadoop.yarn.api.records.UpdateContainerRequest) Test(org.junit.Test)

Example 80 with MockAM

use of org.apache.hadoop.yarn.server.resourcemanager.MockAM in project hadoop by apache.

the class TestIncreaseAllocationExpirer method testContainerIsRemovedFromAllocationExpirer.

@Test
public void testContainerIsRemovedFromAllocationExpirer() throws Exception {
    /**
     * 1. Allocate 1 container: containerId2 (1G)
     * 2. Increase resource of containerId2: 1G -> 3G
     * 3. AM acquires the token
     * 4. AM uses the token
     * 5. Verify containerId2 is removed from allocation expirer such
     *    that it still runs fine after allocation expiration interval
     */
    // Set the allocation expiration to 5 seconds
    conf.setLong(YarnConfiguration.RM_CONTAINER_ALLOC_EXPIRY_INTERVAL_MS, 5000);
    MockRM rm1 = new MockRM(conf);
    rm1.start();
    // Submit an application
    MockNM nm1 = rm1.registerNode("127.0.0.1:1234", 20 * GB);
    RMApp app1 = rm1.submitApp(1 * GB, "app", "user", null, "default");
    MockAM am1 = MockRM.launchAndRegisterAM(app1, rm1, nm1);
    // Report AM container status RUNNING to remove it from expirer
    nm1.nodeHeartbeat(app1.getCurrentAppAttempt().getAppAttemptId(), 1, ContainerState.RUNNING);
    // AM request a new container
    am1.allocate("127.0.0.1", 1 * GB, 1, new ArrayList<ContainerId>());
    ContainerId containerId2 = ContainerId.newContainerId(am1.getApplicationAttemptId(), 2);
    rm1.waitForState(nm1, containerId2, RMContainerState.ALLOCATED);
    // AM acquire a new container to start container allocation expirer
    List<Container> containers = am1.allocate(null, null).getAllocatedContainers();
    Assert.assertEquals(containerId2, containers.get(0).getId());
    Assert.assertNotNull(containers.get(0).getContainerToken());
    checkUsedResource(rm1, "default", 2 * GB, null);
    FiCaSchedulerApp app = TestUtils.getFiCaSchedulerApp(rm1, app1.getApplicationId());
    Assert.assertEquals(2 * GB, app.getAppAttemptResourceUsage().getUsed().getMemorySize());
    verifyAvailableResourceOfSchedulerNode(rm1, nm1.getNodeId(), 18 * GB);
    // Report container status
    nm1.nodeHeartbeat(app1.getCurrentAppAttempt().getAppAttemptId(), 2, ContainerState.RUNNING);
    // Wait until container status is RUNNING, and is removed from
    // allocation expirer
    rm1.waitForState(nm1, containerId2, RMContainerState.RUNNING);
    // am1 asks to increase containerId2 from 1GB to 3GB
    am1.sendContainerResizingRequest(Collections.singletonList(UpdateContainerRequest.newInstance(0, containerId2, ContainerUpdateType.INCREASE_RESOURCE, Resources.createResource(3 * GB), null)));
    // Kick off scheduling and sleep for 1 second;
    nm1.nodeHeartbeat(true);
    Thread.sleep(1000);
    // Start container increase allocation expirer"
    am1.allocate(null, null);
    // Remember the resource in order to report status
    Resource resource = Resources.clone(rm1.getResourceScheduler().getRMContainer(containerId2).getAllocatedResource());
    nm1.containerIncreaseStatus(getContainer(rm1, containerId2, resource));
    // Wait long enough and verify that the container was removed
    // from allocation expirer, and the container is still running
    Thread.sleep(10000);
    Assert.assertEquals(RMContainerState.RUNNING, rm1.getResourceScheduler().getRMContainer(containerId2).getState());
    // Verify container size is 3G
    Assert.assertEquals(3 * GB, rm1.getResourceScheduler().getRMContainer(containerId2).getAllocatedResource().getMemorySize());
    // Verify total resource usage
    checkUsedResource(rm1, "default", 4 * GB, null);
    Assert.assertEquals(4 * GB, app.getAppAttemptResourceUsage().getUsed().getMemorySize());
    // Verify available resource
    verifyAvailableResourceOfSchedulerNode(rm1, nm1.getNodeId(), 16 * GB);
    rm1.stop();
}
Also used : RMApp(org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMApp) RMContainer(org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainer) Container(org.apache.hadoop.yarn.api.records.Container) ContainerId(org.apache.hadoop.yarn.api.records.ContainerId) MockNM(org.apache.hadoop.yarn.server.resourcemanager.MockNM) FiCaSchedulerApp(org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp) Resource(org.apache.hadoop.yarn.api.records.Resource) MockAM(org.apache.hadoop.yarn.server.resourcemanager.MockAM) MockRM(org.apache.hadoop.yarn.server.resourcemanager.MockRM) Test(org.junit.Test)

Aggregations

MockAM (org.apache.hadoop.yarn.server.resourcemanager.MockAM)128 MockNM (org.apache.hadoop.yarn.server.resourcemanager.MockNM)127 RMApp (org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMApp)124 Test (org.junit.Test)124 MockRM (org.apache.hadoop.yarn.server.resourcemanager.MockRM)110 ContainerId (org.apache.hadoop.yarn.api.records.ContainerId)77 FiCaSchedulerApp (org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp)47 YarnConfiguration (org.apache.hadoop.yarn.conf.YarnConfiguration)35 RMNode (org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNode)35 NodeUpdateSchedulerEvent (org.apache.hadoop.yarn.server.resourcemanager.scheduler.event.NodeUpdateSchedulerEvent)35 Container (org.apache.hadoop.yarn.api.records.Container)26 RMAppAttempt (org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttempt)22 RMContainer (org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainer)22 ArrayList (java.util.ArrayList)18 AllocateResponse (org.apache.hadoop.yarn.api.protocolrecords.AllocateResponse)18 Configuration (org.apache.hadoop.conf.Configuration)16 MemoryRMStateStore (org.apache.hadoop.yarn.server.resourcemanager.recovery.MemoryRMStateStore)14 ClientResponse (com.sun.jersey.api.client.ClientResponse)13 WebResource (com.sun.jersey.api.client.WebResource)13 JSONObject (org.codehaus.jettison.json.JSONObject)13