Search in sources :

Example 71 with FlinkException

use of org.apache.flink.util.FlinkException in project flink by apache.

the class JobMasterTest method testJobMasterAcceptsSlotsWhenJobIsRestarting.

@Test
public void testJobMasterAcceptsSlotsWhenJobIsRestarting() throws Exception {
    configuration.set(RestartStrategyOptions.RESTART_STRATEGY, "fixed-delay");
    configuration.set(RestartStrategyOptions.RESTART_STRATEGY_FIXED_DELAY_DELAY, Duration.ofDays(1));
    final int numberSlots = 1;
    final JobMaster jobMaster = new JobMasterBuilder(jobGraph, rpcService).withConfiguration(configuration).createJobMaster();
    try {
        jobMaster.start();
        final JobMasterGateway jobMasterGateway = jobMaster.getSelfGateway(JobMasterGateway.class);
        final LocalUnresolvedTaskManagerLocation unresolvedTaskManagerLocation = new LocalUnresolvedTaskManagerLocation();
        registerSlotsAtJobMaster(numberSlots, jobMasterGateway, jobGraph.getJobID(), new TestingTaskExecutorGatewayBuilder().setAddress("firstTaskManager").createTestingTaskExecutorGateway(), unresolvedTaskManagerLocation);
        CommonTestUtils.waitUntilCondition(() -> jobMasterGateway.requestJobStatus(testingTimeout).get() == JobStatus.RUNNING, Deadline.fromNow(TimeUtils.toDuration(testingTimeout)));
        jobMasterGateway.disconnectTaskManager(unresolvedTaskManagerLocation.getResourceID(), new FlinkException("Test exception."));
        CommonTestUtils.waitUntilCondition(() -> jobMasterGateway.requestJobStatus(testingTimeout).get() == JobStatus.RESTARTING, Deadline.fromNow(TimeUtils.toDuration(testingTimeout)));
        assertThat(registerSlotsAtJobMaster(numberSlots, jobMasterGateway, jobGraph.getJobID(), new TestingTaskExecutorGatewayBuilder().setAddress("secondTaskManager").createTestingTaskExecutorGateway(), new LocalUnresolvedTaskManagerLocation()), hasSize(numberSlots));
    } finally {
        RpcUtils.terminateRpcEndpoint(jobMaster, testingTimeout);
    }
}
Also used : LocalUnresolvedTaskManagerLocation(org.apache.flink.runtime.taskmanager.LocalUnresolvedTaskManagerLocation) TestingTaskExecutorGatewayBuilder(org.apache.flink.runtime.taskexecutor.TestingTaskExecutorGatewayBuilder) CompletedCheckpoint(org.apache.flink.runtime.checkpoint.CompletedCheckpoint) JobMasterBuilder(org.apache.flink.runtime.jobmaster.utils.JobMasterBuilder) FlinkException(org.apache.flink.util.FlinkException) Test(org.junit.Test)

Example 72 with FlinkException

use of org.apache.flink.util.FlinkException in project flink by apache.

the class JobMasterTest method testReconnectionAfterDisconnect.

/**
 * Tests that we continue reconnecting to the latest known RM after a disconnection message.
 */
@Test
public void testReconnectionAfterDisconnect() throws Exception {
    final JobMaster jobMaster = new JobMasterBuilder(jobGraph, rpcService).withJobMasterId(jobMasterId).withConfiguration(configuration).withHighAvailabilityServices(haServices).withHeartbeatServices(heartbeatServices).createJobMaster();
    jobMaster.start();
    final JobMasterGateway jobMasterGateway = jobMaster.getSelfGateway(JobMasterGateway.class);
    try {
        final TestingResourceManagerGateway testingResourceManagerGateway = createAndRegisterTestingResourceManagerGateway();
        final BlockingQueue<JobMasterId> registrationsQueue = new ArrayBlockingQueue<>(1);
        testingResourceManagerGateway.setRegisterJobManagerFunction((jobMasterId, resourceID, s, jobID) -> {
            registrationsQueue.offer(jobMasterId);
            return CompletableFuture.completedFuture(testingResourceManagerGateway.getJobMasterRegistrationSuccess());
        });
        final ResourceManagerId resourceManagerId = testingResourceManagerGateway.getFencingToken();
        notifyResourceManagerLeaderListeners(testingResourceManagerGateway);
        // wait for first registration attempt
        final JobMasterId firstRegistrationAttempt = registrationsQueue.take();
        assertThat(firstRegistrationAttempt, equalTo(jobMasterId));
        assertThat(registrationsQueue.isEmpty(), is(true));
        jobMasterGateway.disconnectResourceManager(resourceManagerId, new FlinkException("Test exception"));
        // wait for the second registration attempt after the disconnect call
        assertThat(registrationsQueue.take(), equalTo(jobMasterId));
    } finally {
        RpcUtils.terminateRpcEndpoint(jobMaster, testingTimeout);
    }
}
Also used : ArrayBlockingQueue(java.util.concurrent.ArrayBlockingQueue) ResourceManagerId(org.apache.flink.runtime.resourcemanager.ResourceManagerId) TestingResourceManagerGateway(org.apache.flink.runtime.resourcemanager.utils.TestingResourceManagerGateway) JobMasterBuilder(org.apache.flink.runtime.jobmaster.utils.JobMasterBuilder) FlinkException(org.apache.flink.util.FlinkException) Test(org.junit.Test)

Example 73 with FlinkException

use of org.apache.flink.util.FlinkException in project flink by apache.

the class DeclarativeSlotPoolServiceTest method testReleaseTaskManager.

@Test
public void testReleaseTaskManager() throws Exception {
    try (DeclarativeSlotPoolService declarativeSlotPoolService = createDeclarativeSlotPoolService()) {
        final ResourceID knownTaskManager = ResourceID.generate();
        declarativeSlotPoolService.registerTaskManager(knownTaskManager);
        declarativeSlotPoolService.releaseTaskManager(knownTaskManager, new FlinkException("Test cause"));
        assertFalse(declarativeSlotPoolService.isTaskManagerRegistered(knownTaskManager.getResourceID()));
    }
}
Also used : ResourceID(org.apache.flink.runtime.clusterframework.types.ResourceID) FlinkException(org.apache.flink.util.FlinkException) Test(org.junit.Test)

Example 74 with FlinkException

use of org.apache.flink.util.FlinkException in project flink by apache.

the class DefaultDeclarativeSlotPoolTest method testReleaseSlotOnlyReturnsFulfilledRequirementsOfReservedSlots.

@Test
public void testReleaseSlotOnlyReturnsFulfilledRequirementsOfReservedSlots() {
    withSlotPoolContainingOneTaskManagerWithTwoSlotsWithUniqueResourceProfiles((slotPool, freeSlot, slotToReserve, ignored) -> {
        slotPool.reserveFreeSlot(slotToReserve.getAllocationId(), slotToReserve.getResourceProfile()).tryAssignPayload(new TestingPhysicalSlotPayload());
        final ResourceCounter fulfilledRequirementsOfFreeSlot = slotPool.releaseSlot(freeSlot.getAllocationId(), new FlinkException("Test failure"));
        final ResourceCounter fulfilledRequirementsOfReservedSlot = slotPool.releaseSlot(slotToReserve.getAllocationId(), new FlinkException("Test failure"));
        assertThat(fulfilledRequirementsOfFreeSlot.getResources(), is(empty()));
        assertThat(fulfilledRequirementsOfReservedSlot.getResourceCount(slotToReserve.getResourceProfile()), is(1));
    });
}
Also used : ResourceCounter(org.apache.flink.runtime.util.ResourceCounter) FlinkException(org.apache.flink.util.FlinkException) Test(org.junit.Test)

Example 75 with FlinkException

use of org.apache.flink.util.FlinkException in project flink by apache.

the class DefaultDeclarativeSlotPoolTest method testReleaseSlotsRemovesSlots.

@Test
public void testReleaseSlotsRemovesSlots() throws InterruptedException {
    final NewResourceRequirementsService notifyNewResourceRequirements = new NewResourceRequirementsService();
    final DefaultDeclarativeSlotPool slotPool = createDefaultDeclarativeSlotPool(notifyNewResourceRequirements);
    final LocalTaskManagerLocation taskManagerLocation = new LocalTaskManagerLocation();
    increaseRequirementsAndOfferSlotsToSlotPool(slotPool, createResourceRequirements(), taskManagerLocation);
    notifyNewResourceRequirements.takeResourceRequirements();
    slotPool.releaseSlots(taskManagerLocation.getResourceID(), new FlinkException("Test failure"));
    assertThat(slotPool.getAllSlotsInformation(), is(empty()));
}
Also used : LocalTaskManagerLocation(org.apache.flink.runtime.taskmanager.LocalTaskManagerLocation) FlinkException(org.apache.flink.util.FlinkException) Test(org.junit.Test)

Aggregations

FlinkException (org.apache.flink.util.FlinkException)197 Test (org.junit.Test)91 CompletableFuture (java.util.concurrent.CompletableFuture)59 IOException (java.io.IOException)38 ExecutionException (java.util.concurrent.ExecutionException)26 ArrayList (java.util.ArrayList)25 JobID (org.apache.flink.api.common.JobID)24 Collection (java.util.Collection)22 CompletionException (java.util.concurrent.CompletionException)22 Configuration (org.apache.flink.configuration.Configuration)21 TimeoutException (java.util.concurrent.TimeoutException)19 FutureUtils (org.apache.flink.util.concurrent.FutureUtils)19 Time (org.apache.flink.api.common.time.Time)16 OneShotLatch (org.apache.flink.core.testutils.OneShotLatch)16 ResourceID (org.apache.flink.runtime.clusterframework.types.ResourceID)16 JobGraph (org.apache.flink.runtime.jobgraph.JobGraph)15 AllocationID (org.apache.flink.runtime.clusterframework.types.AllocationID)14 Collections (java.util.Collections)13 List (java.util.List)13 ExecutorService (java.util.concurrent.ExecutorService)13