Search in sources :

Example 1 with ExecutionDeploymentReport

use of org.apache.flink.runtime.taskexecutor.ExecutionDeploymentReport in project flink by apache.

the class JobMasterExecutionDeploymentReconciliationTest method testExecutionDeploymentReconciliation.

/**
 * Tests how the job master handles unknown/missing executions.
 */
@Test
public void testExecutionDeploymentReconciliation() throws Exception {
    JobMasterBuilder.TestingOnCompletionActions onCompletionActions = new JobMasterBuilder.TestingOnCompletionActions();
    TestingExecutionDeploymentTrackerWrapper deploymentTrackerWrapper = new TestingExecutionDeploymentTrackerWrapper();
    final JobGraph jobGraph = JobGraphTestUtils.singleNoOpJobGraph();
    JobMaster jobMaster = createAndStartJobMaster(onCompletionActions, deploymentTrackerWrapper, jobGraph);
    JobMasterGateway jobMasterGateway = jobMaster.getSelfGateway(JobMasterGateway.class);
    RPC_SERVICE_RESOURCE.getTestingRpcService().registerGateway(jobMasterGateway.getAddress(), jobMasterGateway);
    final CompletableFuture<ExecutionAttemptID> taskCancellationFuture = new CompletableFuture<>();
    TaskExecutorGateway taskExecutorGateway = createTaskExecutorGateway(taskCancellationFuture);
    LocalUnresolvedTaskManagerLocation localUnresolvedTaskManagerLocation = new LocalUnresolvedTaskManagerLocation();
    registerTaskExecutorAndOfferSlots(jobMasterGateway, jobGraph.getJobID(), taskExecutorGateway, localUnresolvedTaskManagerLocation);
    ExecutionAttemptID deployedExecution = deploymentTrackerWrapper.getTaskDeploymentFuture().get();
    assertFalse(taskCancellationFuture.isDone());
    ExecutionAttemptID unknownDeployment = new ExecutionAttemptID();
    // the deployment report is missing the just deployed task, but contains the ID of some
    // other unknown deployment
    // the job master should cancel the unknown deployment, and fail the job
    jobMasterGateway.heartbeatFromTaskManager(localUnresolvedTaskManagerLocation.getResourceID(), new TaskExecutorToJobManagerHeartbeatPayload(new AccumulatorReport(Collections.emptyList()), new ExecutionDeploymentReport(Collections.singleton(unknownDeployment))));
    assertThat(taskCancellationFuture.get(), is(unknownDeployment));
    assertThat(deploymentTrackerWrapper.getStopFuture().get(), is(deployedExecution));
    assertThat(onCompletionActions.getJobReachedGloballyTerminalStateFuture().get().getArchivedExecutionGraph().getState(), is(JobStatus.FAILED));
}
Also used : ExecutionAttemptID(org.apache.flink.runtime.executiongraph.ExecutionAttemptID) TestingTaskExecutorGateway(org.apache.flink.runtime.taskexecutor.TestingTaskExecutorGateway) TaskExecutorGateway(org.apache.flink.runtime.taskexecutor.TaskExecutorGateway) JobMasterBuilder(org.apache.flink.runtime.jobmaster.utils.JobMasterBuilder) JobGraph(org.apache.flink.runtime.jobgraph.JobGraph) CompletableFuture(java.util.concurrent.CompletableFuture) LocalUnresolvedTaskManagerLocation(org.apache.flink.runtime.taskmanager.LocalUnresolvedTaskManagerLocation) AccumulatorReport(org.apache.flink.runtime.taskexecutor.AccumulatorReport) TaskExecutorToJobManagerHeartbeatPayload(org.apache.flink.runtime.taskexecutor.TaskExecutorToJobManagerHeartbeatPayload) ExecutionDeploymentReport(org.apache.flink.runtime.taskexecutor.ExecutionDeploymentReport) Test(org.junit.Test)

Example 2 with ExecutionDeploymentReport

use of org.apache.flink.runtime.taskexecutor.ExecutionDeploymentReport in project flink by apache.

the class DefaultExecutionDeploymentReconcilerTest method testUnknownDeployments.

@Test
public void testUnknownDeployments() {
    TestingExecutionDeploymentReconciliationHandler handler = new TestingExecutionDeploymentReconciliationHandler();
    DefaultExecutionDeploymentReconciler reconciler = new DefaultExecutionDeploymentReconciler(handler);
    ResourceID resourceId = generate();
    ExecutionAttemptID attemptId = new ExecutionAttemptID();
    reconciler.reconcileExecutionDeployments(resourceId, new ExecutionDeploymentReport(Collections.singleton(attemptId)), Collections.emptyMap());
    assertThat(handler.getMissingExecutions(), empty());
    assertThat(handler.getUnknownExecutions(), hasItem(attemptId));
}
Also used : ExecutionAttemptID(org.apache.flink.runtime.executiongraph.ExecutionAttemptID) ResourceID(org.apache.flink.runtime.clusterframework.types.ResourceID) ExecutionDeploymentReport(org.apache.flink.runtime.taskexecutor.ExecutionDeploymentReport) Test(org.junit.Test)

Example 3 with ExecutionDeploymentReport

use of org.apache.flink.runtime.taskexecutor.ExecutionDeploymentReport in project flink by apache.

the class DefaultExecutionDeploymentReconcilerTest method testMissingAndUnknownDeployments.

@Test
public void testMissingAndUnknownDeployments() {
    TestingExecutionDeploymentReconciliationHandler handler = new TestingExecutionDeploymentReconciliationHandler();
    DefaultExecutionDeploymentReconciler reconciler = new DefaultExecutionDeploymentReconciler(handler);
    ResourceID resourceId = generate();
    ExecutionAttemptID unknownId = new ExecutionAttemptID();
    ExecutionAttemptID missingId = new ExecutionAttemptID();
    ExecutionAttemptID matchingId = new ExecutionAttemptID();
    reconciler.reconcileExecutionDeployments(resourceId, new ExecutionDeploymentReport(new HashSet<>(Arrays.asList(unknownId, matchingId))), Stream.of(missingId, matchingId).collect(Collectors.toMap(x -> x, x -> ExecutionDeploymentState.DEPLOYED)));
    assertThat(handler.getMissingExecutions(), hasItem(missingId));
    assertThat(handler.getUnknownExecutions(), hasItem(unknownId));
}
Also used : ExecutionAttemptID(org.apache.flink.runtime.executiongraph.ExecutionAttemptID) ResourceID(org.apache.flink.runtime.clusterframework.types.ResourceID) ExecutionDeploymentReport(org.apache.flink.runtime.taskexecutor.ExecutionDeploymentReport) HashSet(java.util.HashSet) Test(org.junit.Test)

Example 4 with ExecutionDeploymentReport

use of org.apache.flink.runtime.taskexecutor.ExecutionDeploymentReport in project flink by apache.

the class DefaultExecutionDeploymentReconcilerTest method testPendingDeployments.

@Test
public void testPendingDeployments() {
    TestingExecutionDeploymentReconciliationHandler handler = new TestingExecutionDeploymentReconciliationHandler();
    DefaultExecutionDeploymentReconciler reconciler = new DefaultExecutionDeploymentReconciler(handler);
    ResourceID resourceId = generate();
    ExecutionAttemptID matchingId = new ExecutionAttemptID();
    ExecutionAttemptID unknownId = new ExecutionAttemptID();
    ExecutionAttemptID missingId = new ExecutionAttemptID();
    reconciler.reconcileExecutionDeployments(resourceId, new ExecutionDeploymentReport(new HashSet<>(Arrays.asList(matchingId, unknownId))), Stream.of(matchingId, missingId).collect(Collectors.toMap(x -> x, x -> ExecutionDeploymentState.PENDING)));
    assertThat(handler.getMissingExecutions(), empty());
    assertThat(handler.getUnknownExecutions(), hasItem(unknownId));
}
Also used : ExecutionAttemptID(org.apache.flink.runtime.executiongraph.ExecutionAttemptID) ResourceID(org.apache.flink.runtime.clusterframework.types.ResourceID) ExecutionDeploymentReport(org.apache.flink.runtime.taskexecutor.ExecutionDeploymentReport) HashSet(java.util.HashSet) Test(org.junit.Test)

Example 5 with ExecutionDeploymentReport

use of org.apache.flink.runtime.taskexecutor.ExecutionDeploymentReport in project flink by apache.

the class JobMasterExecutionDeploymentReconciliationTest method testExecutionDeploymentReconciliationForPendingExecution.

/**
 * Tests that the job master does not issue a cancel call if the heartbeat reports an execution
 * for which the deployment was not yet acknowledged.
 */
@Test
public void testExecutionDeploymentReconciliationForPendingExecution() throws Exception {
    TestingExecutionDeploymentTrackerWrapper deploymentTrackerWrapper = new TestingExecutionDeploymentTrackerWrapper();
    final JobGraph jobGraph = JobGraphTestUtils.singleNoOpJobGraph();
    JobMaster jobMaster = createAndStartJobMaster(deploymentTrackerWrapper, jobGraph);
    JobMasterGateway jobMasterGateway = jobMaster.getSelfGateway(JobMasterGateway.class);
    RPC_SERVICE_RESOURCE.getTestingRpcService().registerGateway(jobMasterGateway.getAddress(), jobMasterGateway);
    final CompletableFuture<ExecutionAttemptID> taskSubmissionFuture = new CompletableFuture<>();
    final CompletableFuture<ExecutionAttemptID> taskCancellationFuture = new CompletableFuture<>();
    final CompletableFuture<Acknowledge> taskSubmissionAcknowledgeFuture = new CompletableFuture<>();
    TaskExecutorGateway taskExecutorGateway = createTaskExecutorGateway(taskCancellationFuture, taskSubmissionFuture, taskSubmissionAcknowledgeFuture);
    LocalUnresolvedTaskManagerLocation localUnresolvedTaskManagerLocation = new LocalUnresolvedTaskManagerLocation();
    registerTaskExecutorAndOfferSlots(jobMasterGateway, jobGraph.getJobID(), taskExecutorGateway, localUnresolvedTaskManagerLocation);
    ExecutionAttemptID pendingExecutionId = taskSubmissionFuture.get();
    // the execution has not been acknowledged yet by the TaskExecutor, but we already allow the
    // ID to be in the heartbeat payload
    jobMasterGateway.heartbeatFromTaskManager(localUnresolvedTaskManagerLocation.getResourceID(), new TaskExecutorToJobManagerHeartbeatPayload(new AccumulatorReport(Collections.emptyList()), new ExecutionDeploymentReport(Collections.singleton(pendingExecutionId))));
    taskSubmissionAcknowledgeFuture.complete(Acknowledge.get());
    deploymentTrackerWrapper.getTaskDeploymentFuture().get();
    assertFalse(taskCancellationFuture.isDone());
}
Also used : ExecutionAttemptID(org.apache.flink.runtime.executiongraph.ExecutionAttemptID) Acknowledge(org.apache.flink.runtime.messages.Acknowledge) TestingTaskExecutorGateway(org.apache.flink.runtime.taskexecutor.TestingTaskExecutorGateway) TaskExecutorGateway(org.apache.flink.runtime.taskexecutor.TaskExecutorGateway) JobGraph(org.apache.flink.runtime.jobgraph.JobGraph) CompletableFuture(java.util.concurrent.CompletableFuture) LocalUnresolvedTaskManagerLocation(org.apache.flink.runtime.taskmanager.LocalUnresolvedTaskManagerLocation) AccumulatorReport(org.apache.flink.runtime.taskexecutor.AccumulatorReport) TaskExecutorToJobManagerHeartbeatPayload(org.apache.flink.runtime.taskexecutor.TaskExecutorToJobManagerHeartbeatPayload) ExecutionDeploymentReport(org.apache.flink.runtime.taskexecutor.ExecutionDeploymentReport) Test(org.junit.Test)

Aggregations

ExecutionAttemptID (org.apache.flink.runtime.executiongraph.ExecutionAttemptID)7 ExecutionDeploymentReport (org.apache.flink.runtime.taskexecutor.ExecutionDeploymentReport)7 Test (org.junit.Test)7 ResourceID (org.apache.flink.runtime.clusterframework.types.ResourceID)5 HashSet (java.util.HashSet)2 CompletableFuture (java.util.concurrent.CompletableFuture)2 JobGraph (org.apache.flink.runtime.jobgraph.JobGraph)2 AccumulatorReport (org.apache.flink.runtime.taskexecutor.AccumulatorReport)2 TaskExecutorGateway (org.apache.flink.runtime.taskexecutor.TaskExecutorGateway)2 TaskExecutorToJobManagerHeartbeatPayload (org.apache.flink.runtime.taskexecutor.TaskExecutorToJobManagerHeartbeatPayload)2 TestingTaskExecutorGateway (org.apache.flink.runtime.taskexecutor.TestingTaskExecutorGateway)2 LocalUnresolvedTaskManagerLocation (org.apache.flink.runtime.taskmanager.LocalUnresolvedTaskManagerLocation)2 JobMasterBuilder (org.apache.flink.runtime.jobmaster.utils.JobMasterBuilder)1 Acknowledge (org.apache.flink.runtime.messages.Acknowledge)1