Search in sources :

Example 6 with JobGraph

use of org.apache.flink.runtime.jobgraph.JobGraph in project flink by apache.

the class ExecutionGraphSchedulingTest method testScheduleSourceBeforeTarget.

// ------------------------------------------------------------------------
//  Tests
// ------------------------------------------------------------------------
/**
	 * Tests that with scheduling futures and pipelined deployment, the target vertex will
	 * not deploy its task before the source vertex does.
	 */
@Test
public void testScheduleSourceBeforeTarget() throws Exception {
    //                                            [pipelined]
    //  we construct a simple graph    (source) ----------------> (target)
    final int parallelism = 1;
    final JobVertex sourceVertex = new JobVertex("source");
    sourceVertex.setParallelism(parallelism);
    sourceVertex.setInvokableClass(NoOpInvokable.class);
    final JobVertex targetVertex = new JobVertex("target");
    targetVertex.setParallelism(parallelism);
    targetVertex.setInvokableClass(NoOpInvokable.class);
    targetVertex.connectNewDataSetAsInput(sourceVertex, DistributionPattern.ALL_TO_ALL, ResultPartitionType.PIPELINED);
    final JobID jobId = new JobID();
    final JobGraph jobGraph = new JobGraph(jobId, "test", sourceVertex, targetVertex);
    final FlinkCompletableFuture<SimpleSlot> sourceFuture = new FlinkCompletableFuture<>();
    final FlinkCompletableFuture<SimpleSlot> targetFuture = new FlinkCompletableFuture<>();
    ProgrammedSlotProvider slotProvider = new ProgrammedSlotProvider(parallelism);
    slotProvider.addSlot(sourceVertex.getID(), 0, sourceFuture);
    slotProvider.addSlot(targetVertex.getID(), 0, targetFuture);
    final ExecutionGraph eg = createExecutionGraph(jobGraph, slotProvider);
    //  set up two TaskManager gateways and slots
    final TaskManagerGateway gatewaySource = createTaskManager();
    final TaskManagerGateway gatewayTarget = createTaskManager();
    final SimpleSlot sourceSlot = createSlot(gatewaySource, jobId);
    final SimpleSlot targetSlot = createSlot(gatewayTarget, jobId);
    eg.setScheduleMode(ScheduleMode.EAGER);
    eg.setQueuedSchedulingAllowed(true);
    eg.scheduleForExecution();
    // job should be running
    assertEquals(JobStatus.RUNNING, eg.getState());
    // we fulfill the target slot before the source slot
    // that should not cause a deployment or deployment related failure
    targetFuture.complete(targetSlot);
    verify(gatewayTarget, new Timeout(50, times(0))).submitTask(any(TaskDeploymentDescriptor.class), any(Time.class));
    assertEquals(JobStatus.RUNNING, eg.getState());
    // now supply the source slot
    sourceFuture.complete(sourceSlot);
    // by now, all deployments should have happened
    verify(gatewaySource, timeout(1000)).submitTask(any(TaskDeploymentDescriptor.class), any(Time.class));
    verify(gatewayTarget, timeout(1000)).submitTask(any(TaskDeploymentDescriptor.class), any(Time.class));
    assertEquals(JobStatus.RUNNING, eg.getState());
}
Also used : Timeout(org.mockito.verification.Timeout) TaskManagerGateway(org.apache.flink.runtime.jobmanager.slots.TaskManagerGateway) Time(org.apache.flink.api.common.time.Time) FlinkCompletableFuture(org.apache.flink.runtime.concurrent.impl.FlinkCompletableFuture) SimpleSlot(org.apache.flink.runtime.instance.SimpleSlot) JobGraph(org.apache.flink.runtime.jobgraph.JobGraph) JobVertex(org.apache.flink.runtime.jobgraph.JobVertex) TaskDeploymentDescriptor(org.apache.flink.runtime.deployment.TaskDeploymentDescriptor) JobID(org.apache.flink.api.common.JobID) Test(org.junit.Test)

Example 7 with JobGraph

use of org.apache.flink.runtime.jobgraph.JobGraph in project flink by apache.

the class ExecutionGraphSchedulingTest method testDeployPipelinedConnectedComponentsTogether.

/**
	 * This test verifies that before deploying a pipelined connected component, the
	 * full set of slots is available, and that not some tasks are deployed, and later the
	 * system realizes that not enough resources are available.
	 */
@Test
public void testDeployPipelinedConnectedComponentsTogether() throws Exception {
    //                                            [pipelined]
    //  we construct a simple graph    (source) ----------------> (target)
    final int parallelism = 8;
    final JobVertex sourceVertex = new JobVertex("source");
    sourceVertex.setParallelism(parallelism);
    sourceVertex.setInvokableClass(NoOpInvokable.class);
    final JobVertex targetVertex = new JobVertex("target");
    targetVertex.setParallelism(parallelism);
    targetVertex.setInvokableClass(NoOpInvokable.class);
    targetVertex.connectNewDataSetAsInput(sourceVertex, DistributionPattern.ALL_TO_ALL, ResultPartitionType.PIPELINED);
    final JobID jobId = new JobID();
    final JobGraph jobGraph = new JobGraph(jobId, "test", sourceVertex, targetVertex);
    @SuppressWarnings({ "unchecked", "rawtypes" }) final FlinkCompletableFuture<SimpleSlot>[] sourceFutures = new FlinkCompletableFuture[parallelism];
    @SuppressWarnings({ "unchecked", "rawtypes" }) final FlinkCompletableFuture<SimpleSlot>[] targetFutures = new FlinkCompletableFuture[parallelism];
    //
    //  Create the slots, futures, and the slot provider
    final TaskManagerGateway[] sourceTaskManagers = new TaskManagerGateway[parallelism];
    final TaskManagerGateway[] targetTaskManagers = new TaskManagerGateway[parallelism];
    final SimpleSlot[] sourceSlots = new SimpleSlot[parallelism];
    final SimpleSlot[] targetSlots = new SimpleSlot[parallelism];
    for (int i = 0; i < parallelism; i++) {
        sourceTaskManagers[i] = createTaskManager();
        targetTaskManagers[i] = createTaskManager();
        sourceSlots[i] = createSlot(sourceTaskManagers[i], jobId);
        targetSlots[i] = createSlot(targetTaskManagers[i], jobId);
        sourceFutures[i] = new FlinkCompletableFuture<>();
        targetFutures[i] = new FlinkCompletableFuture<>();
    }
    ProgrammedSlotProvider slotProvider = new ProgrammedSlotProvider(parallelism);
    slotProvider.addSlots(sourceVertex.getID(), sourceFutures);
    slotProvider.addSlots(targetVertex.getID(), targetFutures);
    final ExecutionGraph eg = createExecutionGraph(jobGraph, slotProvider);
    for (int i = 0; i < parallelism; i += 2) {
        sourceFutures[i].complete(sourceSlots[i]);
    }
    //
    //  kick off the scheduling
    eg.setScheduleMode(ScheduleMode.EAGER);
    eg.setQueuedSchedulingAllowed(true);
    eg.scheduleForExecution();
    verifyNothingDeployed(eg, sourceTaskManagers);
    //  complete the remaining sources
    for (int i = 1; i < parallelism; i += 2) {
        sourceFutures[i].complete(sourceSlots[i]);
    }
    verifyNothingDeployed(eg, sourceTaskManagers);
    //  complete the targets except for one
    for (int i = 1; i < parallelism; i++) {
        targetFutures[i].complete(targetSlots[i]);
    }
    verifyNothingDeployed(eg, targetTaskManagers);
    //  complete the last target slot future
    targetFutures[0].complete(targetSlots[0]);
    for (TaskManagerGateway gateway : sourceTaskManagers) {
        verify(gateway, timeout(50)).submitTask(any(TaskDeploymentDescriptor.class), any(Time.class));
    }
    for (TaskManagerGateway gateway : targetTaskManagers) {
        verify(gateway, timeout(50)).submitTask(any(TaskDeploymentDescriptor.class), any(Time.class));
    }
}
Also used : TaskManagerGateway(org.apache.flink.runtime.jobmanager.slots.TaskManagerGateway) Time(org.apache.flink.api.common.time.Time) FlinkCompletableFuture(org.apache.flink.runtime.concurrent.impl.FlinkCompletableFuture) SimpleSlot(org.apache.flink.runtime.instance.SimpleSlot) JobGraph(org.apache.flink.runtime.jobgraph.JobGraph) JobVertex(org.apache.flink.runtime.jobgraph.JobVertex) TaskDeploymentDescriptor(org.apache.flink.runtime.deployment.TaskDeploymentDescriptor) JobID(org.apache.flink.api.common.JobID) Test(org.junit.Test)

Example 8 with JobGraph

use of org.apache.flink.runtime.jobgraph.JobGraph in project flink by apache.

the class ExecutionGraphSchedulingTest method testExecutionJobVertexAllocateResourcesReleasesOnException.

/**
	 * Tests that the {@link ExecutionJobVertex#allocateResourcesForAll(SlotProvider, boolean)} method
	 * releases partially acquired resources upon exception.
	 */
@Test
public void testExecutionJobVertexAllocateResourcesReleasesOnException() throws Exception {
    final int parallelism = 8;
    final JobVertex vertex = new JobVertex("vertex");
    vertex.setParallelism(parallelism);
    vertex.setInvokableClass(NoOpInvokable.class);
    final JobID jobId = new JobID();
    final JobGraph jobGraph = new JobGraph(jobId, "test", vertex);
    // set up some available slots and some slot owner that accepts released slots back
    final List<SimpleSlot> returnedSlots = new ArrayList<>();
    final SlotOwner recycler = new SlotOwner() {

        @Override
        public boolean returnAllocatedSlot(Slot slot) {
            returnedSlots.add((SimpleSlot) slot);
            return true;
        }
    };
    // slot provider that hand out parallelism / 3 slots, then throws an exception
    final SlotProvider slotProvider = mock(SlotProvider.class);
    final TaskManagerGateway taskManager = mock(TaskManagerGateway.class);
    final List<SimpleSlot> availableSlots = new ArrayList<>(Arrays.asList(createSlot(taskManager, jobId, recycler), createSlot(taskManager, jobId, recycler), createSlot(taskManager, jobId, recycler)));
    when(slotProvider.allocateSlot(any(ScheduledUnit.class), anyBoolean())).then(new Answer<Future<SimpleSlot>>() {

        @Override
        public Future<SimpleSlot> answer(InvocationOnMock invocation) {
            if (availableSlots.isEmpty()) {
                throw new TestRuntimeException();
            } else {
                return FlinkCompletableFuture.completed(availableSlots.remove(0));
            }
        }
    });
    final ExecutionGraph eg = createExecutionGraph(jobGraph, slotProvider);
    final ExecutionJobVertex ejv = eg.getJobVertex(vertex.getID());
    // acquire resources and check that all are back after the failure
    final int numSlotsToExpectBack = availableSlots.size();
    try {
        ejv.allocateResourcesForAll(slotProvider, false);
        fail("should have failed with an exception");
    } catch (TestRuntimeException e) {
    // expected
    }
    assertEquals(numSlotsToExpectBack, returnedSlots.size());
}
Also used : SlotProvider(org.apache.flink.runtime.instance.SlotProvider) ArrayList(java.util.ArrayList) TaskManagerGateway(org.apache.flink.runtime.jobmanager.slots.TaskManagerGateway) ScheduledUnit(org.apache.flink.runtime.jobmanager.scheduler.ScheduledUnit) SimpleSlot(org.apache.flink.runtime.instance.SimpleSlot) SlotOwner(org.apache.flink.runtime.jobmanager.slots.SlotOwner) JobGraph(org.apache.flink.runtime.jobgraph.JobGraph) JobVertex(org.apache.flink.runtime.jobgraph.JobVertex) InvocationOnMock(org.mockito.invocation.InvocationOnMock) Slot(org.apache.flink.runtime.instance.Slot) SimpleSlot(org.apache.flink.runtime.instance.SimpleSlot) AllocatedSlot(org.apache.flink.runtime.jobmanager.slots.AllocatedSlot) Future(org.apache.flink.runtime.concurrent.Future) FlinkCompletableFuture(org.apache.flink.runtime.concurrent.impl.FlinkCompletableFuture) JobID(org.apache.flink.api.common.JobID) Test(org.junit.Test)

Example 9 with JobGraph

use of org.apache.flink.runtime.jobgraph.JobGraph in project flink by apache.

the class ExecutionGraphSchedulingTest method testExecutionGraphScheduleReleasesResourcesOnException.

/**
	 * Tests that the {@link ExecutionGraph#scheduleForExecution()} method
	 * releases partially acquired resources upon exception.
	 */
@Test
public void testExecutionGraphScheduleReleasesResourcesOnException() throws Exception {
    //                                            [pipelined]
    //  we construct a simple graph    (source) ----------------> (target)
    final int parallelism = 3;
    final JobVertex sourceVertex = new JobVertex("source");
    sourceVertex.setParallelism(parallelism);
    sourceVertex.setInvokableClass(NoOpInvokable.class);
    final JobVertex targetVertex = new JobVertex("target");
    targetVertex.setParallelism(parallelism);
    targetVertex.setInvokableClass(NoOpInvokable.class);
    targetVertex.connectNewDataSetAsInput(sourceVertex, DistributionPattern.ALL_TO_ALL, ResultPartitionType.PIPELINED);
    final JobID jobId = new JobID();
    final JobGraph jobGraph = new JobGraph(jobId, "test", sourceVertex, targetVertex);
    // set up some available slots and some slot owner that accepts released slots back
    final List<SimpleSlot> returnedSlots = new ArrayList<>();
    final SlotOwner recycler = new SlotOwner() {

        @Override
        public boolean returnAllocatedSlot(Slot slot) {
            returnedSlots.add((SimpleSlot) slot);
            return true;
        }
    };
    final TaskManagerGateway taskManager = mock(TaskManagerGateway.class);
    final List<SimpleSlot> availableSlots = new ArrayList<>(Arrays.asList(createSlot(taskManager, jobId, recycler), createSlot(taskManager, jobId, recycler), createSlot(taskManager, jobId, recycler), createSlot(taskManager, jobId, recycler), createSlot(taskManager, jobId, recycler)));
    // slot provider that hand out parallelism / 3 slots, then throws an exception
    final SlotProvider slotProvider = mock(SlotProvider.class);
    when(slotProvider.allocateSlot(any(ScheduledUnit.class), anyBoolean())).then(new Answer<Future<SimpleSlot>>() {

        @Override
        public Future<SimpleSlot> answer(InvocationOnMock invocation) {
            if (availableSlots.isEmpty()) {
                throw new TestRuntimeException();
            } else {
                return FlinkCompletableFuture.completed(availableSlots.remove(0));
            }
        }
    });
    final ExecutionGraph eg = createExecutionGraph(jobGraph, slotProvider);
    // acquire resources and check that all are back after the failure
    final int numSlotsToExpectBack = availableSlots.size();
    try {
        eg.setScheduleMode(ScheduleMode.EAGER);
        eg.scheduleForExecution();
        fail("should have failed with an exception");
    } catch (TestRuntimeException e) {
    // expected
    }
    assertEquals(numSlotsToExpectBack, returnedSlots.size());
}
Also used : SlotProvider(org.apache.flink.runtime.instance.SlotProvider) ArrayList(java.util.ArrayList) TaskManagerGateway(org.apache.flink.runtime.jobmanager.slots.TaskManagerGateway) ScheduledUnit(org.apache.flink.runtime.jobmanager.scheduler.ScheduledUnit) SimpleSlot(org.apache.flink.runtime.instance.SimpleSlot) SlotOwner(org.apache.flink.runtime.jobmanager.slots.SlotOwner) JobGraph(org.apache.flink.runtime.jobgraph.JobGraph) JobVertex(org.apache.flink.runtime.jobgraph.JobVertex) InvocationOnMock(org.mockito.invocation.InvocationOnMock) Slot(org.apache.flink.runtime.instance.Slot) SimpleSlot(org.apache.flink.runtime.instance.SimpleSlot) AllocatedSlot(org.apache.flink.runtime.jobmanager.slots.AllocatedSlot) Future(org.apache.flink.runtime.concurrent.Future) FlinkCompletableFuture(org.apache.flink.runtime.concurrent.impl.FlinkCompletableFuture) JobID(org.apache.flink.api.common.JobID) Test(org.junit.Test)

Example 10 with JobGraph

use of org.apache.flink.runtime.jobgraph.JobGraph in project flink by apache.

the class ExecutionGraphSchedulingTest method testOneSlotFailureAbortsDeploy.

/**
	 * This test verifies that if one slot future fails, the deployment will be aborted.
	 */
@Test
public void testOneSlotFailureAbortsDeploy() throws Exception {
    //                                            [pipelined]
    //  we construct a simple graph    (source) ----------------> (target)
    final int parallelism = 6;
    final JobVertex sourceVertex = new JobVertex("source");
    sourceVertex.setParallelism(parallelism);
    sourceVertex.setInvokableClass(NoOpInvokable.class);
    final JobVertex targetVertex = new JobVertex("target");
    targetVertex.setParallelism(parallelism);
    targetVertex.setInvokableClass(NoOpInvokable.class);
    targetVertex.connectNewDataSetAsInput(sourceVertex, DistributionPattern.POINTWISE, ResultPartitionType.PIPELINED);
    final JobID jobId = new JobID();
    final JobGraph jobGraph = new JobGraph(jobId, "test", sourceVertex, targetVertex);
    //
    //  Create the slots, futures, and the slot provider
    final TaskManagerGateway taskManager = mock(TaskManagerGateway.class);
    final SlotOwner slotOwner = mock(SlotOwner.class);
    final SimpleSlot[] sourceSlots = new SimpleSlot[parallelism];
    final SimpleSlot[] targetSlots = new SimpleSlot[parallelism];
    @SuppressWarnings({ "unchecked", "rawtypes" }) final FlinkCompletableFuture<SimpleSlot>[] sourceFutures = new FlinkCompletableFuture[parallelism];
    @SuppressWarnings({ "unchecked", "rawtypes" }) final FlinkCompletableFuture<SimpleSlot>[] targetFutures = new FlinkCompletableFuture[parallelism];
    for (int i = 0; i < parallelism; i++) {
        sourceSlots[i] = createSlot(taskManager, jobId, slotOwner);
        targetSlots[i] = createSlot(taskManager, jobId, slotOwner);
        sourceFutures[i] = new FlinkCompletableFuture<>();
        targetFutures[i] = new FlinkCompletableFuture<>();
    }
    ProgrammedSlotProvider slotProvider = new ProgrammedSlotProvider(parallelism);
    slotProvider.addSlots(sourceVertex.getID(), sourceFutures);
    slotProvider.addSlots(targetVertex.getID(), targetFutures);
    final ExecutionGraph eg = createExecutionGraph(jobGraph, slotProvider);
    TerminalJobStatusListener testListener = new TerminalJobStatusListener();
    eg.registerJobStatusListener(testListener);
    for (int i = 0; i < parallelism; i += 2) {
        sourceFutures[i].complete(sourceSlots[i]);
        targetFutures[i + 1].complete(targetSlots[i + 1]);
    }
    //
    //  kick off the scheduling
    eg.setScheduleMode(ScheduleMode.EAGER);
    eg.setQueuedSchedulingAllowed(true);
    eg.scheduleForExecution();
    // fail one slot
    sourceFutures[1].completeExceptionally(new TestRuntimeException());
    // wait until the job failed as a whole
    testListener.waitForTerminalState(2000);
    // wait until all slots are back
    verify(slotOwner, new Timeout(2000, times(6))).returnAllocatedSlot(any(Slot.class));
    // no deployment calls must have happened
    verify(taskManager, times(0)).submitTask(any(TaskDeploymentDescriptor.class), any(Time.class));
    // all completed futures must have been returns
    for (int i = 0; i < parallelism; i += 2) {
        assertTrue(sourceSlots[i].isCanceled());
        assertTrue(targetSlots[i + 1].isCanceled());
    }
}
Also used : Timeout(org.mockito.verification.Timeout) TaskManagerGateway(org.apache.flink.runtime.jobmanager.slots.TaskManagerGateway) Time(org.apache.flink.api.common.time.Time) SimpleSlot(org.apache.flink.runtime.instance.SimpleSlot) FlinkCompletableFuture(org.apache.flink.runtime.concurrent.impl.FlinkCompletableFuture) SlotOwner(org.apache.flink.runtime.jobmanager.slots.SlotOwner) JobGraph(org.apache.flink.runtime.jobgraph.JobGraph) JobVertex(org.apache.flink.runtime.jobgraph.JobVertex) Slot(org.apache.flink.runtime.instance.Slot) SimpleSlot(org.apache.flink.runtime.instance.SimpleSlot) AllocatedSlot(org.apache.flink.runtime.jobmanager.slots.AllocatedSlot) TaskDeploymentDescriptor(org.apache.flink.runtime.deployment.TaskDeploymentDescriptor) JobID(org.apache.flink.api.common.JobID) Test(org.junit.Test)

Aggregations

JobGraph (org.apache.flink.runtime.jobgraph.JobGraph)520 Test (org.junit.Test)382 JobVertex (org.apache.flink.runtime.jobgraph.JobVertex)221 Configuration (org.apache.flink.configuration.Configuration)147 JobID (org.apache.flink.api.common.JobID)134 StreamExecutionEnvironment (org.apache.flink.streaming.api.environment.StreamExecutionEnvironment)119 IOException (java.io.IOException)66 CompletableFuture (java.util.concurrent.CompletableFuture)61 ArrayList (java.util.ArrayList)59 Tuple2 (org.apache.flink.api.java.tuple.Tuple2)57 List (java.util.List)53 TestLogger (org.apache.flink.util.TestLogger)52 Arrays (java.util.Arrays)51 FlinkException (org.apache.flink.util.FlinkException)46 Collections (java.util.Collections)45 ExecutionConfig (org.apache.flink.api.common.ExecutionConfig)43 Collectors (java.util.stream.Collectors)42 JobStatus (org.apache.flink.api.common.JobStatus)42 Deadline (org.apache.flink.api.common.time.Deadline)42 ExecutionAttemptID (org.apache.flink.runtime.executiongraph.ExecutionAttemptID)40