Search in sources :

Example 1 with FixedDelayRestartStrategy

use of org.apache.flink.runtime.executiongraph.restart.FixedDelayRestartStrategy in project flink by apache.

the class ExecutionGraphRestartTest method testConstraintsAfterRestart.

@Test
public void testConstraintsAfterRestart() throws Exception {
    //setting up
    Instance instance = ExecutionGraphTestUtils.getInstance(new ActorTaskManagerGateway(new SimpleActorGateway(TestingUtils.directExecutionContext())), NUM_TASKS);
    Scheduler scheduler = new Scheduler(TestingUtils.defaultExecutionContext());
    scheduler.newInstanceAvailable(instance);
    JobVertex groupVertex = newJobVertex("Task1", NUM_TASKS, NoOpInvokable.class);
    JobVertex groupVertex2 = newJobVertex("Task2", NUM_TASKS, NoOpInvokable.class);
    SlotSharingGroup sharingGroup = new SlotSharingGroup();
    groupVertex.setSlotSharingGroup(sharingGroup);
    groupVertex2.setSlotSharingGroup(sharingGroup);
    groupVertex.setStrictlyCoLocatedWith(groupVertex2);
    //initiate and schedule job
    JobGraph jobGraph = new JobGraph("Pointwise job", groupVertex, groupVertex2);
    ExecutionGraph eg = newExecutionGraph(new FixedDelayRestartStrategy(1, 0L), scheduler);
    eg.attachJobGraph(jobGraph.getVerticesSortedTopologicallyFromSources());
    assertEquals(JobStatus.CREATED, eg.getState());
    eg.scheduleForExecution();
    assertEquals(JobStatus.RUNNING, eg.getState());
    //sanity checks
    validateConstraints(eg);
    //restart automatically
    restartAfterFailure(eg, new FiniteDuration(2, TimeUnit.MINUTES), false);
    //checking execution vertex properties
    validateConstraints(eg);
    haltExecution(eg);
}
Also used : JobGraph(org.apache.flink.runtime.jobgraph.JobGraph) JobVertex(org.apache.flink.runtime.jobgraph.JobVertex) Instance(org.apache.flink.runtime.instance.Instance) FixedDelayRestartStrategy(org.apache.flink.runtime.executiongraph.restart.FixedDelayRestartStrategy) Scheduler(org.apache.flink.runtime.jobmanager.scheduler.Scheduler) FiniteDuration(scala.concurrent.duration.FiniteDuration) SimpleActorGateway(org.apache.flink.runtime.executiongraph.ExecutionGraphTestUtils.SimpleActorGateway) SlotSharingGroup(org.apache.flink.runtime.jobmanager.scheduler.SlotSharingGroup) ActorTaskManagerGateway(org.apache.flink.runtime.jobmanager.slots.ActorTaskManagerGateway) Test(org.junit.Test)

Example 2 with FixedDelayRestartStrategy

use of org.apache.flink.runtime.executiongraph.restart.FixedDelayRestartStrategy in project flink by apache.

the class ExecutionVertexLocalityTest method createTestGraph.

// ------------------------------------------------------------------------
//  Utilities
// ------------------------------------------------------------------------
/**
	 * Creates a simple 2 vertex graph with a parallel source and a parallel target.
	 */
private ExecutionGraph createTestGraph(int parallelism, boolean allToAll) throws Exception {
    JobVertex source = new JobVertex("source", sourceVertexId);
    source.setParallelism(parallelism);
    source.setInvokableClass(NoOpInvokable.class);
    JobVertex target = new JobVertex("source", targetVertexId);
    target.setParallelism(parallelism);
    target.setInvokableClass(NoOpInvokable.class);
    DistributionPattern connectionPattern = allToAll ? DistributionPattern.ALL_TO_ALL : DistributionPattern.POINTWISE;
    target.connectNewDataSetAsInput(source, connectionPattern, ResultPartitionType.PIPELINED);
    JobGraph testJob = new JobGraph(jobId, "test job", source, target);
    return ExecutionGraphBuilder.buildGraph(null, testJob, new Configuration(), TestingUtils.defaultExecutor(), TestingUtils.defaultExecutor(), mock(SlotProvider.class), getClass().getClassLoader(), new StandaloneCheckpointRecoveryFactory(), Time.of(10, TimeUnit.SECONDS), new FixedDelayRestartStrategy(10, 0L), new UnregisteredMetricsGroup(), 1, log);
}
Also used : JobGraph(org.apache.flink.runtime.jobgraph.JobGraph) UnregisteredMetricsGroup(org.apache.flink.metrics.groups.UnregisteredMetricsGroup) JobVertex(org.apache.flink.runtime.jobgraph.JobVertex) SlotProvider(org.apache.flink.runtime.instance.SlotProvider) StandaloneCheckpointRecoveryFactory(org.apache.flink.runtime.checkpoint.StandaloneCheckpointRecoveryFactory) Configuration(org.apache.flink.configuration.Configuration) FixedDelayRestartStrategy(org.apache.flink.runtime.executiongraph.restart.FixedDelayRestartStrategy) DistributionPattern(org.apache.flink.runtime.jobgraph.DistributionPattern)

Example 3 with FixedDelayRestartStrategy

use of org.apache.flink.runtime.executiongraph.restart.FixedDelayRestartStrategy in project flink by apache.

the class ExecutionGraphRestartTest method testRestartAutomatically.

@Test
public void testRestartAutomatically() throws Exception {
    RestartStrategy restartStrategy = new FixedDelayRestartStrategy(1, 1000);
    Tuple2<ExecutionGraph, Instance> executionGraphInstanceTuple = createExecutionGraph(restartStrategy);
    ExecutionGraph eg = executionGraphInstanceTuple.f0;
    restartAfterFailure(eg, new FiniteDuration(2, TimeUnit.MINUTES), true);
}
Also used : FixedDelayRestartStrategy(org.apache.flink.runtime.executiongraph.restart.FixedDelayRestartStrategy) Instance(org.apache.flink.runtime.instance.Instance) FailureRateRestartStrategy(org.apache.flink.runtime.executiongraph.restart.FailureRateRestartStrategy) InfiniteDelayRestartStrategy(org.apache.flink.runtime.executiongraph.restart.InfiniteDelayRestartStrategy) NoRestartStrategy(org.apache.flink.runtime.executiongraph.restart.NoRestartStrategy) RestartStrategy(org.apache.flink.runtime.executiongraph.restart.RestartStrategy) FixedDelayRestartStrategy(org.apache.flink.runtime.executiongraph.restart.FixedDelayRestartStrategy) FiniteDuration(scala.concurrent.duration.FiniteDuration) Test(org.junit.Test)

Example 4 with FixedDelayRestartStrategy

use of org.apache.flink.runtime.executiongraph.restart.FixedDelayRestartStrategy in project flink by apache.

the class ExecutionGraphRestartTest method testNoRestartOnSuppressException.

@Test
public void testNoRestartOnSuppressException() throws Exception {
    Tuple2<ExecutionGraph, Instance> executionGraphInstanceTuple = createSpyExecutionGraph(new FixedDelayRestartStrategy(1, 1000));
    ExecutionGraph eg = executionGraphInstanceTuple.f0;
    // Fail with unrecoverable Exception
    eg.getAllExecutionVertices().iterator().next().fail(new SuppressRestartsException(new Exception("Test Exception")));
    assertEquals(JobStatus.FAILING, eg.getState());
    for (ExecutionVertex vertex : eg.getAllExecutionVertices()) {
        vertex.getCurrentExecutionAttempt().cancelingComplete();
    }
    FiniteDuration timeout = new FiniteDuration(2, TimeUnit.MINUTES);
    // Wait for async restart
    Deadline deadline = timeout.fromNow();
    while (deadline.hasTimeLeft() && eg.getState() != JobStatus.FAILED) {
        Thread.sleep(100);
    }
    assertEquals(JobStatus.FAILED, eg.getState());
    // No restart
    verify(eg, never()).restart();
    RestartStrategy restartStrategy = eg.getRestartStrategy();
    assertTrue(restartStrategy instanceof FixedDelayRestartStrategy);
    assertEquals(0, ((FixedDelayRestartStrategy) restartStrategy).getCurrentRestartAttempt());
}
Also used : SuppressRestartsException(org.apache.flink.runtime.execution.SuppressRestartsException) FixedDelayRestartStrategy(org.apache.flink.runtime.executiongraph.restart.FixedDelayRestartStrategy) Instance(org.apache.flink.runtime.instance.Instance) Deadline(scala.concurrent.duration.Deadline) FiniteDuration(scala.concurrent.duration.FiniteDuration) FailureRateRestartStrategy(org.apache.flink.runtime.executiongraph.restart.FailureRateRestartStrategy) InfiniteDelayRestartStrategy(org.apache.flink.runtime.executiongraph.restart.InfiniteDelayRestartStrategy) NoRestartStrategy(org.apache.flink.runtime.executiongraph.restart.NoRestartStrategy) RestartStrategy(org.apache.flink.runtime.executiongraph.restart.RestartStrategy) FixedDelayRestartStrategy(org.apache.flink.runtime.executiongraph.restart.FixedDelayRestartStrategy) SuppressRestartsException(org.apache.flink.runtime.execution.SuppressRestartsException) IOException(java.io.IOException) Test(org.junit.Test)

Aggregations

FixedDelayRestartStrategy (org.apache.flink.runtime.executiongraph.restart.FixedDelayRestartStrategy)4 Instance (org.apache.flink.runtime.instance.Instance)3 Test (org.junit.Test)3 FiniteDuration (scala.concurrent.duration.FiniteDuration)3 FailureRateRestartStrategy (org.apache.flink.runtime.executiongraph.restart.FailureRateRestartStrategy)2 InfiniteDelayRestartStrategy (org.apache.flink.runtime.executiongraph.restart.InfiniteDelayRestartStrategy)2 NoRestartStrategy (org.apache.flink.runtime.executiongraph.restart.NoRestartStrategy)2 RestartStrategy (org.apache.flink.runtime.executiongraph.restart.RestartStrategy)2 JobGraph (org.apache.flink.runtime.jobgraph.JobGraph)2 JobVertex (org.apache.flink.runtime.jobgraph.JobVertex)2 IOException (java.io.IOException)1 Configuration (org.apache.flink.configuration.Configuration)1 UnregisteredMetricsGroup (org.apache.flink.metrics.groups.UnregisteredMetricsGroup)1 StandaloneCheckpointRecoveryFactory (org.apache.flink.runtime.checkpoint.StandaloneCheckpointRecoveryFactory)1 SuppressRestartsException (org.apache.flink.runtime.execution.SuppressRestartsException)1 SimpleActorGateway (org.apache.flink.runtime.executiongraph.ExecutionGraphTestUtils.SimpleActorGateway)1 SlotProvider (org.apache.flink.runtime.instance.SlotProvider)1 DistributionPattern (org.apache.flink.runtime.jobgraph.DistributionPattern)1 Scheduler (org.apache.flink.runtime.jobmanager.scheduler.Scheduler)1 SlotSharingGroup (org.apache.flink.runtime.jobmanager.scheduler.SlotSharingGroup)1