Search in sources :

Example 16 with JobSnapshottingSettings

use of org.apache.flink.runtime.jobgraph.tasks.JobSnapshottingSettings in project flink by apache.

the class JobManagerTest method testSavepointWithDeactivatedPeriodicCheckpointing.

/**
	 * Tests that we can trigger a savepoint when periodic checkpoints are disabled.
	 */
@Test
public void testSavepointWithDeactivatedPeriodicCheckpointing() throws Exception {
    File defaultSavepointDir = tmpFolder.newFolder();
    FiniteDuration timeout = new FiniteDuration(30, TimeUnit.SECONDS);
    Configuration config = new Configuration();
    config.setString(ConfigConstants.SAVEPOINT_DIRECTORY_KEY, defaultSavepointDir.getAbsolutePath());
    ActorSystem actorSystem = null;
    ActorGateway jobManager = null;
    ActorGateway archiver = null;
    ActorGateway taskManager = null;
    try {
        actorSystem = AkkaUtils.createLocalActorSystem(new Configuration());
        Tuple2<ActorRef, ActorRef> master = JobManager.startJobManagerActors(config, actorSystem, TestingUtils.defaultExecutor(), TestingUtils.defaultExecutor(), Option.apply("jm"), Option.apply("arch"), TestingJobManager.class, TestingMemoryArchivist.class);
        jobManager = new AkkaActorGateway(master._1(), null);
        archiver = new AkkaActorGateway(master._2(), null);
        ActorRef taskManagerRef = TaskManager.startTaskManagerComponentsAndActor(config, ResourceID.generate(), actorSystem, "localhost", Option.apply("tm"), Option.<LeaderRetrievalService>apply(new StandaloneLeaderRetrievalService(jobManager.path())), true, TestingTaskManager.class);
        taskManager = new AkkaActorGateway(taskManagerRef, null);
        // Wait until connected
        Object msg = new TestingTaskManagerMessages.NotifyWhenRegisteredAtJobManager(jobManager.actor());
        Await.ready(taskManager.ask(msg, timeout), timeout);
        // Create job graph
        JobVertex sourceVertex = new JobVertex("Source");
        sourceVertex.setInvokableClass(BlockingStatefulInvokable.class);
        sourceVertex.setParallelism(1);
        JobGraph jobGraph = new JobGraph("TestingJob", sourceVertex);
        JobSnapshottingSettings snapshottingSettings = new JobSnapshottingSettings(Collections.singletonList(sourceVertex.getID()), Collections.singletonList(sourceVertex.getID()), Collections.singletonList(sourceVertex.getID()), // deactivated checkpointing
        Long.MAX_VALUE, 360000, 0, Integer.MAX_VALUE, ExternalizedCheckpointSettings.none(), null, true);
        jobGraph.setSnapshotSettings(snapshottingSettings);
        // Submit job graph
        msg = new JobManagerMessages.SubmitJob(jobGraph, ListeningBehaviour.DETACHED);
        Await.result(jobManager.ask(msg, timeout), timeout);
        // Wait for all tasks to be running
        msg = new TestingJobManagerMessages.WaitForAllVerticesToBeRunning(jobGraph.getJobID());
        Await.result(jobManager.ask(msg, timeout), timeout);
        // Cancel with savepoint
        File targetDirectory = tmpFolder.newFolder();
        msg = new TriggerSavepoint(jobGraph.getJobID(), Option.apply(targetDirectory.getAbsolutePath()));
        Future<Object> future = jobManager.ask(msg, timeout);
        Object result = Await.result(future, timeout);
        assertTrue("Did not trigger savepoint", result instanceof TriggerSavepointSuccess);
        assertEquals(1, targetDirectory.listFiles().length);
    } finally {
        if (actorSystem != null) {
            actorSystem.shutdown();
        }
        if (archiver != null) {
            archiver.actor().tell(PoisonPill.getInstance(), ActorRef.noSender());
        }
        if (jobManager != null) {
            jobManager.actor().tell(PoisonPill.getInstance(), ActorRef.noSender());
        }
        if (taskManager != null) {
            taskManager.actor().tell(PoisonPill.getInstance(), ActorRef.noSender());
        }
    }
}
Also used : ActorSystem(akka.actor.ActorSystem) AkkaActorGateway(org.apache.flink.runtime.instance.AkkaActorGateway) Configuration(org.apache.flink.configuration.Configuration) ActorRef(akka.actor.ActorRef) JobSnapshottingSettings(org.apache.flink.runtime.jobgraph.tasks.JobSnapshottingSettings) JobManagerMessages(org.apache.flink.runtime.messages.JobManagerMessages) TestingJobManagerMessages(org.apache.flink.runtime.testingUtils.TestingJobManagerMessages) FiniteDuration(scala.concurrent.duration.FiniteDuration) SubmitJob(org.apache.flink.runtime.messages.JobManagerMessages.SubmitJob) TriggerSavepointSuccess(org.apache.flink.runtime.messages.JobManagerMessages.TriggerSavepointSuccess) JobGraph(org.apache.flink.runtime.jobgraph.JobGraph) JobVertex(org.apache.flink.runtime.jobgraph.JobVertex) TestingJobManagerMessages(org.apache.flink.runtime.testingUtils.TestingJobManagerMessages) StandaloneLeaderRetrievalService(org.apache.flink.runtime.leaderretrieval.StandaloneLeaderRetrievalService) ActorGateway(org.apache.flink.runtime.instance.ActorGateway) AkkaActorGateway(org.apache.flink.runtime.instance.AkkaActorGateway) TriggerSavepoint(org.apache.flink.runtime.messages.JobManagerMessages.TriggerSavepoint) File(java.io.File) WaitForAllVerticesToBeRunning(org.apache.flink.runtime.testingUtils.TestingJobManagerMessages.WaitForAllVerticesToBeRunning) Test(org.junit.Test)

Example 17 with JobSnapshottingSettings

use of org.apache.flink.runtime.jobgraph.tasks.JobSnapshottingSettings in project flink by apache.

the class JobManagerTest method testCancelWithSavepointNoDirectoriesConfigured.

/**
	 * Tests that a meaningful exception is returned if no savepoint directory is
	 * configured.
	 */
@Test
public void testCancelWithSavepointNoDirectoriesConfigured() throws Exception {
    FiniteDuration timeout = new FiniteDuration(30, TimeUnit.SECONDS);
    Configuration config = new Configuration();
    ActorSystem actorSystem = null;
    ActorGateway jobManager = null;
    ActorGateway archiver = null;
    ActorGateway taskManager = null;
    try {
        actorSystem = AkkaUtils.createLocalActorSystem(new Configuration());
        Tuple2<ActorRef, ActorRef> master = JobManager.startJobManagerActors(config, actorSystem, TestingUtils.defaultExecutor(), TestingUtils.defaultExecutor(), Option.apply("jm"), Option.apply("arch"), TestingJobManager.class, TestingMemoryArchivist.class);
        jobManager = new AkkaActorGateway(master._1(), null);
        archiver = new AkkaActorGateway(master._2(), null);
        ActorRef taskManagerRef = TaskManager.startTaskManagerComponentsAndActor(config, ResourceID.generate(), actorSystem, "localhost", Option.apply("tm"), Option.<LeaderRetrievalService>apply(new StandaloneLeaderRetrievalService(jobManager.path())), true, TestingTaskManager.class);
        taskManager = new AkkaActorGateway(taskManagerRef, null);
        // Wait until connected
        Object msg = new TestingTaskManagerMessages.NotifyWhenRegisteredAtJobManager(jobManager.actor());
        Await.ready(taskManager.ask(msg, timeout), timeout);
        // Create job graph
        JobVertex sourceVertex = new JobVertex("Source");
        sourceVertex.setInvokableClass(BlockingStatefulInvokable.class);
        sourceVertex.setParallelism(1);
        JobGraph jobGraph = new JobGraph("TestingJob", sourceVertex);
        JobSnapshottingSettings snapshottingSettings = new JobSnapshottingSettings(Collections.singletonList(sourceVertex.getID()), Collections.singletonList(sourceVertex.getID()), Collections.singletonList(sourceVertex.getID()), 3600000, 3600000, 0, Integer.MAX_VALUE, ExternalizedCheckpointSettings.none(), null, true);
        jobGraph.setSnapshotSettings(snapshottingSettings);
        // Submit job graph
        msg = new JobManagerMessages.SubmitJob(jobGraph, ListeningBehaviour.DETACHED);
        Await.result(jobManager.ask(msg, timeout), timeout);
        // Wait for all tasks to be running
        msg = new TestingJobManagerMessages.WaitForAllVerticesToBeRunning(jobGraph.getJobID());
        Await.result(jobManager.ask(msg, timeout), timeout);
        // Cancel with savepoint
        msg = new JobManagerMessages.CancelJobWithSavepoint(jobGraph.getJobID(), null);
        CancellationResponse cancelResp = (CancellationResponse) Await.result(jobManager.ask(msg, timeout), timeout);
        if (cancelResp instanceof CancellationFailure) {
            CancellationFailure failure = (CancellationFailure) cancelResp;
            assertTrue(failure.cause() instanceof IllegalStateException);
            assertTrue(failure.cause().getMessage().contains("savepoint directory"));
        } else {
            fail("Unexpected cancellation response from JobManager: " + cancelResp);
        }
    } finally {
        if (actorSystem != null) {
            actorSystem.shutdown();
        }
        if (archiver != null) {
            archiver.actor().tell(PoisonPill.getInstance(), ActorRef.noSender());
        }
        if (jobManager != null) {
            jobManager.actor().tell(PoisonPill.getInstance(), ActorRef.noSender());
        }
        if (taskManager != null) {
            taskManager.actor().tell(PoisonPill.getInstance(), ActorRef.noSender());
        }
    }
}
Also used : ActorSystem(akka.actor.ActorSystem) AkkaActorGateway(org.apache.flink.runtime.instance.AkkaActorGateway) Configuration(org.apache.flink.configuration.Configuration) ActorRef(akka.actor.ActorRef) JobSnapshottingSettings(org.apache.flink.runtime.jobgraph.tasks.JobSnapshottingSettings) JobManagerMessages(org.apache.flink.runtime.messages.JobManagerMessages) TestingJobManagerMessages(org.apache.flink.runtime.testingUtils.TestingJobManagerMessages) FiniteDuration(scala.concurrent.duration.FiniteDuration) SubmitJob(org.apache.flink.runtime.messages.JobManagerMessages.SubmitJob) JobGraph(org.apache.flink.runtime.jobgraph.JobGraph) JobVertex(org.apache.flink.runtime.jobgraph.JobVertex) TestingJobManagerMessages(org.apache.flink.runtime.testingUtils.TestingJobManagerMessages) StandaloneLeaderRetrievalService(org.apache.flink.runtime.leaderretrieval.StandaloneLeaderRetrievalService) ActorGateway(org.apache.flink.runtime.instance.ActorGateway) AkkaActorGateway(org.apache.flink.runtime.instance.AkkaActorGateway) CancellationFailure(org.apache.flink.runtime.messages.JobManagerMessages.CancellationFailure) WaitForAllVerticesToBeRunning(org.apache.flink.runtime.testingUtils.TestingJobManagerMessages.WaitForAllVerticesToBeRunning) CancellationResponse(org.apache.flink.runtime.messages.JobManagerMessages.CancellationResponse) Test(org.junit.Test)

Example 18 with JobSnapshottingSettings

use of org.apache.flink.runtime.jobgraph.tasks.JobSnapshottingSettings in project flink by apache.

the class JobSubmitTest method createSimpleJobGraph.

private JobGraph createSimpleJobGraph() {
    JobVertex jobVertex = new JobVertex("Vertex");
    jobVertex.setInvokableClass(NoOpInvokable.class);
    List<JobVertexID> vertexIdList = Collections.singletonList(jobVertex.getID());
    JobGraph jg = new JobGraph("test job", jobVertex);
    jg.setSnapshotSettings(new JobSnapshottingSettings(vertexIdList, vertexIdList, vertexIdList, 5000, 5000, 0L, 10, ExternalizedCheckpointSettings.none(), null, true));
    return jg;
}
Also used : JobGraph(org.apache.flink.runtime.jobgraph.JobGraph) JobVertex(org.apache.flink.runtime.jobgraph.JobVertex) JobVertexID(org.apache.flink.runtime.jobgraph.JobVertexID) JobSnapshottingSettings(org.apache.flink.runtime.jobgraph.tasks.JobSnapshottingSettings)

Aggregations

JobSnapshottingSettings (org.apache.flink.runtime.jobgraph.tasks.JobSnapshottingSettings)18 Test (org.junit.Test)12 JobGraph (org.apache.flink.runtime.jobgraph.JobGraph)11 JobVertex (org.apache.flink.runtime.jobgraph.JobVertex)11 Configuration (org.apache.flink.configuration.Configuration)8 JobVertexID (org.apache.flink.runtime.jobgraph.JobVertexID)8 FiniteDuration (scala.concurrent.duration.FiniteDuration)8 ActorGateway (org.apache.flink.runtime.instance.ActorGateway)7 JobManagerMessages (org.apache.flink.runtime.messages.JobManagerMessages)7 TestingJobManagerMessages (org.apache.flink.runtime.testingUtils.TestingJobManagerMessages)6 ActorRef (akka.actor.ActorRef)5 AkkaActorGateway (org.apache.flink.runtime.instance.AkkaActorGateway)5 ActorSystem (akka.actor.ActorSystem)4 ExternalizedCheckpointSettings (org.apache.flink.runtime.jobgraph.tasks.ExternalizedCheckpointSettings)4 StandaloneLeaderRetrievalService (org.apache.flink.runtime.leaderretrieval.StandaloneLeaderRetrievalService)4 SubmitJob (org.apache.flink.runtime.messages.JobManagerMessages.SubmitJob)4 WaitForAllVerticesToBeRunning (org.apache.flink.runtime.testingUtils.TestingJobManagerMessages.WaitForAllVerticesToBeRunning)4 File (java.io.File)3 JobID (org.apache.flink.api.common.JobID)3 AccessExecutionGraph (org.apache.flink.runtime.executiongraph.AccessExecutionGraph)3