Search in sources :

Example 16 with SavepointRestoreSettings

use of org.apache.flink.runtime.jobgraph.SavepointRestoreSettings in project beam by apache.

the class FlinkRequiresStableInputTest method restoreFromSavepoint.

private JobID restoreFromSavepoint(Pipeline pipeline, String savepointDir) throws ExecutionException, InterruptedException {
    JobGraph jobGraph = getJobGraph(pipeline);
    SavepointRestoreSettings savepointSettings = SavepointRestoreSettings.forPath(savepointDir);
    jobGraph.setSavepointRestoreSettings(savepointSettings);
    return flinkCluster.submitJob(jobGraph).get().getJobID();
}
Also used : JobGraph(org.apache.flink.runtime.jobgraph.JobGraph) SavepointRestoreSettings(org.apache.flink.runtime.jobgraph.SavepointRestoreSettings)

Example 17 with SavepointRestoreSettings

use of org.apache.flink.runtime.jobgraph.SavepointRestoreSettings in project flink by apache.

the class StreamingJobGraphGeneratorTest method generatorForwardsSavepointRestoreSettings.

@Test
public void generatorForwardsSavepointRestoreSettings() {
    StreamGraph streamGraph = new StreamGraph(new ExecutionConfig(), new CheckpointConfig(), SavepointRestoreSettings.forPath("hello"));
    JobGraph jobGraph = StreamingJobGraphGenerator.createJobGraph(streamGraph);
    SavepointRestoreSettings savepointRestoreSettings = jobGraph.getSavepointRestoreSettings();
    assertThat(savepointRestoreSettings.getRestorePath(), is("hello"));
}
Also used : JobGraph(org.apache.flink.runtime.jobgraph.JobGraph) CheckpointConfig(org.apache.flink.streaming.api.environment.CheckpointConfig) ExecutionConfig(org.apache.flink.api.common.ExecutionConfig) SavepointRestoreSettings(org.apache.flink.runtime.jobgraph.SavepointRestoreSettings) Test(org.junit.Test)

Example 18 with SavepointRestoreSettings

use of org.apache.flink.runtime.jobgraph.SavepointRestoreSettings in project flink by apache.

the class JobMasterTest method testCheckpointPrecedesSavepointRecovery.

/**
 * Tests that an existing checkpoint will have precedence over an savepoint.
 */
@Test
public void testCheckpointPrecedesSavepointRecovery() throws Exception {
    // create savepoint data
    final long savepointId = 42L;
    final File savepointFile = createSavepoint(savepointId);
    // set savepoint settings
    final SavepointRestoreSettings savepointRestoreSettings = SavepointRestoreSettings.forPath("" + savepointFile.getAbsolutePath(), true);
    final JobGraph jobGraph = createJobGraphWithCheckpointing(savepointRestoreSettings);
    final long checkpointId = 1L;
    final CompletedCheckpoint completedCheckpoint = new CompletedCheckpoint(jobGraph.getJobID(), checkpointId, 1L, 1L, Collections.emptyMap(), null, CheckpointProperties.forCheckpoint(CheckpointRetentionPolicy.NEVER_RETAIN_AFTER_TERMINATION), new DummyCheckpointStorageLocation());
    final StandaloneCompletedCheckpointStore completedCheckpointStore = new StandaloneCompletedCheckpointStore(1);
    completedCheckpointStore.addCheckpointAndSubsumeOldestOne(completedCheckpoint, new CheckpointsCleaner(), () -> {
    });
    final CheckpointRecoveryFactory testingCheckpointRecoveryFactory = PerJobCheckpointRecoveryFactory.withoutCheckpointStoreRecovery(maxCheckpoints -> completedCheckpointStore);
    haServices.setCheckpointRecoveryFactory(testingCheckpointRecoveryFactory);
    final JobMaster jobMaster = new JobMasterBuilder(jobGraph, rpcService).createJobMaster();
    try {
        // starting the JobMaster should have read the savepoint
        final CompletedCheckpoint savepointCheckpoint = completedCheckpointStore.getLatestCheckpoint();
        assertThat(savepointCheckpoint, Matchers.notNullValue());
        assertThat(savepointCheckpoint.getCheckpointID(), is(checkpointId));
    } finally {
        RpcUtils.terminateRpcEndpoint(jobMaster, testingTimeout);
    }
}
Also used : CompletedCheckpoint(org.apache.flink.runtime.checkpoint.CompletedCheckpoint) JobGraph(org.apache.flink.runtime.jobgraph.JobGraph) StandaloneCompletedCheckpointStore(org.apache.flink.runtime.checkpoint.StandaloneCompletedCheckpointStore) CheckpointsCleaner(org.apache.flink.runtime.checkpoint.CheckpointsCleaner) PerJobCheckpointRecoveryFactory(org.apache.flink.runtime.checkpoint.PerJobCheckpointRecoveryFactory) StandaloneCheckpointRecoveryFactory(org.apache.flink.runtime.checkpoint.StandaloneCheckpointRecoveryFactory) CheckpointRecoveryFactory(org.apache.flink.runtime.checkpoint.CheckpointRecoveryFactory) File(java.io.File) JobMasterBuilder(org.apache.flink.runtime.jobmaster.utils.JobMasterBuilder) SavepointRestoreSettings(org.apache.flink.runtime.jobgraph.SavepointRestoreSettings) Test(org.junit.Test)

Example 19 with SavepointRestoreSettings

use of org.apache.flink.runtime.jobgraph.SavepointRestoreSettings in project flink by apache.

the class JobMasterTest method testRestoringFromSavepoint.

/**
 * Tests that a JobMaster will restore the given JobGraph from its savepoint upon initial
 * submission.
 */
@Test
public void testRestoringFromSavepoint() throws Exception {
    // create savepoint data
    final long savepointId = 42L;
    final File savepointFile = createSavepoint(savepointId);
    // set savepoint settings
    final SavepointRestoreSettings savepointRestoreSettings = SavepointRestoreSettings.forPath(savepointFile.getAbsolutePath(), true);
    final JobGraph jobGraph = createJobGraphWithCheckpointing(savepointRestoreSettings);
    final StandaloneCompletedCheckpointStore completedCheckpointStore = new StandaloneCompletedCheckpointStore(1);
    final CheckpointRecoveryFactory testingCheckpointRecoveryFactory = PerJobCheckpointRecoveryFactory.withoutCheckpointStoreRecovery(maxCheckpoints -> completedCheckpointStore);
    haServices.setCheckpointRecoveryFactory(testingCheckpointRecoveryFactory);
    final JobMaster jobMaster = new JobMasterBuilder(jobGraph, rpcService).withHighAvailabilityServices(haServices).createJobMaster();
    try {
        // we need to start and register the required slots to let the adaptive scheduler
        // restore from the savepoint
        jobMaster.start();
        final OneShotLatch taskSubmitLatch = new OneShotLatch();
        registerSlotsAtJobMaster(1, jobMaster.getSelfGateway(JobMasterGateway.class), jobGraph.getJobID(), new TestingTaskExecutorGatewayBuilder().setSubmitTaskConsumer((taskDeploymentDescriptor, jobMasterId) -> {
            taskSubmitLatch.trigger();
            return CompletableFuture.completedFuture(Acknowledge.get());
        }).createTestingTaskExecutorGateway(), new LocalUnresolvedTaskManagerLocation());
        // wait until a task has submitted because this guarantees that the ExecutionGraph has
        // been created
        taskSubmitLatch.await();
        final CompletedCheckpoint savepointCheckpoint = completedCheckpointStore.getLatestCheckpoint();
        assertThat(savepointCheckpoint, Matchers.notNullValue());
        assertThat(savepointCheckpoint.getCheckpointID(), is(savepointId));
    } finally {
        RpcUtils.terminateRpcEndpoint(jobMaster, testingTimeout);
    }
}
Also used : CompletedCheckpoint(org.apache.flink.runtime.checkpoint.CompletedCheckpoint) TestingTaskExecutorGatewayBuilder(org.apache.flink.runtime.taskexecutor.TestingTaskExecutorGatewayBuilder) PerJobCheckpointRecoveryFactory(org.apache.flink.runtime.checkpoint.PerJobCheckpointRecoveryFactory) StandaloneCheckpointRecoveryFactory(org.apache.flink.runtime.checkpoint.StandaloneCheckpointRecoveryFactory) CheckpointRecoveryFactory(org.apache.flink.runtime.checkpoint.CheckpointRecoveryFactory) JobMasterBuilder(org.apache.flink.runtime.jobmaster.utils.JobMasterBuilder) JobGraph(org.apache.flink.runtime.jobgraph.JobGraph) StandaloneCompletedCheckpointStore(org.apache.flink.runtime.checkpoint.StandaloneCompletedCheckpointStore) OneShotLatch(org.apache.flink.core.testutils.OneShotLatch) LocalUnresolvedTaskManagerLocation(org.apache.flink.runtime.taskmanager.LocalUnresolvedTaskManagerLocation) File(java.io.File) SavepointRestoreSettings(org.apache.flink.runtime.jobgraph.SavepointRestoreSettings) Test(org.junit.Test)

Example 20 with SavepointRestoreSettings

use of org.apache.flink.runtime.jobgraph.SavepointRestoreSettings in project flink by apache.

the class MiniCluster method checkRestoreModeForRandomizedChangelogStateBackend.

// HACK: temporary hack to make the randomized changelog state backend tests work with forced
// full snapshots. This option should be removed once changelog state backend supports forced
// full snapshots
private void checkRestoreModeForRandomizedChangelogStateBackend(JobGraph jobGraph) {
    final SavepointRestoreSettings savepointRestoreSettings = jobGraph.getSavepointRestoreSettings();
    if (overrideRestoreModeForRandomizedChangelogStateBackend && savepointRestoreSettings.getRestoreMode() == RestoreMode.NO_CLAIM) {
        final Configuration conf = new Configuration();
        SavepointRestoreSettings.toConfiguration(savepointRestoreSettings, conf);
        conf.set(SavepointConfigOptions.RESTORE_MODE, RestoreMode.LEGACY);
        jobGraph.setSavepointRestoreSettings(SavepointRestoreSettings.fromConfiguration(conf));
    }
}
Also used : MetricRegistryConfiguration(org.apache.flink.runtime.metrics.MetricRegistryConfiguration) Configuration(org.apache.flink.configuration.Configuration) SavepointRestoreSettings(org.apache.flink.runtime.jobgraph.SavepointRestoreSettings)

Aggregations

SavepointRestoreSettings (org.apache.flink.runtime.jobgraph.SavepointRestoreSettings)22 Test (org.junit.Test)12 JobGraph (org.apache.flink.runtime.jobgraph.JobGraph)9 Configuration (org.apache.flink.configuration.Configuration)7 JobID (org.apache.flink.api.common.JobID)5 File (java.io.File)4 ExecutionConfig (org.apache.flink.api.common.ExecutionConfig)3 CheckpointConfig (org.apache.flink.streaming.api.environment.CheckpointConfig)3 Matchers.containsString (org.hamcrest.Matchers.containsString)3 GlobalConfiguration (org.apache.flink.configuration.GlobalConfiguration)2 CheckpointRecoveryFactory (org.apache.flink.runtime.checkpoint.CheckpointRecoveryFactory)2 CompletedCheckpoint (org.apache.flink.runtime.checkpoint.CompletedCheckpoint)2 PerJobCheckpointRecoveryFactory (org.apache.flink.runtime.checkpoint.PerJobCheckpointRecoveryFactory)2 StandaloneCheckpointRecoveryFactory (org.apache.flink.runtime.checkpoint.StandaloneCheckpointRecoveryFactory)2 StandaloneCompletedCheckpointStore (org.apache.flink.runtime.checkpoint.StandaloneCompletedCheckpointStore)2 JobVertex (org.apache.flink.runtime.jobgraph.JobVertex)2 JobMasterBuilder (org.apache.flink.runtime.jobmaster.utils.JobMasterBuilder)2 RemoteStreamEnvironment (org.apache.flink.streaming.api.environment.RemoteStreamEnvironment)2 ActorRef (akka.actor.ActorRef)1 ActorSystem (akka.actor.ActorSystem)1