Search in sources :

Example 1 with CheckpointStorageLocation

use of org.apache.flink.runtime.state.CheckpointStorageLocation in project flink by apache.

the class DispatcherTest method createTestingSavepoint.

@Nonnull
private URI createTestingSavepoint() throws IOException, URISyntaxException {
    final CheckpointStorage storage = Checkpoints.loadCheckpointStorage(configuration, Thread.currentThread().getContextClassLoader(), log);
    final CheckpointStorageCoordinatorView checkpointStorage = storage.createCheckpointStorage(jobGraph.getJobID());
    final File savepointFile = temporaryFolder.newFolder();
    final long checkpointId = 1L;
    final CheckpointStorageLocation checkpointStorageLocation = checkpointStorage.initializeLocationForSavepoint(checkpointId, savepointFile.getAbsolutePath());
    final CheckpointMetadataOutputStream metadataOutputStream = checkpointStorageLocation.createMetadataOutputStream();
    Checkpoints.storeCheckpointMetadata(new CheckpointMetadata(checkpointId, Collections.emptyList(), Collections.emptyList()), metadataOutputStream);
    final CompletedCheckpointStorageLocation completedCheckpointStorageLocation = metadataOutputStream.closeAndFinalizeCheckpoint();
    return new URI(completedCheckpointStorageLocation.getExternalPointer());
}
Also used : CheckpointMetadataOutputStream(org.apache.flink.runtime.state.CheckpointMetadataOutputStream) CheckpointStorage(org.apache.flink.runtime.state.CheckpointStorage) CheckpointStorageCoordinatorView(org.apache.flink.runtime.state.CheckpointStorageCoordinatorView) CheckpointStorageLocation(org.apache.flink.runtime.state.CheckpointStorageLocation) CompletedCheckpointStorageLocation(org.apache.flink.runtime.state.CompletedCheckpointStorageLocation) CompletedCheckpointStorageLocation(org.apache.flink.runtime.state.CompletedCheckpointStorageLocation) File(java.io.File) URI(java.net.URI) CheckpointMetadata(org.apache.flink.runtime.checkpoint.metadata.CheckpointMetadata) Nonnull(javax.annotation.Nonnull)

Example 2 with CheckpointStorageLocation

use of org.apache.flink.runtime.state.CheckpointStorageLocation in project flink by apache.

the class AbstractFileCheckpointStorageAccessTestBase method testSavepoint.

private void testSavepoint(@Nullable Path savepointDir, @Nullable Path customDir, Path expectedParent) throws Exception {
    final CheckpointStorageAccess storage = savepointDir == null ? createCheckpointStorage(randomTempPath()) : createCheckpointStorageWithSavepointDir(randomTempPath(), savepointDir);
    final String customLocation = customDir == null ? null : customDir.toString();
    final CheckpointStorageLocation savepointLocation = storage.initializeLocationForSavepoint(52452L, customLocation);
    final byte[] data = { 77, 66, 55, 99, 88 };
    final CompletedCheckpointStorageLocation completed;
    try (CheckpointMetadataOutputStream out = savepointLocation.createMetadataOutputStream()) {
        out.write(data);
        completed = out.closeAndFinalizeCheckpoint();
    }
    // we need to do this step to make sure we have a slash (or not) in the same way as the
    // expected path has it
    final Path normalizedWithSlash = Path.fromLocalFile(new File(new Path(completed.getExternalPointer()).getParent().getPath()));
    assertEquals(expectedParent, normalizedWithSlash);
    validateContents(completed.getMetadataHandle(), data);
    // validate that the correct directory was used
    FileStateHandle fileStateHandle = (FileStateHandle) completed.getMetadataHandle();
    // we need to recreate that path in the same way as the "expected path" (via File and URI)
    // to make
    // sure the either both use '/' suffixes, or neither uses them (a bit of an annoying
    // ambiguity)
    Path usedSavepointDir = new Path(new File(fileStateHandle.getFilePath().getParent().getParent().getPath()).toURI());
    assertEquals(expectedParent, usedSavepointDir);
}
Also used : Path(org.apache.flink.core.fs.Path) CheckpointMetadataOutputStream(org.apache.flink.runtime.state.CheckpointMetadataOutputStream) CheckpointStorageLocation(org.apache.flink.runtime.state.CheckpointStorageLocation) CompletedCheckpointStorageLocation(org.apache.flink.runtime.state.CompletedCheckpointStorageLocation) CompletedCheckpointStorageLocation(org.apache.flink.runtime.state.CompletedCheckpointStorageLocation) CheckpointStorageAccess(org.apache.flink.runtime.state.CheckpointStorageAccess) MemoryBackendCheckpointStorageAccess(org.apache.flink.runtime.state.memory.MemoryBackendCheckpointStorageAccess) File(java.io.File)

Example 3 with CheckpointStorageLocation

use of org.apache.flink.runtime.state.CheckpointStorageLocation in project flink by apache.

the class CheckpointCoordinator method startTriggeringCheckpoint.

private void startTriggeringCheckpoint(CheckpointTriggerRequest request) {
    try {
        synchronized (lock) {
            preCheckGlobalState(request.isPeriodic);
        }
        // we will actually trigger this checkpoint!
        Preconditions.checkState(!isTriggering);
        isTriggering = true;
        final long timestamp = System.currentTimeMillis();
        CompletableFuture<CheckpointPlan> checkpointPlanFuture = checkpointPlanCalculator.calculateCheckpointPlan();
        boolean initializeBaseLocations = !baseLocationsForCheckpointInitialized;
        baseLocationsForCheckpointInitialized = true;
        final CompletableFuture<PendingCheckpoint> pendingCheckpointCompletableFuture = checkpointPlanFuture.thenApplyAsync(plan -> {
            try {
                CheckpointIdAndStorageLocation checkpointIdAndStorageLocation = initializeCheckpoint(request.props, request.externalSavepointLocation, initializeBaseLocations);
                return new Tuple2<>(plan, checkpointIdAndStorageLocation);
            } catch (Throwable e) {
                throw new CompletionException(e);
            }
        }, executor).thenApplyAsync((checkpointInfo) -> createPendingCheckpoint(timestamp, request.props, checkpointInfo.f0, request.isPeriodic, checkpointInfo.f1.checkpointId, checkpointInfo.f1.checkpointStorageLocation, request.getOnCompletionFuture()), timer);
        final CompletableFuture<?> coordinatorCheckpointsComplete = pendingCheckpointCompletableFuture.thenComposeAsync((pendingCheckpoint) -> OperatorCoordinatorCheckpoints.triggerAndAcknowledgeAllCoordinatorCheckpointsWithCompletion(coordinatorsToCheckpoint, pendingCheckpoint, timer), timer);
        // We have to take the snapshot of the master hooks after the coordinator checkpoints
        // has completed.
        // This is to ensure the tasks are checkpointed after the OperatorCoordinators in case
        // ExternallyInducedSource is used.
        final CompletableFuture<?> masterStatesComplete = coordinatorCheckpointsComplete.thenComposeAsync(ignored -> {
            // If the code reaches here, the pending checkpoint is guaranteed to
            // be not null.
            // We use FutureUtils.getWithoutException() to make compiler happy
            // with checked
            // exceptions in the signature.
            PendingCheckpoint checkpoint = FutureUtils.getWithoutException(pendingCheckpointCompletableFuture);
            return snapshotMasterState(checkpoint);
        }, timer);
        FutureUtils.assertNoException(CompletableFuture.allOf(masterStatesComplete, coordinatorCheckpointsComplete).handleAsync((ignored, throwable) -> {
            final PendingCheckpoint checkpoint = FutureUtils.getWithoutException(pendingCheckpointCompletableFuture);
            Preconditions.checkState(checkpoint != null || throwable != null, "Either the pending checkpoint needs to be created or an error must have occurred.");
            if (throwable != null) {
                // the initialization might not be finished yet
                if (checkpoint == null) {
                    onTriggerFailure(request, throwable);
                } else {
                    onTriggerFailure(checkpoint, throwable);
                }
            } else {
                triggerCheckpointRequest(request, timestamp, checkpoint);
            }
            return null;
        }, timer).exceptionally(error -> {
            if (!isShutdown()) {
                throw new CompletionException(error);
            } else if (findThrowable(error, RejectedExecutionException.class).isPresent()) {
                LOG.debug("Execution rejected during shutdown");
            } else {
                LOG.warn("Error encountered during shutdown", error);
            }
            return null;
        }));
    } catch (Throwable throwable) {
        onTriggerFailure(request, throwable);
    }
}
Also used : SystemClock(org.apache.flink.util.clock.SystemClock) ScheduledFuture(java.util.concurrent.ScheduledFuture) Tuple2(org.apache.flink.api.java.tuple.Tuple2) PriorityQueue(java.util.PriorityQueue) BiFunction(java.util.function.BiFunction) LoggerFactory(org.slf4j.LoggerFactory) CheckpointCoordinatorConfiguration(org.apache.flink.runtime.jobgraph.tasks.CheckpointCoordinatorConfiguration) ExceptionUtils(org.apache.flink.util.ExceptionUtils) AcknowledgeCheckpoint(org.apache.flink.runtime.messages.checkpoint.AcknowledgeCheckpoint) CheckpointStorage(org.apache.flink.runtime.state.CheckpointStorage) ExceptionUtils.findThrowable(org.apache.flink.util.ExceptionUtils.findThrowable) Collectors.toMap(java.util.stream.Collectors.toMap) AtomicInteger(java.util.concurrent.atomic.AtomicInteger) Map(java.util.Map) Preconditions.checkNotNull(org.apache.flink.util.Preconditions.checkNotNull) ScheduledExecutor(org.apache.flink.util.concurrent.ScheduledExecutor) Predicate(java.util.function.Predicate) CompletedCheckpointStorageLocation(org.apache.flink.runtime.state.CompletedCheckpointStorageLocation) Collection(java.util.Collection) MasterHooks(org.apache.flink.runtime.checkpoint.hooks.MasterHooks) Set(java.util.Set) CompletionException(java.util.concurrent.CompletionException) GuardedBy(javax.annotation.concurrent.GuardedBy) Preconditions(org.apache.flink.util.Preconditions) StringUtils(org.apache.flink.util.StringUtils) Acknowledge(org.apache.flink.runtime.messages.Acknowledge) List(java.util.List) Stream(java.util.stream.Stream) Preconditions.checkArgument(org.apache.flink.util.Preconditions.checkArgument) OperatorID(org.apache.flink.runtime.jobgraph.OperatorID) Optional(java.util.Optional) PossibleInconsistentStateException(org.apache.flink.runtime.persistence.PossibleInconsistentStateException) SavepointFormatType(org.apache.flink.core.execution.SavepointFormatType) CheckpointStorageCoordinatorView(org.apache.flink.runtime.state.CheckpointStorageCoordinatorView) OperatorInfo(org.apache.flink.runtime.operators.coordination.OperatorInfo) HashMap(java.util.HashMap) CompletableFuture(java.util.concurrent.CompletableFuture) Clock(org.apache.flink.util.clock.Clock) ArrayList(java.util.ArrayList) Execution(org.apache.flink.runtime.executiongraph.Execution) JobVertexID(org.apache.flink.runtime.jobgraph.JobVertexID) HashSet(java.util.HashSet) LinkedHashMap(java.util.LinkedHashMap) OptionalLong(java.util.OptionalLong) RejectedExecutionException(java.util.concurrent.RejectedExecutionException) FutureUtils(org.apache.flink.util.concurrent.FutureUtils) ThreadLocalRandom(java.util.concurrent.ThreadLocalRandom) Nullable(javax.annotation.Nullable) ExecutionJobVertex(org.apache.flink.runtime.executiongraph.ExecutionJobVertex) Logger(org.slf4j.Logger) CheckpointStorageLocation(org.apache.flink.runtime.state.CheckpointStorageLocation) FlinkRuntimeException(org.apache.flink.util.FlinkRuntimeException) Executor(java.util.concurrent.Executor) IOException(java.io.IOException) JobStatusListener(org.apache.flink.runtime.executiongraph.JobStatusListener) DeclineCheckpoint(org.apache.flink.runtime.messages.checkpoint.DeclineCheckpoint) VisibleForTesting(org.apache.flink.annotation.VisibleForTesting) TimeUnit(java.util.concurrent.TimeUnit) ExecutionAttemptID(org.apache.flink.runtime.executiongraph.ExecutionAttemptID) JobID(org.apache.flink.api.common.JobID) OperatorCoordinator(org.apache.flink.runtime.operators.coordination.OperatorCoordinator) ByteStreamStateHandle(org.apache.flink.runtime.state.memory.ByteStreamStateHandle) ArrayDeque(java.util.ArrayDeque) ExecutionVertex(org.apache.flink.runtime.executiongraph.ExecutionVertex) SavepointRestoreSettings(org.apache.flink.runtime.jobgraph.SavepointRestoreSettings) Collections(java.util.Collections) RejectedExecutionException(java.util.concurrent.RejectedExecutionException) CompletionException(java.util.concurrent.CompletionException) ExceptionUtils.findThrowable(org.apache.flink.util.ExceptionUtils.findThrowable)

Example 4 with CheckpointStorageLocation

use of org.apache.flink.runtime.state.CheckpointStorageLocation in project flink by apache.

the class CheckpointCoordinator method initializeCheckpoint.

/**
 * Initialize the checkpoint trigger asynchronously. It will expected to be executed in io
 * thread due to it might be time-consuming.
 *
 * @param props checkpoint properties
 * @param externalSavepointLocation the external savepoint location, it might be null
 * @return the initialized result, checkpoint id and checkpoint location
 */
private CheckpointIdAndStorageLocation initializeCheckpoint(CheckpointProperties props, @Nullable String externalSavepointLocation, boolean initializeBaseLocations) throws Exception {
    // this must happen outside the coordinator-wide lock, because it
    // communicates
    // with external services (in HA mode) and may block for a while.
    long checkpointID = checkpointIdCounter.getAndIncrement();
    final CheckpointStorageLocation checkpointStorageLocation;
    if (props.isSavepoint()) {
        checkpointStorageLocation = checkpointStorageView.initializeLocationForSavepoint(checkpointID, externalSavepointLocation);
    } else {
        if (initializeBaseLocations) {
            checkpointStorageView.initializeBaseLocationsForCheckpoint();
        }
        checkpointStorageLocation = checkpointStorageView.initializeLocationForCheckpoint(checkpointID);
    }
    return new CheckpointIdAndStorageLocation(checkpointID, checkpointStorageLocation);
}
Also used : CompletedCheckpointStorageLocation(org.apache.flink.runtime.state.CompletedCheckpointStorageLocation) CheckpointStorageLocation(org.apache.flink.runtime.state.CheckpointStorageLocation)

Example 5 with CheckpointStorageLocation

use of org.apache.flink.runtime.state.CheckpointStorageLocation in project flink by apache.

the class MemoryCheckpointStorageAccessTest method testNonPersistentCheckpointLocation.

@Test
public void testNonPersistentCheckpointLocation() throws Exception {
    MemoryBackendCheckpointStorageAccess storage = new MemoryBackendCheckpointStorageAccess(new JobID(), null, null, DEFAULT_MAX_STATE_SIZE);
    CheckpointStorageLocation location = storage.initializeLocationForCheckpoint(9);
    CheckpointMetadataOutputStream stream = location.createMetadataOutputStream();
    stream.write(99);
    CompletedCheckpointStorageLocation completed = stream.closeAndFinalizeCheckpoint();
    StreamStateHandle handle = completed.getMetadataHandle();
    assertTrue(handle instanceof ByteStreamStateHandle);
    // the reference is not valid in that case
    try {
        storage.resolveCheckpoint(completed.getExternalPointer());
        fail("should fail with an exception");
    } catch (Exception e) {
    // expected
    }
}
Also used : StreamStateHandle(org.apache.flink.runtime.state.StreamStateHandle) CheckpointMetadataOutputStream(org.apache.flink.runtime.state.CheckpointMetadataOutputStream) CheckpointStorageLocation(org.apache.flink.runtime.state.CheckpointStorageLocation) CompletedCheckpointStorageLocation(org.apache.flink.runtime.state.CompletedCheckpointStorageLocation) CompletedCheckpointStorageLocation(org.apache.flink.runtime.state.CompletedCheckpointStorageLocation) JobID(org.apache.flink.api.common.JobID) Test(org.junit.Test)

Aggregations

CheckpointStorageLocation (org.apache.flink.runtime.state.CheckpointStorageLocation)7 CompletedCheckpointStorageLocation (org.apache.flink.runtime.state.CompletedCheckpointStorageLocation)7 CheckpointMetadataOutputStream (org.apache.flink.runtime.state.CheckpointMetadataOutputStream)5 CheckpointStorageAccess (org.apache.flink.runtime.state.CheckpointStorageAccess)3 MemoryBackendCheckpointStorageAccess (org.apache.flink.runtime.state.memory.MemoryBackendCheckpointStorageAccess)3 Test (org.junit.Test)3 File (java.io.File)2 IOException (java.io.IOException)2 JobID (org.apache.flink.api.common.JobID)2 URI (java.net.URI)1 ArrayDeque (java.util.ArrayDeque)1 ArrayList (java.util.ArrayList)1 Collection (java.util.Collection)1 Collections (java.util.Collections)1 HashMap (java.util.HashMap)1 HashSet (java.util.HashSet)1 LinkedHashMap (java.util.LinkedHashMap)1 List (java.util.List)1 Map (java.util.Map)1 Optional (java.util.Optional)1