Search in sources :

Example 1 with AsynchronousException

use of org.apache.flink.runtime.taskmanager.AsynchronousException in project flink by apache.

the class ChangelogStateBackend method restore.

@SuppressWarnings({ "unchecked", "rawtypes" })
private <K> ChangelogKeyedStateBackend<K> restore(Environment env, String operatorIdentifier, KeyGroupRange keyGroupRange, TtlTimeProvider ttlTimeProvider, Collection<KeyedStateHandle> stateHandles, BaseBackendBuilder<K> baseBackendBuilder) throws Exception {
    StateChangelogStorage<?> changelogStorage = Preconditions.checkNotNull(env.getTaskStateManager().getStateChangelogStorage(), "Changelog storage is null when creating and restoring" + " the ChangelogKeyedStateBackend.");
    String subtaskName = env.getTaskInfo().getTaskNameWithSubtasks();
    ExecutionConfig executionConfig = env.getExecutionConfig();
    Collection<ChangelogStateBackendHandle> stateBackendHandles = castHandles(stateHandles);
    ChangelogKeyedStateBackend<K> keyedStateBackend = ChangelogBackendRestoreOperation.restore(changelogStorage.createReader(), env.getUserCodeClassLoader().asClassLoader(), stateBackendHandles, baseBackendBuilder, (baseBackend, baseState) -> new ChangelogKeyedStateBackend(baseBackend, subtaskName, executionConfig, ttlTimeProvider, changelogStorage.createWriter(operatorIdentifier, keyGroupRange), baseState, env.getCheckpointStorageAccess()));
    PeriodicMaterializationManager periodicMaterializationManager = new PeriodicMaterializationManager(checkNotNull(env.getMainMailboxExecutor()), checkNotNull(env.getAsyncOperationsThreadPool()), subtaskName, (message, exception) -> env.failExternally(new AsynchronousException(message, exception)), keyedStateBackend, executionConfig.getPeriodicMaterializeIntervalMillis(), executionConfig.getMaterializationMaxAllowedFailures());
    // keyedStateBackend is responsible to close periodicMaterializationManager
    // This indicates periodicMaterializationManager binds to the keyedStateBackend
    // However PeriodicMaterializationManager can not be part of keyedStateBackend
    // because of cyclic reference
    keyedStateBackend.registerCloseable(periodicMaterializationManager);
    periodicMaterializationManager.start();
    return keyedStateBackend;
}
Also used : ChangelogStateBackendHandle(org.apache.flink.runtime.state.changelog.ChangelogStateBackendHandle) ExecutionConfig(org.apache.flink.api.common.ExecutionConfig) AsynchronousException(org.apache.flink.runtime.taskmanager.AsynchronousException)

Example 2 with AsynchronousException

use of org.apache.flink.runtime.taskmanager.AsynchronousException in project flink by apache.

the class StreamTaskTest method streamTaskAsyncExceptionHandler_handleException_forwardsMessageProperly.

/**
 * This test checks the async exceptions handling wraps the message and cause as an
 * AsynchronousException and propagates this to the environment.
 */
@Test
public void streamTaskAsyncExceptionHandler_handleException_forwardsMessageProperly() {
    MockEnvironment mockEnvironment = MockEnvironment.builder().build();
    RuntimeException expectedException = new RuntimeException("RUNTIME EXCEPTION");
    final StreamTask.StreamTaskAsyncExceptionHandler asyncExceptionHandler = new StreamTask.StreamTaskAsyncExceptionHandler(mockEnvironment);
    mockEnvironment.setExpectedExternalFailureCause(AsynchronousException.class);
    final String expectedErrorMessage = "EXPECTED_ERROR MESSAGE";
    asyncExceptionHandler.handleAsyncException(expectedErrorMessage, expectedException);
    // expect an AsynchronousException containing the supplied error details
    Optional<? extends Throwable> actualExternalFailureCause = mockEnvironment.getActualExternalFailureCause();
    final Throwable actualException = actualExternalFailureCause.orElseThrow(() -> new AssertionError("Expected exceptional completion"));
    assertThat(actualException, instanceOf(AsynchronousException.class));
    assertThat(actualException.getMessage(), is("EXPECTED_ERROR MESSAGE"));
    assertThat(actualException.getCause(), is(expectedException));
}
Also used : FlinkRuntimeException(org.apache.flink.util.FlinkRuntimeException) MockEnvironment(org.apache.flink.runtime.operators.testutils.MockEnvironment) CoreMatchers.containsString(org.hamcrest.CoreMatchers.containsString) AsynchronousException(org.apache.flink.runtime.taskmanager.AsynchronousException) Test(org.junit.Test)

Example 3 with AsynchronousException

use of org.apache.flink.runtime.taskmanager.AsynchronousException in project flink by apache.

the class AsyncCheckpointRunnable method handleExecutionException.

private void handleExecutionException(Exception e) {
    boolean didCleanup = false;
    AsyncCheckpointState currentState = asyncCheckpointState.get();
    while (AsyncCheckpointState.DISCARDED != currentState) {
        if (asyncCheckpointState.compareAndSet(currentState, AsyncCheckpointState.DISCARDED)) {
            didCleanup = true;
            try {
                cleanup();
            } catch (Exception cleanupException) {
                e.addSuppressed(cleanupException);
            }
            Exception checkpointException = new Exception("Could not materialize checkpoint " + checkpointMetaData.getCheckpointId() + " for operator " + taskName + '.', e);
            if (isTaskRunning.get()) {
                // failing the task.
                try {
                    Optional<CheckpointException> underlyingCheckpointException = ExceptionUtils.findThrowable(checkpointException, CheckpointException.class);
                    // If this failure is already a CheckpointException, do not overwrite the
                    // original CheckpointFailureReason
                    CheckpointFailureReason reportedFailureReason = underlyingCheckpointException.map(exception -> exception.getCheckpointFailureReason()).orElse(CheckpointFailureReason.CHECKPOINT_ASYNC_EXCEPTION);
                    taskEnvironment.declineCheckpoint(checkpointMetaData.getCheckpointId(), new CheckpointException(reportedFailureReason, checkpointException));
                } catch (Exception unhandled) {
                    AsynchronousException asyncException = new AsynchronousException(unhandled);
                    asyncExceptionHandler.handleAsyncException("Failure in asynchronous checkpoint materialization", asyncException);
                }
            } else {
                // We never decline checkpoint after task is not running to avoid unexpected job
                // failover, which caused by exceeding checkpoint tolerable failure threshold.
                LOG.info("Ignore decline of checkpoint {} as task is not running anymore.", checkpointMetaData.getCheckpointId());
            }
            currentState = AsyncCheckpointState.DISCARDED;
        } else {
            currentState = asyncCheckpointState.get();
        }
    }
    if (!didCleanup) {
        LOG.trace("Caught followup exception from a failed checkpoint thread. This can be ignored.", e);
    }
}
Also used : CheckpointMetricsBuilder(org.apache.flink.runtime.checkpoint.CheckpointMetricsBuilder) Tuple2(org.apache.flink.api.java.tuple.Tuple2) CheckpointMetaData(org.apache.flink.runtime.checkpoint.CheckpointMetaData) OperatorSnapshotFinalizer(org.apache.flink.streaming.api.operators.OperatorSnapshotFinalizer) LoggerFactory(org.slf4j.LoggerFactory) ExceptionUtils(org.apache.flink.util.ExceptionUtils) CompletableFuture(java.util.concurrent.CompletableFuture) CheckpointFailureReason(org.apache.flink.runtime.checkpoint.CheckpointFailureReason) AtomicReference(java.util.concurrent.atomic.AtomicReference) Supplier(java.util.function.Supplier) OperatorSnapshotFutures(org.apache.flink.streaming.api.operators.OperatorSnapshotFutures) CheckpointException(org.apache.flink.runtime.checkpoint.CheckpointException) Map(java.util.Map) Preconditions.checkNotNull(org.apache.flink.util.Preconditions.checkNotNull) TaskStateSnapshot(org.apache.flink.runtime.checkpoint.TaskStateSnapshot) Preconditions.checkState(org.apache.flink.util.Preconditions.checkState) Logger(org.slf4j.Logger) FileSystemSafetyNet(org.apache.flink.core.fs.FileSystemSafetyNet) Consumer(java.util.function.Consumer) AsyncExceptionHandler(org.apache.flink.runtime.taskmanager.AsyncExceptionHandler) AsynchronousException(org.apache.flink.runtime.taskmanager.AsynchronousException) Closeable(java.io.Closeable) OperatorID(org.apache.flink.runtime.jobgraph.OperatorID) Optional(java.util.Optional) CheckpointMetrics(org.apache.flink.runtime.checkpoint.CheckpointMetrics) Environment(org.apache.flink.runtime.execution.Environment) CheckpointFailureReason(org.apache.flink.runtime.checkpoint.CheckpointFailureReason) CheckpointException(org.apache.flink.runtime.checkpoint.CheckpointException) AsynchronousException(org.apache.flink.runtime.taskmanager.AsynchronousException) CheckpointException(org.apache.flink.runtime.checkpoint.CheckpointException) AsynchronousException(org.apache.flink.runtime.taskmanager.AsynchronousException)

Aggregations

AsynchronousException (org.apache.flink.runtime.taskmanager.AsynchronousException)3 Closeable (java.io.Closeable)1 Map (java.util.Map)1 Optional (java.util.Optional)1 CompletableFuture (java.util.concurrent.CompletableFuture)1 AtomicReference (java.util.concurrent.atomic.AtomicReference)1 Consumer (java.util.function.Consumer)1 Supplier (java.util.function.Supplier)1 ExecutionConfig (org.apache.flink.api.common.ExecutionConfig)1 Tuple2 (org.apache.flink.api.java.tuple.Tuple2)1 FileSystemSafetyNet (org.apache.flink.core.fs.FileSystemSafetyNet)1 CheckpointException (org.apache.flink.runtime.checkpoint.CheckpointException)1 CheckpointFailureReason (org.apache.flink.runtime.checkpoint.CheckpointFailureReason)1 CheckpointMetaData (org.apache.flink.runtime.checkpoint.CheckpointMetaData)1 CheckpointMetrics (org.apache.flink.runtime.checkpoint.CheckpointMetrics)1 CheckpointMetricsBuilder (org.apache.flink.runtime.checkpoint.CheckpointMetricsBuilder)1 TaskStateSnapshot (org.apache.flink.runtime.checkpoint.TaskStateSnapshot)1 Environment (org.apache.flink.runtime.execution.Environment)1 OperatorID (org.apache.flink.runtime.jobgraph.OperatorID)1 MockEnvironment (org.apache.flink.runtime.operators.testutils.MockEnvironment)1