Search in sources :

Example 11 with RootExceptionHistoryEntry

use of org.apache.flink.runtime.scheduler.exceptionhistory.RootExceptionHistoryEntry in project flink by apache.

the class JobExceptionsHandlerTest method testWithExceptionHistoryWithTruncationThroughParameter.

@Test
public void testWithExceptionHistoryWithTruncationThroughParameter() throws HandlerRequestException {
    final RootExceptionHistoryEntry rootCause = fromGlobalFailure(new RuntimeException("exception #0"), System.currentTimeMillis());
    final RootExceptionHistoryEntry otherFailure = new RootExceptionHistoryEntry(new RuntimeException("exception #1"), System.currentTimeMillis(), "task name", new LocalTaskManagerLocation(), Collections.emptySet());
    final ExecutionGraphInfo executionGraphInfo = createExecutionGraphInfo(rootCause, otherFailure);
    final HandlerRequest<EmptyRequestBody> request = createRequest(executionGraphInfo.getJobId(), 1);
    final JobExceptionsInfoWithHistory response = testInstance.handleRequest(request, executionGraphInfo);
    assertThat(response.getExceptionHistory().getEntries(), contains(historyContainsGlobalFailure(rootCause.getException(), rootCause.getTimestamp())));
    assertThat(response.getExceptionHistory().getEntries(), iterableWithSize(1));
    assertTrue(response.getExceptionHistory().isTruncated());
}
Also used : RootExceptionHistoryEntry(org.apache.flink.runtime.scheduler.exceptionhistory.RootExceptionHistoryEntry) ExecutionGraphInfo(org.apache.flink.runtime.scheduler.ExecutionGraphInfo) LocalTaskManagerLocation(org.apache.flink.runtime.taskmanager.LocalTaskManagerLocation) EmptyRequestBody(org.apache.flink.runtime.rest.messages.EmptyRequestBody) JobExceptionsInfoWithHistory(org.apache.flink.runtime.rest.messages.JobExceptionsInfoWithHistory) Test(org.junit.Test)

Example 12 with RootExceptionHistoryEntry

use of org.apache.flink.runtime.scheduler.exceptionhistory.RootExceptionHistoryEntry in project flink by apache.

the class JobExceptionsHandlerTest method testWithLocalExceptionHistoryEntryNotHavingATaskManagerInformationAvailable.

@Test
public void testWithLocalExceptionHistoryEntryNotHavingATaskManagerInformationAvailable() throws HandlerRequestException {
    final RootExceptionHistoryEntry failure = new RootExceptionHistoryEntry(new RuntimeException("exception #1"), System.currentTimeMillis(), "task name", null, Collections.emptySet());
    final ExecutionGraphInfo executionGraphInfo = createExecutionGraphInfo(failure);
    final HandlerRequest<EmptyRequestBody> request = createRequest(executionGraphInfo.getJobId(), 10);
    final JobExceptionsInfoWithHistory response = testInstance.handleRequest(request, executionGraphInfo);
    assertThat(response.getExceptionHistory().getEntries(), contains(historyContainsJobExceptionInfo(failure.getException(), failure.getTimestamp(), failure.getFailingTaskName(), JobExceptionsHandler.toString(failure.getTaskManagerLocation()))));
}
Also used : RootExceptionHistoryEntry(org.apache.flink.runtime.scheduler.exceptionhistory.RootExceptionHistoryEntry) ExecutionGraphInfo(org.apache.flink.runtime.scheduler.ExecutionGraphInfo) EmptyRequestBody(org.apache.flink.runtime.rest.messages.EmptyRequestBody) JobExceptionsInfoWithHistory(org.apache.flink.runtime.rest.messages.JobExceptionsInfoWithHistory) Test(org.junit.Test)

Example 13 with RootExceptionHistoryEntry

use of org.apache.flink.runtime.scheduler.exceptionhistory.RootExceptionHistoryEntry in project flink by apache.

the class SchedulerBase method archiveFromFailureHandlingResult.

protected final void archiveFromFailureHandlingResult(FailureHandlingResultSnapshot failureHandlingResult) {
    if (failureHandlingResult.getRootCauseExecution().isPresent()) {
        final Execution rootCauseExecution = failureHandlingResult.getRootCauseExecution().get();
        final RootExceptionHistoryEntry rootEntry = RootExceptionHistoryEntry.fromFailureHandlingResultSnapshot(failureHandlingResult);
        exceptionHistory.add(rootEntry);
        log.debug("Archive local failure causing attempt {} to fail: {}", rootCauseExecution.getAttemptId(), rootEntry.getExceptionAsString());
    } else {
        archiveGlobalFailure(failureHandlingResult.getRootCause(), failureHandlingResult.getTimestamp(), failureHandlingResult.getConcurrentlyFailedExecution());
    }
}
Also used : Execution(org.apache.flink.runtime.executiongraph.Execution) RootExceptionHistoryEntry(org.apache.flink.runtime.scheduler.exceptionhistory.RootExceptionHistoryEntry)

Example 14 with RootExceptionHistoryEntry

use of org.apache.flink.runtime.scheduler.exceptionhistory.RootExceptionHistoryEntry in project flink by apache.

the class DefaultSchedulerTest method testExceptionHistoryWithGlobalFailOver.

@Test
public void testExceptionHistoryWithGlobalFailOver() {
    final JobGraph jobGraph = singleNonParallelJobVertexJobGraph();
    final DefaultScheduler scheduler = createSchedulerAndStartScheduling(jobGraph);
    final ExecutionAttemptID attemptId = Iterables.getOnlyElement(scheduler.requestJob().getArchivedExecutionGraph().getAllExecutionVertices()).getCurrentExecutionAttempt().getAttemptId();
    final Exception expectedException = new Exception("Expected exception");
    scheduler.handleGlobalFailure(expectedException);
    // we have to cancel the task and trigger the restart to have the exception history
    // populated
    scheduler.updateTaskExecutionState(new TaskExecutionState(attemptId, ExecutionState.CANCELED, expectedException));
    taskRestartExecutor.triggerScheduledTasks();
    final Iterable<RootExceptionHistoryEntry> actualExceptionHistory = scheduler.getExceptionHistory();
    assertThat(actualExceptionHistory, IsIterableWithSize.iterableWithSize(1));
    final RootExceptionHistoryEntry failure = actualExceptionHistory.iterator().next();
    assertThat(failure, ExceptionHistoryEntryMatcher.matchesGlobalFailure(expectedException, scheduler.getExecutionGraph().getFailureInfo().getTimestamp()));
    assertThat(failure.getConcurrentExceptions(), IsEmptyIterable.emptyIterable());
}
Also used : JobGraph(org.apache.flink.runtime.jobgraph.JobGraph) ExecutionAttemptID(org.apache.flink.runtime.executiongraph.ExecutionAttemptID) RootExceptionHistoryEntry(org.apache.flink.runtime.scheduler.exceptionhistory.RootExceptionHistoryEntry) FlinkException(org.apache.flink.util.FlinkException) NoResourceAvailableException(org.apache.flink.runtime.jobmanager.scheduler.NoResourceAvailableException) TaskExecutionState(org.apache.flink.runtime.taskmanager.TaskExecutionState) AdaptiveSchedulerTest(org.apache.flink.runtime.scheduler.adaptive.AdaptiveSchedulerTest) Test(org.junit.Test)

Example 15 with RootExceptionHistoryEntry

use of org.apache.flink.runtime.scheduler.exceptionhistory.RootExceptionHistoryEntry in project flink by apache.

the class DefaultSchedulerTest method testExceptionHistoryWithPreDeployFailure.

@Test
public void testExceptionHistoryWithPreDeployFailure() {
    // disable auto-completing slot requests to simulate timeout
    executionSlotAllocatorFactory.getTestExecutionSlotAllocator().disableAutoCompletePendingRequests();
    final DefaultScheduler scheduler = createSchedulerAndStartScheduling(singleNonParallelJobVertexJobGraph());
    executionSlotAllocatorFactory.getTestExecutionSlotAllocator().timeoutPendingRequests();
    final ArchivedExecutionVertex taskFailureExecutionVertex = Iterables.getOnlyElement(scheduler.requestJob().getArchivedExecutionGraph().getAllExecutionVertices());
    // pending slot request timeout triggers a task failure that needs to be processed
    taskRestartExecutor.triggerNonPeriodicScheduledTask();
    // sanity check that the TaskManagerLocation of the failed task is indeed null, as expected
    assertThat(taskFailureExecutionVertex.getCurrentAssignedResourceLocation(), is(nullValue()));
    final ErrorInfo failureInfo = taskFailureExecutionVertex.getFailureInfo().orElseThrow(() -> new AssertionError("A failureInfo should be set."));
    final Iterable<RootExceptionHistoryEntry> actualExceptionHistory = scheduler.getExceptionHistory();
    assertThat(actualExceptionHistory, IsIterableContainingInOrder.contains(ExceptionHistoryEntryMatcher.matchesFailure(failureInfo.getException(), failureInfo.getTimestamp(), taskFailureExecutionVertex.getTaskNameWithSubtaskIndex(), taskFailureExecutionVertex.getCurrentAssignedResourceLocation())));
}
Also used : RootExceptionHistoryEntry(org.apache.flink.runtime.scheduler.exceptionhistory.RootExceptionHistoryEntry) ArchivedExecutionVertex(org.apache.flink.runtime.executiongraph.ArchivedExecutionVertex) ErrorInfo(org.apache.flink.runtime.executiongraph.ErrorInfo) AdaptiveSchedulerTest(org.apache.flink.runtime.scheduler.adaptive.AdaptiveSchedulerTest) Test(org.junit.Test)

Aggregations

RootExceptionHistoryEntry (org.apache.flink.runtime.scheduler.exceptionhistory.RootExceptionHistoryEntry)19 Test (org.junit.Test)14 LocalTaskManagerLocation (org.apache.flink.runtime.taskmanager.LocalTaskManagerLocation)12 ArchivedExecutionVertex (org.apache.flink.runtime.executiongraph.ArchivedExecutionVertex)10 JobGraph (org.apache.flink.runtime.jobgraph.JobGraph)10 ArrayList (java.util.ArrayList)8 CompletableFuture (java.util.concurrent.CompletableFuture)8 JobID (org.apache.flink.api.common.JobID)8 IOException (java.io.IOException)7 Duration (java.time.Duration)7 Arrays (java.util.Arrays)7 List (java.util.List)7 Optional (java.util.Optional)7 ArrayBlockingQueue (java.util.concurrent.ArrayBlockingQueue)7 BlockingQueue (java.util.concurrent.BlockingQueue)7 ExecutionException (java.util.concurrent.ExecutionException)7 Executors (java.util.concurrent.Executors)7 ScheduledExecutorService (java.util.concurrent.ScheduledExecutorService)7 TimeUnit (java.util.concurrent.TimeUnit)7 AtomicBoolean (java.util.concurrent.atomic.AtomicBoolean)7