Search in sources :

Example 51 with ExecutionVertexID

use of org.apache.flink.runtime.scheduler.strategy.ExecutionVertexID in project flink-mirror by flink-ci.

the class DefaultScheduler method deployOrHandleError.

private BiFunction<Object, Throwable, Void> deployOrHandleError(final DeploymentHandle deploymentHandle) {
    final ExecutionVertexVersion requiredVertexVersion = deploymentHandle.getRequiredVertexVersion();
    final ExecutionVertexID executionVertexId = requiredVertexVersion.getExecutionVertexId();
    return (ignored, throwable) -> {
        if (executionVertexVersioner.isModified(requiredVertexVersion)) {
            log.debug("Refusing to deploy execution vertex {} because this deployment was " + "superseded by another deployment", executionVertexId);
            return null;
        }
        if (throwable == null) {
            deployTaskSafe(executionVertexId);
        } else {
            handleTaskDeploymentFailure(executionVertexId, throwable);
        }
        return null;
    };
}
Also used : ShuffleMaster(org.apache.flink.runtime.shuffle.ShuffleMaster) TaskManagerLocation(org.apache.flink.runtime.taskmanager.TaskManagerLocation) BiFunction(java.util.function.BiFunction) JobGraph(org.apache.flink.runtime.jobgraph.JobGraph) TimeoutException(java.util.concurrent.TimeoutException) ExceptionUtils(org.apache.flink.util.ExceptionUtils) Vertex(org.apache.flink.runtime.topology.Vertex) Map(java.util.Map) SchedulingTopology(org.apache.flink.runtime.scheduler.strategy.SchedulingTopology) Preconditions.checkNotNull(org.apache.flink.util.Preconditions.checkNotNull) CoLocationGroup(org.apache.flink.runtime.jobmanager.scheduler.CoLocationGroup) SchedulingStrategyFactory(org.apache.flink.runtime.scheduler.strategy.SchedulingStrategyFactory) ScheduledExecutor(org.apache.flink.util.concurrent.ScheduledExecutor) JobManagerJobMetricGroup(org.apache.flink.runtime.metrics.groups.JobManagerJobMetricGroup) Collection(java.util.Collection) Set(java.util.Set) CompletionException(java.util.concurrent.CompletionException) ExecutionVertexID(org.apache.flink.runtime.scheduler.strategy.ExecutionVertexID) Collectors(java.util.stream.Collectors) ResourceProfile(org.apache.flink.runtime.clusterframework.types.ResourceProfile) List(java.util.List) FailoverStrategy(org.apache.flink.runtime.executiongraph.failover.flip1.FailoverStrategy) Optional(java.util.Optional) ExecutionFailureHandler(org.apache.flink.runtime.executiongraph.failover.flip1.ExecutionFailureHandler) Time(org.apache.flink.api.common.time.Time) AllocationID(org.apache.flink.runtime.clusterframework.types.AllocationID) NoResourceAvailableException(org.apache.flink.runtime.jobmanager.scheduler.NoResourceAvailableException) IntermediateResultPartitionID(org.apache.flink.runtime.jobgraph.IntermediateResultPartitionID) ComponentMainThreadExecutor(org.apache.flink.runtime.concurrent.ComponentMainThreadExecutor) SlotSharingGroup(org.apache.flink.runtime.jobmanager.scheduler.SlotSharingGroup) HashMap(java.util.HashMap) CompletableFuture(java.util.concurrent.CompletableFuture) JobStatus(org.apache.flink.api.common.JobStatus) Function(java.util.function.Function) ArrayList(java.util.ArrayList) FailureHandlingResult(org.apache.flink.runtime.executiongraph.failover.flip1.FailureHandlingResult) HashSet(java.util.HashSet) OperatorCoordinatorHolder(org.apache.flink.runtime.operators.coordination.OperatorCoordinatorHolder) FutureUtils(org.apache.flink.util.concurrent.FutureUtils) SchedulingStrategy(org.apache.flink.runtime.scheduler.strategy.SchedulingStrategy) Nullable(javax.annotation.Nullable) Preconditions.checkState(org.apache.flink.util.Preconditions.checkState) ExecutionJobVertex(org.apache.flink.runtime.executiongraph.ExecutionJobVertex) Logger(org.slf4j.Logger) Executor(java.util.concurrent.Executor) Configuration(org.apache.flink.configuration.Configuration) ExecutionState(org.apache.flink.runtime.execution.ExecutionState) CheckpointsCleaner(org.apache.flink.runtime.checkpoint.CheckpointsCleaner) LogicalSlot(org.apache.flink.runtime.jobmaster.LogicalSlot) IterableUtils(org.apache.flink.util.IterableUtils) JobStatusListener(org.apache.flink.runtime.executiongraph.JobStatusListener) CheckpointRecoveryFactory(org.apache.flink.runtime.checkpoint.CheckpointRecoveryFactory) RestartBackoffTimeStrategy(org.apache.flink.runtime.executiongraph.failover.flip1.RestartBackoffTimeStrategy) TimeUnit(java.util.concurrent.TimeUnit) Consumer(java.util.function.Consumer) FailureHandlingResultSnapshot(org.apache.flink.runtime.scheduler.exceptionhistory.FailureHandlingResultSnapshot) TaskExecutionStateTransition(org.apache.flink.runtime.executiongraph.TaskExecutionStateTransition) ExecutionVertex(org.apache.flink.runtime.executiongraph.ExecutionVertex) ExecutionVertexID(org.apache.flink.runtime.scheduler.strategy.ExecutionVertexID)

Example 52 with ExecutionVertexID

use of org.apache.flink.runtime.scheduler.strategy.ExecutionVertexID in project flink-mirror by flink-ci.

the class LocalInputPreferredSlotSharingStrategy method notifySchedulingTopologyUpdated.

@Override
public void notifySchedulingTopologyUpdated(SchedulingTopology schedulingTopology, List<ExecutionVertexID> newExecutionVertices) {
    final Map<ExecutionVertexID, ExecutionSlotSharingGroup> newMap = new LocalInputPreferredSlotSharingStrategy.ExecutionSlotSharingGroupBuilder(schedulingTopology, logicalSlotSharingGroups, coLocationGroups).build();
    for (ExecutionVertexID vertexId : newMap.keySet()) {
        final ExecutionSlotSharingGroup newEssg = newMap.get(vertexId);
        final ExecutionSlotSharingGroup oldEssg = executionSlotSharingGroupMap.get(vertexId);
        if (oldEssg == null) {
            executionSlotSharingGroupMap.put(vertexId, newEssg);
        } else {
            // ensures that existing slot sharing groups are not changed
            checkState(oldEssg.getExecutionVertexIds().equals(newEssg.getExecutionVertexIds()), "Existing ExecutionSlotSharingGroups are changed after topology update");
        }
    }
}
Also used : ExecutionVertexID(org.apache.flink.runtime.scheduler.strategy.ExecutionVertexID)

Example 53 with ExecutionVertexID

use of org.apache.flink.runtime.scheduler.strategy.ExecutionVertexID in project flink-mirror by flink-ci.

the class RegionPartitionGroupReleaseStrategyTest method releasePartitionsIfDownstreamRegionWithMultipleOperatorsIsFinished.

@Test
public void releasePartitionsIfDownstreamRegionWithMultipleOperatorsIsFinished() {
    final List<TestingSchedulingExecutionVertex> sourceVertices = testingSchedulingTopology.addExecutionVertices().finish();
    final List<TestingSchedulingExecutionVertex> intermediateVertices = testingSchedulingTopology.addExecutionVertices().finish();
    final List<TestingSchedulingExecutionVertex> sinkVertices = testingSchedulingTopology.addExecutionVertices().finish();
    final List<TestingSchedulingResultPartition> sourceResultPartitions = testingSchedulingTopology.connectAllToAll(sourceVertices, intermediateVertices).finish();
    testingSchedulingTopology.connectAllToAll(intermediateVertices, sinkVertices).withResultPartitionType(ResultPartitionType.PIPELINED).finish();
    final ExecutionVertexID onlyIntermediateVertexId = intermediateVertices.get(0).getId();
    final ExecutionVertexID onlySinkVertexId = sinkVertices.get(0).getId();
    final IntermediateResultPartitionID onlySourceResultPartitionId = sourceResultPartitions.get(0).getId();
    final RegionPartitionGroupReleaseStrategy regionPartitionGroupReleaseStrategy = new RegionPartitionGroupReleaseStrategy(testingSchedulingTopology);
    regionPartitionGroupReleaseStrategy.vertexFinished(onlyIntermediateVertexId);
    final List<IntermediateResultPartitionID> partitionsToRelease = getReleasablePartitions(regionPartitionGroupReleaseStrategy, onlySinkVertexId);
    assertThat(partitionsToRelease, contains(onlySourceResultPartitionId));
}
Also used : TestingSchedulingExecutionVertex(org.apache.flink.runtime.scheduler.strategy.TestingSchedulingExecutionVertex) TestingSchedulingResultPartition(org.apache.flink.runtime.scheduler.strategy.TestingSchedulingResultPartition) ExecutionVertexID(org.apache.flink.runtime.scheduler.strategy.ExecutionVertexID) RegionPartitionGroupReleaseStrategy(org.apache.flink.runtime.executiongraph.failover.flip1.partitionrelease.RegionPartitionGroupReleaseStrategy) IntermediateResultPartitionID(org.apache.flink.runtime.jobgraph.IntermediateResultPartitionID) Test(org.junit.Test)

Example 54 with ExecutionVertexID

use of org.apache.flink.runtime.scheduler.strategy.ExecutionVertexID in project flink-mirror by flink-ci.

the class ExecutionFailureHandlerTest method testNonRecoverableFailureHandlingResult.

/**
 * Tests the case that the failure is non-recoverable type.
 */
@Test
public void testNonRecoverableFailureHandlingResult() {
    // trigger an unrecoverable task failure
    final Throwable error = new Exception(new SuppressRestartsException(new Exception("test failure")));
    final long timestamp = System.currentTimeMillis();
    final FailureHandlingResult result = executionFailureHandler.getFailureHandlingResult(new ExecutionVertexID(new JobVertexID(), 0), error, timestamp);
    // verify results
    assertFalse(result.canRestart());
    assertNotNull(result.getError());
    assertTrue(ExecutionFailureHandler.isUnrecoverableError(result.getError()));
    assertThat(result.getTimestamp(), is(timestamp));
    try {
        result.getVerticesToRestart();
        fail("get tasks to restart is not allowed when restarting is suppressed");
    } catch (IllegalStateException ex) {
    // expected
    }
    try {
        result.getRestartDelayMS();
        fail("get restart delay is not allowed when restarting is suppressed");
    } catch (IllegalStateException ex) {
    // expected
    }
    assertEquals(0, executionFailureHandler.getNumberOfRestarts());
}
Also used : SuppressRestartsException(org.apache.flink.runtime.execution.SuppressRestartsException) ExecutionVertexID(org.apache.flink.runtime.scheduler.strategy.ExecutionVertexID) JobVertexID(org.apache.flink.runtime.jobgraph.JobVertexID) SuppressRestartsException(org.apache.flink.runtime.execution.SuppressRestartsException) Test(org.junit.Test)

Example 55 with ExecutionVertexID

use of org.apache.flink.runtime.scheduler.strategy.ExecutionVertexID in project flink-mirror by flink-ci.

the class ExecutionFailureHandlerTest method testRestartingSuppressedFailureHandlingResult.

/**
 * Tests the case that task restarting is suppressed.
 */
@Test
public void testRestartingSuppressedFailureHandlingResult() {
    // restart strategy suppresses restarting
    backoffTimeStrategy.setCanRestart(false);
    // trigger a task failure
    final Throwable error = new Exception("expected test failure");
    final long timestamp = System.currentTimeMillis();
    final FailureHandlingResult result = executionFailureHandler.getFailureHandlingResult(new ExecutionVertexID(new JobVertexID(), 0), error, timestamp);
    // verify results
    assertFalse(result.canRestart());
    assertThat(result.getError(), containsCause(error));
    assertThat(result.getTimestamp(), is(timestamp));
    assertFalse(ExecutionFailureHandler.isUnrecoverableError(result.getError()));
    try {
        result.getVerticesToRestart();
        fail("get tasks to restart is not allowed when restarting is suppressed");
    } catch (IllegalStateException ex) {
    // expected
    }
    try {
        result.getRestartDelayMS();
        fail("get restart delay is not allowed when restarting is suppressed");
    } catch (IllegalStateException ex) {
    // expected
    }
    assertEquals(0, executionFailureHandler.getNumberOfRestarts());
}
Also used : ExecutionVertexID(org.apache.flink.runtime.scheduler.strategy.ExecutionVertexID) JobVertexID(org.apache.flink.runtime.jobgraph.JobVertexID) SuppressRestartsException(org.apache.flink.runtime.execution.SuppressRestartsException) Test(org.junit.Test)

Aggregations

ExecutionVertexID (org.apache.flink.runtime.scheduler.strategy.ExecutionVertexID)231 Test (org.junit.Test)165 JobVertexID (org.apache.flink.runtime.jobgraph.JobVertexID)63 JobGraph (org.apache.flink.runtime.jobgraph.JobGraph)57 JobVertex (org.apache.flink.runtime.jobgraph.JobVertex)54 SchedulingExecutionVertex (org.apache.flink.runtime.scheduler.strategy.SchedulingExecutionVertex)51 Set (java.util.Set)48 IntermediateResultPartitionID (org.apache.flink.runtime.jobgraph.IntermediateResultPartitionID)45 AdaptiveSchedulerTest (org.apache.flink.runtime.scheduler.adaptive.AdaptiveSchedulerTest)45 TestingSchedulingExecutionVertex (org.apache.flink.runtime.scheduler.strategy.TestingSchedulingExecutionVertex)45 Collection (java.util.Collection)33 TestingSchedulingTopology (org.apache.flink.runtime.scheduler.strategy.TestingSchedulingTopology)33 HashSet (java.util.HashSet)30 ExecutionVertex (org.apache.flink.runtime.executiongraph.ExecutionVertex)30 ArrayList (java.util.ArrayList)27 Map (java.util.Map)27 HashMap (java.util.HashMap)24 List (java.util.List)24 CompletableFuture (java.util.concurrent.CompletableFuture)24 TaskManagerLocation (org.apache.flink.runtime.taskmanager.TaskManagerLocation)24