Search in sources :

Example 36 with Instance

use of org.apache.flink.runtime.instance.Instance in project flink by apache.

the class Scheduler method findInstance.

/**
	 * Tries to find a requested instance. If no such instance is available it will return a non-
	 * local instance. If no such instance exists (all slots occupied), then return null.
	 * 
	 * <p><b>NOTE:</b> This method is not thread-safe, it needs to be synchronized by the caller.</p>
	 *
	 * @param requestedLocations The list of preferred instances. May be null or empty, which indicates that
	 *                           no locality preference exists.   
	 * @param localOnly Flag to indicate whether only one of the exact local instances can be chosen.  
	 */
private Pair<Instance, Locality> findInstance(Iterable<TaskManagerLocation> requestedLocations, boolean localOnly) {
    // drain the queue of newly available instances
    while (this.newlyAvailableInstances.size() > 0) {
        Instance queuedInstance = this.newlyAvailableInstances.poll();
        if (queuedInstance != null) {
            this.instancesWithAvailableResources.put(queuedInstance.getTaskManagerID(), queuedInstance);
        }
    }
    // if nothing is available at all, return null
    if (this.instancesWithAvailableResources.isEmpty()) {
        return null;
    }
    Iterator<TaskManagerLocation> locations = requestedLocations == null ? null : requestedLocations.iterator();
    if (locations != null && locations.hasNext()) {
        while (locations.hasNext()) {
            TaskManagerLocation location = locations.next();
            if (location != null) {
                Instance instance = instancesWithAvailableResources.remove(location.getResourceID());
                if (instance != null) {
                    return new ImmutablePair<Instance, Locality>(instance, Locality.LOCAL);
                }
            }
        }
        // no local instance available
        if (localOnly) {
            return null;
        } else {
            // take the first instance from the instances with resources
            Iterator<Instance> instances = instancesWithAvailableResources.values().iterator();
            Instance instanceToUse = instances.next();
            instances.remove();
            return new ImmutablePair<>(instanceToUse, Locality.NON_LOCAL);
        }
    } else {
        // no location preference, so use some instance
        Iterator<Instance> instances = instancesWithAvailableResources.values().iterator();
        Instance instanceToUse = instances.next();
        instances.remove();
        return new ImmutablePair<>(instanceToUse, Locality.UNCONSTRAINED);
    }
}
Also used : ImmutablePair(org.apache.commons.lang3.tuple.ImmutablePair) Instance(org.apache.flink.runtime.instance.Instance) TaskManagerLocation(org.apache.flink.runtime.taskmanager.TaskManagerLocation)

Example 37 with Instance

use of org.apache.flink.runtime.instance.Instance in project flink by apache.

the class Scheduler method shutdown.

/**
	 * Shuts the scheduler down. After shut down no more tasks can be added to the scheduler.
	 */
public void shutdown() {
    synchronized (globalLock) {
        for (Instance i : allInstances) {
            i.removeSlotListener();
            i.cancelAndReleaseAllSlots();
        }
        allInstances.clear();
        allInstancesByHost.clear();
        instancesWithAvailableResources.clear();
        taskQueue.clear();
    }
}
Also used : Instance(org.apache.flink.runtime.instance.Instance)

Example 38 with Instance

use of org.apache.flink.runtime.instance.Instance in project flink by apache.

the class ExecutionGraphRestartTest method testNoManualRestart.

@Test
public void testNoManualRestart() throws Exception {
    NoRestartStrategy restartStrategy = new NoRestartStrategy();
    Tuple2<ExecutionGraph, Instance> executionGraphInstanceTuple = createExecutionGraph(restartStrategy);
    ExecutionGraph eg = executionGraphInstanceTuple.f0;
    eg.getAllExecutionVertices().iterator().next().fail(new Exception("Test Exception"));
    for (ExecutionVertex vertex : eg.getAllExecutionVertices()) {
        vertex.getCurrentExecutionAttempt().cancelingComplete();
    }
    assertEquals(JobStatus.FAILED, eg.getState());
    // This should not restart the graph.
    eg.restart();
    assertEquals(JobStatus.FAILED, eg.getState());
}
Also used : Instance(org.apache.flink.runtime.instance.Instance) NoRestartStrategy(org.apache.flink.runtime.executiongraph.restart.NoRestartStrategy) SuppressRestartsException(org.apache.flink.runtime.execution.SuppressRestartsException) IOException(java.io.IOException) Test(org.junit.Test)

Example 39 with Instance

use of org.apache.flink.runtime.instance.Instance in project flink by apache.

the class ExecutionGraphRestartTest method testRestartAutomatically.

@Test
public void testRestartAutomatically() throws Exception {
    RestartStrategy restartStrategy = new FixedDelayRestartStrategy(1, 1000);
    Tuple2<ExecutionGraph, Instance> executionGraphInstanceTuple = createExecutionGraph(restartStrategy);
    ExecutionGraph eg = executionGraphInstanceTuple.f0;
    restartAfterFailure(eg, new FiniteDuration(2, TimeUnit.MINUTES), true);
}
Also used : FixedDelayRestartStrategy(org.apache.flink.runtime.executiongraph.restart.FixedDelayRestartStrategy) Instance(org.apache.flink.runtime.instance.Instance) FailureRateRestartStrategy(org.apache.flink.runtime.executiongraph.restart.FailureRateRestartStrategy) InfiniteDelayRestartStrategy(org.apache.flink.runtime.executiongraph.restart.InfiniteDelayRestartStrategy) NoRestartStrategy(org.apache.flink.runtime.executiongraph.restart.NoRestartStrategy) RestartStrategy(org.apache.flink.runtime.executiongraph.restart.RestartStrategy) FixedDelayRestartStrategy(org.apache.flink.runtime.executiongraph.restart.FixedDelayRestartStrategy) FiniteDuration(scala.concurrent.duration.FiniteDuration) Test(org.junit.Test)

Example 40 with Instance

use of org.apache.flink.runtime.instance.Instance in project flink by apache.

the class ExecutionGraphRestartTest method testNoRestartOnSuppressException.

@Test
public void testNoRestartOnSuppressException() throws Exception {
    Tuple2<ExecutionGraph, Instance> executionGraphInstanceTuple = createSpyExecutionGraph(new FixedDelayRestartStrategy(1, 1000));
    ExecutionGraph eg = executionGraphInstanceTuple.f0;
    // Fail with unrecoverable Exception
    eg.getAllExecutionVertices().iterator().next().fail(new SuppressRestartsException(new Exception("Test Exception")));
    assertEquals(JobStatus.FAILING, eg.getState());
    for (ExecutionVertex vertex : eg.getAllExecutionVertices()) {
        vertex.getCurrentExecutionAttempt().cancelingComplete();
    }
    FiniteDuration timeout = new FiniteDuration(2, TimeUnit.MINUTES);
    // Wait for async restart
    Deadline deadline = timeout.fromNow();
    while (deadline.hasTimeLeft() && eg.getState() != JobStatus.FAILED) {
        Thread.sleep(100);
    }
    assertEquals(JobStatus.FAILED, eg.getState());
    // No restart
    verify(eg, never()).restart();
    RestartStrategy restartStrategy = eg.getRestartStrategy();
    assertTrue(restartStrategy instanceof FixedDelayRestartStrategy);
    assertEquals(0, ((FixedDelayRestartStrategy) restartStrategy).getCurrentRestartAttempt());
}
Also used : SuppressRestartsException(org.apache.flink.runtime.execution.SuppressRestartsException) FixedDelayRestartStrategy(org.apache.flink.runtime.executiongraph.restart.FixedDelayRestartStrategy) Instance(org.apache.flink.runtime.instance.Instance) Deadline(scala.concurrent.duration.Deadline) FiniteDuration(scala.concurrent.duration.FiniteDuration) FailureRateRestartStrategy(org.apache.flink.runtime.executiongraph.restart.FailureRateRestartStrategy) InfiniteDelayRestartStrategy(org.apache.flink.runtime.executiongraph.restart.InfiniteDelayRestartStrategy) NoRestartStrategy(org.apache.flink.runtime.executiongraph.restart.NoRestartStrategy) RestartStrategy(org.apache.flink.runtime.executiongraph.restart.RestartStrategy) FixedDelayRestartStrategy(org.apache.flink.runtime.executiongraph.restart.FixedDelayRestartStrategy) SuppressRestartsException(org.apache.flink.runtime.execution.SuppressRestartsException) IOException(java.io.IOException) Test(org.junit.Test)

Aggregations

Instance (org.apache.flink.runtime.instance.Instance)63 Test (org.junit.Test)52 SimpleSlot (org.apache.flink.runtime.instance.SimpleSlot)38 JobVertexID (org.apache.flink.runtime.jobgraph.JobVertexID)33 ActorTaskManagerGateway (org.apache.flink.runtime.jobmanager.slots.ActorTaskManagerGateway)29 IOException (java.io.IOException)19 JobID (org.apache.flink.api.common.JobID)15 ExecutionException (java.util.concurrent.ExecutionException)14 Scheduler (org.apache.flink.runtime.jobmanager.scheduler.Scheduler)14 SchedulerTestUtils.getRandomInstance (org.apache.flink.runtime.jobmanager.scheduler.SchedulerTestUtils.getRandomInstance)14 ExecutionGraphTestUtils.getInstance (org.apache.flink.runtime.executiongraph.ExecutionGraphTestUtils.getInstance)12 TaskManagerLocation (org.apache.flink.runtime.taskmanager.TaskManagerLocation)12 SimpleActorGateway (org.apache.flink.runtime.executiongraph.ExecutionGraphTestUtils.SimpleActorGateway)11 ExecutionGraphTestUtils.getExecutionVertex (org.apache.flink.runtime.executiongraph.ExecutionGraphTestUtils.getExecutionVertex)11 ActorGateway (org.apache.flink.runtime.instance.ActorGateway)11 JobVertex (org.apache.flink.runtime.jobgraph.JobVertex)10 FiniteDuration (scala.concurrent.duration.FiniteDuration)9 SuppressRestartsException (org.apache.flink.runtime.execution.SuppressRestartsException)8 BaseTestingActorGateway (org.apache.flink.runtime.instance.BaseTestingActorGateway)8 JobGraph (org.apache.flink.runtime.jobgraph.JobGraph)8