Search in sources :

Example 6 with CoLocationGroup

use of org.apache.flink.runtime.jobmanager.scheduler.CoLocationGroup in project flink by apache.

the class ExecutionGraph method restart.

public void restart() {
    try {
        synchronized (progressLock) {
            JobStatus current = state;
            if (current == JobStatus.CANCELED) {
                LOG.info("Canceled job during restart. Aborting restart.");
                return;
            } else if (current == JobStatus.FAILED) {
                LOG.info("Failed job during restart. Aborting restart.");
                return;
            } else if (current == JobStatus.SUSPENDED) {
                LOG.info("Suspended job during restart. Aborting restart.");
                return;
            } else if (current != JobStatus.RESTARTING) {
                throw new IllegalStateException("Can only restart job from state restarting.");
            }
            if (slotProvider == null) {
                throw new IllegalStateException("The execution graph has not been scheduled before - slotProvider is null.");
            }
            this.currentExecutions.clear();
            Collection<CoLocationGroup> colGroups = new HashSet<>();
            for (ExecutionJobVertex jv : this.verticesInCreationOrder) {
                CoLocationGroup cgroup = jv.getCoLocationGroup();
                if (cgroup != null && !colGroups.contains(cgroup)) {
                    cgroup.resetConstraints();
                    colGroups.add(cgroup);
                }
                jv.resetForNewExecution();
            }
            for (int i = 0; i < stateTimestamps.length; i++) {
                if (i != JobStatus.RESTARTING.ordinal()) {
                    // Only clear the non restarting state in order to preserve when the job was
                    // restarted. This is needed for the restarting time gauge
                    stateTimestamps[i] = 0;
                }
            }
            numFinishedJobVertices = 0;
            transitionState(JobStatus.RESTARTING, JobStatus.CREATED);
            // if we have checkpointed state, reload it into the executions
            if (checkpointCoordinator != null) {
                checkpointCoordinator.restoreLatestCheckpointedState(getAllVertices(), false, false);
            }
        }
        scheduleForExecution();
    } catch (Throwable t) {
        LOG.warn("Failed to restart the job.", t);
        fail(t);
    }
}
Also used : JobStatus(org.apache.flink.runtime.jobgraph.JobStatus) CoLocationGroup(org.apache.flink.runtime.jobmanager.scheduler.CoLocationGroup) SerializedThrowable(org.apache.flink.runtime.util.SerializedThrowable) HashSet(java.util.HashSet)

Example 7 with CoLocationGroup

use of org.apache.flink.runtime.jobmanager.scheduler.CoLocationGroup in project flink by apache.

the class SharedSlotsTest method testImmediateReleaseTwoLevel.

@Test
public void testImmediateReleaseTwoLevel() {
    try {
        JobID jobId = new JobID();
        JobVertexID vid = new JobVertexID();
        JobVertex vertex = new JobVertex("vertex", vid);
        SlotSharingGroup sharingGroup = new SlotSharingGroup(vid);
        SlotSharingGroupAssignment assignment = sharingGroup.getTaskAssignment();
        CoLocationGroup coLocationGroup = new CoLocationGroup(vertex);
        CoLocationConstraint constraint = coLocationGroup.getLocationConstraint(0);
        Instance instance = SchedulerTestUtils.getRandomInstance(1);
        SharedSlot sharedSlot = instance.allocateSharedSlot(jobId, assignment);
        SimpleSlot sub = assignment.addSharedSlotAndAllocateSubSlot(sharedSlot, Locality.UNCONSTRAINED, constraint);
        assertNull(sub.getGroupID());
        assertEquals(constraint.getSharedSlot(), sub.getParent());
        sub.releaseSlot();
        assertTrue(sub.isReleased());
        assertTrue(sharedSlot.isReleased());
        assertEquals(1, instance.getNumberOfAvailableSlots());
        assertEquals(0, instance.getNumberOfAllocatedSlots());
    } catch (Exception e) {
        e.printStackTrace();
        fail(e.getMessage());
    }
}
Also used : CoLocationConstraint(org.apache.flink.runtime.jobmanager.scheduler.CoLocationConstraint) JobVertex(org.apache.flink.runtime.jobgraph.JobVertex) CoLocationGroup(org.apache.flink.runtime.jobmanager.scheduler.CoLocationGroup) JobVertexID(org.apache.flink.runtime.jobgraph.JobVertexID) SlotSharingGroup(org.apache.flink.runtime.jobmanager.scheduler.SlotSharingGroup) JobID(org.apache.flink.api.common.JobID) Test(org.junit.Test)

Aggregations

CoLocationGroup (org.apache.flink.runtime.jobmanager.scheduler.CoLocationGroup)7 JobVertex (org.apache.flink.runtime.jobgraph.JobVertex)4 SlotSharingGroup (org.apache.flink.runtime.jobmanager.scheduler.SlotSharingGroup)4 JobID (org.apache.flink.api.common.JobID)3 JobVertexID (org.apache.flink.runtime.jobgraph.JobVertexID)3 CoLocationConstraint (org.apache.flink.runtime.jobmanager.scheduler.CoLocationConstraint)3 Test (org.junit.Test)3 HashMap (java.util.HashMap)1 HashSet (java.util.HashSet)1 ExecutionState (org.apache.flink.runtime.execution.ExecutionState)1 JobStatus (org.apache.flink.runtime.jobgraph.JobStatus)1 TaskManagerLocation (org.apache.flink.runtime.taskmanager.TaskManagerLocation)1 SerializedThrowable (org.apache.flink.runtime.util.SerializedThrowable)1