Search in sources :

Example 1 with Task

use of org.apache.flink.runtime.taskmanager.Task in project flink by apache.

the class TaskSlot method add.

// ----------------------------------------------------------------------------------
// State changing methods
// ----------------------------------------------------------------------------------
/**
	 * Add the given task to the task slot. This is only possible if there is not already another
	 * task with the same execution attempt id added to the task slot. In this case, the method
	 * returns true. Otherwise the task slot is left unchanged and false is returned.
	 *
	 * In case that the task slot state is not active an {@link IllegalStateException} is thrown.
	 * In case that the task's job id and allocation id don't match with the job id and allocation
	 * id for which the task slot has been allocated, an {@link IllegalArgumentException} is thrown.
	 *
	 * @param task to be added to the task slot
	 * @throws IllegalStateException if the task slot is not in state active
	 * @return true if the task was added to the task slot; otherwise false
	 */
public boolean add(Task task) {
    // Check that this slot has been assigned to the job sending this task
    Preconditions.checkArgument(task.getJobID().equals(jobId), "The task's job id does not match the " + "job id for which the slot has been allocated.");
    Preconditions.checkArgument(task.getAllocationId().equals(allocationId), "The task's allocation " + "id does not match the allocation id for which the slot has been allocated.");
    Preconditions.checkState(TaskSlotState.ACTIVE == state, "The task slot is not in state active.");
    Task oldTask = tasks.put(task.getExecutionId(), task);
    if (oldTask != null) {
        tasks.put(task.getExecutionId(), oldTask);
        return false;
    } else {
        return true;
    }
}
Also used : Task(org.apache.flink.runtime.taskmanager.Task)

Example 2 with Task

use of org.apache.flink.runtime.taskmanager.Task in project flink by apache.

the class TaskSlotTable method freeSlot.

/**
	 * Tries to free the slot. If the slot is empty it will set the state of the task slot to free
	 * and return its index. If the slot is not empty, then it will set the state of the task slot
	 * to releasing, fail all tasks and return -1.
	 *
	 * @param allocationId identifying the task slot to be freed
	 * @param cause to fail the tasks with if slot is not empty
	 * @throws SlotNotFoundException if there is not task slot for the given allocation id
	 * @return Index of the freed slot if the slot could be freed; otherwise -1
	 */
public int freeSlot(AllocationID allocationId, Throwable cause) throws SlotNotFoundException {
    checkInit();
    if (LOG.isDebugEnabled()) {
        LOG.debug("Free slot {}.", allocationId, cause);
    } else {
        LOG.info("Free slot {}.", allocationId);
    }
    TaskSlot taskSlot = getTaskSlot(allocationId);
    if (taskSlot != null) {
        LOG.info("Free slot {}.", allocationId, cause);
        final JobID jobId = taskSlot.getJobId();
        if (taskSlot.markFree()) {
            // remove the allocation id to task slot mapping
            allocationIDTaskSlotMap.remove(allocationId);
            // unregister a potential timeout
            timerService.unregisterTimeout(allocationId);
            Set<AllocationID> slots = slotsPerJob.get(jobId);
            if (slots == null) {
                throw new IllegalStateException("There are no more slots allocated for the job " + jobId + ". This indicates a programming bug.");
            }
            slots.remove(allocationId);
            if (slots.isEmpty()) {
                slotsPerJob.remove(jobId);
            }
            return taskSlot.getIndex();
        } else {
            // we couldn't free the task slot because it still contains task, fail the tasks
            // and set the slot state to releasing so that it gets eventually freed
            taskSlot.markReleasing();
            Iterator<Task> taskIterator = taskSlot.getTasks();
            while (taskIterator.hasNext()) {
                taskIterator.next().failExternally(cause);
            }
            return -1;
        }
    } else {
        throw new SlotNotFoundException(allocationId);
    }
}
Also used : Task(org.apache.flink.runtime.taskmanager.Task) AllocationID(org.apache.flink.runtime.clusterframework.types.AllocationID) JobID(org.apache.flink.api.common.JobID)

Example 3 with Task

use of org.apache.flink.runtime.taskmanager.Task in project flink by apache.

the class TaskSlotTable method removeTask.

/**
	 * Remove the task with the given execution attempt id from its task slot. If the owning task
	 * slot is in state releasing and empty after removing the task, the slot is freed via the
	 * slot actions.
	 *
	 * @param executionAttemptID identifying the task to remove
	 * @return The removed task if there is any for the given execution attempt id; otherwise null
	 */
public Task removeTask(ExecutionAttemptID executionAttemptID) {
    checkInit();
    TaskSlotMapping taskSlotMapping = taskSlotMappings.remove(executionAttemptID);
    if (taskSlotMapping != null) {
        Task task = taskSlotMapping.getTask();
        TaskSlot taskSlot = taskSlotMapping.getTaskSlot();
        taskSlot.remove(task.getExecutionId());
        if (taskSlot.isReleasing() && taskSlot.isEmpty()) {
            slotActions.freeSlot(taskSlot.getAllocationId());
        }
        return task;
    } else {
        return null;
    }
}
Also used : Task(org.apache.flink.runtime.taskmanager.Task)

Example 4 with Task

use of org.apache.flink.runtime.taskmanager.Task in project flink by apache.

the class TaskExecutor method closeJobManagerConnection.

private void closeJobManagerConnection(JobID jobId, Exception cause) {
    log.info("Close JobManager connection for job {}.", jobId);
    // 1. fail tasks running under this JobID
    Iterator<Task> tasks = taskSlotTable.getTasks(jobId);
    while (tasks.hasNext()) {
        tasks.next().failExternally(new Exception("JobManager responsible for " + jobId + " lost the leadership."));
    }
    // 2. Move the active slots to state allocated (possible to time out again)
    Iterator<AllocationID> activeSlots = taskSlotTable.getActiveSlots(jobId);
    while (activeSlots.hasNext()) {
        AllocationID activeSlot = activeSlots.next();
        try {
            if (!taskSlotTable.markSlotInactive(activeSlot, taskManagerConfiguration.getTimeout())) {
                freeSlot(activeSlot, new Exception("Slot could not be marked inactive."));
            }
        } catch (SlotNotFoundException e) {
            log.debug("Could not mark the slot {} inactive.", jobId, e);
        }
    }
    // 3. Disassociate from the JobManager
    JobManagerConnection jobManagerConnection = jobManagerTable.remove(jobId);
    if (jobManagerConnection != null) {
        try {
            jobManagerHeartbeatManager.unmonitorTarget(jobManagerConnection.getResourceID());
            jobManagerConnections.remove(jobManagerConnection.getResourceID());
            disassociateFromJobManager(jobManagerConnection, cause);
        } catch (IOException e) {
            log.warn("Could not properly disassociate from JobManager {}.", jobManagerConnection.getJobManagerGateway().getAddress(), e);
        }
    }
}
Also used : SlotNotFoundException(org.apache.flink.runtime.taskexecutor.slot.SlotNotFoundException) Task(org.apache.flink.runtime.taskmanager.Task) AllocationID(org.apache.flink.runtime.clusterframework.types.AllocationID) IOException(java.io.IOException) TimeoutException(java.util.concurrent.TimeoutException) PartitionException(org.apache.flink.runtime.taskexecutor.exceptions.PartitionException) CheckpointException(org.apache.flink.runtime.taskexecutor.exceptions.CheckpointException) SlotAllocationException(org.apache.flink.runtime.taskexecutor.exceptions.SlotAllocationException) TaskSubmissionException(org.apache.flink.runtime.taskexecutor.exceptions.TaskSubmissionException) TaskException(org.apache.flink.runtime.taskexecutor.exceptions.TaskException) SlotNotActiveException(org.apache.flink.runtime.taskexecutor.slot.SlotNotActiveException) SlotNotFoundException(org.apache.flink.runtime.taskexecutor.slot.SlotNotFoundException) IOException(java.io.IOException)

Example 5 with Task

use of org.apache.flink.runtime.taskmanager.Task in project flink by apache.

the class TaskExecutor method unregisterTaskAndNotifyFinalState.

private void unregisterTaskAndNotifyFinalState(final UUID jobMasterLeaderId, final JobMasterGateway jobMasterGateway, final ExecutionAttemptID executionAttemptID) {
    Task task = taskSlotTable.removeTask(executionAttemptID);
    if (task != null) {
        if (!task.getExecutionState().isTerminal()) {
            try {
                task.failExternally(new IllegalStateException("Task is being remove from TaskManager."));
            } catch (Exception e) {
                log.error("Could not properly fail task.", e);
            }
        }
        log.info("Un-registering task and sending final execution state {} to JobManager for task {} {}.", task.getExecutionState(), task.getTaskInfo().getTaskName(), task.getExecutionId());
        AccumulatorSnapshot accumulatorSnapshot = task.getAccumulatorRegistry().getSnapshot();
        updateTaskExecutionState(jobMasterLeaderId, jobMasterGateway, new TaskExecutionState(task.getJobID(), task.getExecutionId(), task.getExecutionState(), task.getFailureCause(), accumulatorSnapshot, task.getMetricGroup().getIOMetricGroup().createSnapshot()));
    } else {
        log.error("Cannot find task with ID {} to unregister.", executionAttemptID);
    }
}
Also used : Task(org.apache.flink.runtime.taskmanager.Task) AccumulatorSnapshot(org.apache.flink.runtime.accumulators.AccumulatorSnapshot) TimeoutException(java.util.concurrent.TimeoutException) PartitionException(org.apache.flink.runtime.taskexecutor.exceptions.PartitionException) CheckpointException(org.apache.flink.runtime.taskexecutor.exceptions.CheckpointException) SlotAllocationException(org.apache.flink.runtime.taskexecutor.exceptions.SlotAllocationException) TaskSubmissionException(org.apache.flink.runtime.taskexecutor.exceptions.TaskSubmissionException) TaskException(org.apache.flink.runtime.taskexecutor.exceptions.TaskException) SlotNotActiveException(org.apache.flink.runtime.taskexecutor.slot.SlotNotActiveException) SlotNotFoundException(org.apache.flink.runtime.taskexecutor.slot.SlotNotFoundException) IOException(java.io.IOException) TaskExecutionState(org.apache.flink.runtime.taskmanager.TaskExecutionState)

Aggregations

Task (org.apache.flink.runtime.taskmanager.Task)25 Configuration (org.apache.flink.configuration.Configuration)13 Test (org.junit.Test)10 StreamConfig (org.apache.flink.streaming.api.graph.StreamConfig)9 JobID (org.apache.flink.api.common.JobID)6 AllocationID (org.apache.flink.runtime.clusterframework.types.AllocationID)6 RpcMethod (org.apache.flink.runtime.rpc.RpcMethod)6 JobInformation (org.apache.flink.runtime.executiongraph.JobInformation)5 TaskInformation (org.apache.flink.runtime.executiongraph.TaskInformation)5 PartitionProducerStateChecker (org.apache.flink.runtime.io.network.netty.PartitionProducerStateChecker)5 ResultPartitionConsumableNotifier (org.apache.flink.runtime.io.network.partition.ResultPartitionConsumableNotifier)5 PrepareForTest (org.powermock.core.classloader.annotations.PrepareForTest)5 ExecutionConfig (org.apache.flink.api.common.ExecutionConfig)4 BroadcastVariableManager (org.apache.flink.runtime.broadcast.BroadcastVariableManager)4 ExecutionAttemptID (org.apache.flink.runtime.executiongraph.ExecutionAttemptID)4 FileCache (org.apache.flink.runtime.filecache.FileCache)4 IOManager (org.apache.flink.runtime.io.disk.iomanager.IOManager)4 NetworkEnvironment (org.apache.flink.runtime.io.network.NetworkEnvironment)4 JobVertexID (org.apache.flink.runtime.jobgraph.JobVertexID)4 InputSplitProvider (org.apache.flink.runtime.jobgraph.tasks.InputSplitProvider)4