Search in sources :

Example 1 with DeschedulingFailure

use of com.netflix.titus.supplementary.relocation.model.DeschedulingFailure in project titus-control-plane by Netflix.

the class TaskMigrationDeschedulerTest method testFailures.

@Test
public void testFailures() {
    Task job1Task0 = jobOperations.getTasks("job1").get(0);
    relocationConnectorStubs.place("removable1", job1Task0);
    relocationConnectorStubs.setQuota("job1", 0);
    DeschedulingFailure failure = newDescheduler(Collections.emptyMap()).getDeschedulingFailure(job1Task0);
    assertThat(failure.getReasonMessage()).contains("job quota");
}
Also used : Task(com.netflix.titus.api.jobmanager.model.job.Task) DeschedulingFailure(com.netflix.titus.supplementary.relocation.model.DeschedulingFailure) Test(org.junit.Test)

Example 2 with DeschedulingFailure

use of com.netflix.titus.supplementary.relocation.model.DeschedulingFailure in project titus-control-plane by Netflix.

the class DefaultDeschedulerService method deschedule.

@Override
public List<DeschedulingResult> deschedule(Map<String, TaskRelocationPlan> plannedAheadTaskRelocationPlans) {
    List<Pair<Job, List<Task>>> allJobsAndTasks = jobOperations.getJobsAndTasks();
    Map<String, Job<?>> jobs = allJobsAndTasks.stream().map(Pair::getLeft).collect(Collectors.toMap(Job::getId, j -> j));
    Map<String, Task> tasksById = allJobsAndTasks.stream().flatMap(p -> p.getRight().stream()).collect(Collectors.toMap(Task::getId, t -> t));
    EvacuatedAgentsAllocationTracker evacuatedAgentsAllocationTracker = new EvacuatedAgentsAllocationTracker(nodeDataResolver.resolve(), tasksById);
    EvictionQuotaTracker evictionQuotaTracker = new EvictionQuotaTracker(evictionOperations, jobs);
    TaskMigrationDescheduler taskMigrationDescheduler = new TaskMigrationDescheduler(plannedAheadTaskRelocationPlans, evacuatedAgentsAllocationTracker, evictionQuotaTracker, evictionConfiguration, jobs, tasksById, titusRuntime);
    Map<String, DeschedulingResult> requestedImmediateEvictions = taskMigrationDescheduler.findAllImmediateEvictions();
    Map<String, DeschedulingResult> requestedEvictions = taskMigrationDescheduler.findRequestedJobOrTaskMigrations();
    Map<String, DeschedulingResult> allRequestedEvictions = CollectionsExt.merge(requestedImmediateEvictions, requestedEvictions);
    Map<String, DeschedulingResult> regularEvictions = new HashMap<>();
    Optional<Pair<TitusNode, List<Task>>> bestMatch;
    while ((bestMatch = taskMigrationDescheduler.nextBestMatch()).isPresent()) {
        TitusNode agent = bestMatch.get().getLeft();
        List<Task> tasks = bestMatch.get().getRight();
        tasks.forEach(task -> {
            if (!allRequestedEvictions.containsKey(task.getId())) {
                Optional<TaskRelocationPlan> relocationPlanForTask = getRelocationPlanForTask(agent, task, plannedAheadTaskRelocationPlans);
                relocationPlanForTask.ifPresent(rp -> regularEvictions.put(task.getId(), DeschedulingResult.newBuilder().withTask(task).withAgentInstance(agent).withTaskRelocationPlan(rp).build()));
            }
        });
    }
    // Find eviction which could not be scheduled now.
    for (Task task : tasksById.values()) {
        if (allRequestedEvictions.containsKey(task.getId()) || regularEvictions.containsKey(task.getId())) {
            continue;
        }
        if (evacuatedAgentsAllocationTracker.isEvacuated(task)) {
            DeschedulingFailure failure = taskMigrationDescheduler.getDeschedulingFailure(task);
            TaskRelocationPlan relocationPlan = plannedAheadTaskRelocationPlans.get(task.getId());
            if (relocationPlan == null) {
                relocationPlan = newNotDelayedRelocationPlan(task, false);
            }
            TitusNode agent = evacuatedAgentsAllocationTracker.getRemovableAgent(task);
            regularEvictions.put(task.getId(), DeschedulingResult.newBuilder().withTask(task).withAgentInstance(agent).withTaskRelocationPlan(relocationPlan).withFailure(failure).build());
        }
    }
    return CollectionsExt.merge(new ArrayList<>(allRequestedEvictions.values()), new ArrayList<>(regularEvictions.values()));
}
Also used : Task(com.netflix.titus.api.jobmanager.model.job.Task) CollectionsExt(com.netflix.titus.common.util.CollectionsExt) HashMap(java.util.HashMap) RelocationPredicates(com.netflix.titus.supplementary.relocation.util.RelocationPredicates) Singleton(javax.inject.Singleton) AtomicReference(java.util.concurrent.atomic.AtomicReference) ArrayList(java.util.ArrayList) Inject(javax.inject.Inject) Pair(com.netflix.titus.common.util.tuple.Pair) Map(java.util.Map) EvictionConfiguration(com.netflix.titus.runtime.connector.eviction.EvictionConfiguration) NodeDataResolver(com.netflix.titus.supplementary.relocation.connector.NodeDataResolver) ReadOnlyJobOperations(com.netflix.titus.api.jobmanager.service.ReadOnlyJobOperations) TaskRelocationReason(com.netflix.titus.api.relocation.model.TaskRelocationPlan.TaskRelocationReason) TaskRelocationPlan(com.netflix.titus.api.relocation.model.TaskRelocationPlan) DeschedulingFailure(com.netflix.titus.supplementary.relocation.model.DeschedulingFailure) DeschedulingResult(com.netflix.titus.supplementary.relocation.model.DeschedulingResult) Job(com.netflix.titus.api.jobmanager.model.job.Job) Collectors(java.util.stream.Collectors) List(java.util.List) ReadOnlyEvictionOperations(com.netflix.titus.api.eviction.service.ReadOnlyEvictionOperations) Optional(java.util.Optional) RelocationUtil(com.netflix.titus.supplementary.relocation.util.RelocationUtil) VisibleForTesting(com.google.common.annotations.VisibleForTesting) TitusRuntime(com.netflix.titus.common.runtime.TitusRuntime) Clock(com.netflix.titus.common.util.time.Clock) TitusNode(com.netflix.titus.supplementary.relocation.connector.TitusNode) JobFunctions.hasDisruptionBudget(com.netflix.titus.api.jobmanager.model.job.JobFunctions.hasDisruptionBudget) Task(com.netflix.titus.api.jobmanager.model.job.Task) HashMap(java.util.HashMap) DeschedulingResult(com.netflix.titus.supplementary.relocation.model.DeschedulingResult) TaskRelocationPlan(com.netflix.titus.api.relocation.model.TaskRelocationPlan) DeschedulingFailure(com.netflix.titus.supplementary.relocation.model.DeschedulingFailure) Job(com.netflix.titus.api.jobmanager.model.job.Job) TitusNode(com.netflix.titus.supplementary.relocation.connector.TitusNode) Pair(com.netflix.titus.common.util.tuple.Pair)

Example 3 with DeschedulingFailure

use of com.netflix.titus.supplementary.relocation.model.DeschedulingFailure in project titus-control-plane by Netflix.

the class RelocationTransactionLogger method logTaskRelocationDeschedulingResult.

void logTaskRelocationDeschedulingResult(String stepName, DeschedulingResult deschedulingResult) {
    String taskId = deschedulingResult.getTask().getId();
    DeschedulingFailure failure = deschedulingResult.getFailure().orElse(null);
    if (failure == null) {
        doLog(findJob(taskId), taskId, stepName, "descheduling", "success", "Scheduled for being evicted now from agent: agentId=" + deschedulingResult.getAgentInstance().getId());
    } else {
        doLog(findJob(taskId), taskId, stepName, "descheduling", "failure", String.format("Task eviction not possible: agentId=%s, reason=%s", deschedulingResult.getAgentInstance().getId(), failure.getReasonMessage()));
    }
}
Also used : DeschedulingFailure(com.netflix.titus.supplementary.relocation.model.DeschedulingFailure)

Aggregations

DeschedulingFailure (com.netflix.titus.supplementary.relocation.model.DeschedulingFailure)3 Task (com.netflix.titus.api.jobmanager.model.job.Task)2 VisibleForTesting (com.google.common.annotations.VisibleForTesting)1 ReadOnlyEvictionOperations (com.netflix.titus.api.eviction.service.ReadOnlyEvictionOperations)1 Job (com.netflix.titus.api.jobmanager.model.job.Job)1 JobFunctions.hasDisruptionBudget (com.netflix.titus.api.jobmanager.model.job.JobFunctions.hasDisruptionBudget)1 ReadOnlyJobOperations (com.netflix.titus.api.jobmanager.service.ReadOnlyJobOperations)1 TaskRelocationPlan (com.netflix.titus.api.relocation.model.TaskRelocationPlan)1 TaskRelocationReason (com.netflix.titus.api.relocation.model.TaskRelocationPlan.TaskRelocationReason)1 TitusRuntime (com.netflix.titus.common.runtime.TitusRuntime)1 CollectionsExt (com.netflix.titus.common.util.CollectionsExt)1 Clock (com.netflix.titus.common.util.time.Clock)1 Pair (com.netflix.titus.common.util.tuple.Pair)1 EvictionConfiguration (com.netflix.titus.runtime.connector.eviction.EvictionConfiguration)1 NodeDataResolver (com.netflix.titus.supplementary.relocation.connector.NodeDataResolver)1 TitusNode (com.netflix.titus.supplementary.relocation.connector.TitusNode)1 DeschedulingResult (com.netflix.titus.supplementary.relocation.model.DeschedulingResult)1 RelocationPredicates (com.netflix.titus.supplementary.relocation.util.RelocationPredicates)1 RelocationUtil (com.netflix.titus.supplementary.relocation.util.RelocationUtil)1 ArrayList (java.util.ArrayList)1