Search in sources :

Example 1 with ReconciliationEngine

use of com.netflix.titus.common.framework.reconciler.ReconciliationEngine in project titus-control-plane by Netflix.

the class DefaultReconciliationFramework method changeReferenceModel.

@Override
public Observable<Void> changeReferenceModel(MultiEngineChangeAction multiEngineChangeAction, BiFunction<String, Observable<List<ModelActionHolder>>, ChangeAction> engineChangeActionFactory, String... rootEntityHolderIds) {
    Preconditions.checkArgument(rootEntityHolderIds.length > 1, "Change action for multiple engines requested, but %s root id holders provided", rootEntityHolderIds.length);
    return Observable.create(emitter -> {
        List<ReconciliationEngine<EVENT>> engines = new ArrayList<>();
        for (String id : rootEntityHolderIds) {
            ReconciliationEngine<EVENT> engine = findEngineByRootId(id).orElseThrow(() -> new IllegalArgumentException("Reconciliation engine not found: rootId=" + id));
            engines.add(engine);
        }
        List<Observable<Map<String, List<ModelActionHolder>>>> outputs = ObservableExt.propagate(multiEngineChangeAction.apply(), engines.size());
        List<Observable<Void>> engineActions = new ArrayList<>();
        for (int i = 0; i < engines.size(); i++) {
            ReconciliationEngine<EVENT> engine = engines.get(i);
            String rootId = engine.getReferenceView().getId();
            ChangeAction engineAction = engineChangeActionFactory.apply(rootId, outputs.get(i).map(r -> r.get(rootId)));
            engineActions.add(engine.changeReferenceModel(engineAction));
        }
        // Synchronize on subscription to make sure that this operation is not interleaved with concurrent
        // subscriptions for the same set or subset of the reconciliation engines. The interleaving might result
        // in a deadlock. For example with two engines engineA and engineB:
        // - multi-engine change action M1 for engineA and engineB is scheduled
        // - M1/engineA is added to its queue
        // - another multi-engine change action M2 for engineA and engineB is scheduled
        // - M2/engineB is added to its queue
        // - M1/engineB is added to its queue, and next M2/engineA
        // Executing M1 requires that both M1/engineA and M1/engineB are at the top of the queue, but in this case
        // M2/engineB is ahead of the M1/engineB. On the other hand, M1/engineA is ahead of M2/engineB. Because
        // of that we have deadlock. Please, note that we can ignore here the regular (engine scoped) change actions.
        Subscription subscription;
        synchronized (multiEngineChangeLock) {
            subscription = Observable.mergeDelayError(engineActions).subscribe(emitter::onNext, emitter::onError, emitter::onCompleted);
        }
        emitter.setSubscription(subscription);
    }, Emitter.BackpressureMode.NONE);
}
Also used : Completable(rx.Completable) BiFunction(java.util.function.BiFunction) LoggerFactory(org.slf4j.LoggerFactory) HashMap(java.util.HashMap) AtomicReference(java.util.concurrent.atomic.AtomicReference) Function(java.util.function.Function) ArrayList(java.util.ArrayList) Observable(rx.Observable) HashSet(java.util.HashSet) Pair(com.netflix.titus.common.util.tuple.Pair) Map(java.util.Map) ChangeAction(com.netflix.titus.common.framework.reconciler.ChangeAction) Schedulers(rx.schedulers.Schedulers) ExceptionExt(com.netflix.titus.common.util.ExceptionExt) ExecutorService(java.util.concurrent.ExecutorService) Logger(org.slf4j.Logger) Subscriber(rx.Subscriber) ConcurrentHashMap(java.util.concurrent.ConcurrentHashMap) Set(java.util.Set) BlockingQueue(java.util.concurrent.BlockingQueue) MultiEngineChangeAction(com.netflix.titus.common.framework.reconciler.MultiEngineChangeAction) Emitter(rx.Emitter) Scheduler(rx.Scheduler) LinkedBlockingQueue(java.util.concurrent.LinkedBlockingQueue) Collectors(java.util.stream.Collectors) Executors(java.util.concurrent.Executors) EntityHolder(com.netflix.titus.common.framework.reconciler.EntityHolder) TimeUnit(java.util.concurrent.TimeUnit) Timer(com.netflix.spectator.api.Timer) CountDownLatch(java.util.concurrent.CountDownLatch) ModelActionHolder(com.netflix.titus.common.framework.reconciler.ModelActionHolder) List(java.util.List) ReconciliationEngine(com.netflix.titus.common.framework.reconciler.ReconciliationEngine) Registry(com.netflix.spectator.api.Registry) ReconciliationFramework(com.netflix.titus.common.framework.reconciler.ReconciliationFramework) Optional(java.util.Optional) Preconditions(com.google.common.base.Preconditions) PolledMeter(com.netflix.spectator.api.patterns.PolledMeter) Comparator(java.util.Comparator) Collections(java.util.Collections) ReconcileEventFactory(com.netflix.titus.common.framework.reconciler.ReconcileEventFactory) ObservableExt(com.netflix.titus.common.util.rx.ObservableExt) Subscription(rx.Subscription) ChangeAction(com.netflix.titus.common.framework.reconciler.ChangeAction) MultiEngineChangeAction(com.netflix.titus.common.framework.reconciler.MultiEngineChangeAction) ArrayList(java.util.ArrayList) Observable(rx.Observable) ReconciliationEngine(com.netflix.titus.common.framework.reconciler.ReconciliationEngine) ArrayList(java.util.ArrayList) List(java.util.List) Subscription(rx.Subscription)

Example 2 with ReconciliationEngine

use of com.netflix.titus.common.framework.reconciler.ReconciliationEngine in project titus-control-plane by Netflix.

the class DefaultV3JobOperations method moveServiceTask.

@Override
public Observable<Void> moveServiceTask(String sourceJobId, String targetJobId, String taskId, CallMetadata callMetadata) {
    return Observable.defer(() -> {
        Pair<ReconciliationEngine<JobManagerReconcilerEvent>, EntityHolder> fromEngineTaskPair = reconciliationFramework.findEngineByChildId(taskId).orElseThrow(() -> JobManagerException.taskNotFound(taskId));
        ReconciliationEngine<JobManagerReconcilerEvent> engineFrom = fromEngineTaskPair.getLeft();
        Job<ServiceJobExt> jobFrom = engineFrom.getReferenceView().getEntity();
        if (!JobFunctions.isServiceJob(jobFrom)) {
            throw JobManagerException.notServiceJob(jobFrom.getId());
        }
        if (!jobFrom.getId().equals(sourceJobId)) {
            throw JobManagerException.taskJobMismatch(taskId, sourceJobId);
        }
        if (jobFrom.getId().equals(targetJobId)) {
            throw JobManagerException.sameJobs(jobFrom.getId());
        }
        ReconciliationEngine<JobManagerReconcilerEvent> engineTo = reconciliationFramework.findEngineByRootId(targetJobId).orElseThrow(() -> JobManagerException.jobNotFound(targetJobId));
        Job<ServiceJobExt> jobTo = engineTo.getReferenceView().getEntity();
        if (!JobFunctions.isServiceJob(jobTo)) {
            throw JobManagerException.notServiceJob(jobTo.getId());
        }
        JobCompatibility compatibility = JobCompatibility.of(jobFrom, jobTo);
        if (featureActivationConfiguration.isMoveTaskValidationEnabled() && !compatibility.isCompatible()) {
            Optional<String> diffReport = ProtobufExt.diffReport(GrpcJobManagementModelConverters.toGrpcJobDescriptor(compatibility.getNormalizedDescriptorFrom()), GrpcJobManagementModelConverters.toGrpcJobDescriptor(compatibility.getNormalizedDescriptorTo()));
            throw JobManagerException.notCompatible(jobFrom, jobTo, diffReport.orElse(""));
        }
        return reconciliationFramework.changeReferenceModel(new MoveTaskBetweenJobsAction(engineFrom, engineTo, taskId, store, callMetadata, versionSupplier), (rootId, modelUpdatesObservable) -> {
            String name;
            String summary;
            if (targetJobId.equals(rootId)) {
                name = "moveTask(to)";
                summary = "Moving a task to this job from job " + jobFrom.getId();
            } else {
                name = "moveTask(from)";
                summary = "Moving a task out of this job to job " + jobTo.getId();
            }
            return new TitusChangeAction(Trigger.API, rootId, null, name, summary, callMetadata) {

                @Override
                public Observable<List<ModelActionHolder>> apply() {
                    return modelUpdatesObservable;
                }
            };
        }, jobFrom.getId(), jobTo.getId());
    });
}
Also used : MoveTaskBetweenJobsAction(com.netflix.titus.master.jobmanager.service.service.action.MoveTaskBetweenJobsAction) EntityHolder(com.netflix.titus.common.framework.reconciler.EntityHolder) JobManagerReconcilerEvent(com.netflix.titus.master.jobmanager.service.event.JobManagerReconcilerEvent) JobCompatibility(com.netflix.titus.api.jobmanager.model.job.JobCompatibility) ReconciliationEngine(com.netflix.titus.common.framework.reconciler.ReconciliationEngine) ServiceJobExt(com.netflix.titus.api.jobmanager.model.job.ext.ServiceJobExt) TitusChangeAction(com.netflix.titus.master.jobmanager.service.common.action.TitusChangeAction) List(java.util.List) ArrayList(java.util.ArrayList)

Example 3 with ReconciliationEngine

use of com.netflix.titus.common.framework.reconciler.ReconciliationEngine in project titus-control-plane by Netflix.

the class DefaultV3JobOperations method updateTask.

@Override
public Completable updateTask(String taskId, Function<Task, Optional<Task>> changeFunction, Trigger trigger, String reason, CallMetadata callMetadata) {
    Optional<ReconciliationEngine<JobManagerReconcilerEvent>> engineOpt = reconciliationFramework.findEngineByChildId(taskId).map(Pair::getLeft);
    if (!engineOpt.isPresent()) {
        return Completable.error(JobManagerException.taskNotFound(taskId));
    }
    ReconciliationEngine<JobManagerReconcilerEvent> engine = engineOpt.get();
    TitusChangeAction changeAction = BasicTaskActions.updateTaskInRunningModel(taskId, trigger, jobManagerConfiguration, engine, changeFunction, reason, versionSupplier, titusRuntime, callMetadata);
    return engine.changeReferenceModel(changeAction, taskId).toCompletable();
}
Also used : ReconciliationEngine(com.netflix.titus.common.framework.reconciler.ReconciliationEngine) TitusChangeAction(com.netflix.titus.master.jobmanager.service.common.action.TitusChangeAction) JobManagerReconcilerEvent(com.netflix.titus.master.jobmanager.service.event.JobManagerReconcilerEvent) Pair(com.netflix.titus.common.util.tuple.Pair)

Example 4 with ReconciliationEngine

use of com.netflix.titus.common.framework.reconciler.ReconciliationEngine in project titus-control-plane by Netflix.

the class BasicTaskActions method writeReferenceTaskToStore.

/**
 * Write updated task record to a store. If a task is completed, remove it from the scheduling service.
 * This command calls {@link JobStore#updateTask(Task)}, which assumes that the task record was created already.
 */
public static TitusChangeAction writeReferenceTaskToStore(JobStore titusStore, ReconciliationEngine<JobManagerReconcilerEvent> engine, String taskId, CallMetadata callMetadata, TitusRuntime titusRuntime) {
    return TitusChangeAction.newAction("writeReferenceTaskToStore").trigger(V3JobOperations.Trigger.Reconciler).id(taskId).summary("Persisting task to the store").callMetadata(callMetadata).changeWithModelUpdate(self -> {
        Optional<EntityHolder> taskHolder = engine.getReferenceView().findById(taskId);
        if (!taskHolder.isPresent()) {
            // Should never happen
            titusRuntime.getCodeInvariants().inconsistent("Reference task with id %s not found.", taskId);
            return Observable.empty();
        }
        Task referenceTask = taskHolder.get().getEntity();
        return titusStore.updateTask(referenceTask).andThen(Observable.fromCallable(() -> {
            TitusModelAction modelUpdateAction = TitusModelAction.newModelUpdate(self).taskUpdate(storeRoot -> {
                EntityHolder storedHolder = EntityHolder.newRoot(referenceTask.getId(), referenceTask);
                return Pair.of(storeRoot.addChild(storedHolder), storedHolder);
            });
            return ModelActionHolder.store(modelUpdateAction);
        }));
    });
}
Also used : Trigger(com.netflix.titus.api.jobmanager.service.V3JobOperations.Trigger) DateTimeExt(com.netflix.titus.common.util.DateTimeExt) JobModel(com.netflix.titus.api.jobmanager.model.job.JobModel) JobServiceRuntime(com.netflix.titus.master.jobmanager.service.JobServiceRuntime) TitusChangeAction(com.netflix.titus.master.jobmanager.service.common.action.TitusChangeAction) Task(com.netflix.titus.api.jobmanager.model.job.Task) CollectionsExt(com.netflix.titus.common.util.CollectionsExt) ReactorExt(com.netflix.titus.common.util.rx.ReactorExt) Function(java.util.function.Function) ArrayList(java.util.ArrayList) Observable(rx.Observable) Pair(com.netflix.titus.common.util.tuple.Pair) JobManagerConfiguration(com.netflix.titus.master.jobmanager.service.JobManagerConfiguration) JobManagerException(com.netflix.titus.api.jobmanager.service.JobManagerException) ExceptionExt(com.netflix.titus.common.util.ExceptionExt) JobEntityHolders(com.netflix.titus.master.jobmanager.service.common.action.JobEntityHolders) JobStore(com.netflix.titus.api.jobmanager.store.JobStore) CallMetadata(com.netflix.titus.api.model.callmetadata.CallMetadata) TaskRetryers(com.netflix.titus.master.jobmanager.service.common.action.TaskRetryers) Job(com.netflix.titus.api.jobmanager.model.job.Job) TaskStatus(com.netflix.titus.api.jobmanager.model.job.TaskStatus) JobFunctions(com.netflix.titus.api.jobmanager.model.job.JobFunctions) TitusModelAction(com.netflix.titus.master.jobmanager.service.common.action.TitusModelAction) TaskState(com.netflix.titus.api.jobmanager.model.job.TaskState) EntityHolder(com.netflix.titus.common.framework.reconciler.EntityHolder) ModelActionHolder(com.netflix.titus.common.framework.reconciler.ModelActionHolder) List(java.util.List) V3JobOperations(com.netflix.titus.api.jobmanager.service.V3JobOperations) VersionSupplier(com.netflix.titus.master.jobmanager.service.VersionSupplier) ReconciliationEngine(com.netflix.titus.common.framework.reconciler.ReconciliationEngine) VersionSuppliers(com.netflix.titus.master.jobmanager.service.VersionSuppliers) TaskAttributes(com.netflix.titus.api.jobmanager.TaskAttributes) Optional(java.util.Optional) JobManagerReconcilerEvent(com.netflix.titus.master.jobmanager.service.event.JobManagerReconcilerEvent) TitusRuntime(com.netflix.titus.common.runtime.TitusRuntime) Collections(java.util.Collections) TitusModelAction(com.netflix.titus.master.jobmanager.service.common.action.TitusModelAction) Task(com.netflix.titus.api.jobmanager.model.job.Task) EntityHolder(com.netflix.titus.common.framework.reconciler.EntityHolder)

Example 5 with ReconciliationEngine

use of com.netflix.titus.common.framework.reconciler.ReconciliationEngine in project titus-control-plane by Netflix.

the class KillInitiatedActions method reconcilerInitiatedAllTasksKillInitiated.

/**
 * For all active tasks, send terminate command to the compute provider, and change their state to {@link TaskState#KillInitiated}.
 * This method is used for internal state reconciliation.
 */
public static List<ChangeAction> reconcilerInitiatedAllTasksKillInitiated(ReconciliationEngine<JobManagerReconcilerEvent> engine, JobServiceRuntime runtime, JobStore jobStore, String reasonCode, String reason, int concurrencyLimit, VersionSupplier versionSupplier, TitusRuntime titusRuntime) {
    List<ChangeAction> result = new ArrayList<>();
    EntityHolder runningView = engine.getRunningView();
    Set<String> runningTaskIds = new HashSet<>();
    runningView.getChildren().forEach(taskHolder -> runningTaskIds.add(taskHolder.<Task>getEntity().getId()));
    // Immediately finish Accepted tasks, which are not yet in the running model.
    for (EntityHolder entityHolder : engine.getReferenceView().getChildren()) {
        if (result.size() >= concurrencyLimit) {
            return result;
        }
        Task task = entityHolder.getEntity();
        TaskState state = task.getStatus().getState();
        if (state == TaskState.Accepted && !runningTaskIds.contains(task.getId())) {
            result.add(BasicTaskActions.updateTaskAndWriteItToStore(task.getId(), engine, taskRef -> JobFunctions.changeTaskStatus(taskRef, TaskState.Finished, reasonCode, reason, titusRuntime.getClock()), jobStore, V3JobOperations.Trigger.Reconciler, reason, versionSupplier, titusRuntime, JobManagerConstants.RECONCILER_CALLMETADATA.toBuilder().withCallReason(reason).build()));
        }
    }
    // Move running tasks to KillInitiated state
    for (EntityHolder taskHolder : runningView.getChildren()) {
        if (result.size() >= concurrencyLimit) {
            return result;
        }
        Task task = taskHolder.getEntity();
        TaskState state = task.getStatus().getState();
        if (state != TaskState.KillInitiated && state != TaskState.Finished) {
            result.add(reconcilerInitiatedTaskKillInitiated(engine, task, runtime, jobStore, versionSupplier, reasonCode, reason, titusRuntime));
        }
    }
    return result;
}
Also used : Completable(rx.Completable) JobManagerConstants(com.netflix.titus.api.jobmanager.service.JobManagerConstants) JobServiceRuntime(com.netflix.titus.master.jobmanager.service.JobServiceRuntime) TitusChangeAction(com.netflix.titus.master.jobmanager.service.common.action.TitusChangeAction) Task(com.netflix.titus.api.jobmanager.model.job.Task) Callable(java.util.concurrent.Callable) ReactorExt(com.netflix.titus.common.util.rx.ReactorExt) ArrayList(java.util.ArrayList) Observable(rx.Observable) HashSet(java.util.HashSet) JobStatus(com.netflix.titus.api.jobmanager.model.job.JobStatus) JobState(com.netflix.titus.api.jobmanager.model.job.JobState) ChangeAction(com.netflix.titus.common.framework.reconciler.ChangeAction) JobManagerException(com.netflix.titus.api.jobmanager.service.JobManagerException) JobEntityHolders(com.netflix.titus.master.jobmanager.service.common.action.JobEntityHolders) JobStore(com.netflix.titus.api.jobmanager.store.JobStore) CallMetadata(com.netflix.titus.api.model.callmetadata.CallMetadata) Job(com.netflix.titus.api.jobmanager.model.job.Job) ServiceJobExt(com.netflix.titus.api.jobmanager.model.job.ext.ServiceJobExt) TaskStatus(com.netflix.titus.api.jobmanager.model.job.TaskStatus) Set(java.util.Set) JobFunctions(com.netflix.titus.api.jobmanager.model.job.JobFunctions) TitusModelAction(com.netflix.titus.master.jobmanager.service.common.action.TitusModelAction) TaskState(com.netflix.titus.api.jobmanager.model.job.TaskState) Capacity(com.netflix.titus.api.jobmanager.model.job.Capacity) EntityHolder(com.netflix.titus.common.framework.reconciler.EntityHolder) ModelActionHolder(com.netflix.titus.common.framework.reconciler.ModelActionHolder) List(java.util.List) V3JobOperations(com.netflix.titus.api.jobmanager.service.V3JobOperations) VersionSupplier(com.netflix.titus.master.jobmanager.service.VersionSupplier) ReconciliationEngine(com.netflix.titus.common.framework.reconciler.ReconciliationEngine) VersionSuppliers(com.netflix.titus.master.jobmanager.service.VersionSuppliers) Optional(java.util.Optional) JobManagerReconcilerEvent(com.netflix.titus.master.jobmanager.service.event.JobManagerReconcilerEvent) TitusRuntime(com.netflix.titus.common.runtime.TitusRuntime) Collections(java.util.Collections) Task(com.netflix.titus.api.jobmanager.model.job.Task) TitusChangeAction(com.netflix.titus.master.jobmanager.service.common.action.TitusChangeAction) ChangeAction(com.netflix.titus.common.framework.reconciler.ChangeAction) ArrayList(java.util.ArrayList) EntityHolder(com.netflix.titus.common.framework.reconciler.EntityHolder) TaskState(com.netflix.titus.api.jobmanager.model.job.TaskState) HashSet(java.util.HashSet)

Aggregations

ReconciliationEngine (com.netflix.titus.common.framework.reconciler.ReconciliationEngine)7 EntityHolder (com.netflix.titus.common.framework.reconciler.EntityHolder)6 JobManagerReconcilerEvent (com.netflix.titus.master.jobmanager.service.event.JobManagerReconcilerEvent)6 ArrayList (java.util.ArrayList)6 List (java.util.List)6 TitusChangeAction (com.netflix.titus.master.jobmanager.service.common.action.TitusChangeAction)5 Optional (java.util.Optional)5 Job (com.netflix.titus.api.jobmanager.model.job.Job)4 JobFunctions (com.netflix.titus.api.jobmanager.model.job.JobFunctions)4 Task (com.netflix.titus.api.jobmanager.model.job.Task)4 TaskState (com.netflix.titus.api.jobmanager.model.job.TaskState)4 TaskStatus (com.netflix.titus.api.jobmanager.model.job.TaskStatus)4 ServiceJobExt (com.netflix.titus.api.jobmanager.model.job.ext.ServiceJobExt)4 V3JobOperations (com.netflix.titus.api.jobmanager.service.V3JobOperations)4 JobStore (com.netflix.titus.api.jobmanager.store.JobStore)4 ChangeAction (com.netflix.titus.common.framework.reconciler.ChangeAction)4 ModelActionHolder (com.netflix.titus.common.framework.reconciler.ModelActionHolder)4 TitusRuntime (com.netflix.titus.common.runtime.TitusRuntime)4 JobServiceRuntime (com.netflix.titus.master.jobmanager.service.JobServiceRuntime)4 VersionSupplier (com.netflix.titus.master.jobmanager.service.VersionSupplier)4