Search in sources :

Example 1 with JobManagerReconcilerEvent

use of com.netflix.titus.master.jobmanager.service.event.JobManagerReconcilerEvent in project titus-control-plane by Netflix.

the class JobReconciliationFrameworkFactory method newInstance.

ReconciliationFramework<JobManagerReconcilerEvent> newInstance() {
    List<Pair<Job, List<Task>>> jobsAndTasks = loadJobsAndTasksFromStore(errorCollector);
    // initialize fenzo with running tasks
    List<InternalReconciliationEngine<JobManagerReconcilerEvent>> engines = new ArrayList<>();
    for (Pair<Job, List<Task>> pair : jobsAndTasks) {
        Job job = pair.getLeft();
        List<Task> tasks = pair.getRight();
        InternalReconciliationEngine<JobManagerReconcilerEvent> engine = newRestoredEngine(job, tasks);
        engines.add(engine);
        for (Task task : tasks) {
            Optional<Task> validatedTask = validateTask(task);
            if (!validatedTask.isPresent()) {
                errorCollector.invalidTaskRecord(task.getId());
            }
        }
    }
    errorCollector.failIfTooManyBadRecords();
    return new DefaultReconciliationFramework<>(engines, bootstrapModel -> newEngine(bootstrapModel, true), jobManagerConfiguration.getReconcilerIdleTimeoutMs(), jobManagerConfiguration.getReconcilerActiveTimeoutMs(), jobManagerConfiguration.getCheckpointIntervalMs(), INDEX_COMPARATORS, JOB_EVENT_FACTORY, registry, optionalScheduler);
}
Also used : Task(com.netflix.titus.api.jobmanager.model.job.Task) ArrayList(java.util.ArrayList) JobManagerReconcilerEvent(com.netflix.titus.master.jobmanager.service.event.JobManagerReconcilerEvent) DefaultReconciliationFramework(com.netflix.titus.common.framework.reconciler.internal.DefaultReconciliationFramework) List(java.util.List) ArrayList(java.util.ArrayList) InternalReconciliationEngine(com.netflix.titus.common.framework.reconciler.internal.InternalReconciliationEngine) Job(com.netflix.titus.api.jobmanager.model.job.Job) Pair(com.netflix.titus.common.util.tuple.Pair)

Example 2 with JobManagerReconcilerEvent

use of com.netflix.titus.master.jobmanager.service.event.JobManagerReconcilerEvent in project titus-control-plane by Netflix.

the class DefaultV3JobOperations method moveServiceTask.

@Override
public Observable<Void> moveServiceTask(String sourceJobId, String targetJobId, String taskId, CallMetadata callMetadata) {
    return Observable.defer(() -> {
        Pair<ReconciliationEngine<JobManagerReconcilerEvent>, EntityHolder> fromEngineTaskPair = reconciliationFramework.findEngineByChildId(taskId).orElseThrow(() -> JobManagerException.taskNotFound(taskId));
        ReconciliationEngine<JobManagerReconcilerEvent> engineFrom = fromEngineTaskPair.getLeft();
        Job<ServiceJobExt> jobFrom = engineFrom.getReferenceView().getEntity();
        if (!JobFunctions.isServiceJob(jobFrom)) {
            throw JobManagerException.notServiceJob(jobFrom.getId());
        }
        if (!jobFrom.getId().equals(sourceJobId)) {
            throw JobManagerException.taskJobMismatch(taskId, sourceJobId);
        }
        if (jobFrom.getId().equals(targetJobId)) {
            throw JobManagerException.sameJobs(jobFrom.getId());
        }
        ReconciliationEngine<JobManagerReconcilerEvent> engineTo = reconciliationFramework.findEngineByRootId(targetJobId).orElseThrow(() -> JobManagerException.jobNotFound(targetJobId));
        Job<ServiceJobExt> jobTo = engineTo.getReferenceView().getEntity();
        if (!JobFunctions.isServiceJob(jobTo)) {
            throw JobManagerException.notServiceJob(jobTo.getId());
        }
        JobCompatibility compatibility = JobCompatibility.of(jobFrom, jobTo);
        if (featureActivationConfiguration.isMoveTaskValidationEnabled() && !compatibility.isCompatible()) {
            Optional<String> diffReport = ProtobufExt.diffReport(GrpcJobManagementModelConverters.toGrpcJobDescriptor(compatibility.getNormalizedDescriptorFrom()), GrpcJobManagementModelConverters.toGrpcJobDescriptor(compatibility.getNormalizedDescriptorTo()));
            throw JobManagerException.notCompatible(jobFrom, jobTo, diffReport.orElse(""));
        }
        return reconciliationFramework.changeReferenceModel(new MoveTaskBetweenJobsAction(engineFrom, engineTo, taskId, store, callMetadata, versionSupplier), (rootId, modelUpdatesObservable) -> {
            String name;
            String summary;
            if (targetJobId.equals(rootId)) {
                name = "moveTask(to)";
                summary = "Moving a task to this job from job " + jobFrom.getId();
            } else {
                name = "moveTask(from)";
                summary = "Moving a task out of this job to job " + jobTo.getId();
            }
            return new TitusChangeAction(Trigger.API, rootId, null, name, summary, callMetadata) {

                @Override
                public Observable<List<ModelActionHolder>> apply() {
                    return modelUpdatesObservable;
                }
            };
        }, jobFrom.getId(), jobTo.getId());
    });
}
Also used : MoveTaskBetweenJobsAction(com.netflix.titus.master.jobmanager.service.service.action.MoveTaskBetweenJobsAction) EntityHolder(com.netflix.titus.common.framework.reconciler.EntityHolder) JobManagerReconcilerEvent(com.netflix.titus.master.jobmanager.service.event.JobManagerReconcilerEvent) JobCompatibility(com.netflix.titus.api.jobmanager.model.job.JobCompatibility) ReconciliationEngine(com.netflix.titus.common.framework.reconciler.ReconciliationEngine) ServiceJobExt(com.netflix.titus.api.jobmanager.model.job.ext.ServiceJobExt) TitusChangeAction(com.netflix.titus.master.jobmanager.service.common.action.TitusChangeAction) List(java.util.List) ArrayList(java.util.ArrayList)

Example 3 with JobManagerReconcilerEvent

use of com.netflix.titus.master.jobmanager.service.event.JobManagerReconcilerEvent in project titus-control-plane by Netflix.

the class JobTransactionLoggerTest method testLogFormatting.

/**
 * Sole purpose of this test is visual inspection of the generated log line.
 */
@Test
public void testLogFormatting() throws Exception {
    Job previousJob = createJob();
    Job currentJob = previousJob.toBuilder().withStatus(JobStatus.newBuilder().withState(JobState.Finished).build()).build();
    ModelActionHolder modelActionHolder = ModelActionHolder.reference(TitusModelAction.newModelUpdate("testModelAction").job(previousJob).trigger(Trigger.API).summary("Job model update").jobUpdate(jobHolder -> jobHolder.setEntity(currentJob)));
    TitusChangeAction changeAction = TitusChangeAction.newAction("testChangeAction").job(previousJob).trigger(Trigger.API).summary("Job update").callMetadata(CallMetadata.newBuilder().withCallerId("LoggerTest").withCallReason("Testing logger transaction").build()).applyModelUpdate(self -> modelActionHolder);
    JobManagerReconcilerEvent jobReconcilerEvent = new JobModelUpdateReconcilerEvent(previousJob, changeAction, modelActionHolder, EntityHolder.newRoot(currentJob.getId(), currentJob), Optional.of(EntityHolder.newRoot(previousJob.getId(), previousJob)), "1");
    String logLine = JobTransactionLogger.doFormat(jobReconcilerEvent);
    assertThat(logLine).isNotEmpty();
    logger.info("Job event: {}", logLine);
}
Also used : Trigger(com.netflix.titus.api.jobmanager.service.V3JobOperations.Trigger) Job(com.netflix.titus.api.jobmanager.model.job.Job) Logger(org.slf4j.Logger) JobModel(com.netflix.titus.api.jobmanager.model.job.JobModel) TitusChangeAction(com.netflix.titus.master.jobmanager.service.common.action.TitusChangeAction) Assertions.assertThat(org.assertj.core.api.Assertions.assertThat) LoggerFactory(org.slf4j.LoggerFactory) TitusModelAction(com.netflix.titus.master.jobmanager.service.common.action.TitusModelAction) Test(org.junit.Test) UUID(java.util.UUID) EntityHolder(com.netflix.titus.common.framework.reconciler.EntityHolder) JobStatus(com.netflix.titus.api.jobmanager.model.job.JobStatus) ModelActionHolder(com.netflix.titus.common.framework.reconciler.ModelActionHolder) JobModelUpdateReconcilerEvent(com.netflix.titus.master.jobmanager.service.event.JobModelReconcilerEvent.JobModelUpdateReconcilerEvent) JobState(com.netflix.titus.api.jobmanager.model.job.JobState) Optional(java.util.Optional) JobManagerReconcilerEvent(com.netflix.titus.master.jobmanager.service.event.JobManagerReconcilerEvent) CallMetadata(com.netflix.titus.api.model.callmetadata.CallMetadata) JobModelUpdateReconcilerEvent(com.netflix.titus.master.jobmanager.service.event.JobModelReconcilerEvent.JobModelUpdateReconcilerEvent) TitusChangeAction(com.netflix.titus.master.jobmanager.service.common.action.TitusChangeAction) Job(com.netflix.titus.api.jobmanager.model.job.Job) JobManagerReconcilerEvent(com.netflix.titus.master.jobmanager.service.event.JobManagerReconcilerEvent) ModelActionHolder(com.netflix.titus.common.framework.reconciler.ModelActionHolder) Test(org.junit.Test)

Example 4 with JobManagerReconcilerEvent

use of com.netflix.titus.master.jobmanager.service.event.JobManagerReconcilerEvent in project titus-control-plane by Netflix.

the class DefaultV3JobOperations method enterActiveMode.

@Activator
public void enterActiveMode() {
    this.reconciliationFramework = jobReconciliationFrameworkFactory.newInstance();
    // BUG: event stream breaks permanently, and cannot be retried.
    // As we cannot fix the underlying issue yet, we have to be able to discover when it happens.
    AtomicLong eventStreamLastError = new AtomicLong();
    Clock clock = titusRuntime.getClock();
    this.transactionLoggerSubscription = JobTransactionLogger.logEvents(reconciliationFramework, eventStreamLastError, clock);
    PolledMeter.using(titusRuntime.getRegistry()).withName(METRIC_EVENT_STREAM_LAST_ERROR).monitorValue(eventStreamLastError, value -> value.get() <= 0 ? 0 : clock.wallTime() - value.get());
    // Remove finished jobs from the reconciliation framework.
    Observable<JobManagerReconcilerEvent> reconciliationEventsObservable = reconciliationFramework.events().onBackpressureBuffer(OBSERVE_JOBS_BACKPRESSURE_BUFFER_SIZE, () -> logger.warn("Overflowed the buffer size: " + OBSERVE_JOBS_BACKPRESSURE_BUFFER_SIZE), BackpressureOverflow.ON_OVERFLOW_ERROR).doOnSubscribe(() -> {
        List<EntityHolder> entityHolders = reconciliationFramework.orderedView(IndexKind.StatusCreationTime);
        for (EntityHolder entityHolder : entityHolders) {
            handleJobCompletedEvent(entityHolder);
        }
    });
    this.reconcilerEventSubscription = titusRuntime.persistentStream(reconciliationEventsObservable).subscribe(event -> {
        if (event instanceof JobModelUpdateReconcilerEvent) {
            JobModelUpdateReconcilerEvent jobUpdateEvent = (JobModelUpdateReconcilerEvent) event;
            handleJobCompletedEvent(jobUpdateEvent.getChangedEntityHolder());
        }
    }, e -> logger.error("Event stream terminated with an error", e), () -> logger.info("Event stream completed"));
    reconciliationFramework.start();
}
Also used : Arrays(java.util.Arrays) JobCompatibility(com.netflix.titus.api.jobmanager.model.job.JobCompatibility) TitusChangeAction(com.netflix.titus.master.jobmanager.service.common.action.TitusChangeAction) Task(com.netflix.titus.api.jobmanager.model.job.Task) LoggerFactory(org.slf4j.LoggerFactory) BasicServiceJobActions(com.netflix.titus.master.jobmanager.service.service.action.BasicServiceJobActions) StringExt(com.netflix.titus.common.util.StringExt) ReactorExt(com.netflix.titus.common.util.rx.ReactorExt) JobStatus(com.netflix.titus.api.jobmanager.model.job.JobStatus) PreDestroy(javax.annotation.PreDestroy) FeatureActivationConfiguration(com.netflix.titus.api.FeatureActivationConfiguration) Map(java.util.Map) JobState(com.netflix.titus.api.jobmanager.model.job.JobState) BasicJobActions(com.netflix.titus.master.jobmanager.service.common.action.task.BasicJobActions) JobEntityHolders(com.netflix.titus.master.jobmanager.service.common.action.JobEntityHolders) JobStore(com.netflix.titus.api.jobmanager.store.JobStore) CallMetadata(com.netflix.titus.api.model.callmetadata.CallMetadata) FunctionExt.alwaysTrue(com.netflix.titus.common.util.FunctionExt.alwaysTrue) JobNewModelReconcilerEvent(com.netflix.titus.master.jobmanager.service.event.JobModelReconcilerEvent.JobNewModelReconcilerEvent) ImmutableSet(com.google.common.collect.ImmutableSet) Job(com.netflix.titus.api.jobmanager.model.job.Job) ImmutableMap(com.google.common.collect.ImmutableMap) Predicate(java.util.function.Predicate) TaskStatus(com.netflix.titus.api.jobmanager.model.job.TaskStatus) Set(java.util.Set) JobFunctions(com.netflix.titus.api.jobmanager.model.job.JobFunctions) UUID(java.util.UUID) JobManagerEvent(com.netflix.titus.api.jobmanager.model.job.event.JobManagerEvent) Collectors(java.util.stream.Collectors) TaskState(com.netflix.titus.api.jobmanager.model.job.TaskState) ProtobufExt(com.netflix.titus.common.util.ProtobufExt) List(java.util.List) JobModelUpdateReconcilerEvent(com.netflix.titus.master.jobmanager.service.event.JobModelReconcilerEvent.JobModelUpdateReconcilerEvent) Stream(java.util.stream.Stream) TaskUpdateEvent(com.netflix.titus.api.jobmanager.model.job.event.TaskUpdateEvent) ReconciliationEngine(com.netflix.titus.common.framework.reconciler.ReconciliationEngine) DisruptionBudget(com.netflix.titus.api.jobmanager.model.job.disruptionbudget.DisruptionBudget) ProxyConfiguration(com.netflix.titus.common.util.guice.annotation.ProxyConfiguration) Optional(java.util.Optional) JobManagerReconcilerEvent(com.netflix.titus.master.jobmanager.service.event.JobManagerReconcilerEvent) JobAttributes(com.netflix.titus.api.jobmanager.JobAttributes) ObservableExt(com.netflix.titus.common.util.rx.ObservableExt) Clock(com.netflix.titus.common.util.time.Clock) Subscription(rx.Subscription) KillInitiatedActions(com.netflix.titus.master.jobmanager.service.common.action.task.KillInitiatedActions) Completable(rx.Completable) JobManagerConstants(com.netflix.titus.api.jobmanager.service.JobManagerConstants) EntitySanitizer(com.netflix.titus.common.model.sanitizer.EntitySanitizer) ServiceJobProcesses(com.netflix.titus.api.jobmanager.model.job.ServiceJobProcesses) MoveTaskBetweenJobsAction(com.netflix.titus.master.jobmanager.service.service.action.MoveTaskBetweenJobsAction) ProxyType(com.netflix.titus.common.util.guice.ProxyType) MetricConstants(com.netflix.titus.master.MetricConstants) Singleton(javax.inject.Singleton) Function(java.util.function.Function) ArrayList(java.util.ArrayList) Observable(rx.Observable) Inject(javax.inject.Inject) CallMetadataUtils(com.netflix.titus.runtime.endpoint.metadata.CallMetadataUtils) Pair(com.netflix.titus.common.util.tuple.Pair) Model(com.netflix.titus.common.framework.reconciler.ModelActionHolder.Model) ChangeAction(com.netflix.titus.common.framework.reconciler.ChangeAction) JobManagerException(com.netflix.titus.api.jobmanager.service.JobManagerException) Named(javax.inject.Named) BackpressureOverflow(rx.BackpressureOverflow) JobDescriptor(com.netflix.titus.api.jobmanager.model.job.JobDescriptor) JobCheckpointReconcilerEvent(com.netflix.titus.master.jobmanager.service.event.JobCheckpointReconcilerEvent) Logger(org.slf4j.Logger) JobUpdateEvent(com.netflix.titus.api.jobmanager.model.job.event.JobUpdateEvent) ServiceJobExt(com.netflix.titus.api.jobmanager.model.job.ext.ServiceJobExt) Mono(reactor.core.publisher.Mono) GrpcJobManagementModelConverters(com.netflix.titus.runtime.endpoint.v3.grpc.GrpcJobManagementModelConverters) ManagementSubsystemInitializer(com.netflix.titus.master.service.management.ManagementSubsystemInitializer) JOB_STRICT_SANITIZER(com.netflix.titus.api.jobmanager.model.job.sanitizer.JobSanitizerBuilder.JOB_STRICT_SANITIZER) EntityHolder(com.netflix.titus.common.framework.reconciler.EntityHolder) Activator(com.netflix.titus.common.util.guice.annotation.Activator) AtomicLong(java.util.concurrent.atomic.AtomicLong) ModelActionHolder(com.netflix.titus.common.framework.reconciler.ModelActionHolder) V3JobOperations(com.netflix.titus.api.jobmanager.service.V3JobOperations) TaskAttributes(com.netflix.titus.api.jobmanager.TaskAttributes) CapacityAttributes(com.netflix.titus.api.jobmanager.model.job.CapacityAttributes) ReconciliationFramework(com.netflix.titus.common.framework.reconciler.ReconciliationFramework) BasicTaskActions(com.netflix.titus.master.jobmanager.service.common.action.task.BasicTaskActions) JobSubmitLimiter(com.netflix.titus.master.jobmanager.service.limiter.JobSubmitLimiter) PolledMeter(com.netflix.spectator.api.patterns.PolledMeter) TitusRuntime(com.netflix.titus.common.runtime.TitusRuntime) Evaluators(com.netflix.titus.common.util.Evaluators) Collections(java.util.Collections) AtomicLong(java.util.concurrent.atomic.AtomicLong) JobModelUpdateReconcilerEvent(com.netflix.titus.master.jobmanager.service.event.JobModelReconcilerEvent.JobModelUpdateReconcilerEvent) EntityHolder(com.netflix.titus.common.framework.reconciler.EntityHolder) Clock(com.netflix.titus.common.util.time.Clock) JobManagerReconcilerEvent(com.netflix.titus.master.jobmanager.service.event.JobManagerReconcilerEvent) Activator(com.netflix.titus.common.util.guice.annotation.Activator)

Example 5 with JobManagerReconcilerEvent

use of com.netflix.titus.master.jobmanager.service.event.JobManagerReconcilerEvent in project titus-control-plane by Netflix.

the class DefaultV3JobOperations method updateTask.

@Override
public Completable updateTask(String taskId, Function<Task, Optional<Task>> changeFunction, Trigger trigger, String reason, CallMetadata callMetadata) {
    Optional<ReconciliationEngine<JobManagerReconcilerEvent>> engineOpt = reconciliationFramework.findEngineByChildId(taskId).map(Pair::getLeft);
    if (!engineOpt.isPresent()) {
        return Completable.error(JobManagerException.taskNotFound(taskId));
    }
    ReconciliationEngine<JobManagerReconcilerEvent> engine = engineOpt.get();
    TitusChangeAction changeAction = BasicTaskActions.updateTaskInRunningModel(taskId, trigger, jobManagerConfiguration, engine, changeFunction, reason, versionSupplier, titusRuntime, callMetadata);
    return engine.changeReferenceModel(changeAction, taskId).toCompletable();
}
Also used : ReconciliationEngine(com.netflix.titus.common.framework.reconciler.ReconciliationEngine) TitusChangeAction(com.netflix.titus.master.jobmanager.service.common.action.TitusChangeAction) JobManagerReconcilerEvent(com.netflix.titus.master.jobmanager.service.event.JobManagerReconcilerEvent) Pair(com.netflix.titus.common.util.tuple.Pair)

Aggregations

JobManagerReconcilerEvent (com.netflix.titus.master.jobmanager.service.event.JobManagerReconcilerEvent)9 Job (com.netflix.titus.api.jobmanager.model.job.Job)7 EntityHolder (com.netflix.titus.common.framework.reconciler.EntityHolder)7 ReconciliationEngine (com.netflix.titus.common.framework.reconciler.ReconciliationEngine)7 TitusChangeAction (com.netflix.titus.master.jobmanager.service.common.action.TitusChangeAction)7 ArrayList (java.util.ArrayList)7 List (java.util.List)7 Task (com.netflix.titus.api.jobmanager.model.job.Task)6 Optional (java.util.Optional)6 JobFunctions (com.netflix.titus.api.jobmanager.model.job.JobFunctions)5 JobState (com.netflix.titus.api.jobmanager.model.job.JobState)5 TaskState (com.netflix.titus.api.jobmanager.model.job.TaskState)5 TaskStatus (com.netflix.titus.api.jobmanager.model.job.TaskStatus)5 ServiceJobExt (com.netflix.titus.api.jobmanager.model.job.ext.ServiceJobExt)5 V3JobOperations (com.netflix.titus.api.jobmanager.service.V3JobOperations)5 JobStore (com.netflix.titus.api.jobmanager.store.JobStore)5 CallMetadata (com.netflix.titus.api.model.callmetadata.CallMetadata)5 ModelActionHolder (com.netflix.titus.common.framework.reconciler.ModelActionHolder)5 TitusRuntime (com.netflix.titus.common.runtime.TitusRuntime)5 JobStatus (com.netflix.titus.api.jobmanager.model.job.JobStatus)4