Search in sources :

Example 1 with Activator

use of com.netflix.titus.common.util.guice.annotation.Activator in project titus-control-plane by Netflix.

the class DefaultEvictionOperations method enterActiveMode.

@Activator
public void enterActiveMode() {
    this.quotEventEmitter = new QuotaEventEmitter(configuration, jobOperations, quotaManager, titusRuntime);
    this.taskTerminationExecutor = new TaskTerminationExecutor(configuration, jobOperations, quotaManager, titusRuntime, scheduler);
}
Also used : QuotaEventEmitter(com.netflix.titus.master.eviction.service.quota.QuotaEventEmitter) Activator(com.netflix.titus.common.util.guice.annotation.Activator)

Example 2 with Activator

use of com.netflix.titus.common.util.guice.annotation.Activator in project titus-control-plane by Netflix.

the class KubeAndJobServiceSyncStatusWatcher method enterActiveMode.

@Activator
public Observable<Void> enterActiveMode() {
    try {
        kubeApiFacade.getPodInformer().addEventHandler(new ResourceEventHandler<V1Pod>() {

            @Override
            public void onAdd(V1Pod obj) {
                capturedState.put(obj.getMetadata().getName(), new TaskHolder(obj, false));
            }

            @Override
            public void onUpdate(V1Pod oldObj, V1Pod newObj) {
                capturedState.put(newObj.getMetadata().getName(), new TaskHolder(newObj, false));
            }

            @Override
            public void onDelete(V1Pod obj, boolean deletedFinalStateUnknown) {
                capturedState.put(obj.getMetadata().getName(), new TaskHolder(obj, true));
            }
        });
        ScheduleDescriptor scheduleDescriptor = ScheduleDescriptor.newBuilder().withName(KubeAndJobServiceSyncStatusWatcher.class.getSimpleName()).withDescription("Compare Kube pod state with Titus job service").withInitialDelay(Duration.ofSeconds(60)).withInterval(Duration.ofSeconds(10)).withTimeout(Duration.ofSeconds((60))).build();
        this.schedulerRef = titusRuntime.getLocalScheduler().schedule(scheduleDescriptor, this::process, ExecutorsExt.namedSingleThreadExecutor(KubeAndJobServiceSyncStatusWatcher.class.getSimpleName()));
    } catch (Exception e) {
        return Observable.error(e);
    }
    return Observable.empty();
}
Also used : ScheduleDescriptor(com.netflix.titus.common.framework.scheduler.model.ScheduleDescriptor) V1Pod(io.kubernetes.client.openapi.models.V1Pod) Activator(com.netflix.titus.common.util.guice.annotation.Activator)

Example 3 with Activator

use of com.netflix.titus.common.util.guice.annotation.Activator in project titus-control-plane by Netflix.

the class DefaultV3JobOperations method enterActiveMode.

@Activator
public void enterActiveMode() {
    this.reconciliationFramework = jobReconciliationFrameworkFactory.newInstance();
    // BUG: event stream breaks permanently, and cannot be retried.
    // As we cannot fix the underlying issue yet, we have to be able to discover when it happens.
    AtomicLong eventStreamLastError = new AtomicLong();
    Clock clock = titusRuntime.getClock();
    this.transactionLoggerSubscription = JobTransactionLogger.logEvents(reconciliationFramework, eventStreamLastError, clock);
    PolledMeter.using(titusRuntime.getRegistry()).withName(METRIC_EVENT_STREAM_LAST_ERROR).monitorValue(eventStreamLastError, value -> value.get() <= 0 ? 0 : clock.wallTime() - value.get());
    // Remove finished jobs from the reconciliation framework.
    Observable<JobManagerReconcilerEvent> reconciliationEventsObservable = reconciliationFramework.events().onBackpressureBuffer(OBSERVE_JOBS_BACKPRESSURE_BUFFER_SIZE, () -> logger.warn("Overflowed the buffer size: " + OBSERVE_JOBS_BACKPRESSURE_BUFFER_SIZE), BackpressureOverflow.ON_OVERFLOW_ERROR).doOnSubscribe(() -> {
        List<EntityHolder> entityHolders = reconciliationFramework.orderedView(IndexKind.StatusCreationTime);
        for (EntityHolder entityHolder : entityHolders) {
            handleJobCompletedEvent(entityHolder);
        }
    });
    this.reconcilerEventSubscription = titusRuntime.persistentStream(reconciliationEventsObservable).subscribe(event -> {
        if (event instanceof JobModelUpdateReconcilerEvent) {
            JobModelUpdateReconcilerEvent jobUpdateEvent = (JobModelUpdateReconcilerEvent) event;
            handleJobCompletedEvent(jobUpdateEvent.getChangedEntityHolder());
        }
    }, e -> logger.error("Event stream terminated with an error", e), () -> logger.info("Event stream completed"));
    reconciliationFramework.start();
}
Also used : Arrays(java.util.Arrays) JobCompatibility(com.netflix.titus.api.jobmanager.model.job.JobCompatibility) TitusChangeAction(com.netflix.titus.master.jobmanager.service.common.action.TitusChangeAction) Task(com.netflix.titus.api.jobmanager.model.job.Task) LoggerFactory(org.slf4j.LoggerFactory) BasicServiceJobActions(com.netflix.titus.master.jobmanager.service.service.action.BasicServiceJobActions) StringExt(com.netflix.titus.common.util.StringExt) ReactorExt(com.netflix.titus.common.util.rx.ReactorExt) JobStatus(com.netflix.titus.api.jobmanager.model.job.JobStatus) PreDestroy(javax.annotation.PreDestroy) FeatureActivationConfiguration(com.netflix.titus.api.FeatureActivationConfiguration) Map(java.util.Map) JobState(com.netflix.titus.api.jobmanager.model.job.JobState) BasicJobActions(com.netflix.titus.master.jobmanager.service.common.action.task.BasicJobActions) JobEntityHolders(com.netflix.titus.master.jobmanager.service.common.action.JobEntityHolders) JobStore(com.netflix.titus.api.jobmanager.store.JobStore) CallMetadata(com.netflix.titus.api.model.callmetadata.CallMetadata) FunctionExt.alwaysTrue(com.netflix.titus.common.util.FunctionExt.alwaysTrue) JobNewModelReconcilerEvent(com.netflix.titus.master.jobmanager.service.event.JobModelReconcilerEvent.JobNewModelReconcilerEvent) ImmutableSet(com.google.common.collect.ImmutableSet) Job(com.netflix.titus.api.jobmanager.model.job.Job) ImmutableMap(com.google.common.collect.ImmutableMap) Predicate(java.util.function.Predicate) TaskStatus(com.netflix.titus.api.jobmanager.model.job.TaskStatus) Set(java.util.Set) JobFunctions(com.netflix.titus.api.jobmanager.model.job.JobFunctions) UUID(java.util.UUID) JobManagerEvent(com.netflix.titus.api.jobmanager.model.job.event.JobManagerEvent) Collectors(java.util.stream.Collectors) TaskState(com.netflix.titus.api.jobmanager.model.job.TaskState) ProtobufExt(com.netflix.titus.common.util.ProtobufExt) List(java.util.List) JobModelUpdateReconcilerEvent(com.netflix.titus.master.jobmanager.service.event.JobModelReconcilerEvent.JobModelUpdateReconcilerEvent) Stream(java.util.stream.Stream) TaskUpdateEvent(com.netflix.titus.api.jobmanager.model.job.event.TaskUpdateEvent) ReconciliationEngine(com.netflix.titus.common.framework.reconciler.ReconciliationEngine) DisruptionBudget(com.netflix.titus.api.jobmanager.model.job.disruptionbudget.DisruptionBudget) ProxyConfiguration(com.netflix.titus.common.util.guice.annotation.ProxyConfiguration) Optional(java.util.Optional) JobManagerReconcilerEvent(com.netflix.titus.master.jobmanager.service.event.JobManagerReconcilerEvent) JobAttributes(com.netflix.titus.api.jobmanager.JobAttributes) ObservableExt(com.netflix.titus.common.util.rx.ObservableExt) Clock(com.netflix.titus.common.util.time.Clock) Subscription(rx.Subscription) KillInitiatedActions(com.netflix.titus.master.jobmanager.service.common.action.task.KillInitiatedActions) Completable(rx.Completable) JobManagerConstants(com.netflix.titus.api.jobmanager.service.JobManagerConstants) EntitySanitizer(com.netflix.titus.common.model.sanitizer.EntitySanitizer) ServiceJobProcesses(com.netflix.titus.api.jobmanager.model.job.ServiceJobProcesses) MoveTaskBetweenJobsAction(com.netflix.titus.master.jobmanager.service.service.action.MoveTaskBetweenJobsAction) ProxyType(com.netflix.titus.common.util.guice.ProxyType) MetricConstants(com.netflix.titus.master.MetricConstants) Singleton(javax.inject.Singleton) Function(java.util.function.Function) ArrayList(java.util.ArrayList) Observable(rx.Observable) Inject(javax.inject.Inject) CallMetadataUtils(com.netflix.titus.runtime.endpoint.metadata.CallMetadataUtils) Pair(com.netflix.titus.common.util.tuple.Pair) Model(com.netflix.titus.common.framework.reconciler.ModelActionHolder.Model) ChangeAction(com.netflix.titus.common.framework.reconciler.ChangeAction) JobManagerException(com.netflix.titus.api.jobmanager.service.JobManagerException) Named(javax.inject.Named) BackpressureOverflow(rx.BackpressureOverflow) JobDescriptor(com.netflix.titus.api.jobmanager.model.job.JobDescriptor) JobCheckpointReconcilerEvent(com.netflix.titus.master.jobmanager.service.event.JobCheckpointReconcilerEvent) Logger(org.slf4j.Logger) JobUpdateEvent(com.netflix.titus.api.jobmanager.model.job.event.JobUpdateEvent) ServiceJobExt(com.netflix.titus.api.jobmanager.model.job.ext.ServiceJobExt) Mono(reactor.core.publisher.Mono) GrpcJobManagementModelConverters(com.netflix.titus.runtime.endpoint.v3.grpc.GrpcJobManagementModelConverters) ManagementSubsystemInitializer(com.netflix.titus.master.service.management.ManagementSubsystemInitializer) JOB_STRICT_SANITIZER(com.netflix.titus.api.jobmanager.model.job.sanitizer.JobSanitizerBuilder.JOB_STRICT_SANITIZER) EntityHolder(com.netflix.titus.common.framework.reconciler.EntityHolder) Activator(com.netflix.titus.common.util.guice.annotation.Activator) AtomicLong(java.util.concurrent.atomic.AtomicLong) ModelActionHolder(com.netflix.titus.common.framework.reconciler.ModelActionHolder) V3JobOperations(com.netflix.titus.api.jobmanager.service.V3JobOperations) TaskAttributes(com.netflix.titus.api.jobmanager.TaskAttributes) CapacityAttributes(com.netflix.titus.api.jobmanager.model.job.CapacityAttributes) ReconciliationFramework(com.netflix.titus.common.framework.reconciler.ReconciliationFramework) BasicTaskActions(com.netflix.titus.master.jobmanager.service.common.action.task.BasicTaskActions) JobSubmitLimiter(com.netflix.titus.master.jobmanager.service.limiter.JobSubmitLimiter) PolledMeter(com.netflix.spectator.api.patterns.PolledMeter) TitusRuntime(com.netflix.titus.common.runtime.TitusRuntime) Evaluators(com.netflix.titus.common.util.Evaluators) Collections(java.util.Collections) AtomicLong(java.util.concurrent.atomic.AtomicLong) JobModelUpdateReconcilerEvent(com.netflix.titus.master.jobmanager.service.event.JobModelReconcilerEvent.JobModelUpdateReconcilerEvent) EntityHolder(com.netflix.titus.common.framework.reconciler.EntityHolder) Clock(com.netflix.titus.common.util.time.Clock) JobManagerReconcilerEvent(com.netflix.titus.master.jobmanager.service.event.JobManagerReconcilerEvent) Activator(com.netflix.titus.common.util.guice.annotation.Activator)

Example 4 with Activator

use of com.netflix.titus.common.util.guice.annotation.Activator in project titus-control-plane by Netflix.

the class KubeNotificationProcessor method enterActiveMode.

@Activator
public void enterActiveMode() {
    this.scheduler = initializeNotificationScheduler();
    AtomicLong pendingCounter = new AtomicLong();
    this.subscription = kubeApiServerIntegrator.events().mergeWith(kubeJobManagementReconciler.getPodEventSource()).subscribeOn(scheduler).publishOn(scheduler).doOnError(error -> logger.warn("Kube integration event stream terminated with an error (retrying soon)", error)).retryWhen(Retry.backoff(Long.MAX_VALUE, Duration.ofSeconds(1))).subscribe(event -> {
        Stopwatch stopwatch = Stopwatch.createStarted();
        pendingCounter.getAndIncrement();
        metricsRunning.set(pendingCounter.get());
        metricsLag.set(PodEvent.nextSequence() - event.getSequenceNumber());
        logger.info("New event [pending={}, lag={}]: {}", pendingCounter.get(), PodEvent.nextSequence() - event.getSequenceNumber(), event);
        processEvent(event).doAfterTerminate(() -> {
            pendingCounter.decrementAndGet();
            long elapsed = stopwatch.elapsed(TimeUnit.MILLISECONDS);
            metricsProcessed.record(elapsed, TimeUnit.MILLISECONDS);
            metricsRunning.set(pendingCounter.get());
            logger.info("Event processed [pending={}]: event={}, elapsed={}", pendingCounter.get(), event, elapsed);
        }).subscribe(next -> {
        // nothing
        }, error -> {
            logger.info("Kube notification event state update error: event={}, error={}", event, error.getMessage());
            logger.debug("Stack trace", error);
        }, () -> {
        // nothing
        });
    }, e -> logger.error("Event stream terminated"), () -> logger.info("Event stream completed"));
}
Also used : Retry(reactor.util.retry.Retry) Task(com.netflix.titus.api.jobmanager.model.job.Task) CollectionsExt(com.netflix.titus.common.util.CollectionsExt) LoggerFactory(org.slf4j.LoggerFactory) V1PodStatus(io.kubernetes.client.openapi.models.V1PodStatus) ReactorExt(com.netflix.titus.common.util.rx.ReactorExt) KubeUtil(com.netflix.titus.master.kubernetes.KubeUtil) TITUS_NODE_DOMAIN(com.netflix.titus.runtime.kubernetes.KubeConstants.TITUS_NODE_DOMAIN) Duration(java.time.Duration) Map(java.util.Map) DirectKubeApiServerIntegrator(com.netflix.titus.master.kubernetes.client.DirectKubeApiServerIntegrator) Either(com.netflix.titus.common.util.tuple.Either) CallMetadata(com.netflix.titus.api.model.callmetadata.CallMetadata) PodEvent(com.netflix.titus.master.kubernetes.client.model.PodEvent) Job(com.netflix.titus.api.jobmanager.model.job.Job) TaskStatus(com.netflix.titus.api.jobmanager.model.job.TaskStatus) JobFunctions(com.netflix.titus.api.jobmanager.model.job.JobFunctions) TaskState(com.netflix.titus.api.jobmanager.model.job.TaskState) PodNotFoundEvent(com.netflix.titus.master.kubernetes.client.model.PodNotFoundEvent) Timer(com.netflix.spectator.api.Timer) List(java.util.List) Optional(java.util.Optional) PodWrapper(com.netflix.titus.master.kubernetes.client.model.PodWrapper) Gauge(com.netflix.spectator.api.Gauge) Disposable(reactor.core.Disposable) Stopwatch(com.google.common.base.Stopwatch) PodDeletedEvent(com.netflix.titus.master.kubernetes.client.model.PodDeletedEvent) Counter(com.netflix.spectator.api.Counter) HashMap(java.util.HashMap) MetricConstants(com.netflix.titus.master.MetricConstants) V1Node(io.kubernetes.client.openapi.models.V1Node) Singleton(javax.inject.Singleton) Scheduler(reactor.core.scheduler.Scheduler) ArrayList(java.util.ArrayList) Inject(javax.inject.Inject) Pair(com.netflix.titus.common.util.tuple.Pair) ContainerResultCodeResolver(com.netflix.titus.master.kubernetes.ContainerResultCodeResolver) Schedulers(reactor.core.scheduler.Schedulers) Evaluators.acceptNotNull(com.netflix.titus.common.util.Evaluators.acceptNotNull) KubeJobManagementReconciler(com.netflix.titus.master.kubernetes.controller.KubeJobManagementReconciler) ExecutorService(java.util.concurrent.ExecutorService) ExecutorsExt(com.netflix.titus.common.util.ExecutorsExt) Logger(org.slf4j.Logger) PodUpdatedEvent(com.netflix.titus.master.kubernetes.client.model.PodUpdatedEvent) Mono(reactor.core.publisher.Mono) Activator(com.netflix.titus.common.util.guice.annotation.Activator) TimeUnit(java.util.concurrent.TimeUnit) AtomicLong(java.util.concurrent.atomic.AtomicLong) ExecutableStatus(com.netflix.titus.api.jobmanager.model.job.ExecutableStatus) V3JobOperations(com.netflix.titus.api.jobmanager.service.V3JobOperations) TaskAttributes(com.netflix.titus.api.jobmanager.TaskAttributes) PodToTaskMapper(com.netflix.titus.master.kubernetes.PodToTaskMapper) V1ContainerState(io.kubernetes.client.openapi.models.V1ContainerState) VisibleForTesting(com.google.common.annotations.VisibleForTesting) TitusRuntime(com.netflix.titus.common.runtime.TitusRuntime) Comparator(java.util.Comparator) Evaluators(com.netflix.titus.common.util.Evaluators) AtomicLong(java.util.concurrent.atomic.AtomicLong) Stopwatch(com.google.common.base.Stopwatch) Activator(com.netflix.titus.common.util.guice.annotation.Activator)

Example 5 with Activator

use of com.netflix.titus.common.util.guice.annotation.Activator in project titus-control-plane by Netflix.

the class DefaultFabric8IOConnector method enterActiveMode.

@Activator
public void enterActiveMode() {
    logger.info("Kube api connector entering active mode");
    this.scheduler = initializeNotificationScheduler();
    AtomicLong pendingCounter = new AtomicLong();
    this.subscription = this.events().subscribeOn(scheduler).publishOn(scheduler).doOnError(error -> logger.warn("Kube integration event stream terminated with an error (retrying soon)", error)).retryWhen(Retry.backoff(Long.MAX_VALUE, Duration.ofSeconds(1))).subscribe(event -> {
        Stopwatch stopwatch = Stopwatch.createStarted();
        pendingCounter.getAndIncrement();
        logger.info("New event [pending={}, lag={}]: {}", pendingCounter.get(), PodEvent.nextSequence() - event.getSequenceNumber(), event);
        processEvent(event).doAfterTerminate(() -> {
            pendingCounter.decrementAndGet();
            long elapsed = stopwatch.elapsed(TimeUnit.MILLISECONDS);
            logger.info("Event processed [pending={}]: event={}, elapsed={}", pendingCounter.get(), event, elapsed);
        }).subscribe(next -> {
        // nothing
        }, error -> {
            logger.info("Kube api connector event state update error: event={}, error={}", event, error.getMessage());
            logger.debug("Stack trace", error);
        }, () -> {
        // nothing
        });
    }, e -> logger.error("Event stream terminated"), () -> logger.info("Event stream completed"));
}
Also used : Disposable(reactor.core.Disposable) Retry(reactor.util.retry.Retry) Stopwatch(com.google.common.base.Stopwatch) ResourceEventHandler(io.fabric8.kubernetes.client.informers.ResourceEventHandler) SharedIndexInformer(io.fabric8.kubernetes.client.informers.SharedIndexInformer) LoggerFactory(org.slf4j.LoggerFactory) StringExt(com.netflix.titus.common.util.StringExt) Singleton(javax.inject.Singleton) ReactorExt(com.netflix.titus.common.util.rx.ReactorExt) Scheduler(reactor.core.scheduler.Scheduler) PodUpdatedEvent(com.netflix.titus.runtime.connector.kubernetes.fabric8io.model.PodUpdatedEvent) Inject(javax.inject.Inject) PreDestroy(javax.annotation.PreDestroy) NamespacedKubernetesClient(io.fabric8.kubernetes.client.NamespacedKubernetesClient) Duration(java.time.Duration) Map(java.util.Map) ExceptionExt(com.netflix.titus.common.util.ExceptionExt) Schedulers(reactor.core.scheduler.Schedulers) HashTreePMap(org.pcollections.HashTreePMap) Deactivator(com.netflix.titus.common.util.guice.annotation.Deactivator) ExecutorService(java.util.concurrent.ExecutorService) Node(io.fabric8.kubernetes.api.model.Node) SharedInformerFactory(io.fabric8.kubernetes.client.informers.SharedInformerFactory) ExecutorsExt(com.netflix.titus.common.util.ExecutorsExt) Logger(org.slf4j.Logger) Pod(io.fabric8.kubernetes.api.model.Pod) Mono(reactor.core.publisher.Mono) PodDeletedEvent(com.netflix.titus.runtime.connector.kubernetes.fabric8io.model.PodDeletedEvent) Activator(com.netflix.titus.common.util.guice.annotation.Activator) TimeUnit(java.util.concurrent.TimeUnit) F8KubeObjectFormatter.formatPodEssentials(com.netflix.titus.runtime.connector.kubernetes.fabric8io.model.F8KubeObjectFormatter.formatPodEssentials) AtomicLong(java.util.concurrent.atomic.AtomicLong) Flux(reactor.core.publisher.Flux) Optional(java.util.Optional) PodEvent(com.netflix.titus.runtime.connector.kubernetes.fabric8io.model.PodEvent) VisibleForTesting(com.google.common.annotations.VisibleForTesting) TitusRuntime(com.netflix.titus.common.runtime.TitusRuntime) Evaluators(com.netflix.titus.common.util.Evaluators) PMap(org.pcollections.PMap) AtomicLong(java.util.concurrent.atomic.AtomicLong) Stopwatch(com.google.common.base.Stopwatch) Activator(com.netflix.titus.common.util.guice.annotation.Activator)

Aggregations

Activator (com.netflix.titus.common.util.guice.annotation.Activator)9 TitusRuntime (com.netflix.titus.common.runtime.TitusRuntime)6 Evaluators (com.netflix.titus.common.util.Evaluators)6 Logger (org.slf4j.Logger)6 LoggerFactory (org.slf4j.LoggerFactory)6 ExecutorsExt (com.netflix.titus.common.util.ExecutorsExt)5 Duration (java.time.Duration)5 List (java.util.List)5 PreDestroy (javax.annotation.PreDestroy)5 Inject (javax.inject.Inject)5 Singleton (javax.inject.Singleton)5 Gauge (com.netflix.spectator.api.Gauge)4 Task (com.netflix.titus.api.jobmanager.model.job.Task)4 TaskState (com.netflix.titus.api.jobmanager.model.job.TaskState)4 V3JobOperations (com.netflix.titus.api.jobmanager.service.V3JobOperations)4 ScheduleDescriptor (com.netflix.titus.common.framework.scheduler.model.ScheduleDescriptor)4 Deactivator (com.netflix.titus.common.util.guice.annotation.Deactivator)4 ReactorExt (com.netflix.titus.common.util.rx.ReactorExt)4 MetricConstants (com.netflix.titus.master.MetricConstants)4 ArrayList (java.util.ArrayList)4