Search in sources :

Example 1 with V3JobQueryCriteriaEvaluator

use of com.netflix.titus.runtime.endpoint.v3.grpc.query.V3JobQueryCriteriaEvaluator in project titus-control-plane by Netflix.

the class ObserveJobsSubscription method tryInitialize.

private boolean tryInitialize() {
    ObserveJobsQuery query = getLastObserveJobsQueryEvent();
    if (query == null) {
        return false;
    }
    Stopwatch start = Stopwatch.createStarted();
    String trxId = UUID.randomUUID().toString();
    CallMetadata callMetadata = context.getCallMetadataResolver().resolve().orElse(CallMetadataConstants.UNDEFINED_CALL_METADATA);
    metrics.observeJobsStarted(trxId, callMetadata);
    JobQueryCriteria<TaskStatus.TaskState, JobDescriptor.JobSpecCase> criteria = toJobQueryCriteria(query);
    V3JobQueryCriteriaEvaluator jobsPredicate = new V3JobQueryCriteriaEvaluator(criteria, titusRuntime);
    V3TaskQueryCriteriaEvaluator tasksPredicate = new V3TaskQueryCriteriaEvaluator(criteria, titusRuntime);
    Observable<JobChangeNotification> eventStream = context.getJobOperations().observeJobs(jobsPredicate, tasksPredicate, true).filter(event -> withArchived || !event.isArchived()).observeOn(context.getObserveJobsScheduler()).subscribeOn(context.getObserveJobsScheduler(), false).map(event -> GrpcJobManagementModelConverters.toGrpcJobChangeNotification(event, context.getGrpcObjectsCache(), titusRuntime.getClock().wallTime())).compose(ObservableExt.head(() -> {
        List<JobChangeNotification> snapshot = createJobsSnapshot(jobsPredicate, tasksPredicate);
        snapshot.add(SNAPSHOT_END_MARKER);
        return snapshot;
    })).doOnError(e -> logger.error("Unexpected error in jobs event stream", e));
    AtomicBoolean closingProcessed = new AtomicBoolean();
    this.jobServiceSubscription = eventStream.doOnUnsubscribe(() -> {
        if (!closingProcessed.getAndSet(true)) {
            metrics.observeJobsUnsubscribed(trxId, start.elapsed(TimeUnit.MILLISECONDS));
        }
    }).subscribe(event -> {
        metrics.observeJobsEventEmitted(trxId);
        jobServiceEvents.add(event);
        drain();
    }, e -> {
        if (!closingProcessed.getAndSet(true)) {
            metrics.observeJobsError(trxId, start.elapsed(TimeUnit.MILLISECONDS), e);
        }
        jobServiceCompleted = true;
        jobServiceError = new StatusRuntimeException(Status.INTERNAL.withDescription("All jobs monitoring stream terminated with an error").withCause(e));
        drain();
    }, () -> {
        if (!closingProcessed.getAndSet(true)) {
            metrics.observeJobsCompleted(trxId, start.elapsed(TimeUnit.MILLISECONDS));
        }
        jobServiceCompleted = true;
        drain();
    });
    this.grpcStreamInitiated = true;
    return true;
}
Also used : KeepAliveResponse(com.netflix.titus.grpc.protogen.KeepAliveResponse) Stopwatch(com.google.common.base.Stopwatch) ObserveJobsQuery(com.netflix.titus.grpc.protogen.ObserveJobsQuery) Task(com.netflix.titus.api.jobmanager.model.job.Task) CallMetadataConstants(com.netflix.titus.api.model.callmetadata.CallMetadataConstants) LoggerFactory(org.slf4j.LoggerFactory) AtomicBoolean(java.util.concurrent.atomic.AtomicBoolean) ArrayList(java.util.ArrayList) Observable(rx.Observable) StreamObserver(io.grpc.stub.StreamObserver) Pair(com.netflix.titus.common.util.tuple.Pair) TaskStatus(com.netflix.titus.grpc.protogen.TaskStatus) ExceptionExt(com.netflix.titus.common.util.ExceptionExt) Status(io.grpc.Status) CallMetadata(com.netflix.titus.api.model.callmetadata.CallMetadata) ServerCallStreamObserver(io.grpc.stub.ServerCallStreamObserver) JobDescriptor(com.netflix.titus.grpc.protogen.JobDescriptor) V3TaskQueryCriteriaEvaluator(com.netflix.titus.runtime.endpoint.v3.grpc.query.V3TaskQueryCriteriaEvaluator) Job(com.netflix.titus.api.jobmanager.model.job.Job) KeepAliveRequest(com.netflix.titus.grpc.protogen.KeepAliveRequest) Logger(org.slf4j.Logger) Predicate(java.util.function.Predicate) SNAPSHOT_END_MARKER(com.netflix.titus.master.jobmanager.endpoint.v3.grpc.ObserveJobsContext.SNAPSHOT_END_MARKER) BlockingQueue(java.util.concurrent.BlockingQueue) UUID(java.util.UUID) GrpcJobManagementModelConverters(com.netflix.titus.runtime.endpoint.v3.grpc.GrpcJobManagementModelConverters) TimeUnit(java.util.concurrent.TimeUnit) StatusRuntimeException(io.grpc.StatusRuntimeException) AtomicLong(java.util.concurrent.atomic.AtomicLong) List(java.util.List) JobQueryCriteria(com.netflix.titus.runtime.endpoint.JobQueryCriteria) LinkedBlockingDeque(java.util.concurrent.LinkedBlockingDeque) V3JobQueryCriteriaEvaluator(com.netflix.titus.runtime.endpoint.v3.grpc.query.V3JobQueryCriteriaEvaluator) GrpcJobQueryModelConverters.toJobQueryCriteria(com.netflix.titus.runtime.endpoint.v3.grpc.GrpcJobQueryModelConverters.toJobQueryCriteria) ObserveJobsWithKeepAliveRequest(com.netflix.titus.grpc.protogen.ObserveJobsWithKeepAliveRequest) VisibleForTesting(com.google.common.annotations.VisibleForTesting) TitusRuntime(com.netflix.titus.common.runtime.TitusRuntime) ObservableExt(com.netflix.titus.common.util.rx.ObservableExt) Subscription(rx.Subscription) JobChangeNotification(com.netflix.titus.grpc.protogen.JobChangeNotification) V3TaskQueryCriteriaEvaluator(com.netflix.titus.runtime.endpoint.v3.grpc.query.V3TaskQueryCriteriaEvaluator) CallMetadata(com.netflix.titus.api.model.callmetadata.CallMetadata) Stopwatch(com.google.common.base.Stopwatch) V3JobQueryCriteriaEvaluator(com.netflix.titus.runtime.endpoint.v3.grpc.query.V3JobQueryCriteriaEvaluator) AtomicBoolean(java.util.concurrent.atomic.AtomicBoolean) JobChangeNotification(com.netflix.titus.grpc.protogen.JobChangeNotification) StatusRuntimeException(io.grpc.StatusRuntimeException) ObserveJobsQuery(com.netflix.titus.grpc.protogen.ObserveJobsQuery)

Example 2 with V3JobQueryCriteriaEvaluator

use of com.netflix.titus.runtime.endpoint.v3.grpc.query.V3JobQueryCriteriaEvaluator in project titus-control-plane by Netflix.

the class DefaultJobManagementServiceGrpc method findJobs.

@Override
public void findJobs(JobQuery jobQuery, StreamObserver<JobQueryResult> responseObserver) {
    if (!checkPageIsValid(jobQuery.getPage(), responseObserver)) {
        return;
    }
    try {
        // We need to find all jobs to get the total number of them.
        List<com.netflix.titus.api.jobmanager.model.job.Job<?>> allFilteredJobs = jobOperations.findJobs(new V3JobQueryCriteriaEvaluator(toJobQueryCriteria(jobQuery), titusRuntime), 0, Integer.MAX_VALUE / 2);
        Pair<List<com.netflix.titus.api.jobmanager.model.job.Job<?>>, Pagination> queryResult = PaginationUtil.takePageWithCursorAndKeyExtractor(toPage(jobQuery.getPage()), allFilteredJobs, JobComparators::createJobKeyOf, JobManagerCursors::coreJobIndexOf, JobManagerCursors::newJobCoreCursorFrom);
        List<Job> grpcJobs = new ArrayList<>();
        for (com.netflix.titus.api.jobmanager.model.job.Job<?> job : queryResult.getLeft()) {
            Job toGrpcJob = grpcObjectsCache.getJob(job);
            grpcJobs.add(toGrpcJob);
        }
        JobQueryResult grpcQueryResult;
        if (jobQuery.getFieldsList().isEmpty()) {
            grpcQueryResult = toJobQueryResult(grpcJobs, queryResult.getRight());
        } else {
            Set<String> fields = new HashSet<>(jobQuery.getFieldsList());
            fields.addAll(JOB_MINIMUM_FIELD_SET);
            List<Job> list = new ArrayList<>();
            for (Job j : grpcJobs) {
                list.add(ProtobufExt.copy(j, fields));
            }
            grpcQueryResult = toJobQueryResult(list, queryResult.getRight());
        }
        responseObserver.onNext(grpcQueryResult);
        responseObserver.onCompleted();
    } catch (Exception e) {
        safeOnError(logger, e, responseObserver);
    }
}
Also used : ArrayList(java.util.ArrayList) JobManagerCursors(com.netflix.titus.runtime.jobmanager.JobManagerCursors) V3JobQueryCriteriaEvaluator(com.netflix.titus.runtime.endpoint.v3.grpc.query.V3JobQueryCriteriaEvaluator) JobQueryResult(com.netflix.titus.grpc.protogen.JobQueryResult) StatusRuntimeException(io.grpc.StatusRuntimeException) TitusServiceException(com.netflix.titus.api.service.TitusServiceException) JobManagerException(com.netflix.titus.api.jobmanager.service.JobManagerException) Pagination(com.netflix.titus.api.model.Pagination) GrpcJobQueryModelConverters.toGrpcPagination(com.netflix.titus.runtime.endpoint.v3.grpc.GrpcJobQueryModelConverters.toGrpcPagination) JobComparators(com.netflix.titus.runtime.jobmanager.JobComparators) ArrayList(java.util.ArrayList) List(java.util.List) Job(com.netflix.titus.grpc.protogen.Job) HashSet(java.util.HashSet)

Example 3 with V3JobQueryCriteriaEvaluator

use of com.netflix.titus.runtime.endpoint.v3.grpc.query.V3JobQueryCriteriaEvaluator in project titus-control-plane by Netflix.

the class LocalCacheQueryProcessor method observeJobs.

public Observable<JobChangeNotification> observeJobs(ObserveJobsQuery query) {
    JobQueryCriteria<TaskStatus.TaskState, JobDescriptor.JobSpecCase> criteria = toJobQueryCriteria(query);
    V3JobQueryCriteriaEvaluator jobsPredicate = new V3JobQueryCriteriaEvaluator(criteria, titusRuntime);
    V3TaskQueryCriteriaEvaluator tasksPredicate = new V3TaskQueryCriteriaEvaluator(criteria, titusRuntime);
    Set<String> jobFields = newFieldsFilter(query.getJobFieldsList(), JOB_MINIMUM_FIELD_SET);
    Set<String> taskFields = newFieldsFilter(query.getTaskFieldsList(), TASK_MINIMUM_FIELD_SET);
    Flux<JobChangeNotification> eventStream = Flux.defer(() -> {
        AtomicBoolean first = new AtomicBoolean(true);
        return jobDataReplicator.events().subscribeOn(scheduler).publishOn(scheduler).flatMap(event -> {
            JobManagerEvent<?> jobManagerEvent = event.getRight();
            long now = titusRuntime.getClock().wallTime();
            JobSnapshot snapshot = event.getLeft();
            Optional<JobChangeNotification> grpcEvent = toObserveJobsEvent(snapshot, jobManagerEvent, now, jobsPredicate, tasksPredicate, jobFields, taskFields);
            // On first event emit full snapshot first
            if (first.getAndSet(false)) {
                List<JobChangeNotification> snapshotEvents = buildSnapshot(snapshot, now, jobsPredicate, tasksPredicate, jobFields, taskFields);
                grpcEvent.ifPresent(snapshotEvents::add);
                return Flux.fromIterable(snapshotEvents);
            }
            // subscribe again. Snapshot marker indicates that the underlying GRPC stream was disconnected.
            if (jobManagerEvent == JobManagerEvent.snapshotMarker()) {
                return Mono.error(new StatusRuntimeException(Status.ABORTED.augmentDescription("Downstream event stream reconnected.")));
            }
            // to filter them out here.
            if (jobManagerEvent instanceof JobKeepAliveEvent) {
                // Check if staleness is not too high.
                if (jobDataReplicator.getStalenessMs() > configuration.getObserveJobsStalenessDisconnectMs()) {
                    rejectedByStalenessTooHighMetric.increment();
                    return Mono.error(new StatusRuntimeException(Status.ABORTED.augmentDescription("Data staleness in the event stream is too high. Most likely caused by connectivity issue to the downstream server.")));
                }
                return Mono.empty();
            }
            return grpcEvent.map(Flux::just).orElseGet(Flux::empty);
        });
    });
    return ReactorExt.toObservable(eventStream);
}
Also used : V3TaskQueryCriteriaEvaluator(com.netflix.titus.runtime.endpoint.v3.grpc.query.V3TaskQueryCriteriaEvaluator) Flux(reactor.core.publisher.Flux) JobKeepAliveEvent(com.netflix.titus.api.jobmanager.model.job.event.JobKeepAliveEvent) V3JobQueryCriteriaEvaluator(com.netflix.titus.runtime.endpoint.v3.grpc.query.V3JobQueryCriteriaEvaluator) AtomicBoolean(java.util.concurrent.atomic.AtomicBoolean) JobChangeNotification(com.netflix.titus.grpc.protogen.JobChangeNotification) StatusRuntimeException(io.grpc.StatusRuntimeException) JobSnapshot(com.netflix.titus.runtime.connector.jobmanager.snapshot.JobSnapshot)

Example 4 with V3JobQueryCriteriaEvaluator

use of com.netflix.titus.runtime.endpoint.v3.grpc.query.V3JobQueryCriteriaEvaluator in project titus-control-plane by Netflix.

the class LocalCacheQueryProcessor method findMatchingJob.

private List<com.netflix.titus.api.jobmanager.model.job.Job> findMatchingJob(JobQueryCriteria<TaskStatus.TaskState, JobDescriptor.JobSpecCase> queryCriteria) {
    JobSnapshot jobSnapshot = jobDataReplicator.getCurrent();
    Map<String, Job<?>> jobsById = jobSnapshot.getJobMap();
    V3JobQueryCriteriaEvaluator queryFilter = new V3JobQueryCriteriaEvaluator(queryCriteria, titusRuntime);
    List<com.netflix.titus.api.jobmanager.model.job.Job> matchingJobs = new ArrayList<>();
    jobsById.forEach((jobId, job) -> {
        List<com.netflix.titus.api.jobmanager.model.job.Task> tasks = new ArrayList<>(jobSnapshot.getTasks(jobId).values());
        Pair<Job<?>, List<com.netflix.titus.api.jobmanager.model.job.Task>> jobTaskPair = Pair.of(job, tasks);
        if (queryFilter.test(jobTaskPair)) {
            matchingJobs.add(job);
        }
    });
    return matchingJobs;
}
Also used : Task(com.netflix.titus.grpc.protogen.Task) ArrayList(java.util.ArrayList) V3JobQueryCriteriaEvaluator(com.netflix.titus.runtime.endpoint.v3.grpc.query.V3JobQueryCriteriaEvaluator) JobSnapshot(com.netflix.titus.runtime.connector.jobmanager.snapshot.JobSnapshot) List(java.util.List) ArrayList(java.util.ArrayList) Job(com.netflix.titus.api.jobmanager.model.job.Job)

Aggregations

V3JobQueryCriteriaEvaluator (com.netflix.titus.runtime.endpoint.v3.grpc.query.V3JobQueryCriteriaEvaluator)4 StatusRuntimeException (io.grpc.StatusRuntimeException)3 ArrayList (java.util.ArrayList)3 List (java.util.List)3 Job (com.netflix.titus.api.jobmanager.model.job.Job)2 JobChangeNotification (com.netflix.titus.grpc.protogen.JobChangeNotification)2 JobSnapshot (com.netflix.titus.runtime.connector.jobmanager.snapshot.JobSnapshot)2 V3TaskQueryCriteriaEvaluator (com.netflix.titus.runtime.endpoint.v3.grpc.query.V3TaskQueryCriteriaEvaluator)2 AtomicBoolean (java.util.concurrent.atomic.AtomicBoolean)2 VisibleForTesting (com.google.common.annotations.VisibleForTesting)1 Stopwatch (com.google.common.base.Stopwatch)1 Task (com.netflix.titus.api.jobmanager.model.job.Task)1 JobKeepAliveEvent (com.netflix.titus.api.jobmanager.model.job.event.JobKeepAliveEvent)1 JobManagerException (com.netflix.titus.api.jobmanager.service.JobManagerException)1 Pagination (com.netflix.titus.api.model.Pagination)1 CallMetadata (com.netflix.titus.api.model.callmetadata.CallMetadata)1 CallMetadataConstants (com.netflix.titus.api.model.callmetadata.CallMetadataConstants)1 TitusServiceException (com.netflix.titus.api.service.TitusServiceException)1 TitusRuntime (com.netflix.titus.common.runtime.TitusRuntime)1 ExceptionExt (com.netflix.titus.common.util.ExceptionExt)1