Search in sources :

Example 1 with JobEventPropagationMetrics

use of com.netflix.titus.runtime.connector.jobmanager.JobEventPropagationMetrics in project titus-control-plane by Netflix.

the class ObserveJobsCommand method execute.

@Override
public void execute(CommandContext context) throws Exception {
    long keepAliveMs = context.getCLI().hasOption('k') ? Long.parseLong(context.getCLI().getOptionValue('k')) : -1;
    RemoteJobManagementClient service = keepAliveMs > 0 ? context.getJobManagementClientWithKeepAlive(keepAliveMs) : context.getJobManagementClient();
    Flux<JobManagerEvent<?>> events;
    Set<String> jobFields = StringExt.splitByCommaIntoSet(context.getCLI().getOptionValue('j'));
    Set<String> taskFields = StringExt.splitByCommaIntoSet(context.getCLI().getOptionValue('t'));
    boolean printLatency = context.getCLI().hasOption('l');
    boolean printEvents = !context.getCLI().hasOption('n');
    boolean snapshotOnly = context.getCLI().hasOption('s');
    JobEventPropagationMetrics metrics = JobEventPropagationMetrics.newExternalClientMetrics("cli", context.getTitusRuntime());
    if (context.getCLI().hasOption('i')) {
        String jobId = context.getCLI().getOptionValue('i');
        events = service.observeJob(jobId);
    } else if (jobFields.isEmpty() && taskFields.isEmpty()) {
        events = service.observeJobs(Collections.emptyMap());
    } else {
        // Special case. Fields filtering cannot be used with RemoteJobManagementClient which converts data to
        // the core model. We have to use GRPC directly.
        executeWithFiltering(context, jobFields, taskFields, printEvents, snapshotOnly);
        return;
    }
    while (true) {
        logger.info("Establishing a new connection to the job event stream endpoint...");
        executeOnce(events, metrics, printLatency, printEvents, snapshotOnly);
        if (snapshotOnly) {
            return;
        }
    }
}
Also used : JobManagerEvent(com.netflix.titus.api.jobmanager.model.job.event.JobManagerEvent) JobEventPropagationMetrics(com.netflix.titus.runtime.connector.jobmanager.JobEventPropagationMetrics) RemoteJobManagementClient(com.netflix.titus.runtime.connector.jobmanager.RemoteJobManagementClient)

Example 2 with JobEventPropagationMetrics

use of com.netflix.titus.runtime.connector.jobmanager.JobEventPropagationMetrics in project titus-control-plane by Netflix.

the class ObserveJobsCommand method executeOnce.

private void executeOnce(Flux<JobManagerEvent<?>> events, JobEventPropagationMetrics metrics, boolean printLatency, boolean printEvents, boolean snapshotOnly) throws InterruptedException {
    CountDownLatch latch = new CountDownLatch(1);
    AtomicBoolean snapshotRead = new AtomicBoolean();
    Stopwatch stopwatch = Stopwatch.createStarted();
    Disposable disposable = events.subscribe(next -> {
        if (next == JobManagerEvent.snapshotMarker()) {
            logger.info("Emitted: snapshot marker in {}ms", stopwatch.elapsed(TimeUnit.MILLISECONDS));
            snapshotRead.set(true);
            if (snapshotOnly) {
                latch.countDown();
            }
        } else if (next instanceof JobUpdateEvent) {
            Job<?> job = ((JobUpdateEvent) next).getCurrent();
            if (printEvents) {
                logger.info("Emitted job update: jobId={}({}), jobState={}, version={}", job.getId(), next.isArchived() ? "archived" : job.getStatus().getState(), job.getStatus(), job.getVersion());
            }
            Optional<EventPropagationTrace> trace = metrics.recordJob(((JobUpdateEvent) next).getCurrent(), !snapshotRead.get());
            if (printLatency) {
                trace.ifPresent(t -> {
                    logger.info("Event propagation data: stages={}", t);
                });
            }
        } else if (next instanceof TaskUpdateEvent) {
            Task task = ((TaskUpdateEvent) next).getCurrent();
            if (printEvents) {
                logger.info("Emitted task update: jobId={}({}), taskId={}, taskState={}, version={}", task.getJobId(), next.isArchived() ? "archived" : task.getStatus().getState(), task.getId(), task.getStatus(), task.getVersion());
            }
            Optional<EventPropagationTrace> trace = metrics.recordTask(((TaskUpdateEvent) next).getCurrent(), !snapshotRead.get());
            if (printLatency) {
                trace.ifPresent(t -> logger.info("Event propagation data: {}", t));
            }
        } else if (next instanceof JobKeepAliveEvent) {
            if (printEvents) {
                logger.info("Keep alive response: " + next);
            }
        } else {
            logger.info("Unrecognized event type: {}", next);
        }
    }, e -> {
        ErrorReports.handleReplyError("Error in the event stream", e);
        latch.countDown();
    }, () -> {
        logger.info("Event stream closed");
        latch.countDown();
    });
    latch.await();
    disposable.dispose();
}
Also used : Disposable(reactor.core.Disposable) CommandContext(com.netflix.titus.cli.CommandContext) Disposable(reactor.core.Disposable) Stopwatch(com.google.common.base.Stopwatch) ObserveJobsQuery(com.netflix.titus.grpc.protogen.ObserveJobsQuery) Task(com.netflix.titus.api.jobmanager.model.job.Task) Options(org.apache.commons.cli.Options) LoggerFactory(org.slf4j.LoggerFactory) AtomicBoolean(java.util.concurrent.atomic.AtomicBoolean) StringExt(com.netflix.titus.common.util.StringExt) CliCommand(com.netflix.titus.cli.CliCommand) JobEventPropagationMetrics(com.netflix.titus.runtime.connector.jobmanager.JobEventPropagationMetrics) Option(org.apache.commons.cli.Option) EventPropagationTrace(com.netflix.titus.common.util.event.EventPropagationTrace) Job(com.netflix.titus.api.jobmanager.model.job.Job) Logger(org.slf4j.Logger) Iterator(java.util.Iterator) JobUpdateEvent(com.netflix.titus.api.jobmanager.model.job.event.JobUpdateEvent) Set(java.util.Set) JobManagerEvent(com.netflix.titus.api.jobmanager.model.job.event.JobManagerEvent) JobKeepAliveEvent(com.netflix.titus.api.jobmanager.model.job.event.JobKeepAliveEvent) TimeUnit(java.util.concurrent.TimeUnit) CountDownLatch(java.util.concurrent.CountDownLatch) Flux(reactor.core.publisher.Flux) TaskUpdateEvent(com.netflix.titus.api.jobmanager.model.job.event.TaskUpdateEvent) JobManagementServiceBlockingStub(com.netflix.titus.grpc.protogen.JobManagementServiceGrpc.JobManagementServiceBlockingStub) Optional(java.util.Optional) ErrorReports(com.netflix.titus.cli.command.ErrorReports) Collections(java.util.Collections) JobChangeNotification(com.netflix.titus.grpc.protogen.JobChangeNotification) RemoteJobManagementClient(com.netflix.titus.runtime.connector.jobmanager.RemoteJobManagementClient) Task(com.netflix.titus.api.jobmanager.model.job.Task) Optional(java.util.Optional) Stopwatch(com.google.common.base.Stopwatch) JobKeepAliveEvent(com.netflix.titus.api.jobmanager.model.job.event.JobKeepAliveEvent) CountDownLatch(java.util.concurrent.CountDownLatch) EventPropagationTrace(com.netflix.titus.common.util.event.EventPropagationTrace) JobUpdateEvent(com.netflix.titus.api.jobmanager.model.job.event.JobUpdateEvent) AtomicBoolean(java.util.concurrent.atomic.AtomicBoolean) Job(com.netflix.titus.api.jobmanager.model.job.Job) TaskUpdateEvent(com.netflix.titus.api.jobmanager.model.job.event.TaskUpdateEvent)

Aggregations

JobManagerEvent (com.netflix.titus.api.jobmanager.model.job.event.JobManagerEvent)2 JobEventPropagationMetrics (com.netflix.titus.runtime.connector.jobmanager.JobEventPropagationMetrics)2 RemoteJobManagementClient (com.netflix.titus.runtime.connector.jobmanager.RemoteJobManagementClient)2 Stopwatch (com.google.common.base.Stopwatch)1 Job (com.netflix.titus.api.jobmanager.model.job.Job)1 Task (com.netflix.titus.api.jobmanager.model.job.Task)1 JobKeepAliveEvent (com.netflix.titus.api.jobmanager.model.job.event.JobKeepAliveEvent)1 JobUpdateEvent (com.netflix.titus.api.jobmanager.model.job.event.JobUpdateEvent)1 TaskUpdateEvent (com.netflix.titus.api.jobmanager.model.job.event.TaskUpdateEvent)1 CliCommand (com.netflix.titus.cli.CliCommand)1 CommandContext (com.netflix.titus.cli.CommandContext)1 ErrorReports (com.netflix.titus.cli.command.ErrorReports)1 StringExt (com.netflix.titus.common.util.StringExt)1 EventPropagationTrace (com.netflix.titus.common.util.event.EventPropagationTrace)1 JobChangeNotification (com.netflix.titus.grpc.protogen.JobChangeNotification)1 JobManagementServiceBlockingStub (com.netflix.titus.grpc.protogen.JobManagementServiceGrpc.JobManagementServiceBlockingStub)1 ObserveJobsQuery (com.netflix.titus.grpc.protogen.ObserveJobsQuery)1 Collections (java.util.Collections)1 Iterator (java.util.Iterator)1 Optional (java.util.Optional)1