Search in sources :

Example 1 with MultipleJobsDetails

use of org.apache.flink.runtime.messages.webmonitor.MultipleJobsDetails in project flink by apache.

the class CurrentJobsOverviewHandler method handleJsonRequest.

@Override
public String handleJsonRequest(Map<String, String> pathParams, Map<String, String> queryParams, ActorGateway jobManager) throws Exception {
    try {
        if (jobManager != null) {
            Future<Object> future = jobManager.ask(new RequestJobDetails(includeRunningJobs, includeFinishedJobs), timeout);
            MultipleJobsDetails result = (MultipleJobsDetails) Await.result(future, timeout);
            final long now = System.currentTimeMillis();
            StringWriter writer = new StringWriter();
            JsonGenerator gen = JsonFactory.jacksonFactory.createGenerator(writer);
            gen.writeStartObject();
            if (includeRunningJobs && includeFinishedJobs) {
                gen.writeArrayFieldStart("running");
                for (JobDetails detail : result.getRunningJobs()) {
                    writeJobDetailOverviewAsJson(detail, gen, now);
                }
                gen.writeEndArray();
                gen.writeArrayFieldStart("finished");
                for (JobDetails detail : result.getFinishedJobs()) {
                    writeJobDetailOverviewAsJson(detail, gen, now);
                }
                gen.writeEndArray();
            } else {
                gen.writeArrayFieldStart("jobs");
                for (JobDetails detail : includeRunningJobs ? result.getRunningJobs() : result.getFinishedJobs()) {
                    writeJobDetailOverviewAsJson(detail, gen, now);
                }
                gen.writeEndArray();
            }
            gen.writeEndObject();
            gen.close();
            return writer.toString();
        } else {
            throw new Exception("No connection to the leading JobManager.");
        }
    } catch (Exception e) {
        throw new Exception("Failed to fetch the status overview: " + e.getMessage(), e);
    }
}
Also used : StringWriter(java.io.StringWriter) JsonGenerator(com.fasterxml.jackson.core.JsonGenerator) RequestJobDetails(org.apache.flink.runtime.messages.webmonitor.RequestJobDetails) MultipleJobsDetails(org.apache.flink.runtime.messages.webmonitor.MultipleJobsDetails) JobDetails(org.apache.flink.runtime.messages.webmonitor.JobDetails) RequestJobDetails(org.apache.flink.runtime.messages.webmonitor.RequestJobDetails) IOException(java.io.IOException)

Example 2 with MultipleJobsDetails

use of org.apache.flink.runtime.messages.webmonitor.MultipleJobsDetails in project flink by apache.

the class MetricFetcher method fetchMetrics.

private void fetchMetrics() {
    try {
        Option<scala.Tuple2<ActorGateway, Integer>> jobManagerGatewayAndWebPort = retriever.getJobManagerGatewayAndWebPort();
        if (jobManagerGatewayAndWebPort.isDefined()) {
            ActorGateway jobManager = jobManagerGatewayAndWebPort.get()._1();
            /**
				 * Remove all metrics that belong to a job that is not running and no longer archived.
				 */
            Future<Object> jobDetailsFuture = jobManager.ask(new RequestJobDetails(true, true), timeout);
            jobDetailsFuture.onSuccess(new OnSuccess<Object>() {

                @Override
                public void onSuccess(Object result) throws Throwable {
                    MultipleJobsDetails details = (MultipleJobsDetails) result;
                    ArrayList<String> toRetain = new ArrayList<>();
                    for (JobDetails job : details.getRunningJobs()) {
                        toRetain.add(job.getJobId().toString());
                    }
                    for (JobDetails job : details.getFinishedJobs()) {
                        toRetain.add(job.getJobId().toString());
                    }
                    synchronized (metrics) {
                        metrics.jobs.keySet().retainAll(toRetain);
                    }
                }
            }, ctx);
            logErrorOnFailure(jobDetailsFuture, "Fetching of JobDetails failed.");
            String jobManagerPath = jobManager.path();
            String queryServicePath = jobManagerPath.substring(0, jobManagerPath.lastIndexOf('/') + 1) + MetricQueryService.METRIC_QUERY_SERVICE_NAME;
            ActorRef jobManagerQueryService = actorSystem.actorFor(queryServicePath);
            queryMetrics(jobManagerQueryService);
            /**
				 * We first request the list of all registered task managers from the job manager, and then
				 * request the respective metric dump from each task manager.
				 *
				 * All stored metrics that do not belong to a registered task manager will be removed.
				 */
            Future<Object> registeredTaskManagersFuture = jobManager.ask(JobManagerMessages.getRequestRegisteredTaskManagers(), timeout);
            registeredTaskManagersFuture.onSuccess(new OnSuccess<Object>() {

                @Override
                public void onSuccess(Object result) throws Throwable {
                    Iterable<Instance> taskManagers = ((JobManagerMessages.RegisteredTaskManagers) result).asJavaIterable();
                    List<String> activeTaskManagers = new ArrayList<>();
                    for (Instance taskManager : taskManagers) {
                        activeTaskManagers.add(taskManager.getId().toString());
                        String taskManagerPath = taskManager.getTaskManagerGateway().getAddress();
                        String queryServicePath = taskManagerPath.substring(0, taskManagerPath.lastIndexOf('/') + 1) + MetricQueryService.METRIC_QUERY_SERVICE_NAME + "_" + taskManager.getTaskManagerID().getResourceIdString();
                        ActorRef taskManagerQueryService = actorSystem.actorFor(queryServicePath);
                        queryMetrics(taskManagerQueryService);
                    }
                    synchronized (metrics) {
                        // remove all metrics belonging to unregistered task managers
                        metrics.taskManagers.keySet().retainAll(activeTaskManagers);
                    }
                }
            }, ctx);
            logErrorOnFailure(registeredTaskManagersFuture, "Fetchin list of registered TaskManagers failed.");
        }
    } catch (Exception e) {
        LOG.warn("Exception while fetching metrics.", e);
    }
}
Also used : Instance(org.apache.flink.runtime.instance.Instance) ActorRef(akka.actor.ActorRef) ArrayList(java.util.ArrayList) JobManagerMessages(org.apache.flink.runtime.messages.JobManagerMessages) RequestJobDetails(org.apache.flink.runtime.messages.webmonitor.RequestJobDetails) MultipleJobsDetails(org.apache.flink.runtime.messages.webmonitor.MultipleJobsDetails) RequestJobDetails(org.apache.flink.runtime.messages.webmonitor.RequestJobDetails) JobDetails(org.apache.flink.runtime.messages.webmonitor.JobDetails) ActorGateway(org.apache.flink.runtime.instance.ActorGateway) ArrayList(java.util.ArrayList) List(java.util.List)

Example 3 with MultipleJobsDetails

use of org.apache.flink.runtime.messages.webmonitor.MultipleJobsDetails in project flink by apache.

the class MetricFetcherTest method testUpdate.

@Test
public void testUpdate() throws Exception {
    // ========= setup TaskManager =================================================================================
    JobID jobID = new JobID();
    InstanceID tmID = new InstanceID();
    ResourceID tmRID = new ResourceID(tmID.toString());
    TaskManagerGateway taskManagerGateway = mock(TaskManagerGateway.class);
    when(taskManagerGateway.getAddress()).thenReturn("/tm/address");
    Instance taskManager = mock(Instance.class);
    when(taskManager.getTaskManagerGateway()).thenReturn(taskManagerGateway);
    when(taskManager.getId()).thenReturn(tmID);
    when(taskManager.getTaskManagerID()).thenReturn(tmRID);
    // ========= setup JobManager ==================================================================================
    JobDetails details = mock(JobDetails.class);
    when(details.getJobId()).thenReturn(jobID);
    ActorGateway jobManagerGateway = mock(ActorGateway.class);
    Object registeredTaskManagersAnswer = new JobManagerMessages.RegisteredTaskManagers(JavaConverters.collectionAsScalaIterableConverter(Collections.singletonList(taskManager)).asScala());
    when(jobManagerGateway.ask(isA(RequestJobDetails.class), any(FiniteDuration.class))).thenReturn(Future$.MODULE$.successful((Object) new MultipleJobsDetails(new JobDetails[0], new JobDetails[0])));
    when(jobManagerGateway.ask(isA(JobManagerMessages.RequestRegisteredTaskManagers$.class), any(FiniteDuration.class))).thenReturn(Future$.MODULE$.successful(registeredTaskManagersAnswer));
    when(jobManagerGateway.path()).thenReturn("/jm/address");
    JobManagerRetriever retriever = mock(JobManagerRetriever.class);
    when(retriever.getJobManagerGatewayAndWebPort()).thenReturn(Option.apply(new scala.Tuple2<ActorGateway, Integer>(jobManagerGateway, 0)));
    // ========= setup QueryServices ================================================================================
    Object requestMetricsAnswer = createRequestDumpAnswer(tmID, jobID);
    final ActorRef jmQueryService = mock(ActorRef.class);
    final ActorRef tmQueryService = mock(ActorRef.class);
    ActorSystem actorSystem = mock(ActorSystem.class);
    when(actorSystem.actorFor(eq("/jm/" + METRIC_QUERY_SERVICE_NAME))).thenReturn(jmQueryService);
    when(actorSystem.actorFor(eq("/tm/" + METRIC_QUERY_SERVICE_NAME + "_" + tmRID.getResourceIdString()))).thenReturn(tmQueryService);
    MetricFetcher.BasicGateway jmQueryServiceGateway = mock(MetricFetcher.BasicGateway.class);
    when(jmQueryServiceGateway.ask(any(MetricQueryService.getCreateDump().getClass()), any(FiniteDuration.class))).thenReturn(Future$.MODULE$.successful((Object) new MetricDumpSerialization.MetricSerializationResult(new byte[0], 0, 0, 0, 0)));
    MetricFetcher.BasicGateway tmQueryServiceGateway = mock(MetricFetcher.BasicGateway.class);
    when(tmQueryServiceGateway.ask(any(MetricQueryService.getCreateDump().getClass()), any(FiniteDuration.class))).thenReturn(Future$.MODULE$.successful(requestMetricsAnswer));
    whenNew(MetricFetcher.BasicGateway.class).withArguments(eq(new Object() {

        @Override
        public boolean equals(Object o) {
            return o == jmQueryService;
        }
    })).thenReturn(jmQueryServiceGateway);
    whenNew(MetricFetcher.BasicGateway.class).withArguments(eq(new Object() {

        @Override
        public boolean equals(Object o) {
            return o == tmQueryService;
        }
    })).thenReturn(tmQueryServiceGateway);
    // ========= start MetricFetcher testing =======================================================================
    ExecutionContextExecutor context = ExecutionContext$.MODULE$.fromExecutor(new CurrentThreadExecutor());
    MetricFetcher fetcher = new MetricFetcher(actorSystem, retriever, context);
    // verify that update fetches metrics and updates the store
    fetcher.update();
    MetricStore store = fetcher.getMetricStore();
    synchronized (store) {
        assertEquals("7", store.jobManager.metrics.get("abc.hist_min"));
        assertEquals("6", store.jobManager.metrics.get("abc.hist_max"));
        assertEquals("4.0", store.jobManager.metrics.get("abc.hist_mean"));
        assertEquals("0.5", store.jobManager.metrics.get("abc.hist_median"));
        assertEquals("5.0", store.jobManager.metrics.get("abc.hist_stddev"));
        assertEquals("0.75", store.jobManager.metrics.get("abc.hist_p75"));
        assertEquals("0.9", store.jobManager.metrics.get("abc.hist_p90"));
        assertEquals("0.95", store.jobManager.metrics.get("abc.hist_p95"));
        assertEquals("0.98", store.jobManager.metrics.get("abc.hist_p98"));
        assertEquals("0.99", store.jobManager.metrics.get("abc.hist_p99"));
        assertEquals("0.999", store.jobManager.metrics.get("abc.hist_p999"));
        assertEquals("x", store.getTaskManagerMetricStore(tmID.toString()).metrics.get("abc.gauge"));
        assertEquals("5.0", store.getJobMetricStore(jobID.toString()).metrics.get("abc.jc"));
        assertEquals("2", store.getTaskMetricStore(jobID.toString(), "taskid").metrics.get("2.abc.tc"));
        assertEquals("1", store.getTaskMetricStore(jobID.toString(), "taskid").metrics.get("2.opname.abc.oc"));
    }
}
Also used : ActorSystem(akka.actor.ActorSystem) InstanceID(org.apache.flink.runtime.instance.InstanceID) Instance(org.apache.flink.runtime.instance.Instance) ActorRef(akka.actor.ActorRef) TaskManagerGateway(org.apache.flink.runtime.jobmanager.slots.TaskManagerGateway) MultipleJobsDetails(org.apache.flink.runtime.messages.webmonitor.MultipleJobsDetails) RequestJobDetails(org.apache.flink.runtime.messages.webmonitor.RequestJobDetails) JobDetails(org.apache.flink.runtime.messages.webmonitor.JobDetails) MetricDumpSerialization(org.apache.flink.runtime.metrics.dump.MetricDumpSerialization) ResourceID(org.apache.flink.runtime.clusterframework.types.ResourceID) ActorGateway(org.apache.flink.runtime.instance.ActorGateway) ExecutionContextExecutor(scala.concurrent.ExecutionContextExecutor) FiniteDuration(scala.concurrent.duration.FiniteDuration) RequestJobDetails(org.apache.flink.runtime.messages.webmonitor.RequestJobDetails) Tuple2(org.apache.flink.api.java.tuple.Tuple2) JobManagerRetriever(org.apache.flink.runtime.webmonitor.JobManagerRetriever) JobID(org.apache.flink.api.common.JobID) PrepareForTest(org.powermock.core.classloader.annotations.PrepareForTest) Test(org.junit.Test)

Example 4 with MultipleJobsDetails

use of org.apache.flink.runtime.messages.webmonitor.MultipleJobsDetails in project flink by apache.

the class WebMonitorMessagesTest method testMultipleJobDetails.

@Test
public void testMultipleJobDetails() {
    try {
        final Random rnd = new Random();
        GenericMessageTester.testMessageInstance(new MultipleJobsDetails(randomJobDetails(rnd), randomJobDetails(rnd)));
    } catch (Exception e) {
        e.printStackTrace();
        fail(e.getMessage());
    }
}
Also used : Random(java.util.Random) MultipleJobsDetails(org.apache.flink.runtime.messages.webmonitor.MultipleJobsDetails) Test(org.junit.Test)

Aggregations

MultipleJobsDetails (org.apache.flink.runtime.messages.webmonitor.MultipleJobsDetails)4 JobDetails (org.apache.flink.runtime.messages.webmonitor.JobDetails)3 RequestJobDetails (org.apache.flink.runtime.messages.webmonitor.RequestJobDetails)3 ActorRef (akka.actor.ActorRef)2 ActorGateway (org.apache.flink.runtime.instance.ActorGateway)2 Instance (org.apache.flink.runtime.instance.Instance)2 Test (org.junit.Test)2 ActorSystem (akka.actor.ActorSystem)1 JsonGenerator (com.fasterxml.jackson.core.JsonGenerator)1 IOException (java.io.IOException)1 StringWriter (java.io.StringWriter)1 ArrayList (java.util.ArrayList)1 List (java.util.List)1 Random (java.util.Random)1 JobID (org.apache.flink.api.common.JobID)1 Tuple2 (org.apache.flink.api.java.tuple.Tuple2)1 ResourceID (org.apache.flink.runtime.clusterframework.types.ResourceID)1 InstanceID (org.apache.flink.runtime.instance.InstanceID)1 TaskManagerGateway (org.apache.flink.runtime.jobmanager.slots.TaskManagerGateway)1 JobManagerMessages (org.apache.flink.runtime.messages.JobManagerMessages)1