Search in sources :

Example 1 with JobMessage

use of com.google.api.services.dataflow.model.JobMessage in project beam by apache.

the class MonitoringUtilTest method testLoggingHandler.

@Test
public void testLoggingHandler() {
    DateTime errorTime = new DateTime(1000L, ISOChronology.getInstanceUTC());
    DateTime warningTime = new DateTime(2000L, ISOChronology.getInstanceUTC());
    DateTime basicTime = new DateTime(3000L, ISOChronology.getInstanceUTC());
    DateTime detailedTime = new DateTime(4000L, ISOChronology.getInstanceUTC());
    DateTime debugTime = new DateTime(5000L, ISOChronology.getInstanceUTC());
    DateTime unknownTime = new DateTime(6000L, ISOChronology.getInstanceUTC());
    JobMessage errorJobMessage = new JobMessage();
    errorJobMessage.setMessageImportance("JOB_MESSAGE_ERROR");
    errorJobMessage.setMessageText("ERRORERROR");
    errorJobMessage.setTime(TimeUtil.toCloudTime(errorTime));
    JobMessage warningJobMessage = new JobMessage();
    warningJobMessage.setMessageImportance("JOB_MESSAGE_WARNING");
    warningJobMessage.setMessageText("WARNINGWARNING");
    warningJobMessage.setTime(TimeUtil.toCloudTime(warningTime));
    JobMessage basicJobMessage = new JobMessage();
    basicJobMessage.setMessageImportance("JOB_MESSAGE_BASIC");
    basicJobMessage.setMessageText("BASICBASIC");
    basicJobMessage.setTime(TimeUtil.toCloudTime(basicTime));
    JobMessage detailedJobMessage = new JobMessage();
    detailedJobMessage.setMessageImportance("JOB_MESSAGE_DETAILED");
    detailedJobMessage.setMessageText("DETAILEDDETAILED");
    detailedJobMessage.setTime(TimeUtil.toCloudTime(detailedTime));
    JobMessage debugJobMessage = new JobMessage();
    debugJobMessage.setMessageImportance("JOB_MESSAGE_DEBUG");
    debugJobMessage.setMessageText("DEBUGDEBUG");
    debugJobMessage.setTime(TimeUtil.toCloudTime(debugTime));
    JobMessage unknownJobMessage = new JobMessage();
    unknownJobMessage.setMessageImportance("JOB_MESSAGE_UNKNOWN");
    unknownJobMessage.setMessageText("UNKNOWNUNKNOWN");
    unknownJobMessage.setTime("");
    JobMessage emptyJobMessage = new JobMessage();
    emptyJobMessage.setMessageImportance("JOB_MESSAGE_EMPTY");
    emptyJobMessage.setTime(TimeUtil.toCloudTime(unknownTime));
    new LoggingHandler().process(Arrays.asList(errorJobMessage, warningJobMessage, basicJobMessage, detailedJobMessage, debugJobMessage, unknownJobMessage));
    expectedLogs.verifyError("ERRORERROR");
    expectedLogs.verifyError(errorTime.toString());
    expectedLogs.verifyWarn("WARNINGWARNING");
    expectedLogs.verifyWarn(warningTime.toString());
    expectedLogs.verifyInfo("BASICBASIC");
    expectedLogs.verifyInfo(basicTime.toString());
    expectedLogs.verifyInfo("DETAILEDDETAILED");
    expectedLogs.verifyInfo(detailedTime.toString());
    expectedLogs.verifyDebug("DEBUGDEBUG");
    expectedLogs.verifyDebug(debugTime.toString());
    expectedLogs.verifyTrace("UNKNOWN TIMESTAMP");
    expectedLogs.verifyTrace("UNKNOWNUNKNOWN");
    expectedLogs.verifyNotLogged(unknownTime.toString());
}
Also used : LoggingHandler(org.apache.beam.runners.dataflow.util.MonitoringUtil.LoggingHandler) JobMessage(com.google.api.services.dataflow.model.JobMessage) DateTime(org.joda.time.DateTime) Test(org.junit.Test)

Example 2 with JobMessage

use of com.google.api.services.dataflow.model.JobMessage in project beam by apache.

the class MonitoringUtilTest method testGetJobMessages.

@Test
public void testGetJobMessages() throws IOException {
    DataflowClient dataflowClient = mock(DataflowClient.class);
    ListJobMessagesResponse firstResponse = new ListJobMessagesResponse();
    firstResponse.setJobMessages(new ArrayList<JobMessage>());
    for (int i = 0; i < 100; ++i) {
        JobMessage message = new JobMessage();
        message.setId("message_" + i);
        message.setTime(TimeUtil.toCloudTime(new Instant(i)));
        firstResponse.getJobMessages().add(message);
    }
    String pageToken = "page_token";
    firstResponse.setNextPageToken(pageToken);
    ListJobMessagesResponse secondResponse = new ListJobMessagesResponse();
    secondResponse.setJobMessages(new ArrayList<JobMessage>());
    for (int i = 100; i < 150; ++i) {
        JobMessage message = new JobMessage();
        message.setId("message_" + i);
        message.setTime(TimeUtil.toCloudTime(new Instant(i)));
        secondResponse.getJobMessages().add(message);
    }
    when(dataflowClient.listJobMessages(JOB_ID, null)).thenReturn(firstResponse);
    when(dataflowClient.listJobMessages(JOB_ID, pageToken)).thenReturn(secondResponse);
    MonitoringUtil util = new MonitoringUtil(dataflowClient);
    List<JobMessage> messages = util.getJobMessages(JOB_ID, -1);
    assertEquals(150, messages.size());
}
Also used : DataflowClient(org.apache.beam.runners.dataflow.DataflowClient) JobMessage(com.google.api.services.dataflow.model.JobMessage) Instant(org.joda.time.Instant) ListJobMessagesResponse(com.google.api.services.dataflow.model.ListJobMessagesResponse) Test(org.junit.Test)

Example 3 with JobMessage

use of com.google.api.services.dataflow.model.JobMessage in project beam by apache.

the class TestDataflowRunnerTest method testBatchPipelineFailsIfException.

@Test
public void testBatchPipelineFailsIfException() throws Exception {
    Pipeline p = TestPipeline.create(options);
    PCollection<Integer> pc = p.apply(Create.of(1, 2, 3));
    PAssert.that(pc).containsInAnyOrder(1, 2, 3);
    DataflowPipelineJob mockJob = Mockito.mock(DataflowPipelineJob.class);
    when(mockJob.getState()).thenReturn(State.RUNNING);
    when(mockJob.getProjectId()).thenReturn("test-project");
    when(mockJob.getJobId()).thenReturn("test-job");
    when(mockJob.waitUntilFinish(any(Duration.class), any(JobMessagesHandler.class))).thenAnswer(new Answer<State>() {

        @Override
        public State answer(InvocationOnMock invocation) {
            JobMessage message = new JobMessage();
            message.setMessageText("FooException");
            message.setTime(TimeUtil.toCloudTime(Instant.now()));
            message.setMessageImportance("JOB_MESSAGE_ERROR");
            ((MonitoringUtil.JobMessagesHandler) invocation.getArguments()[1]).process(Arrays.asList(message));
            return State.CANCELLED;
        }
    });
    DataflowRunner mockRunner = Mockito.mock(DataflowRunner.class);
    when(mockRunner.run(any(Pipeline.class))).thenReturn(mockJob);
    when(mockClient.getJobMetrics(anyString())).thenReturn(generateMockMetricResponse(false, /* success */
    true));
    TestDataflowRunner runner = TestDataflowRunner.fromOptionsAndClient(options, mockClient);
    try {
        runner.run(p, mockRunner);
    } catch (AssertionError expected) {
        assertThat(expected.getMessage(), containsString("FooException"));
        verify(mockJob, never()).cancel();
        return;
    }
    // Note that fail throws an AssertionError which is why it is placed out here
    // instead of inside the try-catch block.
    fail("AssertionError expected");
}
Also used : Duration(org.joda.time.Duration) TestPipeline(org.apache.beam.sdk.testing.TestPipeline) Pipeline(org.apache.beam.sdk.Pipeline) JobMessagesHandler(org.apache.beam.runners.dataflow.util.MonitoringUtil.JobMessagesHandler) MonitoringUtil(org.apache.beam.runners.dataflow.util.MonitoringUtil) State(org.apache.beam.sdk.PipelineResult.State) InvocationOnMock(org.mockito.invocation.InvocationOnMock) JobMessage(com.google.api.services.dataflow.model.JobMessage) Test(org.junit.Test)

Example 4 with JobMessage

use of com.google.api.services.dataflow.model.JobMessage in project beam by apache.

the class TestDataflowRunnerTest method testStreamingPipelineFailsIfException.

/**
   * Tests that if a streaming pipeline crash loops for a non-assertion reason that the test run
   * throws an {@link AssertionError}.
   *
   * <p>This is a known limitation/bug of the runner that it does not distinguish the two modes of
   * failure.
   */
@Test
public void testStreamingPipelineFailsIfException() throws Exception {
    options.setStreaming(true);
    Pipeline pipeline = TestPipeline.create(options);
    PCollection<Integer> pc = pipeline.apply(Create.of(1, 2, 3));
    PAssert.that(pc).containsInAnyOrder(1, 2, 3);
    DataflowPipelineJob mockJob = Mockito.mock(DataflowPipelineJob.class);
    when(mockJob.getState()).thenReturn(State.RUNNING);
    when(mockJob.getProjectId()).thenReturn("test-project");
    when(mockJob.getJobId()).thenReturn("test-job");
    when(mockJob.waitUntilFinish(any(Duration.class), any(JobMessagesHandler.class))).thenAnswer(new Answer<State>() {

        @Override
        public State answer(InvocationOnMock invocation) {
            JobMessage message = new JobMessage();
            message.setMessageText("FooException");
            message.setTime(TimeUtil.toCloudTime(Instant.now()));
            message.setMessageImportance("JOB_MESSAGE_ERROR");
            ((MonitoringUtil.JobMessagesHandler) invocation.getArguments()[1]).process(Arrays.asList(message));
            return State.CANCELLED;
        }
    });
    DataflowRunner mockRunner = Mockito.mock(DataflowRunner.class);
    when(mockRunner.run(any(Pipeline.class))).thenReturn(mockJob);
    when(mockClient.getJobMetrics(anyString())).thenReturn(generateMockMetricResponse(false, /* success */
    true));
    TestDataflowRunner runner = TestDataflowRunner.fromOptionsAndClient(options, mockClient);
    expectedException.expect(RuntimeException.class);
    runner.run(pipeline, mockRunner);
}
Also used : Duration(org.joda.time.Duration) TestPipeline(org.apache.beam.sdk.testing.TestPipeline) Pipeline(org.apache.beam.sdk.Pipeline) JobMessagesHandler(org.apache.beam.runners.dataflow.util.MonitoringUtil.JobMessagesHandler) MonitoringUtil(org.apache.beam.runners.dataflow.util.MonitoringUtil) State(org.apache.beam.sdk.PipelineResult.State) InvocationOnMock(org.mockito.invocation.InvocationOnMock) JobMessage(com.google.api.services.dataflow.model.JobMessage) Test(org.junit.Test)

Example 5 with JobMessage

use of com.google.api.services.dataflow.model.JobMessage in project beam by apache.

the class DataflowPipelineJob method waitUntilFinish.

/**
   * Waits until the pipeline finishes and returns the final status.
   *
   * @param duration The time to wait for the job to finish.
   *     Provide a value less than 1 ms for an infinite wait.
   *
   * @param messageHandler If non null this handler will be invoked for each
   *   batch of messages received.
   * @param sleeper A sleeper to use to sleep between attempts.
   * @param nanoClock A nanoClock used to time the total time taken.
   * @return The final state of the job or null on timeout.
   * @throws IOException If there is a persistent problem getting job
   *   information.
   * @throws InterruptedException if the thread is interrupted.
   */
@Nullable
@VisibleForTesting
State waitUntilFinish(Duration duration, @Nullable MonitoringUtil.JobMessagesHandler messageHandler, Sleeper sleeper, NanoClock nanoClock, MonitoringUtil monitor) throws IOException, InterruptedException {
    BackOff backoff;
    if (!duration.isLongerThan(Duration.ZERO)) {
        backoff = BackOffAdapter.toGcpBackOff(MESSAGES_BACKOFF_FACTORY.backoff());
    } else {
        backoff = BackOffAdapter.toGcpBackOff(MESSAGES_BACKOFF_FACTORY.withMaxCumulativeBackoff(duration).backoff());
    }
    // This function tracks the cumulative time from the *first request* to enforce the wall-clock
    // limit. Any backoff instance could, at best, track the the time since the first attempt at a
    // given request. Thus, we need to track the cumulative time ourselves.
    long startNanos = nanoClock.nanoTime();
    State state;
    do {
        // Get the state of the job before listing messages. This ensures we always fetch job
        // messages after the job finishes to ensure we have all them.
        state = getStateWithRetries(BackOffAdapter.toGcpBackOff(STATUS_BACKOFF_FACTORY.withMaxRetries(0).backoff()), sleeper);
        boolean hasError = state == State.UNKNOWN;
        if (messageHandler != null && !hasError) {
            // Process all the job messages that have accumulated so far.
            try {
                List<JobMessage> allMessages = monitor.getJobMessages(jobId, lastTimestamp);
                if (!allMessages.isEmpty()) {
                    lastTimestamp = fromCloudTime(allMessages.get(allMessages.size() - 1).getTime()).getMillis();
                    messageHandler.process(allMessages);
                }
            } catch (GoogleJsonResponseException | SocketTimeoutException e) {
                hasError = true;
                LOG.warn("There were problems getting current job messages: {}.", e.getMessage());
                LOG.debug("Exception information:", e);
            }
        }
        if (!hasError) {
            // We can stop if the job is done.
            if (state.isTerminal()) {
                switch(state) {
                    case DONE:
                    case CANCELLED:
                        LOG.info("Job {} finished with status {}.", getJobId(), state);
                        break;
                    case UPDATED:
                        LOG.info("Job {} has been updated and is running as the new job with id {}. " + "To access the updated job on the Dataflow monitoring console, " + "please navigate to {}", getJobId(), getReplacedByJob().getJobId(), MonitoringUtil.getJobMonitoringPageURL(getReplacedByJob().getProjectId(), getReplacedByJob().getJobId()));
                        break;
                    default:
                        LOG.info("Job {} failed with status {}.", getJobId(), state);
                }
                return state;
            }
            // The job is not done, so we must keep polling.
            backoff.reset();
            // allotted time.
            if (duration.isLongerThan(Duration.ZERO)) {
                long nanosConsumed = nanoClock.nanoTime() - startNanos;
                Duration consumed = Duration.millis((nanosConsumed + 999999) / 1000000);
                Duration remaining = duration.minus(consumed);
                if (remaining.isLongerThan(Duration.ZERO)) {
                    backoff = BackOffAdapter.toGcpBackOff(MESSAGES_BACKOFF_FACTORY.withMaxCumulativeBackoff(remaining).backoff());
                } else {
                    // If there is no time remaining, don't bother backing off.
                    backoff = BackOff.STOP_BACKOFF;
                }
            }
        }
    } while (BackOffUtils.next(sleeper, backoff));
    LOG.warn("No terminal state was returned. State value {}", state);
    // Timed out.
    return null;
}
Also used : GoogleJsonResponseException(com.google.api.client.googleapis.json.GoogleJsonResponseException) SocketTimeoutException(java.net.SocketTimeoutException) JobMessage(com.google.api.services.dataflow.model.JobMessage) Duration(org.joda.time.Duration) BackOff(com.google.api.client.util.BackOff) VisibleForTesting(com.google.common.annotations.VisibleForTesting) Nullable(javax.annotation.Nullable)

Aggregations

JobMessage (com.google.api.services.dataflow.model.JobMessage)8 Test (org.junit.Test)5 MonitoringUtil (org.apache.beam.runners.dataflow.util.MonitoringUtil)3 Duration (org.joda.time.Duration)3 Instant (org.joda.time.Instant)3 ListJobMessagesResponse (com.google.api.services.dataflow.model.ListJobMessagesResponse)2 Nullable (javax.annotation.Nullable)2 JobMessagesHandler (org.apache.beam.runners.dataflow.util.MonitoringUtil.JobMessagesHandler)2 Pipeline (org.apache.beam.sdk.Pipeline)2 State (org.apache.beam.sdk.PipelineResult.State)2 TestPipeline (org.apache.beam.sdk.testing.TestPipeline)2 InvocationOnMock (org.mockito.invocation.InvocationOnMock)2 GoogleJsonResponseException (com.google.api.client.googleapis.json.GoogleJsonResponseException)1 BackOff (com.google.api.client.util.BackOff)1 NanoClock (com.google.api.client.util.NanoClock)1 Sleeper (com.google.api.client.util.Sleeper)1 Job (com.google.api.services.dataflow.model.Job)1 VisibleForTesting (com.google.common.annotations.VisibleForTesting)1 SocketTimeoutException (java.net.SocketTimeoutException)1 ArrayList (java.util.ArrayList)1