Search in sources :

Example 6 with ReadOperation

use of org.apache.beam.runners.dataflow.worker.util.common.worker.ReadOperation in project beam by apache.

the class IntrinsicMapTaskExecutorTest method testGetProgressAndRequestSplit.

@Test
public void testGetProgressAndRequestSplit() throws Exception {
    TestOutputReceiver receiver = new TestOutputReceiver(counterSet, NameContextsForTests.nameContextForTest());
    TestReadOperation operation = new TestReadOperation(receiver, createContext("ReadOperation"));
    ExecutionStateTracker stateTracker = ExecutionStateTracker.newForTest();
    try (IntrinsicMapTaskExecutor executor = IntrinsicMapTaskExecutor.withSharedCounterSet(Arrays.asList(new Operation[] { operation }), counterSet, stateTracker)) {
        operation.setProgress(approximateProgressAtIndex(1L));
        Assert.assertEquals(positionAtIndex(1L), positionFromProgress(executor.getWorkerProgress()));
        Assert.assertEquals(positionAtIndex(1L), positionFromSplitResult(executor.requestDynamicSplit(splitRequestAtIndex(1L))));
    }
}
Also used : ExecutionStateTracker(org.apache.beam.runners.core.metrics.ExecutionStateTracker) DataflowExecutionStateTracker(org.apache.beam.runners.dataflow.worker.DataflowExecutionContext.DataflowExecutionStateTracker) ParDoOperation(org.apache.beam.runners.dataflow.worker.util.common.worker.ParDoOperation) ReadOperation(org.apache.beam.runners.dataflow.worker.util.common.worker.ReadOperation) Operation(org.apache.beam.runners.dataflow.worker.util.common.worker.Operation) TestOutputReceiver(org.apache.beam.runners.dataflow.worker.util.common.worker.TestOutputReceiver) Test(org.junit.Test)

Example 7 with ReadOperation

use of org.apache.beam.runners.dataflow.worker.util.common.worker.ReadOperation in project beam by apache.

the class IntrinsicMapTaskExecutorTest method testNoReadOperation.

@Test
public void testNoReadOperation() throws Exception {
    // Test MapTaskExecutor without ReadOperation.
    List<Operation> operations = Arrays.<Operation>asList(createOperation("o1", 1), createOperation("o2", 2));
    ExecutionStateTracker stateTracker = ExecutionStateTracker.newForTest();
    try (IntrinsicMapTaskExecutor executor = IntrinsicMapTaskExecutor.withSharedCounterSet(operations, counterSet, stateTracker)) {
        thrown.expect(IllegalStateException.class);
        thrown.expectMessage("is not a ReadOperation");
        executor.getReadOperation();
    }
}
Also used : ExecutionStateTracker(org.apache.beam.runners.core.metrics.ExecutionStateTracker) DataflowExecutionStateTracker(org.apache.beam.runners.dataflow.worker.DataflowExecutionContext.DataflowExecutionStateTracker) ParDoOperation(org.apache.beam.runners.dataflow.worker.util.common.worker.ParDoOperation) ReadOperation(org.apache.beam.runners.dataflow.worker.util.common.worker.ReadOperation) Operation(org.apache.beam.runners.dataflow.worker.util.common.worker.Operation) Test(org.junit.Test)

Example 8 with ReadOperation

use of org.apache.beam.runners.dataflow.worker.util.common.worker.ReadOperation in project beam by apache.

the class IntrinsicMapTaskExecutorTest method testPerElementProcessingTimeCounters.

/**
 * Verify counts for the per-element-output-time counter are correct.
 */
@Test
public void testPerElementProcessingTimeCounters() throws Exception {
    PipelineOptions options = PipelineOptionsFactory.create();
    options.as(DataflowPipelineDebugOptions.class).setExperiments(Lists.newArrayList(DataflowElementExecutionTracker.TIME_PER_ELEMENT_EXPERIMENT));
    DataflowExecutionStateTracker stateTracker = new DataflowExecutionStateTracker(ExecutionStateSampler.newForTest(), new TestDataflowExecutionState(NameContext.forStage("test-stage"), "other", null, /* requestingStepName */
    null, /* sideInputIndex */
    null, /* metricsContainer */
    NoopProfileScope.NOOP), counterSet, options, "test-work-item-id");
    NameContext parDoName = nameForStep("s1");
    // Wire a read operation with 3 elements to a ParDoOperation and assert that we count
    // the correct number of elements.
    ReadOperation read = ReadOperation.forTest(new TestReader("a", "b", "c"), new OutputReceiver(), TestOperationContext.create(counterSet, nameForStep("s0"), null, stateTracker));
    ParDoOperation parDo = new ParDoOperation(new NoopParDoFn(), new OutputReceiver[0], TestOperationContext.create(counterSet, parDoName, null, stateTracker));
    parDo.attachInput(read, 0);
    List<Operation> operations = Lists.newArrayList(read, parDo);
    try (IntrinsicMapTaskExecutor executor = IntrinsicMapTaskExecutor.withSharedCounterSet(operations, counterSet, stateTracker)) {
        executor.execute();
    }
    CounterName counterName = CounterName.named("per-element-processing-time").withOriginalName(parDoName);
    Counter<Long, CounterDistribution> counter = (Counter<Long, CounterDistribution>) counterSet.getExistingCounter(counterName);
    assertThat(counter.getAggregate().getCount(), equalTo(3L));
}
Also used : CounterDistribution(org.apache.beam.runners.dataflow.worker.counters.CounterFactory.CounterDistribution) ReadOperation(org.apache.beam.runners.dataflow.worker.util.common.worker.ReadOperation) NameContext(org.apache.beam.runners.dataflow.worker.counters.NameContext) TestReader(org.apache.beam.runners.dataflow.worker.util.common.worker.ExecutorTestUtils.TestReader) OutputReceiver(org.apache.beam.runners.dataflow.worker.util.common.worker.OutputReceiver) TestOutputReceiver(org.apache.beam.runners.dataflow.worker.util.common.worker.TestOutputReceiver) TestDataflowExecutionState(org.apache.beam.runners.dataflow.worker.TestOperationContext.TestDataflowExecutionState) ParDoOperation(org.apache.beam.runners.dataflow.worker.util.common.worker.ParDoOperation) ReadOperation(org.apache.beam.runners.dataflow.worker.util.common.worker.ReadOperation) Operation(org.apache.beam.runners.dataflow.worker.util.common.worker.Operation) ParDoOperation(org.apache.beam.runners.dataflow.worker.util.common.worker.ParDoOperation) Counter(org.apache.beam.runners.dataflow.worker.counters.Counter) CounterName(org.apache.beam.runners.dataflow.worker.counters.CounterName) PipelineOptions(org.apache.beam.sdk.options.PipelineOptions) DataflowPipelineDebugOptions(org.apache.beam.runners.dataflow.options.DataflowPipelineDebugOptions) DataflowExecutionStateTracker(org.apache.beam.runners.dataflow.worker.DataflowExecutionContext.DataflowExecutionStateTracker) Test(org.junit.Test)

Example 9 with ReadOperation

use of org.apache.beam.runners.dataflow.worker.util.common.worker.ReadOperation in project beam by apache.

the class SingularProcessBundleProgressTrackerTest method testProgressInterpolation.

@Test
public void testProgressInterpolation() throws Exception {
    ReadOperation read = Mockito.mock(ReadOperation.class);
    RemoteGrpcPortWriteOperation grpcWrite = Mockito.mock(RemoteGrpcPortWriteOperation.class);
    RegisterAndProcessBundleOperation process = Mockito.mock(RegisterAndProcessBundleOperation.class);
    when(grpcWrite.processedElementsConsumer()).thenReturn(elementsConsumed -> {
    });
    SingularProcessBundleProgressTracker tracker = new SingularProcessBundleProgressTracker(read, grpcWrite, process);
    when(read.getProgress()).thenReturn(new TestProgress("A"), new TestProgress("B"), new TestProgress("C"));
    when(grpcWrite.getElementsSent()).thenReturn(1, 10, 20, 30);
    // This test ignores them, directly working on mocked getInputElementsConsumed
    when(process.getProcessBundleProgress()).thenReturn(CompletableFuture.completedFuture(BeamFnApi.ProcessBundleProgressResponse.getDefaultInstance()));
    when(process.getInputElementsConsumed(any(Iterable.class))).thenReturn(1L, 4L, 10L).thenThrow(new RuntimeException());
    // Initially no progress is known.
    assertEquals(null, tracker.getWorkerProgress());
    // After reading, and writing, and processing one element, the progress is aligned at A.
    tracker.updateProgress();
    assertEquals(new TestProgress("A"), tracker.getWorkerProgress());
    // We've read up to B (10 elements) but only consumed 4.  Progress remains at A.
    tracker.updateProgress();
    assertEquals(new TestProgress("A"), tracker.getWorkerProgress());
    // Once 10 elements have been consumed, advance to B.
    tracker.updateProgress();
    assertEquals(new TestProgress("B"), tracker.getWorkerProgress());
    // An exception is thrown, default to latest read progress.
    tracker.updateProgress();
    assertEquals(new TestProgress("C"), tracker.getWorkerProgress());
}
Also used : ReadOperation(org.apache.beam.runners.dataflow.worker.util.common.worker.ReadOperation) SingularProcessBundleProgressTracker(org.apache.beam.runners.dataflow.worker.fn.control.BeamFnMapTaskExecutor.SingularProcessBundleProgressTracker) RemoteGrpcPortWriteOperation(org.apache.beam.runners.dataflow.worker.fn.data.RemoteGrpcPortWriteOperation) Test(org.junit.Test)

Aggregations

ReadOperation (org.apache.beam.runners.dataflow.worker.util.common.worker.ReadOperation)9 Test (org.junit.Test)7 ParDoOperation (org.apache.beam.runners.dataflow.worker.util.common.worker.ParDoOperation)5 ExecutionStateTracker (org.apache.beam.runners.core.metrics.ExecutionStateTracker)4 DataflowExecutionStateTracker (org.apache.beam.runners.dataflow.worker.DataflowExecutionContext.DataflowExecutionStateTracker)4 Operation (org.apache.beam.runners.dataflow.worker.util.common.worker.Operation)4 Counter (org.apache.beam.runners.dataflow.worker.counters.Counter)3 TestOutputReceiver (org.apache.beam.runners.dataflow.worker.util.common.worker.TestOutputReceiver)3 MapTask (com.google.api.services.dataflow.model.MapTask)2 ArrayList (java.util.ArrayList)2 List (java.util.List)2 NameContext (org.apache.beam.runners.dataflow.worker.counters.NameContext)2 InstructionOutputNode (org.apache.beam.runners.dataflow.worker.graph.Nodes.InstructionOutputNode)2 Node (org.apache.beam.runners.dataflow.worker.graph.Nodes.Node)2 ParallelInstructionNode (org.apache.beam.runners.dataflow.worker.graph.Nodes.ParallelInstructionNode)2 CounterStructuredName (com.google.api.services.dataflow.model.CounterStructuredName)1 CounterUpdate (com.google.api.services.dataflow.model.CounterUpdate)1 ParallelInstruction (com.google.api.services.dataflow.model.ParallelInstruction)1 Status (com.google.api.services.dataflow.model.Status)1 StreamingComputationConfig (com.google.api.services.dataflow.model.StreamingComputationConfig)1