Search in sources :

Example 6 with CounterSet

use of org.apache.beam.runners.dataflow.worker.counters.CounterSet in project beam by apache.

the class UserParDoFnFactoryTest method testFactoryReuseInStep.

@Test
public void testFactoryReuseInStep() throws Exception {
    PipelineOptions options = PipelineOptionsFactory.create();
    CounterSet counters = new CounterSet();
    TestDoFn initialFn = new TestDoFn(Collections.<TupleTag<String>>emptyList());
    CloudObject cloudObject = getCloudObject(initialFn);
    TestOperationContext operationContext = TestOperationContext.create(counters);
    ParDoFn parDoFn = factory.create(options, cloudObject, null, MAIN_OUTPUT, ImmutableMap.<TupleTag<?>, Integer>of(MAIN_OUTPUT, 0), BatchModeExecutionContext.forTesting(options, "testStage"), operationContext);
    Receiver rcvr = new OutputReceiver();
    parDoFn.startBundle(rcvr);
    parDoFn.processElement(WindowedValue.valueInGlobalWindow("foo"));
    TestDoFn fn = (TestDoFn) ((SimpleParDoFn) parDoFn).getDoFnInfo().getDoFn();
    assertThat(fn, not(theInstance(initialFn)));
    parDoFn.finishBundle();
    assertThat(fn.state, equalTo(TestDoFn.State.FINISHED));
    // The fn should be reused for the second call to create
    ParDoFn secondParDoFn = factory.create(options, cloudObject, null, MAIN_OUTPUT, ImmutableMap.<TupleTag<?>, Integer>of(MAIN_OUTPUT, 0), BatchModeExecutionContext.forTesting(options, "testStage"), operationContext);
    // The fn should still be finished from the last call; it should not be set up again
    assertThat(fn.state, equalTo(TestDoFn.State.FINISHED));
    secondParDoFn.startBundle(rcvr);
    secondParDoFn.processElement(WindowedValue.valueInGlobalWindow("spam"));
    TestDoFn reobtainedFn = (TestDoFn) ((SimpleParDoFn) secondParDoFn).getDoFnInfo().getDoFn();
    secondParDoFn.finishBundle();
    assertThat(reobtainedFn.state, equalTo(TestDoFn.State.FINISHED));
    assertThat(fn, theInstance(reobtainedFn));
}
Also used : CounterSet(org.apache.beam.runners.dataflow.worker.counters.CounterSet) CloudObject(org.apache.beam.runners.dataflow.util.CloudObject) PipelineOptions(org.apache.beam.sdk.options.PipelineOptions) Receiver(org.apache.beam.runners.dataflow.worker.util.common.worker.Receiver) OutputReceiver(org.apache.beam.runners.dataflow.worker.util.common.worker.OutputReceiver) OutputReceiver(org.apache.beam.runners.dataflow.worker.util.common.worker.OutputReceiver) ParDoFn(org.apache.beam.runners.dataflow.worker.util.common.worker.ParDoFn) Test(org.junit.Test)

Example 7 with CounterSet

use of org.apache.beam.runners.dataflow.worker.counters.CounterSet in project beam by apache.

the class SimpleParDoFnTest method testStateTracking.

@Test
public void testStateTracking() throws Exception {
    ExecutionStateTracker tracker = ExecutionStateTracker.newForTest();
    TestOperationContext operationContext = TestOperationContext.create(new CounterSet(), NameContextsForTests.nameContextForTest(), new MetricsContainerImpl(NameContextsForTests.ORIGINAL_NAME), tracker);
    class StateTestingDoFn extends DoFn<Integer, String> {

        private boolean startCalled = false;

        @StartBundle
        public void startBundle() throws Exception {
            startCalled = true;
            assertThat(tracker.getCurrentState(), equalTo(operationContext.getStartState()));
        }

        @ProcessElement
        public void processElement(ProcessContext c) throws Exception {
            assertThat(startCalled, equalTo(true));
            assertThat(tracker.getCurrentState(), equalTo(operationContext.getProcessState()));
        }
    }
    StateTestingDoFn fn = new StateTestingDoFn();
    DoFnInfo<?, ?> fnInfo = DoFnInfo.forFn(fn, WindowingStrategy.globalDefault(), null, /* side input views */
    null, /* input coder */
    MAIN_OUTPUT, DoFnSchemaInformation.create(), Collections.emptyMap());
    ParDoFn userParDoFn = new SimpleParDoFn<>(options, DoFnInstanceManagers.singleInstance(fnInfo), NullSideInputReader.empty(), MAIN_OUTPUT, ImmutableMap.of(MAIN_OUTPUT, 0, new TupleTag<>("declared"), 1), BatchModeExecutionContext.forTesting(options, operationContext.counterFactory(), "testStage").getStepContext(operationContext), operationContext, DoFnSchemaInformation.create(), Collections.emptyMap(), SimpleDoFnRunnerFactory.INSTANCE);
    // This test ensures proper behavior of the state sampling even with lazy initialization.
    try (Closeable trackerCloser = tracker.activate()) {
        try (Closeable processCloser = operationContext.enterProcess()) {
            userParDoFn.processElement(WindowedValue.valueInGlobalWindow(5));
        }
    }
}
Also used : MetricsContainerImpl(org.apache.beam.runners.core.metrics.MetricsContainerImpl) ParDoFn(org.apache.beam.runners.dataflow.worker.util.common.worker.ParDoFn) DoFn(org.apache.beam.sdk.transforms.DoFn) CounterSet(org.apache.beam.runners.dataflow.worker.counters.CounterSet) ExecutionStateTracker(org.apache.beam.runners.core.metrics.ExecutionStateTracker) Closeable(java.io.Closeable) TupleTag(org.apache.beam.sdk.values.TupleTag) ParDoFn(org.apache.beam.runners.dataflow.worker.util.common.worker.ParDoFn) Test(org.junit.Test)

Example 8 with CounterSet

use of org.apache.beam.runners.dataflow.worker.counters.CounterSet in project beam by apache.

the class GroupingShuffleReaderTest method runIterationOverGroupingShuffleReader.

@SuppressWarnings("ReturnValueIgnored")
private List<KV<Integer, List<KV<Integer, Integer>>>> runIterationOverGroupingShuffleReader(BatchModeExecutionContext context, TestShuffleReader shuffleReader, GroupingShuffleReader<Integer, KV<Integer, Integer>> groupingShuffleReader, Coder<WindowedValue<KV<Integer, Iterable<KV<Integer, Integer>>>>> coder, ValuesToRead valuesToRead) throws Exception {
    CounterSet counterSet = new CounterSet();
    Counter<Long, ?> elementByteSizeCounter = counterSet.longSum(CounterName.named("element-byte-size-counter"));
    CounterBackedElementByteSizeObserver elementObserver = new CounterBackedElementByteSizeObserver(elementByteSizeCounter);
    List<KV<Integer, List<KV<Integer, Integer>>>> actual = new ArrayList<>();
    assertFalse(shuffleReader.isClosed());
    try (GroupingShuffleReaderIterator<Integer, KV<Integer, Integer>> iter = groupingShuffleReader.iterator(shuffleReader)) {
        Iterable<KV<Integer, Integer>> prevValuesIterable = null;
        Iterator<KV<Integer, Integer>> prevValuesIterator = null;
        for (boolean more = iter.start(); more; more = iter.advance()) {
            // Should not fail.
            iter.getCurrent();
            iter.getCurrent();
            // safe co-variant cast from Reiterable to Iterable
            @SuppressWarnings({ // TODO(https://issues.apache.org/jira/browse/BEAM-10556)
            "rawtypes", "unchecked" }) WindowedValue<KV<Integer, Iterable<KV<Integer, Integer>>>> windowedValue = (WindowedValue) iter.getCurrent();
            // Verify that the byte size observer is lazy for every value the GroupingShuffleReader
            // produces.
            coder.registerByteSizeObserver(windowedValue, elementObserver);
            assertTrue(elementObserver.getIsLazy());
            // Verify value is in an empty windows.
            assertEquals(BoundedWindow.TIMESTAMP_MIN_VALUE, windowedValue.getTimestamp());
            assertEquals(0, windowedValue.getWindows().size());
            KV<Integer, Iterable<KV<Integer, Integer>>> elem = windowedValue.getValue();
            Integer key = elem.getKey();
            List<KV<Integer, Integer>> values = new ArrayList<>();
            if (valuesToRead.ordinal() > ValuesToRead.SKIP_VALUES.ordinal()) {
                if (prevValuesIterable != null) {
                    // Verifies that this does not throw.
                    prevValuesIterable.iterator();
                }
                if (prevValuesIterator != null) {
                    // Verifies that this does not throw.
                    prevValuesIterator.hasNext();
                }
                Iterable<KV<Integer, Integer>> valuesIterable = elem.getValue();
                Iterator<KV<Integer, Integer>> valuesIterator = valuesIterable.iterator();
                if (valuesToRead.ordinal() >= ValuesToRead.READ_ONE_VALUE.ordinal()) {
                    while (valuesIterator.hasNext()) {
                        assertTrue(valuesIterator.hasNext());
                        assertTrue(valuesIterator.hasNext());
                        assertEquals("BatchModeExecutionContext key", key, context.getKey());
                        values.add(valuesIterator.next());
                        if (valuesToRead == ValuesToRead.READ_ONE_VALUE) {
                            break;
                        }
                    }
                    if (valuesToRead.ordinal() >= ValuesToRead.READ_ALL_VALUES.ordinal()) {
                        assertFalse(valuesIterator.hasNext());
                        assertFalse(valuesIterator.hasNext());
                        try {
                            valuesIterator.next();
                            fail("Expected NoSuchElementException");
                        } catch (NoSuchElementException exn) {
                        // As expected.
                        }
                        // Verifies that this does not throw.
                        valuesIterable.iterator();
                    }
                }
                if (valuesToRead == ValuesToRead.READ_ALL_VALUES_TWICE) {
                    // Create new iterator;
                    valuesIterator = valuesIterable.iterator();
                    while (valuesIterator.hasNext()) {
                        assertTrue(valuesIterator.hasNext());
                        assertTrue(valuesIterator.hasNext());
                        assertEquals("BatchModeExecutionContext key", key, context.getKey());
                        valuesIterator.next();
                    }
                    assertFalse(valuesIterator.hasNext());
                    assertFalse(valuesIterator.hasNext());
                    try {
                        valuesIterator.next();
                        fail("Expected NoSuchElementException");
                    } catch (NoSuchElementException exn) {
                    // As expected.
                    }
                }
                prevValuesIterable = valuesIterable;
                prevValuesIterator = valuesIterator;
            }
            actual.add(KV.of(key, values));
        }
        assertFalse(iter.advance());
        assertFalse(iter.advance());
        try {
            iter.getCurrent();
            fail("Expected NoSuchElementException");
        } catch (NoSuchElementException exn) {
        // As expected.
        }
    }
    assertTrue(shuffleReader.isClosed());
    return actual;
}
Also used : ArrayList(java.util.ArrayList) KV(org.apache.beam.sdk.values.KV) CounterSet(org.apache.beam.runners.dataflow.worker.counters.CounterSet) CounterBackedElementByteSizeObserver(org.apache.beam.runners.dataflow.worker.counters.CounterBackedElementByteSizeObserver) WindowedValue(org.apache.beam.sdk.util.WindowedValue) NoSuchElementException(java.util.NoSuchElementException)

Example 9 with CounterSet

use of org.apache.beam.runners.dataflow.worker.counters.CounterSet in project beam by apache.

the class IntrinsicMapTaskExecutorTest method testGetMetricContainers.

@Test
@SuppressWarnings("unchecked")
public /**
 * This test makes sure that any metrics reported within an operation are part of the metric
 * containers returned by {@link getMetricContainers}.
 */
void testGetMetricContainers() throws Exception {
    ExecutionStateTracker stateTracker = new DataflowExecutionStateTracker(ExecutionStateSampler.newForTest(), new TestDataflowExecutionState(NameContext.forStage("testStage"), "other", null, /* requestingStepName */
    null, /* sideInputIndex */
    null, /* metricsContainer */
    NoopProfileScope.NOOP), new CounterSet(), PipelineOptionsFactory.create(), "test-work-item-id");
    final String o1 = "o1";
    TestOperationContext context1 = createContext(o1, stateTracker);
    final String o2 = "o2";
    TestOperationContext context2 = createContext(o2, stateTracker);
    final String o3 = "o3";
    TestOperationContext context3 = createContext(o3, stateTracker);
    List<Operation> operations = Arrays.asList(new Operation(new OutputReceiver[] {}, context1) {

        @Override
        public void start() throws Exception {
            super.start();
            try (Closeable scope = context.enterStart()) {
                Metrics.counter("TestMetric", "MetricCounter").inc(1L);
            }
        }
    }, new Operation(new OutputReceiver[] {}, context2) {

        @Override
        public void start() throws Exception {
            super.start();
            try (Closeable scope = context.enterStart()) {
                Metrics.counter("TestMetric", "MetricCounter").inc(2L);
            }
        }
    }, new Operation(new OutputReceiver[] {}, context3) {

        @Override
        public void start() throws Exception {
            super.start();
            try (Closeable scope = context.enterStart()) {
                Metrics.counter("TestMetric", "MetricCounter").inc(3L);
            }
        }
    });
    try (IntrinsicMapTaskExecutor executor = IntrinsicMapTaskExecutor.withSharedCounterSet(operations, counterSet, stateTracker)) {
        // Call execute so that we run all the counters
        executor.execute();
        assertThat(context1.metricsContainer().getUpdates().counterUpdates(), contains(metricUpdate("TestMetric", "MetricCounter", o1, 1L)));
        assertThat(context2.metricsContainer().getUpdates().counterUpdates(), contains(metricUpdate("TestMetric", "MetricCounter", o2, 2L)));
        assertThat(context3.metricsContainer().getUpdates().counterUpdates(), contains(metricUpdate("TestMetric", "MetricCounter", o3, 3L)));
    }
}
Also used : Closeable(java.io.Closeable) OutputReceiver(org.apache.beam.runners.dataflow.worker.util.common.worker.OutputReceiver) TestOutputReceiver(org.apache.beam.runners.dataflow.worker.util.common.worker.TestOutputReceiver) TestDataflowExecutionState(org.apache.beam.runners.dataflow.worker.TestOperationContext.TestDataflowExecutionState) ParDoOperation(org.apache.beam.runners.dataflow.worker.util.common.worker.ParDoOperation) ReadOperation(org.apache.beam.runners.dataflow.worker.util.common.worker.ReadOperation) Operation(org.apache.beam.runners.dataflow.worker.util.common.worker.Operation) ExpectedException(org.junit.rules.ExpectedException) CounterSet(org.apache.beam.runners.dataflow.worker.counters.CounterSet) ExecutionStateTracker(org.apache.beam.runners.core.metrics.ExecutionStateTracker) DataflowExecutionStateTracker(org.apache.beam.runners.dataflow.worker.DataflowExecutionContext.DataflowExecutionStateTracker) DataflowExecutionStateTracker(org.apache.beam.runners.dataflow.worker.DataflowExecutionContext.DataflowExecutionStateTracker) Test(org.junit.Test)

Example 10 with CounterSet

use of org.apache.beam.runners.dataflow.worker.counters.CounterSet in project beam by apache.

the class DataflowExecutionStateTrackerTest method setUp.

@Before
public void setUp() {
    options = PipelineOptionsFactory.create();
    clock = mock(MillisProvider.class);
    sampler = ExecutionStateSampler.newForTest(clock);
    counterSet = new CounterSet();
}
Also used : CounterSet(org.apache.beam.runners.dataflow.worker.counters.CounterSet) MillisProvider(org.joda.time.DateTimeUtils.MillisProvider) Before(org.junit.Before)

Aggregations

CounterSet (org.apache.beam.runners.dataflow.worker.counters.CounterSet)22 Test (org.junit.Test)14 CloudObject (org.apache.beam.runners.dataflow.util.CloudObject)7 ParDoFn (org.apache.beam.runners.dataflow.worker.util.common.worker.ParDoFn)7 ExecutionStateTracker (org.apache.beam.runners.core.metrics.ExecutionStateTracker)6 DataflowExecutionStateTracker (org.apache.beam.runners.dataflow.worker.DataflowExecutionContext.DataflowExecutionStateTracker)6 OutputReceiver (org.apache.beam.runners.dataflow.worker.util.common.worker.OutputReceiver)6 Receiver (org.apache.beam.runners.dataflow.worker.util.common.worker.Receiver)5 PipelineOptions (org.apache.beam.sdk.options.PipelineOptions)5 Instant (org.joda.time.Instant)4 CounterUpdate (com.google.api.services.dataflow.model.CounterUpdate)3 WorkItemStatus (com.google.api.services.dataflow.model.WorkItemStatus)3 Closeable (java.io.Closeable)3 IOException (java.io.IOException)3 CounterStructuredName (com.google.api.services.dataflow.model.CounterStructuredName)2 NameAndKind (com.google.api.services.dataflow.model.NameAndKind)2 ArrayList (java.util.ArrayList)2 ConcurrentHashMap (java.util.concurrent.ConcurrentHashMap)2 MetricsContainerImpl (org.apache.beam.runners.core.metrics.MetricsContainerImpl)2 DataflowPipelineDebugOptions (org.apache.beam.runners.dataflow.options.DataflowPipelineDebugOptions)2