Search in sources :

Example 81 with IntervalWindow

use of org.apache.beam.sdk.transforms.windowing.IntervalWindow in project beam by apache.

the class ReduceFnRunnerTest method testEmptyOnTimeWithOnTimeBehaviorBackwardCompatibility.

/**
 * Test that it fires an empty on-time isFinished pane when OnTimeBehavior is FIRE_ALWAYS and
 * ClosingBehavior is FIRE_IF_NON_EMPTY.
 *
 * <p>This is a test just for backward compatibility.
 */
@Test
public void testEmptyOnTimeWithOnTimeBehaviorBackwardCompatibility() throws Exception {
    WindowingStrategy<?, IntervalWindow> strategy = WindowingStrategy.of((WindowFn<?, IntervalWindow>) FixedWindows.of(Duration.millis(10))).withTimestampCombiner(TimestampCombiner.EARLIEST).withTrigger(AfterWatermark.pastEndOfWindow().withEarlyFirings(AfterPane.elementCountAtLeast(1))).withMode(AccumulationMode.ACCUMULATING_FIRED_PANES).withAllowedLateness(Duration.ZERO).withClosingBehavior(ClosingBehavior.FIRE_IF_NON_EMPTY);
    ReduceFnTester<Integer, Integer, IntervalWindow> tester = ReduceFnTester.combining(strategy, Sum.ofIntegers(), VarIntCoder.of());
    tester.advanceInputWatermark(new Instant(0));
    tester.advanceProcessingTime(new Instant(0));
    tester.injectElements(TimestampedValue.of(1, new Instant(1)));
    // Should fire empty on time isFinished pane
    tester.advanceInputWatermark(new Instant(11));
    List<WindowedValue<Integer>> output = tester.extractOutput();
    assertEquals(2, output.size());
    assertThat(output.get(0), WindowMatchers.valueWithPaneInfo(PaneInfo.createPane(true, false, Timing.EARLY, 0, -1)));
    assertThat(output.get(1), WindowMatchers.valueWithPaneInfo(PaneInfo.createPane(false, true, Timing.ON_TIME, 1, 0)));
}
Also used : WindowedValue(org.apache.beam.sdk.util.WindowedValue) WindowMatchers.isWindowedValue(org.apache.beam.runners.core.WindowMatchers.isWindowedValue) WindowMatchers.isSingleWindowedValue(org.apache.beam.runners.core.WindowMatchers.isSingleWindowedValue) WindowFn(org.apache.beam.sdk.transforms.windowing.WindowFn) Instant(org.joda.time.Instant) IntervalWindow(org.apache.beam.sdk.transforms.windowing.IntervalWindow) Test(org.junit.Test)

Example 82 with IntervalWindow

use of org.apache.beam.sdk.transforms.windowing.IntervalWindow in project beam by apache.

the class ReduceFnRunnerTest method testPaneInfoAllStatesAfterWatermarkAccumulating.

@Test
public void testPaneInfoAllStatesAfterWatermarkAccumulating() throws Exception {
    ReduceFnTester<Integer, Iterable<Integer>, IntervalWindow> tester = ReduceFnTester.nonCombining(WindowingStrategy.of(FixedWindows.of(Duration.millis(10))).withTrigger(Repeatedly.forever(AfterFirst.of(AfterPane.elementCountAtLeast(2), AfterWatermark.pastEndOfWindow()))).withMode(AccumulationMode.ACCUMULATING_FIRED_PANES).withAllowedLateness(Duration.millis(100)).withTimestampCombiner(TimestampCombiner.EARLIEST).withClosingBehavior(ClosingBehavior.FIRE_ALWAYS));
    tester.advanceInputWatermark(new Instant(0));
    tester.injectElements(TimestampedValue.of(1, new Instant(1)), TimestampedValue.of(2, new Instant(2)));
    List<WindowedValue<Iterable<Integer>>> output = tester.extractOutput();
    assertThat(output, contains(WindowMatchers.valueWithPaneInfo(PaneInfo.createPane(true, false, Timing.EARLY, 0, -1))));
    assertThat(output, contains(isSingleWindowedValue(containsInAnyOrder(1, 2), 1, 0, 10)));
    tester.advanceInputWatermark(new Instant(50));
    // We should get the ON_TIME pane even though it is empty,
    // because we have an AfterWatermark.pastEndOfWindow() trigger.
    output = tester.extractOutput();
    assertThat(output, contains(WindowMatchers.valueWithPaneInfo(PaneInfo.createPane(false, false, Timing.ON_TIME, 1, 0))));
    assertThat(output, contains(isSingleWindowedValue(containsInAnyOrder(1, 2), 9, 0, 10)));
    // We should get the final pane even though it is empty.
    tester.advanceInputWatermark(new Instant(150));
    output = tester.extractOutput();
    assertThat(output, contains(WindowMatchers.valueWithPaneInfo(PaneInfo.createPane(false, true, Timing.LATE, 2, 1))));
    assertThat(output, contains(isSingleWindowedValue(containsInAnyOrder(1, 2), 9, 0, 10)));
}
Also used : Matchers.emptyIterable(org.hamcrest.Matchers.emptyIterable) WindowedValue(org.apache.beam.sdk.util.WindowedValue) WindowMatchers.isWindowedValue(org.apache.beam.runners.core.WindowMatchers.isWindowedValue) WindowMatchers.isSingleWindowedValue(org.apache.beam.runners.core.WindowMatchers.isSingleWindowedValue) Instant(org.joda.time.Instant) IntervalWindow(org.apache.beam.sdk.transforms.windowing.IntervalWindow) Test(org.junit.Test)

Example 83 with IntervalWindow

use of org.apache.beam.sdk.transforms.windowing.IntervalWindow in project beam by apache.

the class ReduceFnRunnerTest method dontSetHoldIfTooLateForEndOfWindowTimer.

/**
 * Make sure that if data comes in too late to make it on time, the hold is the GC time.
 */
@Test
public void dontSetHoldIfTooLateForEndOfWindowTimer() throws Exception {
    ReduceFnTester<Integer, Iterable<Integer>, IntervalWindow> tester = ReduceFnTester.nonCombining(FixedWindows.of(Duration.millis(10)), mockTriggerStateMachine, AccumulationMode.ACCUMULATING_FIRED_PANES, Duration.millis(10), ClosingBehavior.FIRE_ALWAYS);
    tester.setAutoAdvanceOutputWatermark(false);
    // Case: Unobservably "late" relative to input watermark, but on time for output watermark
    tester.advanceInputWatermark(new Instant(15));
    tester.advanceOutputWatermark(new Instant(11));
    IntervalWindow expectedWindow = new IntervalWindow(new Instant(10), new Instant(20));
    injectElement(tester, 14);
    // Hold was applied, waiting for end-of-window timer.
    assertEquals(new Instant(14), tester.getWatermarkHold());
    // Trigger the end-of-window timer, fire a timer as though the mock trigger set it
    when(mockTriggerStateMachine.shouldFire(anyTriggerContext())).thenReturn(true);
    tester.advanceInputWatermark(new Instant(20));
    tester.fireTimer(expectedWindow, expectedWindow.maxTimestamp(), TimeDomain.EVENT_TIME);
    when(mockTriggerStateMachine.shouldFire(anyTriggerContext())).thenReturn(false);
    // Hold has been replaced with garbage collection hold. Waiting for garbage collection.
    assertEquals(new Instant(29), tester.getWatermarkHold());
    assertEquals(new Instant(29), tester.getNextTimer(TimeDomain.EVENT_TIME));
    // Case: Maybe late 1
    injectElement(tester, 13);
    // No change to hold or timers.
    assertEquals(new Instant(29), tester.getWatermarkHold());
    assertEquals(new Instant(29), tester.getNextTimer(TimeDomain.EVENT_TIME));
    // Trigger the garbage collection timer.
    tester.advanceInputWatermark(new Instant(30));
    // Everything should be cleaned up.
    assertFalse(tester.isMarkedFinished(new IntervalWindow(new Instant(10), new Instant(20))));
    tester.assertHasOnlyGlobalAndFinishedSetsFor();
}
Also used : Matchers.emptyIterable(org.hamcrest.Matchers.emptyIterable) Instant(org.joda.time.Instant) IntervalWindow(org.apache.beam.sdk.transforms.windowing.IntervalWindow) Test(org.junit.Test)

Example 84 with IntervalWindow

use of org.apache.beam.sdk.transforms.windowing.IntervalWindow in project beam by apache.

the class ReduceFnRunnerTest method testPaneInfoSkipToFinish.

@Test
public void testPaneInfoSkipToFinish() throws Exception {
    ReduceFnTester<Integer, Iterable<Integer>, IntervalWindow> tester = ReduceFnTester.nonCombining(FixedWindows.of(Duration.millis(10)), mockTriggerStateMachine, AccumulationMode.DISCARDING_FIRED_PANES, Duration.millis(100), ClosingBehavior.FIRE_IF_NON_EMPTY);
    tester.advanceInputWatermark(new Instant(0));
    when(mockTriggerStateMachine.shouldFire(anyTriggerContext())).thenReturn(true);
    triggerShouldFinish(mockTriggerStateMachine);
    injectElement(tester, 1);
    assertThat(tester.extractOutput(), contains(WindowMatchers.valueWithPaneInfo(PaneInfo.createPane(true, true, Timing.EARLY))));
}
Also used : Matchers.emptyIterable(org.hamcrest.Matchers.emptyIterable) Instant(org.joda.time.Instant) IntervalWindow(org.apache.beam.sdk.transforms.windowing.IntervalWindow) Test(org.junit.Test)

Example 85 with IntervalWindow

use of org.apache.beam.sdk.transforms.windowing.IntervalWindow in project beam by apache.

the class ReduceFnRunnerTest method testWatermarkHoldAndLateData.

@Test
public void testWatermarkHoldAndLateData() throws Exception {
    MetricsContainerImpl container = new MetricsContainerImpl("any");
    MetricsEnvironment.setCurrentContainer(container);
    // Test handling of late data. Specifically, ensure the watermark hold is correct.
    Duration allowedLateness = Duration.millis(10);
    ReduceFnTester<Integer, Iterable<Integer>, IntervalWindow> tester = ReduceFnTester.nonCombining(FixedWindows.of(Duration.millis(10)), mockTriggerStateMachine, AccumulationMode.ACCUMULATING_FIRED_PANES, allowedLateness, ClosingBehavior.FIRE_IF_NON_EMPTY);
    // Input watermark -> null
    assertEquals(null, tester.getWatermarkHold());
    assertEquals(null, tester.getOutputWatermark());
    // All on time data, verify watermark hold.
    IntervalWindow expectedWindow = new IntervalWindow(new Instant(0), new Instant(10));
    injectElement(tester, 1);
    injectElement(tester, 3);
    assertEquals(new Instant(1), tester.getWatermarkHold());
    when(mockTriggerStateMachine.shouldFire(anyTriggerContext())).thenReturn(true);
    injectElement(tester, 2);
    List<WindowedValue<Iterable<Integer>>> output = tester.extractOutput();
    assertThat(output, contains(isSingleWindowedValue(containsInAnyOrder(1, 2, 3), equalTo(new Instant(1)), equalTo((BoundedWindow) expectedWindow))));
    assertThat(output.get(0).getPane(), equalTo(PaneInfo.createPane(true, false, Timing.EARLY, 0, -1)));
    // There is no end-of-window hold, but the timer set by the trigger holds the watermark
    assertThat(tester.getWatermarkHold(), nullValue());
    // Nothing dropped.
    long droppedElements = container.getCounter(MetricName.named(ReduceFnRunner.class, ReduceFnRunner.DROPPED_DUE_TO_CLOSED_WINDOW)).getCumulative();
    assertEquals(0, droppedElements);
    // Input watermark -> 4, output watermark should advance that far as well
    tester.advanceInputWatermark(new Instant(4));
    assertEquals(new Instant(4), tester.getOutputWatermark());
    // Some late, some on time. Verify that we only hold to the minimum of on-time.
    when(mockTriggerStateMachine.shouldFire(anyTriggerContext())).thenReturn(false);
    tester.advanceInputWatermark(new Instant(4));
    injectElement(tester, 2);
    injectElement(tester, 3);
    // Late data has arrived behind the _output_ watermark. The ReduceFnRunner sets a GC hold
    // since this data is not permitted to hold up the output watermark.
    assertThat(tester.getWatermarkHold(), equalTo(expectedWindow.maxTimestamp().plus(allowedLateness)));
    // Now data just ahead of the output watermark arrives and sets an earlier "element" hold
    injectElement(tester, 5);
    assertEquals(new Instant(5), tester.getWatermarkHold());
    when(mockTriggerStateMachine.shouldFire(anyTriggerContext())).thenReturn(true);
    injectElement(tester, 4);
    output = tester.extractOutput();
    assertThat(output, contains(isSingleWindowedValue(containsInAnyOrder(// earlier firing
    1, // earlier firing
    2, // earlier firing
    3, 2, 3, 4, // new elements
    5), // timestamp
    4, // window start
    0, // window end
    10)));
    assertThat(output.get(0).getPane(), equalTo(PaneInfo.createPane(false, false, Timing.EARLY, 1, -1)));
    // Since the element hold is cleared, there is no hold remaining
    assertThat(tester.getWatermarkHold(), nullValue());
    // All behind the output watermark -- hold is at GC time (if we imagine the
    // trigger sets a timer for ON_TIME firing, that is actually when they'll be emitted)
    when(mockTriggerStateMachine.shouldFire(anyTriggerContext())).thenReturn(false);
    tester.advanceInputWatermark(new Instant(8));
    injectElement(tester, 6);
    injectElement(tester, 5);
    assertThat(tester.getWatermarkHold(), equalTo(expectedWindow.maxTimestamp().plus(allowedLateness)));
    injectElement(tester, 4);
    // Fire the ON_TIME pane
    when(mockTriggerStateMachine.shouldFire(anyTriggerContext())).thenReturn(true);
    // To get an ON_TIME pane, we need the output watermark to be held back a little; this would
    // be done by way of the timers set by the trigger, which are mocked here
    tester.setAutoAdvanceOutputWatermark(false);
    tester.advanceInputWatermark(expectedWindow.maxTimestamp().plus(Duration.millis(1)));
    tester.fireTimer(expectedWindow, expectedWindow.maxTimestamp(), TimeDomain.EVENT_TIME);
    // Output time is end of the window, because all the new data was late, but the pane
    // is the ON_TIME pane.
    output = tester.extractOutput();
    assertThat(output, contains(isSingleWindowedValue(containsInAnyOrder(// earlier firing
    1, // earlier firing
    2, // earlier firing
    3, // earlier firing
    2, // earlier firing
    3, // earlier firing
    4, // earlier firing
    5, 4, 5, // new elements
    6), // timestamp
    9, // window start
    0, // window end
    10)));
    assertThat(output.get(0).getPane(), equalTo(PaneInfo.createPane(false, false, Timing.ON_TIME, 2, 0)));
    tester.setAutoAdvanceOutputWatermark(true);
    // This is "pending" at the time the watermark makes it way-late.
    // Because we're about to expire the window, we output it.
    when(mockTriggerStateMachine.shouldFire(anyTriggerContext())).thenReturn(false);
    injectElement(tester, 8);
    droppedElements = container.getCounter(MetricName.named(ReduceFnRunner.class, ReduceFnRunner.DROPPED_DUE_TO_CLOSED_WINDOW)).getCumulative();
    assertEquals(0, droppedElements);
    // Exceed the GC limit, triggering the last pane to be fired
    tester.advanceInputWatermark(new Instant(50));
    output = tester.extractOutput();
    // Output time is still end of the window, because the new data (8) was behind
    // the output watermark.
    assertThat(output, contains(isSingleWindowedValue(containsInAnyOrder(// earlier firing
    1, // earlier firing
    2, // earlier firing
    3, // earlier firing
    2, // earlier firing
    3, // earlier firing
    4, // earlier firing
    5, // earlier firing
    4, // earlier firing
    5, // earlier firing
    6, // new element prior to window becoming expired
    8), // timestamp
    9, // window start
    0, // window end
    10)));
    assertThat(output.get(0).getPane(), equalTo(PaneInfo.createPane(false, true, Timing.LATE, 3, 1)));
    assertEquals(new Instant(50), tester.getOutputWatermark());
    assertEquals(null, tester.getWatermarkHold());
    // Late timers are ignored
    tester.fireTimer(new IntervalWindow(new Instant(0), new Instant(10)), new Instant(12), TimeDomain.EVENT_TIME);
    // And because we're past the end of window + allowed lateness, everything should be cleaned up.
    assertFalse(tester.isMarkedFinished(firstWindow));
    tester.assertHasOnlyGlobalAndFinishedSetsFor();
}
Also used : MetricsContainerImpl(org.apache.beam.runners.core.metrics.MetricsContainerImpl) Matchers.emptyIterable(org.hamcrest.Matchers.emptyIterable) WindowedValue(org.apache.beam.sdk.util.WindowedValue) WindowMatchers.isWindowedValue(org.apache.beam.runners.core.WindowMatchers.isWindowedValue) WindowMatchers.isSingleWindowedValue(org.apache.beam.runners.core.WindowMatchers.isSingleWindowedValue) Instant(org.joda.time.Instant) BoundedWindow(org.apache.beam.sdk.transforms.windowing.BoundedWindow) Duration(org.joda.time.Duration) IntervalWindow(org.apache.beam.sdk.transforms.windowing.IntervalWindow) Test(org.junit.Test)

Aggregations

IntervalWindow (org.apache.beam.sdk.transforms.windowing.IntervalWindow)238 Test (org.junit.Test)214 Instant (org.joda.time.Instant)213 WindowedValue (org.apache.beam.sdk.util.WindowedValue)67 BoundedWindow (org.apache.beam.sdk.transforms.windowing.BoundedWindow)56 KV (org.apache.beam.sdk.values.KV)56 Duration (org.joda.time.Duration)33 Matchers.emptyIterable (org.hamcrest.Matchers.emptyIterable)32 WindowMatchers.isSingleWindowedValue (org.apache.beam.runners.core.WindowMatchers.isSingleWindowedValue)20 WindowMatchers.isWindowedValue (org.apache.beam.runners.core.WindowMatchers.isWindowedValue)20 ArrayList (java.util.ArrayList)16 TupleTag (org.apache.beam.sdk.values.TupleTag)16 HashMap (java.util.HashMap)14 PCollectionView (org.apache.beam.sdk.values.PCollectionView)14 Category (org.junit.experimental.categories.Category)13 MetricsContainerImpl (org.apache.beam.runners.core.metrics.MetricsContainerImpl)12 FixedWindows (org.apache.beam.sdk.transforms.windowing.FixedWindows)12 ByteBuffer (java.nio.ByteBuffer)11 Map (java.util.Map)11 StringUtf8Coder (org.apache.beam.sdk.coders.StringUtf8Coder)11