Search in sources :

Example 16 with BoundedWindow

use of org.apache.beam.sdk.transforms.windowing.BoundedWindow in project beam by apache.

the class WindowedValueTest method testExplodeWindowsManyWindowsMultipleWindowedValues.

@Test
public void testExplodeWindowsManyWindowsMultipleWindowedValues() {
    Instant now = Instant.now();
    BoundedWindow centerWindow = new IntervalWindow(now.minus(1000L), now.plus(1000L));
    BoundedWindow pastWindow = new IntervalWindow(now.minus(1500L), now.plus(500L));
    BoundedWindow futureWindow = new IntervalWindow(now.minus(500L), now.plus(1500L));
    BoundedWindow futureFutureWindow = new IntervalWindow(now, now.plus(2000L));
    PaneInfo pane = PaneInfo.createPane(false, false, Timing.ON_TIME, 3L, 0L);
    WindowedValue<String> value = WindowedValue.of("foo", now, ImmutableList.of(pastWindow, centerWindow, futureWindow, futureFutureWindow), pane);
    assertThat(value.explodeWindows(), containsInAnyOrder(WindowedValue.of("foo", now, futureFutureWindow, pane), WindowedValue.of("foo", now, futureWindow, pane), WindowedValue.of("foo", now, centerWindow, pane), WindowedValue.of("foo", now, pastWindow, pane)));
}
Also used : Instant(org.joda.time.Instant) PaneInfo(org.apache.beam.sdk.transforms.windowing.PaneInfo) BoundedWindow(org.apache.beam.sdk.transforms.windowing.BoundedWindow) IntervalWindow(org.apache.beam.sdk.transforms.windowing.IntervalWindow) Test(org.junit.Test)

Example 17 with BoundedWindow

use of org.apache.beam.sdk.transforms.windowing.BoundedWindow in project beam by apache.

the class TriggerStateMachineTester method injectElements.

public final void injectElements(Collection<TimestampedValue<InputT>> values) throws Exception {
    for (TimestampedValue<InputT> value : values) {
        WindowTracing.trace("TriggerTester.injectElements: {}", value);
    }
    List<WindowedValue<InputT>> windowedValues = Lists.newArrayListWithCapacity(values.size());
    for (TimestampedValue<InputT> input : values) {
        try {
            InputT value = input.getValue();
            Instant timestamp = input.getTimestamp();
            Collection<W> assignedWindows = windowFn.assignWindows(new TestAssignContext<W>(windowFn, value, timestamp, GlobalWindow.INSTANCE));
            for (W window : assignedWindows) {
                activeWindows.addActiveForTesting(window);
                // Today, triggers assume onTimer firing at the watermark time, whether or not they
                // explicitly set the timer themselves. So this tester must set it.
                timerInternals.setTimer(TimerData.of(windowNamespace(window), window.maxTimestamp(), TimeDomain.EVENT_TIME));
            }
            windowedValues.add(WindowedValue.of(value, timestamp, assignedWindows, PaneInfo.NO_FIRING));
        } catch (Exception e) {
            throw new RuntimeException(e);
        }
    }
    for (WindowedValue<InputT> windowedValue : windowedValues) {
        for (BoundedWindow untypedWindow : windowedValue.getWindows()) {
            // SDK is responsible for type safety
            @SuppressWarnings("unchecked") W window = mergeResult((W) untypedWindow);
            TriggerStateMachine.OnElementContext context = contextFactory.createOnElementContext(window, new TestTimers(windowNamespace(window)), windowedValue.getTimestamp(), executableTrigger, getFinishedSet(window));
            if (!context.trigger().isFinished()) {
                executableTrigger.invokeOnElement(context);
            }
        }
    }
}
Also used : Instant(org.joda.time.Instant) WindowedValue(org.apache.beam.sdk.util.WindowedValue) BoundedWindow(org.apache.beam.sdk.transforms.windowing.BoundedWindow)

Example 18 with BoundedWindow

use of org.apache.beam.sdk.transforms.windowing.BoundedWindow in project beam by apache.

the class SideInputContainer method indexValuesByWindow.

/**
   * Index the provided values by all {@link BoundedWindow windows} in which they appear.
   */
private Map<BoundedWindow, Collection<WindowedValue<?>>> indexValuesByWindow(Iterable<? extends WindowedValue<?>> values) {
    Map<BoundedWindow, Collection<WindowedValue<?>>> valuesPerWindow = new HashMap<>();
    for (WindowedValue<?> value : values) {
        for (BoundedWindow window : value.getWindows()) {
            Collection<WindowedValue<?>> windowValues = valuesPerWindow.get(window);
            if (windowValues == null) {
                windowValues = new ArrayList<>();
                valuesPerWindow.put(window, windowValues);
            }
            windowValues.add(value);
        }
    }
    return valuesPerWindow;
}
Also used : HashMap(java.util.HashMap) WindowedValue(org.apache.beam.sdk.util.WindowedValue) BoundedWindow(org.apache.beam.sdk.transforms.windowing.BoundedWindow) Collection(java.util.Collection)

Example 19 with BoundedWindow

use of org.apache.beam.sdk.transforms.windowing.BoundedWindow in project beam by apache.

the class EvaluationContextTest method writeToViewWriterThenReadReads.

@Test
public void writeToViewWriterThenReadReads() {
    PCollectionViewWriter<Integer, Iterable<Integer>> viewWriter = context.createPCollectionViewWriter(PCollection.<Iterable<Integer>>createPrimitiveOutputInternal(p, WindowingStrategy.globalDefault(), IsBounded.BOUNDED), view);
    BoundedWindow window = new TestBoundedWindow(new Instant(1024L));
    BoundedWindow second = new TestBoundedWindow(new Instant(899999L));
    WindowedValue<Integer> firstValue = WindowedValue.of(1, new Instant(1222), window, PaneInfo.ON_TIME_AND_ONLY_FIRING);
    WindowedValue<Integer> secondValue = WindowedValue.of(2, new Instant(8766L), second, PaneInfo.createPane(true, false, Timing.ON_TIME, 0, 0));
    Iterable<WindowedValue<Integer>> values = ImmutableList.of(firstValue, secondValue);
    viewWriter.add(values);
    SideInputReader reader = context.createSideInputReader(ImmutableList.<PCollectionView<?>>of(view));
    assertThat(reader.get(view, window), containsInAnyOrder(1));
    assertThat(reader.get(view, second), containsInAnyOrder(2));
    WindowedValue<Integer> overrittenSecondValue = WindowedValue.of(4444, new Instant(8677L), second, PaneInfo.createPane(false, true, Timing.LATE, 1, 1));
    viewWriter.add(Collections.singleton(overrittenSecondValue));
    assertThat(reader.get(view, second), containsInAnyOrder(2));
    // The cached value is served in the earlier reader
    reader = context.createSideInputReader(ImmutableList.<PCollectionView<?>>of(view));
    assertThat(reader.get(view, second), containsInAnyOrder(4444));
}
Also used : PCollectionView(org.apache.beam.sdk.values.PCollectionView) Matchers.emptyIterable(org.hamcrest.Matchers.emptyIterable) WindowedValue(org.apache.beam.sdk.util.WindowedValue) Instant(org.joda.time.Instant) BoundedWindow(org.apache.beam.sdk.transforms.windowing.BoundedWindow) SideInputReader(org.apache.beam.runners.core.SideInputReader) Test(org.junit.Test)

Example 20 with BoundedWindow

use of org.apache.beam.sdk.transforms.windowing.BoundedWindow in project beam by apache.

the class SparkGlobalCombineFn method createAccumulator.

private Iterable<WindowedValue<AccumT>> createAccumulator(WindowedValue<InputT> input) {
    // sort exploded inputs.
    Iterable<WindowedValue<InputT>> sortedInputs = sortByWindows(input.explodeWindows());
    TimestampCombiner timestampCombiner = windowingStrategy.getTimestampCombiner();
    WindowFn<?, BoundedWindow> windowFn = windowingStrategy.getWindowFn();
    //--- inputs iterator, by window order.
    final Iterator<WindowedValue<InputT>> iterator = sortedInputs.iterator();
    WindowedValue<InputT> currentInput = iterator.next();
    BoundedWindow currentWindow = Iterables.getFirst(currentInput.getWindows(), null);
    // first create the accumulator and accumulate first input.
    AccumT accumulator = combineFn.createAccumulator(ctxtForInput(currentInput));
    accumulator = combineFn.addInput(accumulator, currentInput.getValue(), ctxtForInput(currentInput));
    // keep track of the timestamps assigned by the TimestampCombiner.
    Instant windowTimestamp = timestampCombiner.assign(currentWindow, windowingStrategy.getWindowFn().getOutputTime(currentInput.getTimestamp(), currentWindow));
    // accumulate the next windows, or output.
    List<WindowedValue<AccumT>> output = Lists.newArrayList();
    // if merging, merge overlapping windows, e.g. Sessions.
    final boolean merging = !windowingStrategy.getWindowFn().isNonMerging();
    while (iterator.hasNext()) {
        WindowedValue<InputT> nextValue = iterator.next();
        BoundedWindow nextWindow = Iterables.getOnlyElement(nextValue.getWindows());
        boolean mergingAndIntersecting = merging && isIntersecting((IntervalWindow) currentWindow, (IntervalWindow) nextWindow);
        if (mergingAndIntersecting || nextWindow.equals(currentWindow)) {
            if (mergingAndIntersecting) {
                // merge intersecting windows.
                currentWindow = merge((IntervalWindow) currentWindow, (IntervalWindow) nextWindow);
            }
            // keep accumulating and carry on ;-)
            accumulator = combineFn.addInput(accumulator, nextValue.getValue(), ctxtForInput(nextValue));
            windowTimestamp = timestampCombiner.merge(currentWindow, windowTimestamp, windowingStrategy.getWindowFn().getOutputTime(nextValue.getTimestamp(), currentWindow));
        } else {
            // moving to the next window, first add the current accumulation to output
            // and initialize the accumulator.
            output.add(WindowedValue.of(accumulator, windowTimestamp, currentWindow, PaneInfo.NO_FIRING));
            // re-init accumulator, window and timestamp.
            accumulator = combineFn.createAccumulator(ctxtForInput(nextValue));
            accumulator = combineFn.addInput(accumulator, nextValue.getValue(), ctxtForInput(nextValue));
            currentWindow = nextWindow;
            windowTimestamp = timestampCombiner.assign(currentWindow, windowFn.getOutputTime(nextValue.getTimestamp(), currentWindow));
        }
    }
    // add last accumulator to the output.
    output.add(WindowedValue.of(accumulator, windowTimestamp, currentWindow, PaneInfo.NO_FIRING));
    return output;
}
Also used : TimestampCombiner(org.apache.beam.sdk.transforms.windowing.TimestampCombiner) Instant(org.joda.time.Instant) WindowedValue(org.apache.beam.sdk.util.WindowedValue) BoundedWindow(org.apache.beam.sdk.transforms.windowing.BoundedWindow) IntervalWindow(org.apache.beam.sdk.transforms.windowing.IntervalWindow)

Aggregations

BoundedWindow (org.apache.beam.sdk.transforms.windowing.BoundedWindow)54 Instant (org.joda.time.Instant)27 Test (org.junit.Test)26 IntervalWindow (org.apache.beam.sdk.transforms.windowing.IntervalWindow)21 KV (org.apache.beam.sdk.values.KV)20 WindowedValue (org.apache.beam.sdk.util.WindowedValue)14 ArrayList (java.util.ArrayList)7 TimerSpec (org.apache.beam.sdk.state.TimerSpec)7 Timer (org.apache.beam.sdk.state.Timer)6 Matchers.containsString (org.hamcrest.Matchers.containsString)6 DoFn (org.apache.beam.sdk.transforms.DoFn)5 StringUtils.byteArrayToJsonString (org.apache.beam.sdk.util.StringUtils.byteArrayToJsonString)5 ImmutableList (com.google.common.collect.ImmutableList)4 List (java.util.List)4 ValueState (org.apache.beam.sdk.state.ValueState)4 OnTimer (org.apache.beam.sdk.transforms.DoFn.OnTimer)4 TimestampCombiner (org.apache.beam.sdk.transforms.windowing.TimestampCombiner)4 PCollection (org.apache.beam.sdk.values.PCollection)4 TupleTag (org.apache.beam.sdk.values.TupleTag)4 Duration (org.joda.time.Duration)4