Search in sources :

Example 36 with BoundedWindow

use of org.apache.beam.sdk.transforms.windowing.BoundedWindow in project beam by apache.

the class FlinkStatefulDoFnFunction method fireTimer.

private void fireTimer(TimerInternals.TimerData timer, DoFnRunner<KV<K, V>, OutputT> doFnRunner) {
    StateNamespace namespace = timer.getNamespace();
    checkArgument(namespace instanceof StateNamespaces.WindowNamespace);
    BoundedWindow window = ((StateNamespaces.WindowNamespace) namespace).getWindow();
    doFnRunner.onTimer(timer.getTimerId(), window, timer.getTimestamp(), timer.getDomain());
}
Also used : StateNamespaces(org.apache.beam.runners.core.StateNamespaces) BoundedWindow(org.apache.beam.sdk.transforms.windowing.BoundedWindow) StateNamespace(org.apache.beam.runners.core.StateNamespace)

Example 37 with BoundedWindow

use of org.apache.beam.sdk.transforms.windowing.BoundedWindow in project beam by apache.

the class HashingFlinkCombineRunner method collectWindows.

private Set<W> collectWindows(Iterable<WindowedValue<KV<K, InputT>>> values) {
    Set<W> windows = new HashSet<>();
    for (WindowedValue<?> value : values) {
        for (BoundedWindow untypedWindow : value.getWindows()) {
            @SuppressWarnings("unchecked") W window = (W) untypedWindow;
            windows.add(window);
        }
    }
    return windows;
}
Also used : BoundedWindow(org.apache.beam.sdk.transforms.windowing.BoundedWindow) HashSet(java.util.HashSet)

Example 38 with BoundedWindow

use of org.apache.beam.sdk.transforms.windowing.BoundedWindow in project beam by apache.

the class HashingFlinkCombineRunner method combine.

@Override
public void combine(FlinkCombiner<K, InputT, AccumT, OutputT> flinkCombiner, WindowingStrategy<Object, W> windowingStrategy, SideInputReader sideInputReader, PipelineOptions options, Iterable<WindowedValue<KV<K, InputT>>> elements, Collector<WindowedValue<KV<K, OutputT>>> out) throws Exception {
    @SuppressWarnings("unchecked") TimestampCombiner timestampCombiner = windowingStrategy.getTimestampCombiner();
    WindowFn<Object, W> windowFn = windowingStrategy.getWindowFn();
    // Flink Iterable can be iterated over only once.
    List<WindowedValue<KV<K, InputT>>> inputs = new ArrayList<>();
    Iterables.addAll(inputs, elements);
    Set<W> windows = collectWindows(inputs);
    Map<W, W> windowToMergeResult = mergeWindows(windowingStrategy, windows);
    // Combine all windowedValues into map
    Map<W, Tuple2<AccumT, Instant>> mapState = new HashMap<>();
    Iterator<WindowedValue<KV<K, InputT>>> iterator = inputs.iterator();
    WindowedValue<KV<K, InputT>> currentValue = iterator.next();
    K key = currentValue.getValue().getKey();
    do {
        for (BoundedWindow w : currentValue.getWindows()) {
            @SuppressWarnings("unchecked") W currentWindow = (W) w;
            W mergedWindow = windowToMergeResult.get(currentWindow);
            mergedWindow = mergedWindow == null ? currentWindow : mergedWindow;
            Set<W> singletonW = Collections.singleton(mergedWindow);
            Tuple2<AccumT, Instant> accumAndInstant = mapState.get(mergedWindow);
            if (accumAndInstant == null) {
                AccumT accumT = flinkCombiner.firstInput(key, currentValue.getValue().getValue(), options, sideInputReader, singletonW);
                Instant windowTimestamp = timestampCombiner.assign(mergedWindow, windowFn.getOutputTime(currentValue.getTimestamp(), mergedWindow));
                accumAndInstant = new Tuple2<>(accumT, windowTimestamp);
                mapState.put(mergedWindow, accumAndInstant);
            } else {
                accumAndInstant.f0 = flinkCombiner.addInput(key, accumAndInstant.f0, currentValue.getValue().getValue(), options, sideInputReader, singletonW);
                accumAndInstant.f1 = timestampCombiner.combine(accumAndInstant.f1, timestampCombiner.assign(mergedWindow, windowingStrategy.getWindowFn().getOutputTime(currentValue.getTimestamp(), mergedWindow)));
            }
        }
        if (iterator.hasNext()) {
            currentValue = iterator.next();
        } else {
            break;
        }
    } while (true);
    // Output the final value of combiners
    for (Map.Entry<W, Tuple2<AccumT, Instant>> entry : mapState.entrySet()) {
        AccumT accumulator = entry.getValue().f0;
        Instant windowTimestamp = entry.getValue().f1;
        out.collect(WindowedValue.of(KV.of(key, flinkCombiner.extractOutput(key, accumulator, options, sideInputReader, Collections.singleton(entry.getKey()))), windowTimestamp, entry.getKey(), PaneInfo.NO_FIRING));
    }
}
Also used : HashMap(java.util.HashMap) ArrayList(java.util.ArrayList) WindowedValue(org.apache.beam.sdk.util.WindowedValue) BoundedWindow(org.apache.beam.sdk.transforms.windowing.BoundedWindow) TimestampCombiner(org.apache.beam.sdk.transforms.windowing.TimestampCombiner) Instant(org.joda.time.Instant) KV(org.apache.beam.sdk.values.KV) Tuple2(org.apache.flink.api.java.tuple.Tuple2) HashMap(java.util.HashMap) Map(java.util.Map)

Example 39 with BoundedWindow

use of org.apache.beam.sdk.transforms.windowing.BoundedWindow in project beam by apache.

the class SideInputInitializer method initializeBroadcastVariable.

@Override
public Map<BoundedWindow, ViewT> initializeBroadcastVariable(Iterable<WindowedValue<ElemT>> inputValues) {
    // first partition into windows
    Map<BoundedWindow, List<WindowedValue<ElemT>>> partitionedElements = new HashMap<>();
    for (WindowedValue<ElemT> value : inputValues) {
        for (BoundedWindow window : value.getWindows()) {
            List<WindowedValue<ElemT>> windowedValues = partitionedElements.get(window);
            if (windowedValues == null) {
                windowedValues = new ArrayList<>();
                partitionedElements.put(window, windowedValues);
            }
            windowedValues.add(value);
        }
    }
    Map<BoundedWindow, ViewT> resultMap = new HashMap<>();
    for (Map.Entry<BoundedWindow, List<WindowedValue<ElemT>>> elements : partitionedElements.entrySet()) {
        @SuppressWarnings("unchecked") Iterable<WindowedValue<?>> elementsIterable = (List<WindowedValue<?>>) (List<?>) elements.getValue();
        resultMap.put(elements.getKey(), view.getViewFn().apply(elementsIterable));
    }
    return resultMap;
}
Also used : HashMap(java.util.HashMap) WindowedValue(org.apache.beam.sdk.util.WindowedValue) BoundedWindow(org.apache.beam.sdk.transforms.windowing.BoundedWindow) List(java.util.List) ArrayList(java.util.ArrayList) Map(java.util.Map) HashMap(java.util.HashMap)

Example 40 with BoundedWindow

use of org.apache.beam.sdk.transforms.windowing.BoundedWindow in project beam by apache.

the class WatermarkManagerTest method multiWindowedBundle.

@SafeVarargs
private final <T> CommittedBundle<T> multiWindowedBundle(PCollection<T> pc, T... values) {
    UncommittedBundle<T> bundle = bundleFactory.createBundle(pc);
    Collection<BoundedWindow> windows = ImmutableList.of(GlobalWindow.INSTANCE, new IntervalWindow(BoundedWindow.TIMESTAMP_MIN_VALUE, new Instant(0)));
    for (T value : values) {
        bundle.add(WindowedValue.of(value, BoundedWindow.TIMESTAMP_MIN_VALUE, windows, PaneInfo.NO_FIRING));
    }
    return bundle.commit(BoundedWindow.TIMESTAMP_MAX_VALUE);
}
Also used : ReadableInstant(org.joda.time.ReadableInstant) Instant(org.joda.time.Instant) BoundedWindow(org.apache.beam.sdk.transforms.windowing.BoundedWindow) IntervalWindow(org.apache.beam.sdk.transforms.windowing.IntervalWindow)

Aggregations

BoundedWindow (org.apache.beam.sdk.transforms.windowing.BoundedWindow)54 Instant (org.joda.time.Instant)27 Test (org.junit.Test)26 IntervalWindow (org.apache.beam.sdk.transforms.windowing.IntervalWindow)21 KV (org.apache.beam.sdk.values.KV)20 WindowedValue (org.apache.beam.sdk.util.WindowedValue)14 ArrayList (java.util.ArrayList)7 TimerSpec (org.apache.beam.sdk.state.TimerSpec)7 Timer (org.apache.beam.sdk.state.Timer)6 Matchers.containsString (org.hamcrest.Matchers.containsString)6 DoFn (org.apache.beam.sdk.transforms.DoFn)5 StringUtils.byteArrayToJsonString (org.apache.beam.sdk.util.StringUtils.byteArrayToJsonString)5 ImmutableList (com.google.common.collect.ImmutableList)4 List (java.util.List)4 ValueState (org.apache.beam.sdk.state.ValueState)4 OnTimer (org.apache.beam.sdk.transforms.DoFn.OnTimer)4 TimestampCombiner (org.apache.beam.sdk.transforms.windowing.TimestampCombiner)4 PCollection (org.apache.beam.sdk.values.PCollection)4 TupleTag (org.apache.beam.sdk.values.TupleTag)4 Duration (org.joda.time.Duration)4