Search in sources :

Example 1 with ByteToWindowFunction

use of org.apache.beam.runners.twister2.translators.functions.ByteToWindowFunction in project beam by apache.

the class GroupByKeyTranslatorBatch method translateNode.

@Override
public void translateNode(GroupByKey<K, V> transform, Twister2BatchTranslationContext context) {
    PCollection<KV<K, V>> input = context.getInput(transform);
    BatchTSetImpl<WindowedValue<KV<K, V>>> inputTTSet = context.getInputDataSet(input);
    final KvCoder<K, V> coder = (KvCoder<K, V>) input.getCoder();
    Coder<K> inputKeyCoder = coder.getKeyCoder();
    WindowingStrategy windowingStrategy = input.getWindowingStrategy();
    WindowFn<KV<K, V>, BoundedWindow> windowFn = (WindowFn<KV<K, V>, BoundedWindow>) windowingStrategy.getWindowFn();
    final WindowedValue.WindowedValueCoder<V> wvCoder = WindowedValue.FullWindowedValueCoder.of(coder.getValueCoder(), windowFn.windowCoder());
    KeyedTSet<byte[], byte[]> keyedTSet = inputTTSet.mapToTuple(new MapToTupleFunction<K, V>(inputKeyCoder, wvCoder));
    // todo add support for a partition function to be specified, this would use
    // todo keyedPartition function instead of KeyedGather
    ComputeTSet<KV<K, Iterable<WindowedValue<V>>>, Iterator<Tuple<byte[], Iterator<byte[]>>>> groupedbyKeyTset = keyedTSet.keyedGather().map(new ByteToWindowFunction(inputKeyCoder, wvCoder));
    // --- now group also by window.
    SystemReduceFnBuffering reduceFnBuffering = new SystemReduceFnBuffering(coder.getValueCoder());
    ComputeTSet<WindowedValue<KV<K, Iterable<V>>>, Iterable<KV<K, Iterator<WindowedValue<V>>>>> outputTset = groupedbyKeyTset.direct().<WindowedValue<KV<K, Iterable<V>>>>flatmap(new GroupByWindowFunction(windowingStrategy, reduceFnBuffering, context.getOptions()));
    PCollection output = context.getOutput(transform);
    context.setOutputDataSet(output, outputTset);
}
Also used : WindowFn(org.apache.beam.sdk.transforms.windowing.WindowFn) KvCoder(org.apache.beam.sdk.coders.KvCoder) KV(org.apache.beam.sdk.values.KV) SystemReduceFnBuffering(org.apache.beam.runners.twister2.translators.functions.internal.SystemReduceFnBuffering) WindowingStrategy(org.apache.beam.sdk.values.WindowingStrategy) PCollection(org.apache.beam.sdk.values.PCollection) ByteToWindowFunction(org.apache.beam.runners.twister2.translators.functions.ByteToWindowFunction) WindowedValue(org.apache.beam.sdk.util.WindowedValue) KV(org.apache.beam.sdk.values.KV) Iterator(java.util.Iterator) BoundedWindow(org.apache.beam.sdk.transforms.windowing.BoundedWindow) GroupByWindowFunction(org.apache.beam.runners.twister2.translators.functions.GroupByWindowFunction)

Example 2 with ByteToWindowFunction

use of org.apache.beam.runners.twister2.translators.functions.ByteToWindowFunction in project twister2 by DSC-SPIDAL.

the class GroupByKeyTranslatorBatch method translateNode.

@Override
public void translateNode(GroupByKey<K, V> transform, Twister2BatchTranslationContext context) {
    PCollection<KV<K, V>> input = context.getInput(transform);
    BatchTSetImpl<WindowedValue<KV<K, V>>> inputTTSet = context.getInputDataSet(input);
    final KvCoder<K, V> coder = (KvCoder<K, V>) context.getInput(transform).getCoder();
    Coder<K> inputKeyCoder = ((KvCoder<K, V>) input.getCoder()).getKeyCoder();
    WindowingStrategy windowingStrategy = input.getWindowingStrategy();
    WindowFn<KV<K, V>, BoundedWindow> windowFn = (WindowFn<KV<K, V>, BoundedWindow>) windowingStrategy.getWindowFn();
    final WindowedValue.WindowedValueCoder<V> wvCoder = WindowedValue.FullWindowedValueCoder.of(coder.getValueCoder(), windowFn.windowCoder());
    KeyedTSet<byte[], byte[]> keyedTSet = inputTTSet.mapToTuple(new MapToTupleFunction<K, V>(inputKeyCoder, wvCoder));
    // todo add support for a partition function to be specified, this would use
    // todo keyedPartition function instead of KeyedGather
    ComputeTSet<KV<K, Iterable<WindowedValue<V>>>> groupedbyKeyTset = keyedTSet.keyedGather().map(new ByteToWindowFunction(inputKeyCoder, wvCoder));
    // --- now group also by window.
    ComputeTSet<WindowedValue<KV<K, Iterable<V>>>> outputTset = groupedbyKeyTset.direct().<WindowedValue<KV<K, Iterable<V>>>>flatmap(new GroupByWindowFunction(windowingStrategy, SystemReduceFn.buffering(coder.getValueCoder())));
    PCollection output = context.getOutput(transform);
    context.setOutputDataSet(output, outputTset);
}
Also used : WindowFn(org.apache.beam.sdk.transforms.windowing.WindowFn) KvCoder(org.apache.beam.sdk.coders.KvCoder) KV(org.apache.beam.sdk.values.KV) WindowingStrategy(org.apache.beam.sdk.values.WindowingStrategy) PCollection(org.apache.beam.sdk.values.PCollection) ByteToWindowFunction(org.apache.beam.runners.twister2.translators.functions.ByteToWindowFunction) WindowedValue(org.apache.beam.sdk.util.WindowedValue) KV(org.apache.beam.sdk.values.KV) BoundedWindow(org.apache.beam.sdk.transforms.windowing.BoundedWindow) GroupByWindowFunction(org.apache.beam.runners.twister2.translators.functions.GroupByWindowFunction)

Aggregations

ByteToWindowFunction (org.apache.beam.runners.twister2.translators.functions.ByteToWindowFunction)2 GroupByWindowFunction (org.apache.beam.runners.twister2.translators.functions.GroupByWindowFunction)2 KvCoder (org.apache.beam.sdk.coders.KvCoder)2 BoundedWindow (org.apache.beam.sdk.transforms.windowing.BoundedWindow)2 WindowFn (org.apache.beam.sdk.transforms.windowing.WindowFn)2 WindowedValue (org.apache.beam.sdk.util.WindowedValue)2 KV (org.apache.beam.sdk.values.KV)2 PCollection (org.apache.beam.sdk.values.PCollection)2 WindowingStrategy (org.apache.beam.sdk.values.WindowingStrategy)2 Iterator (java.util.Iterator)1 SystemReduceFnBuffering (org.apache.beam.runners.twister2.translators.functions.internal.SystemReduceFnBuffering)1