Search in sources :

Example 1 with WindowFn

use of org.apache.beam.sdk.transforms.windowing.WindowFn in project beam by apache.

the class StreamingTransformTranslator method groupByKey.

private static <K, V, W extends BoundedWindow> TransformEvaluator<GroupByKey<K, V>> groupByKey() {
    return new TransformEvaluator<GroupByKey<K, V>>() {

        @Override
        public void evaluate(GroupByKey<K, V> transform, EvaluationContext context) {
            @SuppressWarnings("unchecked") UnboundedDataset<KV<K, V>> inputDataset = (UnboundedDataset<KV<K, V>>) context.borrowDataset(transform);
            List<Integer> streamSources = inputDataset.getStreamSources();
            JavaDStream<WindowedValue<KV<K, V>>> dStream = inputDataset.getDStream();
            @SuppressWarnings("unchecked") final KvCoder<K, V> coder = (KvCoder<K, V>) context.getInput(transform).getCoder();
            final SparkRuntimeContext runtimeContext = context.getRuntimeContext();
            @SuppressWarnings("unchecked") final WindowingStrategy<?, W> windowingStrategy = (WindowingStrategy<?, W>) context.getInput(transform).getWindowingStrategy();
            @SuppressWarnings("unchecked") final WindowFn<Object, W> windowFn = (WindowFn<Object, W>) windowingStrategy.getWindowFn();
            //--- coders.
            final WindowedValue.WindowedValueCoder<V> wvCoder = WindowedValue.FullWindowedValueCoder.of(coder.getValueCoder(), windowFn.windowCoder());
            //--- group by key only.
            JavaDStream<WindowedValue<KV<K, Iterable<WindowedValue<V>>>>> groupedByKeyStream = dStream.transform(new Function<JavaRDD<WindowedValue<KV<K, V>>>, JavaRDD<WindowedValue<KV<K, Iterable<WindowedValue<V>>>>>>() {

                @Override
                public JavaRDD<WindowedValue<KV<K, Iterable<WindowedValue<V>>>>> call(JavaRDD<WindowedValue<KV<K, V>>> rdd) throws Exception {
                    return GroupCombineFunctions.groupByKeyOnly(rdd, coder.getKeyCoder(), wvCoder);
                }
            });
            //--- now group also by window.
            JavaDStream<WindowedValue<KV<K, Iterable<V>>>> outStream = SparkGroupAlsoByWindowViaWindowSet.groupAlsoByWindow(groupedByKeyStream, coder.getKeyCoder(), wvCoder, windowingStrategy, runtimeContext, streamSources);
            context.putDataset(transform, new UnboundedDataset<>(outStream, streamSources));
        }

        @Override
        public String toNativeString() {
            return "groupByKey()";
        }
    };
}
Also used : GroupByKey(org.apache.beam.sdk.transforms.GroupByKey) WindowingStrategy(org.apache.beam.sdk.values.WindowingStrategy) KV(org.apache.beam.sdk.values.KV) WindowedValue(org.apache.beam.sdk.util.WindowedValue) SparkRuntimeContext(org.apache.beam.runners.spark.translation.SparkRuntimeContext) WindowFn(org.apache.beam.sdk.transforms.windowing.WindowFn) SparkAssignWindowFn(org.apache.beam.runners.spark.translation.SparkAssignWindowFn) KvCoder(org.apache.beam.sdk.coders.KvCoder) KV(org.apache.beam.sdk.values.KV) TransformEvaluator(org.apache.beam.runners.spark.translation.TransformEvaluator) JavaRDD(org.apache.spark.api.java.JavaRDD) EvaluationContext(org.apache.beam.runners.spark.translation.EvaluationContext)

Example 2 with WindowFn

use of org.apache.beam.sdk.transforms.windowing.WindowFn in project beam by apache.

the class WindowIntoTranslationTest method testToFromProto.

@Test
public void testToFromProto() throws InvalidProtocolBufferException {
    pipeline.apply(GenerateSequence.from(0)).apply(Window.<Long>into((WindowFn) windowFn));
    final AtomicReference<AppliedPTransform<?, ?, Assign<?>>> assign = new AtomicReference<>(null);
    pipeline.traverseTopologically(new PipelineVisitor.Defaults() {

        @Override
        public void visitPrimitiveTransform(Node node) {
            if (node.getTransform() instanceof Window.Assign) {
                checkState(assign.get() == null);
                assign.set((AppliedPTransform<?, ?, Assign<?>>) node.toAppliedPTransform(getPipeline()));
            }
        }
    });
    checkState(assign.get() != null);
    SdkComponents components = SdkComponents.create();
    WindowIntoPayload payload = WindowIntoTranslation.toProto(assign.get().getTransform(), components);
    assertEquals(windowFn, WindowIntoTranslation.getWindowFn(payload));
}
Also used : Window(org.apache.beam.sdk.transforms.windowing.Window) GlobalWindow(org.apache.beam.sdk.transforms.windowing.GlobalWindow) BoundedWindow(org.apache.beam.sdk.transforms.windowing.BoundedWindow) WindowIntoPayload(org.apache.beam.sdk.common.runner.v1.RunnerApi.WindowIntoPayload) AppliedPTransform(org.apache.beam.sdk.runners.AppliedPTransform) WindowFn(org.apache.beam.sdk.transforms.windowing.WindowFn) PartitioningWindowFn(org.apache.beam.sdk.transforms.windowing.PartitioningWindowFn) Node(org.apache.beam.sdk.runners.TransformHierarchy.Node) Assign(org.apache.beam.sdk.transforms.windowing.Window.Assign) PipelineVisitor(org.apache.beam.sdk.Pipeline.PipelineVisitor) AtomicReference(java.util.concurrent.atomic.AtomicReference) Test(org.junit.Test)

Example 3 with WindowFn

use of org.apache.beam.sdk.transforms.windowing.WindowFn in project beam by apache.

the class SparkAssignWindowFn method call.

@Override
@SuppressWarnings("unchecked")
public WindowedValue<T> call(WindowedValue<T> windowedValue) throws Exception {
    final BoundedWindow boundedWindow = Iterables.getOnlyElement(windowedValue.getWindows());
    final T element = windowedValue.getValue();
    final Instant timestamp = windowedValue.getTimestamp();
    Collection<W> windows = ((WindowFn<T, W>) fn).assignWindows(((WindowFn<T, W>) fn).new AssignContext() {

        @Override
        public T element() {
            return element;
        }

        @Override
        public Instant timestamp() {
            return timestamp;
        }

        @Override
        public BoundedWindow window() {
            return boundedWindow;
        }
    });
    return WindowedValue.of(element, timestamp, windows, PaneInfo.NO_FIRING);
}
Also used : Instant(org.joda.time.Instant) WindowFn(org.apache.beam.sdk.transforms.windowing.WindowFn) BoundedWindow(org.apache.beam.sdk.transforms.windowing.BoundedWindow)

Aggregations

WindowFn (org.apache.beam.sdk.transforms.windowing.WindowFn)3 BoundedWindow (org.apache.beam.sdk.transforms.windowing.BoundedWindow)2 AtomicReference (java.util.concurrent.atomic.AtomicReference)1 EvaluationContext (org.apache.beam.runners.spark.translation.EvaluationContext)1 SparkAssignWindowFn (org.apache.beam.runners.spark.translation.SparkAssignWindowFn)1 SparkRuntimeContext (org.apache.beam.runners.spark.translation.SparkRuntimeContext)1 TransformEvaluator (org.apache.beam.runners.spark.translation.TransformEvaluator)1 PipelineVisitor (org.apache.beam.sdk.Pipeline.PipelineVisitor)1 KvCoder (org.apache.beam.sdk.coders.KvCoder)1 WindowIntoPayload (org.apache.beam.sdk.common.runner.v1.RunnerApi.WindowIntoPayload)1 AppliedPTransform (org.apache.beam.sdk.runners.AppliedPTransform)1 Node (org.apache.beam.sdk.runners.TransformHierarchy.Node)1 GroupByKey (org.apache.beam.sdk.transforms.GroupByKey)1 GlobalWindow (org.apache.beam.sdk.transforms.windowing.GlobalWindow)1 PartitioningWindowFn (org.apache.beam.sdk.transforms.windowing.PartitioningWindowFn)1 Window (org.apache.beam.sdk.transforms.windowing.Window)1 Assign (org.apache.beam.sdk.transforms.windowing.Window.Assign)1 WindowedValue (org.apache.beam.sdk.util.WindowedValue)1 KV (org.apache.beam.sdk.values.KV)1 WindowingStrategy (org.apache.beam.sdk.values.WindowingStrategy)1