Search in sources :

Example 1 with JavaDStreamLike

use of org.apache.spark.streaming.api.java.JavaDStreamLike in project spark-dataflow by cloudera.

the class StreamingTransformTranslator method window.

private static <T, W extends BoundedWindow> TransformEvaluator<Window.Bound<T>> window() {
    return new TransformEvaluator<Window.Bound<T>>() {

        @Override
        public void evaluate(Window.Bound<T> transform, EvaluationContext context) {
            StreamingEvaluationContext sec = (StreamingEvaluationContext) context;
            //--- first we apply windowing to the stream
            WindowFn<? super T, W> windowFn = WINDOW_FG.get("windowFn", transform);
            @SuppressWarnings("unchecked") JavaDStream<WindowedValue<T>> dStream = (JavaDStream<WindowedValue<T>>) sec.getStream(transform);
            if (windowFn instanceof FixedWindows) {
                Duration windowDuration = Durations.milliseconds(((FixedWindows) windowFn).getSize().getMillis());
                sec.setStream(transform, dStream.window(windowDuration));
            } else if (windowFn instanceof SlidingWindows) {
                Duration windowDuration = Durations.milliseconds(((SlidingWindows) windowFn).getSize().getMillis());
                Duration slideDuration = Durations.milliseconds(((SlidingWindows) windowFn).getPeriod().getMillis());
                sec.setStream(transform, dStream.window(windowDuration, slideDuration));
            }
            //--- then we apply windowing to the elements
            DoFn<T, T> addWindowsDoFn = new AssignWindowsDoFn<>(windowFn);
            DoFnFunction<T, T> dofn = new DoFnFunction<>(addWindowsDoFn, ((StreamingEvaluationContext) context).getRuntimeContext(), null);
            @SuppressWarnings("unchecked") JavaDStreamLike<WindowedValue<T>, ?, JavaRDD<WindowedValue<T>>> dstream = (JavaDStreamLike<WindowedValue<T>, ?, JavaRDD<WindowedValue<T>>>) sec.getStream(transform);
            sec.setStream(transform, dstream.mapPartitions(dofn));
        }
    };
}
Also used : BoundedWindow(com.google.cloud.dataflow.sdk.transforms.windowing.BoundedWindow) Window(com.google.cloud.dataflow.sdk.transforms.windowing.Window) FixedWindows(com.google.cloud.dataflow.sdk.transforms.windowing.FixedWindows) Duration(org.apache.spark.streaming.Duration) AssignWindowsDoFn(com.google.cloud.dataflow.sdk.util.AssignWindowsDoFn) JavaDStream(org.apache.spark.streaming.api.java.JavaDStream) TransformEvaluator(com.cloudera.dataflow.spark.TransformEvaluator) JavaRDD(org.apache.spark.api.java.JavaRDD) DoFnFunction(com.cloudera.dataflow.spark.DoFnFunction) WindowedValue(com.google.cloud.dataflow.sdk.util.WindowedValue) JavaDStreamLike(org.apache.spark.streaming.api.java.JavaDStreamLike) EvaluationContext(com.cloudera.dataflow.spark.EvaluationContext) SlidingWindows(com.google.cloud.dataflow.sdk.transforms.windowing.SlidingWindows)

Aggregations

DoFnFunction (com.cloudera.dataflow.spark.DoFnFunction)1 EvaluationContext (com.cloudera.dataflow.spark.EvaluationContext)1 TransformEvaluator (com.cloudera.dataflow.spark.TransformEvaluator)1 BoundedWindow (com.google.cloud.dataflow.sdk.transforms.windowing.BoundedWindow)1 FixedWindows (com.google.cloud.dataflow.sdk.transforms.windowing.FixedWindows)1 SlidingWindows (com.google.cloud.dataflow.sdk.transforms.windowing.SlidingWindows)1 Window (com.google.cloud.dataflow.sdk.transforms.windowing.Window)1 AssignWindowsDoFn (com.google.cloud.dataflow.sdk.util.AssignWindowsDoFn)1 WindowedValue (com.google.cloud.dataflow.sdk.util.WindowedValue)1 JavaRDD (org.apache.spark.api.java.JavaRDD)1 Duration (org.apache.spark.streaming.Duration)1 JavaDStream (org.apache.spark.streaming.api.java.JavaDStream)1 JavaDStreamLike (org.apache.spark.streaming.api.java.JavaDStreamLike)1