Search in sources :

Example 1 with ProcessElement

use of org.apache.beam.sdk.transforms.DoFn.ProcessElement in project beam by apache.

the class ParDoTest method testWindowingInStartAndFinishBundle.

@Test
@Category(ValidatesRunner.class)
public void testWindowingInStartAndFinishBundle() {
    final FixedWindows windowFn = FixedWindows.of(Duration.millis(1));
    PCollection<String> output = pipeline.apply(Create.timestamped(TimestampedValue.of("elem", new Instant(1)))).apply(Window.<String>into(windowFn)).apply(ParDo.of(new DoFn<String, String>() {

        @ProcessElement
        public void processElement(ProcessContext c) {
            c.output(c.element());
            System.out.println("Process: " + c.element() + ":" + c.timestamp().getMillis());
        }

        @FinishBundle
        public void finishBundle(FinishBundleContext c) {
            Instant ts = new Instant(3);
            c.output("finish", ts, windowFn.assignWindow(ts));
            System.out.println("Finish: 3");
        }
    })).apply(ParDo.of(new PrintingDoFn()));
    PAssert.that(output).satisfies(new Checker());
    pipeline.run();
}
Also used : FixedWindows(org.apache.beam.sdk.transforms.windowing.FixedWindows) Instant(org.joda.time.Instant) ProcessElement(org.apache.beam.sdk.transforms.DoFn.ProcessElement) StringUtils.byteArrayToJsonString(org.apache.beam.sdk.util.StringUtils.byteArrayToJsonString) Matchers.containsString(org.hamcrest.Matchers.containsString) Category(org.junit.experimental.categories.Category) Test(org.junit.Test)

Example 2 with ProcessElement

use of org.apache.beam.sdk.transforms.DoFn.ProcessElement in project beam by apache.

the class ParDoTest method testMainOutputApplyTaggedOutputNoCoder.

@Test
@Category(NeedsRunner.class)
public void testMainOutputApplyTaggedOutputNoCoder() {
    // Regression test: applying a transform to the main output
    // should not cause a crash based on lack of a coder for the
    // additional output.
    final TupleTag<TestDummy> mainOutputTag = new TupleTag<TestDummy>("main");
    final TupleTag<TestDummy> additionalOutputTag = new TupleTag<TestDummy>("additionalOutput");
    PCollectionTuple tuple = pipeline.apply(Create.of(new TestDummy()).withCoder(TestDummyCoder.of())).apply(ParDo.of(new DoFn<TestDummy, TestDummy>() {

        @ProcessElement
        public void processElement(ProcessContext context) {
            TestDummy element = context.element();
            context.output(element);
            context.output(additionalOutputTag, element);
        }
    }).withOutputTags(mainOutputTag, TupleTagList.of(additionalOutputTag)));
    // Before fix, tuple.get(mainOutputTag).apply(...) would indirectly trigger
    // tuple.get(additionalOutputTag).finishSpecifyingOutput(), which would crash
    // on a missing coder.
    tuple.get(mainOutputTag).setCoder(TestDummyCoder.of()).apply("Output1", ParDo.of(new DoFn<TestDummy, Integer>() {

        @ProcessElement
        public void processElement(ProcessContext context) {
            context.output(1);
        }
    }));
    tuple.get(additionalOutputTag).setCoder(TestDummyCoder.of());
    pipeline.run();
}
Also used : ProcessElement(org.apache.beam.sdk.transforms.DoFn.ProcessElement) TupleTag(org.apache.beam.sdk.values.TupleTag) PCollectionTuple(org.apache.beam.sdk.values.PCollectionTuple) Category(org.junit.experimental.categories.Category) Test(org.junit.Test)

Example 3 with ProcessElement

use of org.apache.beam.sdk.transforms.DoFn.ProcessElement in project beam by apache.

the class ParDoTest method testParDoWithOnlyTaggedOutput.

@Test
@Category(ValidatesRunner.class)
public void testParDoWithOnlyTaggedOutput() {
    List<Integer> inputs = Arrays.asList(3, -42, 666);
    final TupleTag<Void> mainOutputTag = new TupleTag<Void>("main") {
    };
    final TupleTag<Integer> additionalOutputTag = new TupleTag<Integer>("additional") {
    };
    PCollectionTuple outputs = pipeline.apply(Create.of(inputs)).apply(ParDo.of(new DoFn<Integer, Void>() {

        @ProcessElement
        public void processElement(ProcessContext c) {
            c.output(additionalOutputTag, c.element());
        }
    }).withOutputTags(mainOutputTag, TupleTagList.of(additionalOutputTag)));
    PAssert.that(outputs.get(mainOutputTag)).empty();
    PAssert.that(outputs.get(additionalOutputTag)).containsInAnyOrder(inputs);
    pipeline.run();
}
Also used : ProcessElement(org.apache.beam.sdk.transforms.DoFn.ProcessElement) TupleTag(org.apache.beam.sdk.values.TupleTag) PCollectionTuple(org.apache.beam.sdk.values.PCollectionTuple) Category(org.junit.experimental.categories.Category) Test(org.junit.Test)

Aggregations

ProcessElement (org.apache.beam.sdk.transforms.DoFn.ProcessElement)3 Test (org.junit.Test)3 Category (org.junit.experimental.categories.Category)3 PCollectionTuple (org.apache.beam.sdk.values.PCollectionTuple)2 TupleTag (org.apache.beam.sdk.values.TupleTag)2 FixedWindows (org.apache.beam.sdk.transforms.windowing.FixedWindows)1 StringUtils.byteArrayToJsonString (org.apache.beam.sdk.util.StringUtils.byteArrayToJsonString)1 Matchers.containsString (org.hamcrest.Matchers.containsString)1 Instant (org.joda.time.Instant)1