Search in sources :

Example 6 with DoFn

use of org.apache.beam.sdk.transforms.DoFn in project beam by apache.

the class TopWikipediaSessionsITCase method testProgram.

@Override
protected void testProgram() throws Exception {
    Pipeline p = FlinkTestPipeline.createForStreaming();
    Long now = (System.currentTimeMillis() + 10000) / 1000;
    PCollection<KV<String, Long>> output = p.apply(Create.of(Arrays.asList(new TableRow().set("timestamp", now).set("contributor_username", "user1"), new TableRow().set("timestamp", now + 10).set("contributor_username", "user3"), new TableRow().set("timestamp", now).set("contributor_username", "user2"), new TableRow().set("timestamp", now).set("contributor_username", "user1"), new TableRow().set("timestamp", now + 2).set("contributor_username", "user1"), new TableRow().set("timestamp", now).set("contributor_username", "user2"), new TableRow().set("timestamp", now + 1).set("contributor_username", "user2"), new TableRow().set("timestamp", now + 5).set("contributor_username", "user2"), new TableRow().set("timestamp", now + 7).set("contributor_username", "user2"), new TableRow().set("timestamp", now + 8).set("contributor_username", "user2"), new TableRow().set("timestamp", now + 200).set("contributor_username", "user2"), new TableRow().set("timestamp", now + 230).set("contributor_username", "user1"), new TableRow().set("timestamp", now + 230).set("contributor_username", "user2"), new TableRow().set("timestamp", now + 240).set("contributor_username", "user2"), new TableRow().set("timestamp", now + 245).set("contributor_username", "user3"), new TableRow().set("timestamp", now + 235).set("contributor_username", "user3"), new TableRow().set("timestamp", now + 236).set("contributor_username", "user3"), new TableRow().set("timestamp", now + 237).set("contributor_username", "user3"), new TableRow().set("timestamp", now + 238).set("contributor_username", "user3"), new TableRow().set("timestamp", now + 239).set("contributor_username", "user3"), new TableRow().set("timestamp", now + 240).set("contributor_username", "user3"), new TableRow().set("timestamp", now + 241).set("contributor_username", "user2"), new TableRow().set("timestamp", now).set("contributor_username", "user3")))).apply(ParDo.of(new DoFn<TableRow, String>() {

        @ProcessElement
        public void processElement(ProcessContext c) throws Exception {
            TableRow row = c.element();
            long timestamp = (Integer) row.get("timestamp");
            String userName = (String) row.get("contributor_username");
            if (userName != null) {
                // Sets the timestamp field to be used in windowing.
                c.outputWithTimestamp(userName, new Instant(timestamp * 1000L));
            }
        }
    })).apply(Window.<String>into(Sessions.withGapDuration(Duration.standardMinutes(1)))).apply(Count.<String>perElement());
    PCollection<String> format = output.apply(ParDo.of(new DoFn<KV<String, Long>, String>() {

        @ProcessElement
        public void processElement(ProcessContext c) throws Exception {
            KV<String, Long> el = c.element();
            String out = "user: " + el.getKey() + " value:" + el.getValue();
            c.output(out);
        }
    }));
    format.apply(TextIO.write().to(resultPath));
    p.run();
}
Also used : Instant(org.joda.time.Instant) KV(org.apache.beam.sdk.values.KV) FlinkTestPipeline(org.apache.beam.runners.flink.FlinkTestPipeline) Pipeline(org.apache.beam.sdk.Pipeline) DoFn(org.apache.beam.sdk.transforms.DoFn) TableRow(com.google.api.services.bigquery.model.TableRow)

Example 7 with DoFn

use of org.apache.beam.sdk.transforms.DoFn in project beam by apache.

the class DoFnInvokersTest method testSplittableDoFnDefaultMethods.

@Test
public void testSplittableDoFnDefaultMethods() throws Exception {
    class MockFn extends DoFn<String, String> {

        @ProcessElement
        public void processElement(ProcessContext c, DefaultTracker tracker) {
        }

        @GetInitialRestriction
        public RestrictionWithDefaultTracker getInitialRestriction(String element) {
            return null;
        }
    }
    MockFn fn = mock(MockFn.class);
    DoFnInvoker<String, String> invoker = DoFnInvokers.invokerFor(fn);
    CoderRegistry coderRegistry = CoderRegistry.createDefault();
    coderRegistry.registerCoderProvider(CoderProviders.fromStaticMethods(RestrictionWithDefaultTracker.class, CoderForDefaultTracker.class));
    assertThat(invoker.<RestrictionWithDefaultTracker>invokeGetRestrictionCoder(coderRegistry), instanceOf(CoderForDefaultTracker.class));
    invoker.invokeSplitRestriction("blah", "foo", new DoFn.OutputReceiver<String>() {

        private boolean invoked;

        @Override
        public void output(String output) {
            assertFalse(invoked);
            invoked = true;
            assertEquals("foo", output);
        }
    });
    invoker.invokeProcessElement(mockArgumentProvider);
    assertThat(invoker.invokeNewTracker(new RestrictionWithDefaultTracker()), instanceOf(DefaultTracker.class));
}
Also used : CoderRegistry(org.apache.beam.sdk.coders.CoderRegistry) DoFn(org.apache.beam.sdk.transforms.DoFn) HasDefaultTracker(org.apache.beam.sdk.transforms.splittabledofn.HasDefaultTracker) Test(org.junit.Test)

Example 8 with DoFn

use of org.apache.beam.sdk.transforms.DoFn in project beam by apache.

the class DoFnSignaturesSplittableDoFnTest method testSplittableProcessElementMustNotHaveOtherParams.

@Test
public void testSplittableProcessElementMustNotHaveOtherParams() throws Exception {
    thrown.expect(IllegalArgumentException.class);
    thrown.expectMessage("Illegal parameter");
    thrown.expectMessage("BoundedWindow");
    DoFnSignature.ProcessElementMethod signature = analyzeProcessElementMethod(new AnonymousMethod() {

        private void method(DoFn<Integer, String>.ProcessContext<Integer, String> context, SomeRestrictionTracker tracker, BoundedWindow window) {
        }
    });
}
Also used : DoFn(org.apache.beam.sdk.transforms.DoFn) FakeDoFn(org.apache.beam.sdk.transforms.reflect.DoFnSignaturesTestUtils.FakeDoFn) BoundedWindow(org.apache.beam.sdk.transforms.windowing.BoundedWindow) AnonymousMethod(org.apache.beam.sdk.transforms.reflect.DoFnSignaturesTestUtils.AnonymousMethod) Test(org.junit.Test)

Example 9 with DoFn

use of org.apache.beam.sdk.transforms.DoFn in project DataflowJavaSDK by GoogleCloudPlatform.

the class StarterPipeline method main.

public static void main(String[] args) {
    Pipeline p = Pipeline.create(PipelineOptionsFactory.fromArgs(args).withValidation().create());
    p.apply(Create.of("Hello", "World")).apply(MapElements.via(new SimpleFunction<String, String>() {

        @Override
        public String apply(String input) {
            return input.toUpperCase();
        }
    })).apply(ParDo.of(new DoFn<String, Void>() {

        @ProcessElement
        public void processElement(ProcessContext c) {
            LOG.info(c.element());
        }
    }));
    p.run();
}
Also used : DoFn(org.apache.beam.sdk.transforms.DoFn) Pipeline(org.apache.beam.sdk.Pipeline)

Example 10 with DoFn

use of org.apache.beam.sdk.transforms.DoFn in project DataflowJavaSDK-examples by GoogleCloudPlatform.

the class StarterPipeline method main.

public static void main(String[] args) {
    Pipeline p = Pipeline.create(PipelineOptionsFactory.fromArgs(args).withValidation().create());
    p.apply(Create.of("Hello", "World")).apply(MapElements.via(new SimpleFunction<String, String>() {

        @Override
        public String apply(String input) {
            return input.toUpperCase();
        }
    })).apply(ParDo.of(new DoFn<String, Void>() {

        @ProcessElement
        public void processElement(ProcessContext c) {
            LOG.info(c.element());
        }
    }));
    p.run();
}
Also used : DoFn(org.apache.beam.sdk.transforms.DoFn) Pipeline(org.apache.beam.sdk.Pipeline)

Aggregations

DoFn (org.apache.beam.sdk.transforms.DoFn)154 Test (org.junit.Test)98 Pipeline (org.apache.beam.sdk.Pipeline)60 KV (org.apache.beam.sdk.values.KV)45 TupleTag (org.apache.beam.sdk.values.TupleTag)28 StateSpec (org.apache.beam.sdk.state.StateSpec)26 Instant (org.joda.time.Instant)26 ArrayList (java.util.ArrayList)23 TestPipeline (org.apache.beam.sdk.testing.TestPipeline)23 BoundedWindow (org.apache.beam.sdk.transforms.windowing.BoundedWindow)22 PCollection (org.apache.beam.sdk.values.PCollection)21 TimerSpec (org.apache.beam.sdk.state.TimerSpec)19 WindowedValue (org.apache.beam.sdk.util.WindowedValue)18 PCollectionView (org.apache.beam.sdk.values.PCollectionView)18 HashMap (java.util.HashMap)17 Coder (org.apache.beam.sdk.coders.Coder)17 List (java.util.List)16 Map (java.util.Map)14 ValueState (org.apache.beam.sdk.state.ValueState)14 RunnerApi (org.apache.beam.model.pipeline.v1.RunnerApi)13