Search in sources :

Example 1 with DoFn

use of com.google.cloud.dataflow.sdk.transforms.DoFn in project spark-dataflow by cloudera.

the class SideEffectsTest method test.

@Test
public void test() throws Exception {
    SparkPipelineOptions options = SparkPipelineOptionsFactory.create();
    options.setRunner(SparkPipelineRunner.class);
    Pipeline pipeline = Pipeline.create(options);
    pipeline.getCoderRegistry().registerCoder(URI.class, StringDelegateCoder.of(URI.class));
    pipeline.apply(Create.of("a")).apply(ParDo.of(new DoFn<String, String>() {

        @Override
        public void processElement(ProcessContext c) throws Exception {
            throw new UserException();
        }
    }));
    try {
        pipeline.run();
        fail("Run should thrown an exception");
    } catch (RuntimeException e) {
        assertNotNull(e.getCause());
        // TODO: remove the version check (and the setup and teardown methods) when we no
        // longer support Spark 1.3 or 1.4
        String version = SparkContextFactory.getSparkContext(options.getSparkMaster(), options.getAppName()).version();
        if (!version.startsWith("1.3.") && !version.startsWith("1.4.")) {
            assertTrue(e.getCause() instanceof UserException);
        }
    }
}
Also used : DoFn(com.google.cloud.dataflow.sdk.transforms.DoFn) URI(java.net.URI) Pipeline(com.google.cloud.dataflow.sdk.Pipeline) Test(org.junit.Test)

Example 2 with DoFn

use of com.google.cloud.dataflow.sdk.transforms.DoFn in project spark-dataflow by cloudera.

the class DoFnOutputTest method test.

@Test
public void test() throws Exception {
    SparkPipelineOptions options = SparkPipelineOptionsFactory.create();
    options.setRunner(SparkPipelineRunner.class);
    Pipeline pipeline = Pipeline.create(options);
    PCollection<String> strings = pipeline.apply(Create.of("a"));
    // Test that values written from startBundle() and finishBundle() are written to
    // the output
    PCollection<String> output = strings.apply(ParDo.of(new DoFn<String, String>() {

        @Override
        public void startBundle(Context c) throws Exception {
            c.output("start");
        }

        @Override
        public void processElement(ProcessContext c) throws Exception {
            c.output(c.element());
        }

        @Override
        public void finishBundle(Context c) throws Exception {
            c.output("finish");
        }
    }));
    DataflowAssert.that(output).containsInAnyOrder("start", "a", "finish");
    EvaluationResult res = SparkPipelineRunner.create().run(pipeline);
    res.close();
}
Also used : DoFn(com.google.cloud.dataflow.sdk.transforms.DoFn) Pipeline(com.google.cloud.dataflow.sdk.Pipeline) Test(org.junit.Test)

Aggregations

Pipeline (com.google.cloud.dataflow.sdk.Pipeline)2 DoFn (com.google.cloud.dataflow.sdk.transforms.DoFn)2 Test (org.junit.Test)2 URI (java.net.URI)1