Search in sources :

Example 36 with SdkComponents

use of org.apache.beam.runners.core.construction.SdkComponents in project beam by apache.

the class PubSubWritePayloadTranslationTest method testTranslateSinkWithTopic.

@Test
public void testTranslateSinkWithTopic() throws Exception {
    PubsubUnboundedSink pubsubUnboundedSink = new PubsubUnboundedSink(null, StaticValueProvider.of(TOPIC), TIMESTAMP_ATTRIBUTE, ID_ATTRIBUTE, 0, 0, 0, Duration.ZERO, null);
    PubsubUnboundedSink.PubsubSink pubsubSink = new PubsubSink(pubsubUnboundedSink);
    PCollection<byte[]> input = pipeline.apply(Create.of(new byte[0]));
    PDone output = input.apply(pubsubSink);
    AppliedPTransform<?, ?, PubsubSink> appliedPTransform = AppliedPTransform.of("sink", PValues.expandInput(input), PValues.expandOutput(output), pubsubSink, ResourceHints.create(), pipeline);
    SdkComponents components = SdkComponents.create();
    components.registerEnvironment(Environments.createDockerEnvironment("java"));
    RunnerApi.FunctionSpec spec = sinkTranslator.translate(appliedPTransform, components);
    assertEquals(PTransformTranslation.PUBSUB_WRITE, spec.getUrn());
    PubSubWritePayload payload = PubSubWritePayload.parseFrom(spec.getPayload());
    assertEquals(TOPIC.getFullPath(), payload.getTopic());
    assertTrue(payload.getTopicRuntimeOverridden().isEmpty());
    assertEquals(TIMESTAMP_ATTRIBUTE, payload.getTimestampAttribute());
    assertEquals(ID_ATTRIBUTE, payload.getIdAttribute());
}
Also used : RunnerApi(org.apache.beam.model.pipeline.v1.RunnerApi) PDone(org.apache.beam.sdk.values.PDone) PubsubSink(org.apache.beam.sdk.io.gcp.pubsub.PubsubUnboundedSink.PubsubSink) PubsubSink(org.apache.beam.sdk.io.gcp.pubsub.PubsubUnboundedSink.PubsubSink) SdkComponents(org.apache.beam.runners.core.construction.SdkComponents) PubSubWritePayload(org.apache.beam.model.pipeline.v1.RunnerApi.PubSubWritePayload) Test(org.junit.Test)

Example 37 with SdkComponents

use of org.apache.beam.runners.core.construction.SdkComponents in project beam by apache.

the class DataflowPipelineTranslatorTest method testToMap.

@Test
public void testToMap() throws Exception {
    DataflowPipelineOptions options = buildPipelineOptions();
    Pipeline pipeline = Pipeline.create(options);
    final PCollectionView<Map<String, Integer>> view = pipeline.apply("CreateSideInput", Create.of(KV.of("a", 1), KV.of("b", 3))).apply(View.asMap());
    PCollection<KV<String, Integer>> output = pipeline.apply("CreateMainInput", Create.of("apple", "banana", "blackberry")).apply("OutputSideInputs", ParDo.of(new DoFn<String, KV<String, Integer>>() {

        @ProcessElement
        public void processElement(ProcessContext c) {
            c.output(KV.of(c.element(), c.sideInput(view).get(c.element().substring(0, 1))));
        }
    }).withSideInputs(view));
    PAssert.that(output).containsInAnyOrder(KV.of("apple", 1), KV.of("banana", 3), KV.of("blackberry", 3));
    DataflowRunner runner = DataflowRunner.fromOptions(options);
    DataflowPipelineTranslator translator = DataflowPipelineTranslator.fromOptions(options);
    runner.replaceV1Transforms(pipeline);
    SdkComponents sdkComponents = createSdkComponents(options);
    RunnerApi.Pipeline pipelineProto = PipelineTranslation.toProto(pipeline, sdkComponents, true);
    Job job = translator.translate(pipeline, pipelineProto, sdkComponents, runner, Collections.emptyList()).getJob();
    List<Step> steps = job.getSteps();
    // Change detector assertion just to make sure the test was not a noop.
    // No need to actually check the pipeline as the ValidatesRunner tests
    // ensure translation is correct. This is just a quick check to see that translation
    // does not crash.
    assertEquals(24, steps.size());
}
Also used : DataflowPipelineOptions(org.apache.beam.runners.dataflow.options.DataflowPipelineOptions) KV(org.apache.beam.sdk.values.KV) Structs.getString(org.apache.beam.runners.dataflow.util.Structs.getString) ByteString(org.apache.beam.vendor.grpc.v1p43p2.com.google.protobuf.ByteString) Step(com.google.api.services.dataflow.model.Step) SdkComponents(org.apache.beam.runners.core.construction.SdkComponents) Pipeline(org.apache.beam.sdk.Pipeline) RunnerApi(org.apache.beam.model.pipeline.v1.RunnerApi) Job(com.google.api.services.dataflow.model.Job) ImmutableMap(org.apache.beam.vendor.guava.v26_0_jre.com.google.common.collect.ImmutableMap) Map(java.util.Map) Test(org.junit.Test)

Example 38 with SdkComponents

use of org.apache.beam.runners.core.construction.SdkComponents in project beam by apache.

the class DataflowPipelineTranslatorTest method testNumWorkersCannotExceedMaxNumWorkers.

@Test
public void testNumWorkersCannotExceedMaxNumWorkers() throws IOException {
    DataflowPipelineOptions options = buildPipelineOptions();
    options.setNumWorkers(43);
    options.setMaxNumWorkers(42);
    Pipeline p = buildPipeline(options);
    p.traverseTopologically(new RecordingPipelineVisitor());
    SdkComponents sdkComponents = createSdkComponents(options);
    RunnerApi.Pipeline pipelineProto = PipelineTranslation.toProto(p, sdkComponents, true);
    thrown.expect(IllegalArgumentException.class);
    thrown.expectMessage("numWorkers (43) cannot exceed maxNumWorkers (42).");
    DataflowPipelineTranslator.fromOptions(options).translate(p, pipelineProto, sdkComponents, DataflowRunner.fromOptions(options), Collections.emptyList()).getJob();
}
Also used : RunnerApi(org.apache.beam.model.pipeline.v1.RunnerApi) DataflowPipelineOptions(org.apache.beam.runners.dataflow.options.DataflowPipelineOptions) SdkComponents(org.apache.beam.runners.core.construction.SdkComponents) Pipeline(org.apache.beam.sdk.Pipeline) Test(org.junit.Test)

Example 39 with SdkComponents

use of org.apache.beam.runners.core.construction.SdkComponents in project beam by apache.

the class DataflowPipelineTranslatorTest method testStepDisplayData.

@Test
public void testStepDisplayData() throws Exception {
    DataflowPipelineOptions options = buildPipelineOptions();
    DataflowPipelineTranslator translator = DataflowPipelineTranslator.fromOptions(options);
    Pipeline pipeline = Pipeline.create(options);
    DoFn<Integer, Integer> fn1 = new DoFn<Integer, Integer>() {

        @ProcessElement
        public void processElement(ProcessContext c) throws Exception {
            c.output(c.element());
        }

        @Override
        public void populateDisplayData(DisplayData.Builder builder) {
            builder.add(DisplayData.item("foo", "bar")).add(DisplayData.item("foo2", DataflowPipelineTranslatorTest.class).withLabel("Test Class").withLinkUrl("http://www.google.com"));
        }
    };
    DoFn<Integer, Integer> fn2 = new DoFn<Integer, Integer>() {

        @ProcessElement
        public void processElement(ProcessContext c) throws Exception {
            c.output(c.element());
        }

        @Override
        public void populateDisplayData(DisplayData.Builder builder) {
            builder.add(DisplayData.item("foo3", 1234));
        }
    };
    ParDo.SingleOutput<Integer, Integer> parDo1 = ParDo.of(fn1);
    ParDo.SingleOutput<Integer, Integer> parDo2 = ParDo.of(fn2);
    pipeline.apply(Create.of(1, 2, 3)).apply(parDo1).apply(parDo2);
    DataflowRunner runner = DataflowRunner.fromOptions(options);
    runner.replaceV1Transforms(pipeline);
    SdkComponents sdkComponents = createSdkComponents(options);
    RunnerApi.Pipeline pipelineProto = PipelineTranslation.toProto(pipeline, sdkComponents, true);
    Job job = translator.translate(pipeline, pipelineProto, sdkComponents, runner, Collections.emptyList()).getJob();
    assertAllStepOutputsHaveUniqueIds(job);
    List<Step> steps = job.getSteps();
    assertEquals(3, steps.size());
    Map<String, Object> parDo1Properties = steps.get(1).getProperties();
    Map<String, Object> parDo2Properties = steps.get(2).getProperties();
    assertThat(parDo1Properties, hasKey("display_data"));
    @SuppressWarnings("unchecked") Collection<Map<String, String>> fn1displayData = (Collection<Map<String, String>>) parDo1Properties.get("display_data");
    @SuppressWarnings("unchecked") Collection<Map<String, String>> fn2displayData = (Collection<Map<String, String>>) parDo2Properties.get("display_data");
    ImmutableSet<ImmutableMap<String, Object>> expectedFn1DisplayData = ImmutableSet.of(ImmutableMap.<String, Object>builder().put("key", "foo").put("type", "STRING").put("value", "bar").put("namespace", fn1.getClass().getName()).build(), ImmutableMap.<String, Object>builder().put("key", "fn").put("label", "Transform Function").put("type", "JAVA_CLASS").put("value", fn1.getClass().getName()).put("shortValue", fn1.getClass().getSimpleName()).put("namespace", parDo1.getClass().getName()).build(), ImmutableMap.<String, Object>builder().put("key", "foo2").put("type", "JAVA_CLASS").put("value", DataflowPipelineTranslatorTest.class.getName()).put("shortValue", DataflowPipelineTranslatorTest.class.getSimpleName()).put("namespace", fn1.getClass().getName()).put("label", "Test Class").put("linkUrl", "http://www.google.com").build());
    ImmutableSet<ImmutableMap<String, Object>> expectedFn2DisplayData = ImmutableSet.of(ImmutableMap.<String, Object>builder().put("key", "fn").put("label", "Transform Function").put("type", "JAVA_CLASS").put("value", fn2.getClass().getName()).put("shortValue", fn2.getClass().getSimpleName()).put("namespace", parDo2.getClass().getName()).build(), ImmutableMap.<String, Object>builder().put("key", "foo3").put("type", "INTEGER").put("value", 1234L).put("namespace", fn2.getClass().getName()).build());
    assertEquals(expectedFn1DisplayData, ImmutableSet.copyOf(fn1displayData));
    assertEquals(expectedFn2DisplayData, ImmutableSet.copyOf(fn2displayData));
}
Also used : Step(com.google.api.services.dataflow.model.Step) Structs.getString(org.apache.beam.runners.dataflow.util.Structs.getString) ByteString(org.apache.beam.vendor.grpc.v1p43p2.com.google.protobuf.ByteString) SdkComponents(org.apache.beam.runners.core.construction.SdkComponents) RunnerApi(org.apache.beam.model.pipeline.v1.RunnerApi) Job(com.google.api.services.dataflow.model.Job) DataflowPipelineOptions(org.apache.beam.runners.dataflow.options.DataflowPipelineOptions) ImmutableMap(org.apache.beam.vendor.guava.v26_0_jre.com.google.common.collect.ImmutableMap) Pipeline(org.apache.beam.sdk.Pipeline) DoFn(org.apache.beam.sdk.transforms.DoFn) ParDo(org.apache.beam.sdk.transforms.ParDo) Collection(java.util.Collection) PCollection(org.apache.beam.sdk.values.PCollection) CloudObject(org.apache.beam.runners.dataflow.util.CloudObject) ImmutableMap(org.apache.beam.vendor.guava.v26_0_jre.com.google.common.collect.ImmutableMap) Map(java.util.Map) Test(org.junit.Test)

Example 40 with SdkComponents

use of org.apache.beam.runners.core.construction.SdkComponents in project beam by apache.

the class DataflowPipelineTranslatorTest method testInaccessibleProvider.

@Test
public void testInaccessibleProvider() throws Exception {
    DataflowPipelineOptions options = buildPipelineOptions();
    Pipeline pipeline = Pipeline.create(options);
    DataflowPipelineTranslator t = DataflowPipelineTranslator.fromOptions(options);
    pipeline.apply(TextIO.read().from(new TestValueProvider()));
    // Check that translation does not fail.
    SdkComponents sdkComponents = createSdkComponents(options);
    RunnerApi.Pipeline pipelineProto = PipelineTranslation.toProto(pipeline, sdkComponents, true);
    t.translate(pipeline, pipelineProto, sdkComponents, DataflowRunner.fromOptions(options), Collections.emptyList());
}
Also used : RunnerApi(org.apache.beam.model.pipeline.v1.RunnerApi) DataflowPipelineOptions(org.apache.beam.runners.dataflow.options.DataflowPipelineOptions) SdkComponents(org.apache.beam.runners.core.construction.SdkComponents) Pipeline(org.apache.beam.sdk.Pipeline) Test(org.junit.Test)

Aggregations

SdkComponents (org.apache.beam.runners.core.construction.SdkComponents)61 RunnerApi (org.apache.beam.model.pipeline.v1.RunnerApi)48 Test (org.junit.Test)46 Pipeline (org.apache.beam.sdk.Pipeline)37 DataflowPipelineOptions (org.apache.beam.runners.dataflow.options.DataflowPipelineOptions)36 Job (com.google.api.services.dataflow.model.Job)25 ByteString (org.apache.beam.vendor.grpc.v1p43p2.com.google.protobuf.ByteString)25 Structs.getString (org.apache.beam.runners.dataflow.util.Structs.getString)21 KV (org.apache.beam.sdk.values.KV)14 Map (java.util.Map)12 Step (com.google.api.services.dataflow.model.Step)11 ArrayList (java.util.ArrayList)11 List (java.util.List)9 CloudObject (org.apache.beam.runners.dataflow.util.CloudObject)9 HashMap (java.util.HashMap)8 ImmutableList (org.apache.beam.vendor.guava.v26_0_jre.com.google.common.collect.ImmutableList)8 WindowedValue (org.apache.beam.sdk.util.WindowedValue)7 ImmutableMap (org.apache.beam.vendor.guava.v26_0_jre.com.google.common.collect.ImmutableMap)7 InstructionOutput (com.google.api.services.dataflow.model.InstructionOutput)6 ParDoInstruction (com.google.api.services.dataflow.model.ParDoInstruction)6