Search in sources :

Example 96 with Pipeline

use of org.apache.beam.sdk.Pipeline in project beam by apache.

the class DataflowRunnerTest method buildDataflowPipeline.

private Pipeline buildDataflowPipeline(DataflowPipelineOptions options) {
    options.setStableUniqueNames(CheckEnabled.ERROR);
    options.setRunner(DataflowRunner.class);
    Pipeline p = Pipeline.create(options);
    p.apply("ReadMyFile", TextIO.read().from("gs://bucket/object")).apply("WriteMyFile", TextIO.write().to("gs://bucket/object"));
    // Enable the FileSystems API to know about gs:// URIs in this test.
    FileSystems.setDefaultPipelineOptions(options);
    return p;
}
Also used : TestPipeline(org.apache.beam.sdk.testing.TestPipeline) Pipeline(org.apache.beam.sdk.Pipeline)

Example 97 with Pipeline

use of org.apache.beam.sdk.Pipeline in project beam by apache.

the class DataflowPipelineTranslatorTest method testPartiallyBoundFailure.

@Test
public void testPartiallyBoundFailure() throws IOException {
    Pipeline p = Pipeline.create(buildPipelineOptions());
    PCollection<Integer> input = p.begin().apply(Create.of(1, 2, 3));
    thrown.expect(IllegalArgumentException.class);
    input.apply(new PartiallyBoundOutputCreator());
    Assert.fail("Failure expected from use of partially bound output");
}
Also used : Pipeline(org.apache.beam.sdk.Pipeline) Test(org.junit.Test)

Example 98 with Pipeline

use of org.apache.beam.sdk.Pipeline in project beam by apache.

the class DataflowPipelineTranslatorTest method testNamesOverridden.

/**
   * Test that in translation the name for a collection (in this case just a Create output) is
   * overriden to be what the Dataflow service expects.
   */
@Test
public void testNamesOverridden() throws Exception {
    DataflowPipelineOptions options = buildPipelineOptions();
    DataflowRunner runner = DataflowRunner.fromOptions(options);
    options.setStreaming(false);
    DataflowPipelineTranslator translator = DataflowPipelineTranslator.fromOptions(options);
    Pipeline pipeline = Pipeline.create(options);
    pipeline.apply("Jazzy", Create.of(3)).setName("foobizzle");
    runner.replaceTransforms(pipeline);
    Job job = translator.translate(pipeline, runner, Collections.<DataflowPackage>emptyList()).getJob();
    // The Create step
    Step step = job.getSteps().get(0);
    // This is the name that is "set by the user" that the Dataflow translator must override
    String userSpecifiedName = Structs.getString(Structs.getListOfMaps(step.getProperties(), PropertyNames.OUTPUT_INFO, null).get(0), PropertyNames.USER_NAME);
    // This is the calculated name that must actually be used
    String calculatedName = getString(step.getProperties(), PropertyNames.USER_NAME) + ".out0";
    assertThat(userSpecifiedName, equalTo(calculatedName));
}
Also used : DataflowPipelineOptions(org.apache.beam.runners.dataflow.options.DataflowPipelineOptions) Step(com.google.api.services.dataflow.model.Step) Structs.getString(org.apache.beam.runners.dataflow.util.Structs.getString) Job(com.google.api.services.dataflow.model.Job) DataflowPackage(com.google.api.services.dataflow.model.DataflowPackage) Pipeline(org.apache.beam.sdk.Pipeline) Test(org.junit.Test)

Example 99 with Pipeline

use of org.apache.beam.sdk.Pipeline in project beam by apache.

the class DataflowPipelineTranslatorTest method testInaccessibleProvider.

@Test
public void testInaccessibleProvider() throws Exception {
    DataflowPipelineOptions options = buildPipelineOptions();
    Pipeline pipeline = Pipeline.create(options);
    DataflowPipelineTranslator t = DataflowPipelineTranslator.fromOptions(options);
    pipeline.apply(TextIO.read().from(new TestValueProvider()));
    // Check that translation does not fail.
    t.translate(pipeline, DataflowRunner.fromOptions(options), Collections.<DataflowPackage>emptyList());
}
Also used : DataflowPipelineOptions(org.apache.beam.runners.dataflow.options.DataflowPipelineOptions) Pipeline(org.apache.beam.sdk.Pipeline) Test(org.junit.Test)

Example 100 with Pipeline

use of org.apache.beam.sdk.Pipeline in project beam by apache.

the class DataflowPipelineTranslatorTest method testSubnetworkConfigMissing.

@Test
public void testSubnetworkConfigMissing() throws IOException {
    DataflowPipelineOptions options = buildPipelineOptions();
    Pipeline p = buildPipeline(options);
    p.traverseTopologically(new RecordingPipelineVisitor());
    Job job = DataflowPipelineTranslator.fromOptions(options).translate(p, DataflowRunner.fromOptions(options), Collections.<DataflowPackage>emptyList()).getJob();
    assertEquals(1, job.getEnvironment().getWorkerPools().size());
    assertNull(job.getEnvironment().getWorkerPools().get(0).getSubnetwork());
}
Also used : DataflowPipelineOptions(org.apache.beam.runners.dataflow.options.DataflowPipelineOptions) Job(com.google.api.services.dataflow.model.Job) DataflowPackage(com.google.api.services.dataflow.model.DataflowPackage) Pipeline(org.apache.beam.sdk.Pipeline) Test(org.junit.Test)

Aggregations

Pipeline (org.apache.beam.sdk.Pipeline)184 Test (org.junit.Test)123 TestPipeline (org.apache.beam.sdk.testing.TestPipeline)86 DataflowPipelineOptions (org.apache.beam.runners.dataflow.options.DataflowPipelineOptions)39 KV (org.apache.beam.sdk.values.KV)35 Job (com.google.api.services.dataflow.model.Job)26 DoFn (org.apache.beam.sdk.transforms.DoFn)24 PipelineOptions (org.apache.beam.sdk.options.PipelineOptions)22 DataflowPackage (com.google.api.services.dataflow.model.DataflowPackage)21 TableRow (com.google.api.services.bigquery.model.TableRow)16 PipelineResult (org.apache.beam.sdk.PipelineResult)14 Structs.getString (org.apache.beam.runners.dataflow.util.Structs.getString)13 TableSchema (com.google.api.services.bigquery.model.TableSchema)12 ApexPipelineOptions (org.apache.beam.runners.apex.ApexPipelineOptions)12 Map (java.util.Map)11 TableFieldSchema (com.google.api.services.bigquery.model.TableFieldSchema)10 ArrayList (java.util.ArrayList)10 Instant (org.joda.time.Instant)10 TableReference (com.google.api.services.bigquery.model.TableReference)9 JsonSchemaToTableSchema (org.apache.beam.sdk.io.gcp.bigquery.BigQueryHelpers.JsonSchemaToTableSchema)9