Search in sources :

Example 1 with StreamingShardedWriteFactory

use of org.apache.beam.runners.dataflow.DataflowRunner.StreamingShardedWriteFactory in project beam by apache.

the class DataflowRunnerTest method testStreamingWriteOverride.

private void testStreamingWriteOverride(PipelineOptions options, int expectedNumShards) {
    TestPipeline p = TestPipeline.fromOptions(options);
    StreamingShardedWriteFactory<Object, Void, Object> factory = new StreamingShardedWriteFactory<>(p.getOptions());
    WriteFiles<Object, Void, Object> original = WriteFiles.to(new TestSink(tmpFolder.toString()));
    PCollection<Object> objs = (PCollection) p.apply(Create.empty(VoidCoder.of()));
    AppliedPTransform<PCollection<Object>, WriteFilesResult<Void>, WriteFiles<Object, Void, Object>> originalApplication = AppliedPTransform.of("writefiles", PValues.expandInput(objs), Collections.emptyMap(), original, ResourceHints.create(), p);
    WriteFiles<Object, Void, Object> replacement = (WriteFiles<Object, Void, Object>) factory.getReplacementTransform(originalApplication).getTransform();
    assertThat(replacement, not(equalTo((Object) original)));
    assertThat(replacement.getNumShardsProvider().get(), equalTo(expectedNumShards));
    WriteFilesResult<Void> originalResult = objs.apply(original);
    WriteFilesResult<Void> replacementResult = objs.apply(replacement);
    Map<PCollection<?>, ReplacementOutput> res = factory.mapOutputs(PValues.expandOutput(originalResult), replacementResult);
    assertEquals(1, res.size());
    assertEquals(originalResult.getPerDestinationOutputFilenames(), res.get(replacementResult.getPerDestinationOutputFilenames()).getOriginal().getValue());
}
Also used : StreamingShardedWriteFactory(org.apache.beam.runners.dataflow.DataflowRunner.StreamingShardedWriteFactory) WriteFilesResult(org.apache.beam.sdk.io.WriteFilesResult) TestPipeline(org.apache.beam.sdk.testing.TestPipeline) PCollection(org.apache.beam.sdk.values.PCollection) ReplacementOutput(org.apache.beam.sdk.runners.PTransformOverrideFactory.ReplacementOutput) StorageObject(com.google.api.services.storage.model.StorageObject) WriteFiles(org.apache.beam.sdk.io.WriteFiles)

Aggregations

StorageObject (com.google.api.services.storage.model.StorageObject)1 StreamingShardedWriteFactory (org.apache.beam.runners.dataflow.DataflowRunner.StreamingShardedWriteFactory)1 WriteFiles (org.apache.beam.sdk.io.WriteFiles)1 WriteFilesResult (org.apache.beam.sdk.io.WriteFilesResult)1 ReplacementOutput (org.apache.beam.sdk.runners.PTransformOverrideFactory.ReplacementOutput)1 TestPipeline (org.apache.beam.sdk.testing.TestPipeline)1 PCollection (org.apache.beam.sdk.values.PCollection)1