Search in sources :

Example 1 with ReplacementOutput

use of org.apache.beam.sdk.runners.PTransformOverrideFactory.ReplacementOutput in project beam by apache.

the class ReplacementOutputs method tagged.

public static Map<PValue, ReplacementOutput> tagged(Map<TupleTag<?>, PValue> original, POutput replacement) {
    Map<TupleTag<?>, TaggedPValue> originalTags = new HashMap<>();
    for (Map.Entry<TupleTag<?>, PValue> originalValue : original.entrySet()) {
        originalTags.put(originalValue.getKey(), TaggedPValue.of(originalValue.getKey(), originalValue.getValue()));
    }
    ImmutableMap.Builder<PValue, ReplacementOutput> resultBuilder = ImmutableMap.builder();
    Set<TupleTag<?>> missingTags = new HashSet<>(originalTags.keySet());
    for (Map.Entry<TupleTag<?>, PValue> replacementValue : replacement.expand().entrySet()) {
        TaggedPValue mapped = originalTags.get(replacementValue.getKey());
        checkArgument(mapped != null, "Missing original output for Tag %s and Value %s Between original %s and replacement %s", replacementValue.getKey(), replacementValue.getValue(), original, replacement.expand());
        resultBuilder.put(replacementValue.getValue(), ReplacementOutput.of(mapped, TaggedPValue.of(replacementValue.getKey(), replacementValue.getValue())));
        missingTags.remove(replacementValue.getKey());
    }
    ImmutableMap<PValue, ReplacementOutput> result = resultBuilder.build();
    checkArgument(missingTags.isEmpty(), "Missing replacement for tags %s. Encountered tags: %s", missingTags, result.keySet());
    return result;
}
Also used : HashMap(java.util.HashMap) TupleTag(org.apache.beam.sdk.values.TupleTag) PValue(org.apache.beam.sdk.values.PValue) TaggedPValue(org.apache.beam.sdk.values.TaggedPValue) ImmutableMap(com.google.common.collect.ImmutableMap) ReplacementOutput(org.apache.beam.sdk.runners.PTransformOverrideFactory.ReplacementOutput) TaggedPValue(org.apache.beam.sdk.values.TaggedPValue) ImmutableMap(com.google.common.collect.ImmutableMap) HashMap(java.util.HashMap) Map(java.util.Map) HashSet(java.util.HashSet)

Example 2 with ReplacementOutput

use of org.apache.beam.sdk.runners.PTransformOverrideFactory.ReplacementOutput in project beam by apache.

the class TransformHierarchyTest method replaceSucceeds.

@Test
public void replaceSucceeds() {
    PTransform<?, ?> enclosingPT = new PTransform<PInput, POutput>() {

        @Override
        public POutput expand(PInput input) {
            return PDone.in(input.getPipeline());
        }
    };
    TransformHierarchy.Node enclosing = hierarchy.pushNode("Enclosing", PBegin.in(pipeline), enclosingPT);
    Create.Values<Long> originalTransform = Create.of(1L);
    TransformHierarchy.Node original = hierarchy.pushNode("Create", PBegin.in(pipeline), originalTransform);
    assertThat(hierarchy.getCurrent(), equalTo(original));
    PCollection<Long> originalOutput = pipeline.apply(originalTransform);
    hierarchy.setOutput(originalOutput);
    hierarchy.popNode();
    assertThat(original.finishedSpecifying, is(true));
    hierarchy.setOutput(PDone.in(pipeline));
    hierarchy.popNode();
    assertThat(hierarchy.getCurrent(), not(equalTo(enclosing)));
    Read.Bounded<Long> replacementTransform = Read.from(CountingSource.upTo(1L));
    PCollection<Long> replacementOutput = pipeline.apply(replacementTransform);
    Node replacement = hierarchy.replaceNode(original, PBegin.in(pipeline), replacementTransform);
    assertThat(hierarchy.getCurrent(), equalTo(replacement));
    hierarchy.setOutput(replacementOutput);
    TaggedPValue taggedReplacement = TaggedPValue.ofExpandedValue(replacementOutput);
    Map<PCollection<?>, ReplacementOutput> replacementOutputs = Collections.singletonMap(replacementOutput, ReplacementOutput.of(TaggedPValue.ofExpandedValue(originalOutput), taggedReplacement));
    hierarchy.replaceOutputs(replacementOutputs);
    assertThat(replacement.getInputs(), equalTo(original.getInputs()));
    assertThat(replacement.getEnclosingNode(), equalTo(original.getEnclosingNode()));
    assertThat(replacement.getEnclosingNode(), equalTo(enclosing));
    assertThat(replacement.getTransform(), equalTo(replacementTransform));
    // THe tags of the replacement transform are matched to the appropriate PValues of the original
    assertThat(replacement.getOutputs().keySet(), Matchers.contains(taggedReplacement.getTag()));
    assertThat(replacement.getOutputs().values(), Matchers.contains(originalOutput));
    hierarchy.popNode();
}
Also used : Node(org.apache.beam.sdk.runners.TransformHierarchy.Node) PInput(org.apache.beam.sdk.values.PInput) Node(org.apache.beam.sdk.runners.TransformHierarchy.Node) Read(org.apache.beam.sdk.io.Read) PCollection(org.apache.beam.sdk.values.PCollection) ReplacementOutput(org.apache.beam.sdk.runners.PTransformOverrideFactory.ReplacementOutput) Create(org.apache.beam.sdk.transforms.Create) TaggedPValue(org.apache.beam.sdk.values.TaggedPValue) PTransform(org.apache.beam.sdk.transforms.PTransform) Test(org.junit.Test)

Example 3 with ReplacementOutput

use of org.apache.beam.sdk.runners.PTransformOverrideFactory.ReplacementOutput in project beam by apache.

the class ReplacementOutputs method tagged.

public static Map<PCollection<?>, ReplacementOutput> tagged(Map<TupleTag<?>, PCollection<?>> original, POutput replacement) {
    Map<TupleTag<?>, TaggedPValue> originalTags = new HashMap<>();
    for (Map.Entry<TupleTag<?>, PCollection<?>> originalValue : original.entrySet()) {
        originalTags.put(originalValue.getKey(), TaggedPValue.of(originalValue.getKey(), originalValue.getValue()));
    }
    ImmutableMap.Builder<PCollection<?>, ReplacementOutput> resultBuilder = ImmutableMap.builder();
    Map<TupleTag<?>, PCollection<?>> remainingTaggedOriginals = new HashMap<>(original);
    Map<TupleTag<?>, PCollection<?>> taggedReplacements = PValues.expandOutput(replacement);
    for (Map.Entry<TupleTag<?>, PCollection<?>> replacementValue : taggedReplacements.entrySet()) {
        TaggedPValue mapped = originalTags.get(replacementValue.getKey());
        checkArgument(mapped != null, "Missing original output for Tag %s and Value %s Between original %s and replacement %s", replacementValue.getKey(), replacementValue.getValue(), original, replacement.expand());
        resultBuilder.put(replacementValue.getValue(), ReplacementOutput.of(mapped, TaggedPValue.of(replacementValue.getKey(), (PCollection<?>) replacementValue.getValue())));
        remainingTaggedOriginals.remove(replacementValue.getKey());
    }
    checkArgument(remainingTaggedOriginals.isEmpty(), "Missing replacement for tagged values %s. Replacement was: %s", remainingTaggedOriginals, taggedReplacements);
    return resultBuilder.build();
}
Also used : PCollection(org.apache.beam.sdk.values.PCollection) ReplacementOutput(org.apache.beam.sdk.runners.PTransformOverrideFactory.ReplacementOutput) HashMap(java.util.HashMap) TaggedPValue(org.apache.beam.sdk.values.TaggedPValue) TupleTag(org.apache.beam.sdk.values.TupleTag) HashMap(java.util.HashMap) ImmutableMap(org.apache.beam.vendor.guava.v26_0_jre.com.google.common.collect.ImmutableMap) Map(java.util.Map) ImmutableMap(org.apache.beam.vendor.guava.v26_0_jre.com.google.common.collect.ImmutableMap)

Example 4 with ReplacementOutput

use of org.apache.beam.sdk.runners.PTransformOverrideFactory.ReplacementOutput in project beam by apache.

the class ReplacementOutputsTest method singletonSucceeds.

@Test
public void singletonSucceeds() {
    Map<PCollection<?>, ReplacementOutput> replacements = ReplacementOutputs.singleton(PValues.expandValue(ints), replacementInts);
    assertThat(replacements, Matchers.hasKey(replacementInts));
    ReplacementOutput replacement = replacements.get(replacementInts);
    Map.Entry<TupleTag<?>, PValue> taggedInts = Iterables.getOnlyElement(ints.expand().entrySet());
    assertThat(replacement.getOriginal().getTag(), equalTo(taggedInts.getKey()));
    assertThat(replacement.getOriginal().getValue(), equalTo(taggedInts.getValue()));
    assertThat(replacement.getReplacement().getValue(), equalTo(replacementInts));
}
Also used : PCollection(org.apache.beam.sdk.values.PCollection) ReplacementOutput(org.apache.beam.sdk.runners.PTransformOverrideFactory.ReplacementOutput) TupleTag(org.apache.beam.sdk.values.TupleTag) PValue(org.apache.beam.sdk.values.PValue) TaggedPValue(org.apache.beam.sdk.values.TaggedPValue) ImmutableMap(org.apache.beam.vendor.guava.v26_0_jre.com.google.common.collect.ImmutableMap) Map(java.util.Map) Test(org.junit.Test)

Example 5 with ReplacementOutput

use of org.apache.beam.sdk.runners.PTransformOverrideFactory.ReplacementOutput in project beam by apache.

the class DataflowRunnerTest method testStreamingWriteOverride.

private void testStreamingWriteOverride(PipelineOptions options, int expectedNumShards) {
    TestPipeline p = TestPipeline.fromOptions(options);
    StreamingShardedWriteFactory<Object, Void, Object> factory = new StreamingShardedWriteFactory<>(p.getOptions());
    WriteFiles<Object, Void, Object> original = WriteFiles.to(new TestSink(tmpFolder.toString()));
    PCollection<Object> objs = (PCollection) p.apply(Create.empty(VoidCoder.of()));
    AppliedPTransform<PCollection<Object>, WriteFilesResult<Void>, WriteFiles<Object, Void, Object>> originalApplication = AppliedPTransform.of("writefiles", PValues.expandInput(objs), Collections.emptyMap(), original, ResourceHints.create(), p);
    WriteFiles<Object, Void, Object> replacement = (WriteFiles<Object, Void, Object>) factory.getReplacementTransform(originalApplication).getTransform();
    assertThat(replacement, not(equalTo((Object) original)));
    assertThat(replacement.getNumShardsProvider().get(), equalTo(expectedNumShards));
    WriteFilesResult<Void> originalResult = objs.apply(original);
    WriteFilesResult<Void> replacementResult = objs.apply(replacement);
    Map<PCollection<?>, ReplacementOutput> res = factory.mapOutputs(PValues.expandOutput(originalResult), replacementResult);
    assertEquals(1, res.size());
    assertEquals(originalResult.getPerDestinationOutputFilenames(), res.get(replacementResult.getPerDestinationOutputFilenames()).getOriginal().getValue());
}
Also used : StreamingShardedWriteFactory(org.apache.beam.runners.dataflow.DataflowRunner.StreamingShardedWriteFactory) WriteFilesResult(org.apache.beam.sdk.io.WriteFilesResult) TestPipeline(org.apache.beam.sdk.testing.TestPipeline) PCollection(org.apache.beam.sdk.values.PCollection) ReplacementOutput(org.apache.beam.sdk.runners.PTransformOverrideFactory.ReplacementOutput) StorageObject(com.google.api.services.storage.model.StorageObject) WriteFiles(org.apache.beam.sdk.io.WriteFiles)

Aggregations

ReplacementOutput (org.apache.beam.sdk.runners.PTransformOverrideFactory.ReplacementOutput)6 PCollection (org.apache.beam.sdk.values.PCollection)5 TaggedPValue (org.apache.beam.sdk.values.TaggedPValue)4 Map (java.util.Map)3 TupleTag (org.apache.beam.sdk.values.TupleTag)3 Test (org.junit.Test)3 HashMap (java.util.HashMap)2 PValue (org.apache.beam.sdk.values.PValue)2 ImmutableMap (org.apache.beam.vendor.guava.v26_0_jre.com.google.common.collect.ImmutableMap)2 StorageObject (com.google.api.services.storage.model.StorageObject)1 ImmutableMap (com.google.common.collect.ImmutableMap)1 HashSet (java.util.HashSet)1 StreamingShardedWriteFactory (org.apache.beam.runners.dataflow.DataflowRunner.StreamingShardedWriteFactory)1 Read (org.apache.beam.sdk.io.Read)1 WriteFiles (org.apache.beam.sdk.io.WriteFiles)1 WriteFilesResult (org.apache.beam.sdk.io.WriteFilesResult)1 Node (org.apache.beam.sdk.runners.TransformHierarchy.Node)1 TestPipeline (org.apache.beam.sdk.testing.TestPipeline)1 Create (org.apache.beam.sdk.transforms.Create)1 PTransform (org.apache.beam.sdk.transforms.PTransform)1