Search in sources :

Example 1 with PValue

use of org.apache.beam.sdk.values.PValue in project beam by apache.

the class TransformHierarchy method setOutput.

/**
   * Set the output of the current {@link Node}. If the output is new (setOutput has
   * not previously been called with it as the parameter), the current node is set as the producer
   * of that {@link POutput}.
   *
   * <p>Also validates the output - specifically, a Primitive {@link PTransform} produces all of
   * its outputs, and a Composite {@link PTransform} produces none of its outputs. Verifies that the
   * expanded output does not contain {@link PValue PValues} produced by both this node and other
   * nodes.
   */
public void setOutput(POutput output) {
    for (PValue value : output.expand().values()) {
        if (!producers.containsKey(value)) {
            producers.put(value, current);
            value.finishSpecifyingOutput(current.getFullName(), unexpandedInputs.get(current), current.transform);
        }
        producerInput.put(value, unexpandedInputs.get(current));
    }
    output.finishSpecifyingOutput(current.getFullName(), unexpandedInputs.get(current), current.transform);
    current.setOutput(output);
}
Also used : PValue(org.apache.beam.sdk.values.PValue)

Example 2 with PValue

use of org.apache.beam.sdk.values.PValue in project beam by apache.

the class TransformHierarchy method replaceNode.

public Node replaceNode(Node existing, PInput input, PTransform<?, ?> transform) {
    checkNotNull(existing);
    checkNotNull(input);
    checkNotNull(transform);
    checkState(unexpandedInputs.isEmpty(), "Replacing a node when the graph has an unexpanded input. This is an SDK bug.");
    Node replacement = new Node(existing.getEnclosingNode(), transform, existing.getFullName(), input);
    for (PValue output : existing.getOutputs().values()) {
        Node producer = producers.get(output);
        boolean producedInExisting = false;
        do {
            if (producer.equals(existing)) {
                producedInExisting = true;
            } else {
                producer = producer.getEnclosingNode();
            }
        } while (!producedInExisting && !producer.isRootNode());
        if (producedInExisting) {
            producers.remove(output);
            LOG.debug("Removed producer for value {} as it is part of a replaced composite {}", output, existing.getFullName());
        } else {
            LOG.debug("Value {} not produced in existing node {}", output, existing.getFullName());
        }
    }
    existing.getEnclosingNode().replaceChild(existing, replacement);
    unexpandedInputs.remove(existing);
    unexpandedInputs.put(replacement, input);
    current = replacement;
    return replacement;
}
Also used : PValue(org.apache.beam.sdk.values.PValue)

Example 3 with PValue

use of org.apache.beam.sdk.values.PValue in project beam by apache.

the class ReplacementOutputs method tagged.

public static Map<PValue, ReplacementOutput> tagged(Map<TupleTag<?>, PValue> original, POutput replacement) {
    Map<TupleTag<?>, TaggedPValue> originalTags = new HashMap<>();
    for (Map.Entry<TupleTag<?>, PValue> originalValue : original.entrySet()) {
        originalTags.put(originalValue.getKey(), TaggedPValue.of(originalValue.getKey(), originalValue.getValue()));
    }
    ImmutableMap.Builder<PValue, ReplacementOutput> resultBuilder = ImmutableMap.builder();
    Set<TupleTag<?>> missingTags = new HashSet<>(originalTags.keySet());
    for (Map.Entry<TupleTag<?>, PValue> replacementValue : replacement.expand().entrySet()) {
        TaggedPValue mapped = originalTags.get(replacementValue.getKey());
        checkArgument(mapped != null, "Missing original output for Tag %s and Value %s Between original %s and replacement %s", replacementValue.getKey(), replacementValue.getValue(), original, replacement.expand());
        resultBuilder.put(replacementValue.getValue(), ReplacementOutput.of(mapped, TaggedPValue.of(replacementValue.getKey(), replacementValue.getValue())));
        missingTags.remove(replacementValue.getKey());
    }
    ImmutableMap<PValue, ReplacementOutput> result = resultBuilder.build();
    checkArgument(missingTags.isEmpty(), "Missing replacement for tags %s. Encountered tags: %s", missingTags, result.keySet());
    return result;
}
Also used : HashMap(java.util.HashMap) TupleTag(org.apache.beam.sdk.values.TupleTag) PValue(org.apache.beam.sdk.values.PValue) TaggedPValue(org.apache.beam.sdk.values.TaggedPValue) ImmutableMap(com.google.common.collect.ImmutableMap) ReplacementOutput(org.apache.beam.sdk.runners.PTransformOverrideFactory.ReplacementOutput) TaggedPValue(org.apache.beam.sdk.values.TaggedPValue) ImmutableMap(com.google.common.collect.ImmutableMap) HashMap(java.util.HashMap) Map(java.util.Map) HashSet(java.util.HashSet)

Example 4 with PValue

use of org.apache.beam.sdk.values.PValue in project beam by apache.

the class ReplacementOutputsTest method singletonSucceeds.

@Test
public void singletonSucceeds() {
    Map<PValue, ReplacementOutput> replacements = ReplacementOutputs.singleton(ints.expand(), replacementInts);
    assertThat(replacements, Matchers.<PValue>hasKey(replacementInts));
    ReplacementOutput replacement = replacements.get(replacementInts);
    Map.Entry<TupleTag<?>, PValue> taggedInts = Iterables.getOnlyElement(ints.expand().entrySet());
    assertThat(replacement.getOriginal().getTag(), Matchers.<TupleTag<?>>equalTo(taggedInts.getKey()));
    assertThat(replacement.getOriginal().getValue(), equalTo(taggedInts.getValue()));
    assertThat(replacement.getReplacement().getValue(), Matchers.<PValue>equalTo(replacementInts));
}
Also used : ReplacementOutput(org.apache.beam.sdk.runners.PTransformOverrideFactory.ReplacementOutput) TupleTag(org.apache.beam.sdk.values.TupleTag) PValue(org.apache.beam.sdk.values.PValue) TaggedPValue(org.apache.beam.sdk.values.TaggedPValue) ImmutableMap(com.google.common.collect.ImmutableMap) Map(java.util.Map) Test(org.junit.Test)

Example 5 with PValue

use of org.apache.beam.sdk.values.PValue in project beam by apache.

the class SdkComponentsTest method translatePipeline.

@Test
public void translatePipeline() {
    BigEndianLongCoder customCoder = BigEndianLongCoder.of();
    PCollection<Long> elems = pipeline.apply(GenerateSequence.from(0L).to(207L));
    PCollection<Long> counted = elems.apply(Count.<Long>globally()).setCoder(customCoder);
    PCollection<Long> windowed = counted.apply(Window.<Long>into(FixedWindows.of(Duration.standardMinutes(7))).triggering(AfterWatermark.pastEndOfWindow().withEarlyFirings(AfterPane.elementCountAtLeast(19))).accumulatingFiredPanes().withAllowedLateness(Duration.standardMinutes(3L)));
    final WindowingStrategy<?, ?> windowedStrategy = windowed.getWindowingStrategy();
    PCollection<KV<String, Long>> keyed = windowed.apply(WithKeys.<String, Long>of("foo"));
    PCollection<KV<String, Iterable<Long>>> grouped = keyed.apply(GroupByKey.<String, Long>create());
    final RunnerApi.Pipeline pipelineProto = SdkComponents.translatePipeline(pipeline);
    pipeline.traverseTopologically(new PipelineVisitor.Defaults() {

        Set<Node> transforms = new HashSet<>();

        Set<PCollection<?>> pcollections = new HashSet<>();

        Set<Equivalence.Wrapper<? extends Coder<?>>> coders = new HashSet<>();

        Set<WindowingStrategy<?, ?>> windowingStrategies = new HashSet<>();

        @Override
        public void leaveCompositeTransform(Node node) {
            if (node.isRootNode()) {
                assertThat("Unexpected number of PTransforms", pipelineProto.getComponents().getTransformsCount(), equalTo(transforms.size()));
                assertThat("Unexpected number of PCollections", pipelineProto.getComponents().getPcollectionsCount(), equalTo(pcollections.size()));
                assertThat("Unexpected number of Coders", pipelineProto.getComponents().getCodersCount(), equalTo(coders.size()));
                assertThat("Unexpected number of Windowing Strategies", pipelineProto.getComponents().getWindowingStrategiesCount(), equalTo(windowingStrategies.size()));
            } else {
                transforms.add(node);
            }
        }

        @Override
        public void visitPrimitiveTransform(Node node) {
            transforms.add(node);
        }

        @Override
        public void visitValue(PValue value, Node producer) {
            if (value instanceof PCollection) {
                PCollection pc = (PCollection) value;
                pcollections.add(pc);
                addCoders(pc.getCoder());
                windowingStrategies.add(pc.getWindowingStrategy());
                addCoders(pc.getWindowingStrategy().getWindowFn().windowCoder());
            }
        }

        private void addCoders(Coder<?> coder) {
            coders.add(Equivalence.<Coder<?>>identity().wrap(coder));
            if (coder instanceof StructuredCoder) {
                for (Coder<?> component : ((StructuredCoder<?>) coder).getComponents()) {
                    addCoders(component);
                }
            }
        }
    });
}
Also used : Node(org.apache.beam.sdk.runners.TransformHierarchy.Node) WindowingStrategy(org.apache.beam.sdk.values.WindowingStrategy) RunnerApi(org.apache.beam.sdk.common.runner.v1.RunnerApi) PipelineVisitor(org.apache.beam.sdk.Pipeline.PipelineVisitor) BigEndianLongCoder(org.apache.beam.sdk.coders.BigEndianLongCoder) HashSet(java.util.HashSet) Coder(org.apache.beam.sdk.coders.Coder) SetCoder(org.apache.beam.sdk.coders.SetCoder) StringUtf8Coder(org.apache.beam.sdk.coders.StringUtf8Coder) KvCoder(org.apache.beam.sdk.coders.KvCoder) BigEndianLongCoder(org.apache.beam.sdk.coders.BigEndianLongCoder) IterableCoder(org.apache.beam.sdk.coders.IterableCoder) VarLongCoder(org.apache.beam.sdk.coders.VarLongCoder) StructuredCoder(org.apache.beam.sdk.coders.StructuredCoder) ByteArrayCoder(org.apache.beam.sdk.coders.ByteArrayCoder) KV(org.apache.beam.sdk.values.KV) PValue(org.apache.beam.sdk.values.PValue) PCollection(org.apache.beam.sdk.values.PCollection) StructuredCoder(org.apache.beam.sdk.coders.StructuredCoder) Test(org.junit.Test)

Aggregations

PValue (org.apache.beam.sdk.values.PValue)28 TupleTag (org.apache.beam.sdk.values.TupleTag)13 PCollection (org.apache.beam.sdk.values.PCollection)12 Test (org.junit.Test)9 TaggedPValue (org.apache.beam.sdk.values.TaggedPValue)7 HashSet (java.util.HashSet)5 Map (java.util.Map)5 Node (org.apache.beam.sdk.runners.TransformHierarchy.Node)5 WindowedValue (org.apache.beam.sdk.util.WindowedValue)5 ImmutableMap (com.google.common.collect.ImmutableMap)4 ReplacementOutput (org.apache.beam.sdk.runners.PTransformOverrideFactory.ReplacementOutput)4 PTransform (org.apache.beam.sdk.transforms.PTransform)4 PCollectionTuple (org.apache.beam.sdk.values.PCollectionTuple)4 JavaRDD (org.apache.spark.api.java.JavaRDD)4 DoFn (org.apache.beam.sdk.transforms.DoFn)3 ParDo (org.apache.beam.sdk.transforms.ParDo)3 ImmutableList (com.google.common.collect.ImmutableList)2 HashMap (java.util.HashMap)2 MetricsContainerStepMap (org.apache.beam.runners.core.metrics.MetricsContainerStepMap)2 EvaluationContext (org.apache.beam.runners.spark.translation.EvaluationContext)2