Search in sources :

Example 66 with PCollectionView

use of org.apache.beam.sdk.values.PCollectionView in project twister2 by DSC-SPIDAL.

the class ParDoMultiOutputTranslatorBatch method translateNode.

@Override
public void translateNode(ParDo.MultiOutput<IT, OT> transform, Twister2BatchTranslationContext context) {
    DoFn<IT, OT> doFn;
    doFn = transform.getFn();
    BatchTSetImpl<WindowedValue<IT>> inputTTSet = context.getInputDataSet(context.getInput(transform));
    WindowingStrategy<?, ?> windowingStrategy = context.getInput(transform).getWindowingStrategy();
    Coder<IT> inputCoder = (Coder<IT>) context.getInput(transform).getCoder();
    Map<TupleTag<?>, PValue> outputs = context.getOutputs();
    Map<TupleTag<?>, Coder<?>> outputCoders = context.getOutputCoders();
    DoFnSignature signature = DoFnSignatures.getSignature(transform.getFn().getClass());
    DoFnSchemaInformation doFnSchemaInformation;
    doFnSchemaInformation = ParDoTranslation.getSchemaInformation(context.getCurrentTransform());
    TupleTag<OT> mainOutput = transform.getMainOutputTag();
    List<TupleTag<?>> additionalOutputTags = new ArrayList<>(outputs.size() - 1);
    Collection<PCollectionView<?>> sideInputs = transform.getSideInputs();
    // construct a map from side input to WindowingStrategy so that
    // the DoFn runner can map main-input windows to side input windows
    Map<PCollectionView<?>, WindowingStrategy<?, ?>> sideInputStrategies = new HashMap<>();
    for (PCollectionView<?> sideInput : sideInputs) {
        sideInputStrategies.put(sideInput, sideInput.getWindowingStrategyInternal());
    }
    TupleTag<?> mainOutputTag;
    try {
        mainOutputTag = ParDoTranslation.getMainOutputTag(context.getCurrentTransform());
    } catch (IOException e) {
        throw new RuntimeException(e);
    }
    Map<TupleTag<?>, Integer> outputMap = Maps.newHashMap();
    // put the main output at index 0, FlinkMultiOutputDoFnFunction  expects this
    outputMap.put(mainOutputTag, 0);
    int count = 1;
    for (TupleTag<?> tag : outputs.keySet()) {
        if (!outputMap.containsKey(tag)) {
            outputMap.put(tag, count++);
        }
    }
    ComputeTSet<RawUnionValue> outputTSet = inputTTSet.direct().<RawUnionValue>compute(new DoFnFunction<OT, IT>(context, doFn, inputCoder, outputCoders, additionalOutputTags, windowingStrategy, sideInputStrategies, mainOutput, doFnSchemaInformation, outputMap));
    for (Map.Entry<TupleTag<?>, PValue> output : outputs.entrySet()) {
        ComputeTSet<WindowedValue<OT>> tempTSet = outputTSet.direct().compute(new OutputTagFilter<>(outputMap.get(output.getKey())));
        context.setOutputDataSet((PCollection) output.getValue(), tempTSet);
    }
}
Also used : HashMap(java.util.HashMap) ArrayList(java.util.ArrayList) TupleTag(org.apache.beam.sdk.values.TupleTag) WindowingStrategy(org.apache.beam.sdk.values.WindowingStrategy) WindowedValue(org.apache.beam.sdk.util.WindowedValue) Coder(org.apache.beam.sdk.coders.Coder) RawUnionValue(org.apache.beam.sdk.transforms.join.RawUnionValue) IOException(java.io.IOException) PValue(org.apache.beam.sdk.values.PValue) PCollectionView(org.apache.beam.sdk.values.PCollectionView) DoFnSchemaInformation(org.apache.beam.sdk.transforms.DoFnSchemaInformation) HashMap(java.util.HashMap) Map(java.util.Map) DoFnSignature(org.apache.beam.sdk.transforms.reflect.DoFnSignature)

Example 67 with PCollectionView

use of org.apache.beam.sdk.values.PCollectionView in project twister2 by DSC-SPIDAL.

the class BeamBatchWorker method execute.

@Override
public void execute(WorkerEnvironment workerEnv) {
    BatchEnvironment env = TSetEnvironment.initBatch(workerEnv);
    Config config = env.getConfig();
    Map<PCollectionView<?>, BatchTSet<?>> sideInputs = (Map<PCollectionView<?>, BatchTSet<?>>) config.get(SIDEINPUTS);
    Set<TSet> leaves = (Set<TSet>) config.get(LEAVES);
}
Also used : PCollectionView(org.apache.beam.sdk.values.PCollectionView) BatchTSet(edu.iu.dsc.tws.api.tset.sets.batch.BatchTSet) Set(java.util.Set) TSet(edu.iu.dsc.tws.api.tset.sets.TSet) BatchTSet(edu.iu.dsc.tws.api.tset.sets.batch.BatchTSet) BatchEnvironment(edu.iu.dsc.tws.tset.env.BatchEnvironment) Config(edu.iu.dsc.tws.api.config.Config) TSet(edu.iu.dsc.tws.api.tset.sets.TSet) BatchTSet(edu.iu.dsc.tws.api.tset.sets.batch.BatchTSet) Map(java.util.Map)

Aggregations

PCollectionView (org.apache.beam.sdk.values.PCollectionView)67 Map (java.util.Map)29 HashMap (java.util.HashMap)28 Test (org.junit.Test)28 TupleTag (org.apache.beam.sdk.values.TupleTag)27 BoundedWindow (org.apache.beam.sdk.transforms.windowing.BoundedWindow)22 Coder (org.apache.beam.sdk.coders.Coder)21 KV (org.apache.beam.sdk.values.KV)20 Instant (org.joda.time.Instant)20 KvCoder (org.apache.beam.sdk.coders.KvCoder)18 WindowedValue (org.apache.beam.sdk.util.WindowedValue)18 PCollection (org.apache.beam.sdk.values.PCollection)18 DoFn (org.apache.beam.sdk.transforms.DoFn)16 ArrayList (java.util.ArrayList)15 IntervalWindow (org.apache.beam.sdk.transforms.windowing.IntervalWindow)14 List (java.util.List)13 ImmutableMap (org.apache.beam.vendor.guava.v26_0_jre.com.google.common.collect.ImmutableMap)13 IOException (java.io.IOException)12 RunnerApi (org.apache.beam.model.pipeline.v1.RunnerApi)12 ByteString (org.apache.beam.vendor.grpc.v1p43p2.com.google.protobuf.ByteString)10