Search in sources :

Example 1 with StreamingPCollectionViewWriterFn

use of org.apache.beam.runners.dataflow.DataflowRunner.StreamingPCollectionViewWriterFn in project beam by apache.

the class UserParDoFnFactory method create.

@Override
public ParDoFn create(PipelineOptions options, CloudObject cloudUserFn, @Nullable List<SideInputInfo> sideInputInfos, TupleTag<?> mainOutputTag, Map<TupleTag<?>, Integer> outputTupleTagsToReceiverIndices, DataflowExecutionContext<?> executionContext, DataflowOperationContext operationContext) throws Exception {
    DoFnInstanceManager instanceManager = fnCache.get(operationContext.nameContext().systemName(), () -> DoFnInstanceManagers.cloningPool(doFnExtractor.getDoFnInfo(cloudUserFn), options));
    DoFnInfo<?, ?> doFnInfo = instanceManager.peek();
    DataflowExecutionContext.DataflowStepContext stepContext = executionContext.getStepContext(operationContext);
    Iterable<PCollectionView<?>> sideInputViews = doFnInfo.getSideInputViews();
    SideInputReader sideInputReader = executionContext.getSideInputReader(sideInputInfos, sideInputViews, operationContext);
    if (doFnInfo.getDoFn() instanceof BatchStatefulParDoOverrides.BatchStatefulDoFn) {
        // HACK: BatchStatefulDoFn is a class from DataflowRunner's overrides
        // that just instructs the worker to execute it differently. This will
        // be replaced by metadata in the Runner API payload
        BatchStatefulParDoOverrides.BatchStatefulDoFn fn = (BatchStatefulParDoOverrides.BatchStatefulDoFn) doFnInfo.getDoFn();
        DoFn underlyingFn = fn.getUnderlyingDoFn();
        return new BatchModeUngroupingParDoFn((BatchModeExecutionContext.StepContext) stepContext, new SimpleParDoFn(options, DoFnInstanceManagers.singleInstance(doFnInfo.withFn(underlyingFn)), sideInputReader, doFnInfo.getMainOutput(), outputTupleTagsToReceiverIndices, stepContext, operationContext, doFnInfo.getDoFnSchemaInformation(), doFnInfo.getSideInputMapping(), runnerFactory));
    } else if (doFnInfo.getDoFn() instanceof StreamingPCollectionViewWriterFn) {
        // HACK: StreamingPCollectionViewWriterFn is a class from
        // DataflowPipelineTranslator. Using the class as an indicator is a migration path
        // to simply having an indicator string.
        checkArgument(stepContext instanceof StreamingModeExecutionContext.StreamingModeStepContext, "stepContext must be a StreamingModeStepContext to use StreamingPCollectionViewWriterFn");
        DataflowRunner.StreamingPCollectionViewWriterFn<Object> writerFn = (StreamingPCollectionViewWriterFn<Object>) doFnInfo.getDoFn();
        return new StreamingPCollectionViewWriterParDoFn((StreamingModeExecutionContext.StreamingModeStepContext) stepContext, writerFn.getView().getTagInternal(), writerFn.getDataCoder(), (Coder<BoundedWindow>) doFnInfo.getWindowingStrategy().getWindowFn().windowCoder());
    } else {
        return new SimpleParDoFn(options, instanceManager, sideInputReader, doFnInfo.getMainOutput(), outputTupleTagsToReceiverIndices, stepContext, operationContext, doFnInfo.getDoFnSchemaInformation(), doFnInfo.getSideInputMapping(), runnerFactory);
    }
}
Also used : Coder(org.apache.beam.sdk.coders.Coder) SideInputReader(org.apache.beam.runners.core.SideInputReader) BatchStatefulParDoOverrides(org.apache.beam.runners.dataflow.BatchStatefulParDoOverrides) StreamingPCollectionViewWriterFn(org.apache.beam.runners.dataflow.DataflowRunner.StreamingPCollectionViewWriterFn) PCollectionView(org.apache.beam.sdk.values.PCollectionView) DoFn(org.apache.beam.sdk.transforms.DoFn) ParDoFn(org.apache.beam.runners.dataflow.worker.util.common.worker.ParDoFn) CloudObject(org.apache.beam.runners.dataflow.util.CloudObject)

Aggregations

SideInputReader (org.apache.beam.runners.core.SideInputReader)1 BatchStatefulParDoOverrides (org.apache.beam.runners.dataflow.BatchStatefulParDoOverrides)1 StreamingPCollectionViewWriterFn (org.apache.beam.runners.dataflow.DataflowRunner.StreamingPCollectionViewWriterFn)1 CloudObject (org.apache.beam.runners.dataflow.util.CloudObject)1 ParDoFn (org.apache.beam.runners.dataflow.worker.util.common.worker.ParDoFn)1 Coder (org.apache.beam.sdk.coders.Coder)1 DoFn (org.apache.beam.sdk.transforms.DoFn)1 PCollectionView (org.apache.beam.sdk.values.PCollectionView)1