Search in sources :

Example 1 with FunctionSpec

use of org.apache.beam.sdk.common.runner.v1.RunnerApi.FunctionSpec in project beam by apache.

the class ProcessBundleHandlerTest method testOrderOfSetupTeardownCalls.

@Test
public void testOrderOfSetupTeardownCalls() throws Exception {
    DoFnWithExecutionInformation doFnWithExecutionInformation = DoFnWithExecutionInformation.of(new TestDoFn(), TestDoFn.mainOutput, Collections.emptyMap(), DoFnSchemaInformation.create());
    RunnerApi.FunctionSpec functionSpec = RunnerApi.FunctionSpec.newBuilder().setUrn(ParDoTranslation.CUSTOM_JAVA_DO_FN_URN).setPayload(ByteString.copyFrom(SerializableUtils.serializeToByteArray(doFnWithExecutionInformation))).build();
    RunnerApi.ParDoPayload parDoPayload = RunnerApi.ParDoPayload.newBuilder().setDoFn(functionSpec).build();
    BeamFnApi.ProcessBundleDescriptor processBundleDescriptor = BeamFnApi.ProcessBundleDescriptor.newBuilder().putTransforms("2L", PTransform.newBuilder().setSpec(RunnerApi.FunctionSpec.newBuilder().setUrn(DATA_INPUT_URN).build()).putOutputs("2L-output", "2L-output-pc").build()).putTransforms("3L", PTransform.newBuilder().setSpec(RunnerApi.FunctionSpec.newBuilder().setUrn(PTransformTranslation.PAR_DO_TRANSFORM_URN).setPayload(parDoPayload.toByteString())).putInputs("3L-input", "2L-output-pc").build()).putPcollections("2L-output-pc", PCollection.newBuilder().setWindowingStrategyId("window-strategy").setCoderId("2L-output-coder").setIsBounded(IsBounded.Enum.BOUNDED).build()).putWindowingStrategies("window-strategy", WindowingStrategy.newBuilder().setWindowCoderId("window-strategy-coder").setWindowFn(RunnerApi.FunctionSpec.newBuilder().setUrn("beam:window_fn:global_windows:v1")).setOutputTime(RunnerApi.OutputTime.Enum.END_OF_WINDOW).setAccumulationMode(RunnerApi.AccumulationMode.Enum.ACCUMULATING).setTrigger(RunnerApi.Trigger.newBuilder().setAlways(RunnerApi.Trigger.Always.getDefaultInstance())).setClosingBehavior(RunnerApi.ClosingBehavior.Enum.EMIT_ALWAYS).setOnTimeBehavior(RunnerApi.OnTimeBehavior.Enum.FIRE_ALWAYS).build()).putCoders("2L-output-coder", CoderTranslation.toProto(StringUtf8Coder.of()).getCoder()).putCoders("window-strategy-coder", Coder.newBuilder().setSpec(RunnerApi.FunctionSpec.newBuilder().setUrn(ModelCoders.GLOBAL_WINDOW_CODER_URN).build()).build()).build();
    Map<String, BeamFnApi.ProcessBundleDescriptor> fnApiRegistry = ImmutableMap.of("1L", processBundleDescriptor);
    Map<String, PTransformRunnerFactory> urnToPTransformRunnerFactoryMap = Maps.newHashMap(REGISTERED_RUNNER_FACTORIES);
    urnToPTransformRunnerFactoryMap.put(DATA_INPUT_URN, (context) -> null);
    ProcessBundleHandler handler = new ProcessBundleHandler(PipelineOptionsFactory.create(), Collections.emptySet(), fnApiRegistry::get, beamFnDataClient, null, /* beamFnStateClient */
    null, /* finalizeBundleHandler */
    new ShortIdMap(), urnToPTransformRunnerFactoryMap, Caches.noop(), new BundleProcessorCache());
    handler.processBundle(BeamFnApi.InstructionRequest.newBuilder().setInstructionId("998L").setProcessBundle(BeamFnApi.ProcessBundleRequest.newBuilder().setProcessBundleDescriptorId("1L")).build());
    handler.processBundle(BeamFnApi.InstructionRequest.newBuilder().setInstructionId("999L").setProcessBundle(BeamFnApi.ProcessBundleRequest.newBuilder().setProcessBundleDescriptorId("1L")).build());
    handler.shutdown();
    // setup and teardown should occur only once when processing multiple bundles for the same
    // descriptor
    assertThat(TestDoFn.orderOfOperations, contains("setUp", "startBundle", "finishBundle", "startBundle", "finishBundle", "tearDown"));
}
Also used : ProcessBundleDescriptor(org.apache.beam.model.fnexecution.v1.BeamFnApi.ProcessBundleDescriptor) ParDoPayload(org.apache.beam.model.pipeline.v1.RunnerApi.ParDoPayload) BeamFnApi(org.apache.beam.model.fnexecution.v1.BeamFnApi) ProcessBundleDescriptor(org.apache.beam.model.fnexecution.v1.BeamFnApi.ProcessBundleDescriptor) BundleProcessorCache(org.apache.beam.fn.harness.control.ProcessBundleHandler.BundleProcessorCache) FunctionSpec(org.apache.beam.model.pipeline.v1.RunnerApi.FunctionSpec) DoFnWithExecutionInformation(org.apache.beam.sdk.util.DoFnWithExecutionInformation) ByteString(org.apache.beam.vendor.grpc.v1p43p2.com.google.protobuf.ByteString) ShortIdMap(org.apache.beam.runners.core.metrics.ShortIdMap) RunnerApi(org.apache.beam.model.pipeline.v1.RunnerApi) PTransformRunnerFactory(org.apache.beam.fn.harness.PTransformRunnerFactory) Test(org.junit.Test)

Example 2 with FunctionSpec

use of org.apache.beam.sdk.common.runner.v1.RunnerApi.FunctionSpec in project beam by apache.

the class WindowMappingFnRunner method createMapFunctionForPTransform.

static <T, W1 extends BoundedWindow, W2 extends BoundedWindow> ThrowingFunction<KV<T, W1>, KV<T, W2>> createMapFunctionForPTransform(String ptransformId, PTransform pTransform) throws IOException {
    FunctionSpec windowMappingFnPayload = FunctionSpec.parseFrom(pTransform.getSpec().getPayload());
    WindowMappingFn<W2> windowMappingFn = (WindowMappingFn<W2>) PCollectionViewTranslation.windowMappingFnFromProto(windowMappingFnPayload);
    return (KV<T, W1> input) -> KV.of(input.getKey(), windowMappingFn.getSideInputWindow(input.getValue()));
}
Also used : WindowMappingFn(org.apache.beam.sdk.transforms.windowing.WindowMappingFn) FunctionSpec(org.apache.beam.model.pipeline.v1.RunnerApi.FunctionSpec)

Example 3 with FunctionSpec

use of org.apache.beam.sdk.common.runner.v1.RunnerApi.FunctionSpec in project beam by apache.

the class ParDoTranslation method translateParDo.

/**
 * Translate a ParDo.
 */
public static <InputT> ParDoPayload translateParDo(ParDo.MultiOutput<InputT, ?> parDo, PCollection<InputT> mainInput, DoFnSchemaInformation doFnSchemaInformation, Pipeline pipeline, SdkComponents components) throws IOException {
    final DoFn<?, ?> doFn = parDo.getFn();
    final DoFnSignature signature = DoFnSignatures.getSignature(doFn.getClass());
    final String restrictionCoderId;
    if (signature.processElement().isSplittable()) {
        DoFnInvoker<?, ?> doFnInvoker = DoFnInvokers.invokerFor(doFn);
        final Coder<?> restrictionAndWatermarkStateCoder = KvCoder.of(doFnInvoker.invokeGetRestrictionCoder(pipeline.getCoderRegistry()), doFnInvoker.invokeGetWatermarkEstimatorStateCoder(pipeline.getCoderRegistry()));
        restrictionCoderId = components.registerCoder(restrictionAndWatermarkStateCoder);
    } else {
        restrictionCoderId = "";
    }
    Coder<BoundedWindow> windowCoder = (Coder<BoundedWindow>) mainInput.getWindowingStrategy().getWindowFn().windowCoder();
    Coder<?> keyCoder;
    if (signature.usesState() || signature.usesTimers()) {
        checkArgument(mainInput.getCoder() instanceof KvCoder, "DoFn's that use state or timers must have an input PCollection with a KvCoder but received %s", mainInput.getCoder());
        keyCoder = ((KvCoder) mainInput.getCoder()).getKeyCoder();
    } else {
        keyCoder = null;
    }
    return payloadForParDoLike(new ParDoLike() {

        @Override
        public FunctionSpec translateDoFn(SdkComponents newComponents) {
            return ParDoTranslation.translateDoFn(parDo.getFn(), parDo.getMainOutputTag(), parDo.getSideInputs(), doFnSchemaInformation, newComponents);
        }

        @Override
        public Map<String, SideInput> translateSideInputs(SdkComponents components) {
            Map<String, SideInput> sideInputs = new HashMap<>();
            for (PCollectionView<?> sideInput : parDo.getSideInputs().values()) {
                sideInputs.put(sideInput.getTagInternal().getId(), translateView(sideInput, components));
            }
            return sideInputs;
        }

        @Override
        public Map<String, RunnerApi.StateSpec> translateStateSpecs(SdkComponents components) throws IOException {
            Map<String, RunnerApi.StateSpec> stateSpecs = new HashMap<>();
            for (Map.Entry<String, StateDeclaration> state : signature.stateDeclarations().entrySet()) {
                RunnerApi.StateSpec spec = translateStateSpec(getStateSpecOrThrow(state.getValue(), doFn), components);
                stateSpecs.put(state.getKey(), spec);
            }
            return stateSpecs;
        }

        @Override
        public ParDoLikeTimerFamilySpecs translateTimerFamilySpecs(SdkComponents newComponents) {
            Map<String, RunnerApi.TimerFamilySpec> timerFamilySpecs = new HashMap<>();
            for (Map.Entry<String, TimerDeclaration> timer : signature.timerDeclarations().entrySet()) {
                RunnerApi.TimerFamilySpec spec = translateTimerFamilySpec(getTimerSpecOrThrow(timer.getValue(), doFn), newComponents, keyCoder, windowCoder);
                timerFamilySpecs.put(timer.getKey(), spec);
            }
            for (Map.Entry<String, DoFnSignature.TimerFamilyDeclaration> timerFamily : signature.timerFamilyDeclarations().entrySet()) {
                RunnerApi.TimerFamilySpec spec = translateTimerFamilySpec(DoFnSignatures.getTimerFamilySpecOrThrow(timerFamily.getValue(), doFn), newComponents, keyCoder, windowCoder);
                timerFamilySpecs.put(timerFamily.getKey(), spec);
            }
            String onWindowExpirationTimerFamilySpec = null;
            if (signature.onWindowExpiration() != null) {
                RunnerApi.TimerFamilySpec spec = RunnerApi.TimerFamilySpec.newBuilder().setTimeDomain(translateTimeDomain(TimeDomain.EVENT_TIME)).setTimerFamilyCoderId(registerCoderOrThrow(components, Timer.Coder.of(keyCoder, windowCoder))).build();
                for (int i = 0; i < Integer.MAX_VALUE; ++i) {
                    onWindowExpirationTimerFamilySpec = "onWindowExpiration" + i;
                    if (!timerFamilySpecs.containsKey(onWindowExpirationTimerFamilySpec)) {
                        break;
                    }
                }
                timerFamilySpecs.put(onWindowExpirationTimerFamilySpec, spec);
            }
            return ParDoLikeTimerFamilySpecs.create(timerFamilySpecs, onWindowExpirationTimerFamilySpec);
        }

        @Override
        public boolean isStateful() {
            return !signature.stateDeclarations().isEmpty() || !signature.timerDeclarations().isEmpty() || !signature.timerFamilyDeclarations().isEmpty() || signature.onWindowExpiration() != null;
        }

        @Override
        public boolean isSplittable() {
            return signature.processElement().isSplittable();
        }

        @Override
        public boolean isRequiresStableInput() {
            return signature.processElement().requiresStableInput();
        }

        @Override
        public boolean isRequiresTimeSortedInput() {
            return signature.processElement().requiresTimeSortedInput();
        }

        @Override
        public boolean requestsFinalization() {
            return (signature.startBundle() != null && signature.startBundle().extraParameters().contains(Parameter.bundleFinalizer())) || (signature.processElement() != null && signature.processElement().extraParameters().contains(Parameter.bundleFinalizer())) || (signature.finishBundle() != null && signature.finishBundle().extraParameters().contains(Parameter.bundleFinalizer()));
        }

        @Override
        public String translateRestrictionCoderId(SdkComponents newComponents) {
            return restrictionCoderId;
        }
    }, components);
}
Also used : KvCoder(org.apache.beam.sdk.coders.KvCoder) Coder(org.apache.beam.sdk.coders.Coder) FunctionSpec(org.apache.beam.model.pipeline.v1.RunnerApi.FunctionSpec) KvCoder(org.apache.beam.sdk.coders.KvCoder) ByteString(org.apache.beam.vendor.grpc.v1p43p2.com.google.protobuf.ByteString) IOException(java.io.IOException) RunnerApi(org.apache.beam.model.pipeline.v1.RunnerApi) StateSpec(org.apache.beam.sdk.state.StateSpec) PCollectionView(org.apache.beam.sdk.values.PCollectionView) BoundedWindow(org.apache.beam.sdk.transforms.windowing.BoundedWindow) Map(java.util.Map) HashMap(java.util.HashMap) DoFnSignature(org.apache.beam.sdk.transforms.reflect.DoFnSignature)

Example 4 with FunctionSpec

use of org.apache.beam.sdk.common.runner.v1.RunnerApi.FunctionSpec in project beam by apache.

the class ParDoTranslation method fromProto.

@VisibleForTesting
static StateSpec<?> fromProto(RunnerApi.StateSpec stateSpec, RehydratedComponents components) throws IOException {
    switch(stateSpec.getSpecCase()) {
        case READ_MODIFY_WRITE_SPEC:
            return StateSpecs.value(components.getCoder(stateSpec.getReadModifyWriteSpec().getCoderId()));
        case BAG_SPEC:
            return StateSpecs.bag(components.getCoder(stateSpec.getBagSpec().getElementCoderId()));
        case COMBINING_SPEC:
            FunctionSpec combineFnSpec = stateSpec.getCombiningSpec().getCombineFn();
            if (!combineFnSpec.getUrn().equals(CombineTranslation.JAVA_SERIALIZED_COMBINE_FN_URN)) {
                throw new UnsupportedOperationException(String.format("Cannot create %s from non-Java %s: %s", StateSpec.class.getSimpleName(), Combine.CombineFn.class.getSimpleName(), combineFnSpec.getUrn()));
            }
            Combine.CombineFn<?, ?, ?> combineFn = (Combine.CombineFn<?, ?, ?>) SerializableUtils.deserializeFromByteArray(combineFnSpec.getPayload().toByteArray(), Combine.CombineFn.class.getSimpleName());
            // for the CombineFn, by construction
            return StateSpecs.combining((Coder) components.getCoder(stateSpec.getCombiningSpec().getAccumulatorCoderId()), combineFn);
        case MAP_SPEC:
            return StateSpecs.map(components.getCoder(stateSpec.getMapSpec().getKeyCoderId()), components.getCoder(stateSpec.getMapSpec().getValueCoderId()));
        case SET_SPEC:
            return StateSpecs.set(components.getCoder(stateSpec.getSetSpec().getElementCoderId()));
        case SPEC_NOT_SET:
        default:
            throw new IllegalArgumentException(String.format("Unknown %s: %s", RunnerApi.StateSpec.class.getName(), stateSpec));
    }
}
Also used : RunnerApi(org.apache.beam.model.pipeline.v1.RunnerApi) Combine(org.apache.beam.sdk.transforms.Combine) FunctionSpec(org.apache.beam.model.pipeline.v1.RunnerApi.FunctionSpec) VisibleForTesting(org.apache.beam.vendor.guava.v26_0_jre.com.google.common.annotations.VisibleForTesting)

Example 5 with FunctionSpec

use of org.apache.beam.sdk.common.runner.v1.RunnerApi.FunctionSpec in project beam by apache.

the class WindowingStrategyTranslation method fromProto.

/**
 * Converts from {@link RunnerApi.WindowingStrategy} to the SDK's {@link WindowingStrategy} using
 * the provided components to dereferences identifiers found in the proto.
 */
public static WindowingStrategy<?, ?> fromProto(RunnerApi.WindowingStrategy proto, RehydratedComponents components) throws InvalidProtocolBufferException {
    FunctionSpec windowFnSpec = proto.getWindowFn();
    WindowFn<?, ?> windowFn = windowFnFromProto(windowFnSpec);
    TimestampCombiner timestampCombiner = timestampCombinerFromProto(proto.getOutputTime());
    AccumulationMode accumulationMode = fromProto(proto.getAccumulationMode());
    Trigger trigger = TriggerTranslation.fromProto(proto.getTrigger());
    ClosingBehavior closingBehavior = fromProto(proto.getClosingBehavior());
    Duration allowedLateness = Duration.millis(proto.getAllowedLateness());
    OnTimeBehavior onTimeBehavior = fromProto(proto.getOnTimeBehavior());
    String environmentId = proto.getEnvironmentId();
    return WindowingStrategy.of(windowFn).withAllowedLateness(allowedLateness).withMode(accumulationMode).withTrigger(trigger).withTimestampCombiner(timestampCombiner).withClosingBehavior(closingBehavior).withOnTimeBehavior(onTimeBehavior).withEnvironmentId(environmentId);
}
Also used : Trigger(org.apache.beam.sdk.transforms.windowing.Trigger) TimestampCombiner(org.apache.beam.sdk.transforms.windowing.TimestampCombiner) FunctionSpec(org.apache.beam.model.pipeline.v1.RunnerApi.FunctionSpec) AccumulationMode(org.apache.beam.sdk.values.WindowingStrategy.AccumulationMode) Duration(org.joda.time.Duration) ByteString(org.apache.beam.vendor.grpc.v1p43p2.com.google.protobuf.ByteString) OnTimeBehavior(org.apache.beam.sdk.transforms.windowing.Window.OnTimeBehavior) ClosingBehavior(org.apache.beam.sdk.transforms.windowing.Window.ClosingBehavior)

Aggregations

FunctionSpec (org.apache.beam.model.pipeline.v1.RunnerApi.FunctionSpec)10 ByteString (org.apache.beam.vendor.grpc.v1p43p2.com.google.protobuf.ByteString)5 RunnerApi (org.apache.beam.model.pipeline.v1.RunnerApi)4 FunctionSpec (org.apache.beam.sdk.common.runner.v1.RunnerApi.FunctionSpec)4 Test (org.junit.Test)4 Map (java.util.Map)3 SdkFunctionSpec (org.apache.beam.sdk.common.runner.v1.RunnerApi.SdkFunctionSpec)3 IOException (java.io.IOException)2 Collection (java.util.Collection)2 Collections (java.util.Collections)2 List (java.util.List)2 PTransformRunnerFactory (org.apache.beam.fn.harness.PTransformRunnerFactory)2 BundleProcessorCache (org.apache.beam.fn.harness.control.ProcessBundleHandler.BundleProcessorCache)2 BeamFnApi (org.apache.beam.model.fnexecution.v1.BeamFnApi)2 PCollectionView (org.apache.beam.sdk.values.PCollectionView)2 TupleTag (org.apache.beam.sdk.values.TupleTag)2 VisibleForTesting (com.google.common.annotations.VisibleForTesting)1 ImmutableMap (com.google.common.collect.ImmutableMap)1 BytesValue (com.google.protobuf.BytesValue)1 ArrayList (java.util.ArrayList)1