Search in sources :

Example 1 with WireCoderSetting

use of org.apache.beam.model.pipeline.v1.RunnerApi.ExecutableStagePayload.WireCoderSetting in project beam by apache.

the class ProcessBundleDescriptors method fromExecutableStageInternal.

private static ExecutableProcessBundleDescriptor fromExecutableStageInternal(String id, ExecutableStage stage, ApiServiceDescriptor dataEndpoint, @Nullable ApiServiceDescriptor stateEndpoint) throws IOException {
    // Create with all of the processing transforms, and all of the components.
    // TODO: Remove the unreachable subcomponents if the size of the descriptor matters.
    Map<String, PTransform> stageTransforms = stage.getTransforms().stream().collect(Collectors.toMap(PTransformNode::getId, PTransformNode::getTransform));
    Components.Builder components = stage.getComponents().toBuilder().clearTransforms().putAllTransforms(stageTransforms);
    ImmutableList.Builder<RemoteInputDestination> inputDestinationsBuilder = ImmutableList.builder();
    ImmutableMap.Builder<String, Coder> remoteOutputCodersBuilder = ImmutableMap.builder();
    WireCoderSetting wireCoderSetting = stage.getWireCoderSettings().stream().filter(ws -> ws.getInputOrOutputId().equals(stage.getInputPCollection().getId())).findAny().orElse(WireCoderSetting.getDefaultInstance());
    // The order of these does not matter.
    inputDestinationsBuilder.add(addStageInput(dataEndpoint, stage.getInputPCollection(), components, wireCoderSetting));
    remoteOutputCodersBuilder.putAll(addStageOutputs(dataEndpoint, stage.getOutputPCollections(), components, stage.getWireCoderSettings()));
    Map<String, Map<String, SideInputSpec>> sideInputSpecs = addSideInputs(stage, components);
    Map<String, Map<String, BagUserStateSpec>> bagUserStateSpecs = forBagUserStates(stage, components.build());
    Map<String, Map<String, TimerSpec>> timerSpecs = forTimerSpecs(stage, components);
    lengthPrefixAnyInputCoder(stage.getInputPCollection().getId(), components);
    // Copy data from components to ProcessBundleDescriptor.
    ProcessBundleDescriptor.Builder bundleDescriptorBuilder = ProcessBundleDescriptor.newBuilder().setId(id);
    if (stateEndpoint != null) {
        bundleDescriptorBuilder.setStateApiServiceDescriptor(stateEndpoint);
    }
    if (timerSpecs.size() > 0) {
        // By default use the data endpoint for timers, in the future considering enabling specifying
        // a different ApiServiceDescriptor for timers.
        bundleDescriptorBuilder.setTimerApiServiceDescriptor(dataEndpoint);
    }
    bundleDescriptorBuilder.putAllCoders(components.getCodersMap()).putAllEnvironments(components.getEnvironmentsMap()).putAllPcollections(components.getPcollectionsMap()).putAllWindowingStrategies(components.getWindowingStrategiesMap()).putAllTransforms(components.getTransformsMap());
    return ExecutableProcessBundleDescriptor.of(bundleDescriptorBuilder.build(), inputDestinationsBuilder.build(), remoteOutputCodersBuilder.build(), sideInputSpecs, bagUserStateSpecs, timerSpecs);
}
Also used : Coder(org.apache.beam.sdk.coders.Coder) ByteStringCoder(org.apache.beam.runners.fnexecution.wire.ByteStringCoder) FullWindowedValueCoder(org.apache.beam.sdk.util.WindowedValue.FullWindowedValueCoder) RemoteInputDestination(org.apache.beam.runners.fnexecution.data.RemoteInputDestination) ImmutableList(org.apache.beam.vendor.guava.v26_0_jre.com.google.common.collect.ImmutableList) ProcessBundleDescriptor(org.apache.beam.model.fnexecution.v1.BeamFnApi.ProcessBundleDescriptor) WireCoderSetting(org.apache.beam.model.pipeline.v1.RunnerApi.ExecutableStagePayload.WireCoderSetting) ImmutableMap(org.apache.beam.vendor.guava.v26_0_jre.com.google.common.collect.ImmutableMap) RehydratedComponents(org.apache.beam.runners.core.construction.RehydratedComponents) Components(org.apache.beam.model.pipeline.v1.RunnerApi.Components) ImmutableMap(org.apache.beam.vendor.guava.v26_0_jre.com.google.common.collect.ImmutableMap) LinkedHashMap(java.util.LinkedHashMap) Map(java.util.Map) PTransform(org.apache.beam.model.pipeline.v1.RunnerApi.PTransform)

Example 2 with WireCoderSetting

use of org.apache.beam.model.pipeline.v1.RunnerApi.ExecutableStagePayload.WireCoderSetting in project beam by apache.

the class ProcessBundleDescriptors method addStageOutputs.

private static Map<String, Coder<WindowedValue<?>>> addStageOutputs(ApiServiceDescriptor dataEndpoint, Collection<PCollectionNode> outputPCollections, Components.Builder components, Collection<WireCoderSetting> wireCoderSettings) throws IOException {
    Map<String, Coder<WindowedValue<?>>> remoteOutputCoders = new LinkedHashMap<>();
    for (PCollectionNode outputPCollection : outputPCollections) {
        WireCoderSetting wireCoderSetting = wireCoderSettings.stream().filter(ws -> ws.getInputOrOutputId().equals(outputPCollection.getId())).findAny().orElse(WireCoderSetting.getDefaultInstance());
        OutputEncoding outputEncoding = addStageOutput(dataEndpoint, components, outputPCollection, wireCoderSetting);
        remoteOutputCoders.put(outputEncoding.getPTransformId(), outputEncoding.getCoder());
    }
    return remoteOutputCoders;
}
Also used : Coder(org.apache.beam.sdk.coders.Coder) ByteStringCoder(org.apache.beam.runners.fnexecution.wire.ByteStringCoder) FullWindowedValueCoder(org.apache.beam.sdk.util.WindowedValue.FullWindowedValueCoder) WireCoderSetting(org.apache.beam.model.pipeline.v1.RunnerApi.ExecutableStagePayload.WireCoderSetting) PCollectionNode(org.apache.beam.runners.core.construction.graph.PipelineNode.PCollectionNode) LinkedHashMap(java.util.LinkedHashMap)

Example 3 with WireCoderSetting

use of org.apache.beam.model.pipeline.v1.RunnerApi.ExecutableStagePayload.WireCoderSetting in project beam by apache.

the class ProcessBundleDescriptors method addStageOutput.

private static OutputEncoding addStageOutput(ApiServiceDescriptor dataEndpoint, Components.Builder components, PCollectionNode outputPCollection, WireCoderSetting wireCoderSetting) throws IOException {
    String outputWireCoderId = WireCoders.addSdkWireCoder(outputPCollection, components, wireCoderSetting);
    @SuppressWarnings("unchecked") Coder<WindowedValue<?>> wireCoder = (Coder) WireCoders.instantiateRunnerWireCoder(outputPCollection, components.build(), wireCoderSetting);
    RemoteGrpcPort outputPort = RemoteGrpcPort.newBuilder().setApiServiceDescriptor(dataEndpoint).setCoderId(outputWireCoderId).build();
    RemoteGrpcPortWrite outputWrite = RemoteGrpcPortWrite.writeToPort(outputPCollection.getId(), outputPort);
    String outputId = uniqueId(String.format("fn/write/%s", outputPCollection.getId()), components::containsTransforms);
    PTransform outputTransform = outputWrite.toPTransform();
    components.putTransforms(outputId, outputTransform);
    return new AutoValue_ProcessBundleDescriptors_OutputEncoding(outputId, wireCoder);
}
Also used : Coder(org.apache.beam.sdk.coders.Coder) ByteStringCoder(org.apache.beam.runners.fnexecution.wire.ByteStringCoder) FullWindowedValueCoder(org.apache.beam.sdk.util.WindowedValue.FullWindowedValueCoder) RemoteGrpcPort(org.apache.beam.model.fnexecution.v1.BeamFnApi.RemoteGrpcPort) WindowedValue(org.apache.beam.sdk.util.WindowedValue) RemoteGrpcPortWrite(org.apache.beam.sdk.fn.data.RemoteGrpcPortWrite) PTransform(org.apache.beam.model.pipeline.v1.RunnerApi.PTransform)

Example 4 with WireCoderSetting

use of org.apache.beam.model.pipeline.v1.RunnerApi.ExecutableStagePayload.WireCoderSetting in project beam by apache.

the class ProcessBundleDescriptors method forTimerSpecs.

private static Map<String, Map<String, TimerSpec>> forTimerSpecs(ExecutableStage stage, Components.Builder components) throws IOException {
    ImmutableTable.Builder<String, String, TimerSpec> idsToSpec = ImmutableTable.builder();
    for (TimerReference timerReference : stage.getTimers()) {
        RunnerApi.ParDoPayload payload = RunnerApi.ParDoPayload.parseFrom(timerReference.transform().getTransform().getSpec().getPayload());
        RunnerApi.TimerFamilySpec timerFamilySpec = payload.getTimerFamilySpecsOrThrow(timerReference.localName());
        org.apache.beam.sdk.state.TimerSpec spec;
        switch(timerFamilySpec.getTimeDomain()) {
            case EVENT_TIME:
                spec = TimerSpecs.timer(TimeDomain.EVENT_TIME);
                break;
            case PROCESSING_TIME:
                spec = TimerSpecs.timer(TimeDomain.PROCESSING_TIME);
                break;
            default:
                throw new IllegalArgumentException(String.format("Unknown or unsupported time domain %s", timerFamilySpec.getTimeDomain()));
        }
        for (WireCoderSetting wireCoderSetting : stage.getWireCoderSettings()) {
            if (wireCoderSetting.hasTimer() && wireCoderSetting.getTimer().getTransformId().equals(timerReference.transform().getId()) && wireCoderSetting.getTimer().getLocalName().equals(timerReference.localName())) {
                throw new UnsupportedOperationException("WireCoderSetting for timer is yet to be supported.");
            }
        }
        String originalTimerCoderId = timerFamilySpec.getTimerFamilyCoderId();
        String sdkCoderId = LengthPrefixUnknownCoders.addLengthPrefixedCoder(originalTimerCoderId, components, false);
        String runnerCoderId = LengthPrefixUnknownCoders.addLengthPrefixedCoder(originalTimerCoderId, components, true);
        Coder<?> timerCoder = RehydratedComponents.forComponents(components.build()).getCoder(runnerCoderId);
        checkArgument(timerCoder instanceof Timer.Coder, "Expected a timer coder but received %s.", timerCoder);
        RunnerApi.FunctionSpec.Builder updatedSpec = components.getTransformsOrThrow(timerReference.transform().getId()).toBuilder().getSpecBuilder();
        RunnerApi.ParDoPayload.Builder updatedPayload = RunnerApi.ParDoPayload.parseFrom(updatedSpec.getPayload()).toBuilder();
        updatedPayload.putTimerFamilySpecs(timerReference.localName(), updatedPayload.getTimerFamilySpecsOrThrow(timerReference.localName()).toBuilder().setTimerFamilyCoderId(sdkCoderId).build());
        updatedSpec.setPayload(updatedPayload.build().toByteString());
        components.putTransforms(timerReference.transform().getId(), // and not the original
        components.getTransformsOrThrow(timerReference.transform().getId()).toBuilder().setSpec(updatedSpec).build());
        idsToSpec.put(timerReference.transform().getId(), timerReference.localName(), TimerSpec.of(timerReference.transform().getId(), timerReference.localName(), spec, (Coder) timerCoder));
    }
    return idsToSpec.build().rowMap();
}
Also used : Coder(org.apache.beam.sdk.coders.Coder) ByteStringCoder(org.apache.beam.runners.fnexecution.wire.ByteStringCoder) FullWindowedValueCoder(org.apache.beam.sdk.util.WindowedValue.FullWindowedValueCoder) TimerReference(org.apache.beam.runners.core.construction.graph.TimerReference) WireCoderSetting(org.apache.beam.model.pipeline.v1.RunnerApi.ExecutableStagePayload.WireCoderSetting) ImmutableTable(org.apache.beam.vendor.guava.v26_0_jre.com.google.common.collect.ImmutableTable) RunnerApi(org.apache.beam.model.pipeline.v1.RunnerApi) Timer(org.apache.beam.runners.core.construction.Timer)

Example 5 with WireCoderSetting

use of org.apache.beam.model.pipeline.v1.RunnerApi.ExecutableStagePayload.WireCoderSetting in project beam by apache.

the class ExecutableStage method fromPayload.

/**
 * Return an {@link ExecutableStage} constructed from the provided {@link FunctionSpec}
 * representation.
 *
 * <p>See {@link #toPTransform} for how the payload is constructed.
 *
 * <p>Note: The payload contains some information redundant with the {@link PTransform} it is the
 * payload of. The {@link ExecutableStagePayload} should be sufficiently rich to construct a
 * {@code ProcessBundleDescriptor} using only the payload.
 */
static ExecutableStage fromPayload(ExecutableStagePayload payload) {
    Components components = payload.getComponents();
    Environment environment = payload.getEnvironment();
    Collection<WireCoderSetting> wireCoderSettings = payload.getWireCoderSettingsList();
    PCollectionNode input = PipelineNode.pCollection(payload.getInput(), components.getPcollectionsOrThrow(payload.getInput()));
    List<SideInputReference> sideInputs = payload.getSideInputsList().stream().map(sideInputId -> SideInputReference.fromSideInputId(sideInputId, components)).collect(Collectors.toList());
    List<UserStateReference> userStates = payload.getUserStatesList().stream().map(userStateId -> UserStateReference.fromUserStateId(userStateId, components)).collect(Collectors.toList());
    List<TimerReference> timers = payload.getTimersList().stream().map(timerId -> TimerReference.fromTimerId(timerId, components)).collect(Collectors.toList());
    List<PTransformNode> transforms = payload.getTransformsList().stream().map(id -> PipelineNode.pTransform(id, components.getTransformsOrThrow(id))).collect(Collectors.toList());
    List<PCollectionNode> outputs = payload.getOutputsList().stream().map(id -> PipelineNode.pCollection(id, components.getPcollectionsOrThrow(id))).collect(Collectors.toList());
    return ImmutableExecutableStage.of(components, environment, input, sideInputs, userStates, timers, transforms, outputs, wireCoderSettings);
}
Also used : RunnerApi(org.apache.beam.model.pipeline.v1.RunnerApi) PTransform(org.apache.beam.model.pipeline.v1.RunnerApi.PTransform) Collection(java.util.Collection) WireCoderSetting(org.apache.beam.model.pipeline.v1.RunnerApi.ExecutableStagePayload.WireCoderSetting) Collectors(java.util.stream.Collectors) UserStateId(org.apache.beam.model.pipeline.v1.RunnerApi.ExecutableStagePayload.UserStateId) ExecutableStagePayload(org.apache.beam.model.pipeline.v1.RunnerApi.ExecutableStagePayload) List(java.util.List) Pipeline(org.apache.beam.model.pipeline.v1.RunnerApi.Pipeline) FunctionSpec(org.apache.beam.model.pipeline.v1.RunnerApi.FunctionSpec) PCollection(org.apache.beam.model.pipeline.v1.RunnerApi.PCollection) TimerId(org.apache.beam.model.pipeline.v1.RunnerApi.ExecutableStagePayload.TimerId) Environment(org.apache.beam.model.pipeline.v1.RunnerApi.Environment) Components(org.apache.beam.model.pipeline.v1.RunnerApi.Components) SideInputId(org.apache.beam.model.pipeline.v1.RunnerApi.ExecutableStagePayload.SideInputId) PTransformNode(org.apache.beam.runners.core.construction.graph.PipelineNode.PTransformNode) PCollectionNode(org.apache.beam.runners.core.construction.graph.PipelineNode.PCollectionNode) Collections(java.util.Collections) PTransformNode(org.apache.beam.runners.core.construction.graph.PipelineNode.PTransformNode) WireCoderSetting(org.apache.beam.model.pipeline.v1.RunnerApi.ExecutableStagePayload.WireCoderSetting) PCollectionNode(org.apache.beam.runners.core.construction.graph.PipelineNode.PCollectionNode) Components(org.apache.beam.model.pipeline.v1.RunnerApi.Components) Environment(org.apache.beam.model.pipeline.v1.RunnerApi.Environment)

Aggregations

ByteStringCoder (org.apache.beam.runners.fnexecution.wire.ByteStringCoder)5 Coder (org.apache.beam.sdk.coders.Coder)5 FullWindowedValueCoder (org.apache.beam.sdk.util.WindowedValue.FullWindowedValueCoder)5 WireCoderSetting (org.apache.beam.model.pipeline.v1.RunnerApi.ExecutableStagePayload.WireCoderSetting)4 PTransform (org.apache.beam.model.pipeline.v1.RunnerApi.PTransform)4 LinkedHashMap (java.util.LinkedHashMap)2 RemoteGrpcPort (org.apache.beam.model.fnexecution.v1.BeamFnApi.RemoteGrpcPort)2 RunnerApi (org.apache.beam.model.pipeline.v1.RunnerApi)2 Components (org.apache.beam.model.pipeline.v1.RunnerApi.Components)2 PCollectionNode (org.apache.beam.runners.core.construction.graph.PipelineNode.PCollectionNode)2 WindowedValue (org.apache.beam.sdk.util.WindowedValue)2 Collection (java.util.Collection)1 Collections (java.util.Collections)1 List (java.util.List)1 Map (java.util.Map)1 Collectors (java.util.stream.Collectors)1 ProcessBundleDescriptor (org.apache.beam.model.fnexecution.v1.BeamFnApi.ProcessBundleDescriptor)1 Environment (org.apache.beam.model.pipeline.v1.RunnerApi.Environment)1 ExecutableStagePayload (org.apache.beam.model.pipeline.v1.RunnerApi.ExecutableStagePayload)1 SideInputId (org.apache.beam.model.pipeline.v1.RunnerApi.ExecutableStagePayload.SideInputId)1