Search in sources :

Example 21 with ExecutableStage

use of org.apache.beam.runners.core.construction.graph.ExecutableStage in project beam by apache.

the class BatchSideInputHandlerFactoryTest method invalidSideInputThrowsException.

@Test
public void invalidSideInputThrowsException() {
    ExecutableStage stage = createExecutableStage(Collections.emptyList());
    BatchSideInputHandlerFactory factory = BatchSideInputHandlerFactory.forStage(stage, context);
    thrown.expect(instanceOf(IllegalArgumentException.class));
    factory.forMultimapSideInput("transform-id", "side-input", KvCoder.of(VoidCoder.of(), VoidCoder.of()), GlobalWindow.Coder.INSTANCE);
}
Also used : ImmutableExecutableStage(org.apache.beam.runners.core.construction.graph.ImmutableExecutableStage) ExecutableStage(org.apache.beam.runners.core.construction.graph.ExecutableStage) Test(org.junit.Test)

Example 22 with ExecutableStage

use of org.apache.beam.runners.core.construction.graph.ExecutableStage in project beam by apache.

the class SparkExecutableStageFunction method call.

@Override
public Iterator<RawUnionValue> call(Iterator<WindowedValue<InputT>> inputs) throws Exception {
    SparkPipelineOptions options = pipelineOptions.get().as(SparkPipelineOptions.class);
    // Register standard file systems.
    FileSystems.setDefaultPipelineOptions(options);
    // Otherwise, this may cause validation errors (e.g. ParDoTest)
    if (!inputs.hasNext()) {
        return Collections.emptyIterator();
    }
    try (ExecutableStageContext stageContext = contextFactory.get(jobInfo)) {
        ExecutableStage executableStage = ExecutableStage.fromPayload(stagePayload);
        try (StageBundleFactory stageBundleFactory = stageContext.getStageBundleFactory(executableStage)) {
            ConcurrentLinkedQueue<RawUnionValue> collector = new ConcurrentLinkedQueue<>();
            StateRequestHandler stateRequestHandler = getStateRequestHandler(executableStage, stageBundleFactory.getProcessBundleDescriptor());
            if (executableStage.getTimers().size() == 0) {
                ReceiverFactory receiverFactory = new ReceiverFactory(collector, outputMap);
                processElements(stateRequestHandler, receiverFactory, null, stageBundleFactory, inputs);
                return collector.iterator();
            }
            // Used with Batch, we know that all the data is available for this key. We can't use the
            // timer manager from the context because it doesn't exist. So we create one and advance
            // time to the end after processing all elements.
            final InMemoryTimerInternals timerInternals = new InMemoryTimerInternals();
            timerInternals.advanceProcessingTime(Instant.now());
            timerInternals.advanceSynchronizedProcessingTime(Instant.now());
            ReceiverFactory receiverFactory = new ReceiverFactory(collector, outputMap);
            TimerReceiverFactory timerReceiverFactory = new TimerReceiverFactory(stageBundleFactory, (Timer<?> timer, TimerInternals.TimerData timerData) -> {
                currentTimerKey = timer.getUserKey();
                if (timer.getClearBit()) {
                    timerInternals.deleteTimer(timerData);
                } else {
                    timerInternals.setTimer(timerData);
                }
            }, windowCoder);
            // Process inputs.
            processElements(stateRequestHandler, receiverFactory, timerReceiverFactory, stageBundleFactory, inputs);
            // Finish any pending windows by advancing the input watermark to infinity.
            timerInternals.advanceInputWatermark(BoundedWindow.TIMESTAMP_MAX_VALUE);
            // Finally, advance the processing time to infinity to fire any timers.
            timerInternals.advanceProcessingTime(BoundedWindow.TIMESTAMP_MAX_VALUE);
            timerInternals.advanceSynchronizedProcessingTime(BoundedWindow.TIMESTAMP_MAX_VALUE);
            // itself)
            while (timerInternals.hasPendingTimers()) {
                try (RemoteBundle bundle = stageBundleFactory.getBundle(receiverFactory, timerReceiverFactory, stateRequestHandler, getBundleProgressHandler())) {
                    PipelineTranslatorUtils.fireEligibleTimers(timerInternals, bundle.getTimerReceivers(), currentTimerKey);
                }
            }
            return collector.iterator();
        }
    }
}
Also used : TimerReceiverFactory(org.apache.beam.runners.fnexecution.control.TimerReceiverFactory) OutputReceiverFactory(org.apache.beam.runners.fnexecution.control.OutputReceiverFactory) StateRequestHandler(org.apache.beam.runners.fnexecution.state.StateRequestHandler) RawUnionValue(org.apache.beam.sdk.transforms.join.RawUnionValue) InMemoryTimerInternals(org.apache.beam.runners.core.InMemoryTimerInternals) SparkPipelineOptions(org.apache.beam.runners.spark.SparkPipelineOptions) StageBundleFactory(org.apache.beam.runners.fnexecution.control.StageBundleFactory) Timer(org.apache.beam.runners.core.construction.Timer) ExecutableStageContext(org.apache.beam.runners.fnexecution.control.ExecutableStageContext) TimerReceiverFactory(org.apache.beam.runners.fnexecution.control.TimerReceiverFactory) ExecutableStage(org.apache.beam.runners.core.construction.graph.ExecutableStage) ConcurrentLinkedQueue(java.util.concurrent.ConcurrentLinkedQueue) RemoteBundle(org.apache.beam.runners.fnexecution.control.RemoteBundle)

Aggregations

ExecutableStage (org.apache.beam.runners.core.construction.graph.ExecutableStage)22 Test (org.junit.Test)17 RunnerApi (org.apache.beam.model.pipeline.v1.RunnerApi)16 Pipeline (org.apache.beam.sdk.Pipeline)15 Coder (org.apache.beam.sdk.coders.Coder)14 HashMap (java.util.HashMap)12 FusedPipeline (org.apache.beam.runners.core.construction.graph.FusedPipeline)12 KvCoder (org.apache.beam.sdk.coders.KvCoder)12 StringUtf8Coder (org.apache.beam.sdk.coders.StringUtf8Coder)12 WindowedValue (org.apache.beam.sdk.util.WindowedValue)12 ByteString (org.apache.beam.vendor.grpc.v1p43p2.com.google.protobuf.ByteString)11 Map (java.util.Map)10 ConcurrentHashMap (java.util.concurrent.ConcurrentHashMap)10 ExecutableProcessBundleDescriptor (org.apache.beam.runners.fnexecution.control.ProcessBundleDescriptors.ExecutableProcessBundleDescriptor)10 BundleProcessor (org.apache.beam.runners.fnexecution.control.SdkHarnessClient.BundleProcessor)10 BigEndianLongCoder (org.apache.beam.sdk.coders.BigEndianLongCoder)10 Collection (java.util.Collection)9 KV (org.apache.beam.sdk.values.KV)9 PCollection (org.apache.beam.sdk.values.PCollection)9 ArrayList (java.util.ArrayList)7