Search in sources :

Example 1 with CloudObject

use of org.apache.beam.runners.dataflow.util.CloudObject in project beam by apache.

the class AvroByteReaderFactoryTest method runTestCreateAvroReader.

NativeReader<?> runTestCreateAvroReader(String filename, @Nullable Long start, @Nullable Long end, CloudObject encoding) throws Exception {
    CloudObject spec = CloudObject.forClassName("AvroSource");
    addString(spec, "filename", filename);
    if (start != null) {
        addLong(spec, "start_offset", start);
    }
    if (end != null) {
        addLong(spec, "end_offset", end);
    }
    Source cloudSource = new Source();
    cloudSource.setSpec(spec);
    cloudSource.setCodec(encoding);
    NativeReader<?> reader = ReaderRegistry.defaultRegistry().create(cloudSource, PipelineOptionsFactory.create(), // ExecutionContext
    null, null);
    return reader;
}
Also used : CloudObject(org.apache.beam.runners.dataflow.util.CloudObject) Source(com.google.api.services.dataflow.model.Source)

Example 2 with CloudObject

use of org.apache.beam.runners.dataflow.util.CloudObject in project beam by apache.

the class ConcatReaderTest method createSourceForTestReader.

private Source createSourceForTestReader(TestReader<String> testReader) {
    Source source = new Source();
    CloudObject specObj = CloudObject.forClass(TestReader.class);
    specObj.put(READER_OBJECT, testReader);
    source.setSpec(specObj);
    return source;
}
Also used : CloudObject(org.apache.beam.runners.dataflow.util.CloudObject) Source(com.google.api.services.dataflow.model.Source)

Example 3 with CloudObject

use of org.apache.beam.runners.dataflow.util.CloudObject in project beam by apache.

the class BeamFnMapTaskExecutorFactory method createWriteOperation.

OperationNode createWriteOperation(ParallelInstructionNode node, PipelineOptions options, SinkFactory sinkFactory, DataflowExecutionContext executionContext, DataflowOperationContext context) throws Exception {
    ParallelInstruction instruction = node.getParallelInstruction();
    WriteInstruction write = instruction.getWrite();
    Coder<?> coder = CloudObjects.coderFromCloudObject(CloudObject.fromSpec(write.getSink().getCodec()));
    CloudObject cloudSink = CloudObject.fromSpec(write.getSink().getSpec());
    Sink<?> sink = sinkFactory.create(cloudSink, coder, options, executionContext, context);
    return OperationNode.create(WriteOperation.create(sink, EMPTY_OUTPUT_RECEIVER_ARRAY, context));
}
Also used : ParallelInstruction(com.google.api.services.dataflow.model.ParallelInstruction) CloudObject(org.apache.beam.runners.dataflow.util.CloudObject) WriteInstruction(com.google.api.services.dataflow.model.WriteInstruction)

Example 4 with CloudObject

use of org.apache.beam.runners.dataflow.util.CloudObject in project beam by apache.

the class BeamFnMapTaskExecutorFactory method createReadOperation.

OperationNode createReadOperation(Network<Node, Edge> network, ParallelInstructionNode node, PipelineOptions options, ReaderFactory readerFactory, DataflowExecutionContext<?> executionContext, DataflowOperationContext operationContext) throws Exception {
    ParallelInstruction instruction = node.getParallelInstruction();
    ReadInstruction read = instruction.getRead();
    Source cloudSource = CloudSourceUtils.flattenBaseSpecs(read.getSource());
    CloudObject sourceSpec = CloudObject.fromSpec(cloudSource.getSpec());
    Coder<?> coder = CloudObjects.coderFromCloudObject(CloudObject.fromSpec(cloudSource.getCodec()));
    NativeReader<?> reader = readerFactory.create(sourceSpec, coder, options, executionContext, operationContext);
    OutputReceiver[] receivers = getOutputReceivers(network, node);
    return OperationNode.create(ReadOperation.create(reader, receivers, operationContext));
}
Also used : ParallelInstruction(com.google.api.services.dataflow.model.ParallelInstruction) CloudObject(org.apache.beam.runners.dataflow.util.CloudObject) OutputReceiver(org.apache.beam.runners.dataflow.worker.util.common.worker.OutputReceiver) ReadInstruction(com.google.api.services.dataflow.model.ReadInstruction) Source(com.google.api.services.dataflow.model.Source)

Example 5 with CloudObject

use of org.apache.beam.runners.dataflow.util.CloudObject in project beam by apache.

the class LengthPrefixUnknownCodersTest method testLengthPrefixParDoInstructionCoder.

@Test
public void testLengthPrefixParDoInstructionCoder() throws Exception {
    ParDoInstruction parDo = new ParDoInstruction();
    CloudObject spec = CloudObject.forClassName(MERGE_BUCKETS_DO_FN);
    spec.put(WorkerPropertyNames.INPUT_CODER, CloudObjects.asCloudObject(windowedValueCoder, /*sdkComponents=*/
    null));
    parDo.setUserFn(spec);
    instruction.setParDo(parDo);
    ParallelInstruction prefixedInstruction = forParallelInstruction(instruction, false);
    assertEqualsAsJson(CloudObjects.asCloudObject(prefixedWindowedValueCoder, /*sdkComponents=*/
    null), prefixedInstruction.getParDo().getUserFn().get(WorkerPropertyNames.INPUT_CODER));
    // Should not mutate the instruction.
    assertEqualsAsJson(CloudObjects.asCloudObject(windowedValueCoder, /*sdkComponents=*/
    null), parDo.getUserFn().get(WorkerPropertyNames.INPUT_CODER));
}
Also used : ParDoInstruction(com.google.api.services.dataflow.model.ParDoInstruction) LengthPrefixUnknownCoders.forParallelInstruction(org.apache.beam.runners.dataflow.worker.graph.LengthPrefixUnknownCoders.forParallelInstruction) ParallelInstruction(com.google.api.services.dataflow.model.ParallelInstruction) CloudObject(org.apache.beam.runners.dataflow.util.CloudObject) Test(org.junit.Test)

Aggregations

CloudObject (org.apache.beam.runners.dataflow.util.CloudObject)62 ParallelInstruction (com.google.api.services.dataflow.model.ParallelInstruction)23 Test (org.junit.Test)21 Source (com.google.api.services.dataflow.model.Source)15 ParDoInstruction (com.google.api.services.dataflow.model.ParDoInstruction)13 InstructionOutput (com.google.api.services.dataflow.model.InstructionOutput)12 ParDoFn (org.apache.beam.runners.dataflow.worker.util.common.worker.ParDoFn)11 OutputReceiver (org.apache.beam.runners.dataflow.worker.util.common.worker.OutputReceiver)10 PipelineOptions (org.apache.beam.sdk.options.PipelineOptions)10 ByteString (org.apache.beam.vendor.grpc.v1p43p2.com.google.protobuf.ByteString)10 ReadInstruction (com.google.api.services.dataflow.model.ReadInstruction)9 HashMap (java.util.HashMap)9 InstructionInput (com.google.api.services.dataflow.model.InstructionInput)8 Map (java.util.Map)8 ArrayList (java.util.ArrayList)7 Structs.getString (org.apache.beam.runners.dataflow.util.Structs.getString)7 SdkComponents (org.apache.beam.runners.core.construction.SdkComponents)6 Structs.addString (org.apache.beam.runners.dataflow.util.Structs.addString)6 ImmutableList (org.apache.beam.vendor.guava.v26_0_jre.com.google.common.collect.ImmutableList)6 List (java.util.List)5