Search in sources :

Example 1 with Source

use of com.google.api.services.dataflow.model.Source in project beam by apache.

the class AvroByteReaderFactoryTest method runTestCreateAvroReader.

NativeReader<?> runTestCreateAvroReader(String filename, @Nullable Long start, @Nullable Long end, CloudObject encoding) throws Exception {
    CloudObject spec = CloudObject.forClassName("AvroSource");
    addString(spec, "filename", filename);
    if (start != null) {
        addLong(spec, "start_offset", start);
    }
    if (end != null) {
        addLong(spec, "end_offset", end);
    }
    Source cloudSource = new Source();
    cloudSource.setSpec(spec);
    cloudSource.setCodec(encoding);
    NativeReader<?> reader = ReaderRegistry.defaultRegistry().create(cloudSource, PipelineOptionsFactory.create(), // ExecutionContext
    null, null);
    return reader;
}
Also used : CloudObject(org.apache.beam.runners.dataflow.util.CloudObject) Source(com.google.api.services.dataflow.model.Source)

Example 2 with Source

use of com.google.api.services.dataflow.model.Source in project beam by apache.

the class ConcatReaderTest method createSourceForTestReader.

private Source createSourceForTestReader(TestReader<String> testReader) {
    Source source = new Source();
    CloudObject specObj = CloudObject.forClass(TestReader.class);
    specObj.put(READER_OBJECT, testReader);
    source.setSpec(specObj);
    return source;
}
Also used : CloudObject(org.apache.beam.runners.dataflow.util.CloudObject) Source(com.google.api.services.dataflow.model.Source)

Example 3 with Source

use of com.google.api.services.dataflow.model.Source in project beam by apache.

the class CloudSourceUtils method flattenBaseSpecs.

/**
 * Returns a copy of the source with {@code baseSpecs} flattened into {@code spec}. On conflict
 * for a parameter name, values in {@code spec} override values in {@code baseSpecs}, and later
 * values in {@code baseSpecs} override earlier ones.
 */
public static Source flattenBaseSpecs(Source source) {
    if (source.getBaseSpecs() == null) {
        return source;
    }
    Map<String, Object> params = new HashMap<>();
    for (Map<String, Object> baseSpec : source.getBaseSpecs()) {
        params.putAll(baseSpec);
    }
    params.putAll(source.getSpec());
    Source result = source.clone();
    result.setSpec(params);
    result.setBaseSpecs(null);
    return result;
}
Also used : HashMap(java.util.HashMap) Source(com.google.api.services.dataflow.model.Source)

Example 4 with Source

use of com.google.api.services.dataflow.model.Source in project beam by apache.

the class BeamFnMapTaskExecutorFactory method createReadOperation.

OperationNode createReadOperation(Network<Node, Edge> network, ParallelInstructionNode node, PipelineOptions options, ReaderFactory readerFactory, DataflowExecutionContext<?> executionContext, DataflowOperationContext operationContext) throws Exception {
    ParallelInstruction instruction = node.getParallelInstruction();
    ReadInstruction read = instruction.getRead();
    Source cloudSource = CloudSourceUtils.flattenBaseSpecs(read.getSource());
    CloudObject sourceSpec = CloudObject.fromSpec(cloudSource.getSpec());
    Coder<?> coder = CloudObjects.coderFromCloudObject(CloudObject.fromSpec(cloudSource.getCodec()));
    NativeReader<?> reader = readerFactory.create(sourceSpec, coder, options, executionContext, operationContext);
    OutputReceiver[] receivers = getOutputReceivers(network, node);
    return OperationNode.create(ReadOperation.create(reader, receivers, operationContext));
}
Also used : ParallelInstruction(com.google.api.services.dataflow.model.ParallelInstruction) CloudObject(org.apache.beam.runners.dataflow.util.CloudObject) OutputReceiver(org.apache.beam.runners.dataflow.worker.util.common.worker.OutputReceiver) ReadInstruction(com.google.api.services.dataflow.model.ReadInstruction) Source(com.google.api.services.dataflow.model.Source)

Example 5 with Source

use of com.google.api.services.dataflow.model.Source in project beam by apache.

the class LengthPrefixUnknownCodersTest method createReadNode.

private static ParallelInstructionNode createReadNode(String name, String readClassName, Coder<?> coder) {
    ParallelInstruction parallelInstruction = new ParallelInstruction().setName(name).setRead(new ReadInstruction().setSource(new Source().setCodec(CloudObjects.asCloudObject(coder, /*sdkComponents=*/
    null)).setSpec(CloudObject.forClassName(readClassName))));
    parallelInstruction.setFactory(new JacksonFactory());
    return ParallelInstructionNode.create(parallelInstruction, Nodes.ExecutionLocation.UNKNOWN);
}
Also used : LengthPrefixUnknownCoders.forParallelInstruction(org.apache.beam.runners.dataflow.worker.graph.LengthPrefixUnknownCoders.forParallelInstruction) ParallelInstruction(com.google.api.services.dataflow.model.ParallelInstruction) ReadInstruction(com.google.api.services.dataflow.model.ReadInstruction) JacksonFactory(com.google.api.client.json.jackson2.JacksonFactory) Source(com.google.api.services.dataflow.model.Source)

Aggregations

Source (com.google.api.services.dataflow.model.Source)51 Test (org.junit.Test)31 ArrayList (java.util.ArrayList)20 WindowedValue (org.apache.beam.sdk.util.WindowedValue)18 CloudObject (org.apache.beam.runners.dataflow.util.CloudObject)16 Map (java.util.Map)15 Callable (java.util.concurrent.Callable)15 Future (java.util.concurrent.Future)15 HashMap (java.util.HashMap)13 ImmutableMap (org.apache.beam.vendor.guava.v26_0_jre.com.google.common.collect.ImmutableMap)12 SortedMap (java.util.SortedMap)11 TreeMap (java.util.TreeMap)11 BoundedWindow (org.apache.beam.sdk.transforms.windowing.BoundedWindow)8 ParallelInstruction (com.google.api.services.dataflow.model.ParallelInstruction)7 ReadInstruction (com.google.api.services.dataflow.model.ReadInstruction)6 KV (org.apache.beam.sdk.values.KV)6 Collection (java.util.Collection)5 List (java.util.List)5 IsmRecord (org.apache.beam.runners.dataflow.internal.IsmFormat.IsmRecord)5 Structs.getString (org.apache.beam.runners.dataflow.util.Structs.getString)5