Search in sources :

Example 1 with BeamJobRuntimeContainer

use of org.talend.components.adapter.beam.BeamJobRuntimeContainer in project components by Talend.

the class BigQueryBeamRuntimeTestIT method createSparkRunnerPipeline.

// TODO extract this to utils
private Pipeline createSparkRunnerPipeline() {
    PipelineOptions o = PipelineOptionsFactory.create();
    SparkContextOptions options = o.as(SparkContextOptions.class);
    options.setProvidedSparkContext(jsc);
    options.setUsesProvidedSparkContext(true);
    options.setRunner(SparkRunner.class);
    runtimeContainer = new BeamJobRuntimeContainer(options);
    return Pipeline.create(options);
}
Also used : BeamJobRuntimeContainer(org.talend.components.adapter.beam.BeamJobRuntimeContainer) SparkContextOptions(org.apache.beam.runners.spark.SparkContextOptions) PipelineOptions(org.apache.beam.sdk.options.PipelineOptions)

Example 2 with BeamJobRuntimeContainer

use of org.talend.components.adapter.beam.BeamJobRuntimeContainer in project components by Talend.

the class PubSubInputRuntimeTestIT method createSparkRunnerPipeline.

// TODO extract this to utils
private Pipeline createSparkRunnerPipeline() {
    PipelineOptions o = PipelineOptionsFactory.create();
    SparkContextOptions options = o.as(SparkContextOptions.class);
    JavaSparkContext jsc = new JavaSparkContext("local[2]", "PubSubInput");
    options.setProvidedSparkContext(jsc);
    options.setUsesProvidedSparkContext(true);
    options.setRunner(SparkRunner.class);
    runtimeContainer = new BeamJobRuntimeContainer(options);
    return Pipeline.create(options);
}
Also used : BeamJobRuntimeContainer(org.talend.components.adapter.beam.BeamJobRuntimeContainer) SparkContextOptions(org.apache.beam.runners.spark.SparkContextOptions) PipelineOptions(org.apache.beam.sdk.options.PipelineOptions) JavaSparkContext(org.apache.spark.api.java.JavaSparkContext)

Example 3 with BeamJobRuntimeContainer

use of org.talend.components.adapter.beam.BeamJobRuntimeContainer in project components by Talend.

the class BigQueryBeamRuntimeTestIT method init.

@Before
public void init() {
    datastore = createDatastore();
    runtimeContainer = new BeamJobRuntimeContainer(pipeline.getOptions());
}
Also used : BeamJobRuntimeContainer(org.talend.components.adapter.beam.BeamJobRuntimeContainer) Before(org.junit.Before)

Example 4 with BeamJobRuntimeContainer

use of org.talend.components.adapter.beam.BeamJobRuntimeContainer in project components by Talend.

the class BigQueryDatasetRuntime method getSample.

@Override
public void getSample(int limit, Consumer<IndexedRecord> consumer) {
    // Create a pipeline using the input component to get records.
    DirectOptions options = BeamLocalRunnerOption.getOptions();
    final Pipeline p = Pipeline.create(options);
    // Create an input runtime based on the properties.
    BigQueryInputRuntime inputRuntime = new BigQueryInputRuntime();
    BigQueryInputProperties inputProperties = new BigQueryInputProperties(null);
    inputProperties.init();
    inputProperties.setDatasetProperties(properties);
    inputRuntime.initialize(new BeamJobRuntimeContainer(options), inputProperties);
    try (DirectConsumerCollector<IndexedRecord> collector = DirectConsumerCollector.of(consumer)) {
        // Collect a sample of the input records.
        // 
        p.apply(inputRuntime).apply(Sample.<IndexedRecord>any(limit)).apply(collector);
        PipelineResult pr = p.run();
        pr.waitUntilFinish();
    }
}
Also used : BeamJobRuntimeContainer(org.talend.components.adapter.beam.BeamJobRuntimeContainer) IndexedRecord(org.apache.avro.generic.IndexedRecord) PipelineResult(org.apache.beam.sdk.PipelineResult) BigQueryInputProperties(org.talend.components.bigquery.input.BigQueryInputProperties) DirectOptions(org.apache.beam.runners.direct.DirectOptions) Pipeline(org.apache.beam.sdk.Pipeline)

Example 5 with BeamJobRuntimeContainer

use of org.talend.components.adapter.beam.BeamJobRuntimeContainer in project components by Talend.

the class PubSubInputRuntimeTestIT method init.

@Before
public void init() {
    datastoreProperties = createDatastore();
    datasetProperties = createDataset(datastoreProperties, topicName);
    runtimeContainer = new BeamJobRuntimeContainer(pipeline.getOptions());
}
Also used : BeamJobRuntimeContainer(org.talend.components.adapter.beam.BeamJobRuntimeContainer) Before(org.junit.Before)

Aggregations

BeamJobRuntimeContainer (org.talend.components.adapter.beam.BeamJobRuntimeContainer)7 SparkContextOptions (org.apache.beam.runners.spark.SparkContextOptions)3 PipelineOptions (org.apache.beam.sdk.options.PipelineOptions)3 JavaSparkContext (org.apache.spark.api.java.JavaSparkContext)2 Before (org.junit.Before)2 IndexedRecord (org.apache.avro.generic.IndexedRecord)1 DirectOptions (org.apache.beam.runners.direct.DirectOptions)1 Pipeline (org.apache.beam.sdk.Pipeline)1 PipelineResult (org.apache.beam.sdk.PipelineResult)1 BigQueryInputProperties (org.talend.components.bigquery.input.BigQueryInputProperties)1