Search in sources :

Example 6 with DirectOptions

use of org.apache.beam.runners.direct.DirectOptions in project components by Talend.

the class SimpleFileIODatasetRuntime method getSample.

@Override
public void getSample(int limit, Consumer<IndexedRecord> consumer) {
    // Create an input runtime based on the properties.
    SimpleFileIOInputRuntime inputRuntime = new SimpleFileIOInputRuntime();
    SimpleFileIOInputProperties inputProperties = new SimpleFileIOInputProperties(null);
    inputProperties.limit.setValue(limit);
    inputProperties.init();
    inputProperties.setDatasetProperties(properties);
    inputRuntime.initialize(null, inputProperties);
    // Create a pipeline using the input component to get records.
    DirectOptions options = BeamLocalRunnerOption.getOptions();
    final Pipeline p = Pipeline.create(options);
    try (DirectConsumerCollector<IndexedRecord> collector = DirectConsumerCollector.of(consumer)) {
        // Collect a sample of the input records.
        // 
        p.apply(inputRuntime).apply(// 
        Sample.<IndexedRecord>any(limit)).apply(collector);
        try {
            p.run().waitUntilFinish();
        } catch (Pipeline.PipelineExecutionException e) {
            if (e.getCause() instanceof TalendRuntimeException)
                throw (TalendRuntimeException) e.getCause();
            throw e;
        }
    }
}
Also used : TalendRuntimeException(org.talend.daikon.exception.TalendRuntimeException) IndexedRecord(org.apache.avro.generic.IndexedRecord) SimpleFileIOInputProperties(org.talend.components.simplefileio.input.SimpleFileIOInputProperties) DirectOptions(org.apache.beam.runners.direct.DirectOptions) Pipeline(org.apache.beam.sdk.Pipeline)

Aggregations

IndexedRecord (org.apache.avro.generic.IndexedRecord)6 DirectOptions (org.apache.beam.runners.direct.DirectOptions)6 Pipeline (org.apache.beam.sdk.Pipeline)6 PipelineResult (org.apache.beam.sdk.PipelineResult)1 BeamJobRuntimeContainer (org.talend.components.adapter.beam.BeamJobRuntimeContainer)1 BigQueryInputProperties (org.talend.components.bigquery.input.BigQueryInputProperties)1 ElasticsearchInputProperties (org.talend.components.elasticsearch.input.ElasticsearchInputProperties)1 KinesisInputProperties (org.talend.components.kinesis.input.KinesisInputProperties)1 PubSubInputProperties (org.talend.components.pubsub.input.PubSubInputProperties)1 SimpleFileIOInputProperties (org.talend.components.simplefileio.input.SimpleFileIOInputProperties)1 S3InputProperties (org.talend.components.simplefileio.s3.input.S3InputProperties)1 TalendRuntimeException (org.talend.daikon.exception.TalendRuntimeException)1