Search in sources :

Example 1 with ConvertToIndexedRecord

use of org.talend.components.adapter.beam.transform.ConvertToIndexedRecord in project components by Talend.

the class SimpleRecordFormatAvroIO method read.

@Override
public PCollection<IndexedRecord> read(PBegin in) {
    // Reuseable coder.
    LazyAvroCoder<Object> lac = LazyAvroCoder.of();
    AvroHdfsFileSource source = AvroHdfsFileSource.of(doAs, path, lac);
    source.getExtraHadoopConfiguration().addFrom(getExtraHadoopConfiguration());
    source.setLimit(limit);
    PCollection<KV<AvroKey, NullWritable>> read = // 
    in.apply(Read.from(source)).setCoder(source.getDefaultOutputCoder());
    PCollection<AvroKey> pc1 = read.apply(Keys.<AvroKey>create());
    PCollection<Object> pc2 = pc1.apply(ParDo.of(new ExtractRecordFromAvroKey()));
    pc2 = pc2.setCoder(lac);
    PCollection<IndexedRecord> pc3 = pc2.apply(ConvertToIndexedRecord.<Object>of());
    return pc3;
}
Also used : AvroHdfsFileSource(org.talend.components.simplefileio.runtime.sources.AvroHdfsFileSource) ConvertToIndexedRecord(org.talend.components.adapter.beam.transform.ConvertToIndexedRecord) IndexedRecord(org.apache.avro.generic.IndexedRecord) AvroKey(org.apache.avro.mapred.AvroKey) KV(org.apache.beam.sdk.values.KV)

Aggregations

IndexedRecord (org.apache.avro.generic.IndexedRecord)1 AvroKey (org.apache.avro.mapred.AvroKey)1 KV (org.apache.beam.sdk.values.KV)1 ConvertToIndexedRecord (org.talend.components.adapter.beam.transform.ConvertToIndexedRecord)1 AvroHdfsFileSource (org.talend.components.simplefileio.runtime.sources.AvroHdfsFileSource)1