Search in sources :

Example 1 with SimpleRecordFormatAvroIO

use of org.talend.components.simplefileio.runtime.SimpleRecordFormatAvroIO in project components by Talend.

the class S3InputRuntime method expand.

@Override
public PCollection<IndexedRecord> expand(PBegin in) {
    // The UGI does not control security for S3.
    UgiDoAs doAs = UgiDoAs.ofNone();
    String path = S3Connection.getUriPath(properties.getDatasetProperties());
    // overwrite is ignored for reads.
    boolean overwrite = false;
    int limit = properties.limit.getValue();
    // mergeOutput is ignored for reads.
    boolean mergeOutput = false;
    SimpleRecordFormatBase rf = null;
    switch(properties.getDatasetProperties().format.getValue()) {
        case AVRO:
            rf = new SimpleRecordFormatAvroIO(doAs, path, overwrite, limit, mergeOutput);
            break;
        case CSV:
            rf = new SimpleRecordFormatCsvIO(doAs, path, overwrite, limit, properties.getDatasetProperties().getRecordDelimiter(), properties.getDatasetProperties().getFieldDelimiter(), mergeOutput);
            break;
        case PARQUET:
            rf = new SimpleRecordFormatParquetIO(doAs, path, overwrite, limit, mergeOutput);
            break;
    }
    if (rf == null) {
        throw new RuntimeException("To be implemented: " + properties.getDatasetProperties().format.getValue());
    }
    S3Connection.setS3Configuration(rf.getExtraHadoopConfiguration(), properties.getDatasetProperties());
    return rf.read(in);
}
Also used : SimpleRecordFormatCsvIO(org.talend.components.simplefileio.runtime.SimpleRecordFormatCsvIO) UgiDoAs(org.talend.components.simplefileio.runtime.ugi.UgiDoAs) SimpleRecordFormatBase(org.talend.components.simplefileio.runtime.SimpleRecordFormatBase) SimpleRecordFormatParquetIO(org.talend.components.simplefileio.runtime.SimpleRecordFormatParquetIO) SimpleRecordFormatAvroIO(org.talend.components.simplefileio.runtime.SimpleRecordFormatAvroIO)

Example 2 with SimpleRecordFormatAvroIO

use of org.talend.components.simplefileio.runtime.SimpleRecordFormatAvroIO in project components by Talend.

the class S3OutputRuntime method expand.

@Override
public PDone expand(PCollection<IndexedRecord> in) {
    // The UGI does not control security for S3.
    UgiDoAs doAs = UgiDoAs.ofNone();
    String path = S3Connection.getUriPath(properties.getDatasetProperties());
    boolean overwrite = properties.overwrite.getValue();
    // limit is ignored for sinks
    int limit = -1;
    boolean mergeOutput = properties.mergeOutput.getValue();
    SimpleRecordFormatBase rf = null;
    switch(properties.getDatasetProperties().format.getValue()) {
        case AVRO:
            rf = new SimpleRecordFormatAvroIO(doAs, path, overwrite, limit, mergeOutput);
            break;
        case CSV:
            rf = new SimpleRecordFormatCsvIO(doAs, path, overwrite, limit, properties.getDatasetProperties().getRecordDelimiter(), properties.getDatasetProperties().getFieldDelimiter(), mergeOutput);
            break;
        case PARQUET:
            rf = new SimpleRecordFormatParquetIO(doAs, path, overwrite, limit, mergeOutput);
            break;
    }
    if (rf == null) {
        throw new RuntimeException("To be implemented: " + properties.getDatasetProperties().format.getValue());
    }
    S3Connection.setS3Configuration(rf.getExtraHadoopConfiguration(), properties.getDatasetProperties());
    return rf.write(in);
}
Also used : SimpleRecordFormatCsvIO(org.talend.components.simplefileio.runtime.SimpleRecordFormatCsvIO) TalendRuntimeException(org.talend.daikon.exception.TalendRuntimeException) UgiDoAs(org.talend.components.simplefileio.runtime.ugi.UgiDoAs) SimpleRecordFormatBase(org.talend.components.simplefileio.runtime.SimpleRecordFormatBase) SimpleRecordFormatParquetIO(org.talend.components.simplefileio.runtime.SimpleRecordFormatParquetIO) SimpleRecordFormatAvroIO(org.talend.components.simplefileio.runtime.SimpleRecordFormatAvroIO)

Aggregations

SimpleRecordFormatAvroIO (org.talend.components.simplefileio.runtime.SimpleRecordFormatAvroIO)2 SimpleRecordFormatBase (org.talend.components.simplefileio.runtime.SimpleRecordFormatBase)2 SimpleRecordFormatCsvIO (org.talend.components.simplefileio.runtime.SimpleRecordFormatCsvIO)2 SimpleRecordFormatParquetIO (org.talend.components.simplefileio.runtime.SimpleRecordFormatParquetIO)2 UgiDoAs (org.talend.components.simplefileio.runtime.ugi.UgiDoAs)2 TalendRuntimeException (org.talend.daikon.exception.TalendRuntimeException)1