Search in sources :

Example 1 with S3OutputProperties

use of org.talend.components.simplefileio.s3.output.S3OutputProperties in project components by Talend.

the class S3OutputRuntimeTestIT method testParquet_merge.

@Test
public void testParquet_merge() throws IOException {
    S3DatasetProperties datasetProps = s3.createS3DatasetProperties();
    datasetProps.format.setValue(SimpleFileIOFormat.PARQUET);
    S3OutputProperties outputProperties = new S3OutputProperties("out");
    outputProperties.init();
    outputProperties.setDatasetProperties(datasetProps);
    outputProperties.mergeOutput.setValue(true);
    // Create the runtime.
    S3OutputRuntime runtime = new S3OutputRuntime();
    runtime.initialize(null, outputProperties);
    // Use the runtime in a Spark pipeline to test.
    final Pipeline p = spark.createPipeline();
    PCollection<IndexedRecord> input = // 
    p.apply(// 
    Create.of(// 
    ConvertToIndexedRecord.convertToAvro(new String[] { "1", "one" }), // 
    ConvertToIndexedRecord.convertToAvro(new String[] { "2", "two" })));
    input.apply(runtime);
    // And run the test.
    p.run().waitUntilFinish();
    FileSystem s3FileSystem = S3Connection.createFileSystem(datasetProps);
    MiniDfsResource.assertReadParquetFile(s3FileSystem, s3.getS3APath(datasetProps), new HashSet<IndexedRecord>(// 
    Arrays.asList(// 
    ConvertToIndexedRecord.convertToAvro(new String[] { "1", "one" }), ConvertToIndexedRecord.convertToAvro(new String[] { "2", "two" }))), false);
    MiniDfsResource.assertFileNumber(s3FileSystem, s3.getS3APath(datasetProps), 1);
}
Also used : S3DatasetProperties(org.talend.components.simplefileio.s3.S3DatasetProperties) ConvertToIndexedRecord(org.talend.components.adapter.beam.transform.ConvertToIndexedRecord) IndexedRecord(org.apache.avro.generic.IndexedRecord) S3OutputProperties(org.talend.components.simplefileio.s3.output.S3OutputProperties) FileSystem(org.apache.hadoop.fs.FileSystem) Pipeline(org.apache.beam.sdk.Pipeline) Test(org.junit.Test)

Example 2 with S3OutputProperties

use of org.talend.components.simplefileio.s3.output.S3OutputProperties in project components by Talend.

the class S3Connection method createClient.

public static AmazonS3 createClient(S3OutputProperties properties) {
    S3DatasetProperties data_set = properties.getDatasetProperties();
    S3DatastoreProperties data_store = properties.getDatasetProperties().getDatastoreProperties();
    com.amazonaws.auth.AWSCredentials credentials = new com.amazonaws.auth.BasicAWSCredentials(data_store.accessKey.getValue(), data_store.secretKey.getValue());
    Region region = RegionUtils.getRegion(data_set.region.getValue().getValue());
    Boolean clientSideEnc = data_set.encryptDataInMotion.getValue();
    AmazonS3 conn = null;
    if (clientSideEnc != null && clientSideEnc) {
        String kms_cmk = data_set.kmsForDataInMotion.getValue();
        KMSEncryptionMaterialsProvider encryptionMaterialsProvider = new KMSEncryptionMaterialsProvider(kms_cmk);
        conn = new AmazonS3EncryptionClient(credentials, encryptionMaterialsProvider, new CryptoConfiguration().withAwsKmsRegion(region));
    } else {
        AWSCredentialsProvider basicCredentialsProvider = new StaticCredentialsProvider(credentials);
        conn = new AmazonS3Client(basicCredentialsProvider);
    }
    conn.setRegion(region);
    return conn;
}
Also used : AmazonS3(com.amazonaws.services.s3.AmazonS3) AmazonS3EncryptionClient(com.amazonaws.services.s3.AmazonS3EncryptionClient) StaticCredentialsProvider(com.amazonaws.internal.StaticCredentialsProvider) KMSEncryptionMaterialsProvider(com.amazonaws.services.s3.model.KMSEncryptionMaterialsProvider) CryptoConfiguration(com.amazonaws.services.s3.model.CryptoConfiguration) S3DatasetProperties(org.talend.components.simplefileio.s3.S3DatasetProperties) AmazonS3Client(com.amazonaws.services.s3.AmazonS3Client) Region(com.amazonaws.regions.Region) S3DatastoreProperties(org.talend.components.simplefileio.s3.S3DatastoreProperties) AWSCredentialsProvider(com.amazonaws.auth.AWSCredentialsProvider)

Example 3 with S3OutputProperties

use of org.talend.components.simplefileio.s3.output.S3OutputProperties in project components by Talend.

the class S3SinkTestIT method testAction.

@Test
public void testAction() throws IOException {
    S3OutputProperties properties = PropertiesPreparer.createS3OtuputProperties();
    runtime.initialize(null, properties);
    S3WriteOperation writeOperation = runtime.createWriteOperation();
    S3OutputWriter writer = writeOperation.createWriter(null);
    writer.open("u001");
    Schema schema = PropertiesPreparer.createTestSchema();
    IndexedRecord r1 = new GenericData.Record(schema);
    r1.put(0, 1);
    r1.put(1, "wangwei");
    writer.write(r1);
    IndexedRecord r2 = new GenericData.Record(schema);
    r2.put(0, 2);
    r2.put(1, "gaoyan");
    writer.write(r2);
    IndexedRecord r3 = new GenericData.Record(schema);
    r3.put(0, 3);
    r3.put(1, "dabao");
    writer.write(r3);
    writer.close();
    AmazonS3 s3_client = S3Connection.createClient(properties);
    String data = s3_client.getObjectAsString(PropertiesPreparer.bucket, PropertiesPreparer.objectkey);
    String expect = "ID;NAME1;wangwei2;gaoyan3;dabao";
    org.junit.Assert.assertEquals("data content is not right", expect, data.replaceAll("[\r\n]+", ""));
}
Also used : AmazonS3(com.amazonaws.services.s3.AmazonS3) IndexedRecord(org.apache.avro.generic.IndexedRecord) S3OutputProperties(org.talend.components.simplefileio.s3.output.S3OutputProperties) Schema(org.apache.avro.Schema) IndexedRecord(org.apache.avro.generic.IndexedRecord) Test(org.junit.Test)

Example 4 with S3OutputProperties

use of org.talend.components.simplefileio.s3.output.S3OutputProperties in project components by Talend.

the class PropertiesPreparer method createS3OtuputProperties.

public static S3OutputProperties createS3OtuputProperties() {
    S3OutputProperties properties = new S3OutputProperties("s3output");
    S3DatasetProperties dataset = new S3DatasetProperties("dataset");
    S3DatastoreProperties datastore = new S3DatastoreProperties("datastore");
    properties.setDatasetProperties(dataset);
    dataset.setDatastoreProperties(datastore);
    datastore.accessKey.setValue(accessKey);
    datastore.secretKey.setValue(secretkey);
    dataset.region.setValue(S3Region.valueOf(region));
    dataset.bucket.setValue(bucket);
    dataset.kmsForDataInMotion.setValue(ssekmskey);
    dataset.kmsForDataAtRest.setValue(csekmskey);
    dataset.object.setValue(objectkey);
    return properties;
}
Also used : S3DatasetProperties(org.talend.components.simplefileio.s3.S3DatasetProperties) S3OutputProperties(org.talend.components.simplefileio.s3.output.S3OutputProperties) S3DatastoreProperties(org.talend.components.simplefileio.s3.S3DatastoreProperties)

Example 5 with S3OutputProperties

use of org.talend.components.simplefileio.s3.output.S3OutputProperties in project components by Talend.

the class S3SourceOrSinkTestIT method initialize_success.

@Test
public void initialize_success() {
    ValidationResult result = runtime.initialize(null, new S3OutputProperties("s3output"));
    org.junit.Assert.assertEquals("expect ok, but not", ValidationResult.OK, result);
}
Also used : S3OutputProperties(org.talend.components.simplefileio.s3.output.S3OutputProperties) ValidationResult(org.talend.daikon.properties.ValidationResult) Test(org.junit.Test)

Aggregations

S3OutputProperties (org.talend.components.simplefileio.s3.output.S3OutputProperties)9 IndexedRecord (org.apache.avro.generic.IndexedRecord)6 Test (org.junit.Test)6 S3DatasetProperties (org.talend.components.simplefileio.s3.S3DatasetProperties)6 S3DatastoreProperties (org.talend.components.simplefileio.s3.S3DatastoreProperties)4 Schema (org.apache.avro.Schema)3 Pipeline (org.apache.beam.sdk.Pipeline)3 FileSystem (org.apache.hadoop.fs.FileSystem)3 ConvertToIndexedRecord (org.talend.components.adapter.beam.transform.ConvertToIndexedRecord)3 AmazonS3 (com.amazonaws.services.s3.AmazonS3)2 S3InputProperties (org.talend.components.simplefileio.s3.input.S3InputProperties)2 RecordSet (org.talend.components.test.RecordSet)2 ValidationResult (org.talend.daikon.properties.ValidationResult)2 AWSCredentialsProvider (com.amazonaws.auth.AWSCredentialsProvider)1 StaticCredentialsProvider (com.amazonaws.internal.StaticCredentialsProvider)1 Region (com.amazonaws.regions.Region)1 AmazonS3Client (com.amazonaws.services.s3.AmazonS3Client)1 AmazonS3EncryptionClient (com.amazonaws.services.s3.AmazonS3EncryptionClient)1 CryptoConfiguration (com.amazonaws.services.s3.model.CryptoConfiguration)1 KMSEncryptionMaterialsProvider (com.amazonaws.services.s3.model.KMSEncryptionMaterialsProvider)1