Search in sources :

Example 1 with CompressionType

use of io.confluent.connect.s3.storage.CompressionType in project kafka-connect-storage-cloud by confluentinc.

the class TestWithMockedS3 method readRecords.

public static Collection<Object> readRecords(String topicsDir, String directory, TopicPartition tp, long startOffset, String extension, String zeroPadFormat, String bucketName, AmazonS3 s3) throws IOException {
    String fileKey = FileUtils.fileKeyToCommit(topicsDir, directory, tp, startOffset, extension, zeroPadFormat);
    CompressionType compressionType = CompressionType.NONE;
    if (extension.endsWith(".gz")) {
        compressionType = CompressionType.GZIP;
    }
    if (".avro".equals(extension)) {
        return readRecordsAvro(bucketName, fileKey, s3);
    } else if (extension.startsWith(".json")) {
        return readRecordsJson(bucketName, fileKey, s3, compressionType);
    } else if (extension.startsWith(".bin")) {
        return readRecordsByteArray(bucketName, fileKey, s3, compressionType, S3SinkConnectorConfig.FORMAT_BYTEARRAY_LINE_SEPARATOR_DEFAULT.getBytes());
    } else if (extension.endsWith(".parquet")) {
        return readRecordsParquet(bucketName, fileKey, s3);
    } else if (extension.startsWith(".customExtensionForTest")) {
        return readRecordsByteArray(bucketName, fileKey, s3, compressionType, "SEPARATOR".getBytes());
    } else {
        throw new IllegalArgumentException("Unknown extension: " + extension);
    }
}
Also used : CompressionType(io.confluent.connect.s3.storage.CompressionType)

Example 2 with CompressionType

use of io.confluent.connect.s3.storage.CompressionType in project kafka-connect-storage-cloud by confluentinc.

the class DataWriterByteArrayTest method testBestGzipCompression.

@Test
public void testBestGzipCompression() throws Exception {
    CompressionType compressionType = CompressionType.GZIP;
    localProps.put(S3SinkConnectorConfig.FORMAT_CLASS_CONFIG, ByteArrayFormat.class.getName());
    localProps.put(S3SinkConnectorConfig.COMPRESSION_TYPE_CONFIG, compressionType.name);
    localProps.put(S3SinkConnectorConfig.COMPRESSION_LEVEL_CONFIG, String.valueOf(Deflater.BEST_COMPRESSION));
    setUp();
    task = new S3SinkTask(connectorConfig, context, storage, partitioner, format, SYSTEM_TIME);
    List<SinkRecord> sinkRecords = createByteArrayRecordsWithoutSchema(7 * context.assignment().size(), 0, context.assignment());
    task.put(sinkRecords);
    task.close(context.assignment());
    task.stop();
    long[] validOffsets = { 0, 3, 6 };
    verify(sinkRecords, validOffsets, context.assignment(), DEFAULT_EXTENSION + ".gz");
}
Also used : ByteArrayFormat(io.confluent.connect.s3.format.bytearray.ByteArrayFormat) SinkRecord(org.apache.kafka.connect.sink.SinkRecord) CompressionType(io.confluent.connect.s3.storage.CompressionType) Test(org.junit.Test)

Example 3 with CompressionType

use of io.confluent.connect.s3.storage.CompressionType in project kafka-connect-storage-cloud by confluentinc.

the class DataWriterJsonTest method testGzipCompressionWithSchema.

@Test
public void testGzipCompressionWithSchema() throws Exception {
    CompressionType compressionType = CompressionType.GZIP;
    localProps.put(S3SinkConnectorConfig.FORMAT_CLASS_CONFIG, JsonFormat.class.getName());
    localProps.put(S3SinkConnectorConfig.COMPRESSION_TYPE_CONFIG, compressionType.name);
    setUp();
    task = new S3SinkTask(connectorConfig, context, storage, partitioner, format, SYSTEM_TIME);
    List<SinkRecord> sinkRecords = createRecordsInterleaved(7 * context.assignment().size(), 0, context.assignment());
    task.put(sinkRecords);
    task.close(context.assignment());
    task.stop();
    long[] validOffsets = { 0, 3, 6 };
    verify(sinkRecords, validOffsets, context.assignment(), EXTENSION + ".gz");
}
Also used : JsonFormat(io.confluent.connect.s3.format.json.JsonFormat) SinkRecord(org.apache.kafka.connect.sink.SinkRecord) CompressionType(io.confluent.connect.s3.storage.CompressionType) Test(org.junit.Test)

Example 4 with CompressionType

use of io.confluent.connect.s3.storage.CompressionType in project kafka-connect-storage-cloud by confluentinc.

the class DataWriterJsonTest method testGzipCompressionNoSchema.

@Test
public void testGzipCompressionNoSchema() throws Exception {
    CompressionType compressionType = CompressionType.GZIP;
    localProps.put(S3SinkConnectorConfig.FORMAT_CLASS_CONFIG, JsonFormat.class.getName());
    localProps.put(S3SinkConnectorConfig.COMPRESSION_TYPE_CONFIG, compressionType.name);
    setUp();
    task = new S3SinkTask(connectorConfig, context, storage, partitioner, format, SYSTEM_TIME);
    List<SinkRecord> sinkRecords = createJsonRecordsWithoutSchema(7 * context.assignment().size(), 0, context.assignment());
    task.put(sinkRecords);
    task.close(context.assignment());
    task.stop();
    long[] validOffsets = { 0, 3, 6 };
    verify(sinkRecords, validOffsets, context.assignment(), EXTENSION + ".gz");
}
Also used : JsonFormat(io.confluent.connect.s3.format.json.JsonFormat) SinkRecord(org.apache.kafka.connect.sink.SinkRecord) CompressionType(io.confluent.connect.s3.storage.CompressionType) Test(org.junit.Test)

Example 5 with CompressionType

use of io.confluent.connect.s3.storage.CompressionType in project kafka-connect-storage-cloud by confluentinc.

the class DataWriterByteArrayTest method testGzipCompression.

@Test
public void testGzipCompression() throws Exception {
    CompressionType compressionType = CompressionType.GZIP;
    localProps.put(S3SinkConnectorConfig.FORMAT_CLASS_CONFIG, ByteArrayFormat.class.getName());
    localProps.put(S3SinkConnectorConfig.COMPRESSION_TYPE_CONFIG, compressionType.name);
    setUp();
    task = new S3SinkTask(connectorConfig, context, storage, partitioner, format, SYSTEM_TIME);
    List<SinkRecord> sinkRecords = createByteArrayRecordsWithoutSchema(7 * context.assignment().size(), 0, context.assignment());
    task.put(sinkRecords);
    task.close(context.assignment());
    task.stop();
    long[] validOffsets = { 0, 3, 6 };
    verify(sinkRecords, validOffsets, context.assignment(), DEFAULT_EXTENSION + ".gz");
}
Also used : ByteArrayFormat(io.confluent.connect.s3.format.bytearray.ByteArrayFormat) SinkRecord(org.apache.kafka.connect.sink.SinkRecord) CompressionType(io.confluent.connect.s3.storage.CompressionType) Test(org.junit.Test)

Aggregations

CompressionType (io.confluent.connect.s3.storage.CompressionType)6 SinkRecord (org.apache.kafka.connect.sink.SinkRecord)5 Test (org.junit.Test)5 JsonFormat (io.confluent.connect.s3.format.json.JsonFormat)3 ByteArrayFormat (io.confluent.connect.s3.format.bytearray.ByteArrayFormat)2