Search in sources :

Example 1 with InputStreamFromOutputStream

use of com.gc.iotools.stream.is.InputStreamFromOutputStream in project bender by Nextdoor.

the class S3Transport method sendBatch.

@Override
public void sendBatch(TransportBuffer buffer, LinkedHashMap<String, String> partitions, Context context) throws TransportException {
    S3TransportBuffer buf = (S3TransportBuffer) buffer;
    /*
     * Create s3 key (filepath + filename)
     */
    LinkedHashMap<String, String> parts = new LinkedHashMap<String, String>(partitions);
    String filename = parts.remove(FILENAME_KEY);
    if (filename == null) {
        filename = context.getAwsRequestId();
    }
    String key = parts.entrySet().stream().map(s -> s.getKey() + "=" + s.getValue()).collect(Collectors.joining("/"));
    key = (key.equals("") ? filename : key + '/' + filename);
    if (this.basePath.endsWith("/")) {
        key = this.basePath + key;
    } else {
        key = this.basePath + '/' + key;
    }
    // TODO: make this dynamic
    if (key.endsWith(".gz")) {
        key = key.substring(0, key.length() - 3);
    }
    /*
     * Add or strip out compression format extension
     *
     * TODO: get this based on the compression codec
     */
    if (this.compress || buf.isCompressed()) {
        key += ".bz2";
    }
    ByteArrayOutputStream os = buf.getInternalBuffer();
    /*
     * Compress stream if needed. Don't compress a compressed stream.
     */
    ByteArrayOutputStream payload;
    if (this.compress && !buf.isCompressed()) {
        payload = compress(os);
    } else {
        payload = os;
    }
    /*
     * For memory efficiency convert the output stream into an InputStream. This is done using the
     * easystream library but under the hood it uses piped streams to facilitate this process. This
     * avoids copying the entire contents of the OutputStream to populate the InputStream. Note that
     * this process creates another thread to consume from the InputStream.
     */
    final String s3Key = key;
    /*
     * Write to OutputStream
     */
    final InputStreamFromOutputStream<String> isos = new InputStreamFromOutputStream<String>() {

        public String produce(final OutputStream dataSink) throws Exception {
            /*
         * Note this is executed in a different thread
         */
            payload.writeTo(dataSink);
            return null;
        }
    };
    /*
     * Consume InputStream
     */
    try {
        sendStream(isos, s3Key, payload.size());
    } finally {
        try {
            isos.close();
        } catch (IOException e) {
            throw new TransportException(e);
        } finally {
            buf.close();
        }
    }
}
Also used : OutputStream(java.io.OutputStream) UploadPartRequest(com.amazonaws.services.s3.model.UploadPartRequest) PartitionedTransport(com.nextdoor.bender.ipc.PartitionedTransport) BZip2CompressorOutputStream(org.apache.commons.compress.compressors.bzip2.BZip2CompressorOutputStream) ByteArrayOutputStream(org.apache.commons.io.output.ByteArrayOutputStream) Context(com.amazonaws.services.lambda.runtime.Context) IOException(java.io.IOException) InputStreamFromOutputStream(com.gc.iotools.stream.is.InputStreamFromOutputStream) AmazonS3Client(com.amazonaws.services.s3.AmazonS3Client) Collectors(java.util.stream.Collectors) LinkedHashMap(java.util.LinkedHashMap) Logger(org.apache.log4j.Logger) TransportBuffer(com.nextdoor.bender.ipc.TransportBuffer) InitiateMultipartUploadRequest(com.amazonaws.services.s3.model.InitiateMultipartUploadRequest) TransportException(com.nextdoor.bender.ipc.TransportException) ObjectMetadata(com.amazonaws.services.s3.model.ObjectMetadata) Map(java.util.Map) InitiateMultipartUploadResult(com.amazonaws.services.s3.model.InitiateMultipartUploadResult) UploadPartResult(com.amazonaws.services.s3.model.UploadPartResult) AmazonClientException(com.amazonaws.AmazonClientException) InputStream(java.io.InputStream) InputStreamFromOutputStream(com.gc.iotools.stream.is.InputStreamFromOutputStream) OutputStream(java.io.OutputStream) BZip2CompressorOutputStream(org.apache.commons.compress.compressors.bzip2.BZip2CompressorOutputStream) ByteArrayOutputStream(org.apache.commons.io.output.ByteArrayOutputStream) InputStreamFromOutputStream(com.gc.iotools.stream.is.InputStreamFromOutputStream) ByteArrayOutputStream(org.apache.commons.io.output.ByteArrayOutputStream) IOException(java.io.IOException) TransportException(com.nextdoor.bender.ipc.TransportException) LinkedHashMap(java.util.LinkedHashMap)

Aggregations

AmazonClientException (com.amazonaws.AmazonClientException)1 Context (com.amazonaws.services.lambda.runtime.Context)1 AmazonS3Client (com.amazonaws.services.s3.AmazonS3Client)1 InitiateMultipartUploadRequest (com.amazonaws.services.s3.model.InitiateMultipartUploadRequest)1 InitiateMultipartUploadResult (com.amazonaws.services.s3.model.InitiateMultipartUploadResult)1 ObjectMetadata (com.amazonaws.services.s3.model.ObjectMetadata)1 UploadPartRequest (com.amazonaws.services.s3.model.UploadPartRequest)1 UploadPartResult (com.amazonaws.services.s3.model.UploadPartResult)1 InputStreamFromOutputStream (com.gc.iotools.stream.is.InputStreamFromOutputStream)1 PartitionedTransport (com.nextdoor.bender.ipc.PartitionedTransport)1 TransportBuffer (com.nextdoor.bender.ipc.TransportBuffer)1 TransportException (com.nextdoor.bender.ipc.TransportException)1 IOException (java.io.IOException)1 InputStream (java.io.InputStream)1 OutputStream (java.io.OutputStream)1 LinkedHashMap (java.util.LinkedHashMap)1 Map (java.util.Map)1 Collectors (java.util.stream.Collectors)1 BZip2CompressorOutputStream (org.apache.commons.compress.compressors.bzip2.BZip2CompressorOutputStream)1 ByteArrayOutputStream (org.apache.commons.io.output.ByteArrayOutputStream)1