Search in sources :

Example 1 with SelectObjectContentRequest

use of com.amazonaws.services.s3.model.SelectObjectContentRequest in project trino by trinodb.

the class TrinoS3SelectClient method getRecordsContent.

public InputStream getRecordsContent(SelectObjectContentRequest selectObjectRequest) {
    this.selectObjectRequest = requireNonNull(selectObjectRequest, "selectObjectRequest is null");
    this.selectObjectContentResult = s3Client.selectObjectContent(selectObjectRequest);
    return selectObjectContentResult.getPayload().getRecordsInputStream(new SelectObjectContentEventVisitor() {

        @Override
        public void visit(EndEvent endEvent) {
            requestComplete = true;
        }
    });
}
Also used : EndEvent(com.amazonaws.services.s3.model.SelectObjectContentEvent.EndEvent) SelectObjectContentEventVisitor(com.amazonaws.services.s3.model.SelectObjectContentEventVisitor)

Example 2 with SelectObjectContentRequest

use of com.amazonaws.services.s3.model.SelectObjectContentRequest in project presto by prestodb.

the class PrestoS3SelectClient method getRecordsContent.

public InputStream getRecordsContent(SelectObjectContentRequest selectObjectRequest) {
    this.selectObjectRequest = requireNonNull(selectObjectRequest, "selectObjectRequest is null");
    this.selectObjectContentResult = s3Client.selectObjectContent(selectObjectRequest);
    return selectObjectContentResult.getPayload().getRecordsInputStream(new SelectObjectContentEventVisitor() {

        @Override
        public void visit(EndEvent endEvent) {
            requestComplete = true;
        }
    });
}
Also used : EndEvent(com.amazonaws.services.s3.model.SelectObjectContentEvent.EndEvent) SelectObjectContentEventVisitor(com.amazonaws.services.s3.model.SelectObjectContentEventVisitor)

Example 3 with SelectObjectContentRequest

use of com.amazonaws.services.s3.model.SelectObjectContentRequest in project presto by prestodb.

the class S3SelectLineRecordReader method readLine.

private int readLine(Text value) throws IOException {
    try {
        return retry().maxAttempts(maxAttempts).exponentialBackoff(BACKOFF_MIN_SLEEP, maxBackoffTime, maxRetryTime, 2.0).stopOn(InterruptedException.class, UnrecoverableS3OperationException.class).run("readRecordsContentStream", () -> {
            if (isFirstLine) {
                recordsFromS3 = 0;
                selectObjectContent = selectClient.getRecordsContent(selectObjectContentRequest);
                closer.register(selectObjectContent);
                reader = new LineReader(selectObjectContent, lineDelimiter.getBytes(StandardCharsets.UTF_8));
                closer.register(reader);
                isFirstLine = false;
            }
            try {
                return reader.readLine(value);
            } catch (RuntimeException e) {
                isFirstLine = true;
                recordsFromS3 = 0;
                if (e instanceof AmazonS3Exception) {
                    switch(((AmazonS3Exception) e).getStatusCode()) {
                        case HTTP_FORBIDDEN:
                        case HTTP_NOT_FOUND:
                        case HTTP_BAD_REQUEST:
                            throw new UnrecoverableS3OperationException(selectClient.getBucketName(), selectClient.getKeyName(), e);
                    }
                }
                throw e;
            }
        });
    } catch (Exception e) {
        throwIfInstanceOf(e, IOException.class);
        throwIfUnchecked(e);
        throw new RuntimeException(e);
    }
}
Also used : LineReader(org.apache.hadoop.util.LineReader) AmazonS3Exception(com.amazonaws.services.s3.model.AmazonS3Exception) IOException(java.io.IOException) AmazonS3Exception(com.amazonaws.services.s3.model.AmazonS3Exception) IOException(java.io.IOException)

Example 4 with SelectObjectContentRequest

use of com.amazonaws.services.s3.model.SelectObjectContentRequest in project presto by prestodb.

the class S3SelectCsvRecordReader method buildSelectObjectRequest.

@Override
public SelectObjectContentRequest buildSelectObjectRequest(Properties schema, String query, Path path) {
    SelectObjectContentRequest selectObjectRequest = new SelectObjectContentRequest();
    URI uri = path.toUri();
    selectObjectRequest.setBucketName(PrestoS3FileSystem.getBucketName(uri));
    selectObjectRequest.setKey(PrestoS3FileSystem.keyFromPath(path));
    selectObjectRequest.setExpression(query);
    selectObjectRequest.setExpressionType(ExpressionType.SQL);
    String fieldDelimiter = getFieldDelimiter(schema);
    String quoteChar = schema.getProperty(QUOTE_CHAR, null);
    String escapeChar = schema.getProperty(ESCAPE_CHAR, null);
    CSVInput selectObjectCSVInputSerialization = new CSVInput();
    selectObjectCSVInputSerialization.setRecordDelimiter(lineDelimiter);
    selectObjectCSVInputSerialization.setFieldDelimiter(fieldDelimiter);
    selectObjectCSVInputSerialization.setComments(COMMENTS_CHAR_STR);
    selectObjectCSVInputSerialization.setQuoteCharacter(quoteChar);
    selectObjectCSVInputSerialization.setQuoteEscapeCharacter(escapeChar);
    InputSerialization selectObjectInputSerialization = new InputSerialization();
    CompressionCodec codec = compressionCodecFactory.getCodec(path);
    if (codec instanceof GzipCodec) {
        selectObjectInputSerialization.setCompressionType(CompressionType.GZIP);
    } else if (codec instanceof BZip2Codec) {
        selectObjectInputSerialization.setCompressionType(CompressionType.BZIP2);
    } else if (codec != null) {
        throw new PrestoException(NOT_SUPPORTED, "Compression extension not supported for S3 Select: " + path);
    }
    selectObjectInputSerialization.setCsv(selectObjectCSVInputSerialization);
    selectObjectRequest.setInputSerialization(selectObjectInputSerialization);
    OutputSerialization selectObjectOutputSerialization = new OutputSerialization();
    CSVOutput selectObjectCSVOutputSerialization = new CSVOutput();
    selectObjectCSVOutputSerialization.setRecordDelimiter(lineDelimiter);
    selectObjectCSVOutputSerialization.setFieldDelimiter(fieldDelimiter);
    selectObjectCSVOutputSerialization.setQuoteCharacter(quoteChar);
    selectObjectCSVOutputSerialization.setQuoteEscapeCharacter(escapeChar);
    selectObjectOutputSerialization.setCsv(selectObjectCSVOutputSerialization);
    selectObjectRequest.setOutputSerialization(selectObjectOutputSerialization);
    return selectObjectRequest;
}
Also used : SelectObjectContentRequest(com.amazonaws.services.s3.model.SelectObjectContentRequest) GzipCodec(org.apache.hadoop.io.compress.GzipCodec) InputSerialization(com.amazonaws.services.s3.model.InputSerialization) CSVInput(com.amazonaws.services.s3.model.CSVInput) BZip2Codec(org.apache.hadoop.io.compress.BZip2Codec) PrestoException(com.facebook.presto.spi.PrestoException) CompressionCodec(org.apache.hadoop.io.compress.CompressionCodec) URI(java.net.URI) OutputSerialization(com.amazonaws.services.s3.model.OutputSerialization) CSVOutput(com.amazonaws.services.s3.model.CSVOutput)

Example 5 with SelectObjectContentRequest

use of com.amazonaws.services.s3.model.SelectObjectContentRequest in project boostkit-bigdata by kunpengcompute.

the class PrestoS3SelectClient method getRecordsContent.

public InputStream getRecordsContent(SelectObjectContentRequest selectObjectRequest) {
    this.selectObjectRequest = requireNonNull(selectObjectRequest, "selectObjectRequest is null");
    this.selectObjectContentResult = s3Client.selectObjectContent(selectObjectRequest);
    return selectObjectContentResult.getPayload().getRecordsInputStream(new SelectObjectContentEventVisitor() {

        @Override
        public void visit(EndEvent endEvent) {
            requestComplete = true;
        }
    });
}
Also used : EndEvent(com.amazonaws.services.s3.model.SelectObjectContentEvent.EndEvent) SelectObjectContentEventVisitor(com.amazonaws.services.s3.model.SelectObjectContentEventVisitor)

Aggregations

SelectObjectContentRequest (com.amazonaws.services.s3.model.SelectObjectContentRequest)9 InputSerialization (com.amazonaws.services.s3.model.InputSerialization)6 OutputSerialization (com.amazonaws.services.s3.model.OutputSerialization)6 SelectObjectContentEventVisitor (com.amazonaws.services.s3.model.SelectObjectContentEventVisitor)6 AmazonS3Exception (com.amazonaws.services.s3.model.AmazonS3Exception)5 CSVInput (com.amazonaws.services.s3.model.CSVInput)5 CSVOutput (com.amazonaws.services.s3.model.CSVOutput)5 EndEvent (com.amazonaws.services.s3.model.SelectObjectContentEvent.EndEvent)5 IOException (java.io.IOException)5 URI (java.net.URI)5 LineReader (org.apache.hadoop.util.LineReader)5 BZip2Codec (org.apache.hadoop.io.compress.BZip2Codec)4 CompressionCodec (org.apache.hadoop.io.compress.CompressionCodec)4 GzipCodec (org.apache.hadoop.io.compress.GzipCodec)4 PrestoException (com.facebook.presto.spi.PrestoException)2 PrestoException (io.prestosql.spi.PrestoException)2 Configuration (org.apache.hadoop.conf.Configuration)2 RequestContext (org.greenplum.pxf.api.model.RequestContext)2 Test (org.junit.jupiter.api.Test)2 InputSerialization (software.amazon.awssdk.services.s3.model.InputSerialization)2