Search in sources :

Example 1 with ResponseInputStream

use of software.amazon.awssdk.core.ResponseInputStream in project flyway by flyway.

the class AwsS3Resource method read.

@Override
public Reader read() {
    S3Client s3 = S3ClientFactory.getClient();
    try {
        GetObjectRequest.Builder builder = GetObjectRequest.builder().bucket(bucketName).key(s3ObjectSummary.key());
        GetObjectRequest request = builder.build();
        ResponseInputStream o = s3.getObject(request);
        return Channels.newReader(Channels.newChannel(o), encoding.name());
    } catch (AwsServiceException e) {
        LOG.error(e.getMessage(), e);
        throw new FlywayException("Failed to get object from s3: " + e.getMessage(), e);
    }
}
Also used : FlywayException(org.flywaydb.core.api.FlywayException) ResponseInputStream(software.amazon.awssdk.core.ResponseInputStream) AwsServiceException(software.amazon.awssdk.awscore.exception.AwsServiceException) S3Client(software.amazon.awssdk.services.s3.S3Client) GetObjectRequest(software.amazon.awssdk.services.s3.model.GetObjectRequest)

Example 2 with ResponseInputStream

use of software.amazon.awssdk.core.ResponseInputStream in project hazelcast by hazelcast.

the class S3Sources method s3.

/**
 * Creates an AWS S3 {@link BatchSource} which lists all the objects in the
 * bucket-list using given {@code prefix}, reads them line by line,
 * transforms each line to the desired output object using given {@code
 * mapFn} and emits them to downstream.
 * <p>
 * The source does not save any state to snapshot. If the job is restarted,
 * it will re-emit all entries.
 * <p>
 * The default local parallelism for this processor is 2.
 * <p>
 * Here is an example which reads the objects from a single bucket with
 * applying the given prefix.
 *
 * <pre>{@code
 * Pipeline p = Pipeline.create();
 * BatchStage<String> srcStage = p.readFrom(S3Sources.s3(
 *      Arrays.asList("bucket1", "bucket2"),
 *      "prefix",
 *      StandardCharsets.UTF_8,
 *      () -> S3Client.create(),
 *      (filename, line) -> line
 * ));
 * }</pre>
 *
 * @param bucketNames    list of bucket-names
 * @param prefix         the prefix to filter the objects. Optional, passing
 *                       {@code null} will list all objects.
 * @param clientSupplier function which returns the s3 client to use
 *                       one client per processor instance is used
 * @param mapFn          the function which creates output object from each
 *                       line. Gets the object name and line as parameters
 * @param <T>            the type of the items the source emits
 */
@Nonnull
public static <T> BatchSource<T> s3(@Nonnull List<String> bucketNames, @Nullable String prefix, @Nonnull Charset charset, @Nonnull SupplierEx<? extends S3Client> clientSupplier, @Nonnull BiFunctionEx<String, String, ? extends T> mapFn) {
    String charsetName = charset.name();
    FunctionEx<InputStream, Stream<String>> readFileFn = responseInputStream -> {
        BufferedReader reader = new BufferedReader(new InputStreamReader(responseInputStream, Charset.forName(charsetName)));
        return reader.lines();
    };
    return s3(bucketNames, prefix, clientSupplier, readFileFn, mapFn);
}
Also used : Traverser(com.hazelcast.jet.Traverser) S3Object(software.amazon.awssdk.services.s3.model.S3Object) Traversers.traverseStream(com.hazelcast.jet.Traversers.traverseStream) GetObjectResponse(software.amazon.awssdk.services.s3.model.GetObjectResponse) BiFunctionEx(com.hazelcast.function.BiFunctionEx) Charset(java.nio.charset.Charset) Util.entry(com.hazelcast.jet.Util.entry) GetObjectRequest(software.amazon.awssdk.services.s3.model.GetObjectRequest) SourceBuffer(com.hazelcast.jet.pipeline.SourceBuilder.SourceBuffer) ResponseInputStream(software.amazon.awssdk.core.ResponseInputStream) Nonnull(javax.annotation.Nonnull) Nullable(javax.annotation.Nullable) FunctionEx(com.hazelcast.function.FunctionEx) BatchSource(com.hazelcast.jet.pipeline.BatchSource) Iterator(java.util.Iterator) S3Client(software.amazon.awssdk.services.s3.S3Client) UTF_8(java.nio.charset.StandardCharsets.UTF_8) InputStreamReader(java.io.InputStreamReader) SupplierEx(com.hazelcast.function.SupplierEx) StandardCharsets(java.nio.charset.StandardCharsets) List(java.util.List) Stream(java.util.stream.Stream) Context(com.hazelcast.jet.core.Processor.Context) Entry(java.util.Map.Entry) TriFunction(com.hazelcast.jet.function.TriFunction) BufferedReader(java.io.BufferedReader) SourceBuilder(com.hazelcast.jet.pipeline.SourceBuilder) InputStream(java.io.InputStream) InputStreamReader(java.io.InputStreamReader) ResponseInputStream(software.amazon.awssdk.core.ResponseInputStream) InputStream(java.io.InputStream) BufferedReader(java.io.BufferedReader) Traversers.traverseStream(com.hazelcast.jet.Traversers.traverseStream) ResponseInputStream(software.amazon.awssdk.core.ResponseInputStream) Stream(java.util.stream.Stream) InputStream(java.io.InputStream) Nonnull(javax.annotation.Nonnull)

Aggregations

ResponseInputStream (software.amazon.awssdk.core.ResponseInputStream)2 S3Client (software.amazon.awssdk.services.s3.S3Client)2 GetObjectRequest (software.amazon.awssdk.services.s3.model.GetObjectRequest)2 BiFunctionEx (com.hazelcast.function.BiFunctionEx)1 FunctionEx (com.hazelcast.function.FunctionEx)1 SupplierEx (com.hazelcast.function.SupplierEx)1 Traverser (com.hazelcast.jet.Traverser)1 Traversers.traverseStream (com.hazelcast.jet.Traversers.traverseStream)1 Util.entry (com.hazelcast.jet.Util.entry)1 Context (com.hazelcast.jet.core.Processor.Context)1 TriFunction (com.hazelcast.jet.function.TriFunction)1 BatchSource (com.hazelcast.jet.pipeline.BatchSource)1 SourceBuilder (com.hazelcast.jet.pipeline.SourceBuilder)1 SourceBuffer (com.hazelcast.jet.pipeline.SourceBuilder.SourceBuffer)1 BufferedReader (java.io.BufferedReader)1 InputStream (java.io.InputStream)1 InputStreamReader (java.io.InputStreamReader)1 Charset (java.nio.charset.Charset)1 StandardCharsets (java.nio.charset.StandardCharsets)1 UTF_8 (java.nio.charset.StandardCharsets.UTF_8)1