Search in sources :

Example 6 with EncryptionKey

use of com.amazonaws.athena.connector.lambda.security.EncryptionKey in project aws-athena-query-federation by awslabs.

the class S3BlockSpiller method write.

/**
 * Writes (aka spills) a Block.
 */
protected SpillLocation write(Block block) {
    try {
        S3SpillLocation spillLocation = makeSpillLocation();
        EncryptionKey encryptionKey = spillConfig.getEncryptionKey();
        logger.info("write: Started encrypting block for write to {}", spillLocation);
        byte[] bytes = blockCrypto.encrypt(encryptionKey, block);
        totalBytesSpilled.addAndGet(bytes.length);
        logger.info("write: Started spilling block of size {} bytes", bytes.length);
        amazonS3.putObject(spillLocation.getBucket(), spillLocation.getKey(), new ByteArrayInputStream(bytes), new ObjectMetadata());
        logger.info("write: Completed spilling block of size {} bytes", bytes.length);
        return spillLocation;
    } catch (RuntimeException ex) {
        asyncException.compareAndSet(null, ex);
        logger.warn("write: Encountered error while writing block.", ex);
        throw ex;
    }
}
Also used : ByteArrayInputStream(java.io.ByteArrayInputStream) S3SpillLocation(com.amazonaws.athena.connector.lambda.domain.spill.S3SpillLocation) EncryptionKey(com.amazonaws.athena.connector.lambda.security.EncryptionKey) ObjectMetadata(com.amazonaws.services.s3.model.ObjectMetadata)

Example 7 with EncryptionKey

use of com.amazonaws.athena.connector.lambda.security.EncryptionKey in project foundry-athena-query-federation-connector by palantir.

the class FoundryMetadataHandler method doGetSplits.

@Override
public GetSplitsResponse doGetSplits(BlockAllocator _allocator, GetSplitsRequest request) {
    log.debug("Getting splits with constraints: {}", request.getConstraints());
    // can be shared across
    EncryptionKey encryptionKey = makeEncryptionKey();
    return splitsFetcher.getSplits(request, () -> makeSpillLocation(request), encryptionKey);
}
Also used : EncryptionKey(com.amazonaws.athena.connector.lambda.security.EncryptionKey)

Example 8 with EncryptionKey

use of com.amazonaws.athena.connector.lambda.security.EncryptionKey in project foundry-athena-query-federation-connector by palantir.

the class S3Spiller method write.

private SpillLocation write(Block block) {
    try {
        S3SpillLocation spillLocation = makeSpillLocation();
        EncryptionKey encryptionKey = spillConfig.getEncryptionKey();
        log.info("write: Started encrypting block for write to {}", spillLocation);
        byte[] bytes = blockCrypto.encrypt(encryptionKey, block);
        totalBytesSpilled.addAndGet(bytes.length);
        log.info("write: Started spilling block of size {} bytes", bytes.length);
        ObjectMetadata objectMetadata = new ObjectMetadata();
        objectMetadata.setContentLength(bytes.length);
        amazonS3.putObject(spillLocation.getBucket(), spillLocation.getKey(), new ByteArrayInputStream(bytes), objectMetadata);
        log.info("write: Completed spilling block of size {} bytes", bytes.length);
        return spillLocation;
    } catch (RuntimeException ex) {
        asyncException.compareAndSet(null, ex);
        log.warn("write: Encountered error while writing block.", ex);
        throw ex;
    }
}
Also used : SafeRuntimeException(com.palantir.logsafe.exceptions.SafeRuntimeException) ByteArrayInputStream(java.io.ByteArrayInputStream) S3SpillLocation(com.amazonaws.athena.connector.lambda.domain.spill.S3SpillLocation) EncryptionKey(com.amazonaws.athena.connector.lambda.security.EncryptionKey) ObjectMetadata(com.amazonaws.services.s3.model.ObjectMetadata)

Example 9 with EncryptionKey

use of com.amazonaws.athena.connector.lambda.security.EncryptionKey in project foundry-athena-query-federation-connector by palantir.

the class SplitsFetcher method getSplits.

GetSplitsResponse getSplits(GetSplitsRequest request, SpillLocationFactory spillLocationFactory, EncryptionKey encryptionKey) {
    CatalogLocator locator = FoundryAthenaObjectMapper.objectMapper().convertValue(request.getSchema().getCustomMetadata(), CatalogLocator.class);
    Optional<Filter> filter;
    if (request.getConstraints().getSummary().isEmpty()) {
        filter = Optional.empty();
    } else {
        // we just push down all constraints which will include those for any partition columns
        filter = Optional.of(Filter.and(AndFilter.of(request.getConstraints().getSummary().entrySet().stream().map(entry -> ConstraintConverter.convert(entry.getKey(), entry.getValue())).collect(Collectors.toList()))));
    }
    Set<Split> splits = new HashSet<>();
    Optional<String> pageToken = Optional.empty();
    while (true) {
        GetSlicesResponse response = metadataService.getSlices(authProvider.getAuthHeader(), GetSlicesRequest.builder().locator(locator).filter(filter).nextPageToken(pageToken).build());
        splits.addAll(response.getSlices().stream().map(slice -> slices.toSplit(spillLocationFactory.makeSpillLocation(), encryptionKey, slice)).collect(Collectors.toSet()));
        if (response.getNextPageToken().isPresent()) {
            pageToken = response.getNextPageToken();
        } else {
            log.debug("finished planning splits. number of splits: {}", splits.size());
            return new GetSplitsResponse(request.getCatalogName(), splits);
        }
    }
}
Also used : CatalogLocator(com.palantir.foundry.athena.api.CatalogLocator) GetSplitsResponse(com.amazonaws.athena.connector.lambda.metadata.GetSplitsResponse) Filter(com.palantir.foundry.athena.api.Filter) Logger(org.slf4j.Logger) CatalogLocator(com.palantir.foundry.athena.api.CatalogLocator) Split(com.amazonaws.athena.connector.lambda.domain.Split) LoggerFactory(org.slf4j.LoggerFactory) GetSlicesRequest(com.palantir.foundry.athena.api.GetSlicesRequest) Set(java.util.Set) SpillLocation(com.amazonaws.athena.connector.lambda.domain.spill.SpillLocation) GetSlicesResponse(com.palantir.foundry.athena.api.GetSlicesResponse) Collectors(java.util.stream.Collectors) AndFilter(com.palantir.foundry.athena.api.AndFilter) HashSet(java.util.HashSet) FoundryAthenaMetadataServiceBlocking(com.palantir.foundry.athena.api.FoundryAthenaMetadataServiceBlocking) EncryptionKey(com.amazonaws.athena.connector.lambda.security.EncryptionKey) Optional(java.util.Optional) GetSplitsRequest(com.amazonaws.athena.connector.lambda.metadata.GetSplitsRequest) GetSlicesResponse(com.palantir.foundry.athena.api.GetSlicesResponse) Filter(com.palantir.foundry.athena.api.Filter) AndFilter(com.palantir.foundry.athena.api.AndFilter) GetSplitsResponse(com.amazonaws.athena.connector.lambda.metadata.GetSplitsResponse) Split(com.amazonaws.athena.connector.lambda.domain.Split) HashSet(java.util.HashSet)

Example 10 with EncryptionKey

use of com.amazonaws.athena.connector.lambda.security.EncryptionKey in project aws-athena-query-federation by awslabs.

the class ExampleRecordHandlerTest method doReadRecordsSpill.

@Test
public void doReadRecordsSpill() throws Exception {
    logger.info("doReadRecordsSpill: enter");
    for (int i = 0; i < 2; i++) {
        EncryptionKey encryptionKey = (i % 2 == 0) ? keyFactory.create() : null;
        logger.info("doReadRecordsSpill: Using encryptionKey[" + encryptionKey + "]");
        Map<String, ValueSet> constraintsMap = new HashMap<>();
        constraintsMap.put("col3", SortedRangeSet.copyOf(Types.MinorType.FLOAT8.getType(), ImmutableList.of(Range.greaterThan(allocator, Types.MinorType.FLOAT8.getType(), -10000D)), false));
        constraintsMap.put("unknown", EquatableValueSet.newBuilder(allocator, Types.MinorType.FLOAT8.getType(), false, true).add(1.1D).build());
        constraintsMap.put("unknown2", new AllOrNoneValueSet(Types.MinorType.FLOAT8.getType(), false, true));
        ReadRecordsRequest request = new ReadRecordsRequest(IdentityUtil.fakeIdentity(), "catalog", "queryId-" + System.currentTimeMillis(), new TableName("schema", "table"), schemaForRead, Split.newBuilder(makeSpillLocation(), encryptionKey).add("year", "10").add("month", "10").add("day", "10").build(), new Constraints(constraintsMap), // ~1.5MB so we should see some spill
        1_600_000L, 1000L);
        ObjectMapperUtil.assertSerialization(request);
        RecordResponse rawResponse = recordService.readRecords(request);
        ObjectMapperUtil.assertSerialization(rawResponse);
        assertTrue(rawResponse instanceof RemoteReadRecordsResponse);
        try (RemoteReadRecordsResponse response = (RemoteReadRecordsResponse) rawResponse) {
            logger.info("doReadRecordsSpill: remoteBlocks[{}]", response.getRemoteBlocks().size());
            assertTrue(response.getNumberBlocks() > 1);
            int blockNum = 0;
            for (SpillLocation next : response.getRemoteBlocks()) {
                S3SpillLocation spillLocation = (S3SpillLocation) next;
                try (Block block = spillReader.read(spillLocation, response.getEncryptionKey(), response.getSchema())) {
                    logger.info("doReadRecordsSpill: blockNum[{}] and recordCount[{}]", blockNum++, block.getRowCount());
                    // assertTrue(++blockNum < response.getRemoteBlocks().size() && block.getRowCount() > 10_000);
                    logger.info("doReadRecordsSpill: {}", BlockUtils.rowToString(block, 0));
                    assertNotNull(BlockUtils.rowToString(block, 0));
                }
            }
        }
    }
    logger.info("doReadRecordsSpill: exit");
}
Also used : RemoteReadRecordsResponse(com.amazonaws.athena.connector.lambda.records.RemoteReadRecordsResponse) SpillLocation(com.amazonaws.athena.connector.lambda.domain.spill.SpillLocation) S3SpillLocation(com.amazonaws.athena.connector.lambda.domain.spill.S3SpillLocation) HashMap(java.util.HashMap) AllOrNoneValueSet(com.amazonaws.athena.connector.lambda.domain.predicate.AllOrNoneValueSet) EncryptionKey(com.amazonaws.athena.connector.lambda.security.EncryptionKey) Matchers.anyString(org.mockito.Matchers.anyString) RecordResponse(com.amazonaws.athena.connector.lambda.records.RecordResponse) TableName(com.amazonaws.athena.connector.lambda.domain.TableName) ReadRecordsRequest(com.amazonaws.athena.connector.lambda.records.ReadRecordsRequest) Constraints(com.amazonaws.athena.connector.lambda.domain.predicate.Constraints) S3SpillLocation(com.amazonaws.athena.connector.lambda.domain.spill.S3SpillLocation) Block(com.amazonaws.athena.connector.lambda.data.Block) ValueSet(com.amazonaws.athena.connector.lambda.domain.predicate.ValueSet) EquatableValueSet(com.amazonaws.athena.connector.lambda.domain.predicate.EquatableValueSet) AllOrNoneValueSet(com.amazonaws.athena.connector.lambda.domain.predicate.AllOrNoneValueSet) Test(org.junit.Test)

Aggregations

EncryptionKey (com.amazonaws.athena.connector.lambda.security.EncryptionKey)11 S3SpillLocation (com.amazonaws.athena.connector.lambda.domain.spill.S3SpillLocation)6 SpillLocation (com.amazonaws.athena.connector.lambda.domain.spill.SpillLocation)6 Split (com.amazonaws.athena.connector.lambda.domain.Split)5 GetSplitsResponse (com.amazonaws.athena.connector.lambda.metadata.GetSplitsResponse)4 Block (com.amazonaws.athena.connector.lambda.data.Block)3 TableName (com.amazonaws.athena.connector.lambda.domain.TableName)3 AllOrNoneValueSet (com.amazonaws.athena.connector.lambda.domain.predicate.AllOrNoneValueSet)3 Constraints (com.amazonaws.athena.connector.lambda.domain.predicate.Constraints)3 EquatableValueSet (com.amazonaws.athena.connector.lambda.domain.predicate.EquatableValueSet)3 ValueSet (com.amazonaws.athena.connector.lambda.domain.predicate.ValueSet)3 ReadRecordsRequest (com.amazonaws.athena.connector.lambda.records.ReadRecordsRequest)3 RemoteReadRecordsResponse (com.amazonaws.athena.connector.lambda.records.RemoteReadRecordsResponse)3 HashMap (java.util.HashMap)3 Before (org.junit.Before)3 RecordResponse (com.amazonaws.athena.connector.lambda.records.RecordResponse)2 ObjectMetadata (com.amazonaws.services.s3.model.ObjectMetadata)2 ByteArrayInputStream (java.io.ByteArrayInputStream)2 HashSet (java.util.HashSet)2 ArrowType (org.apache.arrow.vector.types.pojo.ArrowType)2