Search in sources :

Example 11 with RemoteReadRecordsResponse

use of com.amazonaws.athena.connector.lambda.records.RemoteReadRecordsResponse in project aws-athena-query-federation by awslabs.

the class ElasticsearchRecordHandlerTest method doReadRecordsSpill.

@Test
public void doReadRecordsSpill() throws Exception {
    logger.info("doReadRecordsSpill: enter");
    int batchSize = handler.getQueryBatchSize();
    SearchHit[] searchHit1 = new SearchHit[batchSize];
    for (int i = 0; i < batchSize; ++i) {
        searchHit1[i] = new SearchHit(i + 1);
    }
    SearchHit[] searchHit2 = new SearchHit[2];
    searchHit2[0] = new SearchHit(batchSize + 1);
    searchHit2[1] = new SearchHit(batchSize + 2);
    SearchHits searchHits1 = new SearchHits(searchHit1, new TotalHits(batchSize, TotalHits.Relation.EQUAL_TO), 4);
    SearchHits searchHits2 = new SearchHits(searchHit2, new TotalHits(2, TotalHits.Relation.EQUAL_TO), 4);
    when(mockResponse.getHits()).thenReturn(searchHits1, searchHits1, searchHits2, searchHits2);
    Map<String, ValueSet> constraintsMap = new HashMap<>();
    constraintsMap.put("myshort", SortedRangeSet.copyOf(Types.MinorType.SMALLINT.getType(), ImmutableList.of(Range.range(allocator, Types.MinorType.SMALLINT.getType(), (short) 1955, false, (short) 1972, true)), false));
    ReadRecordsRequest request = new ReadRecordsRequest(fakeIdentity(), "elasticsearch", "queryId-" + System.currentTimeMillis(), new TableName("movies", "mishmash"), mapping, split, new Constraints(constraintsMap), // 10KB Expect this to spill
    10_000L, 0L);
    RecordResponse rawResponse = handler.doReadRecords(allocator, request);
    assertTrue(rawResponse instanceof RemoteReadRecordsResponse);
    try (RemoteReadRecordsResponse response = (RemoteReadRecordsResponse) rawResponse) {
        logger.info("doReadRecordsSpill: remoteBlocks[{}]", response.getRemoteBlocks().size());
        assertEquals(3, response.getNumberBlocks());
        int blockNum = 0;
        for (SpillLocation next : response.getRemoteBlocks()) {
            S3SpillLocation spillLocation = (S3SpillLocation) next;
            try (Block block = spillReader.read(spillLocation, response.getEncryptionKey(), response.getSchema())) {
                logger.info("doReadRecordsSpill: blockNum[{}] and recordCount[{}]", blockNum++, block.getRowCount());
                logger.info("doReadRecordsSpill: {}", BlockUtils.rowToString(block, 0));
                assertNotNull(BlockUtils.rowToString(block, 0));
            }
        }
    }
    logger.info("doReadRecordsSpill: exit");
}
Also used : TotalHits(org.apache.lucene.search.TotalHits) RemoteReadRecordsResponse(com.amazonaws.athena.connector.lambda.records.RemoteReadRecordsResponse) SearchHit(org.elasticsearch.search.SearchHit) SpillLocation(com.amazonaws.athena.connector.lambda.domain.spill.SpillLocation) S3SpillLocation(com.amazonaws.athena.connector.lambda.domain.spill.S3SpillLocation) HashMap(java.util.HashMap) Mockito.anyString(org.mockito.Mockito.anyString) RecordResponse(com.amazonaws.athena.connector.lambda.records.RecordResponse) TableName(com.amazonaws.athena.connector.lambda.domain.TableName) ReadRecordsRequest(com.amazonaws.athena.connector.lambda.records.ReadRecordsRequest) Constraints(com.amazonaws.athena.connector.lambda.domain.predicate.Constraints) S3SpillLocation(com.amazonaws.athena.connector.lambda.domain.spill.S3SpillLocation) Block(com.amazonaws.athena.connector.lambda.data.Block) SearchHits(org.elasticsearch.search.SearchHits) ValueSet(com.amazonaws.athena.connector.lambda.domain.predicate.ValueSet) Test(org.junit.Test)

Example 12 with RemoteReadRecordsResponse

use of com.amazonaws.athena.connector.lambda.records.RemoteReadRecordsResponse in project aws-athena-query-federation by awslabs.

the class RecordHandler method doReadRecords.

/**
 * Used to read the row data associated with the provided Split.
 *
 * @param allocator Tool for creating and managing Apache Arrow Blocks.
 * @param request Details of the read request, including:
 * 1. The Split
 * 2. The Catalog, Database, and Table the read request is for.
 * 3. The filtering predicate (if any)
 * 4. The columns required for projection.
 * @return A RecordResponse which either a ReadRecordsResponse or a RemoteReadRecordsResponse containing the row
 * data for the requested Split.
 */
public RecordResponse doReadRecords(BlockAllocator allocator, ReadRecordsRequest request) throws Exception {
    logger.info("doReadRecords: {}:{}", request.getSchema(), request.getSplit().getSpillLocation());
    SpillConfig spillConfig = getSpillConfig(request);
    try (ConstraintEvaluator evaluator = new ConstraintEvaluator(allocator, request.getSchema(), request.getConstraints());
        S3BlockSpiller spiller = new S3BlockSpiller(amazonS3, spillConfig, allocator, request.getSchema(), evaluator);
        QueryStatusChecker queryStatusChecker = new QueryStatusChecker(athena, athenaInvoker, request.getQueryId())) {
        readWithConstraint(spiller, request, queryStatusChecker);
        if (!spiller.spilled()) {
            return new ReadRecordsResponse(request.getCatalogName(), spiller.getBlock());
        } else {
            return new RemoteReadRecordsResponse(request.getCatalogName(), request.getSchema(), spiller.getSpillLocations(), spillConfig.getEncryptionKey());
        }
    }
}
Also used : SpillConfig(com.amazonaws.athena.connector.lambda.data.SpillConfig) RemoteReadRecordsResponse(com.amazonaws.athena.connector.lambda.records.RemoteReadRecordsResponse) QueryStatusChecker(com.amazonaws.athena.connector.lambda.QueryStatusChecker) RemoteReadRecordsResponse(com.amazonaws.athena.connector.lambda.records.RemoteReadRecordsResponse) ReadRecordsResponse(com.amazonaws.athena.connector.lambda.records.ReadRecordsResponse) S3BlockSpiller(com.amazonaws.athena.connector.lambda.data.S3BlockSpiller) ConstraintEvaluator(com.amazonaws.athena.connector.lambda.domain.predicate.ConstraintEvaluator)

Aggregations

RemoteReadRecordsResponse (com.amazonaws.athena.connector.lambda.records.RemoteReadRecordsResponse)12 Block (com.amazonaws.athena.connector.lambda.data.Block)9 S3SpillLocation (com.amazonaws.athena.connector.lambda.domain.spill.S3SpillLocation)9 SpillLocation (com.amazonaws.athena.connector.lambda.domain.spill.SpillLocation)9 Test (org.junit.Test)9 Constraints (com.amazonaws.athena.connector.lambda.domain.predicate.Constraints)8 ValueSet (com.amazonaws.athena.connector.lambda.domain.predicate.ValueSet)8 ReadRecordsRequest (com.amazonaws.athena.connector.lambda.records.ReadRecordsRequest)8 RecordResponse (com.amazonaws.athena.connector.lambda.records.RecordResponse)8 HashMap (java.util.HashMap)8 Matchers.anyString (org.mockito.Matchers.anyString)7 TableName (com.amazonaws.athena.connector.lambda.domain.TableName)6 EquatableValueSet (com.amazonaws.athena.connector.lambda.domain.predicate.EquatableValueSet)4 ReadRecordsResponse (com.amazonaws.athena.connector.lambda.records.ReadRecordsResponse)3 InputStream (java.io.InputStream)3 InvocationOnMock (org.mockito.invocation.InvocationOnMock)3 QueryStatusChecker (com.amazonaws.athena.connector.lambda.QueryStatusChecker)2 BlockAllocatorImpl (com.amazonaws.athena.connector.lambda.data.BlockAllocatorImpl)2 SpillConfig (com.amazonaws.athena.connector.lambda.data.SpillConfig)2 Split (com.amazonaws.athena.connector.lambda.domain.Split)2