Search in sources :

Example 81 with ValueSet

use of com.amazonaws.athena.connector.lambda.domain.predicate.ValueSet in project aws-athena-query-federation by awslabs.

the class ElasticsearchRecordHandlerTest method doReadRecordsSpill.

@Test
public void doReadRecordsSpill() throws Exception {
    logger.info("doReadRecordsSpill: enter");
    int batchSize = handler.getQueryBatchSize();
    SearchHit[] searchHit1 = new SearchHit[batchSize];
    for (int i = 0; i < batchSize; ++i) {
        searchHit1[i] = new SearchHit(i + 1);
    }
    SearchHit[] searchHit2 = new SearchHit[2];
    searchHit2[0] = new SearchHit(batchSize + 1);
    searchHit2[1] = new SearchHit(batchSize + 2);
    SearchHits searchHits1 = new SearchHits(searchHit1, new TotalHits(batchSize, TotalHits.Relation.EQUAL_TO), 4);
    SearchHits searchHits2 = new SearchHits(searchHit2, new TotalHits(2, TotalHits.Relation.EQUAL_TO), 4);
    when(mockResponse.getHits()).thenReturn(searchHits1, searchHits1, searchHits2, searchHits2);
    Map<String, ValueSet> constraintsMap = new HashMap<>();
    constraintsMap.put("myshort", SortedRangeSet.copyOf(Types.MinorType.SMALLINT.getType(), ImmutableList.of(Range.range(allocator, Types.MinorType.SMALLINT.getType(), (short) 1955, false, (short) 1972, true)), false));
    ReadRecordsRequest request = new ReadRecordsRequest(fakeIdentity(), "elasticsearch", "queryId-" + System.currentTimeMillis(), new TableName("movies", "mishmash"), mapping, split, new Constraints(constraintsMap), // 10KB Expect this to spill
    10_000L, 0L);
    RecordResponse rawResponse = handler.doReadRecords(allocator, request);
    assertTrue(rawResponse instanceof RemoteReadRecordsResponse);
    try (RemoteReadRecordsResponse response = (RemoteReadRecordsResponse) rawResponse) {
        logger.info("doReadRecordsSpill: remoteBlocks[{}]", response.getRemoteBlocks().size());
        assertEquals(3, response.getNumberBlocks());
        int blockNum = 0;
        for (SpillLocation next : response.getRemoteBlocks()) {
            S3SpillLocation spillLocation = (S3SpillLocation) next;
            try (Block block = spillReader.read(spillLocation, response.getEncryptionKey(), response.getSchema())) {
                logger.info("doReadRecordsSpill: blockNum[{}] and recordCount[{}]", blockNum++, block.getRowCount());
                logger.info("doReadRecordsSpill: {}", BlockUtils.rowToString(block, 0));
                assertNotNull(BlockUtils.rowToString(block, 0));
            }
        }
    }
    logger.info("doReadRecordsSpill: exit");
}
Also used : TotalHits(org.apache.lucene.search.TotalHits) RemoteReadRecordsResponse(com.amazonaws.athena.connector.lambda.records.RemoteReadRecordsResponse) SearchHit(org.elasticsearch.search.SearchHit) SpillLocation(com.amazonaws.athena.connector.lambda.domain.spill.SpillLocation) S3SpillLocation(com.amazonaws.athena.connector.lambda.domain.spill.S3SpillLocation) HashMap(java.util.HashMap) Mockito.anyString(org.mockito.Mockito.anyString) RecordResponse(com.amazonaws.athena.connector.lambda.records.RecordResponse) TableName(com.amazonaws.athena.connector.lambda.domain.TableName) ReadRecordsRequest(com.amazonaws.athena.connector.lambda.records.ReadRecordsRequest) Constraints(com.amazonaws.athena.connector.lambda.domain.predicate.Constraints) S3SpillLocation(com.amazonaws.athena.connector.lambda.domain.spill.S3SpillLocation) Block(com.amazonaws.athena.connector.lambda.data.Block) SearchHits(org.elasticsearch.search.SearchHits) ValueSet(com.amazonaws.athena.connector.lambda.domain.predicate.ValueSet) Test(org.junit.Test)

Example 82 with ValueSet

use of com.amazonaws.athena.connector.lambda.domain.predicate.ValueSet in project aws-athena-query-federation by awslabs.

the class RouteTableProvider method readWithConstraint.

/**
 * Calls DescribeRouteTables on the AWS EC2 Client returning all Routes that match the supplied predicate and attempting
 * to push down certain predicates (namely queries for specific RoutingTables) to EC2.
 *
 * @See TableProvider
 */
@Override
public void readWithConstraint(BlockSpiller spiller, ReadRecordsRequest recordsRequest, QueryStatusChecker queryStatusChecker) {
    boolean done = false;
    DescribeRouteTablesRequest request = new DescribeRouteTablesRequest();
    ValueSet idConstraint = recordsRequest.getConstraints().getSummary().get("route_table_id");
    if (idConstraint != null && idConstraint.isSingleValue()) {
        request.setRouteTableIds(Collections.singletonList(idConstraint.getSingleValue().toString()));
    }
    while (!done) {
        DescribeRouteTablesResult response = ec2.describeRouteTables(request);
        for (RouteTable nextRouteTable : response.getRouteTables()) {
            for (Route route : nextRouteTable.getRoutes()) {
                instanceToRow(nextRouteTable, route, spiller);
            }
        }
        request.setNextToken(response.getNextToken());
        if (response.getNextToken() == null || !queryStatusChecker.isQueryRunning()) {
            done = true;
        }
    }
}
Also used : RouteTable(com.amazonaws.services.ec2.model.RouteTable) DescribeRouteTablesRequest(com.amazonaws.services.ec2.model.DescribeRouteTablesRequest) DescribeRouteTablesResult(com.amazonaws.services.ec2.model.DescribeRouteTablesResult) ValueSet(com.amazonaws.athena.connector.lambda.domain.predicate.ValueSet) Route(com.amazonaws.services.ec2.model.Route)

Example 83 with ValueSet

use of com.amazonaws.athena.connector.lambda.domain.predicate.ValueSet in project aws-athena-query-federation by awslabs.

the class SubnetTableProvider method readWithConstraint.

/**
 * Calls DescribeSubnets on the AWS EC2 Client returning all subnets that match the supplied predicate and attempting
 * to push down certain predicates (namely queries for specific subnet) to EC2.
 *
 * @See TableProvider
 */
@Override
public void readWithConstraint(BlockSpiller spiller, ReadRecordsRequest recordsRequest, QueryStatusChecker queryStatusChecker) {
    DescribeSubnetsRequest request = new DescribeSubnetsRequest();
    ValueSet idConstraint = recordsRequest.getConstraints().getSummary().get("id");
    if (idConstraint != null && idConstraint.isSingleValue()) {
        request.setSubnetIds(Collections.singletonList(idConstraint.getSingleValue().toString()));
    }
    DescribeSubnetsResult response = ec2.describeSubnets(request);
    for (Subnet subnet : response.getSubnets()) {
        instanceToRow(subnet, spiller);
    }
}
Also used : Subnet(com.amazonaws.services.ec2.model.Subnet) ValueSet(com.amazonaws.athena.connector.lambda.domain.predicate.ValueSet) DescribeSubnetsResult(com.amazonaws.services.ec2.model.DescribeSubnetsResult) DescribeSubnetsRequest(com.amazonaws.services.ec2.model.DescribeSubnetsRequest)

Example 84 with ValueSet

use of com.amazonaws.athena.connector.lambda.domain.predicate.ValueSet in project aws-athena-query-federation by awslabs.

the class S3ObjectsTableProvider method readWithConstraint.

/**
 * Calls DescribeDBInstances on the AWS RDS Client returning all DB Instances that match the supplied predicate and attempting
 * to push down certain predicates (namely queries for specific DB Instance) to EC2.
 *
 * @See TableProvider
 */
@Override
public void readWithConstraint(BlockSpiller spiller, ReadRecordsRequest recordsRequest, QueryStatusChecker queryStatusChecker) {
    ValueSet bucketConstraint = recordsRequest.getConstraints().getSummary().get("bucket_name");
    String bucket;
    if (bucketConstraint != null && bucketConstraint.isSingleValue()) {
        bucket = bucketConstraint.getSingleValue().toString();
    } else {
        throw new IllegalArgumentException("Queries against the objects table must filter on a single bucket " + "(e.g. where bucket_name='my_bucket'.");
    }
    ListObjectsV2Request req = new ListObjectsV2Request().withBucketName(bucket).withMaxKeys(MAX_KEYS);
    ListObjectsV2Result result;
    do {
        result = amazonS3.listObjectsV2(req);
        for (S3ObjectSummary objectSummary : result.getObjectSummaries()) {
            toRow(objectSummary, spiller);
        }
        req.setContinuationToken(result.getNextContinuationToken());
    } while (result.isTruncated() && queryStatusChecker.isQueryRunning());
}
Also used : ListObjectsV2Request(com.amazonaws.services.s3.model.ListObjectsV2Request) ListObjectsV2Result(com.amazonaws.services.s3.model.ListObjectsV2Result) S3ObjectSummary(com.amazonaws.services.s3.model.S3ObjectSummary) ValueSet(com.amazonaws.athena.connector.lambda.domain.predicate.ValueSet)

Example 85 with ValueSet

use of com.amazonaws.athena.connector.lambda.domain.predicate.ValueSet in project aws-athena-query-federation by awslabs.

the class HiveRecordHandlerTest method buildSplitSql.

@Test
public void buildSplitSql() throws SQLException {
    TableName tableName = new TableName("testSchema", "testTable");
    SchemaBuilder schemaBuilder = SchemaBuilder.newBuilder();
    schemaBuilder.addField(FieldBuilder.newBuilder("testCol1", Types.MinorType.INT.getType()).build());
    schemaBuilder.addField(FieldBuilder.newBuilder("testCol2", Types.MinorType.DATEDAY.getType()).build());
    schemaBuilder.addField(FieldBuilder.newBuilder("testCol3", Types.MinorType.DATEMILLI.getType()).build());
    schemaBuilder.addField(FieldBuilder.newBuilder("testCol4", Types.MinorType.VARBINARY.getType()).build());
    schemaBuilder.addField(FieldBuilder.newBuilder("partition", Types.MinorType.VARCHAR.getType()).build());
    Schema schema = schemaBuilder.build();
    Split split = Mockito.mock(Split.class);
    Mockito.when(split.getProperties()).thenReturn(Collections.singletonMap("partition", "p0"));
    Mockito.when(split.getProperty(Mockito.eq("partition"))).thenReturn("p0");
    Range range1a = Mockito.mock(Range.class, Mockito.RETURNS_DEEP_STUBS);
    Mockito.when(range1a.isSingleValue()).thenReturn(true);
    Mockito.when(range1a.getLow().getValue()).thenReturn(1);
    Range range1b = Mockito.mock(Range.class, Mockito.RETURNS_DEEP_STUBS);
    Mockito.when(range1b.isSingleValue()).thenReturn(true);
    Mockito.when(range1b.getLow().getValue()).thenReturn(2);
    ValueSet valueSet1 = Mockito.mock(SortedRangeSet.class, Mockito.RETURNS_DEEP_STUBS);
    Mockito.when(valueSet1.getRanges().getOrderedRanges()).thenReturn(ImmutableList.of(range1a, range1b));
    final long dateDays = TimeUnit.DAYS.toDays(Date.valueOf("2020-01-05").getTime());
    ValueSet valueSet2 = getSingleValueSet(dateDays);
    Constraints constraints = Mockito.mock(Constraints.class);
    Mockito.when(constraints.getSummary()).thenReturn(new ImmutableMap.Builder<String, ValueSet>().put("testCol2", valueSet2).build());
    PreparedStatement expectedPreparedStatement = Mockito.mock(PreparedStatement.class);
    Mockito.when(this.connection.prepareStatement(Mockito.anyString())).thenReturn(expectedPreparedStatement);
    PreparedStatement preparedStatement = this.hiveRecordHandler.buildSplitSql(this.connection, "testCatalogName", tableName, schema, constraints, split);
    Assert.assertEquals(expectedPreparedStatement, preparedStatement);
}
Also used : TableName(com.amazonaws.athena.connector.lambda.domain.TableName) Constraints(com.amazonaws.athena.connector.lambda.domain.predicate.Constraints) Schema(org.apache.arrow.vector.types.pojo.Schema) SchemaBuilder(com.amazonaws.athena.connector.lambda.data.SchemaBuilder) PreparedStatement(java.sql.PreparedStatement) Split(com.amazonaws.athena.connector.lambda.domain.Split) Range(com.amazonaws.athena.connector.lambda.domain.predicate.Range) ValueSet(com.amazonaws.athena.connector.lambda.domain.predicate.ValueSet) ImmutableMap(com.google.common.collect.ImmutableMap) Test(org.junit.Test)

Aggregations

ValueSet (com.amazonaws.athena.connector.lambda.domain.predicate.ValueSet)104 Test (org.junit.Test)66 Constraints (com.amazonaws.athena.connector.lambda.domain.predicate.Constraints)63 HashMap (java.util.HashMap)48 TableName (com.amazonaws.athena.connector.lambda.domain.TableName)47 Schema (org.apache.arrow.vector.types.pojo.Schema)37 Split (com.amazonaws.athena.connector.lambda.domain.Split)31 Range (com.amazonaws.athena.connector.lambda.domain.predicate.Range)27 ReadRecordsRequest (com.amazonaws.athena.connector.lambda.records.ReadRecordsRequest)27 EquatableValueSet (com.amazonaws.athena.connector.lambda.domain.predicate.EquatableValueSet)26 ArrayList (java.util.ArrayList)25 Matchers.anyString (org.mockito.Matchers.anyString)25 RecordResponse (com.amazonaws.athena.connector.lambda.records.RecordResponse)24 Block (com.amazonaws.athena.connector.lambda.data.Block)23 S3SpillLocation (com.amazonaws.athena.connector.lambda.domain.spill.S3SpillLocation)21 RemoteReadRecordsResponse (com.amazonaws.athena.connector.lambda.records.RemoteReadRecordsResponse)18 SchemaBuilder (com.amazonaws.athena.connector.lambda.data.SchemaBuilder)17 ReadRecordsResponse (com.amazonaws.athena.connector.lambda.records.ReadRecordsResponse)17 InvocationOnMock (org.mockito.invocation.InvocationOnMock)17 BlockAllocatorImpl (com.amazonaws.athena.connector.lambda.data.BlockAllocatorImpl)13