Search in sources :

Example 21 with GetSplitsRequest

use of com.amazonaws.athena.connector.lambda.metadata.GetSplitsRequest in project aws-athena-query-federation by awslabs.

the class HiveMuxMetadataHandlerTest method doGetSplits.

@Test
public void doGetSplits() {
    GetSplitsRequest getSplitsRequest = Mockito.mock(GetSplitsRequest.class);
    Mockito.when(getSplitsRequest.getCatalogName()).thenReturn("metaHive");
    this.jdbcMetadataHandler.doGetSplits(this.allocator, getSplitsRequest);
    Mockito.verify(this.hiveMetadataHandler, Mockito.times(1)).doGetSplits(Mockito.eq(this.allocator), Mockito.eq(getSplitsRequest));
}
Also used : GetSplitsRequest(com.amazonaws.athena.connector.lambda.metadata.GetSplitsRequest) Test(org.junit.Test)

Example 22 with GetSplitsRequest

use of com.amazonaws.athena.connector.lambda.metadata.GetSplitsRequest in project aws-athena-query-federation by awslabs.

the class CloudwatchMetadataHandlerTest method doGetSplits.

@Test
public void doGetSplits() {
    logger.info("doGetSplits: enter");
    Schema schema = SchemaBuilder.newBuilder().addField(CloudwatchMetadataHandler.LOG_STREAM_FIELD, new ArrowType.Utf8()).addField(CloudwatchMetadataHandler.LOG_STREAM_SIZE_FIELD, new ArrowType.Int(64, true)).addField(CloudwatchMetadataHandler.LOG_GROUP_FIELD, new ArrowType.Utf8()).build();
    Block partitions = allocator.createBlock(schema);
    int num_partitions = 2_000;
    for (int i = 0; i < num_partitions; i++) {
        BlockUtils.setValue(partitions.getFieldVector(CloudwatchMetadataHandler.LOG_STREAM_SIZE_FIELD), i, 2016L + i);
        BlockUtils.setValue(partitions.getFieldVector(CloudwatchMetadataHandler.LOG_STREAM_FIELD), i, "log_stream_" + i);
        BlockUtils.setValue(partitions.getFieldVector(CloudwatchMetadataHandler.LOG_GROUP_FIELD), i, "log_group_" + i);
    }
    partitions.setRowCount(num_partitions);
    String continuationToken = null;
    GetSplitsRequest originalReq = new GetSplitsRequest(identity, "queryId", "catalog_name", new TableName("schema", "all_log_streams"), partitions, Collections.singletonList(CloudwatchMetadataHandler.LOG_STREAM_FIELD), new Constraints(new HashMap<>()), continuationToken);
    int numContinuations = 0;
    do {
        GetSplitsRequest req = new GetSplitsRequest(originalReq, continuationToken);
        logger.info("doGetSplits: req[{}]", req);
        MetadataResponse rawResponse = handler.doGetSplits(allocator, req);
        assertEquals(MetadataRequestType.GET_SPLITS, rawResponse.getRequestType());
        GetSplitsResponse response = (GetSplitsResponse) rawResponse;
        continuationToken = response.getContinuationToken();
        logger.info("doGetSplits: continuationToken[{}] - numSplits[{}]", continuationToken, response.getSplits().size());
        for (Split nextSplit : response.getSplits()) {
            assertNotNull(nextSplit.getProperty(CloudwatchMetadataHandler.LOG_STREAM_SIZE_FIELD));
            assertNotNull(nextSplit.getProperty(CloudwatchMetadataHandler.LOG_STREAM_FIELD));
            assertNotNull(nextSplit.getProperty(CloudwatchMetadataHandler.LOG_GROUP_FIELD));
        }
        if (continuationToken != null) {
            numContinuations++;
        }
    } while (continuationToken != null);
    assertTrue(numContinuations > 0);
    logger.info("doGetSplits: exit");
}
Also used : GetSplitsRequest(com.amazonaws.athena.connector.lambda.metadata.GetSplitsRequest) HashMap(java.util.HashMap) Schema(org.apache.arrow.vector.types.pojo.Schema) ArrowType(org.apache.arrow.vector.types.pojo.ArrowType) TableName(com.amazonaws.athena.connector.lambda.domain.TableName) Constraints(com.amazonaws.athena.connector.lambda.domain.predicate.Constraints) GetSplitsResponse(com.amazonaws.athena.connector.lambda.metadata.GetSplitsResponse) MetadataResponse(com.amazonaws.athena.connector.lambda.metadata.MetadataResponse) Block(com.amazonaws.athena.connector.lambda.data.Block) Split(com.amazonaws.athena.connector.lambda.domain.Split) Test(org.junit.Test)

Example 23 with GetSplitsRequest

use of com.amazonaws.athena.connector.lambda.metadata.GetSplitsRequest in project aws-athena-query-federation by awslabs.

the class DocDBMetadataHandlerTest method doGetSplits.

@Test
public void doGetSplits() {
    List<String> partitionCols = new ArrayList<>();
    Block partitions = BlockUtils.newBlock(allocator, PARTITION_ID, Types.MinorType.INT.getType(), 0);
    String continuationToken = null;
    GetSplitsRequest originalReq = new GetSplitsRequest(IDENTITY, QUERY_ID, DEFAULT_CATALOG, TABLE_NAME, partitions, partitionCols, new Constraints(new HashMap<>()), null);
    GetSplitsRequest req = new GetSplitsRequest(originalReq, continuationToken);
    logger.info("doGetSplits: req[{}]", req);
    MetadataResponse rawResponse = handler.doGetSplits(allocator, req);
    assertEquals(MetadataRequestType.GET_SPLITS, rawResponse.getRequestType());
    GetSplitsResponse response = (GetSplitsResponse) rawResponse;
    continuationToken = response.getContinuationToken();
    logger.info("doGetSplits: continuationToken[{}] - numSplits[{}]", new Object[] { continuationToken, response.getSplits().size() });
    assertTrue("Continuation criteria violated", response.getSplits().size() == 1);
    assertTrue("Continuation criteria violated", response.getContinuationToken() == null);
}
Also used : Constraints(com.amazonaws.athena.connector.lambda.domain.predicate.Constraints) GetSplitsRequest(com.amazonaws.athena.connector.lambda.metadata.GetSplitsRequest) HashMap(java.util.HashMap) GetSplitsResponse(com.amazonaws.athena.connector.lambda.metadata.GetSplitsResponse) ArrayList(java.util.ArrayList) MetadataResponse(com.amazonaws.athena.connector.lambda.metadata.MetadataResponse) Block(com.amazonaws.athena.connector.lambda.data.Block) Matchers.anyString(org.mockito.Matchers.anyString) Test(org.junit.Test)

Example 24 with GetSplitsRequest

use of com.amazonaws.athena.connector.lambda.metadata.GetSplitsRequest in project aws-athena-query-federation by awslabs.

the class MetricsMetadataHandlerTest method doGetMetricsSplits.

@Test
public void doGetMetricsSplits() throws Exception {
    logger.info("doGetMetricsSplits: enter");
    Schema schema = SchemaBuilder.newBuilder().addIntField("partitionId").build();
    Block partitions = allocator.createBlock(schema);
    BlockUtils.setValue(partitions.getFieldVector("partitionId"), 1, 1);
    partitions.setRowCount(1);
    String continuationToken = null;
    GetSplitsRequest originalReq = new GetSplitsRequest(identity, "queryId", "catalog_name", new TableName(defaultSchema, "metrics"), partitions, Collections.singletonList("partitionId"), new Constraints(new HashMap<>()), continuationToken);
    int numContinuations = 0;
    do {
        GetSplitsRequest req = new GetSplitsRequest(originalReq, continuationToken);
        logger.info("doGetMetricsSplits: req[{}]", req);
        MetadataResponse rawResponse = handler.doGetSplits(allocator, req);
        assertEquals(MetadataRequestType.GET_SPLITS, rawResponse.getRequestType());
        GetSplitsResponse response = (GetSplitsResponse) rawResponse;
        continuationToken = response.getContinuationToken();
        logger.info("doGetMetricsSplits: continuationToken[{}] - numSplits[{}]", continuationToken, response.getSplits().size());
        assertEquals(1, response.getSplits().size());
        if (continuationToken != null) {
            numContinuations++;
        }
    } while (continuationToken != null);
    assertEquals(0, numContinuations);
    logger.info("doGetMetricsSplits: exit");
}
Also used : TableName(com.amazonaws.athena.connector.lambda.domain.TableName) Constraints(com.amazonaws.athena.connector.lambda.domain.predicate.Constraints) GetSplitsRequest(com.amazonaws.athena.connector.lambda.metadata.GetSplitsRequest) HashMap(java.util.HashMap) GetSplitsResponse(com.amazonaws.athena.connector.lambda.metadata.GetSplitsResponse) Schema(org.apache.arrow.vector.types.pojo.Schema) MetadataResponse(com.amazonaws.athena.connector.lambda.metadata.MetadataResponse) Block(com.amazonaws.athena.connector.lambda.data.Block) Test(org.junit.Test)

Example 25 with GetSplitsRequest

use of com.amazonaws.athena.connector.lambda.metadata.GetSplitsRequest in project aws-athena-query-federation by awslabs.

the class DataLakeGen2MetadataHandlerTest method doGetSplitsWithNoPartition.

@Test
public void doGetSplitsWithNoPartition() throws Exception {
    BlockAllocator blockAllocator = new BlockAllocatorImpl();
    Constraints constraints = Mockito.mock(Constraints.class);
    TableName tableName = new TableName("testSchema", "testTable");
    Schema partitionSchema = this.dataLakeGen2MetadataHandler.getPartitionSchema("testCatalogName");
    Set<String> partitionCols = partitionSchema.getFields().stream().map(Field::getName).collect(Collectors.toSet());
    GetTableLayoutRequest getTableLayoutRequest = new GetTableLayoutRequest(this.federatedIdentity, "testQueryId", "testCatalogName", tableName, constraints, partitionSchema, partitionCols);
    GetTableLayoutResponse getTableLayoutResponse = this.dataLakeGen2MetadataHandler.doGetTableLayout(blockAllocator, getTableLayoutRequest);
    BlockAllocator splitBlockAllocator = new BlockAllocatorImpl();
    GetSplitsRequest getSplitsRequest = new GetSplitsRequest(this.federatedIdentity, "testQueryId", "testCatalogName", tableName, getTableLayoutResponse.getPartitions(), new ArrayList<>(partitionCols), constraints, null);
    GetSplitsResponse getSplitsResponse = this.dataLakeGen2MetadataHandler.doGetSplits(splitBlockAllocator, getSplitsRequest);
    Set<Map<String, String>> expectedSplits = new HashSet<>();
    expectedSplits.add(Collections.singletonMap(DataLakeGen2MetadataHandler.PARTITION_NUMBER, "0"));
    Assert.assertEquals(expectedSplits.size(), getSplitsResponse.getSplits().size());
    Set<Map<String, String>> actualSplits = getSplitsResponse.getSplits().stream().map(Split::getProperties).collect(Collectors.toSet());
    Assert.assertEquals(expectedSplits, actualSplits);
}
Also used : GetSplitsRequest(com.amazonaws.athena.connector.lambda.metadata.GetSplitsRequest) Schema(org.apache.arrow.vector.types.pojo.Schema) TableName(com.amazonaws.athena.connector.lambda.domain.TableName) Constraints(com.amazonaws.athena.connector.lambda.domain.predicate.Constraints) GetTableLayoutResponse(com.amazonaws.athena.connector.lambda.metadata.GetTableLayoutResponse) BlockAllocatorImpl(com.amazonaws.athena.connector.lambda.data.BlockAllocatorImpl) GetSplitsResponse(com.amazonaws.athena.connector.lambda.metadata.GetSplitsResponse) BlockAllocator(com.amazonaws.athena.connector.lambda.data.BlockAllocator) GetTableLayoutRequest(com.amazonaws.athena.connector.lambda.metadata.GetTableLayoutRequest) Map(java.util.Map) HashSet(java.util.HashSet) Test(org.junit.Test)

Aggregations

GetSplitsRequest (com.amazonaws.athena.connector.lambda.metadata.GetSplitsRequest)46 Test (org.junit.Test)41 GetSplitsResponse (com.amazonaws.athena.connector.lambda.metadata.GetSplitsResponse)32 Constraints (com.amazonaws.athena.connector.lambda.domain.predicate.Constraints)29 TableName (com.amazonaws.athena.connector.lambda.domain.TableName)24 Schema (org.apache.arrow.vector.types.pojo.Schema)24 GetTableLayoutRequest (com.amazonaws.athena.connector.lambda.metadata.GetTableLayoutRequest)17 BlockAllocator (com.amazonaws.athena.connector.lambda.data.BlockAllocator)16 BlockAllocatorImpl (com.amazonaws.athena.connector.lambda.data.BlockAllocatorImpl)16 GetTableLayoutResponse (com.amazonaws.athena.connector.lambda.metadata.GetTableLayoutResponse)16 HashMap (java.util.HashMap)16 HashSet (java.util.HashSet)15 MetadataResponse (com.amazonaws.athena.connector.lambda.metadata.MetadataResponse)14 Map (java.util.Map)14 Block (com.amazonaws.athena.connector.lambda.data.Block)13 ResultSet (java.sql.ResultSet)12 AtomicInteger (java.util.concurrent.atomic.AtomicInteger)12 PreparedStatement (java.sql.PreparedStatement)9 ArrayList (java.util.ArrayList)9 Split (com.amazonaws.athena.connector.lambda.domain.Split)8