Search in sources :

Example 1 with Datum

use of com.amazonaws.services.timestreamquery.model.Datum in project aws-athena-query-federation by awslabs.

the class TestUtils method makeMockQueryResult.

public static QueryResult makeMockQueryResult(Schema schemaForRead, int numRows) {
    QueryResult mockResult = mock(QueryResult.class);
    final AtomicLong nextToken = new AtomicLong(0);
    when(mockResult.getRows()).thenAnswer((Answer<List<Row>>) invocationOnMock -> {
        List<Row> rows = new ArrayList<>();
        for (int i = 0; i < 100; i++) {
            nextToken.incrementAndGet();
            List<Datum> columnData = new ArrayList<>();
            for (Field nextField : schemaForRead.getFields()) {
                columnData.add(makeValue(nextField));
            }
            Row row = new Row();
            row.setData(columnData);
            rows.add(row);
        }
        return rows;
    });
    when(mockResult.getNextToken()).thenAnswer((Answer<String>) invocationOnMock -> {
        if (nextToken.get() < numRows) {
            return String.valueOf(nextToken.get());
        }
        return null;
    });
    return mockResult;
}
Also used : QueryResult(com.amazonaws.services.timestreamquery.model.QueryResult) Schema(org.apache.arrow.vector.types.pojo.Schema) FLOAT8(org.apache.arrow.vector.types.Types.MinorType.FLOAT8) Date(java.util.Date) Types(org.apache.arrow.vector.types.Types) SimpleDateFormat(java.text.SimpleDateFormat) HashMap(java.util.HashMap) Random(java.util.Random) TimeSeriesDataPoint(com.amazonaws.services.timestreamquery.model.TimeSeriesDataPoint) ArrayList(java.util.ArrayList) Answer(org.mockito.stubbing.Answer) Map(java.util.Map) GeneratedRowWriter(com.amazonaws.athena.connector.lambda.data.writers.GeneratedRowWriter) FieldResolver(com.amazonaws.athena.connector.lambda.data.FieldResolver) ConstraintProjector(com.amazonaws.athena.connector.lambda.domain.predicate.ConstraintProjector) FieldVector(org.apache.arrow.vector.FieldVector) Datum(com.amazonaws.services.timestreamquery.model.Datum) Field(org.apache.arrow.vector.types.pojo.Field) Mockito.when(org.mockito.Mockito.when) Row(com.amazonaws.services.timestreamquery.model.Row) AtomicLong(java.util.concurrent.atomic.AtomicLong) List(java.util.List) BlockUtils(com.amazonaws.athena.connector.lambda.data.BlockUtils) Extractor(com.amazonaws.athena.connector.lambda.data.writers.extractors.Extractor) Assert.assertEquals(org.junit.Assert.assertEquals) Mockito.mock(org.mockito.Mockito.mock) Field(org.apache.arrow.vector.types.pojo.Field) QueryResult(com.amazonaws.services.timestreamquery.model.QueryResult) AtomicLong(java.util.concurrent.atomic.AtomicLong) ArrayList(java.util.ArrayList) List(java.util.List) Row(com.amazonaws.services.timestreamquery.model.Row)

Example 2 with Datum

use of com.amazonaws.services.timestreamquery.model.Datum in project aws-athena-query-federation by awslabs.

the class TimestreamMetadataHandlerTest method doGetTable.

@Test
public void doGetTable() throws Exception {
    logger.info("doGetTable - enter");
    when(mockGlue.getTable(any(com.amazonaws.services.glue.model.GetTableRequest.class))).thenReturn(mock(GetTableResult.class));
    when(mockTsQuery.query(any(QueryRequest.class))).thenAnswer((InvocationOnMock invocation) -> {
        QueryRequest request = invocation.getArgumentAt(0, QueryRequest.class);
        assertEquals("DESCRIBE \"default\".\"table1\"", request.getQueryString());
        List<Row> rows = new ArrayList<>();
        // TODO: Add types here
        rows.add(new Row().withData(new Datum().withScalarValue("availability_zone"), new Datum().withScalarValue("varchar"), new Datum().withScalarValue("dimension")));
        rows.add(new Row().withData(new Datum().withScalarValue("measure_value"), new Datum().withScalarValue("double"), new Datum().withScalarValue("measure_value")));
        rows.add(new Row().withData(new Datum().withScalarValue("measure_name"), new Datum().withScalarValue("varchar"), new Datum().withScalarValue("measure_name")));
        rows.add(new Row().withData(new Datum().withScalarValue("time"), new Datum().withScalarValue("timestamp"), new Datum().withScalarValue("timestamp")));
        return new QueryResult().withRows(rows);
    });
    GetTableRequest req = new GetTableRequest(identity, "query-id", "default", new TableName(defaultSchema, "table1"));
    GetTableResponse res = handler.doGetTable(allocator, req);
    logger.info("doGetTable - {}", res);
    assertEquals(4, res.getSchema().getFields().size());
    Field measureName = res.getSchema().findField("measure_name");
    assertEquals(Types.MinorType.VARCHAR, Types.getMinorTypeForArrowType(measureName.getType()));
    Field measureValue = res.getSchema().findField("measure_value");
    assertEquals(Types.MinorType.FLOAT8, Types.getMinorTypeForArrowType(measureValue.getType()));
    Field availabilityZone = res.getSchema().findField("availability_zone");
    assertEquals(Types.MinorType.VARCHAR, Types.getMinorTypeForArrowType(availabilityZone.getType()));
    Field time = res.getSchema().findField("time");
    assertEquals(Types.MinorType.DATEMILLI, Types.getMinorTypeForArrowType(time.getType()));
    logger.info("doGetTable - exit");
}
Also used : Datum(com.amazonaws.services.timestreamquery.model.Datum) QueryRequest(com.amazonaws.services.timestreamquery.model.QueryRequest) ArrayList(java.util.ArrayList) GetTableRequest(com.amazonaws.athena.connector.lambda.metadata.GetTableRequest) TableName(com.amazonaws.athena.connector.lambda.domain.TableName) Field(org.apache.arrow.vector.types.pojo.Field) QueryResult(com.amazonaws.services.timestreamquery.model.QueryResult) GetTableResponse(com.amazonaws.athena.connector.lambda.metadata.GetTableResponse) InvocationOnMock(org.mockito.invocation.InvocationOnMock) Row(com.amazonaws.services.timestreamquery.model.Row) GetTableResult(com.amazonaws.services.glue.model.GetTableResult) Test(org.junit.Test)

Example 3 with Datum

use of com.amazonaws.services.timestreamquery.model.Datum in project aws-athena-query-federation by awslabs.

the class TimestreamMetadataHandler method doGetTable.

@Override
public GetTableResponse doGetTable(BlockAllocator blockAllocator, GetTableRequest request) throws Exception {
    logger.info("doGetTable: enter", request.getTableName());
    Schema schema = null;
    try {
        if (glue != null) {
            schema = super.doGetTable(blockAllocator, request, TABLE_FILTER).getSchema();
            logger.info("doGetTable: Retrieved schema for table[{}] from AWS Glue.", request.getTableName());
        }
    } catch (RuntimeException ex) {
        logger.warn("doGetTable: Unable to retrieve table[{}:{}] from AWS Glue.", request.getTableName().getSchemaName(), request.getTableName().getTableName(), ex);
    }
    if (schema == null) {
        TableName tableName = request.getTableName();
        String describeQuery = queryFactory.createDescribeTableQueryBuilder().withTablename(tableName.getTableName()).withDatabaseName(tableName.getSchemaName()).build();
        logger.info("doGetTable: Retrieving schema for table[{}] from TimeStream using describeQuery[{}].", request.getTableName(), describeQuery);
        QueryRequest queryRequest = new QueryRequest().withQueryString(describeQuery);
        SchemaBuilder schemaBuilder = SchemaBuilder.newBuilder();
        do {
            QueryResult queryResult = tsQuery.query(queryRequest);
            for (Row next : queryResult.getRows()) {
                List<Datum> datum = next.getData();
                if (datum.size() != 3) {
                    throw new RuntimeException("Unexpected datum size " + datum.size() + " while getting schema from datum[" + datum.toString() + "]");
                }
                Field nextField = TimestreamSchemaUtils.makeField(datum.get(0).getScalarValue(), datum.get(1).getScalarValue());
                schemaBuilder.addField(nextField);
            }
            queryRequest = new QueryRequest().withNextToken(queryResult.getNextToken());
        } while (queryRequest.getNextToken() != null);
        schema = schemaBuilder.build();
    }
    return new GetTableResponse(request.getCatalogName(), request.getTableName(), schema);
}
Also used : TableName(com.amazonaws.athena.connector.lambda.domain.TableName) Field(org.apache.arrow.vector.types.pojo.Field) QueryResult(com.amazonaws.services.timestreamquery.model.QueryResult) Datum(com.amazonaws.services.timestreamquery.model.Datum) QueryRequest(com.amazonaws.services.timestreamquery.model.QueryRequest) GetTableResponse(com.amazonaws.athena.connector.lambda.metadata.GetTableResponse) Schema(org.apache.arrow.vector.types.pojo.Schema) SchemaBuilder(com.amazonaws.athena.connector.lambda.data.SchemaBuilder) Row(com.amazonaws.services.timestreamquery.model.Row)

Example 4 with Datum

use of com.amazonaws.services.timestreamquery.model.Datum in project amazon-timestream-tools by awslabs.

the class QueryExample method parseRow.

private String parseRow(List<ColumnInfo> columnInfo, Row row) {
    List<Datum> data = row.getData();
    List<String> rowOutput = new ArrayList<>();
    // iterate every column per row
    for (int j = 0; j < data.size(); j++) {
        ColumnInfo info = columnInfo.get(j);
        Datum datum = data.get(j);
        rowOutput.add(parseDatum(info, datum));
    }
    return String.format("{%s}", rowOutput.stream().map(Object::toString).collect(Collectors.joining(",")));
}
Also used : Datum(com.amazonaws.services.timestreamquery.model.Datum) ArrayList(java.util.ArrayList) ColumnInfo(com.amazonaws.services.timestreamquery.model.ColumnInfo) TimeSeriesDataPoint(com.amazonaws.services.timestreamquery.model.TimeSeriesDataPoint)

Example 5 with Datum

use of com.amazonaws.services.timestreamquery.model.Datum in project aws-athena-query-federation by awslabs.

the class TimestreamRecordHandler method buildTimeSeriesExtractor.

private void buildTimeSeriesExtractor(GeneratedRowWriter.RowWriterBuilder builder, Field field, int curFieldNum) {
    builder.withFieldWriterFactory(field.getName(), (FieldVector vector, Extractor extractor, ConstraintProjector constraint) -> (Object context, int rowNum) -> {
        Row row = (Row) context;
        Datum datum = row.getData().get(curFieldNum);
        Field timeField = field.getChildren().get(0).getChildren().get(0);
        Field valueField = field.getChildren().get(0).getChildren().get(1);
        if (datum.getTimeSeriesValue() != null) {
            List<Map<String, Object>> values = new ArrayList<>();
            for (TimeSeriesDataPoint nextDatum : datum.getTimeSeriesValue()) {
                Map<String, Object> eventMap = new HashMap<>();
                eventMap.put(timeField.getName(), TIMESTAMP_FORMATTER.parse(nextDatum.getTime()).getTime());
                switch(Types.getMinorTypeForArrowType(valueField.getType())) {
                    case FLOAT8:
                        eventMap.put(valueField.getName(), Double.valueOf(nextDatum.getValue().getScalarValue()));
                        break;
                    case BIGINT:
                        eventMap.put(valueField.getName(), Long.valueOf(nextDatum.getValue().getScalarValue()));
                        break;
                    case INT:
                        eventMap.put(valueField.getName(), Integer.valueOf(nextDatum.getValue().getScalarValue()));
                        break;
                    case BIT:
                        eventMap.put(valueField.getName(), Boolean.valueOf(((Row) context).getData().get(curFieldNum).getScalarValue()) == false ? 0 : 1);
                        break;
                }
                values.add(eventMap);
            }
            BlockUtils.setComplexValue(vector, rowNum, FieldResolver.DEFAULT, values);
        } else {
            throw new RuntimeException("Only LISTs of type TimeSeries are presently supported.");
        }
        // we don't yet support predicate pushdown on complex types
        return true;
    });
}
Also used : TimeSeriesDataPoint(com.amazonaws.services.timestreamquery.model.TimeSeriesDataPoint) Datum(com.amazonaws.services.timestreamquery.model.Datum) HashMap(java.util.HashMap) ConstraintProjector(com.amazonaws.athena.connector.lambda.domain.predicate.ConstraintProjector) ArrayList(java.util.ArrayList) FieldVector(org.apache.arrow.vector.FieldVector) Field(org.apache.arrow.vector.types.pojo.Field) BigIntExtractor(com.amazonaws.athena.connector.lambda.data.writers.extractors.BigIntExtractor) BitExtractor(com.amazonaws.athena.connector.lambda.data.writers.extractors.BitExtractor) Float8Extractor(com.amazonaws.athena.connector.lambda.data.writers.extractors.Float8Extractor) VarCharExtractor(com.amazonaws.athena.connector.lambda.data.writers.extractors.VarCharExtractor) DateMilliExtractor(com.amazonaws.athena.connector.lambda.data.writers.extractors.DateMilliExtractor) Extractor(com.amazonaws.athena.connector.lambda.data.writers.extractors.Extractor) Row(com.amazonaws.services.timestreamquery.model.Row) HashMap(java.util.HashMap) Map(java.util.Map)

Aggregations

Datum (com.amazonaws.services.timestreamquery.model.Datum)6 ArrayList (java.util.ArrayList)5 Field (org.apache.arrow.vector.types.pojo.Field)5 Row (com.amazonaws.services.timestreamquery.model.Row)4 TimeSeriesDataPoint (com.amazonaws.services.timestreamquery.model.TimeSeriesDataPoint)4 QueryResult (com.amazonaws.services.timestreamquery.model.QueryResult)3 Extractor (com.amazonaws.athena.connector.lambda.data.writers.extractors.Extractor)2 TableName (com.amazonaws.athena.connector.lambda.domain.TableName)2 ConstraintProjector (com.amazonaws.athena.connector.lambda.domain.predicate.ConstraintProjector)2 GetTableResponse (com.amazonaws.athena.connector.lambda.metadata.GetTableResponse)2 QueryRequest (com.amazonaws.services.timestreamquery.model.QueryRequest)2 Date (java.util.Date)2 HashMap (java.util.HashMap)2 Map (java.util.Map)2 FieldVector (org.apache.arrow.vector.FieldVector)2 Schema (org.apache.arrow.vector.types.pojo.Schema)2 BlockUtils (com.amazonaws.athena.connector.lambda.data.BlockUtils)1 FieldResolver (com.amazonaws.athena.connector.lambda.data.FieldResolver)1 SchemaBuilder (com.amazonaws.athena.connector.lambda.data.SchemaBuilder)1 GeneratedRowWriter (com.amazonaws.athena.connector.lambda.data.writers.GeneratedRowWriter)1