Search in sources :

Example 1 with TableStatistics

use of com.facebook.presto.spi.statistics.TableStatistics in project presto by prestodb.

the class TestMetastoreHiveStatisticsProvider method testGetTableStatistics.

@Test
public void testGetTableStatistics() {
    String partitionName = "p1=string1/p2=1234";
    PartitionStatistics statistics = PartitionStatistics.builder().setBasicStatistics(new HiveBasicStatistics(OptionalLong.empty(), OptionalLong.of(1000), OptionalLong.of(5000), OptionalLong.empty())).setColumnStatistics(ImmutableMap.of(COLUMN, createIntegerColumnStatistics(OptionalLong.of(-100), OptionalLong.of(100), OptionalLong.of(500), OptionalLong.of(300)))).build();
    MetastoreHiveStatisticsProvider statisticsProvider = new MetastoreHiveStatisticsProvider((session, table, hivePartitions) -> ImmutableMap.of(partitionName, statistics));
    TestingConnectorSession session = new TestingConnectorSession(new HiveSessionProperties(new HiveClientConfig(), new OrcFileWriterConfig(), new ParquetFileWriterConfig(), new CacheConfig()).getSessionProperties());
    HiveColumnHandle columnHandle = new HiveColumnHandle(COLUMN, HIVE_LONG, BIGINT.getTypeSignature(), 2, REGULAR, Optional.empty(), Optional.empty());
    TableStatistics expected = TableStatistics.builder().setRowCount(Estimate.of(1000)).setTotalSize(Estimate.of(5000)).setColumnStatistics(PARTITION_COLUMN_1, ColumnStatistics.builder().setDataSize(Estimate.of(7000)).setNullsFraction(Estimate.of(0)).setDistinctValuesCount(Estimate.of(1)).build()).setColumnStatistics(PARTITION_COLUMN_2, ColumnStatistics.builder().setRange(new DoubleRange(1234, 1234)).setNullsFraction(Estimate.of(0)).setDistinctValuesCount(Estimate.of(1)).build()).setColumnStatistics(columnHandle, ColumnStatistics.builder().setRange(new DoubleRange(-100, 100)).setNullsFraction(Estimate.of(0.5)).setDistinctValuesCount(Estimate.of(300)).build()).build();
    assertEquals(statisticsProvider.getTableStatistics(session, TABLE, ImmutableMap.of("p1", PARTITION_COLUMN_1, "p2", PARTITION_COLUMN_2, COLUMN, columnHandle), ImmutableMap.of("p1", VARCHAR, "p2", BIGINT, COLUMN, BIGINT), ImmutableList.of(partition(partitionName))), expected);
}
Also used : TestingConnectorSession(com.facebook.presto.testing.TestingConnectorSession) OrcFileWriterConfig(com.facebook.presto.hive.OrcFileWriterConfig) HiveBasicStatistics(com.facebook.presto.hive.HiveBasicStatistics) HiveSessionProperties(com.facebook.presto.hive.HiveSessionProperties) DoubleRange(com.facebook.presto.spi.statistics.DoubleRange) MetastoreHiveStatisticsProvider.validatePartitionStatistics(com.facebook.presto.hive.statistics.MetastoreHiveStatisticsProvider.validatePartitionStatistics) PartitionStatistics(com.facebook.presto.hive.metastore.PartitionStatistics) TableStatistics(com.facebook.presto.spi.statistics.TableStatistics) CacheConfig(com.facebook.presto.cache.CacheConfig) ParquetFileWriterConfig(com.facebook.presto.hive.ParquetFileWriterConfig) HiveColumnHandle(com.facebook.presto.hive.HiveColumnHandle) HiveClientConfig(com.facebook.presto.hive.HiveClientConfig) Test(org.testng.annotations.Test)

Example 2 with TableStatistics

use of com.facebook.presto.spi.statistics.TableStatistics in project presto by prestodb.

the class TestMetastoreHiveStatisticsProvider method testGetTableStatisticsUnpartitioned.

@Test
public void testGetTableStatisticsUnpartitioned() {
    PartitionStatistics statistics = PartitionStatistics.builder().setBasicStatistics(new HiveBasicStatistics(OptionalLong.empty(), OptionalLong.of(1000), OptionalLong.of(5000), OptionalLong.empty())).setColumnStatistics(ImmutableMap.of(COLUMN, createIntegerColumnStatistics(OptionalLong.of(-100), OptionalLong.of(100), OptionalLong.of(500), OptionalLong.of(300)))).build();
    MetastoreHiveStatisticsProvider statisticsProvider = new MetastoreHiveStatisticsProvider((session, table, hivePartitions) -> ImmutableMap.of(UNPARTITIONED_ID, statistics));
    TestingConnectorSession session = new TestingConnectorSession(new HiveSessionProperties(new HiveClientConfig(), new OrcFileWriterConfig(), new ParquetFileWriterConfig(), new CacheConfig()).getSessionProperties());
    HiveColumnHandle columnHandle = new HiveColumnHandle(COLUMN, HIVE_LONG, BIGINT.getTypeSignature(), 2, REGULAR, Optional.empty(), Optional.empty());
    TableStatistics expected = TableStatistics.builder().setRowCount(Estimate.of(1000)).setTotalSize(Estimate.of(5000)).setColumnStatistics(columnHandle, ColumnStatistics.builder().setRange(new DoubleRange(-100, 100)).setNullsFraction(Estimate.of(0.5)).setDistinctValuesCount(Estimate.of(300)).build()).build();
    assertEquals(statisticsProvider.getTableStatistics(session, TABLE, ImmutableMap.of(COLUMN, columnHandle), ImmutableMap.of(COLUMN, BIGINT), ImmutableList.of(new HivePartition(TABLE))), expected);
}
Also used : TestingConnectorSession(com.facebook.presto.testing.TestingConnectorSession) OrcFileWriterConfig(com.facebook.presto.hive.OrcFileWriterConfig) HiveBasicStatistics(com.facebook.presto.hive.HiveBasicStatistics) HiveSessionProperties(com.facebook.presto.hive.HiveSessionProperties) DoubleRange(com.facebook.presto.spi.statistics.DoubleRange) MetastoreHiveStatisticsProvider.validatePartitionStatistics(com.facebook.presto.hive.statistics.MetastoreHiveStatisticsProvider.validatePartitionStatistics) PartitionStatistics(com.facebook.presto.hive.metastore.PartitionStatistics) TableStatistics(com.facebook.presto.spi.statistics.TableStatistics) CacheConfig(com.facebook.presto.cache.CacheConfig) ParquetFileWriterConfig(com.facebook.presto.hive.ParquetFileWriterConfig) HiveColumnHandle(com.facebook.presto.hive.HiveColumnHandle) HiveClientConfig(com.facebook.presto.hive.HiveClientConfig) HivePartition(com.facebook.presto.hive.HivePartition) Test(org.testng.annotations.Test)

Example 3 with TableStatistics

use of com.facebook.presto.spi.statistics.TableStatistics in project presto by prestodb.

the class TestTpchMetadata method testColumnStats.

private void testColumnStats(String schema, TpchTable<?> table, TpchColumn<?> column, Constraint<ColumnHandle> constraint, ColumnStatistics expected) {
    TpchTableHandle tableHandle = tpchMetadata.getTableHandle(session, new SchemaTableName(schema, table.getTableName()));
    List<ColumnHandle> columnHandles = ImmutableList.copyOf(tpchMetadata.getColumnHandles(session, tableHandle).values());
    TableStatistics tableStatistics = tpchMetadata.getTableStatistics(session, tableHandle, Optional.empty(), columnHandles, constraint);
    ColumnHandle columnHandle = tpchMetadata.getColumnHandles(session, tableHandle).get(column.getSimplifiedColumnName());
    ColumnStatistics actual = tableStatistics.getColumnStatistics().get(columnHandle);
    EstimateAssertion estimateAssertion = new EstimateAssertion(TOLERANCE);
    estimateAssertion.assertClose(actual.getDistinctValuesCount(), expected.getDistinctValuesCount(), "distinctValuesCount");
    estimateAssertion.assertClose(actual.getDataSize(), expected.getDataSize(), "dataSize");
    estimateAssertion.assertClose(actual.getNullsFraction(), expected.getNullsFraction(), "nullsFraction");
    estimateAssertion.assertClose(actual.getRange(), expected.getRange(), "range");
}
Also used : ColumnStatistics(com.facebook.presto.spi.statistics.ColumnStatistics) ColumnHandle(com.facebook.presto.spi.ColumnHandle) TableStatistics(com.facebook.presto.spi.statistics.TableStatistics) SchemaTableName(com.facebook.presto.spi.SchemaTableName)

Example 4 with TableStatistics

use of com.facebook.presto.spi.statistics.TableStatistics in project presto by prestodb.

the class TestTpchMetadata method testTableStats.

private void testTableStats(String schema, TpchTable<?> table, Constraint<ColumnHandle> constraint, double expectedRowCount) {
    TpchTableHandle tableHandle = tpchMetadata.getTableHandle(session, new SchemaTableName(schema, table.getTableName()));
    List<ColumnHandle> columnHandles = ImmutableList.copyOf(tpchMetadata.getColumnHandles(session, tableHandle).values());
    TableStatistics tableStatistics = tpchMetadata.getTableStatistics(session, tableHandle, Optional.empty(), columnHandles, constraint);
    double actualRowCountValue = tableStatistics.getRowCount().getValue();
    assertEquals(tableStatistics.getRowCount(), Estimate.of(actualRowCountValue));
    assertEquals(actualRowCountValue, expectedRowCount, expectedRowCount * TOLERANCE);
}
Also used : ColumnHandle(com.facebook.presto.spi.ColumnHandle) TableStatistics(com.facebook.presto.spi.statistics.TableStatistics) SchemaTableName(com.facebook.presto.spi.SchemaTableName)

Example 5 with TableStatistics

use of com.facebook.presto.spi.statistics.TableStatistics in project presto by prestodb.

the class TestTpcdsMetadataStatistics method testTableStatsDetails.

@Test
public void testTableStatsDetails() {
    SchemaTableName schemaTableName = new SchemaTableName("sf1", Table.CALL_CENTER.getName());
    ConnectorTableHandle tableHandle = metadata.getTableHandle(session, schemaTableName);
    Map<String, ColumnHandle> columnHandles = metadata.getColumnHandles(session, tableHandle);
    TableStatistics tableStatistics = metadata.getTableStatistics(session, tableHandle, Optional.empty(), ImmutableList.copyOf(columnHandles.values()), alwaysTrue());
    estimateAssertion.assertClose(tableStatistics.getRowCount(), Estimate.of(6), "Row count does not match");
    // all columns have stats
    for (ColumnHandle column : columnHandles.values()) {
        assertTrue(tableStatistics.getColumnStatistics().containsKey(column));
        assertNotNull(tableStatistics.getColumnStatistics().get(column));
    }
    // identifier
    assertColumnStatistics(tableStatistics.getColumnStatistics().get(columnHandles.get(CallCenterColumn.CC_CALL_CENTER_SK.getName())), ColumnStatistics.builder().setNullsFraction(Estimate.of(0)).setDistinctValuesCount(Estimate.of(6)).setRange(new DoubleRange(1, 6)).build());
    // varchar
    assertColumnStatistics(tableStatistics.getColumnStatistics().get(columnHandles.get(CallCenterColumn.CC_CALL_CENTER_ID.getName())), ColumnStatistics.builder().setNullsFraction(Estimate.of(0)).setDistinctValuesCount(Estimate.of(3)).setDataSize(Estimate.of(48.0)).build());
    // char
    assertColumnStatistics(tableStatistics.getColumnStatistics().get(columnHandles.get(CallCenterColumn.CC_ZIP.getName())), ColumnStatistics.builder().setNullsFraction(Estimate.of(0)).setDistinctValuesCount(Estimate.of(1)).setDataSize(Estimate.of(5.0)).build());
    // decimal
    assertColumnStatistics(tableStatistics.getColumnStatistics().get(columnHandles.get(CallCenterColumn.CC_GMT_OFFSET.getName())), ColumnStatistics.builder().setNullsFraction(Estimate.of(0)).setDistinctValuesCount(Estimate.of(1)).setRange(new DoubleRange(-5, -5)).build());
    // date
    assertColumnStatistics(tableStatistics.getColumnStatistics().get(columnHandles.get(CallCenterColumn.CC_REC_START_DATE.getName())), ColumnStatistics.builder().setNullsFraction(Estimate.of(0)).setDistinctValuesCount(Estimate.of(4)).setRange(new DoubleRange(10227L, 11688L)).build());
    // only null values
    assertColumnStatistics(tableStatistics.getColumnStatistics().get(columnHandles.get(CallCenterColumn.CC_CLOSED_DATE_SK.getName())), ColumnStatistics.builder().setNullsFraction(Estimate.of(1)).setDistinctValuesCount(Estimate.of(0)).build());
}
Also used : ColumnHandle(com.facebook.presto.spi.ColumnHandle) DoubleRange(com.facebook.presto.spi.statistics.DoubleRange) TableStatistics(com.facebook.presto.spi.statistics.TableStatistics) SchemaTableName(com.facebook.presto.spi.SchemaTableName) ConnectorTableHandle(com.facebook.presto.spi.ConnectorTableHandle) Test(org.testng.annotations.Test)

Aggregations

TableStatistics (com.facebook.presto.spi.statistics.TableStatistics)20 ColumnHandle (com.facebook.presto.spi.ColumnHandle)14 SchemaTableName (com.facebook.presto.spi.SchemaTableName)10 ColumnStatistics (com.facebook.presto.spi.statistics.ColumnStatistics)9 ConnectorTableHandle (com.facebook.presto.spi.ConnectorTableHandle)7 DoubleRange (com.facebook.presto.spi.statistics.DoubleRange)7 Map (java.util.Map)7 Test (org.testng.annotations.Test)7 Type (com.facebook.presto.common.type.Type)6 Chars.isCharType (com.facebook.presto.common.type.Chars.isCharType)5 Varchars.isVarcharType (com.facebook.presto.common.type.Varchars.isVarcharType)5 ConnectorSession (com.facebook.presto.spi.ConnectorSession)5 Constraint (com.facebook.presto.spi.Constraint)5 RowExpression (com.facebook.presto.spi.relation.RowExpression)5 ImmutableMap (com.google.common.collect.ImmutableMap)5 NullableValue (com.facebook.presto.common.predicate.NullableValue)4 TupleDomain (com.facebook.presto.common.predicate.TupleDomain)4 Objects.requireNonNull (java.util.Objects.requireNonNull)4 Optional (java.util.Optional)4 Subfield (com.facebook.presto.common.Subfield)3