Search in sources :

Example 21 with OrcColumnId

use of io.prestosql.orc.metadata.OrcColumnId in project hetu-core by openlookeng.

the class TestAbstractNumbericColumnReader method testTypeCoercionShort.

@Test
public void testTypeCoercionShort() throws OrcCorruptionException {
    OrcColumn column = new OrcColumn("hdfs://hacluster/user/hive/warehouse/tpcds_orc_hive_1000.db/catalog_sales/cs_sold_date_sk=2452268/000896_0", new OrcColumnId(3), "cs_order_number", OrcType.OrcTypeKind.SHORT, new OrcDataSourceId("hdfs://hacluster/user/hive/warehouse/tpcds_orc_hive_1000.db/catalog_sales/cs_sold_date_sk=2452268/000896_0"), ImmutableList.of());
    ColumnReader actualShortColumnReader = ColumnReaders.createColumnReader(type, column, AggregatedMemoryContext.newSimpleAggregatedMemoryContext(), null);
    ShortColumnReader expectedShortColumnReader = new ShortColumnReader(type, column, AggregatedMemoryContext.newSimpleAggregatedMemoryContext().newLocalMemoryContext(ColumnReaders.class.getSimpleName()));
    assertEquals(actualShortColumnReader.toString(), expectedShortColumnReader.toString());
}
Also used : OrcColumnId(io.prestosql.orc.metadata.OrcColumnId) ShortColumnReader(io.prestosql.orc.reader.ShortColumnReader) ShortColumnReader(io.prestosql.orc.reader.ShortColumnReader) DateColumnReader(io.prestosql.orc.reader.DateColumnReader) LongColumnReader(io.prestosql.orc.reader.LongColumnReader) ColumnReader(io.prestosql.orc.reader.ColumnReader) IntegerColumnReader(io.prestosql.orc.reader.IntegerColumnReader) Test(org.testng.annotations.Test)

Example 22 with OrcColumnId

use of io.prestosql.orc.metadata.OrcColumnId in project hetu-core by openlookeng.

the class StructColumnWriter method finishRowGroup.

@Override
public Map<OrcColumnId, ColumnStatistics> finishRowGroup() {
    checkState(!closed);
    ColumnStatistics statistics = new ColumnStatistics((long) nonNullValueCount, 0, null, null, null, null, null, null, null, null);
    rowGroupColumnStatistics.add(statistics);
    nonNullValueCount = 0;
    ImmutableMap.Builder<OrcColumnId, ColumnStatistics> columnStatistics = ImmutableMap.builder();
    columnStatistics.put(columnId, statistics);
    structFields.stream().map(ColumnWriter::finishRowGroup).forEach(columnStatistics::putAll);
    return columnStatistics.build();
}
Also used : ColumnStatistics(io.prestosql.orc.metadata.statistics.ColumnStatistics) OrcColumnId(io.prestosql.orc.metadata.OrcColumnId) ImmutableMap(com.google.common.collect.ImmutableMap)

Example 23 with OrcColumnId

use of io.prestosql.orc.metadata.OrcColumnId in project hetu-core by openlookeng.

the class TestReadBloomFilter method testType.

private static <T> void testType(Type type, List<T> uniqueValues, T inBloomFilter, T notInBloomFilter) throws Exception {
    Stream<T> writeValues = newArrayList(limit(cycle(uniqueValues), 30_000)).stream();
    try (TempFile tempFile = new TempFile()) {
        writeOrcColumnHive(tempFile.getFile(), ORC_12, LZ4, type, writeValues.iterator());
        // without predicate a normal block will be created
        try (OrcRecordReader recordReader = createCustomOrcRecordReader(tempFile, OrcPredicate.TRUE, type, MAX_BATCH_SIZE)) {
            assertEquals(recordReader.nextPage().getLoadedPage().getPositionCount(), 1024);
        }
        // predicate for specific value within the min/max range without bloom filter being enabled
        TupleDomainOrcPredicate noBloomFilterPredicate = TupleDomainOrcPredicate.builder().addColumn(new OrcColumnId(1), Domain.singleValue(type, notInBloomFilter)).build();
        try (OrcRecordReader recordReader = createCustomOrcRecordReader(tempFile, noBloomFilterPredicate, type, MAX_BATCH_SIZE)) {
            assertEquals(recordReader.nextPage().getLoadedPage().getPositionCount(), 1024);
        }
        // predicate for specific value within the min/max range with bloom filter enabled, but a value not in the bloom filter
        TupleDomainOrcPredicate notMatchBloomFilterPredicate = TupleDomainOrcPredicate.builder().addColumn(new OrcColumnId(1), Domain.singleValue(type, notInBloomFilter)).setBloomFiltersEnabled(true).build();
        try (OrcRecordReader recordReader = createCustomOrcRecordReader(tempFile, notMatchBloomFilterPredicate, type, MAX_BATCH_SIZE)) {
            assertNull(recordReader.nextPage());
        }
        // predicate for specific value within the min/max range with bloom filter enabled, and a value in the bloom filter
        TupleDomainOrcPredicate matchBloomFilterPredicate = TupleDomainOrcPredicate.builder().addColumn(new OrcColumnId(1), Domain.singleValue(type, inBloomFilter)).setBloomFiltersEnabled(true).build();
        try (OrcRecordReader recordReader = createCustomOrcRecordReader(tempFile, matchBloomFilterPredicate, type, MAX_BATCH_SIZE)) {
            assertEquals(recordReader.nextPage().getLoadedPage().getPositionCount(), 1024);
        }
    }
}
Also used : OrcColumnId(io.prestosql.orc.metadata.OrcColumnId) BIGINT(io.prestosql.spi.type.BigintType.BIGINT) TINYINT(io.prestosql.spi.type.TinyintType.TINYINT) SMALLINT(io.prestosql.spi.type.SmallintType.SMALLINT)

Aggregations

OrcColumnId (io.prestosql.orc.metadata.OrcColumnId)23 Stream (io.prestosql.orc.metadata.Stream)9 ImmutableMap (com.google.common.collect.ImmutableMap)8 ColumnStatistics (io.prestosql.orc.metadata.statistics.ColumnStatistics)7 Test (org.testng.annotations.Test)7 ArrayList (java.util.ArrayList)6 List (java.util.List)6 Slice (io.airlift.slice.Slice)5 ImmutableList (com.google.common.collect.ImmutableList)4 CompressionKind (io.prestosql.orc.metadata.CompressionKind)4 ColumnReader (io.prestosql.orc.reader.ColumnReader)4 DateColumnReader (io.prestosql.orc.reader.DateColumnReader)4 IntegerColumnReader (io.prestosql.orc.reader.IntegerColumnReader)4 LongColumnReader (io.prestosql.orc.reader.LongColumnReader)4 ShortColumnReader (io.prestosql.orc.reader.ShortColumnReader)4 OrcInputStream (io.prestosql.orc.stream.OrcInputStream)4 InputStream (java.io.InputStream)4 Map (java.util.Map)4 ColumnEncoding (io.prestosql.orc.metadata.ColumnEncoding)3 StripeFooter (io.prestosql.orc.metadata.StripeFooter)3