Search in sources :

Example 1 with COMMENT

use of io.airlift.tpch.NationColumn.COMMENT in project hetu-core by openlookeng.

the class TestOrcAcidPageSource method readFile.

private static List<Nation> readFile(Map<NationColumn, Integer> columns, TupleDomain<HiveColumnHandle> tupleDomain, Optional<DeleteDeltaLocations> deleteDeltaLocations) {
    List<HiveColumnHandle> columnHandles = columns.entrySet().stream().map(column -> toHiveColumnHandle(column.getKey(), column.getValue())).collect(toImmutableList());
    List<String> columnNames = columnHandles.stream().map(HiveColumnHandle::getName).collect(toImmutableList());
    // This file has the contains the TPC-H nation table which each row repeated 1000 times
    File nationFileWithReplicatedRows = new File(TestOrcAcidPageSource.class.getClassLoader().getResource("nationFile25kRowsSortedOnNationKey/bucket_00000").getPath());
    ConnectorPageSource pageSource = PAGE_SOURCE_FACTORY.createPageSource(new JobConf(new Configuration(false)), HiveTestUtils.SESSION, new Path(nationFileWithReplicatedRows.getAbsoluteFile().toURI()), 0, nationFileWithReplicatedRows.length(), nationFileWithReplicatedRows.length(), createSchema(), columnHandles, tupleDomain, Optional.empty(), deleteDeltaLocations, Optional.empty(), Optional.empty(), null, false, -1L).get();
    int nationKeyColumn = columnNames.indexOf("n_nationkey");
    int nameColumn = columnNames.indexOf("n_name");
    int regionKeyColumn = columnNames.indexOf("n_regionkey");
    int commentColumn = columnNames.indexOf("n_comment");
    ImmutableList.Builder<Nation> rows = ImmutableList.builder();
    while (!pageSource.isFinished()) {
        Page page = pageSource.getNextPage();
        if (page == null) {
            continue;
        }
        page = page.getLoadedPage();
        for (int position = 0; position < page.getPositionCount(); position++) {
            long nationKey = -42;
            if (nationKeyColumn >= 0) {
                nationKey = BIGINT.getLong(page.getBlock(nationKeyColumn), position);
            }
            String name = "<not read>";
            if (nameColumn >= 0) {
                name = VARCHAR.getSlice(page.getBlock(nameColumn), position).toStringUtf8();
            }
            long regionKey = -42;
            if (regionKeyColumn >= 0) {
                regionKey = BIGINT.getLong(page.getBlock(regionKeyColumn), position);
            }
            String comment = "<not read>";
            if (commentColumn >= 0) {
                comment = VARCHAR.getSlice(page.getBlock(commentColumn), position).toStringUtf8();
            }
            rows.add(new Nation(position, nationKey, name, regionKey, comment));
        }
    }
    return rows.build();
}
Also used : HiveType.toHiveType(io.prestosql.plugin.hive.HiveType.toHiveType) Nation(io.airlift.tpch.Nation) Test(org.testng.annotations.Test) HiveColumnHandle(io.prestosql.plugin.hive.HiveColumnHandle) HiveConfig(io.prestosql.plugin.hive.HiveConfig) HiveTestUtils(io.prestosql.plugin.hive.HiveTestUtils) Configuration(org.apache.hadoop.conf.Configuration) Duration(java.time.Duration) Map(java.util.Map) Path(org.apache.hadoop.fs.Path) Type(io.prestosql.spi.type.Type) LongPredicate(java.util.function.LongPredicate) COMMENT(io.airlift.tpch.NationColumn.COMMENT) BIGINT(io.prestosql.spi.type.BigintType.BIGINT) SERIALIZATION_LIB(org.apache.hadoop.hive.serde.serdeConstants.SERIALIZATION_LIB) ImmutableMap(com.google.common.collect.ImmutableMap) MetadataManager.createTestMetadataManager(io.prestosql.metadata.MetadataManager.createTestMetadataManager) Collections.nCopies(java.util.Collections.nCopies) ImmutableList.toImmutableList(com.google.common.collect.ImmutableList.toImmutableList) Set(java.util.Set) AcidUtils.deleteDeltaSubdir(org.apache.hadoop.hive.ql.io.AcidUtils.deleteDeltaSubdir) Metadata(io.prestosql.metadata.Metadata) FileFormatDataSourceStats(io.prestosql.plugin.hive.FileFormatDataSourceStats) List(java.util.List) ConnectorPageSource(io.prestosql.spi.connector.ConnectorPageSource) Domain(io.prestosql.spi.predicate.Domain) Optional(java.util.Optional) ORC(io.prestosql.plugin.hive.HiveStorageFormat.ORC) NAME(io.airlift.tpch.NationColumn.NAME) NationColumn(io.airlift.tpch.NationColumn) HiveTypeTranslator(io.prestosql.plugin.hive.HiveTypeTranslator) Assert.assertEquals(org.testng.Assert.assertEquals) ArrayList(java.util.ArrayList) NationGenerator(io.airlift.tpch.NationGenerator) OptionalLong(java.util.OptionalLong) REGULAR(io.prestosql.plugin.hive.HiveColumnHandle.ColumnType.REGULAR) VARCHAR(io.prestosql.spi.type.VarcharType.VARCHAR) ImmutableList(com.google.common.collect.ImmutableList) HivePageSourceFactory(io.prestosql.plugin.hive.HivePageSourceFactory) InternalTypeManager(io.prestosql.type.InternalTypeManager) Properties(java.util.Properties) DeleteDeltaLocations(io.prestosql.plugin.hive.DeleteDeltaLocations) TupleDomain(io.prestosql.spi.predicate.TupleDomain) TypeManager(io.prestosql.spi.type.TypeManager) Page(io.prestosql.spi.Page) TABLE_IS_TRANSACTIONAL(org.apache.hadoop.hive.metastore.api.hive_metastoreConstants.TABLE_IS_TRANSACTIONAL) File(java.io.File) JobConf(org.apache.hadoop.mapred.JobConf) NATION_KEY(io.airlift.tpch.NationColumn.NATION_KEY) OrcCacheStore(io.prestosql.orc.OrcCacheStore) REGION_KEY(io.airlift.tpch.NationColumn.REGION_KEY) FILE_INPUT_FORMAT(org.apache.hadoop.hive.metastore.api.hive_metastoreConstants.FILE_INPUT_FORMAT) Path(org.apache.hadoop.fs.Path) Nation(io.airlift.tpch.Nation) Configuration(org.apache.hadoop.conf.Configuration) ImmutableList.toImmutableList(com.google.common.collect.ImmutableList.toImmutableList) ImmutableList(com.google.common.collect.ImmutableList) Page(io.prestosql.spi.Page) ConnectorPageSource(io.prestosql.spi.connector.ConnectorPageSource) File(java.io.File) JobConf(org.apache.hadoop.mapred.JobConf) HiveColumnHandle(io.prestosql.plugin.hive.HiveColumnHandle)

Example 2 with COMMENT

use of io.airlift.tpch.NationColumn.COMMENT in project boostkit-bigdata by kunpengcompute.

the class TestOrcAcidPageSource method readFile.

private static List<Nation> readFile(Map<NationColumn, Integer> columns, TupleDomain<HiveColumnHandle> tupleDomain, Optional<DeleteDeltaLocations> deleteDeltaLocations) {
    List<HiveColumnHandle> columnHandles = columns.entrySet().stream().map(column -> toHiveColumnHandle(column.getKey(), column.getValue())).collect(toImmutableList());
    List<String> columnNames = columnHandles.stream().map(HiveColumnHandle::getName).collect(toImmutableList());
    // This file has the contains the TPC-H nation table which each row repeated 1000 times
    File nationFileWithReplicatedRows = new File(TestOrcAcidPageSource.class.getClassLoader().getResource("nationFile25kRowsSortedOnNationKey/bucket_00000").getPath());
    ConnectorPageSource pageSource = PAGE_SOURCE_FACTORY.createPageSource(new JobConf(new Configuration(false)), HiveTestUtils.SESSION, new Path(nationFileWithReplicatedRows.getAbsoluteFile().toURI()), 0, nationFileWithReplicatedRows.length(), nationFileWithReplicatedRows.length(), createSchema(), columnHandles, tupleDomain, Optional.empty(), deleteDeltaLocations, Optional.empty(), Optional.empty(), null, false, -1L).get();
    int nationKeyColumn = columnNames.indexOf("n_nationkey");
    int nameColumn = columnNames.indexOf("n_name");
    int regionKeyColumn = columnNames.indexOf("n_regionkey");
    int commentColumn = columnNames.indexOf("n_comment");
    ImmutableList.Builder<Nation> rows = ImmutableList.builder();
    while (!pageSource.isFinished()) {
        Page page = pageSource.getNextPage();
        if (page == null) {
            continue;
        }
        page = page.getLoadedPage();
        for (int position = 0; position < page.getPositionCount(); position++) {
            long nationKey = -42;
            if (nationKeyColumn >= 0) {
                nationKey = BIGINT.getLong(page.getBlock(nationKeyColumn), position);
            }
            String name = "<not read>";
            if (nameColumn >= 0) {
                name = VARCHAR.getSlice(page.getBlock(nameColumn), position).toStringUtf8();
            }
            long regionKey = -42;
            if (regionKeyColumn >= 0) {
                regionKey = BIGINT.getLong(page.getBlock(regionKeyColumn), position);
            }
            String comment = "<not read>";
            if (commentColumn >= 0) {
                comment = VARCHAR.getSlice(page.getBlock(commentColumn), position).toStringUtf8();
            }
            rows.add(new Nation(position, nationKey, name, regionKey, comment));
        }
    }
    return rows.build();
}
Also used : HiveType.toHiveType(io.prestosql.plugin.hive.HiveType.toHiveType) Nation(io.airlift.tpch.Nation) Test(org.testng.annotations.Test) HiveColumnHandle(io.prestosql.plugin.hive.HiveColumnHandle) HiveConfig(io.prestosql.plugin.hive.HiveConfig) HiveTestUtils(io.prestosql.plugin.hive.HiveTestUtils) Configuration(org.apache.hadoop.conf.Configuration) Duration(java.time.Duration) Map(java.util.Map) Path(org.apache.hadoop.fs.Path) Type(io.prestosql.spi.type.Type) LongPredicate(java.util.function.LongPredicate) COMMENT(io.airlift.tpch.NationColumn.COMMENT) BIGINT(io.prestosql.spi.type.BigintType.BIGINT) SERIALIZATION_LIB(org.apache.hadoop.hive.serde.serdeConstants.SERIALIZATION_LIB) ImmutableMap(com.google.common.collect.ImmutableMap) MetadataManager.createTestMetadataManager(io.prestosql.metadata.MetadataManager.createTestMetadataManager) Collections.nCopies(java.util.Collections.nCopies) ImmutableList.toImmutableList(com.google.common.collect.ImmutableList.toImmutableList) Set(java.util.Set) AcidUtils.deleteDeltaSubdir(org.apache.hadoop.hive.ql.io.AcidUtils.deleteDeltaSubdir) Metadata(io.prestosql.metadata.Metadata) FileFormatDataSourceStats(io.prestosql.plugin.hive.FileFormatDataSourceStats) List(java.util.List) ConnectorPageSource(io.prestosql.spi.connector.ConnectorPageSource) Domain(io.prestosql.spi.predicate.Domain) Optional(java.util.Optional) ORC(io.prestosql.plugin.hive.HiveStorageFormat.ORC) NAME(io.airlift.tpch.NationColumn.NAME) NationColumn(io.airlift.tpch.NationColumn) HiveTypeTranslator(io.prestosql.plugin.hive.HiveTypeTranslator) Assert.assertEquals(org.testng.Assert.assertEquals) ArrayList(java.util.ArrayList) NationGenerator(io.airlift.tpch.NationGenerator) OptionalLong(java.util.OptionalLong) REGULAR(io.prestosql.plugin.hive.HiveColumnHandle.ColumnType.REGULAR) VARCHAR(io.prestosql.spi.type.VarcharType.VARCHAR) ImmutableList(com.google.common.collect.ImmutableList) HivePageSourceFactory(io.prestosql.plugin.hive.HivePageSourceFactory) InternalTypeManager(io.prestosql.type.InternalTypeManager) Properties(java.util.Properties) DeleteDeltaLocations(io.prestosql.plugin.hive.DeleteDeltaLocations) TupleDomain(io.prestosql.spi.predicate.TupleDomain) TypeManager(io.prestosql.spi.type.TypeManager) Page(io.prestosql.spi.Page) TABLE_IS_TRANSACTIONAL(org.apache.hadoop.hive.metastore.api.hive_metastoreConstants.TABLE_IS_TRANSACTIONAL) File(java.io.File) JobConf(org.apache.hadoop.mapred.JobConf) NATION_KEY(io.airlift.tpch.NationColumn.NATION_KEY) OrcCacheStore(io.prestosql.orc.OrcCacheStore) REGION_KEY(io.airlift.tpch.NationColumn.REGION_KEY) FILE_INPUT_FORMAT(org.apache.hadoop.hive.metastore.api.hive_metastoreConstants.FILE_INPUT_FORMAT) Path(org.apache.hadoop.fs.Path) Nation(io.airlift.tpch.Nation) Configuration(org.apache.hadoop.conf.Configuration) ImmutableList.toImmutableList(com.google.common.collect.ImmutableList.toImmutableList) ImmutableList(com.google.common.collect.ImmutableList) Page(io.prestosql.spi.Page) ConnectorPageSource(io.prestosql.spi.connector.ConnectorPageSource) File(java.io.File) JobConf(org.apache.hadoop.mapred.JobConf) HiveColumnHandle(io.prestosql.plugin.hive.HiveColumnHandle)

Aggregations

ImmutableList (com.google.common.collect.ImmutableList)2 ImmutableList.toImmutableList (com.google.common.collect.ImmutableList.toImmutableList)2 ImmutableMap (com.google.common.collect.ImmutableMap)2 Nation (io.airlift.tpch.Nation)2 NationColumn (io.airlift.tpch.NationColumn)2 COMMENT (io.airlift.tpch.NationColumn.COMMENT)2 NAME (io.airlift.tpch.NationColumn.NAME)2 NATION_KEY (io.airlift.tpch.NationColumn.NATION_KEY)2 REGION_KEY (io.airlift.tpch.NationColumn.REGION_KEY)2 NationGenerator (io.airlift.tpch.NationGenerator)2 Metadata (io.prestosql.metadata.Metadata)2 MetadataManager.createTestMetadataManager (io.prestosql.metadata.MetadataManager.createTestMetadataManager)2 OrcCacheStore (io.prestosql.orc.OrcCacheStore)2 DeleteDeltaLocations (io.prestosql.plugin.hive.DeleteDeltaLocations)2 FileFormatDataSourceStats (io.prestosql.plugin.hive.FileFormatDataSourceStats)2 HiveColumnHandle (io.prestosql.plugin.hive.HiveColumnHandle)2 REGULAR (io.prestosql.plugin.hive.HiveColumnHandle.ColumnType.REGULAR)2 HiveConfig (io.prestosql.plugin.hive.HiveConfig)2 HivePageSourceFactory (io.prestosql.plugin.hive.HivePageSourceFactory)2 ORC (io.prestosql.plugin.hive.HiveStorageFormat.ORC)2