Search in sources :

Example 1 with MemoryColumnHandle

use of io.prestosql.plugin.memory.MemoryColumnHandle in project hetu-core by openlookeng.

the class LogicalPart method partitionPage.

static Map<String, Page> partitionPage(Page page, List<String> partitionedBy, List<MemoryColumnHandle> columns, TypeManager typeManager) {
    // derive the channel numbers that corresponds to the partitionedBy list
    List<MemoryColumnHandle> partitionChannels = new ArrayList<>(partitionedBy.size());
    for (String name : partitionedBy) {
        for (MemoryColumnHandle handle : columns) {
            if (handle.getColumnName().equals(name)) {
                partitionChannels.add(handle);
            }
        }
    }
    // build the partitions
    Map<String, Page> partitions = new HashMap<>();
    MemoryColumnHandle partitionColumnHandle = partitionChannels.get(0);
    Block block = page.getBlock(partitionColumnHandle.getColumnIndex());
    Type type = partitionColumnHandle.getType(typeManager);
    Map<Object, ArrayList<Integer>> uniqueValues = new HashMap<>();
    for (int i = 0; i < page.getPositionCount(); i++) {
        Object value = getNativeValue(type, block, i);
        uniqueValues.putIfAbsent(value, new ArrayList<>());
        uniqueValues.get(value).add(i);
    }
    for (Map.Entry<Object, ArrayList<Integer>> valueAndPosition : uniqueValues.entrySet()) {
        int[] retainedPositions = valueAndPosition.getValue().stream().mapToInt(i -> i).toArray();
        Object valueKey = valueAndPosition.getKey();
        Page subPage = page.getPositions(retainedPositions, 0, retainedPositions.length);
        // NOTE: null partition key is allowed here in the map
        // but when this partition map is sent to coordinator via MemoryDataFragment
        // the JSON parser fails and can't handle null keys in the map
        // the JSON parser will ignore null keys
        // therefore during scheduling if the query predicate is for null
        // we MUST NOT do any partition filtering because the partition map
        // the coordinator has is missing null partitions
        // the coordinator must schedule all splits if the query predicate is null
        // see: MemorySplitManager#getSplits
        // 
        // note: the other option is to use an empty string as the null key
        // then the JSON parser could send the key to the coordinator
        // but then this would cause conflicts with actual empty string values
        partitions.put(valueKey == null ? null : valueKey.toString(), subPage);
    }
    return partitions;
}
Also used : GZIPInputStream(java.util.zip.GZIPInputStream) ObjectInputStream(java.io.ObjectInputStream) SliceInput(io.airlift.slice.SliceInput) SortOrder(io.prestosql.spi.block.SortOrder) InputStreamSliceInput(io.airlift.slice.InputStreamSliceInput) Locale(java.util.Locale) Slices(io.airlift.slice.Slices) Map(java.util.Map) Type(io.prestosql.spi.type.Type) Path(java.nio.file.Path) Set(java.util.Set) NavigableMap(java.util.NavigableMap) SortedRangeSet(io.prestosql.spi.predicate.SortedRangeSet) Serializable(java.io.Serializable) DataSize(io.airlift.units.DataSize) List(java.util.List) Domain(io.prestosql.spi.predicate.Domain) GZIPOutputStream(java.util.zip.GZIPOutputStream) TypeSignature(io.prestosql.spi.type.TypeSignature) JsonCodec(io.airlift.json.JsonCodec) Slice(io.airlift.slice.Slice) Logger(io.airlift.log.Logger) SliceOutput(io.airlift.slice.SliceOutput) Marker(io.prestosql.spi.predicate.Marker) HashMap(java.util.HashMap) AtomicReference(java.util.concurrent.atomic.AtomicReference) ArrayList(java.util.ArrayList) HashSet(java.util.HashSet) OutputStreamSliceOutput(io.airlift.slice.OutputStreamSliceOutput) BloomFilter(io.prestosql.spi.util.BloomFilter) PagesSerde(io.hetu.core.transport.execution.buffer.PagesSerde) Range(io.prestosql.spi.predicate.Range) Objects.requireNonNull(java.util.Objects.requireNonNull) ObjectOutputStream(java.io.ObjectOutputStream) Block(io.prestosql.spi.block.Block) OutputStream(java.io.OutputStream) BaseEncoding(com.google.common.io.BaseEncoding) Files(java.nio.file.Files) TupleDomain(io.prestosql.spi.predicate.TupleDomain) TypeManager(io.prestosql.spi.type.TypeManager) Page(io.prestosql.spi.Page) IOException(java.io.IOException) PageSorter(io.prestosql.spi.PageSorter) TypeUtils(io.prestosql.spi.type.TypeUtils) AbstractMap(java.util.AbstractMap) TreeMap(java.util.TreeMap) ColumnHandle(io.prestosql.spi.connector.ColumnHandle) MemoryColumnHandle(io.prestosql.plugin.memory.MemoryColumnHandle) PagesSerdeUtil(io.hetu.core.transport.execution.buffer.PagesSerdeUtil) VisibleForTesting(com.google.common.annotations.VisibleForTesting) SortingColumn(io.prestosql.plugin.memory.SortingColumn) Collections(java.util.Collections) InputStream(java.io.InputStream) HashMap(java.util.HashMap) ArrayList(java.util.ArrayList) MemoryColumnHandle(io.prestosql.plugin.memory.MemoryColumnHandle) Page(io.prestosql.spi.Page) Type(io.prestosql.spi.type.Type) Block(io.prestosql.spi.block.Block) Map(java.util.Map) NavigableMap(java.util.NavigableMap) HashMap(java.util.HashMap) AbstractMap(java.util.AbstractMap) TreeMap(java.util.TreeMap)

Example 2 with MemoryColumnHandle

use of io.prestosql.plugin.memory.MemoryColumnHandle in project hetu-core by openlookeng.

the class StatisticsUtils method fromComputedStatistics.

public static TableStatisticsData fromComputedStatistics(ConnectorSession session, Collection<ComputedStatistics> computedStatistics, long rowCount, Map<String, ColumnHandle> columnHandles, TypeManager typeManager) {
    TableStatisticsData.Builder tableStatBuilder = TableStatisticsData.builder();
    tableStatBuilder.setRowCount(rowCount);
    // organize the column stats into a per-column view
    Map<String, Map<ColumnStatisticType, Block>> perColumnStats = new HashMap<>();
    for (ComputedStatistics stat : computedStatistics) {
        for (Map.Entry<ColumnStatisticMetadata, Block> entry : stat.getColumnStatistics().entrySet()) {
            perColumnStats.putIfAbsent(entry.getKey().getColumnName(), new HashMap<>());
            perColumnStats.get(entry.getKey().getColumnName()).put(entry.getKey().getStatisticType(), entry.getValue());
        }
    }
    // build the per-column statistics
    for (Map.Entry<String, ColumnHandle> entry : columnHandles.entrySet()) {
        Map<ColumnStatisticType, Block> columnStat = perColumnStats.get(entry.getKey());
        if (columnStat == null) {
            continue;
        }
        MemoryColumnHandle handle = (MemoryColumnHandle) entry.getValue();
        Type columnType = handle.getType(typeManager);
        tableStatBuilder.setColumnStatistics(handle.getColumnName(), fromComputedStatistics(session, columnStat, rowCount, columnType));
    }
    return tableStatBuilder.build();
}
Also used : ColumnStatisticMetadata(io.prestosql.spi.statistics.ColumnStatisticMetadata) ColumnHandle(io.prestosql.spi.connector.ColumnHandle) MemoryColumnHandle(io.prestosql.plugin.memory.MemoryColumnHandle) HashMap(java.util.HashMap) MemoryColumnHandle(io.prestosql.plugin.memory.MemoryColumnHandle) Varchars.isVarcharType(io.prestosql.spi.type.Varchars.isVarcharType) DecimalType(io.prestosql.spi.type.DecimalType) ColumnStatisticType(io.prestosql.spi.statistics.ColumnStatisticType) MapType(io.prestosql.spi.type.MapType) RowType(io.prestosql.spi.type.RowType) Type(io.prestosql.spi.type.Type) Chars.isCharType(io.prestosql.spi.type.Chars.isCharType) ArrayType(io.prestosql.spi.type.ArrayType) ComputedStatistics(io.prestosql.spi.statistics.ComputedStatistics) ColumnStatisticType(io.prestosql.spi.statistics.ColumnStatisticType) Block(io.prestosql.spi.block.Block) HashMap(java.util.HashMap) Map(java.util.Map)

Aggregations

MemoryColumnHandle (io.prestosql.plugin.memory.MemoryColumnHandle)2 Block (io.prestosql.spi.block.Block)2 ColumnHandle (io.prestosql.spi.connector.ColumnHandle)2 Type (io.prestosql.spi.type.Type)2 VisibleForTesting (com.google.common.annotations.VisibleForTesting)1 BaseEncoding (com.google.common.io.BaseEncoding)1 JsonCodec (io.airlift.json.JsonCodec)1 Logger (io.airlift.log.Logger)1 InputStreamSliceInput (io.airlift.slice.InputStreamSliceInput)1 OutputStreamSliceOutput (io.airlift.slice.OutputStreamSliceOutput)1 Slice (io.airlift.slice.Slice)1 SliceInput (io.airlift.slice.SliceInput)1 SliceOutput (io.airlift.slice.SliceOutput)1 Slices (io.airlift.slice.Slices)1 DataSize (io.airlift.units.DataSize)1 PagesSerde (io.hetu.core.transport.execution.buffer.PagesSerde)1 PagesSerdeUtil (io.hetu.core.transport.execution.buffer.PagesSerdeUtil)1 SortingColumn (io.prestosql.plugin.memory.SortingColumn)1 Page (io.prestosql.spi.Page)1 PageSorter (io.prestosql.spi.PageSorter)1