Search in sources :

Example 21 with ColumnStatistics

use of io.trino.orc.metadata.statistics.ColumnStatistics in project trino by trinodb.

the class DoubleColumnWriter method getBloomFilters.

@Override
public List<StreamDataOutput> getBloomFilters(CompressedMetadataWriter metadataWriter) throws IOException {
    List<BloomFilter> bloomFilters = rowGroupColumnStatistics.stream().map(ColumnStatistics::getBloomFilter).filter(Objects::nonNull).collect(toImmutableList());
    if (!bloomFilters.isEmpty()) {
        Slice slice = metadataWriter.writeBloomFilters(bloomFilters);
        Stream stream = new Stream(columnId, StreamKind.BLOOM_FILTER_UTF8, slice.length(), false);
        return ImmutableList.of(new StreamDataOutput(slice, stream));
    }
    return ImmutableList.of();
}
Also used : ColumnStatistics(io.trino.orc.metadata.statistics.ColumnStatistics) Slice(io.airlift.slice.Slice) PresentOutputStream(io.trino.orc.stream.PresentOutputStream) DoubleOutputStream(io.trino.orc.stream.DoubleOutputStream) Stream(io.trino.orc.metadata.Stream) StreamDataOutput(io.trino.orc.stream.StreamDataOutput) BloomFilter(io.trino.orc.metadata.statistics.BloomFilter)

Example 22 with ColumnStatistics

use of io.trino.orc.metadata.statistics.ColumnStatistics in project trino by trinodb.

the class FloatColumnWriter method finishRowGroup.

@Override
public Map<OrcColumnId, ColumnStatistics> finishRowGroup() {
    checkState(!closed);
    ColumnStatistics statistics = statisticsBuilder.buildColumnStatistics();
    rowGroupColumnStatistics.add(statistics);
    statisticsBuilder = statisticsBuilderSupplier.get();
    return ImmutableMap.of(columnId, statistics);
}
Also used : ColumnStatistics(io.trino.orc.metadata.statistics.ColumnStatistics)

Example 23 with ColumnStatistics

use of io.trino.orc.metadata.statistics.ColumnStatistics in project trino by trinodb.

the class FloatColumnWriter method getIndexStreams.

@Override
public List<StreamDataOutput> getIndexStreams(CompressedMetadataWriter metadataWriter) throws IOException {
    checkState(closed);
    ImmutableList.Builder<RowGroupIndex> rowGroupIndexes = ImmutableList.builder();
    List<FloatStreamCheckpoint> dataCheckpoints = dataStream.getCheckpoints();
    Optional<List<BooleanStreamCheckpoint>> presentCheckpoints = presentStream.getCheckpoints();
    for (int i = 0; i < rowGroupColumnStatistics.size(); i++) {
        int groupId = i;
        ColumnStatistics columnStatistics = rowGroupColumnStatistics.get(groupId);
        FloatStreamCheckpoint dataCheckpoint = dataCheckpoints.get(groupId);
        Optional<BooleanStreamCheckpoint> presentCheckpoint = presentCheckpoints.map(checkpoints -> checkpoints.get(groupId));
        List<Integer> positions = createFloatColumnPositionList(compressed, dataCheckpoint, presentCheckpoint);
        rowGroupIndexes.add(new RowGroupIndex(positions, columnStatistics));
    }
    Slice slice = metadataWriter.writeRowIndexes(rowGroupIndexes.build());
    Stream stream = new Stream(columnId, StreamKind.ROW_INDEX, slice.length(), false);
    return ImmutableList.of(new StreamDataOutput(slice, stream));
}
Also used : ColumnStatistics(io.trino.orc.metadata.statistics.ColumnStatistics) BooleanStreamCheckpoint(io.trino.orc.checkpoint.BooleanStreamCheckpoint) ImmutableList(com.google.common.collect.ImmutableList) ImmutableList.toImmutableList(com.google.common.collect.ImmutableList.toImmutableList) FloatStreamCheckpoint(io.trino.orc.checkpoint.FloatStreamCheckpoint) StreamDataOutput(io.trino.orc.stream.StreamDataOutput) BooleanStreamCheckpoint(io.trino.orc.checkpoint.BooleanStreamCheckpoint) FloatStreamCheckpoint(io.trino.orc.checkpoint.FloatStreamCheckpoint) RowGroupIndex(io.trino.orc.metadata.RowGroupIndex) Slice(io.airlift.slice.Slice) ArrayList(java.util.ArrayList) ImmutableList(com.google.common.collect.ImmutableList) ImmutableList.toImmutableList(com.google.common.collect.ImmutableList.toImmutableList) List(java.util.List) PresentOutputStream(io.trino.orc.stream.PresentOutputStream) Stream(io.trino.orc.metadata.Stream) FloatOutputStream(io.trino.orc.stream.FloatOutputStream)

Example 24 with ColumnStatistics

use of io.trino.orc.metadata.statistics.ColumnStatistics in project trino by trinodb.

the class BooleanColumnWriter method finishRowGroup.

@Override
public Map<OrcColumnId, ColumnStatistics> finishRowGroup() {
    checkState(!closed);
    ColumnStatistics statistics = statisticsBuilder.buildColumnStatistics();
    rowGroupColumnStatistics.add(statistics);
    statisticsBuilder = new BooleanStatisticsBuilder();
    return ImmutableMap.of(columnId, statistics);
}
Also used : ColumnStatistics(io.trino.orc.metadata.statistics.ColumnStatistics) BooleanStatisticsBuilder(io.trino.orc.metadata.statistics.BooleanStatisticsBuilder)

Example 25 with ColumnStatistics

use of io.trino.orc.metadata.statistics.ColumnStatistics in project trino by trinodb.

the class BooleanColumnWriter method getIndexStreams.

@Override
public List<StreamDataOutput> getIndexStreams(CompressedMetadataWriter metadataWriter) throws IOException {
    checkState(closed);
    ImmutableList.Builder<RowGroupIndex> rowGroupIndexes = ImmutableList.builder();
    List<BooleanStreamCheckpoint> dataCheckpoints = dataStream.getCheckpoints();
    Optional<List<BooleanStreamCheckpoint>> presentCheckpoints = presentStream.getCheckpoints();
    for (int i = 0; i < rowGroupColumnStatistics.size(); i++) {
        int groupId = i;
        ColumnStatistics columnStatistics = rowGroupColumnStatistics.get(groupId);
        BooleanStreamCheckpoint dataCheckpoint = dataCheckpoints.get(groupId);
        Optional<BooleanStreamCheckpoint> presentCheckpoint = presentCheckpoints.map(checkpoints -> checkpoints.get(groupId));
        List<Integer> positions = createBooleanColumnPositionList(compressed, dataCheckpoint, presentCheckpoint);
        rowGroupIndexes.add(new RowGroupIndex(positions, columnStatistics));
    }
    Slice slice = metadataWriter.writeRowIndexes(rowGroupIndexes.build());
    Stream stream = new Stream(columnId, StreamKind.ROW_INDEX, slice.length(), false);
    return ImmutableList.of(new StreamDataOutput(slice, stream));
}
Also used : ColumnStatistics(io.trino.orc.metadata.statistics.ColumnStatistics) BooleanStreamCheckpoint(io.trino.orc.checkpoint.BooleanStreamCheckpoint) ImmutableList(com.google.common.collect.ImmutableList) StreamDataOutput(io.trino.orc.stream.StreamDataOutput) BooleanStreamCheckpoint(io.trino.orc.checkpoint.BooleanStreamCheckpoint) RowGroupIndex(io.trino.orc.metadata.RowGroupIndex) Slice(io.airlift.slice.Slice) ArrayList(java.util.ArrayList) ImmutableList(com.google.common.collect.ImmutableList) List(java.util.List) PresentOutputStream(io.trino.orc.stream.PresentOutputStream) Stream(io.trino.orc.metadata.Stream) BooleanOutputStream(io.trino.orc.stream.BooleanOutputStream)

Aggregations

ColumnStatistics (io.trino.orc.metadata.statistics.ColumnStatistics)45 Slice (io.airlift.slice.Slice)23 Stream (io.trino.orc.metadata.Stream)23 StreamDataOutput (io.trino.orc.stream.StreamDataOutput)20 ArrayList (java.util.ArrayList)20 ImmutableList (com.google.common.collect.ImmutableList)19 List (java.util.List)19 PresentOutputStream (io.trino.orc.stream.PresentOutputStream)17 RowGroupIndex (io.trino.orc.metadata.RowGroupIndex)15 BooleanStreamCheckpoint (io.trino.orc.checkpoint.BooleanStreamCheckpoint)12 OrcColumnId (io.trino.orc.metadata.OrcColumnId)12 ImmutableList.toImmutableList (com.google.common.collect.ImmutableList.toImmutableList)9 LongOutputStream (io.trino.orc.stream.LongOutputStream)9 BloomFilter (io.trino.orc.metadata.statistics.BloomFilter)8 ImmutableMap (com.google.common.collect.ImmutableMap)7 LongStreamCheckpoint (io.trino.orc.checkpoint.LongStreamCheckpoint)7 ColumnEncoding (io.trino.orc.metadata.ColumnEncoding)5 ColumnMetadata (io.trino.orc.metadata.ColumnMetadata)5 StripeFooter (io.trino.orc.metadata.StripeFooter)5 LongOutputStream.createLengthOutputStream (io.trino.orc.stream.LongOutputStream.createLengthOutputStream)5