Search in sources :

Example 1 with Stream

use of io.trino.orc.metadata.Stream in project trino by trinodb.

the class ListColumnWriter method getIndexStreams.

@Override
public List<StreamDataOutput> getIndexStreams(CompressedMetadataWriter metadataWriter) throws IOException {
    checkState(closed);
    ImmutableList.Builder<RowGroupIndex> rowGroupIndexes = ImmutableList.builder();
    List<LongStreamCheckpoint> lengthCheckpoints = lengthStream.getCheckpoints();
    Optional<List<BooleanStreamCheckpoint>> presentCheckpoints = presentStream.getCheckpoints();
    for (int i = 0; i < rowGroupColumnStatistics.size(); i++) {
        int groupId = i;
        ColumnStatistics columnStatistics = rowGroupColumnStatistics.get(groupId);
        LongStreamCheckpoint lengthCheckpoint = lengthCheckpoints.get(groupId);
        Optional<BooleanStreamCheckpoint> presentCheckpoint = presentCheckpoints.map(checkpoints -> checkpoints.get(groupId));
        List<Integer> positions = createArrayColumnPositionList(compressed, lengthCheckpoint, presentCheckpoint);
        rowGroupIndexes.add(new RowGroupIndex(positions, columnStatistics));
    }
    Slice slice = metadataWriter.writeRowIndexes(rowGroupIndexes.build());
    Stream stream = new Stream(columnId, StreamKind.ROW_INDEX, slice.length(), false);
    ImmutableList.Builder<StreamDataOutput> indexStreams = ImmutableList.builder();
    indexStreams.add(new StreamDataOutput(slice, stream));
    indexStreams.addAll(elementWriter.getIndexStreams(metadataWriter));
    indexStreams.addAll(elementWriter.getBloomFilters(metadataWriter));
    return indexStreams.build();
}
Also used : ColumnStatistics(io.trino.orc.metadata.statistics.ColumnStatistics) BooleanStreamCheckpoint(io.trino.orc.checkpoint.BooleanStreamCheckpoint) ImmutableList(com.google.common.collect.ImmutableList) StreamDataOutput(io.trino.orc.stream.StreamDataOutput) LongStreamCheckpoint(io.trino.orc.checkpoint.LongStreamCheckpoint) BooleanStreamCheckpoint(io.trino.orc.checkpoint.BooleanStreamCheckpoint) LongStreamCheckpoint(io.trino.orc.checkpoint.LongStreamCheckpoint) RowGroupIndex(io.trino.orc.metadata.RowGroupIndex) Slice(io.airlift.slice.Slice) ArrayList(java.util.ArrayList) ImmutableList(com.google.common.collect.ImmutableList) List(java.util.List) PresentOutputStream(io.trino.orc.stream.PresentOutputStream) Stream(io.trino.orc.metadata.Stream) LongOutputStream(io.trino.orc.stream.LongOutputStream) LongOutputStream.createLengthOutputStream(io.trino.orc.stream.LongOutputStream.createLengthOutputStream)

Example 2 with Stream

use of io.trino.orc.metadata.Stream in project trino by trinodb.

the class SliceDirectColumnWriter method getIndexStreams.

@Override
public List<StreamDataOutput> getIndexStreams(CompressedMetadataWriter metadataWriter) throws IOException {
    checkState(closed);
    ImmutableList.Builder<RowGroupIndex> rowGroupIndexes = ImmutableList.builder();
    List<LongStreamCheckpoint> lengthCheckpoints = lengthStream.getCheckpoints();
    List<ByteArrayStreamCheckpoint> dataCheckpoints = dataStream.getCheckpoints();
    Optional<List<BooleanStreamCheckpoint>> presentCheckpoints = presentStream.getCheckpoints();
    for (int i = 0; i < rowGroupColumnStatistics.size(); i++) {
        int groupId = i;
        ColumnStatistics columnStatistics = rowGroupColumnStatistics.get(groupId);
        LongStreamCheckpoint lengthCheckpoint = lengthCheckpoints.get(groupId);
        ByteArrayStreamCheckpoint dataCheckpoint = dataCheckpoints.get(groupId);
        Optional<BooleanStreamCheckpoint> presentCheckpoint = presentCheckpoints.map(checkpoints -> checkpoints.get(groupId));
        List<Integer> positions = createSliceColumnPositionList(compressed, lengthCheckpoint, dataCheckpoint, presentCheckpoint);
        rowGroupIndexes.add(new RowGroupIndex(positions, columnStatistics));
    }
    Slice slice = metadataWriter.writeRowIndexes(rowGroupIndexes.build());
    Stream stream = new Stream(columnId, StreamKind.ROW_INDEX, slice.length(), false);
    return ImmutableList.of(new StreamDataOutput(slice, stream));
}
Also used : ColumnStatistics(io.trino.orc.metadata.statistics.ColumnStatistics) BooleanStreamCheckpoint(io.trino.orc.checkpoint.BooleanStreamCheckpoint) ByteArrayStreamCheckpoint(io.trino.orc.checkpoint.ByteArrayStreamCheckpoint) ImmutableList(com.google.common.collect.ImmutableList) ImmutableList.toImmutableList(com.google.common.collect.ImmutableList.toImmutableList) StreamDataOutput(io.trino.orc.stream.StreamDataOutput) LongStreamCheckpoint(io.trino.orc.checkpoint.LongStreamCheckpoint) BooleanStreamCheckpoint(io.trino.orc.checkpoint.BooleanStreamCheckpoint) ByteArrayStreamCheckpoint(io.trino.orc.checkpoint.ByteArrayStreamCheckpoint) LongStreamCheckpoint(io.trino.orc.checkpoint.LongStreamCheckpoint) RowGroupIndex(io.trino.orc.metadata.RowGroupIndex) Slice(io.airlift.slice.Slice) ArrayList(java.util.ArrayList) ImmutableList(com.google.common.collect.ImmutableList) ImmutableList.toImmutableList(com.google.common.collect.ImmutableList.toImmutableList) List(java.util.List) PresentOutputStream(io.trino.orc.stream.PresentOutputStream) Stream(io.trino.orc.metadata.Stream) LongOutputStream(io.trino.orc.stream.LongOutputStream) LongOutputStream.createLengthOutputStream(io.trino.orc.stream.LongOutputStream.createLengthOutputStream) ByteArrayOutputStream(io.trino.orc.stream.ByteArrayOutputStream)

Example 3 with Stream

use of io.trino.orc.metadata.Stream in project trino by trinodb.

the class FloatColumnWriter method getBloomFilters.

@Override
public List<StreamDataOutput> getBloomFilters(CompressedMetadataWriter metadataWriter) throws IOException {
    List<BloomFilter> bloomFilters = rowGroupColumnStatistics.stream().map(ColumnStatistics::getBloomFilter).filter(Objects::nonNull).collect(toImmutableList());
    if (!bloomFilters.isEmpty()) {
        Slice slice = metadataWriter.writeBloomFilters(bloomFilters);
        Stream stream = new Stream(columnId, StreamKind.BLOOM_FILTER_UTF8, slice.length(), false);
        return ImmutableList.of(new StreamDataOutput(slice, stream));
    }
    return ImmutableList.of();
}
Also used : ColumnStatistics(io.trino.orc.metadata.statistics.ColumnStatistics) Slice(io.airlift.slice.Slice) PresentOutputStream(io.trino.orc.stream.PresentOutputStream) Stream(io.trino.orc.metadata.Stream) FloatOutputStream(io.trino.orc.stream.FloatOutputStream) StreamDataOutput(io.trino.orc.stream.StreamDataOutput) BloomFilter(io.trino.orc.metadata.statistics.BloomFilter)

Example 4 with Stream

use of io.trino.orc.metadata.Stream in project trino by trinodb.

the class ByteColumnWriter method getIndexStreams.

@Override
public List<StreamDataOutput> getIndexStreams(CompressedMetadataWriter metadataWriter) throws IOException {
    checkState(closed);
    ImmutableList.Builder<RowGroupIndex> rowGroupIndexes = ImmutableList.builder();
    List<ByteStreamCheckpoint> dataCheckpoints = dataStream.getCheckpoints();
    Optional<List<BooleanStreamCheckpoint>> presentCheckpoints = presentStream.getCheckpoints();
    for (int i = 0; i < rowGroupColumnStatistics.size(); i++) {
        int groupId = i;
        ColumnStatistics columnStatistics = rowGroupColumnStatistics.get(groupId);
        ByteStreamCheckpoint dataCheckpoint = dataCheckpoints.get(groupId);
        Optional<BooleanStreamCheckpoint> presentCheckpoint = presentCheckpoints.map(checkpoints -> checkpoints.get(groupId));
        List<Integer> positions = createByteColumnPositionList(compressed, dataCheckpoint, presentCheckpoint);
        rowGroupIndexes.add(new RowGroupIndex(positions, columnStatistics));
    }
    Slice slice = metadataWriter.writeRowIndexes(rowGroupIndexes.build());
    Stream stream = new Stream(columnId, StreamKind.ROW_INDEX, slice.length(), false);
    return ImmutableList.of(new StreamDataOutput(slice, stream));
}
Also used : ColumnStatistics(io.trino.orc.metadata.statistics.ColumnStatistics) BooleanStreamCheckpoint(io.trino.orc.checkpoint.BooleanStreamCheckpoint) ImmutableList(com.google.common.collect.ImmutableList) ImmutableList.toImmutableList(com.google.common.collect.ImmutableList.toImmutableList) ByteStreamCheckpoint(io.trino.orc.checkpoint.ByteStreamCheckpoint) StreamDataOutput(io.trino.orc.stream.StreamDataOutput) ByteStreamCheckpoint(io.trino.orc.checkpoint.ByteStreamCheckpoint) BooleanStreamCheckpoint(io.trino.orc.checkpoint.BooleanStreamCheckpoint) RowGroupIndex(io.trino.orc.metadata.RowGroupIndex) Slice(io.airlift.slice.Slice) ArrayList(java.util.ArrayList) ImmutableList(com.google.common.collect.ImmutableList) ImmutableList.toImmutableList(com.google.common.collect.ImmutableList.toImmutableList) List(java.util.List) PresentOutputStream(io.trino.orc.stream.PresentOutputStream) Stream(io.trino.orc.metadata.Stream) ByteOutputStream(io.trino.orc.stream.ByteOutputStream)

Example 5 with Stream

use of io.trino.orc.metadata.Stream in project trino by trinodb.

the class PresentOutputStream method getStreamDataOutput.

public Optional<StreamDataOutput> getStreamDataOutput(OrcColumnId columnId) {
    checkArgument(closed);
    if (booleanOutputStream == null) {
        return Optional.empty();
    }
    StreamDataOutput streamDataOutput = booleanOutputStream.getStreamDataOutput(columnId);
    // rewrite the DATA stream created by the boolean output stream to a PRESENT stream
    Stream stream = new Stream(columnId, PRESENT, toIntExact(streamDataOutput.size()), streamDataOutput.getStream().isUseVInts());
    return Optional.of(new StreamDataOutput(sliceOutput -> {
        streamDataOutput.writeData(sliceOutput);
        return stream.getLength();
    }, stream));
}
Also used : OrcOutputBuffer(io.trino.orc.OrcOutputBuffer) PRESENT(io.trino.orc.metadata.Stream.StreamKind.PRESENT) BooleanStreamCheckpoint(io.trino.orc.checkpoint.BooleanStreamCheckpoint) CompressionKind(io.trino.orc.metadata.CompressionKind) Stream(io.trino.orc.metadata.Stream) ArrayList(java.util.ArrayList) Preconditions.checkState(com.google.common.base.Preconditions.checkState) List(java.util.List) Preconditions.checkArgument(com.google.common.base.Preconditions.checkArgument) ClassLayout(org.openjdk.jol.info.ClassLayout) Optional(java.util.Optional) Math.toIntExact(java.lang.Math.toIntExact) Nullable(javax.annotation.Nullable) OrcColumnId(io.trino.orc.metadata.OrcColumnId) Stream(io.trino.orc.metadata.Stream)

Aggregations

Stream (io.trino.orc.metadata.Stream)33 Slice (io.airlift.slice.Slice)23 ColumnStatistics (io.trino.orc.metadata.statistics.ColumnStatistics)23 StreamDataOutput (io.trino.orc.stream.StreamDataOutput)20 ArrayList (java.util.ArrayList)20 List (java.util.List)20 ImmutableList (com.google.common.collect.ImmutableList)19 PresentOutputStream (io.trino.orc.stream.PresentOutputStream)18 RowGroupIndex (io.trino.orc.metadata.RowGroupIndex)16 BooleanStreamCheckpoint (io.trino.orc.checkpoint.BooleanStreamCheckpoint)14 OrcColumnId (io.trino.orc.metadata.OrcColumnId)11 BloomFilter (io.trino.orc.metadata.statistics.BloomFilter)9 OrcInputStream (io.trino.orc.stream.OrcInputStream)9 InputStream (java.io.InputStream)9 ImmutableMap (com.google.common.collect.ImmutableMap)8 LongOutputStream (io.trino.orc.stream.LongOutputStream)8 ValueInputStream (io.trino.orc.stream.ValueInputStream)8 ImmutableList.toImmutableList (com.google.common.collect.ImmutableList.toImmutableList)7 ImmutableMap.toImmutableMap (com.google.common.collect.ImmutableMap.toImmutableMap)7 StreamCheckpoint (io.trino.orc.checkpoint.StreamCheckpoint)6