Search in sources :

Example 6 with CodedInputStream

use of io.trino.orc.protobuf.CodedInputStream in project trino by trinodb.

the class OrcMetadataReader method readRowIndexes.

@Override
public List<RowGroupIndex> readRowIndexes(HiveWriterVersion hiveWriterVersion, InputStream inputStream) throws IOException {
    CodedInputStream input = CodedInputStream.newInstance(inputStream);
    OrcProto.RowIndex rowIndex = OrcProto.RowIndex.parseFrom(input);
    return rowIndex.getEntryList().stream().map(rowIndexEntry -> toRowGroupIndex(hiveWriterVersion, rowIndexEntry)).collect(toImmutableList());
}
Also used : BINARY_VALUE_BYTES_OVERHEAD(io.trino.orc.metadata.statistics.BinaryStatistics.BINARY_VALUE_BYTES_OVERHEAD) DATE_VALUE_BYTES(io.trino.orc.metadata.statistics.DateStatistics.DATE_VALUE_BYTES) IntegerStatistics(io.trino.orc.metadata.statistics.IntegerStatistics) ORC_HIVE_8732(io.trino.orc.metadata.PostScript.HiveWriterVersion.ORC_HIVE_8732) OrcTypeKind(io.trino.orc.metadata.OrcType.OrcTypeKind) TIMESTAMP_VALUE_BYTES(io.trino.orc.metadata.statistics.TimestampStatistics.TIMESTAMP_VALUE_BYTES) DecimalStatistics(io.trino.orc.metadata.statistics.DecimalStatistics) NONE(io.trino.orc.metadata.CompressionKind.NONE) BigDecimal(java.math.BigDecimal) Preconditions.checkArgument(com.google.common.base.Preconditions.checkArgument) DoubleStatistics(io.trino.orc.metadata.statistics.DoubleStatistics) BOOLEAN_VALUE_BYTES(io.trino.orc.metadata.statistics.BooleanStatistics.BOOLEAN_VALUE_BYTES) DOUBLE_VALUE_BYTES(io.trino.orc.metadata.statistics.DoubleStatistics.DOUBLE_VALUE_BYTES) Slices(io.airlift.slice.Slices) Map(java.util.Map) SliceUtf8.tryGetCodePointAt(io.airlift.slice.SliceUtf8.tryGetCodePointAt) DateStatistics(io.trino.orc.metadata.statistics.DateStatistics) ByteString(io.trino.orc.protobuf.ByteString) Longs(com.google.common.primitives.Longs) ImmutableMap(com.google.common.collect.ImmutableMap) RowIndexEntry(io.trino.orc.proto.OrcProto.RowIndexEntry) HiveWriterVersion(io.trino.orc.metadata.PostScript.HiveWriterVersion) ImmutableList.toImmutableList(com.google.common.collect.ImmutableList.toImmutableList) OrcProto(io.trino.orc.proto.OrcProto) ZoneId(java.time.ZoneId) ColumnEncodingKind(io.trino.orc.metadata.ColumnEncoding.ColumnEncodingKind) ByteOrder(java.nio.ByteOrder) DataSize(io.airlift.units.DataSize) List(java.util.List) DECIMAL_VALUE_BYTES_OVERHEAD(io.trino.orc.metadata.statistics.DecimalStatistics.DECIMAL_VALUE_BYTES_OVERHEAD) Optional(java.util.Optional) MIN_SUPPLEMENTARY_CODE_POINT(java.lang.Character.MIN_SUPPLEMENTARY_CODE_POINT) ColumnStatistics(io.trino.orc.metadata.statistics.ColumnStatistics) BinaryStatistics(io.trino.orc.metadata.statistics.BinaryStatistics) Slice(io.airlift.slice.Slice) CodedInputStream(io.trino.orc.protobuf.CodedInputStream) OptionalInt(java.util.OptionalInt) ZSTD(io.trino.orc.metadata.CompressionKind.ZSTD) StringStatistics(io.trino.orc.metadata.statistics.StringStatistics) GIGABYTE(io.airlift.units.DataSize.Unit.GIGABYTE) ImmutableList(com.google.common.collect.ImmutableList) SHORT_DECIMAL_VALUE_BYTES(io.trino.orc.metadata.statistics.ShortDecimalStatisticsBuilder.SHORT_DECIMAL_VALUE_BYTES) SNAPPY(io.trino.orc.metadata.CompressionKind.SNAPPY) Math.toIntExact(java.lang.Math.toIntExact) StreamKind(io.trino.orc.metadata.Stream.StreamKind) SliceUtf8.lengthOfCodePoint(io.airlift.slice.SliceUtf8.lengthOfCodePoint) BloomFilter(io.trino.orc.metadata.statistics.BloomFilter) IOException(java.io.IOException) ZLIB(io.trino.orc.metadata.CompressionKind.ZLIB) TimestampStatistics(io.trino.orc.metadata.statistics.TimestampStatistics) Strings.emptyToNull(com.google.common.base.Strings.emptyToNull) LZ4(io.trino.orc.metadata.CompressionKind.LZ4) STRING_VALUE_BYTES_OVERHEAD(io.trino.orc.metadata.statistics.StringStatistics.STRING_VALUE_BYTES_OVERHEAD) BooleanStatistics(io.trino.orc.metadata.statistics.BooleanStatistics) StripeStatistics(io.trino.orc.metadata.statistics.StripeStatistics) ORIGINAL(io.trino.orc.metadata.PostScript.HiveWriterVersion.ORIGINAL) VisibleForTesting(com.google.common.annotations.VisibleForTesting) INTEGER_VALUE_BYTES(io.trino.orc.metadata.statistics.IntegerStatistics.INTEGER_VALUE_BYTES) InputStream(java.io.InputStream) CodedInputStream(io.trino.orc.protobuf.CodedInputStream) OrcProto(io.trino.orc.proto.OrcProto)

Example 7 with CodedInputStream

use of io.trino.orc.protobuf.CodedInputStream in project trino by trinodb.

the class OrcMetadataReader method readFooter.

@Override
public Footer readFooter(HiveWriterVersion hiveWriterVersion, InputStream inputStream) throws IOException {
    CodedInputStream input = CodedInputStream.newInstance(inputStream);
    input.setSizeLimit(PROTOBUF_MESSAGE_MAX_LIMIT);
    OrcProto.Footer footer = OrcProto.Footer.parseFrom(input);
    return new Footer(footer.getNumberOfRows(), footer.getRowIndexStride() == 0 ? OptionalInt.empty() : OptionalInt.of(footer.getRowIndexStride()), toStripeInformation(footer.getStripesList()), toType(footer.getTypesList()), toColumnStatistics(hiveWriterVersion, footer.getStatisticsList(), false), toUserMetadata(footer.getMetadataList()), Optional.of(footer.getWriter()));
}
Also used : CodedInputStream(io.trino.orc.protobuf.CodedInputStream) OrcProto(io.trino.orc.proto.OrcProto)

Aggregations

OrcProto (io.trino.orc.proto.OrcProto)7 CodedInputStream (io.trino.orc.protobuf.CodedInputStream)7 BloomFilter (io.trino.orc.metadata.statistics.BloomFilter)3 ImmutableList (com.google.common.collect.ImmutableList)2 ImmutableList.toImmutableList (com.google.common.collect.ImmutableList.toImmutableList)2 Slice (io.airlift.slice.Slice)2 ZoneId (java.time.ZoneId)2 VisibleForTesting (com.google.common.annotations.VisibleForTesting)1 Preconditions.checkArgument (com.google.common.base.Preconditions.checkArgument)1 Strings.emptyToNull (com.google.common.base.Strings.emptyToNull)1 ImmutableMap (com.google.common.collect.ImmutableMap)1 Longs (com.google.common.primitives.Longs)1 SliceUtf8.lengthOfCodePoint (io.airlift.slice.SliceUtf8.lengthOfCodePoint)1 SliceUtf8.tryGetCodePointAt (io.airlift.slice.SliceUtf8.tryGetCodePointAt)1 Slices (io.airlift.slice.Slices)1 DataSize (io.airlift.units.DataSize)1 GIGABYTE (io.airlift.units.DataSize.Unit.GIGABYTE)1 TupleDomainOrcPredicate.checkInBloomFilter (io.trino.orc.TupleDomainOrcPredicate.checkInBloomFilter)1 ColumnEncodingKind (io.trino.orc.metadata.ColumnEncoding.ColumnEncodingKind)1 CompressedMetadataWriter (io.trino.orc.metadata.CompressedMetadataWriter)1