Search in sources :

Example 11 with OrcInputStream

use of com.facebook.presto.orc.stream.OrcInputStream in project presto by prestodb.

the class AbstractTestDwrfStripeCaching method readFileFooter.

static DwrfProto.Footer readFileFooter(File orcFile) {
    try (RandomAccessFile file = new RandomAccessFile(orcFile, "r")) {
        // read postscript size
        file.seek(file.length() - 1);
        int postScriptSize = file.read() & 0xff;
        // read postscript
        long postScriptPosition = file.length() - postScriptSize - 1;
        byte[] postScriptBytes = readBytes(file, postScriptPosition, postScriptSize);
        CodedInputStream postScriptInput = CodedInputStream.newInstance(postScriptBytes, 0, postScriptSize);
        DwrfProto.PostScript postScript = DwrfProto.PostScript.parseFrom(postScriptInput);
        // read footer
        long footerPosition = postScriptPosition - postScript.getFooterLength();
        int footerLength = toIntExact(postScript.getFooterLength());
        byte[] footerBytes = readBytes(file, footerPosition, postScript.getFooterLength());
        int compressionBufferSize = toIntExact(postScript.getCompressionBlockSize());
        OrcDataSourceId dataSourceId = new OrcDataSourceId(orcFile.getName());
        Optional<OrcDecompressor> decompressor = OrcDecompressor.createOrcDecompressor(dataSourceId, ZLIB, compressionBufferSize);
        InputStream footerInputStream = new OrcInputStream(dataSourceId, new SharedBuffer(NOOP_ORC_LOCAL_MEMORY_CONTEXT), Slices.wrappedBuffer(footerBytes).slice(0, footerLength).getInput(), decompressor, Optional.empty(), NOOP_ORC_AGGREGATED_MEMORY_CONTEXT, footerLength);
        return DwrfProto.Footer.parseFrom(footerInputStream);
    } catch (IOException e) {
        throw new UncheckedIOException(e);
    }
}
Also used : OrcInputStream(com.facebook.presto.orc.stream.OrcInputStream) CodedInputStream(com.facebook.presto.orc.protobuf.CodedInputStream) CodedInputStream(com.facebook.presto.orc.protobuf.CodedInputStream) OrcInputStream(com.facebook.presto.orc.stream.OrcInputStream) InputStream(java.io.InputStream) UncheckedIOException(java.io.UncheckedIOException) DwrfProto(com.facebook.presto.orc.proto.DwrfProto) IOException(java.io.IOException) UncheckedIOException(java.io.UncheckedIOException) SharedBuffer(com.facebook.presto.orc.stream.SharedBuffer) RandomAccessFile(java.io.RandomAccessFile)

Example 12 with OrcInputStream

use of com.facebook.presto.orc.stream.OrcInputStream in project presto by prestodb.

the class StripeReader method readDiskRanges.

private Map<StreamId, OrcInputStream> readDiskRanges(StripeId stripeId, Map<StreamId, DiskRange> diskRanges, OrcAggregatedMemoryContext systemMemoryUsage, Optional<DwrfEncryptionInfo> decryptors, SharedBuffer sharedDecompressionBuffer) throws IOException {
    // 
    // Note: this code does not use the Java 8 stream APIs to avoid any extra object allocation
    // 
    // read ranges
    Map<StreamId, OrcDataSourceInput> streamsData = stripeMetadataSource.getInputs(orcDataSource, stripeId, diskRanges, cacheable);
    // transform streams to OrcInputStream
    ImmutableMap.Builder<StreamId, OrcInputStream> streamsBuilder = ImmutableMap.builder();
    for (Entry<StreamId, OrcDataSourceInput> entry : streamsData.entrySet()) {
        OrcDataSourceInput sourceInput = entry.getValue();
        Optional<DwrfDataEncryptor> dwrfDecryptor = createDwrfDecryptor(entry.getKey(), decryptors);
        streamsBuilder.put(entry.getKey(), new OrcInputStream(orcDataSource.getId(), sharedDecompressionBuffer, sourceInput.getInput(), decompressor, dwrfDecryptor, systemMemoryUsage, sourceInput.getRetainedSizeInBytes()));
    }
    return streamsBuilder.build();
}
Also used : OrcInputStream(com.facebook.presto.orc.stream.OrcInputStream) ImmutableMap(com.google.common.collect.ImmutableMap)

Example 13 with OrcInputStream

use of com.facebook.presto.orc.stream.OrcInputStream in project presto by prestodb.

the class StripeReader method readBloomFilterIndexes.

private Map<Integer, List<HiveBloomFilter>> readBloomFilterIndexes(Map<StreamId, Stream> streams, Map<StreamId, OrcInputStream> streamsData) throws IOException {
    ImmutableMap.Builder<Integer, List<HiveBloomFilter>> bloomFilters = ImmutableMap.builder();
    for (Entry<StreamId, Stream> entry : streams.entrySet()) {
        Stream stream = entry.getValue();
        if (stream.getStreamKind() == BLOOM_FILTER) {
            OrcInputStream inputStream = streamsData.get(entry.getKey());
            bloomFilters.put(entry.getKey().getColumn(), metadataReader.readBloomFilterIndexes(inputStream));
        }
    // TODO: add support for BLOOM_FILTER_UTF8
    }
    return bloomFilters.build();
}
Also used : OrcInputStream(com.facebook.presto.orc.stream.OrcInputStream) List(java.util.List) ArrayList(java.util.ArrayList) ImmutableList(com.google.common.collect.ImmutableList) ValueInputStream(com.facebook.presto.orc.stream.ValueInputStream) OrcInputStream(com.facebook.presto.orc.stream.OrcInputStream) Stream(com.facebook.presto.orc.metadata.Stream) InputStream(java.io.InputStream) ImmutableMap(com.google.common.collect.ImmutableMap)

Example 14 with OrcInputStream

use of com.facebook.presto.orc.stream.OrcInputStream in project presto by prestodb.

the class StripeReader method readStripeFooter.

public StripeFooter readStripeFooter(StripeId stripeId, StripeInformation stripe, OrcAggregatedMemoryContext systemMemoryUsage) throws IOException {
    long footerOffset = stripe.getOffset() + stripe.getIndexLength() + stripe.getDataLength();
    int footerLength = toIntExact(stripe.getFooterLength());
    // read the footer
    Slice footerSlice = stripeMetadataSource.getStripeFooterSlice(orcDataSource, stripeId, footerOffset, footerLength, cacheable);
    try (InputStream inputStream = new OrcInputStream(orcDataSource.getId(), // Memory is not accounted as the buffer is expected to be tiny and will be immediately discarded
    new SharedBuffer(NOOP_ORC_LOCAL_MEMORY_CONTEXT), footerSlice.getInput(), decompressor, Optional.empty(), systemMemoryUsage, footerLength)) {
        return metadataReader.readStripeFooter(orcDataSource.getId(), types, inputStream);
    }
}
Also used : SharedBuffer(com.facebook.presto.orc.stream.SharedBuffer) OrcInputStream(com.facebook.presto.orc.stream.OrcInputStream) Slice(io.airlift.slice.Slice) ValueInputStream(com.facebook.presto.orc.stream.ValueInputStream) OrcInputStream(com.facebook.presto.orc.stream.OrcInputStream) InputStream(java.io.InputStream) Checkpoints.getDictionaryStreamCheckpoint(com.facebook.presto.orc.checkpoint.Checkpoints.getDictionaryStreamCheckpoint) StreamCheckpoint(com.facebook.presto.orc.checkpoint.StreamCheckpoint)

Aggregations

OrcInputStream (com.facebook.presto.orc.stream.OrcInputStream)14 InputStream (java.io.InputStream)11 ImmutableMap (com.google.common.collect.ImmutableMap)9 Stream (com.facebook.presto.orc.metadata.Stream)7 ImmutableList (com.google.common.collect.ImmutableList)6 ValueInputStream (com.facebook.presto.orc.stream.ValueInputStream)5 List (java.util.List)5 SharedBuffer (com.facebook.presto.orc.stream.SharedBuffer)4 Checkpoints.getDictionaryStreamCheckpoint (com.facebook.presto.orc.checkpoint.Checkpoints.getDictionaryStreamCheckpoint)3 StreamCheckpoint (com.facebook.presto.orc.checkpoint.StreamCheckpoint)3 ColumnEncodingKind (com.facebook.presto.orc.metadata.ColumnEncoding.ColumnEncodingKind)3 RowGroupIndex (com.facebook.presto.orc.metadata.RowGroupIndex)3 StripeFooter (com.facebook.presto.orc.metadata.StripeFooter)3 ValueStream (com.facebook.presto.orc.stream.ValueStream)3 Slice (io.airlift.slice.Slice)3 ArrayList (java.util.ArrayList)3 InvalidCheckpointException (com.facebook.presto.orc.checkpoint.InvalidCheckpointException)2 ColumnEncoding (com.facebook.presto.orc.metadata.ColumnEncoding)2 OrcTypeKind (com.facebook.presto.orc.metadata.OrcType.OrcTypeKind)2 ColumnStatistics (com.facebook.presto.orc.metadata.statistics.ColumnStatistics)2