Search in sources :

Example 1 with PositionProvider

use of org.apache.orc.impl.PositionProvider in project hive by apache.

the class OrcEncodedDataConsumer method createPositionProviders.

private PositionProvider[] createPositionProviders(TreeReaderFactory.TreeReader[] columnReaders, OrcBatchKey batchKey, ConsumerStripeMetadata stripeMetadata) throws IOException {
    if (columnReaders.length == 0)
        return null;
    PositionProvider[] pps = null;
    if (!stripeMetadata.supportsRowIndexes()) {
        PositionProvider singleRgPp = new IndexlessPositionProvider();
        pps = new PositionProvider[stripeMetadata.getEncodings().size()];
        for (int i = 0; i < pps.length; ++i) {
            pps[i] = singleRgPp;
        }
    } else {
        int rowGroupIndex = batchKey.rgIx;
        if (rowGroupIndex == OrcEncodedColumnBatch.ALL_RGS) {
            throw new IOException("Cannot position readers without RG information");
        }
        // TODO: this assumes indexes in getRowIndexes would match column IDs
        OrcProto.RowIndex[] ris = stripeMetadata.getRowIndexes();
        pps = new PositionProvider[ris.length];
        for (int i = 0; i < ris.length; ++i) {
            OrcProto.RowIndex ri = ris[i];
            if (ri == null)
                continue;
            pps[i] = new RecordReaderImpl.PositionProviderImpl(ri.getEntry(rowGroupIndex));
        }
    }
    return pps;
}
Also used : OrcProto(org.apache.orc.OrcProto) PositionProvider(org.apache.orc.impl.PositionProvider) IOException(java.io.IOException) RecordReaderImpl(org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl)

Example 2 with PositionProvider

use of org.apache.orc.impl.PositionProvider in project hive by apache.

the class OrcEncodedDataConsumer method repositionInStreams.

private void repositionInStreams(TreeReaderFactory.TreeReader[] columnReaders, EncodedColumnBatch<OrcBatchKey> batch, boolean sameStripe, ConsumerStripeMetadata stripeMetadata) throws IOException {
    PositionProvider[] pps = createPositionProviders(columnReaders, batch.getBatchKey(), stripeMetadata);
    if (pps == null)
        return;
    for (int i = 0; i < columnReaders.length; i++) {
        TreeReader reader = columnReaders[i];
        // Note: we assume this never happens for SerDe reader - the batch would never have vectors.
        // That is always true now; but it wasn't some day, the below would throw in getColumnData.
        ((SettableTreeReader) reader).setBuffers(batch, sameStripe);
        // SettableTreeReader so that we can avoid this check.
        if (reader instanceof EncodedTreeReaderFactory.TimestampStreamReader && !sameStripe) {
            ((EncodedTreeReaderFactory.TimestampStreamReader) reader).updateTimezone(stripeMetadata.getWriterTimezone());
        }
        reader.seek(pps);
    }
}
Also used : SettableTreeReader(org.apache.hadoop.hive.ql.io.orc.encoded.EncodedTreeReaderFactory.SettableTreeReader) PositionProvider(org.apache.orc.impl.PositionProvider) TreeReader(org.apache.orc.impl.TreeReaderFactory.TreeReader) SettableTreeReader(org.apache.hadoop.hive.ql.io.orc.encoded.EncodedTreeReaderFactory.SettableTreeReader) StructTreeReader(org.apache.orc.impl.TreeReaderFactory.StructTreeReader) EncodedTreeReaderFactory(org.apache.hadoop.hive.ql.io.orc.encoded.EncodedTreeReaderFactory)

Aggregations

PositionProvider (org.apache.orc.impl.PositionProvider)2 IOException (java.io.IOException)1 RecordReaderImpl (org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl)1 EncodedTreeReaderFactory (org.apache.hadoop.hive.ql.io.orc.encoded.EncodedTreeReaderFactory)1 SettableTreeReader (org.apache.hadoop.hive.ql.io.orc.encoded.EncodedTreeReaderFactory.SettableTreeReader)1 OrcProto (org.apache.orc.OrcProto)1 StructTreeReader (org.apache.orc.impl.TreeReaderFactory.StructTreeReader)1 TreeReader (org.apache.orc.impl.TreeReaderFactory.TreeReader)1