Search in sources :

Example 1 with OrcMapreduceRecordReader

use of org.apache.orc.mapreduce.OrcMapreduceRecordReader in project incubator-gobblin by apache.

the class OrcCompactionTaskTest method readOrcFile.

/**
 * Read a output ORC compacted file into memory.
 * This only works if fields are int value.
 */
private List<OrcStruct> readOrcFile(Path orcFilePath) throws IOException, InterruptedException {
    ReaderImpl orcReader = new ReaderImpl(orcFilePath, new OrcFile.ReaderOptions(new Configuration()));
    Reader.Options options = new Reader.Options().schema(orcReader.getSchema());
    OrcMapreduceRecordReader recordReader = new OrcMapreduceRecordReader(orcReader, options);
    List<OrcStruct> result = new ArrayList<>();
    OrcStruct recordContainer;
    while (recordReader.nextKeyValue()) {
        recordContainer = (OrcStruct) OrcUtils.createValueRecursively(orcReader.getSchema());
        OrcUtils.upConvertOrcStruct((OrcStruct) recordReader.getCurrentValue(), recordContainer, orcReader.getSchema());
        result.add(recordContainer);
    }
    return result;
}
Also used : OrcStruct(org.apache.orc.mapred.OrcStruct) Configuration(org.apache.hadoop.conf.Configuration) OrcFile(org.apache.orc.OrcFile) ArrayList(java.util.ArrayList) Reader(org.apache.orc.Reader) OrcMapreduceRecordReader(org.apache.orc.mapreduce.OrcMapreduceRecordReader) OrcMapreduceRecordReader(org.apache.orc.mapreduce.OrcMapreduceRecordReader) ReaderImpl(org.apache.orc.impl.ReaderImpl)

Aggregations

ArrayList (java.util.ArrayList)1 Configuration (org.apache.hadoop.conf.Configuration)1 OrcFile (org.apache.orc.OrcFile)1 Reader (org.apache.orc.Reader)1 ReaderImpl (org.apache.orc.impl.ReaderImpl)1 OrcStruct (org.apache.orc.mapred.OrcStruct)1 OrcMapreduceRecordReader (org.apache.orc.mapreduce.OrcMapreduceRecordReader)1