Search in sources :

Example 11 with SeekableInput

use of org.apache.avro.file.SeekableInput in project incubator-gobblin by apache.

the class TestAvroExtractor method getRecordFromFile.

public static List<GenericRecord> getRecordFromFile(String path) throws IOException {
    Configuration config = new Configuration();
    SeekableInput input = new FsInput(new Path(path), config);
    DatumReader<GenericRecord> reader1 = new GenericDatumReader<>();
    FileReader<GenericRecord> fileReader = DataFileReader.openReader(input, reader1);
    List<GenericRecord> records = new ArrayList<>();
    for (GenericRecord datum : fileReader) {
        records.add(datum);
    }
    fileReader.close();
    return records;
}
Also used : Path(org.apache.hadoop.fs.Path) Configuration(org.apache.hadoop.conf.Configuration) FsInput(org.apache.avro.mapred.FsInput) GenericDatumReader(org.apache.avro.generic.GenericDatumReader) ArrayList(java.util.ArrayList) SeekableInput(org.apache.avro.file.SeekableInput) GenericRecord(org.apache.avro.generic.GenericRecord)

Aggregations

SeekableInput (org.apache.avro.file.SeekableInput)11 DataFileReader (org.apache.avro.file.DataFileReader)6 GenericRecord (org.apache.avro.generic.GenericRecord)5 GenericDatumReader (org.apache.avro.generic.GenericDatumReader)4 FsInput (org.apache.avro.mapred.FsInput)4 Schema (org.apache.avro.Schema)3 Configuration (org.apache.hadoop.conf.Configuration)3 InputStream (java.io.InputStream)2 ArrayList (java.util.ArrayList)2 SeekableByteArrayInput (org.apache.avro.file.SeekableByteArrayInput)2 ReflectDatumReader (org.apache.avro.reflect.ReflectDatumReader)2 SpecificDatumReader (org.apache.avro.specific.SpecificDatumReader)2 Utf8 (org.apache.avro.util.Utf8)2 Path (org.apache.hadoop.fs.Path)2 Formats (org.apache.parquet.cli.util.Formats)2 Test (org.junit.Test)2 AbstractIterator (com.google.common.collect.AbstractIterator)1 RowVisitor (com.thinkbiganalytics.nifi.thrift.api.RowVisitor)1 ByteArrayOutputStream (java.io.ByteArrayOutputStream)1 IOException (java.io.IOException)1