Search in sources :

Example 16 with SpecificDatumReader

use of org.apache.avro.specific.SpecificDatumReader in project gora by apache.

the class HBaseByteInterface method fromBytes.

/**
   * Deserializes an array of bytes matching the given schema to the proper basic 
   * (enum, Utf8,...) or complex type (Persistent/Record).
   * 
   * Does not handle <code>arrays/maps</code> if not inside a <code>record</code> type.
   * 
   * @param schema Avro schema describing the expected data
   * @param val array of bytes with the data serialized
   * @return Enum|Utf8|ByteBuffer|Integer|Long|Float|Double|Boolean|Persistent|Null
   * @throws IOException
   */
@SuppressWarnings({ "rawtypes" })
public static Object fromBytes(Schema schema, byte[] val) throws IOException {
    Type type = schema.getType();
    switch(type) {
        case ENUM:
            return AvroUtils.getEnumValue(schema, val[0]);
        case STRING:
            return new Utf8(Bytes.toString(val));
        case BYTES:
            return ByteBuffer.wrap(val);
        case INT:
            return Bytes.toInt(val);
        case LONG:
            return Bytes.toLong(val);
        case FLOAT:
            return Bytes.toFloat(val);
        case DOUBLE:
            return Bytes.toDouble(val);
        case BOOLEAN:
            return val[0] != 0;
        case UNION:
            // if 'val' is empty we ignore the special case (will match Null in "case RECORD")  
            if (schema.getTypes().size() == 2) {
                // schema [type0, type1]
                Type type0 = schema.getTypes().get(0).getType();
                Type type1 = schema.getTypes().get(1).getType();
                // Check if types are different and there's a "null", like ["null","type"] or ["type","null"]
                if (!type0.equals(type1) && (type0.equals(Schema.Type.NULL) || type1.equals(Schema.Type.NULL))) {
                    if (type0.equals(Schema.Type.NULL))
                        schema = schema.getTypes().get(1);
                    else
                        schema = schema.getTypes().get(0);
                    // Deserialize as if schema was ["type"] 
                    return fromBytes(schema, val);
                }
            }
        case RECORD:
            // For UNION schemas, must use a specific SpecificDatumReader
            // from the readerMap since unions don't have own name
            // (key name in map will be "UNION-type-type-...")
            String schemaId = schema.getType().equals(Schema.Type.UNION) ? String.valueOf(schema.hashCode()) : schema.getFullName();
            SpecificDatumReader<?> reader = readerMap.get(schemaId);
            if (reader == null) {
                // ignore dirty bits
                reader = new SpecificDatumReader(schema);
                SpecificDatumReader localReader = null;
                if ((localReader = readerMap.putIfAbsent(schemaId, reader)) != null) {
                    reader = localReader;
                }
            }
            // initialize a decoder, possibly reusing previous one
            BinaryDecoder decoderFromCache = decoders.get();
            BinaryDecoder decoder = DecoderFactory.get().binaryDecoder(val, null);
            // put in threadlocal cache if the initial get was empty
            if (decoderFromCache == null) {
                decoders.set(decoder);
            }
            return reader.read(null, decoder);
        default:
            throw new RuntimeException("Unknown type: " + type);
    }
}
Also used : Type(org.apache.avro.Schema.Type) Utf8(org.apache.avro.util.Utf8) SpecificDatumReader(org.apache.avro.specific.SpecificDatumReader) BinaryDecoder(org.apache.avro.io.BinaryDecoder)

Example 17 with SpecificDatumReader

use of org.apache.avro.specific.SpecificDatumReader in project sling by apache.

the class AvroContentSerializer method readAvroResources.

private Collection<AvroShallowResource> readAvroResources(byte[] bytes) throws IOException {
    DatumReader<AvroShallowResource> datumReader = new SpecificDatumReader<AvroShallowResource>(AvroShallowResource.class);
    DataFileReader<AvroShallowResource> dataFileReader = new DataFileReader<AvroShallowResource>(new SeekableByteArrayInput(bytes), datumReader);
    Collection<AvroShallowResource> avroResources = new LinkedList<AvroShallowResource>();
    try {
        for (AvroShallowResource avroResource : dataFileReader) {
            avroResources.add(avroResource);
        }
    } finally {
        dataFileReader.close();
    }
    return avroResources;
}
Also used : DataFileReader(org.apache.avro.file.DataFileReader) SpecificDatumReader(org.apache.avro.specific.SpecificDatumReader) SeekableByteArrayInput(org.apache.avro.file.SeekableByteArrayInput) LinkedList(java.util.LinkedList)

Aggregations

SpecificDatumReader (org.apache.avro.specific.SpecificDatumReader)17 Schema (org.apache.avro.Schema)6 HashMap (java.util.HashMap)5 GenericRecord (org.apache.avro.generic.GenericRecord)5 BinaryDecoder (org.apache.avro.io.BinaryDecoder)5 Test (org.junit.Test)5 DataFileStream (org.apache.avro.file.DataFileStream)4 Utf8 (org.apache.avro.util.Utf8)4 Tuple2 (org.apache.flink.api.java.tuple.Tuple2)4 FSDataInputStream (org.apache.hadoop.fs.FSDataInputStream)4 Path (org.apache.hadoop.fs.Path)4 DataFileReader (org.apache.avro.file.DataFileReader)3 IOException (java.io.IOException)2 ArrayList (java.util.ArrayList)2 Type (org.apache.avro.Schema.Type)2 ReflectDatumReader (org.apache.avro.reflect.ReflectDatumReader)2 TypeHint (org.apache.flink.api.common.typeinfo.TypeHint)2 StreamExecutionEnvironment (org.apache.flink.streaming.api.environment.StreamExecutionEnvironment)2 AvroKeyValueSinkWriter (org.apache.flink.streaming.connectors.fs.AvroKeyValueSinkWriter)2 AvroKeyValue (org.apache.flink.streaming.connectors.fs.AvroKeyValueSinkWriter.AvroKeyValue)2