Search in sources :

Example 1 with UnexpectedFormatException

use of io.cdap.cdap.api.data.format.UnexpectedFormatException in project cdap by caskdata.

the class ObjectDeserializer method deserializeField.

/**
 * Translate a field that fits a {@link Schema} field into a type that Hive understands.
 * For example, a ByteBuffer is allowed by schema but Hive only understands byte arrays, so all ByteBuffers must
 * be changed into byte arrays. Reflection is used to examine java objects if the expected hive type is a struct.
 *
 * @param field value of the field to deserialize.
 * @param typeInfo type of the field as expected by Hive.
 * @param schema schema of the field.
 * @return translated field.
 * @throws NoSuchFieldException if a struct field was expected but not found in the object.
 * @throws IllegalAccessException if a struct field was not accessible.
 */
private Object deserializeField(Object field, TypeInfo typeInfo, Schema schema) throws NoSuchFieldException, IllegalAccessException {
    boolean isNullable = schema.isNullable();
    if (field == null) {
        if (isNullable) {
            return null;
        } else {
            throw new UnexpectedFormatException("Non-nullable field was null.");
        }
    }
    if (isNullable) {
        schema = schema.getNonNullable();
    }
    switch(typeInfo.getCategory()) {
        case PRIMITIVE:
            return deserializePrimitive(field, (PrimitiveTypeInfo) typeInfo, schema);
        case LIST:
            // HIVE!! some versions will turn bytes into array<tinyint> instead of binary... so special case it.
            // TODO: remove once CDAP-1556 is done
            ListTypeInfo listTypeInfo = (ListTypeInfo) typeInfo;
            if (isByteArray(listTypeInfo) && !(field instanceof Collection)) {
                return deserializeByteArray(field);
            }
            return deserializeList(field, (ListTypeInfo) typeInfo, schema.getComponentSchema());
        case MAP:
            return deserializeMap(field, (MapTypeInfo) typeInfo, schema.getMapSchema());
        case STRUCT:
            StructTypeInfo structTypeInfo = (StructTypeInfo) typeInfo;
            ArrayList<String> innerFieldNames = structTypeInfo.getAllStructFieldNames();
            ArrayList<TypeInfo> innerFieldTypes = structTypeInfo.getAllStructFieldTypeInfos();
            return flattenRecord(field, innerFieldNames, innerFieldTypes, schema);
        case UNION:
            // TODO: decide what to do here
            return field;
    }
    return null;
}
Also used : ListTypeInfo(org.apache.hadoop.hive.serde2.typeinfo.ListTypeInfo) UnexpectedFormatException(io.cdap.cdap.api.data.format.UnexpectedFormatException) Collection(java.util.Collection) StructTypeInfo(org.apache.hadoop.hive.serde2.typeinfo.StructTypeInfo) MapTypeInfo(org.apache.hadoop.hive.serde2.typeinfo.MapTypeInfo) ListTypeInfo(org.apache.hadoop.hive.serde2.typeinfo.ListTypeInfo) StructTypeInfo(org.apache.hadoop.hive.serde2.typeinfo.StructTypeInfo) PrimitiveTypeInfo(org.apache.hadoop.hive.serde2.typeinfo.PrimitiveTypeInfo) TypeInfo(org.apache.hadoop.hive.serde2.typeinfo.TypeInfo)

Example 2 with UnexpectedFormatException

use of io.cdap.cdap.api.data.format.UnexpectedFormatException in project cdap by caskdata.

the class ObjectDeserializer method flattenRecord.

private List<Object> flattenRecord(Object obj, List<String> fieldNames, List<TypeInfo> fieldTypes, Schema schema) throws NoSuchFieldException, IllegalAccessException {
    boolean isNullable = schema.isNullable();
    if (obj == null) {
        if (isNullable) {
            return null;
        } else {
            throw new UnexpectedFormatException("Non-nullable field is null.");
        }
    }
    if (isNullable) {
        schema = schema.getNonNullable();
    }
    Map<String, Schema.Field> fieldMap = getFieldMap(schema);
    List<Object> objectFields = Lists.newArrayListWithCapacity(fieldNames.size());
    for (int i = 0; i < fieldNames.size(); i++) {
        String hiveName = fieldNames.get(i);
        TypeInfo fieldType = fieldTypes.get(i);
        Schema.Field schemaField = fieldMap.get(hiveName);
        // use the name from the schema field in case it is not all lowercase
        Object recordField = getRecordField(obj, schemaField.getName());
        objectFields.add(deserializeField(recordField, fieldType, schemaField.getSchema()));
    }
    return objectFields;
}
Also used : Field(java.lang.reflect.Field) Schema(io.cdap.cdap.api.data.schema.Schema) UnexpectedFormatException(io.cdap.cdap.api.data.format.UnexpectedFormatException) MapTypeInfo(org.apache.hadoop.hive.serde2.typeinfo.MapTypeInfo) ListTypeInfo(org.apache.hadoop.hive.serde2.typeinfo.ListTypeInfo) StructTypeInfo(org.apache.hadoop.hive.serde2.typeinfo.StructTypeInfo) PrimitiveTypeInfo(org.apache.hadoop.hive.serde2.typeinfo.PrimitiveTypeInfo) TypeInfo(org.apache.hadoop.hive.serde2.typeinfo.TypeInfo)

Aggregations

UnexpectedFormatException (io.cdap.cdap.api.data.format.UnexpectedFormatException)2 ListTypeInfo (org.apache.hadoop.hive.serde2.typeinfo.ListTypeInfo)2 MapTypeInfo (org.apache.hadoop.hive.serde2.typeinfo.MapTypeInfo)2 PrimitiveTypeInfo (org.apache.hadoop.hive.serde2.typeinfo.PrimitiveTypeInfo)2 StructTypeInfo (org.apache.hadoop.hive.serde2.typeinfo.StructTypeInfo)2 TypeInfo (org.apache.hadoop.hive.serde2.typeinfo.TypeInfo)2 Schema (io.cdap.cdap.api.data.schema.Schema)1 Field (java.lang.reflect.Field)1 Collection (java.util.Collection)1