Search in sources :

Example 1 with ArrayData

use of org.apache.spark.sql.catalyst.util.ArrayData in project iceberg by apache.

the class GenericsHelpers method assertEqualsUnsafe.

private static void assertEqualsUnsafe(Types.MapType map, Map<?, ?> expected, MapData actual) {
    Type keyType = map.keyType();
    Type valueType = map.valueType();
    List<Map.Entry<?, ?>> expectedElements = Lists.newArrayList(expected.entrySet());
    ArrayData actualKeys = actual.keyArray();
    ArrayData actualValues = actual.valueArray();
    for (int i = 0; i < expectedElements.size(); i += 1) {
        Map.Entry<?, ?> expectedPair = expectedElements.get(i);
        Object actualKey = actualKeys.get(i, convert(keyType));
        Object actualValue = actualValues.get(i, convert(keyType));
        assertEqualsUnsafe(keyType, expectedPair.getKey(), actualKey);
        assertEqualsUnsafe(valueType, expectedPair.getValue(), actualValue);
    }
}
Also used : Type(org.apache.iceberg.types.Type) Map(java.util.Map) ArrayData(org.apache.spark.sql.catalyst.util.ArrayData)

Example 2 with ArrayData

use of org.apache.spark.sql.catalyst.util.ArrayData in project iceberg by apache.

the class TestHelpers method assertEqualsMaps.

private static void assertEqualsMaps(String prefix, Types.MapType type, MapData expected, Map<?, ?> actual) {
    if (expected == null || actual == null) {
        Assert.assertEquals(prefix, expected, actual);
    } else {
        Type keyType = type.keyType();
        Type valueType = type.valueType();
        ArrayData expectedKeyArray = expected.keyArray();
        ArrayData expectedValueArray = expected.valueArray();
        Assert.assertEquals(prefix + " length", expected.numElements(), actual.size());
        for (int e = 0; e < expected.numElements(); ++e) {
            Object expectedKey = getValue(expectedKeyArray, e, keyType);
            Object actualValue = actual.get(expectedKey);
            if (actualValue == null) {
                Assert.assertEquals(prefix + ".key=" + expectedKey + " has null", true, expected.valueArray().isNullAt(e));
            } else {
                switch(valueType.typeId()) {
                    case BOOLEAN:
                    case INTEGER:
                    case LONG:
                    case FLOAT:
                    case DOUBLE:
                    case STRING:
                    case DECIMAL:
                    case DATE:
                    case TIMESTAMP:
                        Assert.assertEquals(prefix + ".key=" + expectedKey + " - " + valueType, getValue(expectedValueArray, e, valueType), actual.get(expectedKey));
                        break;
                    case UUID:
                    case FIXED:
                    case BINARY:
                        assertEqualBytes(prefix + ".key=" + expectedKey, (byte[]) getValue(expectedValueArray, e, valueType), (byte[]) actual.get(expectedKey));
                        break;
                    case STRUCT:
                        {
                            Types.StructType st = (Types.StructType) valueType;
                            assertEquals(prefix + ".key=" + expectedKey, st, expectedValueArray.getStruct(e, st.fields().size()), (Row) actual.get(expectedKey));
                            break;
                        }
                    case LIST:
                        assertEqualsLists(prefix + ".key=" + expectedKey, valueType.asListType(), expectedValueArray.getArray(e), toList((Seq<?>) actual.get(expectedKey)));
                        break;
                    case MAP:
                        assertEqualsMaps(prefix + ".key=" + expectedKey, valueType.asMapType(), expectedValueArray.getMap(e), toJavaMap((scala.collection.Map<?, ?>) actual.get(expectedKey)));
                        break;
                    default:
                        throw new IllegalArgumentException("Unhandled type " + valueType);
                }
            }
        }
    }
}
Also used : Types(org.apache.iceberg.types.Types) BinaryType(org.apache.spark.sql.types.BinaryType) DataType(org.apache.spark.sql.types.DataType) StructType(org.apache.spark.sql.types.StructType) Type(org.apache.iceberg.types.Type) ArrayType(org.apache.spark.sql.types.ArrayType) MapType(org.apache.spark.sql.types.MapType) StructType(org.apache.spark.sql.types.StructType) InternalRow(org.apache.spark.sql.catalyst.InternalRow) GenericRow(org.apache.spark.sql.catalyst.expressions.GenericRow) Row(org.apache.spark.sql.Row) Map(java.util.Map) Seq(scala.collection.Seq) ArrayData(org.apache.spark.sql.catalyst.util.ArrayData)

Example 3 with ArrayData

use of org.apache.spark.sql.catalyst.util.ArrayData in project spark-cassandra-bulkreader by jberragan.

the class CqlMap method sparkSqlRowValue.

@Override
public Object sparkSqlRowValue(GenericInternalRow row, int pos) {
    final MapData map = row.getMap(pos);
    final ArrayData keys = map.keyArray();
    final ArrayData values = map.valueArray();
    final Map<Object, Object> result = new HashMap<>(keys.numElements());
    for (int i = 0; i < keys.numElements(); i++) {
        final Object key = keyType().toTestRowType(keys.get(i, keyType().sparkSqlType()));
        final Object value = valueType().toTestRowType(values.get(i, valueType().sparkSqlType()));
        result.put(key, value);
    }
    return result;
}
Also used : HashMap(java.util.HashMap) MapData(org.apache.spark.sql.catalyst.util.MapData) ArrayBasedMapData(org.apache.spark.sql.catalyst.util.ArrayBasedMapData) ArrayData(org.apache.spark.sql.catalyst.util.ArrayData)

Example 4 with ArrayData

use of org.apache.spark.sql.catalyst.util.ArrayData in project spark-bigquery-connector by GoogleCloudDataproc.

the class AvroSchemaConverter method createConverterFor.

static Converter createConverterFor(DataType sparkType, Schema avroType) {
    if (sparkType instanceof NullType && avroType.getType() == Schema.Type.NULL) {
        return (getter, ordinal) -> null;
    }
    if (sparkType instanceof BooleanType && avroType.getType() == Schema.Type.BOOLEAN) {
        return (getter, ordinal) -> getter.getBoolean(ordinal);
    }
    if (sparkType instanceof ByteType && avroType.getType() == Schema.Type.LONG) {
        return (getter, ordinal) -> Long.valueOf(getter.getByte(ordinal));
    }
    if (sparkType instanceof ShortType && avroType.getType() == Schema.Type.LONG) {
        return (getter, ordinal) -> Long.valueOf(getter.getShort(ordinal));
    }
    if (sparkType instanceof IntegerType && avroType.getType() == Schema.Type.LONG) {
        return (getter, ordinal) -> Long.valueOf(getter.getInt(ordinal));
    }
    if (sparkType instanceof LongType && avroType.getType() == Schema.Type.LONG) {
        return (getter, ordinal) -> getter.getLong(ordinal);
    }
    if (sparkType instanceof FloatType && avroType.getType() == Schema.Type.DOUBLE) {
        return (getter, ordinal) -> Double.valueOf(getter.getFloat(ordinal));
    }
    if (sparkType instanceof DoubleType && avroType.getType() == Schema.Type.DOUBLE) {
        return (getter, ordinal) -> getter.getDouble(ordinal);
    }
    if (sparkType instanceof DecimalType && avroType.getType() == Schema.Type.BYTES) {
        DecimalType decimalType = (DecimalType) sparkType;
        return (getter, ordinal) -> {
            Decimal decimal = getter.getDecimal(ordinal, decimalType.precision(), decimalType.scale());
            return DECIMAL_CONVERSIONS.toBytes(decimal.toJavaBigDecimal(), avroType, LogicalTypes.decimal(decimalType.precision(), decimalType.scale()));
        };
    }
    if (sparkType instanceof StringType && avroType.getType() == Schema.Type.STRING) {
        return (getter, ordinal) -> new Utf8(getter.getUTF8String(ordinal).getBytes());
    }
    if (sparkType instanceof BinaryType && avroType.getType() == Schema.Type.FIXED) {
        int size = avroType.getFixedSize();
        return (getter, ordinal) -> {
            byte[] data = getter.getBinary(ordinal);
            if (data.length != size) {
                throw new IllegalArgumentException(String.format("Cannot write %s bytes of binary data into FIXED Type with size of %s bytes", data.length, size));
            }
            return new GenericData.Fixed(avroType, data);
        };
    }
    if (sparkType instanceof BinaryType && avroType.getType() == Schema.Type.BYTES) {
        return (getter, ordinal) -> ByteBuffer.wrap(getter.getBinary(ordinal));
    }
    if (sparkType instanceof DateType && avroType.getType() == Schema.Type.INT) {
        return (getter, ordinal) -> getter.getInt(ordinal);
    }
    if (sparkType instanceof TimestampType && avroType.getType() == Schema.Type.LONG) {
        return (getter, ordinal) -> getter.getLong(ordinal);
    }
    if (sparkType instanceof ArrayType && avroType.getType() == Schema.Type.ARRAY) {
        DataType et = ((ArrayType) sparkType).elementType();
        boolean containsNull = ((ArrayType) sparkType).containsNull();
        Converter elementConverter = createConverterFor(et, resolveNullableType(avroType.getElementType(), containsNull));
        return (getter, ordinal) -> {
            ArrayData arrayData = getter.getArray(ordinal);
            int len = arrayData.numElements();
            Object[] result = new Object[len];
            for (int i = 0; i < len; i++) {
                if (containsNull && arrayData.isNullAt(i)) {
                    result[i] = null;
                } else {
                    result[i] = elementConverter.convert(arrayData, i);
                }
            }
            // `ArrayList` backed by the specified array without data copying.
            return java.util.Arrays.asList(result);
        };
    }
    if (sparkType instanceof StructType && avroType.getType() == Schema.Type.RECORD) {
        StructType sparkStruct = (StructType) sparkType;
        StructConverter structConverter = new StructConverter(sparkStruct, avroType);
        int numFields = sparkStruct.length();
        return (getter, ordinal) -> structConverter.convert(getter.getStruct(ordinal, numFields));
    }
    if (sparkType instanceof UserDefinedType) {
        UserDefinedType userDefinedType = (UserDefinedType) sparkType;
        return createConverterFor(userDefinedType.sqlType(), avroType);
    }
    throw new IllegalArgumentException(String.format("Cannot convert Catalyst type %s to Avro type %s", sparkType, avroType));
}
Also used : BinaryType(org.apache.spark.sql.types.BinaryType) DataType(org.apache.spark.sql.types.DataType) Decimal(org.apache.spark.sql.types.Decimal) InternalRow(org.apache.spark.sql.catalyst.InternalRow) FloatType(org.apache.spark.sql.types.FloatType) DecimalType(org.apache.spark.sql.types.DecimalType) ByteBuffer(java.nio.ByteBuffer) GenericData(org.apache.avro.generic.GenericData) ArrayType(org.apache.spark.sql.types.ArrayType) ByteType(org.apache.spark.sql.types.ByteType) LogicalTypes(org.apache.avro.LogicalTypes) ArrayData(org.apache.spark.sql.catalyst.util.ArrayData) SpecializedGetters(org.apache.spark.sql.catalyst.expressions.SpecializedGetters) DoubleType(org.apache.spark.sql.types.DoubleType) Conversions(org.apache.avro.Conversions) NullType(org.apache.spark.sql.types.NullType) StructField(org.apache.spark.sql.types.StructField) StructType(org.apache.spark.sql.types.StructType) Utf8(org.apache.avro.util.Utf8) Schema(org.apache.avro.Schema) UserDefinedType(org.apache.spark.sql.types.UserDefinedType) IntegerType(org.apache.spark.sql.types.IntegerType) StringType(org.apache.spark.sql.types.StringType) LongType(org.apache.spark.sql.types.LongType) TimestampType(org.apache.spark.sql.types.TimestampType) ShortType(org.apache.spark.sql.types.ShortType) SchemaBuilder(org.apache.avro.SchemaBuilder) List(java.util.List) Optional(java.util.Optional) Preconditions(com.google.common.base.Preconditions) BooleanType(org.apache.spark.sql.types.BooleanType) DateType(org.apache.spark.sql.types.DateType) MapType(org.apache.spark.sql.types.MapType) LongType(org.apache.spark.sql.types.LongType) StructType(org.apache.spark.sql.types.StructType) StringType(org.apache.spark.sql.types.StringType) ByteType(org.apache.spark.sql.types.ByteType) FloatType(org.apache.spark.sql.types.FloatType) ArrayType(org.apache.spark.sql.types.ArrayType) Decimal(org.apache.spark.sql.types.Decimal) TimestampType(org.apache.spark.sql.types.TimestampType) DataType(org.apache.spark.sql.types.DataType) DateType(org.apache.spark.sql.types.DateType) BinaryType(org.apache.spark.sql.types.BinaryType) ShortType(org.apache.spark.sql.types.ShortType) BooleanType(org.apache.spark.sql.types.BooleanType) UserDefinedType(org.apache.spark.sql.types.UserDefinedType) GenericData(org.apache.avro.generic.GenericData) IntegerType(org.apache.spark.sql.types.IntegerType) DoubleType(org.apache.spark.sql.types.DoubleType) DecimalType(org.apache.spark.sql.types.DecimalType) Utf8(org.apache.avro.util.Utf8) NullType(org.apache.spark.sql.types.NullType) ArrayData(org.apache.spark.sql.catalyst.util.ArrayData)

Example 5 with ArrayData

use of org.apache.spark.sql.catalyst.util.ArrayData in project iceberg by apache.

the class TestHelpers method assertEqualsUnsafe.

private static void assertEqualsUnsafe(Types.MapType map, Map<?, ?> expected, MapData actual) {
    Type keyType = map.keyType();
    Type valueType = map.valueType();
    List<Map.Entry<?, ?>> expectedElements = Lists.newArrayList(expected.entrySet());
    ArrayData actualKeys = actual.keyArray();
    ArrayData actualValues = actual.valueArray();
    for (int i = 0; i < expectedElements.size(); i += 1) {
        Map.Entry<?, ?> expectedPair = expectedElements.get(i);
        Object actualKey = actualKeys.get(i, convert(keyType));
        Object actualValue = actualValues.get(i, convert(keyType));
        assertEqualsUnsafe(keyType, expectedPair.getKey(), actualKey);
        assertEqualsUnsafe(valueType, expectedPair.getValue(), actualValue);
    }
}
Also used : BinaryType(org.apache.spark.sql.types.BinaryType) DataType(org.apache.spark.sql.types.DataType) StructType(org.apache.spark.sql.types.StructType) Type(org.apache.iceberg.types.Type) ArrayType(org.apache.spark.sql.types.ArrayType) MapType(org.apache.spark.sql.types.MapType) Map(java.util.Map) ArrayData(org.apache.spark.sql.catalyst.util.ArrayData)

Aggregations

ArrayData (org.apache.spark.sql.catalyst.util.ArrayData)7 DataType (org.apache.spark.sql.types.DataType)4 Map (java.util.Map)3 Type (org.apache.iceberg.types.Type)3 InternalRow (org.apache.spark.sql.catalyst.InternalRow)3 ArrayType (org.apache.spark.sql.types.ArrayType)3 BinaryType (org.apache.spark.sql.types.BinaryType)3 MapType (org.apache.spark.sql.types.MapType)3 StructType (org.apache.spark.sql.types.StructType)3 Message (com.google.cloud.pubsublite.Message)1 SequencedMessage (com.google.cloud.pubsublite.SequencedMessage)1 Preconditions (com.google.common.base.Preconditions)1 Timestamp (com.google.protobuf.Timestamp)1 ByteBuffer (java.nio.ByteBuffer)1 HashMap (java.util.HashMap)1 List (java.util.List)1 Optional (java.util.Optional)1 Conversions (org.apache.avro.Conversions)1 LogicalTypes (org.apache.avro.LogicalTypes)1 Schema (org.apache.avro.Schema)1