Search in sources :

Example 1 with IcebergArrowColumnVector

use of org.apache.iceberg.spark.data.vectorized.IcebergArrowColumnVector in project iceberg by apache.

the class TestHelpers method assertEqualsBatch.

public static void assertEqualsBatch(Types.StructType struct, Iterator<Record> expected, ColumnarBatch batch, boolean checkArrowValidityVector) {
    for (int rowId = 0; rowId < batch.numRows(); rowId++) {
        List<Types.NestedField> fields = struct.fields();
        InternalRow row = batch.getRow(rowId);
        Record rec = expected.next();
        for (int i = 0; i < fields.size(); i += 1) {
            Type fieldType = fields.get(i).type();
            Object expectedValue = rec.get(i);
            Object actualValue = row.isNullAt(i) ? null : row.get(i, convert(fieldType));
            assertEqualsUnsafe(fieldType, expectedValue, actualValue);
            if (checkArrowValidityVector) {
                ColumnVector columnVector = batch.column(i);
                ValueVector arrowVector = ((IcebergArrowColumnVector) columnVector).vectorAccessor().getVector();
                Assert.assertFalse("Nullability doesn't match of " + columnVector.dataType(), expectedValue == null ^ arrowVector.isNull(rowId));
            }
        }
    }
}
Also used : ValueVector(org.apache.arrow.vector.ValueVector) BinaryType(org.apache.spark.sql.types.BinaryType) DataType(org.apache.spark.sql.types.DataType) StructType(org.apache.spark.sql.types.StructType) Type(org.apache.iceberg.types.Type) ArrayType(org.apache.spark.sql.types.ArrayType) MapType(org.apache.spark.sql.types.MapType) Record(org.apache.avro.generic.GenericData.Record) InternalRow(org.apache.spark.sql.catalyst.InternalRow) ColumnVector(org.apache.spark.sql.vectorized.ColumnVector) IcebergArrowColumnVector(org.apache.iceberg.spark.data.vectorized.IcebergArrowColumnVector)

Aggregations

ValueVector (org.apache.arrow.vector.ValueVector)1 Record (org.apache.avro.generic.GenericData.Record)1 IcebergArrowColumnVector (org.apache.iceberg.spark.data.vectorized.IcebergArrowColumnVector)1 Type (org.apache.iceberg.types.Type)1 InternalRow (org.apache.spark.sql.catalyst.InternalRow)1 ArrayType (org.apache.spark.sql.types.ArrayType)1 BinaryType (org.apache.spark.sql.types.BinaryType)1 DataType (org.apache.spark.sql.types.DataType)1 MapType (org.apache.spark.sql.types.MapType)1 StructType (org.apache.spark.sql.types.StructType)1 ColumnVector (org.apache.spark.sql.vectorized.ColumnVector)1