Search in sources :

Example 6 with MappedField

use of org.apache.iceberg.mapping.MappedField in project iceberg by apache.

the class TestSchemaAndMappingUpdate method testDeleteAndAddColumnReassign.

@Test
public void testDeleteAndAddColumnReassign() {
    NameMapping mapping = MappingUtil.create(table.schema());
    String mappingJson = NameMappingParser.toJson(mapping);
    table.updateProperties().set(TableProperties.DEFAULT_NAME_MAPPING, mappingJson).commit();
    // the original field ID
    int startIdColumnId = table.schema().findField("id").fieldId();
    table.updateSchema().deleteColumn("id").commit();
    // add the same column name back to the table with a different field ID
    table.updateSchema().addColumn("id", Types.StringType.get()).commit();
    String updatedJson = table.properties().get(TableProperties.DEFAULT_NAME_MAPPING);
    NameMapping updated = NameMappingParser.fromJson(updatedJson);
    // the new field ID
    int idColumnId = table.schema().findField("id").fieldId();
    Set<Integer> changedIds = Sets.newHashSet(startIdColumnId, idColumnId);
    validateUnchanged(Iterables.filter(mapping.asMappedFields().fields(), field -> !changedIds.contains(field.id())), updated);
    MappedField newMapping = updated.find("id");
    Assert.assertNotNull("Mapping for id column should exist", newMapping);
    Assert.assertEquals("Mapping should use the new field ID", (Integer) idColumnId, newMapping.id());
    Assert.assertNull("Should not contain a nested mapping", newMapping.nestedMapping());
    MappedField updatedMapping = updated.find(startIdColumnId);
    Assert.assertNotNull("Mapping for original id column should exist", updatedMapping);
    Assert.assertEquals("Mapping should use the original field ID", (Integer) startIdColumnId, updatedMapping.id());
    Assert.assertFalse("Should not use id as a name", updatedMapping.names().contains("id"));
    Assert.assertNull("Should not contain a nested mapping", updatedMapping.nestedMapping());
}
Also used : Types(org.apache.iceberg.types.Types) MappedField(org.apache.iceberg.mapping.MappedField) RunWith(org.junit.runner.RunWith) Set(java.util.Set) NameMappingParser(org.apache.iceberg.mapping.NameMappingParser) Iterables(org.apache.iceberg.relocated.com.google.common.collect.Iterables) Test(org.junit.Test) ImmutableList(org.apache.iceberg.relocated.com.google.common.collect.ImmutableList) Objects(java.util.Objects) ValidationException(org.apache.iceberg.exceptions.ValidationException) MappingUtil(org.apache.iceberg.mapping.MappingUtil) Sets(org.apache.iceberg.relocated.com.google.common.collect.Sets) NameMapping(org.apache.iceberg.mapping.NameMapping) MappedFields(org.apache.iceberg.mapping.MappedFields) Assert(org.junit.Assert) Parameterized(org.junit.runners.Parameterized) MappedField(org.apache.iceberg.mapping.MappedField) NameMapping(org.apache.iceberg.mapping.NameMapping) Test(org.junit.Test)

Example 7 with MappedField

use of org.apache.iceberg.mapping.MappedField in project iceberg by apache.

the class TestSchemaAndMappingUpdate method testAddStructColumn.

@Test
public void testAddStructColumn() {
    NameMapping mapping = MappingUtil.create(table.schema());
    String mappingJson = NameMappingParser.toJson(mapping);
    table.updateProperties().set(TableProperties.DEFAULT_NAME_MAPPING, mappingJson).commit();
    table.updateSchema().addColumn("location", Types.StructType.of(Types.NestedField.optional(1, "lat", Types.DoubleType.get()), Types.NestedField.optional(2, "long", Types.DoubleType.get()))).commit();
    String updatedJson = table.properties().get(TableProperties.DEFAULT_NAME_MAPPING);
    NameMapping updated = NameMappingParser.fromJson(updatedJson);
    validateUnchanged(mapping, updated);
    MappedField newMapping = updated.find("location");
    Assert.assertNotNull("Mapping for new column should be added", newMapping);
    Assert.assertEquals("Mapping should use the assigned field ID", (Integer) table.schema().findField("location").fieldId(), updated.find("location").id());
    Assert.assertNotNull("Should contain a nested mapping", updated.find("location").nestedMapping());
    Assert.assertEquals("Mapping should use the assigned field ID", (Integer) table.schema().findField("location.lat").fieldId(), updated.find("location.lat").id());
    Assert.assertNull("Should not contain a nested mapping", updated.find("location.lat").nestedMapping());
    Assert.assertEquals("Mapping should use the assigned field ID", (Integer) table.schema().findField("location.long").fieldId(), updated.find("location.long").id());
    Assert.assertNull("Should not contain a nested mapping", updated.find("location.long").nestedMapping());
}
Also used : MappedField(org.apache.iceberg.mapping.MappedField) NameMapping(org.apache.iceberg.mapping.NameMapping) Test(org.junit.Test)

Example 8 with MappedField

use of org.apache.iceberg.mapping.MappedField in project iceberg by apache.

the class TestSchemaAndMappingUpdate method testAddPrimitiveColumn.

@Test
public void testAddPrimitiveColumn() {
    NameMapping mapping = MappingUtil.create(table.schema());
    String mappingJson = NameMappingParser.toJson(mapping);
    table.updateProperties().set(TableProperties.DEFAULT_NAME_MAPPING, mappingJson).commit();
    table.updateSchema().addColumn("count", Types.LongType.get()).commit();
    String updatedJson = table.properties().get(TableProperties.DEFAULT_NAME_MAPPING);
    NameMapping updated = NameMappingParser.fromJson(updatedJson);
    validateUnchanged(mapping, updated);
    MappedField newMapping = updated.find("count");
    Assert.assertNotNull("Mapping for new column should be added", newMapping);
    Assert.assertEquals("Mapping should use the assigned field ID", (Integer) table.schema().findField("count").fieldId(), updated.find("count").id());
    Assert.assertNull("Should not contain a nested mapping", updated.find("count").nestedMapping());
}
Also used : MappedField(org.apache.iceberg.mapping.MappedField) NameMapping(org.apache.iceberg.mapping.NameMapping) Test(org.junit.Test)

Example 9 with MappedField

use of org.apache.iceberg.mapping.MappedField in project iceberg by apache.

the class TestSchemaAndMappingUpdate method testRenameAndRenameColumnReassign.

@Test
public void testRenameAndRenameColumnReassign() {
    NameMapping mapping = MappingUtil.create(table.schema());
    String mappingJson = NameMappingParser.toJson(mapping);
    table.updateProperties().set(TableProperties.DEFAULT_NAME_MAPPING, mappingJson).commit();
    // the original field ID
    int startIdColumnId = table.schema().findField("id").fieldId();
    table.updateSchema().renameColumn("id", "object_id").commit();
    NameMapping afterRename = NameMappingParser.fromJson(table.properties().get(TableProperties.DEFAULT_NAME_MAPPING));
    Assert.assertEquals("Renamed column should have both names", Sets.newHashSet("id", "object_id"), afterRename.find(startIdColumnId).names());
    // rename the data column to the renamed column's old name
    // also, rename the original column again to ensure its names are handled correctly
    table.updateSchema().renameColumn("object_id", "oid").renameColumn("data", "id").commit();
    String updatedJson = table.properties().get(TableProperties.DEFAULT_NAME_MAPPING);
    NameMapping updated = NameMappingParser.fromJson(updatedJson);
    // the new field ID
    int idColumnId = table.schema().findField("id").fieldId();
    Set<Integer> changedIds = Sets.newHashSet(startIdColumnId, idColumnId);
    validateUnchanged(Iterables.filter(afterRename.asMappedFields().fields(), field -> !changedIds.contains(field.id())), updated);
    MappedField newMapping = updated.find("id");
    Assert.assertNotNull("Mapping for id column should exist", newMapping);
    Assert.assertEquals("Renamed column should have both names", Sets.newHashSet("id", "data"), newMapping.names());
    Assert.assertEquals("Mapping should use the new field ID", (Integer) idColumnId, newMapping.id());
    Assert.assertNull("Should not contain a nested mapping", newMapping.nestedMapping());
    MappedField updatedMapping = updated.find(startIdColumnId);
    Assert.assertNotNull("Mapping for original id column should exist", updatedMapping);
    Assert.assertEquals("Mapping should use the original field ID", (Integer) startIdColumnId, updatedMapping.id());
    Assert.assertEquals("Should not use id as a name", Sets.newHashSet("object_id", "oid"), updatedMapping.names());
    Assert.assertNull("Should not contain a nested mapping", updatedMapping.nestedMapping());
}
Also used : Types(org.apache.iceberg.types.Types) MappedField(org.apache.iceberg.mapping.MappedField) RunWith(org.junit.runner.RunWith) Set(java.util.Set) NameMappingParser(org.apache.iceberg.mapping.NameMappingParser) Iterables(org.apache.iceberg.relocated.com.google.common.collect.Iterables) Test(org.junit.Test) ImmutableList(org.apache.iceberg.relocated.com.google.common.collect.ImmutableList) Objects(java.util.Objects) ValidationException(org.apache.iceberg.exceptions.ValidationException) MappingUtil(org.apache.iceberg.mapping.MappingUtil) Sets(org.apache.iceberg.relocated.com.google.common.collect.Sets) NameMapping(org.apache.iceberg.mapping.NameMapping) MappedFields(org.apache.iceberg.mapping.MappedFields) Assert(org.junit.Assert) Parameterized(org.junit.runners.Parameterized) MappedField(org.apache.iceberg.mapping.MappedField) NameMapping(org.apache.iceberg.mapping.NameMapping) Test(org.junit.Test)

Example 10 with MappedField

use of org.apache.iceberg.mapping.MappedField in project iceberg by apache.

the class ApplyNameMapping method struct.

@Override
public Type struct(GroupType struct, List<Type> types) {
    MappedField field = nameMapping.find(currentPath());
    List<Type> actualTypes = types.stream().filter(Objects::nonNull).collect(Collectors.toList());
    Type structType = struct.withNewFields(actualTypes);
    return field == null ? structType : structType.withId(field.id());
}
Also used : MappedField(org.apache.iceberg.mapping.MappedField) PrimitiveType(org.apache.parquet.schema.PrimitiveType) GroupType(org.apache.parquet.schema.GroupType) MessageType(org.apache.parquet.schema.MessageType) Type(org.apache.parquet.schema.Type)

Aggregations

MappedField (org.apache.iceberg.mapping.MappedField)14 NameMapping (org.apache.iceberg.mapping.NameMapping)7 Test (org.junit.Test)7 Objects (java.util.Objects)5 Set (java.util.Set)5 ValidationException (org.apache.iceberg.exceptions.ValidationException)5 MappedFields (org.apache.iceberg.mapping.MappedFields)5 MappingUtil (org.apache.iceberg.mapping.MappingUtil)5 NameMappingParser (org.apache.iceberg.mapping.NameMappingParser)5 ImmutableList (org.apache.iceberg.relocated.com.google.common.collect.ImmutableList)5 Iterables (org.apache.iceberg.relocated.com.google.common.collect.Iterables)5 Sets (org.apache.iceberg.relocated.com.google.common.collect.Sets)5 Types (org.apache.iceberg.types.Types)5 Assert (org.junit.Assert)5 RunWith (org.junit.runner.RunWith)5 Parameterized (org.junit.runners.Parameterized)5 TypeDescription (org.apache.orc.TypeDescription)3 GroupType (org.apache.parquet.schema.GroupType)3 MessageType (org.apache.parquet.schema.MessageType)3 PrimitiveType (org.apache.parquet.schema.PrimitiveType)3