Search in sources :

Example 1 with MappedField

use of org.apache.iceberg.mapping.MappedField in project iceberg by apache.

the class ApplyNameMapping method map.

@Override
public Type map(GroupType map, Type keyType, Type valueType) {
    Preconditions.checkArgument(keyType != null && valueType != null, "Map type must have both key field and value field");
    MappedField field = nameMapping.find(currentPath());
    Type mapType = Types.buildGroup(map.getRepetition()).as(LogicalTypeAnnotation.mapType()).repeatedGroup().addFields(keyType, valueType).named(map.getFieldName(0)).named(map.getName());
    return field == null ? mapType : mapType.withId(field.id());
}
Also used : MappedField(org.apache.iceberg.mapping.MappedField) PrimitiveType(org.apache.parquet.schema.PrimitiveType) GroupType(org.apache.parquet.schema.GroupType) MessageType(org.apache.parquet.schema.MessageType) Type(org.apache.parquet.schema.Type)

Example 2 with MappedField

use of org.apache.iceberg.mapping.MappedField in project iceberg by apache.

the class ApplyNameMapping method map.

@Override
public TypeDescription map(TypeDescription map, TypeDescription key, TypeDescription value) {
    Preconditions.checkArgument(key != null && value != null, "Map type must have both key and value types");
    MappedField field = nameMapping.find(currentPath());
    TypeDescription mapType = TypeDescription.createMap(key, value);
    return setId(mapType, field);
}
Also used : MappedField(org.apache.iceberg.mapping.MappedField) TypeDescription(org.apache.orc.TypeDescription)

Example 3 with MappedField

use of org.apache.iceberg.mapping.MappedField in project iceberg by apache.

the class TestSchemaAndMappingUpdate method testRenameColumn.

@Test
public void testRenameColumn() {
    NameMapping mapping = MappingUtil.create(table.schema());
    String mappingJson = NameMappingParser.toJson(mapping);
    table.updateProperties().set(TableProperties.DEFAULT_NAME_MAPPING, mappingJson).commit();
    table.updateSchema().renameColumn("id", "object_id").commit();
    String updatedJson = table.properties().get(TableProperties.DEFAULT_NAME_MAPPING);
    NameMapping updated = NameMappingParser.fromJson(updatedJson);
    int idColumnId = table.schema().findField("object_id").fieldId();
    validateUnchanged(Iterables.filter(mapping.asMappedFields().fields(), field -> !Objects.equals(idColumnId, field.id())), updated);
    MappedField updatedMapping = updated.find(idColumnId);
    Assert.assertNotNull("Mapping for id column should exist", updatedMapping);
    Assert.assertEquals("Should add the new column name to the existing mapping", MappedField.of(idColumnId, ImmutableList.of("id", "object_id")), updatedMapping);
}
Also used : Types(org.apache.iceberg.types.Types) MappedField(org.apache.iceberg.mapping.MappedField) RunWith(org.junit.runner.RunWith) Set(java.util.Set) NameMappingParser(org.apache.iceberg.mapping.NameMappingParser) Iterables(org.apache.iceberg.relocated.com.google.common.collect.Iterables) Test(org.junit.Test) ImmutableList(org.apache.iceberg.relocated.com.google.common.collect.ImmutableList) Objects(java.util.Objects) ValidationException(org.apache.iceberg.exceptions.ValidationException) MappingUtil(org.apache.iceberg.mapping.MappingUtil) Sets(org.apache.iceberg.relocated.com.google.common.collect.Sets) NameMapping(org.apache.iceberg.mapping.NameMapping) MappedFields(org.apache.iceberg.mapping.MappedFields) Assert(org.junit.Assert) Parameterized(org.junit.runners.Parameterized) MappedField(org.apache.iceberg.mapping.MappedField) NameMapping(org.apache.iceberg.mapping.NameMapping) Test(org.junit.Test)

Example 4 with MappedField

use of org.apache.iceberg.mapping.MappedField in project iceberg by apache.

the class TestSchemaAndMappingUpdate method testDeleteAndRenameColumnReassign.

@Test
public void testDeleteAndRenameColumnReassign() {
    NameMapping mapping = MappingUtil.create(table.schema());
    String mappingJson = NameMappingParser.toJson(mapping);
    table.updateProperties().set(TableProperties.DEFAULT_NAME_MAPPING, mappingJson).commit();
    // the original field ID
    int startIdColumnId = table.schema().findField("id").fieldId();
    table.updateSchema().deleteColumn("id").commit();
    // rename the data column to id
    table.updateSchema().renameColumn("data", "id").commit();
    String updatedJson = table.properties().get(TableProperties.DEFAULT_NAME_MAPPING);
    NameMapping updated = NameMappingParser.fromJson(updatedJson);
    // the new field ID
    int idColumnId = table.schema().findField("id").fieldId();
    Set<Integer> changedIds = Sets.newHashSet(startIdColumnId, idColumnId);
    validateUnchanged(Iterables.filter(mapping.asMappedFields().fields(), field -> !changedIds.contains(field.id())), updated);
    MappedField newMapping = updated.find("id");
    Assert.assertNotNull("Mapping for id column should exist", newMapping);
    Assert.assertEquals("Mapping should use the new field ID", (Integer) idColumnId, newMapping.id());
    Assert.assertEquals("Should have both names", Sets.newHashSet("id", "data"), newMapping.names());
    Assert.assertNull("Should not contain a nested mapping", newMapping.nestedMapping());
    MappedField updatedMapping = updated.find(startIdColumnId);
    Assert.assertNotNull("Mapping for original id column should exist", updatedMapping);
    Assert.assertEquals("Mapping should use the original field ID", (Integer) startIdColumnId, updatedMapping.id());
    Assert.assertFalse("Should not use id as a name", updatedMapping.names().contains("id"));
    Assert.assertNull("Should not contain a nested mapping", updatedMapping.nestedMapping());
}
Also used : Types(org.apache.iceberg.types.Types) MappedField(org.apache.iceberg.mapping.MappedField) RunWith(org.junit.runner.RunWith) Set(java.util.Set) NameMappingParser(org.apache.iceberg.mapping.NameMappingParser) Iterables(org.apache.iceberg.relocated.com.google.common.collect.Iterables) Test(org.junit.Test) ImmutableList(org.apache.iceberg.relocated.com.google.common.collect.ImmutableList) Objects(java.util.Objects) ValidationException(org.apache.iceberg.exceptions.ValidationException) MappingUtil(org.apache.iceberg.mapping.MappingUtil) Sets(org.apache.iceberg.relocated.com.google.common.collect.Sets) NameMapping(org.apache.iceberg.mapping.NameMapping) MappedFields(org.apache.iceberg.mapping.MappedFields) Assert(org.junit.Assert) Parameterized(org.junit.runners.Parameterized) MappedField(org.apache.iceberg.mapping.MappedField) NameMapping(org.apache.iceberg.mapping.NameMapping) Test(org.junit.Test)

Example 5 with MappedField

use of org.apache.iceberg.mapping.MappedField in project iceberg by apache.

the class TestSchemaAndMappingUpdate method testRenameAndAddColumnReassign.

@Test
public void testRenameAndAddColumnReassign() {
    NameMapping mapping = MappingUtil.create(table.schema());
    String mappingJson = NameMappingParser.toJson(mapping);
    table.updateProperties().set(TableProperties.DEFAULT_NAME_MAPPING, mappingJson).commit();
    // the original field ID
    int startIdColumnId = table.schema().findField("id").fieldId();
    table.updateSchema().renameColumn("id", "object_id").commit();
    NameMapping afterRename = NameMappingParser.fromJson(table.properties().get(TableProperties.DEFAULT_NAME_MAPPING));
    Assert.assertEquals("Renamed column should have both names", Sets.newHashSet("id", "object_id"), afterRename.find(startIdColumnId).names());
    // add a new column with the renamed column's old name
    // also, rename the original column again to ensure its names are handled correctly
    table.updateSchema().renameColumn("object_id", "oid").addColumn("id", Types.StringType.get()).commit();
    String updatedJson = table.properties().get(TableProperties.DEFAULT_NAME_MAPPING);
    NameMapping updated = NameMappingParser.fromJson(updatedJson);
    // the new field ID
    int idColumnId = table.schema().findField("id").fieldId();
    Set<Integer> changedIds = Sets.newHashSet(startIdColumnId, idColumnId);
    validateUnchanged(Iterables.filter(afterRename.asMappedFields().fields(), field -> !changedIds.contains(field.id())), updated);
    MappedField newMapping = updated.find("id");
    Assert.assertNotNull("Mapping for id column should exist", newMapping);
    Assert.assertEquals("Mapping should use the new field ID", (Integer) idColumnId, newMapping.id());
    Assert.assertNull("Should not contain a nested mapping", newMapping.nestedMapping());
    MappedField updatedMapping = updated.find(startIdColumnId);
    Assert.assertNotNull("Mapping for original id column should exist", updatedMapping);
    Assert.assertEquals("Mapping should use the original field ID", (Integer) startIdColumnId, updatedMapping.id());
    Assert.assertEquals("Should not use id as a name", Sets.newHashSet("object_id", "oid"), updatedMapping.names());
    Assert.assertNull("Should not contain a nested mapping", updatedMapping.nestedMapping());
}
Also used : Types(org.apache.iceberg.types.Types) MappedField(org.apache.iceberg.mapping.MappedField) RunWith(org.junit.runner.RunWith) Set(java.util.Set) NameMappingParser(org.apache.iceberg.mapping.NameMappingParser) Iterables(org.apache.iceberg.relocated.com.google.common.collect.Iterables) Test(org.junit.Test) ImmutableList(org.apache.iceberg.relocated.com.google.common.collect.ImmutableList) Objects(java.util.Objects) ValidationException(org.apache.iceberg.exceptions.ValidationException) MappingUtil(org.apache.iceberg.mapping.MappingUtil) Sets(org.apache.iceberg.relocated.com.google.common.collect.Sets) NameMapping(org.apache.iceberg.mapping.NameMapping) MappedFields(org.apache.iceberg.mapping.MappedFields) Assert(org.junit.Assert) Parameterized(org.junit.runners.Parameterized) MappedField(org.apache.iceberg.mapping.MappedField) NameMapping(org.apache.iceberg.mapping.NameMapping) Test(org.junit.Test)

Aggregations

MappedField (org.apache.iceberg.mapping.MappedField)14 NameMapping (org.apache.iceberg.mapping.NameMapping)7 Test (org.junit.Test)7 Objects (java.util.Objects)5 Set (java.util.Set)5 ValidationException (org.apache.iceberg.exceptions.ValidationException)5 MappedFields (org.apache.iceberg.mapping.MappedFields)5 MappingUtil (org.apache.iceberg.mapping.MappingUtil)5 NameMappingParser (org.apache.iceberg.mapping.NameMappingParser)5 ImmutableList (org.apache.iceberg.relocated.com.google.common.collect.ImmutableList)5 Iterables (org.apache.iceberg.relocated.com.google.common.collect.Iterables)5 Sets (org.apache.iceberg.relocated.com.google.common.collect.Sets)5 Types (org.apache.iceberg.types.Types)5 Assert (org.junit.Assert)5 RunWith (org.junit.runner.RunWith)5 Parameterized (org.junit.runners.Parameterized)5 TypeDescription (org.apache.orc.TypeDescription)3 GroupType (org.apache.parquet.schema.GroupType)3 MessageType (org.apache.parquet.schema.MessageType)3 PrimitiveType (org.apache.parquet.schema.PrimitiveType)3