Search in sources :

Example 11 with MetadataTableType

use of org.apache.iceberg.MetadataTableType in project iceberg by apache.

the class TestMetadataTablesWithPartitionEvolution method testEntriesMetadataTable.

@Test
public void testEntriesMetadataTable() throws ParseException {
    sql("CREATE TABLE %s (id bigint NOT NULL, category string, data string) USING iceberg", tableName);
    initTable();
    sql("INSERT INTO TABLE %s VALUES (1, 'a1', 'b1')", tableName);
    // verify the metadata tables while the current spec is still unpartitioned
    for (MetadataTableType tableType : Arrays.asList(ENTRIES, ALL_ENTRIES)) {
        Dataset<Row> df = loadMetadataTable(tableType);
        StructType dataFileType = (StructType) df.schema().apply("data_file").dataType();
        Assert.assertTrue("Partition must be skipped", dataFileType.getFieldIndex("").isEmpty());
    }
    Table table = validationCatalog.loadTable(tableIdent);
    table.updateSpec().addField("data").commit();
    sql("REFRESH TABLE %s", tableName);
    sql("INSERT INTO TABLE %s VALUES (1, 'a1', 'b1')", tableName);
    // verify the metadata tables after adding the first partition column
    for (MetadataTableType tableType : Arrays.asList(ENTRIES, ALL_ENTRIES)) {
        assertPartitions(ImmutableList.of(row(new Object[] { null }), row("b1")), "STRUCT<data:STRING>", tableType);
    }
    table.updateSpec().addField(Expressions.bucket("category", 8)).commit();
    sql("REFRESH TABLE %s", tableName);
    sql("INSERT INTO TABLE %s VALUES (1, 'a1', 'b1')", tableName);
    // verify the metadata tables after adding the second partition column
    for (MetadataTableType tableType : Arrays.asList(ENTRIES, ALL_ENTRIES)) {
        assertPartitions(ImmutableList.of(row(null, null), row("b1", null), row("b1", 2)), "STRUCT<data:STRING,category_bucket_8:INT>", tableType);
    }
    table.updateSpec().removeField("data").commit();
    sql("REFRESH TABLE %s", tableName);
    sql("INSERT INTO TABLE %s VALUES (1, 'a1', 'b1')", tableName);
    // verify the metadata tables after dropping the first partition column
    for (MetadataTableType tableType : Arrays.asList(ENTRIES, ALL_ENTRIES)) {
        assertPartitions(ImmutableList.of(row(null, null), row(null, 2), row("b1", null), row("b1", 2)), "STRUCT<data:STRING,category_bucket_8:INT>", tableType);
    }
    table.updateSpec().renameField("category_bucket_8", "category_bucket_8_another_name").commit();
    sql("REFRESH TABLE %s", tableName);
    // verify the metadata tables after renaming the second partition column
    for (MetadataTableType tableType : Arrays.asList(ENTRIES, ALL_ENTRIES)) {
        assertPartitions(ImmutableList.of(row(null, null), row(null, 2), row("b1", null), row("b1", 2)), "STRUCT<data:STRING,category_bucket_8_another_name:INT>", tableType);
    }
}
Also used : MetadataTableType(org.apache.iceberg.MetadataTableType) Table(org.apache.iceberg.Table) StructType(org.apache.spark.sql.types.StructType) Row(org.apache.spark.sql.Row) Test(org.junit.Test)

Example 12 with MetadataTableType

use of org.apache.iceberg.MetadataTableType in project iceberg by apache.

the class HadoopTables method parseMetadataType.

/**
 * Try to resolve a metadata table, which we encode as URI fragments
 * e.g. hdfs:///warehouse/my_table#snapshots
 * @param location Path to parse
 * @return A base table name and MetadataTableType if a type is found, null if not
 */
private Pair<String, MetadataTableType> parseMetadataType(String location) {
    int hashIndex = location.lastIndexOf('#');
    if (hashIndex != -1 && !location.endsWith("#")) {
        String baseTable = location.substring(0, hashIndex);
        String metaTable = location.substring(hashIndex + 1);
        MetadataTableType type = MetadataTableType.from(metaTable);
        return (type == null) ? null : Pair.of(baseTable, type);
    } else {
        return null;
    }
}
Also used : MetadataTableType(org.apache.iceberg.MetadataTableType)

Aggregations

MetadataTableType (org.apache.iceberg.MetadataTableType)12 Table (org.apache.iceberg.Table)9 Test (org.junit.Test)9 TableOperations (org.apache.iceberg.TableOperations)3 HasTableOperations (org.apache.iceberg.HasTableOperations)2 Row (org.apache.spark.sql.Row)2 Set (java.util.Set)1 FileSelection (org.apache.drill.exec.store.dfs.FileSelection)1 BaseTable (org.apache.iceberg.BaseTable)1 CachingCatalog (org.apache.iceberg.CachingCatalog)1 PartitionSpec (org.apache.iceberg.PartitionSpec)1 SerializableTable (org.apache.iceberg.SerializableTable)1 Snapshot (org.apache.iceberg.Snapshot)1 StaticTableOperations (org.apache.iceberg.StaticTableOperations)1 TableMetadata (org.apache.iceberg.TableMetadata)1 TestableCachingCatalog (org.apache.iceberg.TestableCachingCatalog)1 Catalog (org.apache.iceberg.catalog.Catalog)1 TableIdentifier (org.apache.iceberg.catalog.TableIdentifier)1 NoSuchTableException (org.apache.iceberg.exceptions.NoSuchTableException)1 StructType (org.apache.spark.sql.types.StructType)1