Search in sources :

Example 1 with SparkTable

use of org.apache.iceberg.spark.source.SparkTable in project iceberg by apache.

the class Spark3Util method toIcebergTable.

public static org.apache.iceberg.Table toIcebergTable(Table table) {
    Preconditions.checkArgument(table instanceof SparkTable, "Table %s is not an Iceberg table", table);
    SparkTable sparkTable = (SparkTable) table;
    return sparkTable.table();
}
Also used : SparkTable(org.apache.iceberg.spark.source.SparkTable)

Example 2 with SparkTable

use of org.apache.iceberg.spark.source.SparkTable in project iceberg by apache.

the class SparkCatalog method createTable.

@Override
public SparkTable createTable(Identifier ident, StructType schema, Transform[] transforms, Map<String, String> properties) throws TableAlreadyExistsException {
    Schema icebergSchema = SparkSchemaUtil.convert(schema, useTimestampsWithoutZone);
    try {
        Catalog.TableBuilder builder = newBuilder(ident, icebergSchema);
        Table icebergTable = builder.withPartitionSpec(Spark3Util.toPartitionSpec(icebergSchema, transforms)).withLocation(properties.get("location")).withProperties(Spark3Util.rebuildCreateProperties(properties)).create();
        return new SparkTable(icebergTable, !cacheEnabled);
    } catch (AlreadyExistsException e) {
        throw new TableAlreadyExistsException(ident);
    }
}
Also used : TableAlreadyExistsException(org.apache.spark.sql.catalyst.analysis.TableAlreadyExistsException) StagedSparkTable(org.apache.iceberg.spark.source.StagedSparkTable) StagedTable(org.apache.spark.sql.connector.catalog.StagedTable) Table(org.apache.iceberg.Table) SparkTable(org.apache.iceberg.spark.source.SparkTable) AlreadyExistsException(org.apache.iceberg.exceptions.AlreadyExistsException) NamespaceAlreadyExistsException(org.apache.spark.sql.catalyst.analysis.NamespaceAlreadyExistsException) TableAlreadyExistsException(org.apache.spark.sql.catalyst.analysis.TableAlreadyExistsException) Schema(org.apache.iceberg.Schema) StagedSparkTable(org.apache.iceberg.spark.source.StagedSparkTable) SparkTable(org.apache.iceberg.spark.source.SparkTable) TableCatalog(org.apache.spark.sql.connector.catalog.TableCatalog) CachingCatalog(org.apache.iceberg.CachingCatalog) HadoopCatalog(org.apache.iceberg.hadoop.HadoopCatalog) Catalog(org.apache.iceberg.catalog.Catalog)

Example 3 with SparkTable

use of org.apache.iceberg.spark.source.SparkTable in project iceberg by apache.

the class SparkCatalog method alterTable.

@Override
public SparkTable alterTable(Identifier ident, TableChange... changes) throws NoSuchTableException {
    SetProperty setLocation = null;
    SetProperty setSnapshotId = null;
    SetProperty pickSnapshotId = null;
    List<TableChange> propertyChanges = Lists.newArrayList();
    List<TableChange> schemaChanges = Lists.newArrayList();
    for (TableChange change : changes) {
        if (change instanceof SetProperty) {
            SetProperty set = (SetProperty) change;
            if (TableCatalog.PROP_LOCATION.equalsIgnoreCase(set.property())) {
                setLocation = set;
            } else if ("current-snapshot-id".equalsIgnoreCase(set.property())) {
                setSnapshotId = set;
            } else if ("cherry-pick-snapshot-id".equalsIgnoreCase(set.property())) {
                pickSnapshotId = set;
            } else if ("sort-order".equalsIgnoreCase(set.property())) {
                throw new UnsupportedOperationException("Cannot specify the 'sort-order' because it's a reserved table " + "property. Please use the command 'ALTER TABLE ... WRITE ORDERED BY' to specify write sort-orders.");
            } else {
                propertyChanges.add(set);
            }
        } else if (change instanceof RemoveProperty) {
            propertyChanges.add(change);
        } else if (change instanceof ColumnChange) {
            schemaChanges.add(change);
        } else {
            throw new UnsupportedOperationException("Cannot apply unknown table change: " + change);
        }
    }
    try {
        Table table = load(ident).first();
        commitChanges(table, setLocation, setSnapshotId, pickSnapshotId, propertyChanges, schemaChanges);
        return new SparkTable(table, true);
    } catch (org.apache.iceberg.exceptions.NoSuchTableException e) {
        throw new NoSuchTableException(ident);
    }
}
Also used : RemoveProperty(org.apache.spark.sql.connector.catalog.TableChange.RemoveProperty) StagedSparkTable(org.apache.iceberg.spark.source.StagedSparkTable) StagedTable(org.apache.spark.sql.connector.catalog.StagedTable) Table(org.apache.iceberg.Table) SparkTable(org.apache.iceberg.spark.source.SparkTable) ColumnChange(org.apache.spark.sql.connector.catalog.TableChange.ColumnChange) NoSuchTableException(org.apache.spark.sql.catalyst.analysis.NoSuchTableException) SetProperty(org.apache.spark.sql.connector.catalog.TableChange.SetProperty) StagedSparkTable(org.apache.iceberg.spark.source.StagedSparkTable) SparkTable(org.apache.iceberg.spark.source.SparkTable) TableChange(org.apache.spark.sql.connector.catalog.TableChange)

Example 4 with SparkTable

use of org.apache.iceberg.spark.source.SparkTable in project iceberg by apache.

the class BaseProcedure method loadSparkTable.

protected SparkTable loadSparkTable(Identifier ident) {
    try {
        Table table = tableCatalog.loadTable(ident);
        ValidationException.check(table instanceof SparkTable, "%s is not %s", ident, SparkTable.class.getName());
        return (SparkTable) table;
    } catch (NoSuchTableException e) {
        String errMsg = String.format("Couldn't load table '%s' in catalog '%s'", ident, tableCatalog.name());
        throw new RuntimeException(errMsg, e);
    }
}
Also used : SparkTable(org.apache.iceberg.spark.source.SparkTable) Table(org.apache.spark.sql.connector.catalog.Table) NoSuchTableException(org.apache.spark.sql.catalyst.analysis.NoSuchTableException) SparkTable(org.apache.iceberg.spark.source.SparkTable)

Example 5 with SparkTable

use of org.apache.iceberg.spark.source.SparkTable in project iceberg by apache.

the class TestCreateActions method testAddColumnOnMigratedTableAtMiddle.

@Test
public void testAddColumnOnMigratedTableAtMiddle() throws Exception {
    Assume.assumeTrue("Cannot migrate to a hadoop based catalog", !type.equals("hadoop"));
    Assume.assumeTrue("Can only migrate from Spark Session Catalog", catalog.name().equals("spark_catalog"));
    String source = sourceName("test_add_column_migrated_table_middle");
    String dest = source;
    createSourceTable(CREATE_PARQUET, source);
    // migrate table
    SparkActions.get().migrateTable(source).execute();
    SparkTable sparkTable = loadTable(dest);
    Table table = sparkTable.table();
    List<Object[]> expected = sql("select id, null, data from %s order by id", source);
    // test column addition on migrated table
    Schema beforeSchema = table.schema();
    String newCol1 = "newCol";
    sparkTable.table().updateSchema().addColumn("newCol", Types.IntegerType.get()).moveAfter(newCol1, "id").commit();
    Schema afterSchema = table.schema();
    Assert.assertNull(beforeSchema.findField(newCol1));
    Assert.assertNotNull(afterSchema.findField(newCol1));
    // reads should succeed
    List<Object[]> results = sql("select * from %s order by id", dest);
    Assert.assertTrue(results.size() > 0);
    assertEquals("Output must match", results, expected);
}
Also used : CatalogTable(org.apache.spark.sql.catalyst.catalog.CatalogTable) SnapshotTable(org.apache.iceberg.actions.SnapshotTable) MigrateTable(org.apache.iceberg.actions.MigrateTable) Table(org.apache.iceberg.Table) SparkTable(org.apache.iceberg.spark.source.SparkTable) Schema(org.apache.iceberg.Schema) SparkTable(org.apache.iceberg.spark.source.SparkTable) Test(org.junit.Test)

Aggregations

SparkTable (org.apache.iceberg.spark.source.SparkTable)23 Test (org.junit.Test)12 Identifier (org.apache.spark.sql.connector.catalog.Identifier)8 File (java.io.File)7 SparkCatalog (org.apache.iceberg.spark.SparkCatalog)7 Map (java.util.Map)6 Table (org.apache.iceberg.Table)6 StreamSupport (java.util.stream.StreamSupport)5 DeleteOrphanFiles (org.apache.iceberg.actions.DeleteOrphanFiles)5 Maps (org.apache.iceberg.relocated.com.google.common.collect.Maps)5 SparkSchemaUtil (org.apache.iceberg.spark.SparkSchemaUtil)5 SparkSessionCatalog (org.apache.iceberg.spark.SparkSessionCatalog)5 Transform (org.apache.spark.sql.connector.expressions.Transform)5 After (org.junit.After)5 Assert (org.junit.Assert)5 Schema (org.apache.iceberg.Schema)4 MigrateTable (org.apache.iceberg.actions.MigrateTable)3 SnapshotTable (org.apache.iceberg.actions.SnapshotTable)3 NoSuchTableException (org.apache.spark.sql.catalyst.analysis.NoSuchTableException)3 CatalogTable (org.apache.spark.sql.catalyst.catalog.CatalogTable)3