Search in sources :

Example 1 with BaseSnapshotTableActionResult

use of org.apache.iceberg.actions.BaseSnapshotTableActionResult in project iceberg by apache.

the class BaseSnapshotTableSparkAction method doExecute.

private SnapshotTable.Result doExecute() {
    Preconditions.checkArgument(destCatalog() != null && destTableIdent() != null, "The destination catalog and identifier cannot be null. " + "Make sure to configure the action with a valid destination table identifier via the `as` method.");
    LOG.info("Staging a new Iceberg table {} as a snapshot of {}", destTableIdent(), sourceTableIdent());
    StagedSparkTable stagedTable = stageDestTable();
    Table icebergTable = stagedTable.table();
    // TODO: Check the dest table location does not overlap with the source table location
    boolean threw = true;
    try {
        LOG.info("Ensuring {} has a valid name mapping", destTableIdent());
        ensureNameMappingPresent(icebergTable);
        TableIdentifier v1TableIdent = v1SourceTable().identifier();
        String stagingLocation = getMetadataLocation(icebergTable);
        LOG.info("Generating Iceberg metadata for {} in {}", destTableIdent(), stagingLocation);
        SparkTableUtil.importSparkTable(spark(), v1TableIdent, icebergTable, stagingLocation);
        LOG.info("Committing staged changes to {}", destTableIdent());
        stagedTable.commitStagedChanges();
        threw = false;
    } finally {
        if (threw) {
            LOG.error("Error when populating the staged table with metadata, aborting changes");
            try {
                stagedTable.abortStagedChanges();
            } catch (Exception abortException) {
                LOG.error("Cannot abort staged changes", abortException);
            }
        }
    }
    Snapshot snapshot = icebergTable.currentSnapshot();
    long importedDataFilesCount = Long.parseLong(snapshot.summary().get(SnapshotSummary.TOTAL_DATA_FILES_PROP));
    LOG.info("Successfully loaded Iceberg metadata for {} files to {}", importedDataFilesCount, destTableIdent());
    return new BaseSnapshotTableActionResult(importedDataFilesCount);
}
Also used : TableIdentifier(org.apache.spark.sql.catalyst.TableIdentifier) Snapshot(org.apache.iceberg.Snapshot) Table(org.apache.iceberg.Table) StagedSparkTable(org.apache.iceberg.spark.source.StagedSparkTable) SnapshotTable(org.apache.iceberg.actions.SnapshotTable) StagedSparkTable(org.apache.iceberg.spark.source.StagedSparkTable) BaseSnapshotTableActionResult(org.apache.iceberg.actions.BaseSnapshotTableActionResult)

Aggregations

Snapshot (org.apache.iceberg.Snapshot)1 Table (org.apache.iceberg.Table)1 BaseSnapshotTableActionResult (org.apache.iceberg.actions.BaseSnapshotTableActionResult)1 SnapshotTable (org.apache.iceberg.actions.SnapshotTable)1 StagedSparkTable (org.apache.iceberg.spark.source.StagedSparkTable)1 TableIdentifier (org.apache.spark.sql.catalyst.TableIdentifier)1