Search in sources :

Example 1 with SparkReadConf

use of org.apache.iceberg.spark.SparkReadConf in project iceberg by apache.

the class SparkScanBuilder method buildMergeOnReadScan.

public Scan buildMergeOnReadScan() {
    Preconditions.checkArgument(readConf.snapshotId() == null && readConf.asOfTimestamp() == null, "Cannot set time travel options %s and %s for row-level command scans", SparkReadOptions.SNAPSHOT_ID, SparkReadOptions.AS_OF_TIMESTAMP);
    Preconditions.checkArgument(readConf.startSnapshotId() == null && readConf.endSnapshotId() == null, "Cannot set incremental scan options %s and %s for row-level command scans", SparkReadOptions.START_SNAPSHOT_ID, SparkReadOptions.END_SNAPSHOT_ID);
    Snapshot snapshot = table.currentSnapshot();
    if (snapshot == null) {
        return new SparkBatchQueryScan(spark, table, null, readConf, schemaWithMetadataColumns(), filterExpressions);
    }
    // remember the current snapshot ID for commit validation
    long snapshotId = snapshot.snapshotId();
    CaseInsensitiveStringMap adjustedOptions = Spark3Util.setOption(SparkReadOptions.SNAPSHOT_ID, Long.toString(snapshotId), options);
    SparkReadConf adjustedReadConf = new SparkReadConf(spark, table, adjustedOptions);
    Schema expectedSchema = schemaWithMetadataColumns();
    TableScan scan = table.newScan().useSnapshot(snapshotId).caseSensitive(caseSensitive).filter(filterExpression()).project(expectedSchema);
    scan = configureSplitPlanning(scan);
    return new SparkBatchQueryScan(spark, table, scan, adjustedReadConf, expectedSchema, filterExpressions);
}
Also used : SparkReadConf(org.apache.iceberg.spark.SparkReadConf) Snapshot(org.apache.iceberg.Snapshot) TableScan(org.apache.iceberg.TableScan) Schema(org.apache.iceberg.Schema) CaseInsensitiveStringMap(org.apache.spark.sql.util.CaseInsensitiveStringMap)

Aggregations

Schema (org.apache.iceberg.Schema)1 Snapshot (org.apache.iceberg.Snapshot)1 TableScan (org.apache.iceberg.TableScan)1 SparkReadConf (org.apache.iceberg.spark.SparkReadConf)1 CaseInsensitiveStringMap (org.apache.spark.sql.util.CaseInsensitiveStringMap)1