Search in sources :

Example 1 with HoodieSnapshotExporterException

use of org.apache.hudi.utilities.exception.HoodieSnapshotExporterException in project hudi by apache.

the class HoodieSnapshotExporter method export.

public void export(JavaSparkContext jsc, Config cfg) throws IOException {
    FileSystem fs = FSUtils.getFs(cfg.sourceBasePath, jsc.hadoopConfiguration());
    HoodieSparkEngineContext engineContext = new HoodieSparkEngineContext(jsc);
    if (outputPathExists(fs, cfg)) {
        throw new HoodieSnapshotExporterException("The target output path already exists.");
    }
    final String latestCommitTimestamp = getLatestCommitTimestamp(fs, cfg).<HoodieSnapshotExporterException>orElseThrow(() -> {
        throw new HoodieSnapshotExporterException("No commits present. Nothing to snapshot.");
    });
    LOG.info(String.format("Starting to snapshot latest version files which are also no-late-than %s.", latestCommitTimestamp));
    final List<String> partitions = getPartitions(engineContext, cfg);
    if (partitions.isEmpty()) {
        throw new HoodieSnapshotExporterException("The source dataset has 0 partition to snapshot.");
    }
    LOG.info(String.format("The job needs to export %d partitions.", partitions.size()));
    if (cfg.outputFormat.equals(OutputFormatValidator.HUDI)) {
        exportAsHudi(jsc, cfg, partitions, latestCommitTimestamp);
    } else {
        exportAsNonHudi(jsc, cfg, partitions, latestCommitTimestamp);
    }
    createSuccessTag(fs, cfg);
}
Also used : HoodieSparkEngineContext(org.apache.hudi.client.common.HoodieSparkEngineContext) FileSystem(org.apache.hadoop.fs.FileSystem) HoodieSnapshotExporterException(org.apache.hudi.utilities.exception.HoodieSnapshotExporterException)

Aggregations

FileSystem (org.apache.hadoop.fs.FileSystem)1 HoodieSparkEngineContext (org.apache.hudi.client.common.HoodieSparkEngineContext)1 HoodieSnapshotExporterException (org.apache.hudi.utilities.exception.HoodieSnapshotExporterException)1