Search in sources :

Example 1 with FullRecordBootstrapDataProvider

use of org.apache.hudi.client.bootstrap.FullRecordBootstrapDataProvider in project hudi by apache.

the class SparkBootstrapCommitActionExecutor method fullBootstrap.

/**
 * Perform Full Bootstrap.
 * @param partitionFilesList List of partitions and files within that partitions
 */
protected Option<HoodieWriteMetadata<HoodieData<WriteStatus>>> fullBootstrap(List<Pair<String, List<HoodieFileStatus>>> partitionFilesList) {
    if (null == partitionFilesList || partitionFilesList.isEmpty()) {
        return Option.empty();
    }
    TypedProperties properties = new TypedProperties();
    properties.putAll(config.getProps());
    FullRecordBootstrapDataProvider inputProvider = (FullRecordBootstrapDataProvider) ReflectionUtils.loadClass(config.getFullBootstrapInputProvider(), properties, context);
    JavaRDD<HoodieRecord> inputRecordsRDD = (JavaRDD<HoodieRecord>) inputProvider.generateInputRecords("bootstrap_source", config.getBootstrapSourceBasePath(), partitionFilesList);
    // Start Full Bootstrap
    final HoodieInstant requested = new HoodieInstant(State.REQUESTED, table.getMetaClient().getCommitActionType(), HoodieTimeline.FULL_BOOTSTRAP_INSTANT_TS);
    table.getActiveTimeline().createNewInstant(requested);
    // Setup correct schema and run bulk insert.
    return Option.of(getBulkInsertActionExecutor(HoodieJavaRDD.of(inputRecordsRDD)).execute());
}
Also used : HoodieInstant(org.apache.hudi.common.table.timeline.HoodieInstant) HoodieRecord(org.apache.hudi.common.model.HoodieRecord) FullRecordBootstrapDataProvider(org.apache.hudi.client.bootstrap.FullRecordBootstrapDataProvider) TypedProperties(org.apache.hudi.common.config.TypedProperties) HoodieJavaRDD(org.apache.hudi.data.HoodieJavaRDD) JavaRDD(org.apache.spark.api.java.JavaRDD)

Aggregations

FullRecordBootstrapDataProvider (org.apache.hudi.client.bootstrap.FullRecordBootstrapDataProvider)1 TypedProperties (org.apache.hudi.common.config.TypedProperties)1 HoodieRecord (org.apache.hudi.common.model.HoodieRecord)1 HoodieInstant (org.apache.hudi.common.table.timeline.HoodieInstant)1 HoodieJavaRDD (org.apache.hudi.data.HoodieJavaRDD)1 JavaRDD (org.apache.spark.api.java.JavaRDD)1