Search in sources :

Example 1 with HoodieCompactionException

use of org.apache.hudi.exception.HoodieCompactionException in project hudi by apache.

the class RunCompactionActionExecutor method execute.

@Override
public HoodieWriteMetadata<HoodieData<WriteStatus>> execute() {
    HoodieTimeline pendingCompactionTimeline = table.getActiveTimeline().filterPendingCompactionTimeline();
    compactor.preCompact(table, pendingCompactionTimeline, instantTime);
    HoodieWriteMetadata<HoodieData<WriteStatus>> compactionMetadata = new HoodieWriteMetadata<>();
    try {
        // generate compaction plan
        // should support configurable commit metadata
        HoodieCompactionPlan compactionPlan = CompactionUtils.getCompactionPlan(table.getMetaClient(), instantTime);
        HoodieData<WriteStatus> statuses = compactor.compact(context, compactionPlan, table, config, instantTime, compactionHandler);
        compactor.maybePersist(statuses, config);
        context.setJobStatus(this.getClass().getSimpleName(), "Preparing compaction metadata");
        List<HoodieWriteStat> updateStatusMap = statuses.map(WriteStatus::getStat).collectAsList();
        HoodieCommitMetadata metadata = new HoodieCommitMetadata(true);
        for (HoodieWriteStat stat : updateStatusMap) {
            metadata.addWriteStat(stat.getPartitionPath(), stat);
        }
        metadata.addMetadata(HoodieCommitMetadata.SCHEMA_KEY, config.getSchema());
        compactionMetadata.setWriteStatuses(statuses);
        compactionMetadata.setCommitted(false);
        compactionMetadata.setCommitMetadata(Option.of(metadata));
    } catch (IOException e) {
        throw new HoodieCompactionException("Could not compact " + config.getBasePath(), e);
    }
    return compactionMetadata;
}
Also used : HoodieData(org.apache.hudi.common.data.HoodieData) HoodieCommitMetadata(org.apache.hudi.common.model.HoodieCommitMetadata) HoodieCompactionException(org.apache.hudi.exception.HoodieCompactionException) HoodieWriteStat(org.apache.hudi.common.model.HoodieWriteStat) HoodieCompactionPlan(org.apache.hudi.avro.model.HoodieCompactionPlan) HoodieTimeline(org.apache.hudi.common.table.timeline.HoodieTimeline) HoodieWriteMetadata(org.apache.hudi.table.action.HoodieWriteMetadata) IOException(java.io.IOException) WriteStatus(org.apache.hudi.client.WriteStatus)

Example 2 with HoodieCompactionException

use of org.apache.hudi.exception.HoodieCompactionException in project hudi by apache.

the class ScheduleCompactionActionExecutor method scheduleCompaction.

private HoodieCompactionPlan scheduleCompaction() {
    LOG.info("Checking if compaction needs to be run on " + config.getBasePath());
    // judge if we need to compact according to num delta commits and time elapsed
    boolean compactable = needCompact(config.getInlineCompactTriggerStrategy());
    if (compactable) {
        LOG.info("Generating compaction plan for merge on read table " + config.getBasePath());
        try {
            SyncableFileSystemView fileSystemView = (SyncableFileSystemView) table.getSliceView();
            Set<HoodieFileGroupId> fgInPendingCompactionAndClustering = fileSystemView.getPendingCompactionOperations().map(instantTimeOpPair -> instantTimeOpPair.getValue().getFileGroupId()).collect(Collectors.toSet());
            // exclude files in pending clustering from compaction.
            fgInPendingCompactionAndClustering.addAll(fileSystemView.getFileGroupsInPendingClustering().map(Pair::getLeft).collect(Collectors.toSet()));
            context.setJobStatus(this.getClass().getSimpleName(), "Compaction: generating compaction plan");
            return compactor.generateCompactionPlan(context, table, config, instantTime, fgInPendingCompactionAndClustering);
        } catch (IOException e) {
            throw new HoodieCompactionException("Could not schedule compaction " + config.getBasePath(), e);
        }
    }
    return new HoodieCompactionPlan();
}
Also used : HoodieTable(org.apache.hudi.table.HoodieTable) BaseActionExecutor(org.apache.hudi.table.action.BaseActionExecutor) HoodieInstant(org.apache.hudi.common.table.timeline.HoodieInstant) Option(org.apache.hudi.common.util.Option) HoodieEngineContext(org.apache.hudi.common.engine.HoodieEngineContext) Logger(org.apache.log4j.Logger) Map(java.util.Map) ParseException(java.text.ParseException) HoodieFileGroupId(org.apache.hudi.common.model.HoodieFileGroupId) HoodieActiveTimeline(org.apache.hudi.common.table.timeline.HoodieActiveTimeline) HoodieTimeline(org.apache.hudi.common.table.timeline.HoodieTimeline) SyncableFileSystemView(org.apache.hudi.common.table.view.SyncableFileSystemView) ValidationUtils(org.apache.hudi.common.util.ValidationUtils) HoodieWriteConfig(org.apache.hudi.config.HoodieWriteConfig) Set(java.util.Set) TimelineMetadataUtils(org.apache.hudi.common.table.timeline.TimelineMetadataUtils) IOException(java.io.IOException) Collectors(java.util.stream.Collectors) HoodieRecordPayload(org.apache.hudi.common.model.HoodieRecordPayload) HoodieCompactionException(org.apache.hudi.exception.HoodieCompactionException) List(java.util.List) HoodieCompactionPlan(org.apache.hudi.avro.model.HoodieCompactionPlan) HoodieIOException(org.apache.hudi.exception.HoodieIOException) LogManager(org.apache.log4j.LogManager) EngineType(org.apache.hudi.common.engine.EngineType) CompactionUtils(org.apache.hudi.common.util.CompactionUtils) Pair(org.apache.hudi.common.util.collection.Pair) HoodieCompactionException(org.apache.hudi.exception.HoodieCompactionException) SyncableFileSystemView(org.apache.hudi.common.table.view.SyncableFileSystemView) HoodieCompactionPlan(org.apache.hudi.avro.model.HoodieCompactionPlan) HoodieFileGroupId(org.apache.hudi.common.model.HoodieFileGroupId) IOException(java.io.IOException) HoodieIOException(org.apache.hudi.exception.HoodieIOException) Pair(org.apache.hudi.common.util.collection.Pair)

Example 3 with HoodieCompactionException

use of org.apache.hudi.exception.HoodieCompactionException in project hudi by apache.

the class HoodieCompactor method doCompact.

private int doCompact(JavaSparkContext jsc) throws Exception {
    // Get schema.
    String schemaStr;
    if (StringUtils.isNullOrEmpty(cfg.schemaFile)) {
        schemaStr = getSchemaFromLatestInstant();
    } else {
        schemaStr = UtilHelpers.parseSchema(fs, cfg.schemaFile);
    }
    LOG.info("Schema --> : " + schemaStr);
    try (SparkRDDWriteClient<HoodieRecordPayload> client = UtilHelpers.createHoodieClient(jsc, cfg.basePath, schemaStr, cfg.parallelism, Option.empty(), props)) {
        // instant from the active timeline
        if (StringUtils.isNullOrEmpty(cfg.compactionInstantTime)) {
            HoodieTableMetaClient metaClient = UtilHelpers.createMetaClient(jsc, cfg.basePath, true);
            Option<HoodieInstant> firstCompactionInstant = metaClient.getActiveTimeline().firstInstant(HoodieTimeline.COMPACTION_ACTION, HoodieInstant.State.REQUESTED);
            if (firstCompactionInstant.isPresent()) {
                cfg.compactionInstantTime = firstCompactionInstant.get().getTimestamp();
                LOG.info("Found the earliest scheduled compaction instant which will be executed: " + cfg.compactionInstantTime);
            } else {
                throw new HoodieCompactionException("There is no scheduled compaction in the table.");
            }
        }
        HoodieWriteMetadata<JavaRDD<WriteStatus>> compactionMetadata = client.compact(cfg.compactionInstantTime);
        return UtilHelpers.handleErrors(compactionMetadata.getCommitMetadata().get(), cfg.compactionInstantTime);
    }
}
Also used : HoodieTableMetaClient(org.apache.hudi.common.table.HoodieTableMetaClient) HoodieInstant(org.apache.hudi.common.table.timeline.HoodieInstant) HoodieCompactionException(org.apache.hudi.exception.HoodieCompactionException) HoodieRecordPayload(org.apache.hudi.common.model.HoodieRecordPayload) JavaRDD(org.apache.spark.api.java.JavaRDD)

Aggregations

HoodieCompactionException (org.apache.hudi.exception.HoodieCompactionException)3 IOException (java.io.IOException)2 HoodieCompactionPlan (org.apache.hudi.avro.model.HoodieCompactionPlan)2 HoodieRecordPayload (org.apache.hudi.common.model.HoodieRecordPayload)2 HoodieInstant (org.apache.hudi.common.table.timeline.HoodieInstant)2 HoodieTimeline (org.apache.hudi.common.table.timeline.HoodieTimeline)2 ParseException (java.text.ParseException)1 List (java.util.List)1 Map (java.util.Map)1 Set (java.util.Set)1 Collectors (java.util.stream.Collectors)1 WriteStatus (org.apache.hudi.client.WriteStatus)1 HoodieData (org.apache.hudi.common.data.HoodieData)1 EngineType (org.apache.hudi.common.engine.EngineType)1 HoodieEngineContext (org.apache.hudi.common.engine.HoodieEngineContext)1 HoodieCommitMetadata (org.apache.hudi.common.model.HoodieCommitMetadata)1 HoodieFileGroupId (org.apache.hudi.common.model.HoodieFileGroupId)1 HoodieWriteStat (org.apache.hudi.common.model.HoodieWriteStat)1 HoodieTableMetaClient (org.apache.hudi.common.table.HoodieTableMetaClient)1 HoodieActiveTimeline (org.apache.hudi.common.table.timeline.HoodieActiveTimeline)1