Search in sources :

Example 1 with RetentionAction

use of org.apache.gobblin.data.management.retention.action.RetentionAction in project incubator-gobblin by apache.

the class MultiVersionCleanableDatasetBase method clean.

/**
 * Method to perform the Retention operations for this dataset.
 *
 *<ul>
 * <li>{@link MultiVersionCleanableDatasetBase#getVersionFindersAndPolicies()} gets a list {@link VersionFinderAndPolicy}s
 * <li>Each {@link VersionFinderAndPolicy} contains a {@link VersionFinder} and a {@link VersionSelectionPolicy}. It can
 * optionally have a {@link RetentionAction}
 * <li>The {@link MultiVersionCleanableDatasetBase#clean()} method finds all the {@link FileSystemDatasetVersion}s using
 * {@link VersionFinderAndPolicy#versionFinder}
 * <li> It gets the deletable {@link FileSystemDatasetVersion}s by applying {@link VersionFinderAndPolicy#versionSelectionPolicy}.
 * These deletable version are deleted  and then deletes empty parent directories.
 * <li>If additional retention actions are available at {@link VersionFinderAndPolicy#getRetentionActions()}, all versions
 * found by the {@link VersionFinderAndPolicy#versionFinder} are passed to {@link RetentionAction#execute(List)} for
 * each {@link RetentionAction}
 * </ul>
 */
@Override
public void clean() throws IOException {
    if (this.isDatasetBlacklisted) {
        this.log.info("Dataset blacklisted. Cleanup skipped for " + datasetRoot());
        return;
    }
    boolean atLeastOneFailureSeen = false;
    for (VersionFinderAndPolicy<T> versionFinderAndPolicy : getVersionFindersAndPolicies()) {
        VersionSelectionPolicy<T> selectionPolicy = versionFinderAndPolicy.getVersionSelectionPolicy();
        VersionFinder<? extends T> versionFinder = versionFinderAndPolicy.getVersionFinder();
        if (!selectionPolicy.versionClass().isAssignableFrom(versionFinder.versionClass())) {
            throw new IOException("Incompatible dataset version classes.");
        }
        this.log.info(String.format("Cleaning dataset %s. Using version finder %s and policy %s", this, versionFinder.getClass().getName(), selectionPolicy));
        List<T> versions = Lists.newArrayList(versionFinder.findDatasetVersions(this));
        if (versions.isEmpty()) {
            this.log.warn("No dataset version can be found. Ignoring.");
            continue;
        }
        Collections.sort(versions, Collections.reverseOrder());
        Collection<T> deletableVersions = selectionPolicy.listSelectedVersions(versions);
        cleanImpl(deletableVersions);
        List<DatasetVersion> allVersions = Lists.newArrayList();
        for (T ver : versions) {
            allVersions.add(ver);
        }
        for (RetentionAction retentionAction : versionFinderAndPolicy.getRetentionActions()) {
            try {
                retentionAction.execute(allVersions);
            } catch (Throwable t) {
                atLeastOneFailureSeen = true;
                log.error(String.format("RetentionAction %s failed for dataset %s", retentionAction.getClass().getName(), this.datasetRoot()), t);
            }
        }
    }
    if (atLeastOneFailureSeen) {
        throw new RuntimeException(String.format("At least one failure happened while processing %s. Look for previous logs for failures", datasetRoot()));
    }
}
Also used : DatasetVersion(org.apache.gobblin.data.management.version.DatasetVersion) FileSystemDatasetVersion(org.apache.gobblin.data.management.version.FileSystemDatasetVersion) IOException(java.io.IOException) RetentionAction(org.apache.gobblin.data.management.retention.action.RetentionAction)

Aggregations

IOException (java.io.IOException)1 RetentionAction (org.apache.gobblin.data.management.retention.action.RetentionAction)1 DatasetVersion (org.apache.gobblin.data.management.version.DatasetVersion)1 FileSystemDatasetVersion (org.apache.gobblin.data.management.version.FileSystemDatasetVersion)1