Search in sources :

Example 6 with ParallelRunner

use of org.apache.gobblin.util.ParallelRunner in project incubator-gobblin by apache.

the class MRJobLauncher method prepareJobInput.

/**
 * Prepare the job input.
 * @throws IOException
 */
private void prepareJobInput(List<WorkUnit> workUnits) throws IOException {
    Closer closer = Closer.create();
    try {
        ParallelRunner parallelRunner = closer.register(new ParallelRunner(this.parallelRunnerThreads, this.fs));
        int multiTaskIdSequence = 0;
        // Serialize each work unit into a file named after the task ID
        for (WorkUnit workUnit : workUnits) {
            String workUnitFileName;
            if (workUnit instanceof MultiWorkUnit) {
                workUnitFileName = JobLauncherUtils.newMultiTaskId(this.jobContext.getJobId(), multiTaskIdSequence++) + MULTI_WORK_UNIT_FILE_EXTENSION;
            } else {
                workUnitFileName = workUnit.getProp(ConfigurationKeys.TASK_ID_KEY) + WORK_UNIT_FILE_EXTENSION;
            }
            Path workUnitFile = new Path(this.jobInputPath, workUnitFileName);
            LOG.debug("Writing work unit file " + workUnitFileName);
            parallelRunner.serializeToFile(workUnit, workUnitFile);
        // Append the work unit file path to the job input file
        }
    } catch (Throwable t) {
        throw closer.rethrow(t);
    } finally {
        closer.close();
    }
}
Also used : Closer(com.google.common.io.Closer) Path(org.apache.hadoop.fs.Path) MultiWorkUnit(org.apache.gobblin.source.workunit.MultiWorkUnit) MultiWorkUnit(org.apache.gobblin.source.workunit.MultiWorkUnit) WorkUnit(org.apache.gobblin.source.workunit.WorkUnit) ParallelRunner(org.apache.gobblin.util.ParallelRunner)

Aggregations

ParallelRunner (org.apache.gobblin.util.ParallelRunner)6 Closer (com.google.common.io.Closer)3 IOException (java.io.IOException)3 MultiWorkUnit (org.apache.gobblin.source.workunit.MultiWorkUnit)3 WorkUnit (org.apache.gobblin.source.workunit.WorkUnit)3 Path (org.apache.hadoop.fs.Path)3 WorkUnitState (org.apache.gobblin.configuration.WorkUnitState)1 JobConfig (org.apache.helix.task.JobConfig)1 TaskConfig (org.apache.helix.task.TaskConfig)1