Search in sources :

Example 1 with HadoopExternalSplit

use of org.apache.ignite.internal.processors.hadoop.HadoopExternalSplit in project ignite by apache.

the class HadoopV2Job method input.

/** {@inheritDoc} */
@Override
public Collection<HadoopInputSplit> input() {
    ClassLoader oldLdr = HadoopCommonUtils.setContextClassLoader(jobConf.getClassLoader());
    try {
        String jobDirPath = jobConf.get(MRJobConfig.MAPREDUCE_JOB_DIR);
        if (jobDirPath == null) {
            // Assume that we have needed classes and try to generate input splits ourself.
            if (jobConf.getUseNewMapper())
                return HadoopV2Splitter.splitJob(jobCtx);
            else
                return HadoopV1Splitter.splitJob(jobConf);
        }
        Path jobDir = new Path(jobDirPath);
        try {
            FileSystem fs = fileSystem(jobDir.toUri(), jobConf);
            JobSplit.TaskSplitMetaInfo[] metaInfos = SplitMetaInfoReader.readSplitMetaInfo(hadoopJobID, fs, jobConf, jobDir);
            if (F.isEmpty(metaInfos))
                throw new IgniteCheckedException("No input splits found.");
            Path splitsFile = JobSubmissionFiles.getJobSplitFile(jobDir);
            try (FSDataInputStream in = fs.open(splitsFile)) {
                Collection<HadoopInputSplit> res = new ArrayList<>(metaInfos.length);
                for (JobSplit.TaskSplitMetaInfo metaInfo : metaInfos) {
                    long off = metaInfo.getStartOffset();
                    String[] hosts = metaInfo.getLocations();
                    in.seek(off);
                    String clsName = Text.readString(in);
                    HadoopFileBlock block = HadoopV1Splitter.readFileBlock(clsName, in, hosts);
                    if (block == null)
                        block = HadoopV2Splitter.readFileBlock(clsName, in, hosts);
                    res.add(block != null ? block : new HadoopExternalSplit(hosts, off));
                }
                return res;
            }
        } catch (Throwable e) {
            if (e instanceof Error)
                throw (Error) e;
            else
                throw transformException(e);
        }
    } catch (IgniteCheckedException e) {
        throw new IgniteException(e);
    } finally {
        HadoopCommonUtils.restoreContextClassLoader(oldLdr);
    }
}
Also used : Path(org.apache.hadoop.fs.Path) ArrayList(java.util.ArrayList) HadoopInputSplit(org.apache.ignite.hadoop.HadoopInputSplit) HadoopFileBlock(org.apache.ignite.internal.processors.hadoop.HadoopFileBlock) IgniteCheckedException(org.apache.ignite.IgniteCheckedException) JobSplit(org.apache.hadoop.mapreduce.split.JobSplit) IgniteException(org.apache.ignite.IgniteException) FileSystem(org.apache.hadoop.fs.FileSystem) HadoopClassLoader(org.apache.ignite.internal.processors.hadoop.HadoopClassLoader) FSDataInputStream(org.apache.hadoop.fs.FSDataInputStream) HadoopExternalSplit(org.apache.ignite.internal.processors.hadoop.HadoopExternalSplit)

Aggregations

ArrayList (java.util.ArrayList)1 FSDataInputStream (org.apache.hadoop.fs.FSDataInputStream)1 FileSystem (org.apache.hadoop.fs.FileSystem)1 Path (org.apache.hadoop.fs.Path)1 JobSplit (org.apache.hadoop.mapreduce.split.JobSplit)1 IgniteCheckedException (org.apache.ignite.IgniteCheckedException)1 IgniteException (org.apache.ignite.IgniteException)1 HadoopInputSplit (org.apache.ignite.hadoop.HadoopInputSplit)1 HadoopClassLoader (org.apache.ignite.internal.processors.hadoop.HadoopClassLoader)1 HadoopExternalSplit (org.apache.ignite.internal.processors.hadoop.HadoopExternalSplit)1 HadoopFileBlock (org.apache.ignite.internal.processors.hadoop.HadoopFileBlock)1