Search in sources :

Example 66 with FileSplit

use of org.apache.hadoop.mapred.FileSplit in project ignite by apache.

the class HadoopV1Splitter method splitJob.

/**
 * @param jobConf Job configuration.
 * @return Collection of mapped splits.
 * @throws IgniteCheckedException If mapping failed.
 */
public static Collection<HadoopInputSplit> splitJob(JobConf jobConf) throws IgniteCheckedException {
    try {
        InputFormat<?, ?> format = jobConf.getInputFormat();
        assert format != null;
        InputSplit[] splits = format.getSplits(jobConf, 0);
        Collection<HadoopInputSplit> res = new ArrayList<>(splits.length);
        for (int i = 0; i < splits.length; i++) {
            InputSplit nativeSplit = splits[i];
            if (nativeSplit instanceof FileSplit) {
                FileSplit s = (FileSplit) nativeSplit;
                res.add(new HadoopFileBlock(s.getLocations(), s.getPath().toUri(), s.getStart(), s.getLength()));
            } else
                res.add(HadoopUtils.wrapSplit(i, nativeSplit, nativeSplit.getLocations()));
        }
        return res;
    } catch (IOException e) {
        throw new IgniteCheckedException(e);
    }
}
Also used : IgniteCheckedException(org.apache.ignite.IgniteCheckedException) ArrayList(java.util.ArrayList) HadoopInputSplit(org.apache.ignite.hadoop.HadoopInputSplit) IOException(java.io.IOException) FileSplit(org.apache.hadoop.mapred.FileSplit) HadoopFileBlock(org.apache.ignite.internal.processors.hadoop.HadoopFileBlock) HadoopInputSplit(org.apache.ignite.hadoop.HadoopInputSplit) InputSplit(org.apache.hadoop.mapred.InputSplit)

Aggregations

FileSplit (org.apache.hadoop.mapred.FileSplit)66 Path (org.apache.hadoop.fs.Path)38 InputSplit (org.apache.hadoop.mapred.InputSplit)23 JobConf (org.apache.hadoop.mapred.JobConf)16 File (java.io.File)10 IOException (java.io.IOException)10 Configuration (org.apache.hadoop.conf.Configuration)10 FileStatus (org.apache.hadoop.fs.FileStatus)10 FileSystem (org.apache.hadoop.fs.FileSystem)10 Test (org.junit.Test)9 RecordReader (org.apache.hadoop.mapred.RecordReader)8 ArrayList (java.util.ArrayList)7 Properties (java.util.Properties)7 StructField (org.apache.hadoop.hive.serde2.objectinspector.StructField)7 ObjectInspector (org.apache.hadoop.hive.serde2.objectinspector.ObjectInspector)5 NullWritable (org.apache.hadoop.io.NullWritable)5 InputFormat (org.apache.hadoop.mapred.InputFormat)4 NodeControllerInfo (org.apache.hyracks.api.client.NodeControllerInfo)4 ClusterTopology (org.apache.hyracks.api.topology.ClusterTopology)4 VertexLocationHint (org.apache.tez.dag.api.VertexLocationHint)4