Search in sources :

Example 41 with TaskAttemptContext

use of org.apache.hadoop.mapreduce.TaskAttemptContext in project mongo-hadoop by mongodb.

the class GridFSInputFormatTest method testReadWholeFileNoDelimiter.

@Test
public void testReadWholeFileNoDelimiter() throws IOException, InterruptedException {
    Configuration conf = getConfiguration();
    MongoConfigUtil.setGridFSWholeFileSplit(conf, true);
    JobContext jobContext = mockJobContext(conf);
    List<InputSplit> splits = inputFormat.getSplits(jobContext);
    // Empty delimiter == no delimiter.
    MongoConfigUtil.setGridFSDelimiterPattern(conf, "");
    TaskAttemptContext context = mockTaskAttemptContext(conf);
    assertEquals(1, splits.size());
    String fileText = null;
    for (InputSplit split : splits) {
        GridFSInputFormat.GridFSTextRecordReader reader = new GridFSInputFormat.GridFSTextRecordReader();
        reader.initialize(split, context);
        int i;
        for (i = 0; reader.nextKeyValue(); ++i) {
            fileText = reader.getCurrentValue().toString();
        }
        assertEquals(1, i);
    }
    assertEquals(fileContents.toString(), fileText);
}
Also used : Configuration(org.apache.hadoop.conf.Configuration) TaskAttemptContext(org.apache.hadoop.mapreduce.TaskAttemptContext) JobContext(org.apache.hadoop.mapreduce.JobContext) InputSplit(org.apache.hadoop.mapreduce.InputSplit) Test(org.junit.Test) BaseHadoopTest(com.mongodb.hadoop.testutils.BaseHadoopTest)

Example 42 with TaskAttemptContext

use of org.apache.hadoop.mapreduce.TaskAttemptContext in project mongo-hadoop by mongodb.

the class GridFSInputFormatTest method testRecordReaderNoDelimiter.

@Test
public void testRecordReaderNoDelimiter() throws IOException, InterruptedException {
    List<InputSplit> splits = getSplits();
    Configuration conf = getConfiguration();
    // Empty delimiter == no delimiter.
    MongoConfigUtil.setGridFSDelimiterPattern(conf, "");
    TaskAttemptContext context = mockTaskAttemptContext(conf);
    StringBuilder fileText = new StringBuilder();
    for (InputSplit split : splits) {
        GridFSInputFormat.GridFSTextRecordReader reader = new GridFSInputFormat.GridFSTextRecordReader();
        reader.initialize(split, context);
        while (reader.nextKeyValue()) {
            fileText.append(reader.getCurrentValue().toString());
        }
    }
    assertEquals(fileContents.toString(), fileText.toString());
}
Also used : Configuration(org.apache.hadoop.conf.Configuration) TaskAttemptContext(org.apache.hadoop.mapreduce.TaskAttemptContext) InputSplit(org.apache.hadoop.mapreduce.InputSplit) Test(org.junit.Test) BaseHadoopTest(com.mongodb.hadoop.testutils.BaseHadoopTest)

Example 43 with TaskAttemptContext

use of org.apache.hadoop.mapreduce.TaskAttemptContext in project cassandra by apache.

the class CqlInputFormat method getSplits.

// Old Hadoop API
public InputSplit[] getSplits(JobConf jobConf, int numSplits) throws IOException {
    TaskAttemptContext tac = HadoopCompat.newTaskAttemptContext(jobConf, new TaskAttemptID());
    List<org.apache.hadoop.mapreduce.InputSplit> newInputSplits = this.getSplits(tac);
    InputSplit[] oldInputSplits = new InputSplit[newInputSplits.size()];
    for (int i = 0; i < newInputSplits.size(); i++) oldInputSplits[i] = (ColumnFamilySplit) newInputSplits.get(i);
    return oldInputSplits;
}
Also used : TaskAttemptID(org.apache.hadoop.mapreduce.TaskAttemptID) TaskAttemptContext(org.apache.hadoop.mapreduce.TaskAttemptContext) org.apache.cassandra.hadoop(org.apache.cassandra.hadoop) InputSplit(org.apache.hadoop.mapred.InputSplit)

Example 44 with TaskAttemptContext

use of org.apache.hadoop.mapreduce.TaskAttemptContext in project flink by apache.

the class HadoopOutputFormatBase method finalizeGlobal.

@Override
public void finalizeGlobal(int parallelism) throws IOException {
    JobContext jobContext;
    TaskAttemptContext taskContext;
    try {
        TaskAttemptID taskAttemptID = TaskAttemptID.forName("attempt__0000_r_" + String.format("%" + (6 - Integer.toString(1).length()) + "s", " ").replace(" ", "0") + Integer.toString(1) + "_0");
        jobContext = HadoopUtils.instantiateJobContext(this.configuration, new JobID());
        taskContext = HadoopUtils.instantiateTaskAttemptContext(this.configuration, taskAttemptID);
        this.outputCommitter = this.mapreduceOutputFormat.getOutputCommitter(taskContext);
    } catch (Exception e) {
        throw new RuntimeException(e);
    }
    jobContext.getCredentials().addAll(this.credentials);
    Credentials currentUserCreds = getCredentialsFromUGI(UserGroupInformation.getCurrentUser());
    if (currentUserCreds != null) {
        jobContext.getCredentials().addAll(currentUserCreds);
    }
    // finalize HDFS output format
    if (this.outputCommitter != null) {
        this.outputCommitter.commitJob(jobContext);
    }
}
Also used : TaskAttemptID(org.apache.hadoop.mapreduce.TaskAttemptID) TaskAttemptContext(org.apache.hadoop.mapreduce.TaskAttemptContext) JobContext(org.apache.hadoop.mapreduce.JobContext) JobID(org.apache.hadoop.mapreduce.JobID) IOException(java.io.IOException) Credentials(org.apache.hadoop.security.Credentials)

Example 45 with TaskAttemptContext

use of org.apache.hadoop.mapreduce.TaskAttemptContext in project flink by apache.

the class HadoopUtils method instantiateTaskAttemptContext.

public static TaskAttemptContext instantiateTaskAttemptContext(Configuration configuration, TaskAttemptID taskAttemptID) throws Exception {
    try {
        Class<?> clazz;
        // for Hadoop 1.xx
        if (JobContext.class.isInterface()) {
            clazz = Class.forName("org.apache.hadoop.mapreduce.task.TaskAttemptContextImpl");
        } else // for Hadoop 2.xx
        {
            clazz = Class.forName("org.apache.hadoop.mapreduce.TaskAttemptContext");
        }
        Constructor<?> constructor = clazz.getConstructor(Configuration.class, TaskAttemptID.class);
        TaskAttemptContext context = (TaskAttemptContext) constructor.newInstance(configuration, taskAttemptID);
        return context;
    } catch (Exception e) {
        throw new Exception("Could not create instance of TaskAttemptContext.");
    }
}
Also used : TaskAttemptContext(org.apache.hadoop.mapreduce.TaskAttemptContext)

Aggregations

TaskAttemptContext (org.apache.hadoop.mapreduce.TaskAttemptContext)110 Configuration (org.apache.hadoop.conf.Configuration)58 Job (org.apache.hadoop.mapreduce.Job)44 Path (org.apache.hadoop.fs.Path)39 TaskAttemptContextImpl (org.apache.hadoop.mapreduce.task.TaskAttemptContextImpl)38 InputSplit (org.apache.hadoop.mapreduce.InputSplit)36 Test (org.junit.Test)35 TaskAttemptID (org.apache.hadoop.mapreduce.TaskAttemptID)33 JobContext (org.apache.hadoop.mapreduce.JobContext)28 IOException (java.io.IOException)27 File (java.io.File)22 LongWritable (org.apache.hadoop.io.LongWritable)22 JobContextImpl (org.apache.hadoop.mapreduce.task.JobContextImpl)21 RecordWriter (org.apache.hadoop.mapreduce.RecordWriter)19 MapContextImpl (org.apache.hadoop.mapreduce.task.MapContextImpl)17 FileSystem (org.apache.hadoop.fs.FileSystem)16 OutputCommitter (org.apache.hadoop.mapreduce.OutputCommitter)12 ArrayList (java.util.ArrayList)11 BytesWritable (org.apache.hadoop.io.BytesWritable)10 MapFile (org.apache.hadoop.io.MapFile)10