Examples with CustomOutputCommitter - org.apache.hadoop.CustomOutputCommitter

Example 1 with CustomOutputCommitter

use of org.apache.hadoop.CustomOutputCommitter in project hadoop by apache.

the class TestGridMixClasses method testSleepReducer.

/*
   * test SleepReducer
   */
@Test(timeout = 3000)
public void testSleepReducer() throws Exception {
    Configuration conf = new Configuration();
    conf.setInt(JobContext.NUM_REDUCES, 2);
    CompressionEmulationUtil.setCompressionEmulationEnabled(conf, true);
    conf.setBoolean(FileOutputFormat.COMPRESS, true);
    CompressionEmulationUtil.setCompressionEmulationEnabled(conf, true);
    conf.setBoolean(MRJobConfig.MAP_OUTPUT_COMPRESS, true);
    TaskAttemptID taskId = new TaskAttemptID();
    RawKeyValueIterator input = new FakeRawKeyValueReducerIterator();
    Counter counter = new GenericCounter();
    Counter inputValueCounter = new GenericCounter();
    RecordWriter<NullWritable, NullWritable> output = new LoadRecordReduceWriter();
    OutputCommitter committer = new CustomOutputCommitter();
    StatusReporter reporter = new DummyReporter();
    RawComparator<GridmixKey> comparator = new FakeRawComparator();
    ReduceContext<GridmixKey, NullWritable, NullWritable, NullWritable> reducecontext = new ReduceContextImpl<GridmixKey, NullWritable, NullWritable, NullWritable>(conf, taskId, input, counter, inputValueCounter, output, committer, reporter, comparator, GridmixKey.class, NullWritable.class);
    org.apache.hadoop.mapreduce.Reducer<GridmixKey, NullWritable, NullWritable, NullWritable>.Context<GridmixKey, NullWritable, NullWritable, NullWritable> context = new WrappedReducer<GridmixKey, NullWritable, NullWritable, NullWritable>().getReducerContext(reducecontext);
    SleepReducer test = new SleepReducer();
    long start = System.currentTimeMillis();
    test.setup(context);
    long sleeper = context.getCurrentKey().getReduceOutputBytes();
    // status has been changed
    assertEquals("Sleeping... " + sleeper + " ms left", context.getStatus());
    // should sleep 0.9 sec
    assertTrue(System.currentTimeMillis() >= (start + sleeper));
    test.cleanup(context);
    // status has been changed again
    assertEquals("Slept for " + sleeper, context.getStatus());
}

Also used : Configuration(org.apache.hadoop.conf.Configuration) ReduceContextImpl(org.apache.hadoop.mapreduce.task.ReduceContextImpl) TaskAttemptID(org.apache.hadoop.mapreduce.TaskAttemptID) GenericCounter(org.apache.hadoop.mapreduce.counters.GenericCounter) Counter(org.apache.hadoop.mapreduce.Counter) CustomOutputCommitter(org.apache.hadoop.CustomOutputCommitter) CustomOutputCommitter(org.apache.hadoop.CustomOutputCommitter) OutputCommitter(org.apache.hadoop.mapreduce.OutputCommitter) GenericCounter(org.apache.hadoop.mapreduce.counters.GenericCounter) DummyReporter(org.apache.hadoop.mapreduce.task.TaskAttemptContextImpl.DummyReporter) NullWritable(org.apache.hadoop.io.NullWritable) RawKeyValueIterator(org.apache.hadoop.mapred.RawKeyValueIterator) SleepReducer(org.apache.hadoop.mapred.gridmix.SleepJob.SleepReducer) WrappedReducer(org.apache.hadoop.mapreduce.lib.reduce.WrappedReducer) SleepReducer(org.apache.hadoop.mapred.gridmix.SleepJob.SleepReducer) StatusReporter(org.apache.hadoop.mapreduce.StatusReporter) Test(org.junit.Test)

Example 2 with CustomOutputCommitter

use of org.apache.hadoop.CustomOutputCommitter in project hadoop by apache.

the class TestGridMixClasses method testLoadMapper.

/*
   * test LoadMapper loadMapper should write to writer record for each reduce
   */
@SuppressWarnings({ "rawtypes", "unchecked" })
@Test(timeout = 10000)
public void testLoadMapper() throws Exception {
    Configuration conf = new Configuration();
    conf.setInt(JobContext.NUM_REDUCES, 2);
    CompressionEmulationUtil.setCompressionEmulationEnabled(conf, true);
    conf.setBoolean(MRJobConfig.MAP_OUTPUT_COMPRESS, true);
    TaskAttemptID taskId = new TaskAttemptID();
    RecordReader<NullWritable, GridmixRecord> reader = new FakeRecordReader();
    LoadRecordGkGrWriter writer = new LoadRecordGkGrWriter();
    OutputCommitter committer = new CustomOutputCommitter();
    StatusReporter reporter = new TaskAttemptContextImpl.DummyReporter();
    LoadSplit split = getLoadSplit();
    MapContext<NullWritable, GridmixRecord, GridmixKey, GridmixRecord> mapContext = new MapContextImpl<NullWritable, GridmixRecord, GridmixKey, GridmixRecord>(conf, taskId, reader, writer, committer, reporter, split);
    // context
    Context ctx = new WrappedMapper<NullWritable, GridmixRecord, GridmixKey, GridmixRecord>().getMapContext(mapContext);
    reader.initialize(split, ctx);
    ctx.getConfiguration().setBoolean(MRJobConfig.MAP_OUTPUT_COMPRESS, true);
    CompressionEmulationUtil.setCompressionEmulationEnabled(ctx.getConfiguration(), true);
    LoadJob.LoadMapper mapper = new LoadJob.LoadMapper();
    // setup, map, clean
    mapper.run(ctx);
    Map<GridmixKey, GridmixRecord> data = writer.getData();
    // check result
    assertEquals(2, data.size());
}

Also used : Context(org.apache.hadoop.mapreduce.Mapper.Context) ReduceContext(org.apache.hadoop.mapreduce.ReduceContext) MapContext(org.apache.hadoop.mapreduce.MapContext) TaskAttemptContext(org.apache.hadoop.mapreduce.TaskAttemptContext) JobContext(org.apache.hadoop.mapred.JobContext) CustomOutputCommitter(org.apache.hadoop.CustomOutputCommitter) OutputCommitter(org.apache.hadoop.mapreduce.OutputCommitter) Configuration(org.apache.hadoop.conf.Configuration) MapContextImpl(org.apache.hadoop.mapreduce.task.MapContextImpl) TaskAttemptID(org.apache.hadoop.mapreduce.TaskAttemptID) DummyReporter(org.apache.hadoop.mapreduce.task.TaskAttemptContextImpl.DummyReporter) NullWritable(org.apache.hadoop.io.NullWritable) CustomOutputCommitter(org.apache.hadoop.CustomOutputCommitter) StatusReporter(org.apache.hadoop.mapreduce.StatusReporter) Test(org.junit.Test)

Example 3 with CustomOutputCommitter

use of org.apache.hadoop.CustomOutputCommitter in project hadoop by apache.

the class TestGridMixClasses method testLoadJobLoadReducer.

/*
   * test LoadReducer
   */
@Test(timeout = 3000)
public void testLoadJobLoadReducer() throws Exception {
    LoadJob.LoadReducer test = new LoadJob.LoadReducer();
    Configuration conf = new Configuration();
    conf.setInt(JobContext.NUM_REDUCES, 2);
    CompressionEmulationUtil.setCompressionEmulationEnabled(conf, true);
    conf.setBoolean(FileOutputFormat.COMPRESS, true);
    CompressionEmulationUtil.setCompressionEmulationEnabled(conf, true);
    conf.setBoolean(MRJobConfig.MAP_OUTPUT_COMPRESS, true);
    TaskAttemptID taskid = new TaskAttemptID();
    RawKeyValueIterator input = new FakeRawKeyValueIterator();
    Counter counter = new GenericCounter();
    Counter inputValueCounter = new GenericCounter();
    LoadRecordWriter output = new LoadRecordWriter();
    OutputCommitter committer = new CustomOutputCommitter();
    StatusReporter reporter = new DummyReporter();
    RawComparator<GridmixKey> comparator = new FakeRawComparator();
    ReduceContext<GridmixKey, GridmixRecord, NullWritable, GridmixRecord> reduceContext = new ReduceContextImpl<GridmixKey, GridmixRecord, NullWritable, GridmixRecord>(conf, taskid, input, counter, inputValueCounter, output, committer, reporter, comparator, GridmixKey.class, GridmixRecord.class);
    // read for previous data
    reduceContext.nextKeyValue();
    org.apache.hadoop.mapreduce.Reducer<GridmixKey, GridmixRecord, NullWritable, GridmixRecord>.Context<GridmixKey, GridmixRecord, NullWritable, GridmixRecord> context = new WrappedReducer<GridmixKey, GridmixRecord, NullWritable, GridmixRecord>().getReducerContext(reduceContext);
    // test.setup(context);
    test.run(context);
    // have been readed 9 records (-1 for previous)
    assertEquals(9, counter.getValue());
    assertEquals(10, inputValueCounter.getValue());
    assertEquals(1, output.getData().size());
    GridmixRecord record = output.getData().values().iterator().next();
    assertEquals(1593, record.getSize());
}

Also used : Configuration(org.apache.hadoop.conf.Configuration) ReduceContextImpl(org.apache.hadoop.mapreduce.task.ReduceContextImpl) TaskAttemptID(org.apache.hadoop.mapreduce.TaskAttemptID) GenericCounter(org.apache.hadoop.mapreduce.counters.GenericCounter) Counter(org.apache.hadoop.mapreduce.Counter) CustomOutputCommitter(org.apache.hadoop.CustomOutputCommitter) CustomOutputCommitter(org.apache.hadoop.CustomOutputCommitter) OutputCommitter(org.apache.hadoop.mapreduce.OutputCommitter) GenericCounter(org.apache.hadoop.mapreduce.counters.GenericCounter) DummyReporter(org.apache.hadoop.mapreduce.task.TaskAttemptContextImpl.DummyReporter) NullWritable(org.apache.hadoop.io.NullWritable) RawKeyValueIterator(org.apache.hadoop.mapred.RawKeyValueIterator) WrappedReducer(org.apache.hadoop.mapreduce.lib.reduce.WrappedReducer) SleepReducer(org.apache.hadoop.mapred.gridmix.SleepJob.SleepReducer) StatusReporter(org.apache.hadoop.mapreduce.StatusReporter) Test(org.junit.Test)

Example 4 with CustomOutputCommitter

use of org.apache.hadoop.CustomOutputCommitter in project hadoop by apache.

the class TestGridMixClasses method testSleepMapper.

/*
   * test SleepMapper
   */
@SuppressWarnings({ "unchecked", "rawtypes" })
@Test(timeout = 30000)
public void testSleepMapper() throws Exception {
    SleepJob.SleepMapper test = new SleepJob.SleepMapper();
    Configuration conf = new Configuration();
    conf.setInt(JobContext.NUM_REDUCES, 2);
    CompressionEmulationUtil.setCompressionEmulationEnabled(conf, true);
    conf.setBoolean(MRJobConfig.MAP_OUTPUT_COMPRESS, true);
    TaskAttemptID taskId = new TaskAttemptID();
    FakeRecordLLReader reader = new FakeRecordLLReader();
    LoadRecordGkNullWriter writer = new LoadRecordGkNullWriter();
    OutputCommitter committer = new CustomOutputCommitter();
    StatusReporter reporter = new TaskAttemptContextImpl.DummyReporter();
    SleepSplit split = getSleepSplit();
    MapContext<LongWritable, LongWritable, GridmixKey, NullWritable> mapcontext = new MapContextImpl<LongWritable, LongWritable, GridmixKey, NullWritable>(conf, taskId, reader, writer, committer, reporter, split);
    Context context = new WrappedMapper<LongWritable, LongWritable, GridmixKey, NullWritable>().getMapContext(mapcontext);
    long start = System.currentTimeMillis();
    LOG.info("start:" + start);
    LongWritable key = new LongWritable(start + 2000);
    LongWritable value = new LongWritable(start + 2000);
    // should slip 2 sec
    test.map(key, value, context);
    LOG.info("finish:" + System.currentTimeMillis());
    assertTrue(System.currentTimeMillis() >= (start + 2000));
    test.cleanup(context);
    assertEquals(1, writer.getData().size());
}

Also used : Context(org.apache.hadoop.mapreduce.Mapper.Context) ReduceContext(org.apache.hadoop.mapreduce.ReduceContext) MapContext(org.apache.hadoop.mapreduce.MapContext) TaskAttemptContext(org.apache.hadoop.mapreduce.TaskAttemptContext) JobContext(org.apache.hadoop.mapred.JobContext) CustomOutputCommitter(org.apache.hadoop.CustomOutputCommitter) OutputCommitter(org.apache.hadoop.mapreduce.OutputCommitter) Configuration(org.apache.hadoop.conf.Configuration) MapContextImpl(org.apache.hadoop.mapreduce.task.MapContextImpl) TaskAttemptID(org.apache.hadoop.mapreduce.TaskAttemptID) DummyReporter(org.apache.hadoop.mapreduce.task.TaskAttemptContextImpl.DummyReporter) NullWritable(org.apache.hadoop.io.NullWritable) CustomOutputCommitter(org.apache.hadoop.CustomOutputCommitter) LongWritable(org.apache.hadoop.io.LongWritable) SleepSplit(org.apache.hadoop.mapred.gridmix.SleepJob.SleepSplit) StatusReporter(org.apache.hadoop.mapreduce.StatusReporter) Test(org.junit.Test)

Aggregations

CustomOutputCommitter (org.apache.hadoop.CustomOutputCommitter)4 Configuration (org.apache.hadoop.conf.Configuration)4 NullWritable (org.apache.hadoop.io.NullWritable)4 OutputCommitter (org.apache.hadoop.mapreduce.OutputCommitter)4 StatusReporter (org.apache.hadoop.mapreduce.StatusReporter)4 TaskAttemptID (org.apache.hadoop.mapreduce.TaskAttemptID)4 DummyReporter (org.apache.hadoop.mapreduce.task.TaskAttemptContextImpl.DummyReporter)4 Test (org.junit.Test)4 JobContext (org.apache.hadoop.mapred.JobContext)2 RawKeyValueIterator (org.apache.hadoop.mapred.RawKeyValueIterator)2 SleepReducer (org.apache.hadoop.mapred.gridmix.SleepJob.SleepReducer)2 Counter (org.apache.hadoop.mapreduce.Counter)2 MapContext (org.apache.hadoop.mapreduce.MapContext)2 Context (org.apache.hadoop.mapreduce.Mapper.Context)2 ReduceContext (org.apache.hadoop.mapreduce.ReduceContext)2 TaskAttemptContext (org.apache.hadoop.mapreduce.TaskAttemptContext)2 GenericCounter (org.apache.hadoop.mapreduce.counters.GenericCounter)2 WrappedReducer (org.apache.hadoop.mapreduce.lib.reduce.WrappedReducer)2 MapContextImpl (org.apache.hadoop.mapreduce.task.MapContextImpl)2 ReduceContextImpl (org.apache.hadoop.mapreduce.task.ReduceContextImpl)2