Search in sources :

Example 6 with WrappedReducer

use of org.apache.hadoop.mapreduce.lib.reduce.WrappedReducer in project tez by apache.

the class MRTask method createReduceContext.

protected static <INKEY, INVALUE, OUTKEY, OUTVALUE> org.apache.hadoop.mapreduce.Reducer<INKEY, INVALUE, OUTKEY, OUTVALUE>.Context createReduceContext(org.apache.hadoop.mapreduce.Reducer<INKEY, INVALUE, OUTKEY, OUTVALUE> reducer, Configuration job, TaskAttemptID taskId, final TezRawKeyValueIterator rIter, org.apache.hadoop.mapreduce.Counter inputKeyCounter, org.apache.hadoop.mapreduce.Counter inputValueCounter, org.apache.hadoop.mapreduce.RecordWriter<OUTKEY, OUTVALUE> output, org.apache.hadoop.mapreduce.OutputCommitter committer, org.apache.hadoop.mapreduce.StatusReporter reporter, RawComparator<INKEY> comparator, Class<INKEY> keyClass, Class<INVALUE> valueClass) throws IOException, InterruptedException {
    RawKeyValueIterator r = new RawKeyValueIterator() {

        @Override
        public boolean next() throws IOException {
            return rIter.next();
        }

        @Override
        public DataInputBuffer getValue() throws IOException {
            return rIter.getValue();
        }

        @Override
        public Progress getProgress() {
            return rIter.getProgress();
        }

        @Override
        public DataInputBuffer getKey() throws IOException {
            return rIter.getKey();
        }

        @Override
        public void close() throws IOException {
            rIter.close();
        }
    };
    org.apache.hadoop.mapreduce.ReduceContext<INKEY, INVALUE, OUTKEY, OUTVALUE> reduceContext = new ReduceContextImpl<INKEY, INVALUE, OUTKEY, OUTVALUE>(job, taskId, r, inputKeyCounter, inputValueCounter, output, committer, reporter, comparator, keyClass, valueClass);
    if (LOG.isDebugEnabled()) {
        LOG.debug("Using key class: " + keyClass + ", valueClass: " + valueClass);
    }
    org.apache.hadoop.mapreduce.Reducer<INKEY, INVALUE, OUTKEY, OUTVALUE>.Context reducerContext = new WrappedReducer<INKEY, INVALUE, OUTKEY, OUTVALUE>().getReducerContext(reduceContext);
    return reducerContext;
}
Also used : ReduceContextImpl(org.apache.hadoop.mapreduce.task.ReduceContextImpl) TezRawKeyValueIterator(org.apache.tez.runtime.library.common.sort.impl.TezRawKeyValueIterator) RawKeyValueIterator(org.apache.hadoop.mapred.RawKeyValueIterator) WrappedReducer(org.apache.hadoop.mapreduce.lib.reduce.WrappedReducer)

Example 7 with WrappedReducer

use of org.apache.hadoop.mapreduce.lib.reduce.WrappedReducer in project cdap by caskdata.

the class ReducerWrapper method createAutoFlushingContext.

private WrappedReducer.Context createAutoFlushingContext(final Context context, final BasicMapReduceTaskContext basicMapReduceContext, final ReduceTaskMetricsWriter metricsWriter) {
    // NOTE: we will change auto-flush to take into account size of buffered data, so no need to do/test a lot with
    // current approach
    final int flushFreq = context.getConfiguration().getInt("c.reducer.flush.freq", 10000);
    final long reportIntervalInMillis = basicMapReduceContext.getMetricsReportIntervalMillis();
    @SuppressWarnings("unchecked") WrappedReducer.Context flushingContext = new WrappedReducer().new Context(context) {

        private int processedRecords = 0;

        private long nextTimeToReportMetrics = 0L;

        @Override
        public boolean nextKeyValue() throws IOException, InterruptedException {
            boolean result = super.nextKey();
            if (++processedRecords > flushFreq) {
                try {
                    LOG.trace("Flushing dataset operations...");
                    basicMapReduceContext.flushOperations();
                } catch (Exception e) {
                    LOG.error("Failed to persist changes", e);
                    throw Throwables.propagate(e);
                }
                processedRecords = 0;
            }
            if (System.currentTimeMillis() >= nextTimeToReportMetrics) {
                metricsWriter.reportMetrics();
                nextTimeToReportMetrics = System.currentTimeMillis() + reportIntervalInMillis;
            }
            return result;
        }
    };
    return flushingContext;
}
Also used : RuntimeContext(co.cask.cdap.api.RuntimeContext) WrappedReducer(org.apache.hadoop.mapreduce.lib.reduce.WrappedReducer) IOException(java.io.IOException)

Aggregations

WrappedReducer (org.apache.hadoop.mapreduce.lib.reduce.WrappedReducer)7 RawKeyValueIterator (org.apache.hadoop.mapred.RawKeyValueIterator)4 ReduceContextImpl (org.apache.hadoop.mapreduce.task.ReduceContextImpl)4 RuntimeContext (co.cask.cdap.api.RuntimeContext)2 IOException (java.io.IOException)2 CustomOutputCommitter (org.apache.hadoop.CustomOutputCommitter)2 Configuration (org.apache.hadoop.conf.Configuration)2 NullWritable (org.apache.hadoop.io.NullWritable)2 SleepReducer (org.apache.hadoop.mapred.gridmix.SleepJob.SleepReducer)2 Counter (org.apache.hadoop.mapreduce.Counter)2 OutputCommitter (org.apache.hadoop.mapreduce.OutputCommitter)2 StatusReporter (org.apache.hadoop.mapreduce.StatusReporter)2 TaskAttemptID (org.apache.hadoop.mapreduce.TaskAttemptID)2 GenericCounter (org.apache.hadoop.mapreduce.counters.GenericCounter)2 DummyReporter (org.apache.hadoop.mapreduce.task.TaskAttemptContextImpl.DummyReporter)2 TezRawKeyValueIterator (org.apache.tez.runtime.library.common.sort.impl.TezRawKeyValueIterator)2 Test (org.junit.Test)2 JobContextImpl (org.apache.hadoop.mapred.JobContextImpl)1 Reducer (org.apache.hadoop.mapred.Reducer)1 OutputFormat (org.apache.hadoop.mapreduce.OutputFormat)1