Search in sources :

Example 6 with Reducer

use of org.apache.hadoop.mapred.Reducer in project tez by apache.

the class MRCombiner method runNewCombiner.

// /////////////// End of methods for old API //////////////////////
// /////////////// Methods for new API //////////////////////
private void runNewCombiner(final TezRawKeyValueIterator rawIter, final Writer writer) throws InterruptedException, IOException {
    RecordWriter recordWriter = new RecordWriter() {

        @Override
        public void write(Object key, Object value) throws IOException, InterruptedException {
            writer.append(key, value);
            combineOutputRecordsCounter.increment(1);
        }

        @Override
        public void close(TaskAttemptContext context) throws IOException, InterruptedException {
        // Will be closed by whoever invokes the combiner.
        }
    };
    Class<? extends org.apache.hadoop.mapreduce.Reducer> reducerClazz = (Class<? extends org.apache.hadoop.mapreduce.Reducer>) conf.getClass(MRJobConfig.COMBINE_CLASS_ATTR, null, org.apache.hadoop.mapreduce.Reducer.class);
    org.apache.hadoop.mapreduce.Reducer reducer = ReflectionUtils.newInstance(reducerClazz, conf);
    org.apache.hadoop.mapreduce.Reducer.Context reducerContext = createReduceContext(conf, mrTaskAttemptID, rawIter, new MRCounters.MRCounter(combineInputRecordsCounter), new MRCounters.MRCounter(combineOutputRecordsCounter), recordWriter, reporter, (RawComparator) comparator, keyClass, valClass);
    reducer.run(reducerContext);
    recordWriter.close(reducerContext);
}
Also used : TaskAttemptContext(org.apache.hadoop.mapreduce.TaskAttemptContext) MRCounters(org.apache.tez.mapreduce.hadoop.mapred.MRCounters) RecordWriter(org.apache.hadoop.mapreduce.RecordWriter) Reducer(org.apache.hadoop.mapred.Reducer) WrappedReducer(org.apache.hadoop.mapreduce.lib.reduce.WrappedReducer)

Example 7 with Reducer

use of org.apache.hadoop.mapred.Reducer in project flink by apache.

the class HadoopReduceFunction method readObject.

@SuppressWarnings("unchecked")
private void readObject(final ObjectInputStream in) throws IOException, ClassNotFoundException {
    Class<Reducer<KEYIN, VALUEIN, KEYOUT, VALUEOUT>> reducerClass = (Class<Reducer<KEYIN, VALUEIN, KEYOUT, VALUEOUT>>) in.readObject();
    reducer = InstantiationUtil.instantiate(reducerClass);
    jobConf = new JobConf();
    jobConf.readFields(in);
}
Also used : Reducer(org.apache.hadoop.mapred.Reducer) JobConf(org.apache.hadoop.mapred.JobConf)

Example 8 with Reducer

use of org.apache.hadoop.mapred.Reducer in project tez by apache.

the class ReduceProcessor method runNewReducer.

void runNewReducer(JobConf job, final MRTaskReporter reporter, OrderedGroupedInputLegacy input, RawComparator comparator, Class keyClass, Class valueClass, final KeyValueWriter out) throws IOException, InterruptedException, ClassNotFoundException, TezException {
    // make a task context so we can get the classes
    org.apache.hadoop.mapreduce.TaskAttemptContext taskContext = getTaskAttemptContext();
    // make a reducer
    org.apache.hadoop.mapreduce.Reducer reducer = (org.apache.hadoop.mapreduce.Reducer) ReflectionUtils.newInstance(taskContext.getReducerClass(), job);
    // wrap value iterator to report progress.
    final TezRawKeyValueIterator rawIter = input.getIterator();
    TezRawKeyValueIterator rIter = new TezRawKeyValueIterator() {

        public void close() throws IOException {
            rawIter.close();
        }

        public DataInputBuffer getKey() throws IOException {
            return rawIter.getKey();
        }

        public Progress getProgress() {
            return rawIter.getProgress();
        }

        @Override
        public boolean isSameKey() throws IOException {
            return rawIter.isSameKey();
        }

        public DataInputBuffer getValue() throws IOException {
            return rawIter.getValue();
        }

        @Override
        public boolean hasNext() throws IOException {
            return rawIter.hasNext();
        }

        public boolean next() throws IOException {
            boolean ret = rawIter.next();
            reporter.setProgress(rawIter.getProgress().getProgress());
            return ret;
        }
    };
    org.apache.hadoop.mapreduce.RecordWriter trackedRW = new org.apache.hadoop.mapreduce.RecordWriter() {

        @Override
        public void write(Object key, Object value) throws IOException, InterruptedException {
            out.write(key, value);
        }

        @Override
        public void close(TaskAttemptContext context) throws IOException, InterruptedException {
        }
    };
    org.apache.hadoop.mapreduce.Reducer.Context reducerContext = createReduceContext(reducer, job, taskAttemptId, rIter, reduceInputKeyCounter, reduceInputValueCounter, trackedRW, committer, reporter, comparator, keyClass, valueClass);
    reducer.run(reducerContext);
    // Set progress to 1.0f if there was no exception,
    reporter.setProgress(1.0f);
    trackedRW.close(reducerContext);
}
Also used : TaskAttemptContext(org.apache.hadoop.mapreduce.TaskAttemptContext) TaskAttemptContext(org.apache.hadoop.mapreduce.TaskAttemptContext) Reducer(org.apache.hadoop.mapred.Reducer) TezRawKeyValueIterator(org.apache.tez.runtime.library.common.sort.impl.TezRawKeyValueIterator)

Example 9 with Reducer

use of org.apache.hadoop.mapred.Reducer in project flink by apache.

the class HadoopReduceCombineFunction method readObject.

@SuppressWarnings("unchecked")
private void readObject(final ObjectInputStream in) throws IOException, ClassNotFoundException {
    Class<Reducer<KEYIN, VALUEIN, KEYOUT, VALUEOUT>> reducerClass = (Class<Reducer<KEYIN, VALUEIN, KEYOUT, VALUEOUT>>) in.readObject();
    reducer = InstantiationUtil.instantiate(reducerClass);
    Class<Reducer<KEYIN, VALUEIN, KEYIN, VALUEIN>> combinerClass = (Class<Reducer<KEYIN, VALUEIN, KEYIN, VALUEIN>>) in.readObject();
    combiner = InstantiationUtil.instantiate(combinerClass);
    jobConf = new JobConf();
    jobConf.readFields(in);
}
Also used : Reducer(org.apache.hadoop.mapred.Reducer) JobConf(org.apache.hadoop.mapred.JobConf)

Aggregations

Reducer (org.apache.hadoop.mapred.Reducer)9 JobConf (org.apache.hadoop.mapred.JobConf)5 OutputCollector (org.apache.hadoop.mapred.OutputCollector)2 TaskAttemptContext (org.apache.hadoop.mapreduce.TaskAttemptContext)2 WrappedReducer (org.apache.hadoop.mapreduce.lib.reduce.WrappedReducer)2 IOException (java.io.IOException)1 URL (java.net.URL)1 URLClassLoader (java.net.URLClassLoader)1 StringTokenizer (java.util.StringTokenizer)1 BasicParser (org.apache.commons.cli.BasicParser)1 CommandLine (org.apache.commons.cli.CommandLine)1 ParseException (org.apache.commons.cli.ParseException)1 Parser (org.apache.commons.cli.Parser)1 Path (org.apache.hadoop.fs.Path)1 RawComparator (org.apache.hadoop.io.RawComparator)1 FileInputFormat (org.apache.hadoop.mapred.FileInputFormat)1 FileOutputFormat (org.apache.hadoop.mapred.FileOutputFormat)1 InputFormat (org.apache.hadoop.mapred.InputFormat)1 Mapper (org.apache.hadoop.mapred.Mapper)1 OutputFormat (org.apache.hadoop.mapred.OutputFormat)1