Search in sources :

Example 71 with JobConf

use of org.apache.hadoop.mapred.JobConf in project flink by apache.

the class WordCountMapredITCase method internalRun.

private void internalRun(boolean isTestDeprecatedAPI) throws Exception {
    final ExecutionEnvironment env = ExecutionEnvironment.getExecutionEnvironment();
    DataSet<Tuple2<LongWritable, Text>> input;
    if (isTestDeprecatedAPI) {
        input = env.readHadoopFile(new TextInputFormat(), LongWritable.class, Text.class, textPath);
    } else {
        input = env.createInput(readHadoopFile(new TextInputFormat(), LongWritable.class, Text.class, textPath));
    }
    DataSet<String> text = input.map(new MapFunction<Tuple2<LongWritable, Text>, String>() {

        @Override
        public String map(Tuple2<LongWritable, Text> value) throws Exception {
            return value.f1.toString();
        }
    });
    DataSet<Tuple2<String, Integer>> counts = // split up the lines in pairs (2-tuples) containing: (word,1)
    text.flatMap(new Tokenizer()).groupBy(0).sum(1);
    DataSet<Tuple2<Text, LongWritable>> words = counts.map(new MapFunction<Tuple2<String, Integer>, Tuple2<Text, LongWritable>>() {

        @Override
        public Tuple2<Text, LongWritable> map(Tuple2<String, Integer> value) throws Exception {
            return new Tuple2<Text, LongWritable>(new Text(value.f0), new LongWritable(value.f1));
        }
    });
    // Set up Hadoop Output Format
    HadoopOutputFormat<Text, LongWritable> hadoopOutputFormat = new HadoopOutputFormat<Text, LongWritable>(new TextOutputFormat<Text, LongWritable>(), new JobConf());
    hadoopOutputFormat.getJobConf().set("mapred.textoutputformat.separator", " ");
    TextOutputFormat.setOutputPath(hadoopOutputFormat.getJobConf(), new Path(resultPath));
    // Output & Execute
    words.output(hadoopOutputFormat);
    env.execute("Hadoop Compat WordCount");
}
Also used : Path(org.apache.hadoop.fs.Path) ExecutionEnvironment(org.apache.flink.api.java.ExecutionEnvironment) Text(org.apache.hadoop.io.Text) HadoopOutputFormat(org.apache.flink.api.java.hadoop.mapred.HadoopOutputFormat) TextInputFormat(org.apache.hadoop.mapred.TextInputFormat) Tuple2(org.apache.flink.api.java.tuple.Tuple2) LongWritable(org.apache.hadoop.io.LongWritable) Tokenizer(org.apache.flink.test.testfunctions.Tokenizer) JobConf(org.apache.hadoop.mapred.JobConf)

Example 72 with JobConf

use of org.apache.hadoop.mapred.JobConf in project hadoop by apache.

the class ValueAggregatorJob method createValueAggregatorJobs.

public static JobControl createValueAggregatorJobs(String[] args, Class<? extends ValueAggregatorDescriptor>[] descriptors) throws IOException {
    JobControl theControl = new JobControl("ValueAggregatorJobs");
    ArrayList<Job> dependingJobs = new ArrayList<Job>();
    JobConf aJobConf = createValueAggregatorJob(args);
    if (descriptors != null)
        setAggregatorDescriptors(aJobConf, descriptors);
    Job aJob = new Job(aJobConf, dependingJobs);
    theControl.addJob(aJob);
    return theControl;
}
Also used : ArrayList(java.util.ArrayList) JobControl(org.apache.hadoop.mapred.jobcontrol.JobControl) Job(org.apache.hadoop.mapred.jobcontrol.Job) JobConf(org.apache.hadoop.mapred.JobConf)

Example 73 with JobConf

use of org.apache.hadoop.mapred.JobConf in project hadoop by apache.

the class ValueAggregatorJob method createValueAggregatorJob.

public static JobConf createValueAggregatorJob(String[] args, Class<? extends ValueAggregatorDescriptor>[] descriptors, Class<?> caller) throws IOException {
    JobConf job = createValueAggregatorJob(args, caller);
    setAggregatorDescriptors(job, descriptors);
    return job;
}
Also used : JobConf(org.apache.hadoop.mapred.JobConf)

Example 74 with JobConf

use of org.apache.hadoop.mapred.JobConf in project hadoop by apache.

the class ValueAggregatorJob method createValueAggregatorJob.

public static JobConf createValueAggregatorJob(String[] args, Class<? extends ValueAggregatorDescriptor>[] descriptors) throws IOException {
    JobConf job = createValueAggregatorJob(args);
    setAggregatorDescriptors(job, descriptors);
    return job;
}
Also used : JobConf(org.apache.hadoop.mapred.JobConf)

Example 75 with JobConf

use of org.apache.hadoop.mapred.JobConf in project hadoop by apache.

the class ValueAggregatorJob method main.

/**
   * create and run an Aggregate based map/reduce job.
   * 
   * @param args the arguments used for job creation
   * @throws IOException
   */
public static void main(String[] args) throws IOException {
    JobConf job = ValueAggregatorJob.createValueAggregatorJob(args);
    JobClient.runJob(job);
}
Also used : JobConf(org.apache.hadoop.mapred.JobConf)

Aggregations

JobConf (org.apache.hadoop.mapred.JobConf)1037 Path (org.apache.hadoop.fs.Path)510 Test (org.junit.Test)317 FileSystem (org.apache.hadoop.fs.FileSystem)264 IOException (java.io.IOException)204 Configuration (org.apache.hadoop.conf.Configuration)163 InputSplit (org.apache.hadoop.mapred.InputSplit)110 ArrayList (java.util.ArrayList)89 Text (org.apache.hadoop.io.Text)82 File (java.io.File)81 RunningJob (org.apache.hadoop.mapred.RunningJob)67 Properties (java.util.Properties)58 List (java.util.List)49 HashMap (java.util.HashMap)47 DMLRuntimeException (org.apache.sysml.runtime.DMLRuntimeException)47 SequenceFile (org.apache.hadoop.io.SequenceFile)45 TextInputFormat (org.apache.hadoop.mapred.TextInputFormat)44 Map (java.util.Map)42 Job (org.apache.hadoop.mapreduce.Job)42 LongWritable (org.apache.hadoop.io.LongWritable)41