Examples with MapFunction - org.apache.flink.api.common.functions.MapFunction

Example 41 with MapFunction

use of org.apache.flink.api.common.functions.MapFunction in project flink by apache.

the class CustomKvStateProgram method main.

public static void main(String[] args) throws Exception {
    final String jarFile = args[0];
    final String host = args[1];
    final int port = Integer.parseInt(args[2]);
    final int parallelism = Integer.parseInt(args[3]);
    final String checkpointPath = args[4];
    final int checkpointingInterval = Integer.parseInt(args[5]);
    final String outputPath = args[6];
    StreamExecutionEnvironment env = StreamExecutionEnvironment.createRemoteEnvironment(host, port, jarFile);
    env.setParallelism(parallelism);
    env.getConfig().disableSysoutLogging();
    env.enableCheckpointing(checkpointingInterval);
    env.setStateBackend(new FsStateBackend(checkpointPath));
    DataStream<Integer> source = env.addSource(new InfiniteIntegerSource());
    source.map(new MapFunction<Integer, Tuple2<Integer, Integer>>() {

        private static final long serialVersionUID = 1L;

        @Override
        public Tuple2<Integer, Integer> map(Integer value) throws Exception {
            return new Tuple2<>(ThreadLocalRandom.current().nextInt(parallelism), value);
        }
    }).keyBy(new KeySelector<Tuple2<Integer, Integer>, Integer>() {

        private static final long serialVersionUID = 1L;

        @Override
        public Integer getKey(Tuple2<Integer, Integer> value) throws Exception {
            return value.f0;
        }
    }).flatMap(new ReducingStateFlatMap()).writeAsText(outputPath);
    env.execute();
}

Also used : RichFlatMapFunction(org.apache.flink.api.common.functions.RichFlatMapFunction) MapFunction(org.apache.flink.api.common.functions.MapFunction) Tuple2(org.apache.flink.api.java.tuple.Tuple2) StreamExecutionEnvironment(org.apache.flink.streaming.api.environment.StreamExecutionEnvironment) FsStateBackend(org.apache.flink.runtime.state.filesystem.FsStateBackend)

Example 42 with MapFunction

use of org.apache.flink.api.common.functions.MapFunction in project flink by apache.

the class CoGroupConnectedComponentsITCase method testProgram.

// --------------------------------------------------------------------------------------------
//  The test program
// --------------------------------------------------------------------------------------------
@Override
protected void testProgram() throws Exception {
    ExecutionEnvironment env = ExecutionEnvironment.getExecutionEnvironment();
    DataSet<Tuple1<Long>> initialVertices = env.readCsvFile(verticesPath).fieldDelimiter(" ").types(Long.class).name("Vertices");
    DataSet<Tuple2<Long, Long>> edges = env.readCsvFile(edgesPath).fieldDelimiter(" ").types(Long.class, Long.class).name("Edges");
    DataSet<Tuple2<Long, Long>> verticesWithId = initialVertices.map(new MapFunction<Tuple1<Long>, Tuple2<Long, Long>>() {

        @Override
        public Tuple2<Long, Long> map(Tuple1<Long> value) throws Exception {
            return new Tuple2<>(value.f0, value.f0);
        }
    }).name("Assign Vertex Ids");
    DeltaIteration<Tuple2<Long, Long>, Tuple2<Long, Long>> iteration = verticesWithId.iterateDelta(verticesWithId, MAX_ITERATIONS, 0);
    JoinOperator<Tuple2<Long, Long>, Tuple2<Long, Long>, Tuple2<Long, Long>> joinWithNeighbors = iteration.getWorkset().join(edges).where(0).equalTo(0).with(new JoinFunction<Tuple2<Long, Long>, Tuple2<Long, Long>, Tuple2<Long, Long>>() {

        @Override
        public Tuple2<Long, Long> join(Tuple2<Long, Long> first, Tuple2<Long, Long> second) throws Exception {
            return new Tuple2<>(second.f1, first.f1);
        }
    }).name("Join Candidate Id With Neighbor");
    CoGroupOperator<Tuple2<Long, Long>, Tuple2<Long, Long>, Tuple2<Long, Long>> minAndUpdate = joinWithNeighbors.coGroup(iteration.getSolutionSet()).where(0).equalTo(0).with(new MinIdAndUpdate()).name("min Id and Update");
    iteration.closeWith(minAndUpdate, minAndUpdate).writeAsCsv(resultPath, "\n", " ").name("Result");
    env.execute("Workset Connected Components");
}

Also used : ExecutionEnvironment(org.apache.flink.api.java.ExecutionEnvironment) Tuple1(org.apache.flink.api.java.tuple.Tuple1) Tuple2(org.apache.flink.api.java.tuple.Tuple2) JoinFunction(org.apache.flink.api.common.functions.JoinFunction) MapFunction(org.apache.flink.api.common.functions.MapFunction)

Example 43 with MapFunction

use of org.apache.flink.api.common.functions.MapFunction in project flink by apache.

the class DataSinkITCase method testSortingParallelism4.

@Test
public void testSortingParallelism4() throws Exception {
    final ExecutionEnvironment env = ExecutionEnvironment.getExecutionEnvironment();
    DataSet<Long> ds = env.generateSequence(0, 1000);
    // randomize
    ds.map(new MapFunction<Long, Long>() {

        Random rand = new Random(1234L);

        @Override
        public Long map(Long value) throws Exception {
            return rand.nextLong();
        }
    }).writeAsText(resultPath).sortLocalOutput("*", Order.ASCENDING).setParallelism(4);
    env.execute();
    BufferedReader[] resReaders = getResultReader(resultPath);
    for (BufferedReader br : resReaders) {
        long cmp = Long.MIN_VALUE;
        while (br.ready()) {
            long cur = Long.parseLong(br.readLine());
            assertTrue("Invalid order of sorted output", cmp <= cur);
            cmp = cur;
        }
        br.close();
    }
}

Also used : ExecutionEnvironment(org.apache.flink.api.java.ExecutionEnvironment) Random(java.util.Random) BufferedReader(java.io.BufferedReader) MapFunction(org.apache.flink.api.common.functions.MapFunction) Test(org.junit.Test)

Example 44 with MapFunction

use of org.apache.flink.api.common.functions.MapFunction in project flink by apache.

the class GroupCombineITCase method testPartialReduceWithDifferentInputOutputType.

@Test
public void testPartialReduceWithDifferentInputOutputType() throws Exception {
    final ExecutionEnvironment env = ExecutionEnvironment.getExecutionEnvironment();
    // data
    DataSet<Tuple3<Integer, Long, String>> ds = CollectionDataSets.get3TupleDataSet(env);
    DataSet<Tuple2<Long, Tuple3<Integer, Long, String>>> dsWrapped = ds.map(new Tuple3KvWrapper());
    List<Tuple2<Integer, Long>> result = dsWrapped.groupBy(0).combineGroup(new Tuple3toTuple2GroupReduce()).groupBy(0).reduceGroup(new Tuple2toTuple2GroupReduce()).map(new MapFunction<Tuple2<Long, Tuple2<Integer, Long>>, Tuple2<Integer, Long>>() {

        @Override
        public Tuple2<Integer, Long> map(Tuple2<Long, Tuple2<Integer, Long>> value) throws Exception {
            return value.f1;
        }
    }).collect();
    String expected = "1,3\n" + "5,20\n" + "15,58\n" + "34,52\n" + "65,70\n" + "111,96\n";
    compareResultAsTuples(result, expected);
}

Also used : ExecutionEnvironment(org.apache.flink.api.java.ExecutionEnvironment) MapFunction(org.apache.flink.api.common.functions.MapFunction) Tuple2(org.apache.flink.api.java.tuple.Tuple2) Tuple3(org.apache.flink.api.java.tuple.Tuple3) Test(org.junit.Test)

Example 45 with MapFunction

use of org.apache.flink.api.common.functions.MapFunction in project flink by apache.

the class GroupCombineITCase method testPartialReduceWithIdenticalInputOutputType.

@Test
public void testPartialReduceWithIdenticalInputOutputType() throws Exception {
    final ExecutionEnvironment env = ExecutionEnvironment.getExecutionEnvironment();
    // data
    DataSet<Tuple3<Integer, Long, String>> ds = CollectionDataSets.get3TupleDataSet(env);
    DataSet<Tuple2<Long, Tuple3<Integer, Long, String>>> dsWrapped = ds.map(new Tuple3KvWrapper());
    List<Tuple3<Integer, Long, String>> result = dsWrapped.groupBy(0).combineGroup(new Tuple3toTuple3GroupReduce()).groupBy(0).reduceGroup(new Tuple3toTuple3GroupReduce()).map(new MapFunction<Tuple2<Long, Tuple3<Integer, Long, String>>, Tuple3<Integer, Long, String>>() {

        @Override
        public Tuple3<Integer, Long, String> map(Tuple2<Long, Tuple3<Integer, Long, String>> value) throws Exception {
            return value.f1;
        }
    }).collect();
    String expected = "1,1,combined\n" + "5,4,combined\n" + "15,9,combined\n" + "34,16,combined\n" + "65,25,combined\n" + "111,36,combined\n";
    compareResultAsTuples(result, expected);
}

Also used : ExecutionEnvironment(org.apache.flink.api.java.ExecutionEnvironment) Tuple2(org.apache.flink.api.java.tuple.Tuple2) Tuple3(org.apache.flink.api.java.tuple.Tuple3) MapFunction(org.apache.flink.api.common.functions.MapFunction) Test(org.junit.Test)

Aggregations

MapFunction (org.apache.flink.api.common.functions.MapFunction)48 Test (org.junit.Test)31 Tuple2 (org.apache.flink.api.java.tuple.Tuple2)29 ExecutionEnvironment (org.apache.flink.api.java.ExecutionEnvironment)19 StreamExecutionEnvironment (org.apache.flink.streaming.api.environment.StreamExecutionEnvironment)19 Configuration (org.apache.flink.configuration.Configuration)10 FlatMapFunction (org.apache.flink.api.common.functions.FlatMapFunction)9 Plan (org.apache.flink.api.common.Plan)8 RichMapFunction (org.apache.flink.api.common.functions.RichMapFunction)8 OptimizedPlan (org.apache.flink.optimizer.plan.OptimizedPlan)8 RichFlatMapFunction (org.apache.flink.api.common.functions.RichFlatMapFunction)7 JobGraph (org.apache.flink.runtime.jobgraph.JobGraph)7 DiscardingOutputFormat (org.apache.flink.api.java.io.DiscardingOutputFormat)6 Edge (org.apache.flink.graph.Edge)6 SinkPlanNode (org.apache.flink.optimizer.plan.SinkPlanNode)6 NullValue (org.apache.flink.types.NullValue)6 FilterFunction (org.apache.flink.api.common.functions.FilterFunction)5 FieldList (org.apache.flink.api.common.operators.util.FieldList)5 DataSet (org.apache.flink.api.java.DataSet)5 Tuple1 (org.apache.flink.api.java.tuple.Tuple1)5