Search in sources :

Example 1 with DummyCoGroupFunction

use of org.apache.flink.optimizer.testfunctions.DummyCoGroupFunction in project flink by apache.

the class CoGroupCustomPartitioningTest method testIncompatibleHashAndCustomPartitioning.

@Test
public void testIncompatibleHashAndCustomPartitioning() {
    try {
        ExecutionEnvironment env = ExecutionEnvironment.getExecutionEnvironment();
        DataSet<Tuple3<Long, Long, Long>> input = env.fromElements(new Tuple3<Long, Long, Long>(0L, 0L, 0L));
        DataSet<Tuple3<Long, Long, Long>> partitioned = input.partitionCustom(new Partitioner<Long>() {

            @Override
            public int partition(Long key, int numPartitions) {
                return 0;
            }
        }, 0).map(new IdentityMapper<Tuple3<Long, Long, Long>>()).withForwardedFields("0", "1", "2");
        DataSet<Tuple3<Long, Long, Long>> grouped = partitioned.distinct(0, 1).groupBy(1).sortGroup(0, Order.ASCENDING).reduceGroup(new IdentityGroupReducerCombinable<Tuple3<Long, Long, Long>>()).withForwardedFields("0", "1");
        grouped.coGroup(partitioned).where(0).equalTo(0).with(new DummyCoGroupFunction<Tuple3<Long, Long, Long>, Tuple3<Long, Long, Long>>()).output(new DiscardingOutputFormat<Tuple2<Tuple3<Long, Long, Long>, Tuple3<Long, Long, Long>>>());
        Plan p = env.createProgramPlan();
        OptimizedPlan op = compileNoStats(p);
        SinkPlanNode sink = op.getDataSinks().iterator().next();
        DualInputPlanNode coGroup = (DualInputPlanNode) sink.getInput().getSource();
        assertEquals(ShipStrategyType.PARTITION_HASH, coGroup.getInput1().getShipStrategy());
        assertTrue(coGroup.getInput2().getShipStrategy() == ShipStrategyType.PARTITION_HASH || coGroup.getInput2().getShipStrategy() == ShipStrategyType.FORWARD);
    } catch (Exception e) {
        e.printStackTrace();
        fail(e.getMessage());
    }
}
Also used : ExecutionEnvironment(org.apache.flink.api.java.ExecutionEnvironment) Plan(org.apache.flink.api.common.Plan) OptimizedPlan(org.apache.flink.optimizer.plan.OptimizedPlan) InvalidProgramException(org.apache.flink.api.common.InvalidProgramException) OptimizedPlan(org.apache.flink.optimizer.plan.OptimizedPlan) DualInputPlanNode(org.apache.flink.optimizer.plan.DualInputPlanNode) IdentityMapper(org.apache.flink.optimizer.testfunctions.IdentityMapper) Tuple2(org.apache.flink.api.java.tuple.Tuple2) Tuple3(org.apache.flink.api.java.tuple.Tuple3) IdentityGroupReducerCombinable(org.apache.flink.optimizer.testfunctions.IdentityGroupReducerCombinable) SinkPlanNode(org.apache.flink.optimizer.plan.SinkPlanNode) DummyCoGroupFunction(org.apache.flink.optimizer.testfunctions.DummyCoGroupFunction) Test(org.junit.Test)

Example 2 with DummyCoGroupFunction

use of org.apache.flink.optimizer.testfunctions.DummyCoGroupFunction in project flink by apache.

the class CoGroupCustomPartitioningTest method testCoGroupWithTuples.

@Test
public void testCoGroupWithTuples() {
    try {
        final Partitioner<Long> partitioner = new TestPartitionerLong();
        ExecutionEnvironment env = ExecutionEnvironment.getExecutionEnvironment();
        DataSet<Tuple2<Long, Long>> input1 = env.fromElements(new Tuple2<Long, Long>(0L, 0L));
        DataSet<Tuple3<Long, Long, Long>> input2 = env.fromElements(new Tuple3<Long, Long, Long>(0L, 0L, 0L));
        input1.coGroup(input2).where(1).equalTo(0).withPartitioner(partitioner).with(new DummyCoGroupFunction<Tuple2<Long, Long>, Tuple3<Long, Long, Long>>()).output(new DiscardingOutputFormat<Tuple2<Tuple2<Long, Long>, Tuple3<Long, Long, Long>>>());
        Plan p = env.createProgramPlan();
        OptimizedPlan op = compileNoStats(p);
        SinkPlanNode sink = op.getDataSinks().iterator().next();
        DualInputPlanNode join = (DualInputPlanNode) sink.getInput().getSource();
        assertEquals(ShipStrategyType.PARTITION_CUSTOM, join.getInput1().getShipStrategy());
        assertEquals(ShipStrategyType.PARTITION_CUSTOM, join.getInput2().getShipStrategy());
        assertEquals(partitioner, join.getInput1().getPartitioner());
        assertEquals(partitioner, join.getInput2().getPartitioner());
    } catch (Exception e) {
        e.printStackTrace();
        fail(e.getMessage());
    }
}
Also used : ExecutionEnvironment(org.apache.flink.api.java.ExecutionEnvironment) Plan(org.apache.flink.api.common.Plan) OptimizedPlan(org.apache.flink.optimizer.plan.OptimizedPlan) InvalidProgramException(org.apache.flink.api.common.InvalidProgramException) OptimizedPlan(org.apache.flink.optimizer.plan.OptimizedPlan) DualInputPlanNode(org.apache.flink.optimizer.plan.DualInputPlanNode) Tuple2(org.apache.flink.api.java.tuple.Tuple2) Tuple3(org.apache.flink.api.java.tuple.Tuple3) SinkPlanNode(org.apache.flink.optimizer.plan.SinkPlanNode) DummyCoGroupFunction(org.apache.flink.optimizer.testfunctions.DummyCoGroupFunction) Test(org.junit.Test)

Example 3 with DummyCoGroupFunction

use of org.apache.flink.optimizer.testfunctions.DummyCoGroupFunction in project flink by apache.

the class PipelineBreakingTest method testReJoinedBranches.

/**
 * Tests that branches that are re-joined have place pipeline breakers.
 *
 * <pre>
 *                                         /-> (sink)
 *                                        /
 *                         /-> (reduce) -+          /-> (flatmap) -> (sink)
 *                        /               \        /
 *     (source) -> (map) -                (join) -+-----\
 *                        \               /              \
 *                         \-> (filter) -+                \
 *                                       \                (co group) -> (sink)
 *                                        \                /
 *                                         \-> (reduce) - /
 * </pre>
 */
@Test
public void testReJoinedBranches() {
    try {
        // build a test program
        ExecutionEnvironment env = ExecutionEnvironment.getExecutionEnvironment();
        DataSet<Tuple2<Long, Long>> data = env.fromElements(33L, 44L).map(new MapFunction<Long, Tuple2<Long, Long>>() {

            @Override
            public Tuple2<Long, Long> map(Long value) {
                return new Tuple2<Long, Long>(value, value);
            }
        });
        DataSet<Tuple2<Long, Long>> reduced = data.groupBy(0).reduce(new SelectOneReducer<Tuple2<Long, Long>>());
        reduced.output(new DiscardingOutputFormat<Tuple2<Long, Long>>());
        DataSet<Tuple2<Long, Long>> filtered = data.filter(new FilterFunction<Tuple2<Long, Long>>() {

            @Override
            public boolean filter(Tuple2<Long, Long> value) throws Exception {
                return false;
            }
        });
        DataSet<Tuple2<Long, Long>> joined = reduced.join(filtered).where(1).equalTo(1).with(new DummyFlatJoinFunction<Tuple2<Long, Long>>());
        joined.flatMap(new IdentityFlatMapper<Tuple2<Long, Long>>()).output(new DiscardingOutputFormat<Tuple2<Long, Long>>());
        joined.coGroup(filtered.groupBy(1).reduceGroup(new Top1GroupReducer<Tuple2<Long, Long>>())).where(0).equalTo(0).with(new DummyCoGroupFunction<Tuple2<Long, Long>, Tuple2<Long, Long>>()).output(new DiscardingOutputFormat<Tuple2<Tuple2<Long, Long>, Tuple2<Long, Long>>>());
        List<DataSinkNode> sinks = convertPlan(env.createProgramPlan());
        // gather the optimizer DAG nodes
        DataSinkNode sinkAfterReduce = sinks.get(0);
        DataSinkNode sinkAfterFlatMap = sinks.get(1);
        DataSinkNode sinkAfterCoGroup = sinks.get(2);
        SingleInputNode reduceNode = (SingleInputNode) sinkAfterReduce.getPredecessorNode();
        SingleInputNode mapNode = (SingleInputNode) reduceNode.getPredecessorNode();
        SingleInputNode flatMapNode = (SingleInputNode) sinkAfterFlatMap.getPredecessorNode();
        TwoInputNode joinNode = (TwoInputNode) flatMapNode.getPredecessorNode();
        SingleInputNode filterNode = (SingleInputNode) joinNode.getSecondPredecessorNode();
        TwoInputNode coGroupNode = (TwoInputNode) sinkAfterCoGroup.getPredecessorNode();
        SingleInputNode otherReduceNode = (SingleInputNode) coGroupNode.getSecondPredecessorNode();
        // test sanity checks (that we constructed the DAG correctly)
        assertEquals(reduceNode, joinNode.getFirstPredecessorNode());
        assertEquals(mapNode, filterNode.getPredecessorNode());
        assertEquals(joinNode, coGroupNode.getFirstPredecessorNode());
        assertEquals(filterNode, otherReduceNode.getPredecessorNode());
        // verify the pipeline breaking status
        assertFalse(sinkAfterReduce.getInputConnection().isBreakingPipeline());
        assertFalse(sinkAfterFlatMap.getInputConnection().isBreakingPipeline());
        assertFalse(sinkAfterCoGroup.getInputConnection().isBreakingPipeline());
        assertFalse(mapNode.getIncomingConnection().isBreakingPipeline());
        assertFalse(flatMapNode.getIncomingConnection().isBreakingPipeline());
        assertFalse(joinNode.getFirstIncomingConnection().isBreakingPipeline());
        assertFalse(coGroupNode.getFirstIncomingConnection().isBreakingPipeline());
        assertFalse(coGroupNode.getSecondIncomingConnection().isBreakingPipeline());
        // these should be pipeline breakers
        assertTrue(reduceNode.getIncomingConnection().isBreakingPipeline());
        assertTrue(filterNode.getIncomingConnection().isBreakingPipeline());
        assertTrue(otherReduceNode.getIncomingConnection().isBreakingPipeline());
        assertTrue(joinNode.getSecondIncomingConnection().isBreakingPipeline());
    } catch (Exception e) {
        e.printStackTrace();
        fail(e.getMessage());
    }
}
Also used : SingleInputNode(org.apache.flink.optimizer.dag.SingleInputNode) ExecutionEnvironment(org.apache.flink.api.java.ExecutionEnvironment) Top1GroupReducer(org.apache.flink.optimizer.testfunctions.Top1GroupReducer) DataSinkNode(org.apache.flink.optimizer.dag.DataSinkNode) Tuple2(org.apache.flink.api.java.tuple.Tuple2) IdentityFlatMapper(org.apache.flink.optimizer.testfunctions.IdentityFlatMapper) DummyCoGroupFunction(org.apache.flink.optimizer.testfunctions.DummyCoGroupFunction) TwoInputNode(org.apache.flink.optimizer.dag.TwoInputNode) Test(org.junit.Test)

Example 4 with DummyCoGroupFunction

use of org.apache.flink.optimizer.testfunctions.DummyCoGroupFunction in project flink by apache.

the class BinaryCustomPartitioningCompatibilityTest method testCompatiblePartitioningCoGroup.

@Test
public void testCompatiblePartitioningCoGroup() {
    try {
        final Partitioner<Long> partitioner = new Partitioner<Long>() {

            @Override
            public int partition(Long key, int numPartitions) {
                return 0;
            }
        };
        ExecutionEnvironment env = ExecutionEnvironment.getExecutionEnvironment();
        DataSet<Tuple2<Long, Long>> input1 = env.fromElements(new Tuple2<Long, Long>(0L, 0L));
        DataSet<Tuple3<Long, Long, Long>> input2 = env.fromElements(new Tuple3<Long, Long, Long>(0L, 0L, 0L));
        input1.partitionCustom(partitioner, 1).coGroup(input2.partitionCustom(partitioner, 0)).where(1).equalTo(0).with(new DummyCoGroupFunction<Tuple2<Long, Long>, Tuple3<Long, Long, Long>>()).output(new DiscardingOutputFormat<Tuple2<Tuple2<Long, Long>, Tuple3<Long, Long, Long>>>());
        Plan p = env.createProgramPlan();
        OptimizedPlan op = compileNoStats(p);
        SinkPlanNode sink = op.getDataSinks().iterator().next();
        DualInputPlanNode coGroup = (DualInputPlanNode) sink.getInput().getSource();
        SingleInputPlanNode partitioner1 = (SingleInputPlanNode) coGroup.getInput1().getSource();
        SingleInputPlanNode partitioner2 = (SingleInputPlanNode) coGroup.getInput2().getSource();
        assertEquals(ShipStrategyType.FORWARD, coGroup.getInput1().getShipStrategy());
        assertEquals(ShipStrategyType.FORWARD, coGroup.getInput2().getShipStrategy());
        assertEquals(ShipStrategyType.PARTITION_CUSTOM, partitioner1.getInput().getShipStrategy());
        assertEquals(ShipStrategyType.PARTITION_CUSTOM, partitioner2.getInput().getShipStrategy());
        assertEquals(partitioner, partitioner1.getInput().getPartitioner());
        assertEquals(partitioner, partitioner2.getInput().getPartitioner());
        new JobGraphGenerator().compileJobGraph(op);
    } catch (Exception e) {
        e.printStackTrace();
        fail(e.getMessage());
    }
}
Also used : ExecutionEnvironment(org.apache.flink.api.java.ExecutionEnvironment) Plan(org.apache.flink.api.common.Plan) OptimizedPlan(org.apache.flink.optimizer.plan.OptimizedPlan) OptimizedPlan(org.apache.flink.optimizer.plan.OptimizedPlan) DualInputPlanNode(org.apache.flink.optimizer.plan.DualInputPlanNode) SingleInputPlanNode(org.apache.flink.optimizer.plan.SingleInputPlanNode) Tuple2(org.apache.flink.api.java.tuple.Tuple2) JobGraphGenerator(org.apache.flink.optimizer.plantranslate.JobGraphGenerator) Tuple3(org.apache.flink.api.java.tuple.Tuple3) SinkPlanNode(org.apache.flink.optimizer.plan.SinkPlanNode) Partitioner(org.apache.flink.api.common.functions.Partitioner) DummyCoGroupFunction(org.apache.flink.optimizer.testfunctions.DummyCoGroupFunction) Test(org.junit.Test)

Example 5 with DummyCoGroupFunction

use of org.apache.flink.optimizer.testfunctions.DummyCoGroupFunction in project flink by apache.

the class CoGroupCustomPartitioningTest method testCoGroupWithKeySelectors.

@Test
public void testCoGroupWithKeySelectors() {
    try {
        final Partitioner<Integer> partitioner = new TestPartitionerInt();
        ExecutionEnvironment env = ExecutionEnvironment.getExecutionEnvironment();
        DataSet<Pojo2> input1 = env.fromElements(new Pojo2());
        DataSet<Pojo3> input2 = env.fromElements(new Pojo3());
        input1.coGroup(input2).where(new Pojo2KeySelector()).equalTo(new Pojo3KeySelector()).withPartitioner(partitioner).with(new DummyCoGroupFunction<Pojo2, Pojo3>()).output(new DiscardingOutputFormat<Tuple2<Pojo2, Pojo3>>());
        Plan p = env.createProgramPlan();
        OptimizedPlan op = compileNoStats(p);
        SinkPlanNode sink = op.getDataSinks().iterator().next();
        DualInputPlanNode join = (DualInputPlanNode) sink.getInput().getSource();
        assertEquals(ShipStrategyType.PARTITION_CUSTOM, join.getInput1().getShipStrategy());
        assertEquals(ShipStrategyType.PARTITION_CUSTOM, join.getInput2().getShipStrategy());
        assertEquals(partitioner, join.getInput1().getPartitioner());
        assertEquals(partitioner, join.getInput2().getPartitioner());
    } catch (Exception e) {
        e.printStackTrace();
        fail(e.getMessage());
    }
}
Also used : ExecutionEnvironment(org.apache.flink.api.java.ExecutionEnvironment) Plan(org.apache.flink.api.common.Plan) OptimizedPlan(org.apache.flink.optimizer.plan.OptimizedPlan) InvalidProgramException(org.apache.flink.api.common.InvalidProgramException) OptimizedPlan(org.apache.flink.optimizer.plan.OptimizedPlan) DualInputPlanNode(org.apache.flink.optimizer.plan.DualInputPlanNode) Tuple2(org.apache.flink.api.java.tuple.Tuple2) SinkPlanNode(org.apache.flink.optimizer.plan.SinkPlanNode) DummyCoGroupFunction(org.apache.flink.optimizer.testfunctions.DummyCoGroupFunction) Test(org.junit.Test)

Aggregations

ExecutionEnvironment (org.apache.flink.api.java.ExecutionEnvironment)7 Tuple2 (org.apache.flink.api.java.tuple.Tuple2)7 DummyCoGroupFunction (org.apache.flink.optimizer.testfunctions.DummyCoGroupFunction)7 Test (org.junit.Test)7 Plan (org.apache.flink.api.common.Plan)6 OptimizedPlan (org.apache.flink.optimizer.plan.OptimizedPlan)6 DualInputPlanNode (org.apache.flink.optimizer.plan.DualInputPlanNode)5 SinkPlanNode (org.apache.flink.optimizer.plan.SinkPlanNode)5 InvalidProgramException (org.apache.flink.api.common.InvalidProgramException)4 Tuple3 (org.apache.flink.api.java.tuple.Tuple3)3 JobGraphGenerator (org.apache.flink.optimizer.plantranslate.JobGraphGenerator)2 Partitioner (org.apache.flink.api.common.functions.Partitioner)1 DataSinkNode (org.apache.flink.optimizer.dag.DataSinkNode)1 SingleInputNode (org.apache.flink.optimizer.dag.SingleInputNode)1 TwoInputNode (org.apache.flink.optimizer.dag.TwoInputNode)1 SingleInputPlanNode (org.apache.flink.optimizer.plan.SingleInputPlanNode)1 IdentityFlatMapper (org.apache.flink.optimizer.testfunctions.IdentityFlatMapper)1 IdentityGroupReducerCombinable (org.apache.flink.optimizer.testfunctions.IdentityGroupReducerCombinable)1 IdentityMapper (org.apache.flink.optimizer.testfunctions.IdentityMapper)1 Top1GroupReducer (org.apache.flink.optimizer.testfunctions.Top1GroupReducer)1