Search in sources :

Example 41 with OptimizedPlan

use of org.apache.flink.optimizer.plan.OptimizedPlan in project flink by apache.

the class CoGroupCustomPartitioningTest method testIncompatibleHashAndCustomPartitioning.

@Test
public void testIncompatibleHashAndCustomPartitioning() {
    try {
        ExecutionEnvironment env = ExecutionEnvironment.getExecutionEnvironment();
        DataSet<Tuple3<Long, Long, Long>> input = env.fromElements(new Tuple3<Long, Long, Long>(0L, 0L, 0L));
        DataSet<Tuple3<Long, Long, Long>> partitioned = input.partitionCustom(new Partitioner<Long>() {

            @Override
            public int partition(Long key, int numPartitions) {
                return 0;
            }
        }, 0).map(new IdentityMapper<Tuple3<Long, Long, Long>>()).withForwardedFields("0", "1", "2");
        DataSet<Tuple3<Long, Long, Long>> grouped = partitioned.distinct(0, 1).groupBy(1).sortGroup(0, Order.ASCENDING).reduceGroup(new IdentityGroupReducerCombinable<Tuple3<Long, Long, Long>>()).withForwardedFields("0", "1");
        grouped.coGroup(partitioned).where(0).equalTo(0).with(new DummyCoGroupFunction<Tuple3<Long, Long, Long>, Tuple3<Long, Long, Long>>()).output(new DiscardingOutputFormat<Tuple2<Tuple3<Long, Long, Long>, Tuple3<Long, Long, Long>>>());
        Plan p = env.createProgramPlan();
        OptimizedPlan op = compileNoStats(p);
        SinkPlanNode sink = op.getDataSinks().iterator().next();
        DualInputPlanNode coGroup = (DualInputPlanNode) sink.getInput().getSource();
        assertEquals(ShipStrategyType.PARTITION_HASH, coGroup.getInput1().getShipStrategy());
        assertTrue(coGroup.getInput2().getShipStrategy() == ShipStrategyType.PARTITION_HASH || coGroup.getInput2().getShipStrategy() == ShipStrategyType.FORWARD);
    } catch (Exception e) {
        e.printStackTrace();
        fail(e.getMessage());
    }
}
Also used : ExecutionEnvironment(org.apache.flink.api.java.ExecutionEnvironment) Plan(org.apache.flink.api.common.Plan) OptimizedPlan(org.apache.flink.optimizer.plan.OptimizedPlan) InvalidProgramException(org.apache.flink.api.common.InvalidProgramException) OptimizedPlan(org.apache.flink.optimizer.plan.OptimizedPlan) DualInputPlanNode(org.apache.flink.optimizer.plan.DualInputPlanNode) IdentityMapper(org.apache.flink.optimizer.testfunctions.IdentityMapper) Tuple2(org.apache.flink.api.java.tuple.Tuple2) Tuple3(org.apache.flink.api.java.tuple.Tuple3) IdentityGroupReducerCombinable(org.apache.flink.optimizer.testfunctions.IdentityGroupReducerCombinable) SinkPlanNode(org.apache.flink.optimizer.plan.SinkPlanNode) DummyCoGroupFunction(org.apache.flink.optimizer.testfunctions.DummyCoGroupFunction) Test(org.junit.Test)

Example 42 with OptimizedPlan

use of org.apache.flink.optimizer.plan.OptimizedPlan in project flink by apache.

the class CoGroupCustomPartitioningTest method testCoGroupWithTuples.

@Test
public void testCoGroupWithTuples() {
    try {
        final Partitioner<Long> partitioner = new TestPartitionerLong();
        ExecutionEnvironment env = ExecutionEnvironment.getExecutionEnvironment();
        DataSet<Tuple2<Long, Long>> input1 = env.fromElements(new Tuple2<Long, Long>(0L, 0L));
        DataSet<Tuple3<Long, Long, Long>> input2 = env.fromElements(new Tuple3<Long, Long, Long>(0L, 0L, 0L));
        input1.coGroup(input2).where(1).equalTo(0).withPartitioner(partitioner).with(new DummyCoGroupFunction<Tuple2<Long, Long>, Tuple3<Long, Long, Long>>()).output(new DiscardingOutputFormat<Tuple2<Tuple2<Long, Long>, Tuple3<Long, Long, Long>>>());
        Plan p = env.createProgramPlan();
        OptimizedPlan op = compileNoStats(p);
        SinkPlanNode sink = op.getDataSinks().iterator().next();
        DualInputPlanNode join = (DualInputPlanNode) sink.getInput().getSource();
        assertEquals(ShipStrategyType.PARTITION_CUSTOM, join.getInput1().getShipStrategy());
        assertEquals(ShipStrategyType.PARTITION_CUSTOM, join.getInput2().getShipStrategy());
        assertEquals(partitioner, join.getInput1().getPartitioner());
        assertEquals(partitioner, join.getInput2().getPartitioner());
    } catch (Exception e) {
        e.printStackTrace();
        fail(e.getMessage());
    }
}
Also used : ExecutionEnvironment(org.apache.flink.api.java.ExecutionEnvironment) Plan(org.apache.flink.api.common.Plan) OptimizedPlan(org.apache.flink.optimizer.plan.OptimizedPlan) InvalidProgramException(org.apache.flink.api.common.InvalidProgramException) OptimizedPlan(org.apache.flink.optimizer.plan.OptimizedPlan) DualInputPlanNode(org.apache.flink.optimizer.plan.DualInputPlanNode) Tuple2(org.apache.flink.api.java.tuple.Tuple2) Tuple3(org.apache.flink.api.java.tuple.Tuple3) SinkPlanNode(org.apache.flink.optimizer.plan.SinkPlanNode) DummyCoGroupFunction(org.apache.flink.optimizer.testfunctions.DummyCoGroupFunction) Test(org.junit.Test)

Example 43 with OptimizedPlan

use of org.apache.flink.optimizer.plan.OptimizedPlan in project flink by apache.

the class CustomPartitioningGlobalOptimizationTest method testJoinReduceCombination.

@Test
public void testJoinReduceCombination() {
    try {
        final Partitioner<Long> partitioner = new TestPartitionerLong();
        ExecutionEnvironment env = ExecutionEnvironment.getExecutionEnvironment();
        DataSet<Tuple2<Long, Long>> input1 = env.fromElements(new Tuple2<Long, Long>(0L, 0L));
        DataSet<Tuple3<Long, Long, Long>> input2 = env.fromElements(new Tuple3<Long, Long, Long>(0L, 0L, 0L));
        DataSet<Tuple3<Long, Long, Long>> joined = input1.join(input2).where(1).equalTo(0).projectFirst(0, 1).<Tuple3<Long, Long, Long>>projectSecond(2).withPartitioner(partitioner);
        joined.groupBy(1).withPartitioner(partitioner).reduceGroup(new IdentityGroupReducerCombinable<Tuple3<Long, Long, Long>>()).output(new DiscardingOutputFormat<Tuple3<Long, Long, Long>>());
        Plan p = env.createProgramPlan();
        OptimizedPlan op = compileNoStats(p);
        SinkPlanNode sink = op.getDataSinks().iterator().next();
        SingleInputPlanNode reducer = (SingleInputPlanNode) sink.getInput().getSource();
        assertTrue("Reduce is not chained, property reuse does not happen", reducer.getInput().getSource() instanceof DualInputPlanNode);
        DualInputPlanNode join = (DualInputPlanNode) reducer.getInput().getSource();
        assertEquals(ShipStrategyType.PARTITION_CUSTOM, join.getInput1().getShipStrategy());
        assertEquals(ShipStrategyType.PARTITION_CUSTOM, join.getInput2().getShipStrategy());
        assertEquals(partitioner, join.getInput1().getPartitioner());
        assertEquals(partitioner, join.getInput2().getPartitioner());
        assertEquals(ShipStrategyType.FORWARD, reducer.getInput().getShipStrategy());
    } catch (Exception e) {
        e.printStackTrace();
        fail(e.getMessage());
    }
}
Also used : ExecutionEnvironment(org.apache.flink.api.java.ExecutionEnvironment) Plan(org.apache.flink.api.common.Plan) OptimizedPlan(org.apache.flink.optimizer.plan.OptimizedPlan) OptimizedPlan(org.apache.flink.optimizer.plan.OptimizedPlan) SingleInputPlanNode(org.apache.flink.optimizer.plan.SingleInputPlanNode) DualInputPlanNode(org.apache.flink.optimizer.plan.DualInputPlanNode) Tuple2(org.apache.flink.api.java.tuple.Tuple2) Tuple3(org.apache.flink.api.java.tuple.Tuple3) IdentityGroupReducerCombinable(org.apache.flink.optimizer.testfunctions.IdentityGroupReducerCombinable) SinkPlanNode(org.apache.flink.optimizer.plan.SinkPlanNode) Test(org.junit.Test)

Example 44 with OptimizedPlan

use of org.apache.flink.optimizer.plan.OptimizedPlan in project flink by apache.

the class CustomPartitioningTest method testPartitionTuples.

@Test
public void testPartitionTuples() {
    try {
        final Partitioner<Integer> part = new TestPartitionerInt();
        final int parallelism = 4;
        ExecutionEnvironment env = ExecutionEnvironment.getExecutionEnvironment();
        env.setParallelism(parallelism);
        DataSet<Tuple2<Integer, Integer>> data = env.fromElements(new Tuple2<Integer, Integer>(0, 0)).rebalance();
        data.partitionCustom(part, 0).mapPartition(new IdentityPartitionerMapper<Tuple2<Integer, Integer>>()).output(new DiscardingOutputFormat<Tuple2<Integer, Integer>>());
        Plan p = env.createProgramPlan();
        OptimizedPlan op = compileNoStats(p);
        SinkPlanNode sink = op.getDataSinks().iterator().next();
        SingleInputPlanNode mapper = (SingleInputPlanNode) sink.getInput().getSource();
        SingleInputPlanNode partitioner = (SingleInputPlanNode) mapper.getInput().getSource();
        SingleInputPlanNode balancer = (SingleInputPlanNode) partitioner.getInput().getSource();
        assertEquals(ShipStrategyType.FORWARD, sink.getInput().getShipStrategy());
        assertEquals(parallelism, sink.getParallelism());
        assertEquals(ShipStrategyType.FORWARD, mapper.getInput().getShipStrategy());
        assertEquals(parallelism, mapper.getParallelism());
        assertEquals(ShipStrategyType.PARTITION_CUSTOM, partitioner.getInput().getShipStrategy());
        assertEquals(part, partitioner.getInput().getPartitioner());
        assertEquals(parallelism, partitioner.getParallelism());
        assertEquals(ShipStrategyType.PARTITION_FORCED_REBALANCE, balancer.getInput().getShipStrategy());
        assertEquals(parallelism, balancer.getParallelism());
    } catch (Exception e) {
        e.printStackTrace();
        fail(e.getMessage());
    }
}
Also used : ExecutionEnvironment(org.apache.flink.api.java.ExecutionEnvironment) Plan(org.apache.flink.api.common.Plan) OptimizedPlan(org.apache.flink.optimizer.plan.OptimizedPlan) InvalidProgramException(org.apache.flink.api.common.InvalidProgramException) OptimizedPlan(org.apache.flink.optimizer.plan.OptimizedPlan) SingleInputPlanNode(org.apache.flink.optimizer.plan.SingleInputPlanNode) IdentityPartitionerMapper(org.apache.flink.optimizer.testfunctions.IdentityPartitionerMapper) Tuple2(org.apache.flink.api.java.tuple.Tuple2) SinkPlanNode(org.apache.flink.optimizer.plan.SinkPlanNode) Test(org.junit.Test)

Example 45 with OptimizedPlan

use of org.apache.flink.optimizer.plan.OptimizedPlan in project flink by apache.

the class CustomPartitioningTest method testPartitionPojo.

@Test
public void testPartitionPojo() {
    try {
        final Partitioner<Integer> part = new TestPartitionerInt();
        final int parallelism = 4;
        ExecutionEnvironment env = ExecutionEnvironment.getExecutionEnvironment();
        env.setParallelism(parallelism);
        DataSet<Pojo> data = env.fromElements(new Pojo()).rebalance();
        data.partitionCustom(part, "a").mapPartition(new IdentityPartitionerMapper<Pojo>()).output(new DiscardingOutputFormat<Pojo>());
        Plan p = env.createProgramPlan();
        OptimizedPlan op = compileNoStats(p);
        SinkPlanNode sink = op.getDataSinks().iterator().next();
        SingleInputPlanNode mapper = (SingleInputPlanNode) sink.getInput().getSource();
        SingleInputPlanNode partitioner = (SingleInputPlanNode) mapper.getInput().getSource();
        SingleInputPlanNode balancer = (SingleInputPlanNode) partitioner.getInput().getSource();
        assertEquals(ShipStrategyType.FORWARD, sink.getInput().getShipStrategy());
        assertEquals(parallelism, sink.getParallelism());
        assertEquals(ShipStrategyType.FORWARD, mapper.getInput().getShipStrategy());
        assertEquals(parallelism, mapper.getParallelism());
        assertEquals(ShipStrategyType.PARTITION_CUSTOM, partitioner.getInput().getShipStrategy());
        assertEquals(part, partitioner.getInput().getPartitioner());
        assertEquals(parallelism, partitioner.getParallelism());
        assertEquals(ShipStrategyType.PARTITION_FORCED_REBALANCE, balancer.getInput().getShipStrategy());
        assertEquals(parallelism, balancer.getParallelism());
    } catch (Exception e) {
        e.printStackTrace();
        fail(e.getMessage());
    }
}
Also used : ExecutionEnvironment(org.apache.flink.api.java.ExecutionEnvironment) Plan(org.apache.flink.api.common.Plan) OptimizedPlan(org.apache.flink.optimizer.plan.OptimizedPlan) InvalidProgramException(org.apache.flink.api.common.InvalidProgramException) OptimizedPlan(org.apache.flink.optimizer.plan.OptimizedPlan) SingleInputPlanNode(org.apache.flink.optimizer.plan.SingleInputPlanNode) IdentityPartitionerMapper(org.apache.flink.optimizer.testfunctions.IdentityPartitionerMapper) SinkPlanNode(org.apache.flink.optimizer.plan.SinkPlanNode) Test(org.junit.Test)

Aggregations

OptimizedPlan (org.apache.flink.optimizer.plan.OptimizedPlan)221 Test (org.junit.Test)197 Plan (org.apache.flink.api.common.Plan)192 ExecutionEnvironment (org.apache.flink.api.java.ExecutionEnvironment)183 SinkPlanNode (org.apache.flink.optimizer.plan.SinkPlanNode)146 Tuple2 (org.apache.flink.api.java.tuple.Tuple2)91 SingleInputPlanNode (org.apache.flink.optimizer.plan.SingleInputPlanNode)83 DualInputPlanNode (org.apache.flink.optimizer.plan.DualInputPlanNode)82 JobGraphGenerator (org.apache.flink.optimizer.plantranslate.JobGraphGenerator)55 Tuple3 (org.apache.flink.api.java.tuple.Tuple3)54 SourcePlanNode (org.apache.flink.optimizer.plan.SourcePlanNode)48 DiscardingOutputFormat (org.apache.flink.api.java.io.DiscardingOutputFormat)33 InvalidProgramException (org.apache.flink.api.common.InvalidProgramException)27 FieldList (org.apache.flink.api.common.operators.util.FieldList)27 Channel (org.apache.flink.optimizer.plan.Channel)26 FieldSet (org.apache.flink.api.common.operators.util.FieldSet)25 GlobalProperties (org.apache.flink.optimizer.dataproperties.GlobalProperties)25 LocalProperties (org.apache.flink.optimizer.dataproperties.LocalProperties)25 IdentityMapper (org.apache.flink.optimizer.testfunctions.IdentityMapper)20 WorksetIterationPlanNode (org.apache.flink.optimizer.plan.WorksetIterationPlanNode)16