Search in sources :

Example 66 with OptimizedPlan

use of org.apache.flink.optimizer.plan.OptimizedPlan in project flink by apache.

the class PartitionPushdownTest method testPartitioningNotPushedDown.

@Test
public void testPartitioningNotPushedDown() {
    try {
        ExecutionEnvironment env = ExecutionEnvironment.getExecutionEnvironment();
        @SuppressWarnings("unchecked") DataSet<Tuple3<Long, Long, Long>> input = env.fromElements(new Tuple3<Long, Long, Long>(0L, 0L, 0L));
        input.groupBy(0, 1).sum(2).groupBy(0).sum(1).output(new DiscardingOutputFormat<Tuple3<Long, Long, Long>>());
        Plan p = env.createProgramPlan();
        OptimizedPlan op = compileNoStats(p);
        SinkPlanNode sink = op.getDataSinks().iterator().next();
        SingleInputPlanNode agg2Reducer = (SingleInputPlanNode) sink.getInput().getSource();
        SingleInputPlanNode agg2Combiner = (SingleInputPlanNode) agg2Reducer.getInput().getSource();
        SingleInputPlanNode agg1Reducer = (SingleInputPlanNode) agg2Combiner.getInput().getSource();
        assertEquals(ShipStrategyType.PARTITION_HASH, agg2Reducer.getInput().getShipStrategy());
        assertEquals(new FieldList(0), agg2Reducer.getInput().getShipStrategyKeys());
        assertEquals(ShipStrategyType.FORWARD, agg2Combiner.getInput().getShipStrategy());
        assertEquals(ShipStrategyType.PARTITION_HASH, agg1Reducer.getInput().getShipStrategy());
        assertEquals(new FieldList(0, 1), agg1Reducer.getInput().getShipStrategyKeys());
    } catch (Exception e) {
        e.printStackTrace();
        fail(e.getMessage());
    }
}
Also used : ExecutionEnvironment(org.apache.flink.api.java.ExecutionEnvironment) Plan(org.apache.flink.api.common.Plan) OptimizedPlan(org.apache.flink.optimizer.plan.OptimizedPlan) OptimizedPlan(org.apache.flink.optimizer.plan.OptimizedPlan) FieldList(org.apache.flink.api.common.operators.util.FieldList) SingleInputPlanNode(org.apache.flink.optimizer.plan.SingleInputPlanNode) Tuple3(org.apache.flink.api.java.tuple.Tuple3) SinkPlanNode(org.apache.flink.optimizer.plan.SinkPlanNode) Test(org.junit.Test)

Example 67 with OptimizedPlan

use of org.apache.flink.optimizer.plan.OptimizedPlan in project flink by apache.

the class PartitioningReusageTest method reuseBothPartitioningJoin6.

@Test
public void reuseBothPartitioningJoin6() {
    ExecutionEnvironment env = ExecutionEnvironment.getExecutionEnvironment();
    DataSet<Tuple3<Integer, Integer, Integer>> set1 = env.readCsvFile(IN_FILE).types(Integer.class, Integer.class, Integer.class);
    DataSet<Tuple3<Integer, Integer, Integer>> set2 = env.readCsvFile(IN_FILE).types(Integer.class, Integer.class, Integer.class);
    DataSet<Tuple3<Integer, Integer, Integer>> joined = set1.partitionByHash(0).map(new MockMapper()).withForwardedFields("0").join(set2.partitionByHash(1).map(new MockMapper()).withForwardedFields("1"), JoinOperatorBase.JoinHint.REPARTITION_HASH_FIRST).where(0, 2).equalTo(1, 2).with(new MockJoin());
    joined.output(new DiscardingOutputFormat<Tuple3<Integer, Integer, Integer>>());
    Plan plan = env.createProgramPlan();
    OptimizedPlan oPlan = compileWithStats(plan);
    SinkPlanNode sink = oPlan.getDataSinks().iterator().next();
    DualInputPlanNode join = (DualInputPlanNode) sink.getInput().getSource();
    checkValidJoinInputProperties(join);
}
Also used : DualInputPlanNode(org.apache.flink.optimizer.plan.DualInputPlanNode) ExecutionEnvironment(org.apache.flink.api.java.ExecutionEnvironment) Tuple3(org.apache.flink.api.java.tuple.Tuple3) SinkPlanNode(org.apache.flink.optimizer.plan.SinkPlanNode) Plan(org.apache.flink.api.common.Plan) OptimizedPlan(org.apache.flink.optimizer.plan.OptimizedPlan) OptimizedPlan(org.apache.flink.optimizer.plan.OptimizedPlan) Test(org.junit.Test)

Example 68 with OptimizedPlan

use of org.apache.flink.optimizer.plan.OptimizedPlan in project flink by apache.

the class PartitioningReusageTest method noPreviousPartitioningCoGroup1.

@Test
public void noPreviousPartitioningCoGroup1() {
    ExecutionEnvironment env = ExecutionEnvironment.getExecutionEnvironment();
    DataSet<Tuple3<Integer, Integer, Integer>> set1 = env.readCsvFile(IN_FILE).types(Integer.class, Integer.class, Integer.class);
    DataSet<Tuple3<Integer, Integer, Integer>> set2 = env.readCsvFile(IN_FILE).types(Integer.class, Integer.class, Integer.class);
    DataSet<Tuple3<Integer, Integer, Integer>> coGrouped = set1.coGroup(set2).where(0, 1).equalTo(0, 1).with(new MockCoGroup());
    coGrouped.output(new DiscardingOutputFormat<Tuple3<Integer, Integer, Integer>>());
    Plan plan = env.createProgramPlan();
    OptimizedPlan oPlan = compileWithStats(plan);
    SinkPlanNode sink = oPlan.getDataSinks().iterator().next();
    DualInputPlanNode coGroup = (DualInputPlanNode) sink.getInput().getSource();
    checkValidCoGroupInputProperties(coGroup);
}
Also used : DualInputPlanNode(org.apache.flink.optimizer.plan.DualInputPlanNode) ExecutionEnvironment(org.apache.flink.api.java.ExecutionEnvironment) Tuple3(org.apache.flink.api.java.tuple.Tuple3) SinkPlanNode(org.apache.flink.optimizer.plan.SinkPlanNode) Plan(org.apache.flink.api.common.Plan) OptimizedPlan(org.apache.flink.optimizer.plan.OptimizedPlan) OptimizedPlan(org.apache.flink.optimizer.plan.OptimizedPlan) Test(org.junit.Test)

Example 69 with OptimizedPlan

use of org.apache.flink.optimizer.plan.OptimizedPlan in project flink by apache.

the class PartitioningReusageTest method reuseBothPartitioningJoin1.

@Test
public void reuseBothPartitioningJoin1() {
    ExecutionEnvironment env = ExecutionEnvironment.getExecutionEnvironment();
    DataSet<Tuple3<Integer, Integer, Integer>> set1 = env.readCsvFile(IN_FILE).types(Integer.class, Integer.class, Integer.class);
    DataSet<Tuple3<Integer, Integer, Integer>> set2 = env.readCsvFile(IN_FILE).types(Integer.class, Integer.class, Integer.class);
    DataSet<Tuple3<Integer, Integer, Integer>> joined = set1.partitionByHash(0, 1).map(new MockMapper()).withForwardedFields("0;1").join(set2.partitionByHash(0, 1).map(new MockMapper()).withForwardedFields("0;1"), JoinOperatorBase.JoinHint.REPARTITION_HASH_FIRST).where(0, 1).equalTo(0, 1).with(new MockJoin());
    joined.output(new DiscardingOutputFormat<Tuple3<Integer, Integer, Integer>>());
    Plan plan = env.createProgramPlan();
    OptimizedPlan oPlan = compileWithStats(plan);
    SinkPlanNode sink = oPlan.getDataSinks().iterator().next();
    DualInputPlanNode join = (DualInputPlanNode) sink.getInput().getSource();
    checkValidJoinInputProperties(join);
}
Also used : DualInputPlanNode(org.apache.flink.optimizer.plan.DualInputPlanNode) ExecutionEnvironment(org.apache.flink.api.java.ExecutionEnvironment) Tuple3(org.apache.flink.api.java.tuple.Tuple3) SinkPlanNode(org.apache.flink.optimizer.plan.SinkPlanNode) Plan(org.apache.flink.api.common.Plan) OptimizedPlan(org.apache.flink.optimizer.plan.OptimizedPlan) OptimizedPlan(org.apache.flink.optimizer.plan.OptimizedPlan) Test(org.junit.Test)

Example 70 with OptimizedPlan

use of org.apache.flink.optimizer.plan.OptimizedPlan in project flink by apache.

the class PartitioningReusageTest method reuseBothPartitioningJoin4.

@Test
public void reuseBothPartitioningJoin4() {
    ExecutionEnvironment env = ExecutionEnvironment.getExecutionEnvironment();
    DataSet<Tuple3<Integer, Integer, Integer>> set1 = env.readCsvFile(IN_FILE).types(Integer.class, Integer.class, Integer.class);
    DataSet<Tuple3<Integer, Integer, Integer>> set2 = env.readCsvFile(IN_FILE).types(Integer.class, Integer.class, Integer.class);
    DataSet<Tuple3<Integer, Integer, Integer>> joined = set1.partitionByHash(0, 2).map(new MockMapper()).withForwardedFields("0;2").join(set2.partitionByHash(1).map(new MockMapper()).withForwardedFields("1"), JoinOperatorBase.JoinHint.REPARTITION_HASH_FIRST).where(0, 2).equalTo(2, 1).with(new MockJoin());
    joined.output(new DiscardingOutputFormat<Tuple3<Integer, Integer, Integer>>());
    Plan plan = env.createProgramPlan();
    OptimizedPlan oPlan = compileWithStats(plan);
    SinkPlanNode sink = oPlan.getDataSinks().iterator().next();
    DualInputPlanNode join = (DualInputPlanNode) sink.getInput().getSource();
    checkValidJoinInputProperties(join);
}
Also used : DualInputPlanNode(org.apache.flink.optimizer.plan.DualInputPlanNode) ExecutionEnvironment(org.apache.flink.api.java.ExecutionEnvironment) Tuple3(org.apache.flink.api.java.tuple.Tuple3) SinkPlanNode(org.apache.flink.optimizer.plan.SinkPlanNode) Plan(org.apache.flink.api.common.Plan) OptimizedPlan(org.apache.flink.optimizer.plan.OptimizedPlan) OptimizedPlan(org.apache.flink.optimizer.plan.OptimizedPlan) Test(org.junit.Test)

Aggregations

OptimizedPlan (org.apache.flink.optimizer.plan.OptimizedPlan)221 Test (org.junit.Test)197 Plan (org.apache.flink.api.common.Plan)192 ExecutionEnvironment (org.apache.flink.api.java.ExecutionEnvironment)183 SinkPlanNode (org.apache.flink.optimizer.plan.SinkPlanNode)146 Tuple2 (org.apache.flink.api.java.tuple.Tuple2)91 SingleInputPlanNode (org.apache.flink.optimizer.plan.SingleInputPlanNode)83 DualInputPlanNode (org.apache.flink.optimizer.plan.DualInputPlanNode)82 JobGraphGenerator (org.apache.flink.optimizer.plantranslate.JobGraphGenerator)55 Tuple3 (org.apache.flink.api.java.tuple.Tuple3)54 SourcePlanNode (org.apache.flink.optimizer.plan.SourcePlanNode)48 DiscardingOutputFormat (org.apache.flink.api.java.io.DiscardingOutputFormat)33 InvalidProgramException (org.apache.flink.api.common.InvalidProgramException)27 FieldList (org.apache.flink.api.common.operators.util.FieldList)27 Channel (org.apache.flink.optimizer.plan.Channel)26 FieldSet (org.apache.flink.api.common.operators.util.FieldSet)25 GlobalProperties (org.apache.flink.optimizer.dataproperties.GlobalProperties)25 LocalProperties (org.apache.flink.optimizer.dataproperties.LocalProperties)25 IdentityMapper (org.apache.flink.optimizer.testfunctions.IdentityMapper)20 WorksetIterationPlanNode (org.apache.flink.optimizer.plan.WorksetIterationPlanNode)16