Search in sources :

Example 61 with SinkPlanNode

use of org.apache.flink.optimizer.plan.SinkPlanNode in project flink by apache.

the class PartitioningReusageTest method noPreviousPartitioningCoGroup1.

@Test
public void noPreviousPartitioningCoGroup1() {
    ExecutionEnvironment env = ExecutionEnvironment.getExecutionEnvironment();
    DataSet<Tuple3<Integer, Integer, Integer>> set1 = env.readCsvFile(IN_FILE).types(Integer.class, Integer.class, Integer.class);
    DataSet<Tuple3<Integer, Integer, Integer>> set2 = env.readCsvFile(IN_FILE).types(Integer.class, Integer.class, Integer.class);
    DataSet<Tuple3<Integer, Integer, Integer>> coGrouped = set1.coGroup(set2).where(0, 1).equalTo(0, 1).with(new MockCoGroup());
    coGrouped.output(new DiscardingOutputFormat<Tuple3<Integer, Integer, Integer>>());
    Plan plan = env.createProgramPlan();
    OptimizedPlan oPlan = compileWithStats(plan);
    SinkPlanNode sink = oPlan.getDataSinks().iterator().next();
    DualInputPlanNode coGroup = (DualInputPlanNode) sink.getInput().getSource();
    checkValidCoGroupInputProperties(coGroup);
}
Also used : DualInputPlanNode(org.apache.flink.optimizer.plan.DualInputPlanNode) ExecutionEnvironment(org.apache.flink.api.java.ExecutionEnvironment) Tuple3(org.apache.flink.api.java.tuple.Tuple3) SinkPlanNode(org.apache.flink.optimizer.plan.SinkPlanNode) Plan(org.apache.flink.api.common.Plan) OptimizedPlan(org.apache.flink.optimizer.plan.OptimizedPlan) OptimizedPlan(org.apache.flink.optimizer.plan.OptimizedPlan) Test(org.junit.Test)

Example 62 with SinkPlanNode

use of org.apache.flink.optimizer.plan.SinkPlanNode in project flink by apache.

the class PartitioningReusageTest method noPreviousPartitioningJoin1.

@Test
public void noPreviousPartitioningJoin1() {
    ExecutionEnvironment env = ExecutionEnvironment.getExecutionEnvironment();
    DataSet<Tuple3<Integer, Integer, Integer>> set1 = env.readCsvFile(IN_FILE).types(Integer.class, Integer.class, Integer.class);
    DataSet<Tuple3<Integer, Integer, Integer>> set2 = env.readCsvFile(IN_FILE).types(Integer.class, Integer.class, Integer.class);
    DataSet<Tuple3<Integer, Integer, Integer>> joined = set1.join(set2, JoinOperatorBase.JoinHint.REPARTITION_HASH_FIRST).where(0, 1).equalTo(0, 1).with(new MockJoin());
    joined.output(new DiscardingOutputFormat<Tuple3<Integer, Integer, Integer>>());
    Plan plan = env.createProgramPlan();
    OptimizedPlan oPlan = compileWithStats(plan);
    SinkPlanNode sink = oPlan.getDataSinks().iterator().next();
    DualInputPlanNode join = (DualInputPlanNode) sink.getInput().getSource();
    checkValidJoinInputProperties(join);
}
Also used : DualInputPlanNode(org.apache.flink.optimizer.plan.DualInputPlanNode) ExecutionEnvironment(org.apache.flink.api.java.ExecutionEnvironment) Tuple3(org.apache.flink.api.java.tuple.Tuple3) SinkPlanNode(org.apache.flink.optimizer.plan.SinkPlanNode) Plan(org.apache.flink.api.common.Plan) OptimizedPlan(org.apache.flink.optimizer.plan.OptimizedPlan) OptimizedPlan(org.apache.flink.optimizer.plan.OptimizedPlan) Test(org.junit.Test)

Example 63 with SinkPlanNode

use of org.apache.flink.optimizer.plan.SinkPlanNode in project flink by apache.

the class PartitioningReusageTest method reuseSinglePartitioningJoin5.

@Test
public void reuseSinglePartitioningJoin5() {
    ExecutionEnvironment env = ExecutionEnvironment.getExecutionEnvironment();
    DataSet<Tuple3<Integer, Integer, Integer>> set1 = env.readCsvFile(IN_FILE).types(Integer.class, Integer.class, Integer.class);
    DataSet<Tuple3<Integer, Integer, Integer>> set2 = env.readCsvFile(IN_FILE).types(Integer.class, Integer.class, Integer.class);
    DataSet<Tuple3<Integer, Integer, Integer>> joined = set1.join(set2.partitionByHash(2).map(new MockMapper()).withForwardedFields("2"), JoinOperatorBase.JoinHint.REPARTITION_HASH_FIRST).where(0, 1).equalTo(2, 1).with(new MockJoin());
    joined.output(new DiscardingOutputFormat<Tuple3<Integer, Integer, Integer>>());
    Plan plan = env.createProgramPlan();
    OptimizedPlan oPlan = compileWithStats(plan);
    SinkPlanNode sink = oPlan.getDataSinks().iterator().next();
    DualInputPlanNode join = (DualInputPlanNode) sink.getInput().getSource();
    checkValidJoinInputProperties(join);
}
Also used : DualInputPlanNode(org.apache.flink.optimizer.plan.DualInputPlanNode) ExecutionEnvironment(org.apache.flink.api.java.ExecutionEnvironment) Tuple3(org.apache.flink.api.java.tuple.Tuple3) SinkPlanNode(org.apache.flink.optimizer.plan.SinkPlanNode) Plan(org.apache.flink.api.common.Plan) OptimizedPlan(org.apache.flink.optimizer.plan.OptimizedPlan) OptimizedPlan(org.apache.flink.optimizer.plan.OptimizedPlan) Test(org.junit.Test)

Example 64 with SinkPlanNode

use of org.apache.flink.optimizer.plan.SinkPlanNode in project flink by apache.

the class PartitioningReusageTest method noPreviousPartitioningJoin2.

@Test
public void noPreviousPartitioningJoin2() {
    ExecutionEnvironment env = ExecutionEnvironment.getExecutionEnvironment();
    DataSet<Tuple3<Integer, Integer, Integer>> set1 = env.readCsvFile(IN_FILE).types(Integer.class, Integer.class, Integer.class);
    DataSet<Tuple3<Integer, Integer, Integer>> set2 = env.readCsvFile(IN_FILE).types(Integer.class, Integer.class, Integer.class);
    DataSet<Tuple3<Integer, Integer, Integer>> joined = set1.join(set2, JoinOperatorBase.JoinHint.REPARTITION_HASH_FIRST).where(0, 1).equalTo(2, 1).with(new MockJoin());
    joined.output(new DiscardingOutputFormat<Tuple3<Integer, Integer, Integer>>());
    Plan plan = env.createProgramPlan();
    OptimizedPlan oPlan = compileWithStats(plan);
    SinkPlanNode sink = oPlan.getDataSinks().iterator().next();
    DualInputPlanNode join = (DualInputPlanNode) sink.getInput().getSource();
    checkValidJoinInputProperties(join);
}
Also used : DualInputPlanNode(org.apache.flink.optimizer.plan.DualInputPlanNode) ExecutionEnvironment(org.apache.flink.api.java.ExecutionEnvironment) Tuple3(org.apache.flink.api.java.tuple.Tuple3) SinkPlanNode(org.apache.flink.optimizer.plan.SinkPlanNode) Plan(org.apache.flink.api.common.Plan) OptimizedPlan(org.apache.flink.optimizer.plan.OptimizedPlan) OptimizedPlan(org.apache.flink.optimizer.plan.OptimizedPlan) Test(org.junit.Test)

Example 65 with SinkPlanNode

use of org.apache.flink.optimizer.plan.SinkPlanNode in project flink by apache.

the class RelationalQueryCompilerTest method testQueryGeneric.

private void testQueryGeneric(Plan p, long orderSize, long lineitemSize, float orderSelectivity, float joinSelectivity, boolean broadcastOkay, boolean partitionedOkay, boolean hashJoinFirstOkay, boolean hashJoinSecondOkay, boolean mergeJoinOkay) {
    try {
        // set statistics
        OperatorResolver cr = getContractResolver(p);
        GenericDataSourceBase<?, ?> ordersSource = cr.getNode(ORDERS);
        GenericDataSourceBase<?, ?> lineItemSource = cr.getNode(LINEITEM);
        SingleInputOperator<?, ?, ?> mapper = cr.getNode(MAPPER_NAME);
        DualInputOperator<?, ?, ?, ?> joiner = cr.getNode(JOIN_NAME);
        setSourceStatistics(ordersSource, orderSize, 100f);
        setSourceStatistics(lineItemSource, lineitemSize, 140f);
        mapper.getCompilerHints().setAvgOutputRecordSize(16f);
        mapper.getCompilerHints().setFilterFactor(orderSelectivity);
        joiner.getCompilerHints().setFilterFactor(joinSelectivity);
        // compile
        final OptimizedPlan plan = compileWithStats(p);
        final OptimizerPlanNodeResolver or = getOptimizerPlanNodeResolver(plan);
        // get the nodes from the final plan
        final SinkPlanNode sink = or.getNode(SINK);
        final SingleInputPlanNode reducer = or.getNode(REDUCE_NAME);
        final SingleInputPlanNode combiner = reducer.getPredecessor() instanceof SingleInputPlanNode ? (SingleInputPlanNode) reducer.getPredecessor() : null;
        final DualInputPlanNode join = or.getNode(JOIN_NAME);
        final SingleInputPlanNode filteringMapper = or.getNode(MAPPER_NAME);
        checkStandardStrategies(filteringMapper, join, combiner, reducer, sink);
        // check the possible variants and that the variant ia allowed in this specific setting
        if (checkBroadcastShipStrategies(join, reducer, combiner)) {
            Assert.assertTrue("Broadcast join incorrectly chosen.", broadcastOkay);
            if (checkHashJoinStrategies(join, reducer, true)) {
                Assert.assertTrue("Hash join (build orders) incorrectly chosen", hashJoinFirstOkay);
            } else if (checkHashJoinStrategies(join, reducer, false)) {
                Assert.assertTrue("Hash join (build lineitem) incorrectly chosen", hashJoinSecondOkay);
            } else if (checkBroadcastMergeJoin(join, reducer)) {
                Assert.assertTrue("Merge join incorrectly chosen", mergeJoinOkay);
            } else {
                Assert.fail("Plan has no correct hash join or merge join strategies.");
            }
        } else if (checkRepartitionShipStrategies(join, reducer, combiner)) {
            Assert.assertTrue("Partitioned join incorrectly chosen.", partitionedOkay);
            if (checkHashJoinStrategies(join, reducer, true)) {
                Assert.assertTrue("Hash join (build orders) incorrectly chosen", hashJoinFirstOkay);
            } else if (checkHashJoinStrategies(join, reducer, false)) {
                Assert.assertTrue("Hash join (build lineitem) incorrectly chosen", hashJoinSecondOkay);
            } else if (checkRepartitionMergeJoin(join, reducer)) {
                Assert.assertTrue("Merge join incorrectly chosen", mergeJoinOkay);
            } else {
                Assert.fail("Plan has no correct hash join or merge join strategies.");
            }
        } else {
            Assert.fail("Plan has neither correct BC join or partitioned join configuration.");
        }
    } catch (Exception e) {
        e.printStackTrace();
        Assert.fail(e.getMessage());
    }
}
Also used : SingleInputPlanNode(org.apache.flink.optimizer.plan.SingleInputPlanNode) DualInputPlanNode(org.apache.flink.optimizer.plan.DualInputPlanNode) OperatorResolver(org.apache.flink.optimizer.util.OperatorResolver) SinkPlanNode(org.apache.flink.optimizer.plan.SinkPlanNode) OptimizedPlan(org.apache.flink.optimizer.plan.OptimizedPlan)

Aggregations

SinkPlanNode (org.apache.flink.optimizer.plan.SinkPlanNode)153 OptimizedPlan (org.apache.flink.optimizer.plan.OptimizedPlan)146 Plan (org.apache.flink.api.common.Plan)139 ExecutionEnvironment (org.apache.flink.api.java.ExecutionEnvironment)139 Test (org.junit.Test)138 SingleInputPlanNode (org.apache.flink.optimizer.plan.SingleInputPlanNode)72 DualInputPlanNode (org.apache.flink.optimizer.plan.DualInputPlanNode)67 Tuple2 (org.apache.flink.api.java.tuple.Tuple2)66 Tuple3 (org.apache.flink.api.java.tuple.Tuple3)53 SourcePlanNode (org.apache.flink.optimizer.plan.SourcePlanNode)52 FieldSet (org.apache.flink.api.common.operators.util.FieldSet)24 DiscardingOutputFormat (org.apache.flink.api.java.io.DiscardingOutputFormat)24 GlobalProperties (org.apache.flink.optimizer.dataproperties.GlobalProperties)24 LocalProperties (org.apache.flink.optimizer.dataproperties.LocalProperties)24 InvalidProgramException (org.apache.flink.api.common.InvalidProgramException)23 FieldList (org.apache.flink.api.common.operators.util.FieldList)23 WorksetIterationPlanNode (org.apache.flink.optimizer.plan.WorksetIterationPlanNode)16 IdentityGroupReducerCombinable (org.apache.flink.optimizer.testfunctions.IdentityGroupReducerCombinable)16 IdentityMapper (org.apache.flink.optimizer.testfunctions.IdentityMapper)16 Channel (org.apache.flink.optimizer.plan.Channel)13