Search in sources :

Example 36 with OptimizedPlan

use of org.apache.flink.optimizer.plan.OptimizedPlan in project flink by apache.

the class UnionReplacementTest method testConsecutiveUnionsWithRebalance.

/**
	 *
	 * Checks that a plan with consecutive UNIONs followed by REBALANCE is correctly translated.
	 *
	 * The program can be illustrated as follows:
	 *
	 * Src1 -\
	 *        >-> Union12--<
	 * Src2 -/              \
	 *                       >-> Union123 -> Rebalance -> Output
	 * Src3 ----------------/
	 *
	 * In the resulting plan, the Rebalance (ShippingStrategy.PARTITION_FORCED_REBALANCE) must be
	 * pushed to the inputs of the unions (Src1, Src2, Src3).
	 *
	 */
@Test
public void testConsecutiveUnionsWithRebalance() throws Exception {
    // -----------------------------------------------------------------------------------------
    // Build test program
    // -----------------------------------------------------------------------------------------
    ExecutionEnvironment env = ExecutionEnvironment.getExecutionEnvironment();
    env.setParallelism(DEFAULT_PARALLELISM);
    DataSet<Tuple2<Long, Long>> src1 = env.fromElements(new Tuple2<>(0L, 0L));
    DataSet<Tuple2<Long, Long>> src2 = env.fromElements(new Tuple2<>(0L, 0L));
    DataSet<Tuple2<Long, Long>> src3 = env.fromElements(new Tuple2<>(0L, 0L));
    DataSet<Tuple2<Long, Long>> union12 = src1.union(src2);
    DataSet<Tuple2<Long, Long>> union123 = union12.union(src3);
    union123.rebalance().output(new DiscardingOutputFormat<Tuple2<Long, Long>>()).name("out");
    // -----------------------------------------------------------------------------------------
    // Verify optimized plan
    // -----------------------------------------------------------------------------------------
    OptimizedPlan optimizedPlan = compileNoStats(env.createProgramPlan());
    OptimizerPlanNodeResolver resolver = getOptimizerPlanNodeResolver(optimizedPlan);
    SingleInputPlanNode sink = resolver.getNode("out");
    // check partitioning is correct
    assertEquals("Sink input should be force rebalanced.", PartitioningProperty.FORCED_REBALANCED, sink.getInput().getGlobalProperties().getPartitioning());
    SingleInputPlanNode partitioner = (SingleInputPlanNode) sink.getInput().getSource();
    assertTrue(partitioner.getDriverStrategy() == DriverStrategy.UNARY_NO_OP);
    assertEquals("Partitioner input should be force rebalanced.", PartitioningProperty.FORCED_REBALANCED, partitioner.getInput().getGlobalProperties().getPartitioning());
    assertEquals("Partitioner input channel should be forwarding", ShipStrategyType.FORWARD, partitioner.getInput().getShipStrategy());
    NAryUnionPlanNode union = (NAryUnionPlanNode) partitioner.getInput().getSource();
    // all union inputs should be force rebalanced
    for (Channel c : union.getInputs()) {
        assertEquals("Union input should be force rebalanced", PartitioningProperty.FORCED_REBALANCED, c.getGlobalProperties().getPartitioning());
        assertEquals("Union input channel should be rebalancing", ShipStrategyType.PARTITION_FORCED_REBALANCE, c.getShipStrategy());
        assertTrue("Union input should be data source", c.getSource() instanceof SourcePlanNode);
    }
}
Also used : SingleInputPlanNode(org.apache.flink.optimizer.plan.SingleInputPlanNode) NAryUnionPlanNode(org.apache.flink.optimizer.plan.NAryUnionPlanNode) ExecutionEnvironment(org.apache.flink.api.java.ExecutionEnvironment) Tuple2(org.apache.flink.api.java.tuple.Tuple2) Channel(org.apache.flink.optimizer.plan.Channel) SourcePlanNode(org.apache.flink.optimizer.plan.SourcePlanNode) DiscardingOutputFormat(org.apache.flink.api.java.io.DiscardingOutputFormat) OptimizedPlan(org.apache.flink.optimizer.plan.OptimizedPlan) Test(org.junit.Test)

Example 37 with OptimizedPlan

use of org.apache.flink.optimizer.plan.OptimizedPlan in project flink by apache.

the class UnionReplacementTest method testUnionWithTwoOutputs.

/**
	 *
	 * Test for FLINK-2662.
	 *
	 * Checks that a plan with an union with two outputs is correctly translated.
	 * The program can be illustrated as follows:
	 *
	 * Src1 ----------------\
	 *                       >-> Union123 -> GroupBy(0) -> Sum -> Output
	 * Src2 -\              /
	 *        >-> Union23--<
	 * Src3 -/              \
	 *                       >-> Union234 -> GroupBy(1) -> Sum -> Output
	 * Src4 ----------------/
	 *
	 * The fix for FLINK-2662 translates the union with two output (Union-23) into two separate
	 * unions (Union-23_1 and Union-23_2) with one output each. Due to this change, the interesting
	 * partitioning properties for GroupBy(0) and GroupBy(1) are pushed through Union-23_1 and
	 * Union-23_2 and do not interfere with each other (which would be the case if Union-23 would
	 * be a single operator with two outputs).
	 *
	 */
@Test
public void testUnionWithTwoOutputs() throws Exception {
    // -----------------------------------------------------------------------------------------
    // Build test program
    // -----------------------------------------------------------------------------------------
    ExecutionEnvironment env = ExecutionEnvironment.getExecutionEnvironment();
    env.setParallelism(DEFAULT_PARALLELISM);
    DataSet<Tuple2<Long, Long>> src1 = env.fromElements(new Tuple2<>(0L, 0L));
    DataSet<Tuple2<Long, Long>> src2 = env.fromElements(new Tuple2<>(0L, 0L));
    DataSet<Tuple2<Long, Long>> src3 = env.fromElements(new Tuple2<>(0L, 0L));
    DataSet<Tuple2<Long, Long>> src4 = env.fromElements(new Tuple2<>(0L, 0L));
    DataSet<Tuple2<Long, Long>> union23 = src2.union(src3);
    DataSet<Tuple2<Long, Long>> union123 = src1.union(union23);
    DataSet<Tuple2<Long, Long>> union234 = src4.union(union23);
    union123.groupBy(0).sum(1).name("1").output(new DiscardingOutputFormat<Tuple2<Long, Long>>());
    union234.groupBy(1).sum(0).name("2").output(new DiscardingOutputFormat<Tuple2<Long, Long>>());
    // -----------------------------------------------------------------------------------------
    // Verify optimized plan
    // -----------------------------------------------------------------------------------------
    OptimizedPlan optimizedPlan = compileNoStats(env.createProgramPlan());
    OptimizerPlanNodeResolver resolver = getOptimizerPlanNodeResolver(optimizedPlan);
    SingleInputPlanNode groupRed1 = resolver.getNode("1");
    SingleInputPlanNode groupRed2 = resolver.getNode("2");
    // check partitioning is correct
    assertTrue("Reduce input should be partitioned on 0.", groupRed1.getInput().getGlobalProperties().getPartitioningFields().isExactMatch(new FieldList(0)));
    assertTrue("Reduce input should be partitioned on 1.", groupRed2.getInput().getGlobalProperties().getPartitioningFields().isExactMatch(new FieldList(1)));
    // check group reduce inputs are n-ary unions with three inputs
    assertTrue("Reduce input should be n-ary union with three inputs.", groupRed1.getInput().getSource() instanceof NAryUnionPlanNode && ((NAryUnionPlanNode) groupRed1.getInput().getSource()).getListOfInputs().size() == 3);
    assertTrue("Reduce input should be n-ary union with three inputs.", groupRed2.getInput().getSource() instanceof NAryUnionPlanNode && ((NAryUnionPlanNode) groupRed2.getInput().getSource()).getListOfInputs().size() == 3);
    // check channel from union to group reduce is forwarding
    assertTrue("Channel between union and group reduce should be forwarding", groupRed1.getInput().getShipStrategy().equals(ShipStrategyType.FORWARD));
    assertTrue("Channel between union and group reduce should be forwarding", groupRed2.getInput().getShipStrategy().equals(ShipStrategyType.FORWARD));
    // check that all inputs of unions are hash partitioned
    List<Channel> union123In = ((NAryUnionPlanNode) groupRed1.getInput().getSource()).getListOfInputs();
    for (Channel i : union123In) {
        assertTrue("Union input channel should hash partition on 0", i.getShipStrategy().equals(ShipStrategyType.PARTITION_HASH) && i.getShipStrategyKeys().isExactMatch(new FieldList(0)));
    }
    List<Channel> union234In = ((NAryUnionPlanNode) groupRed2.getInput().getSource()).getListOfInputs();
    for (Channel i : union234In) {
        assertTrue("Union input channel should hash partition on 0", i.getShipStrategy().equals(ShipStrategyType.PARTITION_HASH) && i.getShipStrategyKeys().isExactMatch(new FieldList(1)));
    }
}
Also used : SingleInputPlanNode(org.apache.flink.optimizer.plan.SingleInputPlanNode) NAryUnionPlanNode(org.apache.flink.optimizer.plan.NAryUnionPlanNode) ExecutionEnvironment(org.apache.flink.api.java.ExecutionEnvironment) Tuple2(org.apache.flink.api.java.tuple.Tuple2) Channel(org.apache.flink.optimizer.plan.Channel) OptimizedPlan(org.apache.flink.optimizer.plan.OptimizedPlan) FieldList(org.apache.flink.api.common.operators.util.FieldList) Test(org.junit.Test)

Example 38 with OptimizedPlan

use of org.apache.flink.optimizer.plan.OptimizedPlan in project flink by apache.

the class WorksetIterationCornerCasesTest method testWorksetIterationNotDependingOnSolutionSet.

@Test
public void testWorksetIterationNotDependingOnSolutionSet() {
    try {
        ExecutionEnvironment env = ExecutionEnvironment.getExecutionEnvironment();
        DataSet<Tuple2<Long, Long>> input = env.generateSequence(1, 100).map(new Duplicator<Long>());
        DeltaIteration<Tuple2<Long, Long>, Tuple2<Long, Long>> iteration = input.iterateDelta(input, 100, 1);
        DataSet<Tuple2<Long, Long>> iterEnd = iteration.getWorkset().map(new TestMapper<Tuple2<Long, Long>>());
        iteration.closeWith(iterEnd, iterEnd).output(new DiscardingOutputFormat<Tuple2<Long, Long>>());
        Plan p = env.createProgramPlan();
        OptimizedPlan op = compileNoStats(p);
        WorksetIterationPlanNode wipn = (WorksetIterationPlanNode) op.getDataSinks().iterator().next().getInput().getSource();
        assertTrue(wipn.getSolutionSetPlanNode().getOutgoingChannels().isEmpty());
        JobGraphGenerator jgg = new JobGraphGenerator();
        jgg.compileJobGraph(op);
    } catch (Exception e) {
        e.printStackTrace();
        fail(e.getMessage());
    }
}
Also used : ExecutionEnvironment(org.apache.flink.api.java.ExecutionEnvironment) WorksetIterationPlanNode(org.apache.flink.optimizer.plan.WorksetIterationPlanNode) Plan(org.apache.flink.api.common.Plan) OptimizedPlan(org.apache.flink.optimizer.plan.OptimizedPlan) OptimizedPlan(org.apache.flink.optimizer.plan.OptimizedPlan) Tuple2(org.apache.flink.api.java.tuple.Tuple2) JobGraphGenerator(org.apache.flink.optimizer.plantranslate.JobGraphGenerator) Test(org.junit.Test)

Example 39 with OptimizedPlan

use of org.apache.flink.optimizer.plan.OptimizedPlan in project flink by apache.

the class WorksetIterationsRecordApiCompilerTest method testRecordApiWithDeferredSoltionSetUpdateWithNonPreservingJoin.

@Test
public void testRecordApiWithDeferredSoltionSetUpdateWithNonPreservingJoin() {
    Plan plan = getTestPlan(false, false);
    OptimizedPlan oPlan;
    try {
        oPlan = compileNoStats(plan);
    } catch (CompilerException ce) {
        ce.printStackTrace();
        fail("The pact compiler is unable to compile this plan correctly.");
        // silence the compiler
        return;
    }
    OptimizerPlanNodeResolver resolver = getOptimizerPlanNodeResolver(oPlan);
    DualInputPlanNode joinWithInvariantNode = resolver.getNode(JOIN_WITH_INVARIANT_NAME);
    DualInputPlanNode joinWithSolutionSetNode = resolver.getNode(JOIN_WITH_SOLUTION_SET);
    SingleInputPlanNode worksetReducer = resolver.getNode(NEXT_WORKSET_REDUCER_NAME);
    // iteration preserves partitioning in reducer, so the first partitioning is out of the loop, 
    // the in-loop partitioning is before the final reducer
    // verify joinWithInvariant
    assertEquals(ShipStrategyType.FORWARD, joinWithInvariantNode.getInput1().getShipStrategy());
    assertEquals(ShipStrategyType.PARTITION_HASH, joinWithInvariantNode.getInput2().getShipStrategy());
    assertEquals(list0, joinWithInvariantNode.getKeysForInput1());
    assertEquals(list0, joinWithInvariantNode.getKeysForInput2());
    // verify joinWithSolutionSet
    assertEquals(ShipStrategyType.FORWARD, joinWithSolutionSetNode.getInput1().getShipStrategy());
    assertEquals(ShipStrategyType.FORWARD, joinWithSolutionSetNode.getInput2().getShipStrategy());
    // verify reducer
    assertEquals(ShipStrategyType.PARTITION_HASH, worksetReducer.getInput().getShipStrategy());
    assertEquals(list0, worksetReducer.getKeys(0));
    // verify solution delta
    assertEquals(2, joinWithSolutionSetNode.getOutgoingChannels().size());
    assertEquals(ShipStrategyType.PARTITION_HASH, joinWithSolutionSetNode.getOutgoingChannels().get(0).getShipStrategy());
    assertEquals(ShipStrategyType.PARTITION_HASH, joinWithSolutionSetNode.getOutgoingChannels().get(1).getShipStrategy());
    new JobGraphGenerator().compileJobGraph(oPlan);
}
Also used : DualInputPlanNode(org.apache.flink.optimizer.plan.DualInputPlanNode) SingleInputPlanNode(org.apache.flink.optimizer.plan.SingleInputPlanNode) JobGraphGenerator(org.apache.flink.optimizer.plantranslate.JobGraphGenerator) Plan(org.apache.flink.api.common.Plan) OptimizedPlan(org.apache.flink.optimizer.plan.OptimizedPlan) OptimizedPlan(org.apache.flink.optimizer.plan.OptimizedPlan) Test(org.junit.Test)

Example 40 with OptimizedPlan

use of org.apache.flink.optimizer.plan.OptimizedPlan in project flink by apache.

the class WorksetIterationsRecordApiCompilerTest method testRecordApiWithDeferredSoltionSetUpdateWithMapper.

@Test
public void testRecordApiWithDeferredSoltionSetUpdateWithMapper() {
    Plan plan = getTestPlan(false, true);
    OptimizedPlan oPlan;
    try {
        oPlan = compileNoStats(plan);
    } catch (CompilerException ce) {
        ce.printStackTrace();
        fail("The pact compiler is unable to compile this plan correctly.");
        // silence the compiler
        return;
    }
    OptimizerPlanNodeResolver resolver = getOptimizerPlanNodeResolver(oPlan);
    DualInputPlanNode joinWithInvariantNode = resolver.getNode(JOIN_WITH_INVARIANT_NAME);
    DualInputPlanNode joinWithSolutionSetNode = resolver.getNode(JOIN_WITH_SOLUTION_SET);
    SingleInputPlanNode worksetReducer = resolver.getNode(NEXT_WORKSET_REDUCER_NAME);
    SingleInputPlanNode deltaMapper = resolver.getNode(SOLUTION_DELTA_MAPPER_NAME);
    // iteration preserves partitioning in reducer, so the first partitioning is out of the loop, 
    // the in-loop partitioning is before the final reducer
    // verify joinWithInvariant
    assertEquals(ShipStrategyType.FORWARD, joinWithInvariantNode.getInput1().getShipStrategy());
    assertEquals(ShipStrategyType.PARTITION_HASH, joinWithInvariantNode.getInput2().getShipStrategy());
    assertEquals(list0, joinWithInvariantNode.getKeysForInput1());
    assertEquals(list0, joinWithInvariantNode.getKeysForInput2());
    // verify joinWithSolutionSet
    assertEquals(ShipStrategyType.FORWARD, joinWithSolutionSetNode.getInput1().getShipStrategy());
    assertEquals(ShipStrategyType.FORWARD, joinWithSolutionSetNode.getInput2().getShipStrategy());
    // verify reducer
    assertEquals(ShipStrategyType.PARTITION_HASH, worksetReducer.getInput().getShipStrategy());
    assertEquals(list0, worksetReducer.getKeys(0));
    // currently, the system may partition before or after the mapper
    ShipStrategyType ss1 = deltaMapper.getInput().getShipStrategy();
    ShipStrategyType ss2 = deltaMapper.getOutgoingChannels().get(0).getShipStrategy();
    assertTrue((ss1 == ShipStrategyType.FORWARD && ss2 == ShipStrategyType.PARTITION_HASH) || (ss2 == ShipStrategyType.FORWARD && ss1 == ShipStrategyType.PARTITION_HASH));
    new JobGraphGenerator().compileJobGraph(oPlan);
}
Also used : DualInputPlanNode(org.apache.flink.optimizer.plan.DualInputPlanNode) SingleInputPlanNode(org.apache.flink.optimizer.plan.SingleInputPlanNode) JobGraphGenerator(org.apache.flink.optimizer.plantranslate.JobGraphGenerator) Plan(org.apache.flink.api.common.Plan) OptimizedPlan(org.apache.flink.optimizer.plan.OptimizedPlan) OptimizedPlan(org.apache.flink.optimizer.plan.OptimizedPlan) ShipStrategyType(org.apache.flink.runtime.operators.shipping.ShipStrategyType) Test(org.junit.Test)

Aggregations

OptimizedPlan (org.apache.flink.optimizer.plan.OptimizedPlan)221 Test (org.junit.Test)197 Plan (org.apache.flink.api.common.Plan)192 ExecutionEnvironment (org.apache.flink.api.java.ExecutionEnvironment)183 SinkPlanNode (org.apache.flink.optimizer.plan.SinkPlanNode)146 Tuple2 (org.apache.flink.api.java.tuple.Tuple2)91 SingleInputPlanNode (org.apache.flink.optimizer.plan.SingleInputPlanNode)83 DualInputPlanNode (org.apache.flink.optimizer.plan.DualInputPlanNode)82 JobGraphGenerator (org.apache.flink.optimizer.plantranslate.JobGraphGenerator)55 Tuple3 (org.apache.flink.api.java.tuple.Tuple3)54 SourcePlanNode (org.apache.flink.optimizer.plan.SourcePlanNode)48 DiscardingOutputFormat (org.apache.flink.api.java.io.DiscardingOutputFormat)33 InvalidProgramException (org.apache.flink.api.common.InvalidProgramException)27 FieldList (org.apache.flink.api.common.operators.util.FieldList)27 Channel (org.apache.flink.optimizer.plan.Channel)26 FieldSet (org.apache.flink.api.common.operators.util.FieldSet)25 GlobalProperties (org.apache.flink.optimizer.dataproperties.GlobalProperties)25 LocalProperties (org.apache.flink.optimizer.dataproperties.LocalProperties)25 IdentityMapper (org.apache.flink.optimizer.testfunctions.IdentityMapper)20 WorksetIterationPlanNode (org.apache.flink.optimizer.plan.WorksetIterationPlanNode)16