Search in sources :

Example 6 with GenericDataSinkBase

use of org.apache.flink.api.common.operators.GenericDataSinkBase in project flink by apache.

the class UnionTranslationTest method translateUnion3SortedGroup.

@Test
public void translateUnion3SortedGroup() {
    try {
        final int parallelism = 4;
        ExecutionEnvironment env = ExecutionEnvironment.createLocalEnvironment(parallelism);
        DataSet<Tuple3<Double, StringValue, LongValue>> dataset1 = getSourceDataSet(env, 2);
        DataSet<Tuple3<Double, StringValue, LongValue>> dataset2 = getSourceDataSet(env, 3);
        DataSet<Tuple3<Double, StringValue, LongValue>> dataset3 = getSourceDataSet(env, -1);
        dataset1.union(dataset2).union(dataset3).groupBy((KeySelector<Tuple3<Double, StringValue, LongValue>, String>) value -> "").sortGroup((KeySelector<Tuple3<Double, StringValue, LongValue>, String>) value -> "", Order.ASCENDING).reduceGroup((GroupReduceFunction<Tuple3<Double, StringValue, LongValue>, String>) (values, out) -> {
        }).returns(String.class).output(new DiscardingOutputFormat<>());
        Plan p = env.createProgramPlan();
        // The plan should look like the following one.
        // 
        // DataSet1(2) - MapOperator(2)-+
        // |- Union(-1) -+
        // DataSet2(3) - MapOperator(3)-+             |- Union(-1) - SingleInputOperator - Sink
        // |
        // DataSet3(-1) - MapOperator(-1)-+
        GenericDataSinkBase<?> sink = p.getDataSinks().iterator().next();
        Union secondUnionOperator = (Union) ((SingleInputOperator) sink.getInput()).getInput();
        // The first input of the second union should be the first union.
        Union firstUnionOperator = (Union) secondUnionOperator.getFirstInput();
        // The key mapper should be added to the second input stream of the second union.
        assertTrue(secondUnionOperator.getSecondInput() instanceof MapOperatorBase<?, ?, ?>);
        // The key mappers should be added to both of the two input streams for the first union.
        assertTrue(firstUnionOperator.getFirstInput() instanceof MapOperatorBase<?, ?, ?>);
        assertTrue(firstUnionOperator.getSecondInput() instanceof MapOperatorBase<?, ?, ?>);
        // The parallelisms of the key mappers should be equal to those of their inputs.
        assertEquals(firstUnionOperator.getFirstInput().getParallelism(), 2);
        assertEquals(firstUnionOperator.getSecondInput().getParallelism(), 3);
        assertEquals(secondUnionOperator.getSecondInput().getParallelism(), -1);
        // The union should always have the default parallelism.
        assertEquals(secondUnionOperator.getParallelism(), ExecutionConfig.PARALLELISM_DEFAULT);
        assertEquals(firstUnionOperator.getParallelism(), ExecutionConfig.PARALLELISM_DEFAULT);
    } catch (Exception e) {
        System.err.println(e.getMessage());
        e.printStackTrace();
        fail("Test caused an error: " + e.getMessage());
    }
}
Also used : KeySelector(org.apache.flink.api.java.functions.KeySelector) Tuple3(org.apache.flink.api.java.tuple.Tuple3) DiscardingOutputFormat(org.apache.flink.api.java.io.DiscardingOutputFormat) LongValue(org.apache.flink.types.LongValue) GroupReduceFunction(org.apache.flink.api.common.functions.GroupReduceFunction) MapOperatorBase(org.apache.flink.api.common.operators.base.MapOperatorBase) Union(org.apache.flink.api.common.operators.Union) Assert.assertTrue(org.junit.Assert.assertTrue) Test(org.junit.Test) SingleInputOperator(org.apache.flink.api.common.operators.SingleInputOperator) DataSet(org.apache.flink.api.java.DataSet) ExecutionEnvironment(org.apache.flink.api.java.ExecutionEnvironment) StringValue(org.apache.flink.types.StringValue) GenericDataSinkBase(org.apache.flink.api.common.operators.GenericDataSinkBase) ExecutionConfig(org.apache.flink.api.common.ExecutionConfig) Plan(org.apache.flink.api.common.Plan) Assert.fail(org.junit.Assert.fail) Order(org.apache.flink.api.common.operators.Order) Assert.assertEquals(org.junit.Assert.assertEquals) ExecutionEnvironment(org.apache.flink.api.java.ExecutionEnvironment) KeySelector(org.apache.flink.api.java.functions.KeySelector) Plan(org.apache.flink.api.common.Plan) Union(org.apache.flink.api.common.operators.Union) Tuple3(org.apache.flink.api.java.tuple.Tuple3) LongValue(org.apache.flink.types.LongValue) StringValue(org.apache.flink.types.StringValue) Test(org.junit.Test)

Aggregations

GenericDataSinkBase (org.apache.flink.api.common.operators.GenericDataSinkBase)6 Plan (org.apache.flink.api.common.Plan)4 MapOperatorBase (org.apache.flink.api.common.operators.base.MapOperatorBase)4 Union (org.apache.flink.api.common.operators.Union)3 DataSet (org.apache.flink.api.java.DataSet)3 ExecutionEnvironment (org.apache.flink.api.java.ExecutionEnvironment)3 DiscardingOutputFormat (org.apache.flink.api.java.io.DiscardingOutputFormat)3 Tuple3 (org.apache.flink.api.java.tuple.Tuple3)3 ExecutionConfig (org.apache.flink.api.common.ExecutionConfig)2 InvalidProgramException (org.apache.flink.api.common.InvalidProgramException)2 GroupReduceFunction (org.apache.flink.api.common.functions.GroupReduceFunction)2 Order (org.apache.flink.api.common.operators.Order)2 SingleInputOperator (org.apache.flink.api.common.operators.SingleInputOperator)2 DeltaIterationBase (org.apache.flink.api.common.operators.base.DeltaIterationBase)2 InnerJoinOperatorBase (org.apache.flink.api.common.operators.base.InnerJoinOperatorBase)2 KeySelector (org.apache.flink.api.java.functions.KeySelector)2 Test (org.junit.Test)2 ArrayList (java.util.ArrayList)1 Iterator (java.util.Iterator)1 LongSumAggregator (org.apache.flink.api.common.aggregators.LongSumAggregator)1