Search in sources :

Example 1 with Transformation

use of org.apache.flink.api.dag.Transformation in project flink by apache.

the class PythonOperatorChainingOptimizer method apply.

/**
 * Perform chaining optimization. It will iterate the transformations defined in the given
 * StreamExecutionEnvironment and update them with the chained transformations.
 */
@SuppressWarnings("unchecked")
public static void apply(StreamExecutionEnvironment env) throws Exception {
    if (env.getConfiguration().get(PythonOptions.PYTHON_OPERATOR_CHAINING_ENABLED)) {
        final Field transformationsField = StreamExecutionEnvironment.class.getDeclaredField("transformations");
        transformationsField.setAccessible(true);
        final List<Transformation<?>> transformations = (List<Transformation<?>>) transformationsField.get(env);
        transformationsField.set(env, optimize(transformations));
    }
}
Also used : Field(java.lang.reflect.Field) FeedbackTransformation(org.apache.flink.streaming.api.transformations.FeedbackTransformation) ReduceTransformation(org.apache.flink.streaming.api.transformations.ReduceTransformation) TimestampsAndWatermarksTransformation(org.apache.flink.streaming.api.transformations.TimestampsAndWatermarksTransformation) AbstractMultipleInputTransformation(org.apache.flink.streaming.api.transformations.AbstractMultipleInputTransformation) PhysicalTransformation(org.apache.flink.streaming.api.transformations.PhysicalTransformation) SinkTransformation(org.apache.flink.streaming.api.transformations.SinkTransformation) PartitionTransformation(org.apache.flink.streaming.api.transformations.PartitionTransformation) TwoInputTransformation(org.apache.flink.streaming.api.transformations.TwoInputTransformation) UnionTransformation(org.apache.flink.streaming.api.transformations.UnionTransformation) OneInputTransformation(org.apache.flink.streaming.api.transformations.OneInputTransformation) SideOutputTransformation(org.apache.flink.streaming.api.transformations.SideOutputTransformation) LegacySinkTransformation(org.apache.flink.streaming.api.transformations.LegacySinkTransformation) Transformation(org.apache.flink.api.dag.Transformation) AbstractBroadcastStateTransformation(org.apache.flink.streaming.api.transformations.AbstractBroadcastStateTransformation) ArrayList(java.util.ArrayList) List(java.util.List)

Example 2 with Transformation

use of org.apache.flink.api.dag.Transformation in project flink by apache.

the class PythonOperatorChainingOptimizer method apply.

/**
 * Perform chaining optimization. It will iterate the transformations defined in the given
 * StreamExecutionEnvironment and update them with the chained transformations. Besides, it will
 * return the transformation after chaining optimization for the given transformation.
 */
@SuppressWarnings("unchecked")
public static Transformation<?> apply(StreamExecutionEnvironment env, Transformation<?> transformation) throws Exception {
    if (env.getConfiguration().get(PythonOptions.PYTHON_OPERATOR_CHAINING_ENABLED)) {
        final Field transformationsField = StreamExecutionEnvironment.class.getDeclaredField("transformations");
        transformationsField.setAccessible(true);
        final List<Transformation<?>> transformations = (List<Transformation<?>>) transformationsField.get(env);
        final Tuple2<List<Transformation<?>>, Transformation<?>> resultTuple = optimize(transformations, transformation);
        transformationsField.set(env, resultTuple.f0);
        return resultTuple.f1;
    } else {
        return transformation;
    }
}
Also used : Field(java.lang.reflect.Field) FeedbackTransformation(org.apache.flink.streaming.api.transformations.FeedbackTransformation) ReduceTransformation(org.apache.flink.streaming.api.transformations.ReduceTransformation) TimestampsAndWatermarksTransformation(org.apache.flink.streaming.api.transformations.TimestampsAndWatermarksTransformation) AbstractMultipleInputTransformation(org.apache.flink.streaming.api.transformations.AbstractMultipleInputTransformation) PhysicalTransformation(org.apache.flink.streaming.api.transformations.PhysicalTransformation) SinkTransformation(org.apache.flink.streaming.api.transformations.SinkTransformation) PartitionTransformation(org.apache.flink.streaming.api.transformations.PartitionTransformation) TwoInputTransformation(org.apache.flink.streaming.api.transformations.TwoInputTransformation) UnionTransformation(org.apache.flink.streaming.api.transformations.UnionTransformation) OneInputTransformation(org.apache.flink.streaming.api.transformations.OneInputTransformation) SideOutputTransformation(org.apache.flink.streaming.api.transformations.SideOutputTransformation) LegacySinkTransformation(org.apache.flink.streaming.api.transformations.LegacySinkTransformation) Transformation(org.apache.flink.api.dag.Transformation) AbstractBroadcastStateTransformation(org.apache.flink.streaming.api.transformations.AbstractBroadcastStateTransformation) ArrayList(java.util.ArrayList) List(java.util.List)

Example 3 with Transformation

use of org.apache.flink.api.dag.Transformation in project flink by apache.

the class PythonOperatorChainingOptimizerTest method testChainingMultipleOperators.

@Test
public void testChainingMultipleOperators() {
    PythonKeyedProcessOperator<?> keyedProcessOperator = createKeyedProcessOperator("f1", new RowTypeInfo(Types.INT(), Types.INT()), Types.STRING());
    PythonProcessOperator<?, ?> processOperator1 = createProcessOperator("f2", Types.STRING(), Types.LONG());
    PythonProcessOperator<?, ?> processOperator2 = createProcessOperator("f3", Types.LONG(), Types.INT());
    Transformation<?> sourceTransformation = mock(SourceTransformation.class);
    OneInputTransformation<?, ?> keyedProcessTransformation = new OneInputTransformation(sourceTransformation, "keyedProcess", keyedProcessOperator, keyedProcessOperator.getProducedType(), 2);
    Transformation<?> processTransformation1 = new OneInputTransformation(keyedProcessTransformation, "process", processOperator1, processOperator1.getProducedType(), 2);
    Transformation<?> processTransformation2 = new OneInputTransformation(processTransformation1, "process", processOperator2, processOperator2.getProducedType(), 2);
    List<Transformation<?>> transformations = new ArrayList<>();
    transformations.add(sourceTransformation);
    transformations.add(keyedProcessTransformation);
    transformations.add(processTransformation1);
    transformations.add(processTransformation2);
    List<Transformation<?>> optimized = PythonOperatorChainingOptimizer.optimize(transformations);
    assertEquals(2, optimized.size());
    OneInputTransformation<?, ?> chainedTransformation = (OneInputTransformation<?, ?>) optimized.get(1);
    assertEquals(sourceTransformation.getOutputType(), chainedTransformation.getInputType());
    assertEquals(processOperator2.getProducedType(), chainedTransformation.getOutputType());
    OneInputStreamOperator<?, ?> chainedOperator = chainedTransformation.getOperator();
    assertTrue(chainedOperator instanceof PythonKeyedProcessOperator);
    validateChainedPythonFunctions(((PythonKeyedProcessOperator<?>) chainedOperator).getPythonFunctionInfo(), "f3", "f2", "f1");
}
Also used : SourceTransformation(org.apache.flink.streaming.api.transformations.SourceTransformation) TwoInputTransformation(org.apache.flink.streaming.api.transformations.TwoInputTransformation) OneInputTransformation(org.apache.flink.streaming.api.transformations.OneInputTransformation) Transformation(org.apache.flink.api.dag.Transformation) PythonKeyedProcessOperator(org.apache.flink.streaming.api.operators.python.PythonKeyedProcessOperator) ArrayList(java.util.ArrayList) RowTypeInfo(org.apache.flink.api.java.typeutils.RowTypeInfo) OneInputTransformation(org.apache.flink.streaming.api.transformations.OneInputTransformation) Test(org.junit.Test)

Example 4 with Transformation

use of org.apache.flink.api.dag.Transformation in project flink by apache.

the class PythonOperatorChainingOptimizerTest method testContinuousKeyedOperators.

@Test
public void testContinuousKeyedOperators() {
    PythonKeyedProcessOperator<?> keyedProcessOperator1 = createKeyedProcessOperator("f1", new RowTypeInfo(Types.INT(), Types.INT()), new RowTypeInfo(Types.INT(), Types.INT()));
    PythonKeyedProcessOperator<?> keyedProcessOperator2 = createKeyedProcessOperator("f2", new RowTypeInfo(Types.INT(), Types.INT()), Types.STRING());
    Transformation<?> sourceTransformation = mock(SourceTransformation.class);
    OneInputTransformation<?, ?> processTransformation1 = new OneInputTransformation(sourceTransformation, "KeyedProcess1", keyedProcessOperator1, keyedProcessOperator1.getProducedType(), 2);
    OneInputTransformation<?, ?> processTransformation2 = new OneInputTransformation(processTransformation1, "KeyedProcess2", keyedProcessOperator2, keyedProcessOperator2.getProducedType(), 2);
    List<Transformation<?>> transformations = new ArrayList<>();
    transformations.add(sourceTransformation);
    transformations.add(processTransformation1);
    transformations.add(processTransformation2);
    List<Transformation<?>> optimized = PythonOperatorChainingOptimizer.optimize(transformations);
    assertEquals(3, optimized.size());
    assertEquals(processTransformation1, optimized.get(1));
    assertEquals(processTransformation2, optimized.get(2));
}
Also used : SourceTransformation(org.apache.flink.streaming.api.transformations.SourceTransformation) TwoInputTransformation(org.apache.flink.streaming.api.transformations.TwoInputTransformation) OneInputTransformation(org.apache.flink.streaming.api.transformations.OneInputTransformation) Transformation(org.apache.flink.api.dag.Transformation) ArrayList(java.util.ArrayList) RowTypeInfo(org.apache.flink.api.java.typeutils.RowTypeInfo) OneInputTransformation(org.apache.flink.streaming.api.transformations.OneInputTransformation) Test(org.junit.Test)

Example 5 with Transformation

use of org.apache.flink.api.dag.Transformation in project flink by apache.

the class PythonOperatorChainingOptimizerTest method testSingleTransformation.

@Test
public void testSingleTransformation() {
    PythonKeyedProcessOperator<?> keyedProcessOperator = createKeyedProcessOperator("f1", new RowTypeInfo(Types.INT(), Types.INT()), Types.STRING());
    PythonProcessOperator<?, ?> processOperator1 = createProcessOperator("f2", Types.STRING(), Types.LONG());
    PythonProcessOperator<?, ?> processOperator2 = createProcessOperator("f3", Types.LONG(), Types.INT());
    Transformation<?> sourceTransformation = mock(SourceTransformation.class);
    OneInputTransformation<?, ?> keyedProcessTransformation = new OneInputTransformation(sourceTransformation, "keyedProcess", keyedProcessOperator, keyedProcessOperator.getProducedType(), 2);
    Transformation<?> processTransformation1 = new OneInputTransformation(keyedProcessTransformation, "process", processOperator1, processOperator1.getProducedType(), 2);
    Transformation<?> processTransformation2 = new OneInputTransformation(processTransformation1, "process", processOperator2, processOperator2.getProducedType(), 2);
    List<Transformation<?>> transformations = new ArrayList<>();
    transformations.add(processTransformation2);
    List<Transformation<?>> optimized = PythonOperatorChainingOptimizer.optimize(transformations);
    assertEquals(2, optimized.size());
    OneInputTransformation<?, ?> chainedTransformation = (OneInputTransformation<?, ?>) optimized.get(0);
    assertEquals(sourceTransformation.getOutputType(), chainedTransformation.getInputType());
    assertEquals(processOperator2.getProducedType(), chainedTransformation.getOutputType());
    OneInputStreamOperator<?, ?> chainedOperator = chainedTransformation.getOperator();
    assertTrue(chainedOperator instanceof PythonKeyedProcessOperator);
    validateChainedPythonFunctions(((PythonKeyedProcessOperator<?>) chainedOperator).getPythonFunctionInfo(), "f3", "f2", "f1");
}
Also used : SourceTransformation(org.apache.flink.streaming.api.transformations.SourceTransformation) TwoInputTransformation(org.apache.flink.streaming.api.transformations.TwoInputTransformation) OneInputTransformation(org.apache.flink.streaming.api.transformations.OneInputTransformation) Transformation(org.apache.flink.api.dag.Transformation) PythonKeyedProcessOperator(org.apache.flink.streaming.api.operators.python.PythonKeyedProcessOperator) ArrayList(java.util.ArrayList) RowTypeInfo(org.apache.flink.api.java.typeutils.RowTypeInfo) OneInputTransformation(org.apache.flink.streaming.api.transformations.OneInputTransformation) Test(org.junit.Test)

Aggregations

Transformation (org.apache.flink.api.dag.Transformation)98 RowData (org.apache.flink.table.data.RowData)69 ExecEdge (org.apache.flink.table.planner.plan.nodes.exec.ExecEdge)53 RowType (org.apache.flink.table.types.logical.RowType)50 OneInputTransformation (org.apache.flink.streaming.api.transformations.OneInputTransformation)45 TableException (org.apache.flink.table.api.TableException)28 RowDataKeySelector (org.apache.flink.table.runtime.keyselector.RowDataKeySelector)28 ArrayList (java.util.ArrayList)25 CodeGeneratorContext (org.apache.flink.table.planner.codegen.CodeGeneratorContext)21 Configuration (org.apache.flink.configuration.Configuration)19 TwoInputTransformation (org.apache.flink.streaming.api.transformations.TwoInputTransformation)18 List (java.util.List)17 PartitionTransformation (org.apache.flink.streaming.api.transformations.PartitionTransformation)17 AggregateInfoList (org.apache.flink.table.planner.plan.utils.AggregateInfoList)17 LogicalType (org.apache.flink.table.types.logical.LogicalType)16 Test (org.junit.Test)16 StreamExecutionEnvironment (org.apache.flink.streaming.api.environment.StreamExecutionEnvironment)13 SourceTransformation (org.apache.flink.streaming.api.transformations.SourceTransformation)13 Arrays (java.util.Arrays)11 Collections (java.util.Collections)10