Search in sources :

Example 66 with Transformation

use of org.apache.flink.api.dag.Transformation in project flink by apache.

the class CommonExecCorrelate method translateToPlanInternal.

@SuppressWarnings("unchecked")
@Override
protected Transformation<RowData> translateToPlanInternal(PlannerBase planner, ExecNodeConfig config) {
    final ExecEdge inputEdge = getInputEdges().get(0);
    final Transformation<RowData> inputTransform = (Transformation<RowData>) inputEdge.translateToPlan(planner);
    final CodeGeneratorContext ctx = new CodeGeneratorContext(config.getTableConfig()).setOperatorBaseClass(operatorBaseClass);
    return CorrelateCodeGenerator.generateCorrelateTransformation(config.getTableConfig(), ctx, inputTransform, (RowType) inputEdge.getOutputType(), invocation, JavaScalaConversionUtil.toScala(Optional.ofNullable(condition)), (RowType) getOutputType(), joinType, inputTransform.getParallelism(), retainHeader, getClass().getSimpleName(), createTransformationMeta(CORRELATE_TRANSFORMATION, config));
}
Also used : RowData(org.apache.flink.table.data.RowData) Transformation(org.apache.flink.api.dag.Transformation) ExecEdge(org.apache.flink.table.planner.plan.nodes.exec.ExecEdge) CodeGeneratorContext(org.apache.flink.table.planner.codegen.CodeGeneratorContext)

Example 67 with Transformation

use of org.apache.flink.api.dag.Transformation in project flink by apache.

the class CommonExecExpand method translateToPlanInternal.

@SuppressWarnings("unchecked")
@Override
protected Transformation<RowData> translateToPlanInternal(PlannerBase planner, ExecNodeConfig config) {
    final ExecEdge inputEdge = getInputEdges().get(0);
    final Transformation<RowData> inputTransform = (Transformation<RowData>) inputEdge.translateToPlan(planner);
    final CodeGenOperatorFactory<RowData> operatorFactory = ExpandCodeGenerator.generateExpandOperator(new CodeGeneratorContext(config.getTableConfig()), (RowType) inputEdge.getOutputType(), (RowType) getOutputType(), projects, retainHeader, getClass().getSimpleName());
    return ExecNodeUtil.createOneInputTransformation(inputTransform, createTransformationMeta(EXPAND_TRANSFORMATION, config), operatorFactory, InternalTypeInfo.of(getOutputType()), inputTransform.getParallelism());
}
Also used : RowData(org.apache.flink.table.data.RowData) Transformation(org.apache.flink.api.dag.Transformation) ExecEdge(org.apache.flink.table.planner.plan.nodes.exec.ExecEdge) CodeGeneratorContext(org.apache.flink.table.planner.codegen.CodeGeneratorContext)

Example 68 with Transformation

use of org.apache.flink.api.dag.Transformation in project flink by apache.

the class CommonExecSink method applyUpsertMaterialize.

private Transformation<RowData> applyUpsertMaterialize(Transformation<RowData> inputTransform, int[] primaryKeys, int sinkParallelism, ReadableConfig config, RowType physicalRowType) {
    GeneratedRecordEqualiser equaliser = new EqualiserCodeGenerator(physicalRowType).generateRecordEqualiser("SinkMaterializeEqualiser");
    SinkUpsertMaterializer operator = new SinkUpsertMaterializer(StateConfigUtil.createTtlConfig(config.get(ExecutionConfigOptions.IDLE_STATE_RETENTION).toMillis()), InternalSerializers.create(physicalRowType), equaliser);
    final String[] fieldNames = physicalRowType.getFieldNames().toArray(new String[0]);
    final List<String> pkFieldNames = Arrays.stream(primaryKeys).mapToObj(idx -> fieldNames[idx]).collect(Collectors.toList());
    OneInputTransformation<RowData, RowData> materializeTransform = ExecNodeUtil.createOneInputTransformation(inputTransform, createTransformationMeta(UPSERT_MATERIALIZE_TRANSFORMATION, String.format("SinkMaterializer(pk=[%s])", String.join(", ", pkFieldNames)), "SinkMaterializer", config), operator, inputTransform.getOutputType(), sinkParallelism);
    RowDataKeySelector keySelector = KeySelectorUtil.getRowDataSelector(primaryKeys, InternalTypeInfo.of(physicalRowType));
    materializeTransform.setStateKeySelector(keySelector);
    materializeTransform.setStateKeyType(keySelector.getProducedType());
    return materializeTransform;
}
Also used : TransformationMetadata(org.apache.flink.table.planner.plan.nodes.exec.utils.TransformationMetadata) Arrays(java.util.Arrays) InputProperty(org.apache.flink.table.planner.plan.nodes.exec.InputProperty) ResolvedSchema(org.apache.flink.table.catalog.ResolvedSchema) ExecNode(org.apache.flink.table.planner.plan.nodes.exec.ExecNode) CharType(org.apache.flink.table.types.logical.CharType) KeySelectorUtil(org.apache.flink.table.planner.plan.utils.KeySelectorUtil) InternalSerializers(org.apache.flink.table.runtime.typeutils.InternalSerializers) ConstraintEnforcer(org.apache.flink.table.runtime.operators.sink.ConstraintEnforcer) OutputFormat(org.apache.flink.api.common.io.OutputFormat) SinkUpsertMaterializer(org.apache.flink.table.runtime.operators.sink.SinkUpsertMaterializer) PartitionTransformation(org.apache.flink.streaming.api.transformations.PartitionTransformation) SinkFunction(org.apache.flink.streaming.api.functions.sink.SinkFunction) DynamicTableSink(org.apache.flink.table.connector.sink.DynamicTableSink) StateConfigUtil(org.apache.flink.table.runtime.util.StateConfigUtil) RowDataKeySelector(org.apache.flink.table.runtime.keyselector.RowDataKeySelector) Collectors(java.util.stream.Collectors) JsonProperty(org.apache.flink.shaded.jackson2.com.fasterxml.jackson.annotation.JsonProperty) SimpleOperatorFactory(org.apache.flink.streaming.api.operators.SimpleOperatorFactory) List(java.util.List) InternalTypeInfo(org.apache.flink.table.runtime.typeutils.InternalTypeInfo) LogicalType(org.apache.flink.table.types.logical.LogicalType) DataStreamSinkProvider(org.apache.flink.table.connector.sink.DataStreamSinkProvider) LegacySinkTransformation(org.apache.flink.streaming.api.transformations.LegacySinkTransformation) Optional(java.util.Optional) ExecNodeBase(org.apache.flink.table.planner.plan.nodes.exec.ExecNodeBase) KeyGroupRangeAssignment(org.apache.flink.runtime.state.KeyGroupRangeAssignment) StreamExecutionEnvironment(org.apache.flink.streaming.api.environment.StreamExecutionEnvironment) SinkRuntimeProvider(org.apache.flink.table.connector.sink.DynamicTableSink.SinkRuntimeProvider) IntStream(java.util.stream.IntStream) BinaryType(org.apache.flink.table.types.logical.BinaryType) ParallelismProvider(org.apache.flink.table.connector.ParallelismProvider) ChangelogMode(org.apache.flink.table.connector.ChangelogMode) MultipleTransformationTranslator(org.apache.flink.table.planner.plan.nodes.exec.MultipleTransformationTranslator) StreamRecordTimestampInserter(org.apache.flink.table.runtime.operators.sink.StreamRecordTimestampInserter) TransformationSinkProvider(org.apache.flink.table.planner.connectors.TransformationSinkProvider) RowType(org.apache.flink.table.types.logical.RowType) ArrayList(java.util.ArrayList) SinkV2Provider(org.apache.flink.table.connector.sink.SinkV2Provider) OutputFormatSinkFunction(org.apache.flink.streaming.api.functions.sink.OutputFormatSinkFunction) DynamicTableSinkSpec(org.apache.flink.table.planner.plan.nodes.exec.spec.DynamicTableSinkSpec) ExecNodeUtil(org.apache.flink.table.planner.plan.nodes.exec.utils.ExecNodeUtil) ReadableConfig(org.apache.flink.configuration.ReadableConfig) SinkFunctionProvider(org.apache.flink.table.connector.sink.SinkFunctionProvider) DataStreamSink(org.apache.flink.streaming.api.datastream.DataStreamSink) ExecNodeContext(org.apache.flink.table.planner.plan.nodes.exec.ExecNodeContext) RowData(org.apache.flink.table.data.RowData) ProviderContext(org.apache.flink.table.connector.ProviderContext) SinkOperator(org.apache.flink.table.runtime.operators.sink.SinkOperator) TableException(org.apache.flink.table.api.TableException) SinkProvider(org.apache.flink.table.connector.sink.SinkProvider) OneInputTransformation(org.apache.flink.streaming.api.transformations.OneInputTransformation) DataStream(org.apache.flink.streaming.api.datastream.DataStream) OutputFormatProvider(org.apache.flink.table.connector.sink.OutputFormatProvider) KeyGroupStreamPartitioner(org.apache.flink.streaming.runtime.partitioner.KeyGroupStreamPartitioner) EqualiserCodeGenerator(org.apache.flink.table.planner.codegen.EqualiserCodeGenerator) SinkRuntimeProviderContext(org.apache.flink.table.runtime.connector.sink.SinkRuntimeProviderContext) RowKind(org.apache.flink.types.RowKind) StreamExecNode(org.apache.flink.table.planner.plan.nodes.exec.stream.StreamExecNode) GeneratedRecordEqualiser(org.apache.flink.table.runtime.generated.GeneratedRecordEqualiser) Transformation(org.apache.flink.api.dag.Transformation) InputTypeConfigurable(org.apache.flink.api.java.typeutils.InputTypeConfigurable) ExecutionConfigOptions(org.apache.flink.table.api.config.ExecutionConfigOptions) LogicalTypeRoot(org.apache.flink.table.types.logical.LogicalTypeRoot) LogicalTypeChecks(org.apache.flink.table.types.logical.utils.LogicalTypeChecks) RowData(org.apache.flink.table.data.RowData) SinkUpsertMaterializer(org.apache.flink.table.runtime.operators.sink.SinkUpsertMaterializer) RowDataKeySelector(org.apache.flink.table.runtime.keyselector.RowDataKeySelector) EqualiserCodeGenerator(org.apache.flink.table.planner.codegen.EqualiserCodeGenerator) GeneratedRecordEqualiser(org.apache.flink.table.runtime.generated.GeneratedRecordEqualiser)

Example 69 with Transformation

use of org.apache.flink.api.dag.Transformation in project flink by apache.

the class CommonExecSink method applySinkProvider.

private Transformation<?> applySinkProvider(Transformation<RowData> inputTransform, StreamExecutionEnvironment env, SinkRuntimeProvider runtimeProvider, int rowtimeFieldIndex, int sinkParallelism, ReadableConfig config) {
    TransformationMetadata sinkMeta = createTransformationMeta(SINK_TRANSFORMATION, config);
    if (runtimeProvider instanceof DataStreamSinkProvider) {
        Transformation<RowData> sinkTransformation = applyRowtimeTransformation(inputTransform, rowtimeFieldIndex, sinkParallelism, config);
        final DataStream<RowData> dataStream = new DataStream<>(env, sinkTransformation);
        final DataStreamSinkProvider provider = (DataStreamSinkProvider) runtimeProvider;
        return provider.consumeDataStream(createProviderContext(), dataStream).getTransformation();
    } else if (runtimeProvider instanceof TransformationSinkProvider) {
        final TransformationSinkProvider provider = (TransformationSinkProvider) runtimeProvider;
        return provider.createTransformation(new TransformationSinkProvider.Context() {

            @Override
            public Transformation<RowData> getInputTransformation() {
                return inputTransform;
            }

            @Override
            public int getRowtimeIndex() {
                return rowtimeFieldIndex;
            }

            @Override
            public Optional<String> generateUid(String name) {
                return createProviderContext().generateUid(name);
            }
        });
    } else if (runtimeProvider instanceof SinkFunctionProvider) {
        final SinkFunction<RowData> sinkFunction = ((SinkFunctionProvider) runtimeProvider).createSinkFunction();
        return createSinkFunctionTransformation(sinkFunction, env, inputTransform, rowtimeFieldIndex, sinkMeta, sinkParallelism);
    } else if (runtimeProvider instanceof OutputFormatProvider) {
        OutputFormat<RowData> outputFormat = ((OutputFormatProvider) runtimeProvider).createOutputFormat();
        final SinkFunction<RowData> sinkFunction = new OutputFormatSinkFunction<>(outputFormat);
        return createSinkFunctionTransformation(sinkFunction, env, inputTransform, rowtimeFieldIndex, sinkMeta, sinkParallelism);
    } else if (runtimeProvider instanceof SinkProvider) {
        Transformation<RowData> sinkTransformation = applyRowtimeTransformation(inputTransform, rowtimeFieldIndex, sinkParallelism, config);
        final DataStream<RowData> dataStream = new DataStream<>(env, sinkTransformation);
        final Transformation<?> transformation = DataStreamSink.forSinkV1(dataStream, ((SinkProvider) runtimeProvider).createSink()).getTransformation();
        transformation.setParallelism(sinkParallelism);
        sinkMeta.fill(transformation);
        return transformation;
    } else if (runtimeProvider instanceof SinkV2Provider) {
        Transformation<RowData> sinkTransformation = applyRowtimeTransformation(inputTransform, rowtimeFieldIndex, sinkParallelism, config);
        final DataStream<RowData> dataStream = new DataStream<>(env, sinkTransformation);
        final Transformation<?> transformation = DataStreamSink.forSink(dataStream, ((SinkV2Provider) runtimeProvider).createSink()).getTransformation();
        transformation.setParallelism(sinkParallelism);
        sinkMeta.fill(transformation);
        return transformation;
    } else {
        throw new TableException("Unsupported sink runtime provider.");
    }
}
Also used : ExecNodeContext(org.apache.flink.table.planner.plan.nodes.exec.ExecNodeContext) ProviderContext(org.apache.flink.table.connector.ProviderContext) SinkRuntimeProviderContext(org.apache.flink.table.runtime.connector.sink.SinkRuntimeProviderContext) TransformationMetadata(org.apache.flink.table.planner.plan.nodes.exec.utils.TransformationMetadata) PartitionTransformation(org.apache.flink.streaming.api.transformations.PartitionTransformation) LegacySinkTransformation(org.apache.flink.streaming.api.transformations.LegacySinkTransformation) OneInputTransformation(org.apache.flink.streaming.api.transformations.OneInputTransformation) Transformation(org.apache.flink.api.dag.Transformation) TableException(org.apache.flink.table.api.TableException) DataStream(org.apache.flink.streaming.api.datastream.DataStream) DataStreamSinkProvider(org.apache.flink.table.connector.sink.DataStreamSinkProvider) OutputFormat(org.apache.flink.api.common.io.OutputFormat) DataStreamSinkProvider(org.apache.flink.table.connector.sink.DataStreamSinkProvider) TransformationSinkProvider(org.apache.flink.table.planner.connectors.TransformationSinkProvider) SinkProvider(org.apache.flink.table.connector.sink.SinkProvider) SinkFunctionProvider(org.apache.flink.table.connector.sink.SinkFunctionProvider) RowData(org.apache.flink.table.data.RowData) TransformationSinkProvider(org.apache.flink.table.planner.connectors.TransformationSinkProvider) SinkFunction(org.apache.flink.streaming.api.functions.sink.SinkFunction) OutputFormatSinkFunction(org.apache.flink.streaming.api.functions.sink.OutputFormatSinkFunction) SinkV2Provider(org.apache.flink.table.connector.sink.SinkV2Provider) OutputFormatProvider(org.apache.flink.table.connector.sink.OutputFormatProvider)

Example 70 with Transformation

use of org.apache.flink.api.dag.Transformation in project flink by apache.

the class CommonExecSink method createSinkTransformation.

@SuppressWarnings("unchecked")
protected Transformation<Object> createSinkTransformation(StreamExecutionEnvironment streamExecEnv, ReadableConfig config, Transformation<RowData> inputTransform, DynamicTableSink tableSink, int rowtimeFieldIndex, boolean upsertMaterialize) {
    final ResolvedSchema schema = tableSinkSpec.getContextResolvedTable().getResolvedSchema();
    final SinkRuntimeProvider runtimeProvider = tableSink.getSinkRuntimeProvider(new SinkRuntimeProviderContext(isBounded));
    final RowType physicalRowType = getPhysicalRowType(schema);
    final int[] primaryKeys = getPrimaryKeyIndices(physicalRowType, schema);
    final int sinkParallelism = deriveSinkParallelism(inputTransform, runtimeProvider);
    final int inputParallelism = inputTransform.getParallelism();
    final boolean inputInsertOnly = inputChangelogMode.containsOnly(RowKind.INSERT);
    final boolean hasPk = primaryKeys.length > 0;
    if (!inputInsertOnly && sinkParallelism != inputParallelism && !hasPk) {
        throw new TableException(String.format("The sink for table '%s' has a configured parallelism of %s, while the input parallelism is %s. " + "Since the configured parallelism is different from the input's parallelism and " + "the changelog mode is not insert-only, a primary key is required but could not " + "be found.", tableSinkSpec.getContextResolvedTable().getIdentifier().asSummaryString(), sinkParallelism, inputParallelism));
    }
    // only add materialization if input has change
    final boolean needMaterialization = !inputInsertOnly && upsertMaterialize;
    Transformation<RowData> sinkTransform = applyConstraintValidations(inputTransform, config, physicalRowType);
    if (hasPk) {
        sinkTransform = applyKeyBy(config, sinkTransform, primaryKeys, sinkParallelism, inputParallelism, inputInsertOnly, needMaterialization);
    }
    if (needMaterialization) {
        sinkTransform = applyUpsertMaterialize(sinkTransform, primaryKeys, sinkParallelism, config, physicalRowType);
    }
    return (Transformation<Object>) applySinkProvider(sinkTransform, streamExecEnv, runtimeProvider, rowtimeFieldIndex, sinkParallelism, config);
}
Also used : SinkRuntimeProviderContext(org.apache.flink.table.runtime.connector.sink.SinkRuntimeProviderContext) TableException(org.apache.flink.table.api.TableException) RowData(org.apache.flink.table.data.RowData) PartitionTransformation(org.apache.flink.streaming.api.transformations.PartitionTransformation) LegacySinkTransformation(org.apache.flink.streaming.api.transformations.LegacySinkTransformation) OneInputTransformation(org.apache.flink.streaming.api.transformations.OneInputTransformation) Transformation(org.apache.flink.api.dag.Transformation) RowType(org.apache.flink.table.types.logical.RowType) ResolvedSchema(org.apache.flink.table.catalog.ResolvedSchema) SinkRuntimeProvider(org.apache.flink.table.connector.sink.DynamicTableSink.SinkRuntimeProvider)

Aggregations

Transformation (org.apache.flink.api.dag.Transformation)98 RowData (org.apache.flink.table.data.RowData)69 ExecEdge (org.apache.flink.table.planner.plan.nodes.exec.ExecEdge)53 RowType (org.apache.flink.table.types.logical.RowType)50 OneInputTransformation (org.apache.flink.streaming.api.transformations.OneInputTransformation)45 TableException (org.apache.flink.table.api.TableException)28 RowDataKeySelector (org.apache.flink.table.runtime.keyselector.RowDataKeySelector)28 ArrayList (java.util.ArrayList)25 CodeGeneratorContext (org.apache.flink.table.planner.codegen.CodeGeneratorContext)21 Configuration (org.apache.flink.configuration.Configuration)19 TwoInputTransformation (org.apache.flink.streaming.api.transformations.TwoInputTransformation)18 List (java.util.List)17 PartitionTransformation (org.apache.flink.streaming.api.transformations.PartitionTransformation)17 AggregateInfoList (org.apache.flink.table.planner.plan.utils.AggregateInfoList)17 LogicalType (org.apache.flink.table.types.logical.LogicalType)16 Test (org.junit.Test)16 StreamExecutionEnvironment (org.apache.flink.streaming.api.environment.StreamExecutionEnvironment)13 SourceTransformation (org.apache.flink.streaming.api.transformations.SourceTransformation)13 Arrays (java.util.Arrays)11 Collections (java.util.Collections)10