Search in sources :

Example 1 with DataStreamScanProvider

use of org.apache.flink.table.connector.source.DataStreamScanProvider in project flink by apache.

the class UpsertKafkaDynamicTableFactoryTest method assertKafkaSource.

private void assertKafkaSource(ScanTableSource.ScanRuntimeProvider provider) {
    assertThat(provider, instanceOf(DataStreamScanProvider.class));
    final DataStreamScanProvider dataStreamScanProvider = (DataStreamScanProvider) provider;
    final Transformation<RowData> transformation = dataStreamScanProvider.produceDataStream(n -> Optional.empty(), StreamExecutionEnvironment.createLocalEnvironment()).getTransformation();
    assertThat(transformation, instanceOf(SourceTransformation.class));
    SourceTransformation<RowData, KafkaPartitionSplit, KafkaSourceEnumState> sourceTransformation = (SourceTransformation<RowData, KafkaPartitionSplit, KafkaSourceEnumState>) transformation;
    assertThat(sourceTransformation.getSource(), instanceOf(KafkaSource.class));
}
Also used : DataType(org.apache.flink.table.types.DataType) AtomicDataType(org.apache.flink.table.types.AtomicDataType) Arrays(java.util.Arrays) ResolvedSchema(org.apache.flink.table.catalog.ResolvedSchema) SourceTransformation(org.apache.flink.streaming.api.transformations.SourceTransformation) DataStreamScanProvider(org.apache.flink.table.connector.source.DataStreamScanProvider) CoreMatchers.instanceOf(org.hamcrest.CoreMatchers.instanceOf) DecodingFormat(org.apache.flink.table.connector.format.DecodingFormat) Map(java.util.Map) TestLogger(org.apache.flink.util.TestLogger) FactoryMocks.createTableSink(org.apache.flink.table.factories.utils.FactoryMocks.createTableSink) ConfluentRegistryAvroSerializationSchema(org.apache.flink.formats.avro.registry.confluent.ConfluentRegistryAvroSerializationSchema) DynamicTableSource(org.apache.flink.table.connector.source.DynamicTableSource) DynamicTableSink(org.apache.flink.table.connector.sink.DynamicTableSink) FlinkMatchers.containsCause(org.apache.flink.core.testutils.FlinkMatchers.containsCause) AVRO_CONFLUENT(org.apache.flink.streaming.connectors.kafka.table.KafkaConnectorOptionsUtil.AVRO_CONFLUENT) AvroRowDataSerializationSchema(org.apache.flink.formats.avro.AvroRowDataSerializationSchema) FactoryUtil(org.apache.flink.table.factories.FactoryUtil) DataStreamSinkProvider(org.apache.flink.table.connector.sink.DataStreamSinkProvider) ValidationException(org.apache.flink.table.api.ValidationException) Optional(java.util.Optional) ScanRuntimeProviderContext(org.apache.flink.table.runtime.connector.source.ScanRuntimeProviderContext) SerializationSchema(org.apache.flink.api.common.serialization.SerializationSchema) StreamExecutionEnvironment(org.apache.flink.streaming.api.environment.StreamExecutionEnvironment) TestFormatFactory(org.apache.flink.table.factories.TestFormatFactory) DeliveryGuarantee(org.apache.flink.connector.base.DeliveryGuarantee) EncodingFormat(org.apache.flink.table.connector.format.EncodingFormat) Sink(org.apache.flink.api.connector.sink2.Sink) ChangelogMode(org.apache.flink.table.connector.ChangelogMode) StreamOperatorFactory(org.apache.flink.streaming.api.operators.StreamOperatorFactory) Column(org.apache.flink.table.catalog.Column) HashMap(java.util.HashMap) RowType(org.apache.flink.table.types.logical.RowType) ScanTableSource(org.apache.flink.table.connector.source.ScanTableSource) SinkV2Provider(org.apache.flink.table.connector.sink.SinkV2Provider) KafkaSink(org.apache.flink.connector.kafka.sink.KafkaSink) RowDataToAvroConverters(org.apache.flink.formats.avro.RowDataToAvroConverters) FactoryMocks.createTableSource(org.apache.flink.table.factories.utils.FactoryMocks.createTableSource) MatcherAssert.assertThat(org.hamcrest.MatcherAssert.assertThat) SinkWriterOperatorFactory(org.apache.flink.streaming.runtime.operators.sink.SinkWriterOperatorFactory) ExpectedException(org.junit.rules.ExpectedException) RowData(org.apache.flink.table.data.RowData) Properties(java.util.Properties) Assert.assertTrue(org.junit.Assert.assertTrue) DataTypes(org.apache.flink.table.api.DataTypes) VarCharType(org.apache.flink.table.types.logical.VarCharType) Test(org.junit.Test) BinaryRowData(org.apache.flink.table.data.binary.BinaryRowData) KafkaSourceEnumState(org.apache.flink.connector.kafka.source.enumerator.KafkaSourceEnumState) DeserializationSchema(org.apache.flink.api.common.serialization.DeserializationSchema) Consumer(java.util.function.Consumer) StartupMode(org.apache.flink.streaming.connectors.kafka.config.StartupMode) Rule(org.junit.Rule) KafkaSource(org.apache.flink.connector.kafka.source.KafkaSource) UniqueConstraint(org.apache.flink.table.catalog.UniqueConstraint) SinkRuntimeProviderContext(org.apache.flink.table.runtime.connector.sink.SinkRuntimeProviderContext) FactoryMocks(org.apache.flink.table.factories.utils.FactoryMocks) KafkaPartitionSplit(org.apache.flink.connector.kafka.source.split.KafkaPartitionSplit) Transformation(org.apache.flink.api.dag.Transformation) Collections(java.util.Collections) Assert.assertEquals(org.junit.Assert.assertEquals) AvroSchemaConverter(org.apache.flink.formats.avro.typeutils.AvroSchemaConverter) KafkaPartitionSplit(org.apache.flink.connector.kafka.source.split.KafkaPartitionSplit) RowData(org.apache.flink.table.data.RowData) BinaryRowData(org.apache.flink.table.data.binary.BinaryRowData) KafkaSource(org.apache.flink.connector.kafka.source.KafkaSource) KafkaSourceEnumState(org.apache.flink.connector.kafka.source.enumerator.KafkaSourceEnumState) DataStreamScanProvider(org.apache.flink.table.connector.source.DataStreamScanProvider) SourceTransformation(org.apache.flink.streaming.api.transformations.SourceTransformation)

Example 2 with DataStreamScanProvider

use of org.apache.flink.table.connector.source.DataStreamScanProvider in project flink by apache.

the class KafkaDynamicTableFactoryTest method assertKafkaSource.

private KafkaSource<?> assertKafkaSource(ScanTableSource.ScanRuntimeProvider provider) {
    assertThat(provider).isInstanceOf(DataStreamScanProvider.class);
    final DataStreamScanProvider dataStreamScanProvider = (DataStreamScanProvider) provider;
    final Transformation<RowData> transformation = dataStreamScanProvider.produceDataStream(n -> Optional.empty(), StreamExecutionEnvironment.createLocalEnvironment()).getTransformation();
    assertThat(transformation).isInstanceOf(SourceTransformation.class);
    SourceTransformation<RowData, KafkaPartitionSplit, KafkaSourceEnumState> sourceTransformation = (SourceTransformation<RowData, KafkaPartitionSplit, KafkaSourceEnumState>) transformation;
    assertThat(sourceTransformation.getSource()).isInstanceOf(KafkaSource.class);
    return (KafkaSource<?>) sourceTransformation.getSource();
}
Also used : DataType(org.apache.flink.table.types.DataType) ConfigOptions(org.apache.flink.configuration.ConfigOptions) Arrays(java.util.Arrays) Assertions.assertThat(org.assertj.core.api.Assertions.assertThat) ResolvedSchema(org.apache.flink.table.catalog.ResolvedSchema) SourceTransformation(org.apache.flink.streaming.api.transformations.SourceTransformation) DataStreamScanProvider(org.apache.flink.table.connector.source.DataStreamScanProvider) DecodingFormat(org.apache.flink.table.connector.format.DecodingFormat) ExtendWith(org.junit.jupiter.api.extension.ExtendWith) Map(java.util.Map) FactoryMocks.createTableSink(org.apache.flink.table.factories.utils.FactoryMocks.createTableSink) FlinkFixedPartitioner(org.apache.flink.streaming.connectors.kafka.partitioner.FlinkFixedPartitioner) ConfluentRegistryAvroSerializationSchema(org.apache.flink.formats.avro.registry.confluent.ConfluentRegistryAvroSerializationSchema) DynamicTableSource(org.apache.flink.table.connector.source.DynamicTableSource) DynamicTableSink(org.apache.flink.table.connector.sink.DynamicTableSink) KafkaTopicPartition(org.apache.flink.streaming.connectors.kafka.internals.KafkaTopicPartition) Set(java.util.Set) EncodingFormatMock(org.apache.flink.table.factories.TestFormatFactory.EncodingFormatMock) ConsumerConfig(org.apache.kafka.clients.consumer.ConsumerConfig) AVRO_CONFLUENT(org.apache.flink.streaming.connectors.kafka.table.KafkaConnectorOptionsUtil.AVRO_CONFLUENT) ResolvedExpressionMock(org.apache.flink.table.expressions.utils.ResolvedExpressionMock) AvroRowDataSerializationSchema(org.apache.flink.formats.avro.AvroRowDataSerializationSchema) Test(org.junit.jupiter.api.Test) List(java.util.List) FactoryUtil(org.apache.flink.table.factories.FactoryUtil) ValidationException(org.apache.flink.table.api.ValidationException) FlinkAssertions.containsCause(org.apache.flink.core.testutils.FlinkAssertions.containsCause) Optional(java.util.Optional) Pattern(java.util.regex.Pattern) ScanRuntimeProviderContext(org.apache.flink.table.runtime.connector.source.ScanRuntimeProviderContext) SerializationSchema(org.apache.flink.api.common.serialization.SerializationSchema) StreamExecutionEnvironment(org.apache.flink.streaming.api.environment.StreamExecutionEnvironment) TestFormatFactory(org.apache.flink.table.factories.TestFormatFactory) DeliveryGuarantee(org.apache.flink.connector.base.DeliveryGuarantee) EncodingFormat(org.apache.flink.table.connector.format.EncodingFormat) Sink(org.apache.flink.api.connector.sink2.Sink) ChangelogMode(org.apache.flink.table.connector.ChangelogMode) Column(org.apache.flink.table.catalog.Column) HashMap(java.util.HashMap) RowType(org.apache.flink.table.types.logical.RowType) ScanTableSource(org.apache.flink.table.connector.source.ScanTableSource) SinkV2Provider(org.apache.flink.table.connector.sink.SinkV2Provider) HashSet(java.util.HashSet) TestLoggerExtension(org.apache.flink.util.TestLoggerExtension) PROPERTIES_PREFIX(org.apache.flink.streaming.connectors.kafka.table.KafkaConnectorOptionsUtil.PROPERTIES_PREFIX) KafkaSink(org.apache.flink.connector.kafka.sink.KafkaSink) Assertions.assertThatThrownBy(org.assertj.core.api.Assertions.assertThatThrownBy) Assertions.assertThatExceptionOfType(org.assertj.core.api.Assertions.assertThatExceptionOfType) RowDataToAvroConverters(org.apache.flink.formats.avro.RowDataToAvroConverters) KafkaSourceOptions(org.apache.flink.connector.kafka.source.KafkaSourceOptions) FactoryMocks.createTableSource(org.apache.flink.table.factories.utils.FactoryMocks.createTableSource) Nullable(javax.annotation.Nullable) ValueSource(org.junit.jupiter.params.provider.ValueSource) DEBEZIUM_AVRO_CONFLUENT(org.apache.flink.streaming.connectors.kafka.table.KafkaConnectorOptionsUtil.DEBEZIUM_AVRO_CONFLUENT) RowData(org.apache.flink.table.data.RowData) Properties(java.util.Properties) WatermarkSpec(org.apache.flink.table.catalog.WatermarkSpec) Configuration(org.apache.flink.configuration.Configuration) DataTypes(org.apache.flink.table.api.DataTypes) ScanStartupMode(org.apache.flink.streaming.connectors.kafka.table.KafkaConnectorOptions.ScanStartupMode) KafkaSourceEnumState(org.apache.flink.connector.kafka.source.enumerator.KafkaSourceEnumState) FlinkKafkaPartitioner(org.apache.flink.streaming.connectors.kafka.partitioner.FlinkKafkaPartitioner) DeserializationSchema(org.apache.flink.api.common.serialization.DeserializationSchema) Consumer(java.util.function.Consumer) StartupMode(org.apache.flink.streaming.connectors.kafka.config.StartupMode) ParameterizedTest(org.junit.jupiter.params.ParameterizedTest) KafkaSource(org.apache.flink.connector.kafka.source.KafkaSource) UniqueConstraint(org.apache.flink.table.catalog.UniqueConstraint) DecodingFormatMock(org.apache.flink.table.factories.TestFormatFactory.DecodingFormatMock) SinkRuntimeProviderContext(org.apache.flink.table.runtime.connector.sink.SinkRuntimeProviderContext) ImmutableList(org.apache.flink.shaded.guava30.com.google.common.collect.ImmutableList) KafkaSourceTestUtils(org.apache.flink.connector.kafka.source.KafkaSourceTestUtils) FactoryMocks(org.apache.flink.table.factories.utils.FactoryMocks) KafkaPartitionSplit(org.apache.flink.connector.kafka.source.split.KafkaPartitionSplit) DebeziumAvroSerializationSchema(org.apache.flink.formats.avro.registry.confluent.debezium.DebeziumAvroSerializationSchema) NullSource(org.junit.jupiter.params.provider.NullSource) Transformation(org.apache.flink.api.dag.Transformation) Collections(java.util.Collections) AvroSchemaConverter(org.apache.flink.formats.avro.typeutils.AvroSchemaConverter) KafkaPartitionSplit(org.apache.flink.connector.kafka.source.split.KafkaPartitionSplit) RowData(org.apache.flink.table.data.RowData) KafkaSource(org.apache.flink.connector.kafka.source.KafkaSource) KafkaSourceEnumState(org.apache.flink.connector.kafka.source.enumerator.KafkaSourceEnumState) DataStreamScanProvider(org.apache.flink.table.connector.source.DataStreamScanProvider) SourceTransformation(org.apache.flink.streaming.api.transformations.SourceTransformation)

Example 3 with DataStreamScanProvider

use of org.apache.flink.table.connector.source.DataStreamScanProvider in project flink by apache.

the class CommonExecTableSourceScan method translateToPlanInternal.

@Override
protected Transformation<RowData> translateToPlanInternal(PlannerBase planner, ExecNodeConfig config) {
    final StreamExecutionEnvironment env = planner.getExecEnv();
    final TransformationMetadata meta = createTransformationMeta(SOURCE_TRANSFORMATION, config);
    final InternalTypeInfo<RowData> outputTypeInfo = InternalTypeInfo.of((RowType) getOutputType());
    final ScanTableSource tableSource = tableSourceSpec.getScanTableSource(planner.getFlinkContext());
    ScanTableSource.ScanRuntimeProvider provider = tableSource.getScanRuntimeProvider(ScanRuntimeProviderContext.INSTANCE);
    if (provider instanceof SourceFunctionProvider) {
        final SourceFunctionProvider sourceFunctionProvider = (SourceFunctionProvider) provider;
        final SourceFunction<RowData> function = sourceFunctionProvider.createSourceFunction();
        final Transformation<RowData> transformation = createSourceFunctionTransformation(env, function, sourceFunctionProvider.isBounded(), meta.getName(), outputTypeInfo);
        return meta.fill(transformation);
    } else if (provider instanceof InputFormatProvider) {
        final InputFormat<RowData, ?> inputFormat = ((InputFormatProvider) provider).createInputFormat();
        final Transformation<RowData> transformation = createInputFormatTransformation(env, inputFormat, outputTypeInfo, meta.getName());
        return meta.fill(transformation);
    } else if (provider instanceof SourceProvider) {
        final Source<RowData, ?, ?> source = ((SourceProvider) provider).createSource();
        // TODO: Push down watermark strategy to source scan
        final Transformation<RowData> transformation = env.fromSource(source, WatermarkStrategy.noWatermarks(), meta.getName(), outputTypeInfo).getTransformation();
        return meta.fill(transformation);
    } else if (provider instanceof DataStreamScanProvider) {
        Transformation<RowData> transformation = ((DataStreamScanProvider) provider).produceDataStream(createProviderContext(), env).getTransformation();
        meta.fill(transformation);
        transformation.setOutputType(outputTypeInfo);
        return transformation;
    } else if (provider instanceof TransformationScanProvider) {
        final Transformation<RowData> transformation = ((TransformationScanProvider) provider).createTransformation(createProviderContext());
        meta.fill(transformation);
        transformation.setOutputType(outputTypeInfo);
        return transformation;
    } else {
        throw new UnsupportedOperationException(provider.getClass().getSimpleName() + " is unsupported now.");
    }
}
Also used : TransformationMetadata(org.apache.flink.table.planner.plan.nodes.exec.utils.TransformationMetadata) LegacySourceTransformation(org.apache.flink.streaming.api.transformations.LegacySourceTransformation) Transformation(org.apache.flink.api.dag.Transformation) TransformationScanProvider(org.apache.flink.table.planner.connectors.TransformationScanProvider) InputFormatProvider(org.apache.flink.table.connector.source.InputFormatProvider) SourceFunctionProvider(org.apache.flink.table.connector.source.SourceFunctionProvider) SourceProvider(org.apache.flink.table.connector.source.SourceProvider) ScanTableSource(org.apache.flink.table.connector.source.ScanTableSource) RowData(org.apache.flink.table.data.RowData) InputFormat(org.apache.flink.api.common.io.InputFormat) DataStreamScanProvider(org.apache.flink.table.connector.source.DataStreamScanProvider) StreamExecutionEnvironment(org.apache.flink.streaming.api.environment.StreamExecutionEnvironment)

Aggregations

Transformation (org.apache.flink.api.dag.Transformation)3 StreamExecutionEnvironment (org.apache.flink.streaming.api.environment.StreamExecutionEnvironment)3 DataStreamScanProvider (org.apache.flink.table.connector.source.DataStreamScanProvider)3 Arrays (java.util.Arrays)2 Collections (java.util.Collections)2 HashMap (java.util.HashMap)2 Map (java.util.Map)2 Optional (java.util.Optional)2 Properties (java.util.Properties)2 Consumer (java.util.function.Consumer)2 DeserializationSchema (org.apache.flink.api.common.serialization.DeserializationSchema)2 SerializationSchema (org.apache.flink.api.common.serialization.SerializationSchema)2 Sink (org.apache.flink.api.connector.sink2.Sink)2 DeliveryGuarantee (org.apache.flink.connector.base.DeliveryGuarantee)2 KafkaSink (org.apache.flink.connector.kafka.sink.KafkaSink)2 KafkaSource (org.apache.flink.connector.kafka.source.KafkaSource)2 KafkaSourceEnumState (org.apache.flink.connector.kafka.source.enumerator.KafkaSourceEnumState)2 KafkaPartitionSplit (org.apache.flink.connector.kafka.source.split.KafkaPartitionSplit)2 AvroRowDataSerializationSchema (org.apache.flink.formats.avro.AvroRowDataSerializationSchema)2 RowDataToAvroConverters (org.apache.flink.formats.avro.RowDataToAvroConverters)2