Examples with DataStream - org.apache.flink.streaming.api.datastream.DataStream

Example 46 with DataStream

use of org.apache.flink.streaming.api.datastream.DataStream in project flink by apache.

the class StreamOperatorChainingTest method testMultiChainingWithSplit.

/**
 * Verify that multi-chaining works with object reuse enabled.
 */
private void testMultiChainingWithSplit(StreamExecutionEnvironment env) throws Exception {
    // set parallelism to 2 to avoid chaining with source in case when available processors is
    // 1.
    env.setParallelism(2);
    // the actual elements will not be used
    DataStream<Integer> input = env.fromElements(1, 2, 3);
    sink1Results = new ArrayList<>();
    sink2Results = new ArrayList<>();
    sink3Results = new ArrayList<>();
    input = input.map(value -> value);
    OutputTag<Integer> oneOutput = new OutputTag<Integer>("one") {
    };
    OutputTag<Integer> otherOutput = new OutputTag<Integer>("other") {
    };
    SingleOutputStreamOperator<Object> split = input.process(new ProcessFunction<Integer, Object>() {

        private static final long serialVersionUID = 1L;

        @Override
        public void processElement(Integer value, Context ctx, Collector<Object> out) throws Exception {
            if (value.equals(1)) {
                ctx.output(oneOutput, value);
            } else {
                ctx.output(otherOutput, value);
            }
        }
    });
    split.getSideOutput(oneOutput).map(value -> "First 1: " + value).addSink(new SinkFunction<String>() {

        @Override
        public void invoke(String value, Context ctx) throws Exception {
            sink1Results.add(value);
        }
    });
    split.getSideOutput(oneOutput).map(value -> "First 2: " + value).addSink(new SinkFunction<String>() {

        @Override
        public void invoke(String value, Context ctx) throws Exception {
            sink2Results.add(value);
        }
    });
    split.getSideOutput(otherOutput).map(value -> "Second: " + value).addSink(new SinkFunction<String>() {

        @Override
        public void invoke(String value, Context ctx) throws Exception {
            sink3Results.add(value);
        }
    });
    // be build our own StreamTask and OperatorChain
    JobGraph jobGraph = env.getStreamGraph().getJobGraph();
    Assert.assertTrue(jobGraph.getVerticesSortedTopologicallyFromSources().size() == 2);
    JobVertex chainedVertex = jobGraph.getVerticesSortedTopologicallyFromSources().get(1);
    Configuration configuration = chainedVertex.getConfiguration();
    StreamConfig streamConfig = new StreamConfig(configuration);
    StreamMap<Integer, Integer> headOperator = streamConfig.getStreamOperator(Thread.currentThread().getContextClassLoader());
    try (MockEnvironment environment = createMockEnvironment(chainedVertex.getName())) {
        StreamTask<Integer, StreamMap<Integer, Integer>> mockTask = createMockTask(streamConfig, environment);
        OperatorChain<Integer, StreamMap<Integer, Integer>> operatorChain = createOperatorChain(streamConfig, environment, mockTask);
        headOperator.setup(mockTask, streamConfig, operatorChain.getMainOperatorOutput());
        operatorChain.initializeStateAndOpenOperators(null);
        headOperator.processElement(new StreamRecord<>(1));
        headOperator.processElement(new StreamRecord<>(2));
        headOperator.processElement(new StreamRecord<>(3));
        assertThat(sink1Results, contains("First 1: 1"));
        assertThat(sink2Results, contains("First 2: 1"));
        assertThat(sink3Results, contains("Second: 2", "Second: 3"));
    }
}

Also used : StreamConfig(org.apache.flink.streaming.api.graph.StreamConfig) JobVertex(org.apache.flink.runtime.jobgraph.JobVertex) JobGraph(org.apache.flink.runtime.jobgraph.JobGraph) RecordWriterDelegate(org.apache.flink.runtime.io.network.api.writer.RecordWriterDelegate) ArrayList(java.util.ArrayList) StreamRecord(org.apache.flink.streaming.runtime.streamrecord.StreamRecord) StreamTaskStateInitializer(org.apache.flink.streaming.api.operators.StreamTaskStateInitializer) Collector(org.apache.flink.util.Collector) StreamMap(org.apache.flink.streaming.api.operators.StreamMap) ProcessFunction(org.apache.flink.streaming.api.functions.ProcessFunction) MatcherAssert.assertThat(org.hamcrest.MatcherAssert.assertThat) MockEnvironment(org.apache.flink.runtime.operators.testutils.MockEnvironment) SinkFunction(org.apache.flink.streaming.api.functions.sink.SinkFunction) MockInputSplitProvider(org.apache.flink.runtime.operators.testutils.MockInputSplitProvider) StreamTask(org.apache.flink.streaming.runtime.tasks.StreamTask) Configuration(org.apache.flink.configuration.Configuration) SerializationDelegate(org.apache.flink.runtime.plugable.SerializationDelegate) SingleOutputStreamOperator(org.apache.flink.streaming.api.datastream.SingleOutputStreamOperator) MockStreamTaskBuilder(org.apache.flink.streaming.util.MockStreamTaskBuilder) OutputTag(org.apache.flink.util.OutputTag) Test(org.junit.Test) MockEnvironmentBuilder(org.apache.flink.runtime.operators.testutils.MockEnvironmentBuilder) DataStream(org.apache.flink.streaming.api.datastream.DataStream) StreamOperator(org.apache.flink.streaming.api.operators.StreamOperator) List(java.util.List) Matchers.contains(org.hamcrest.Matchers.contains) ExecutionConfig(org.apache.flink.api.common.ExecutionConfig) OperatorChain(org.apache.flink.streaming.runtime.tasks.OperatorChain) RegularOperatorChain(org.apache.flink.streaming.runtime.tasks.RegularOperatorChain) Assert(org.junit.Assert) Environment(org.apache.flink.runtime.execution.Environment) StreamExecutionEnvironment(org.apache.flink.streaming.api.environment.StreamExecutionEnvironment) StreamOperatorWrapper(org.apache.flink.streaming.runtime.tasks.StreamOperatorWrapper) Configuration(org.apache.flink.configuration.Configuration) MockEnvironment(org.apache.flink.runtime.operators.testutils.MockEnvironment) OutputTag(org.apache.flink.util.OutputTag) StreamConfig(org.apache.flink.streaming.api.graph.StreamConfig) JobGraph(org.apache.flink.runtime.jobgraph.JobGraph) JobVertex(org.apache.flink.runtime.jobgraph.JobVertex) StreamMap(org.apache.flink.streaming.api.operators.StreamMap)

Example 47 with DataStream

use of org.apache.flink.streaming.api.datastream.DataStream in project flink by apache.

the class DataStreamJavaITCase method getComplexUnifiedPipeline.

// --------------------------------------------------------------------------------------------
// Helper methods
// --------------------------------------------------------------------------------------------
private Table getComplexUnifiedPipeline(StreamExecutionEnvironment env) {
    final DataStream<String> allowedNamesStream = env.fromElements("Bob", "Alice");
    final StreamTableEnvironment tableEnv = StreamTableEnvironment.create(env);
    tableEnv.createTemporaryView("AllowedNamesTable", tableEnv.fromDataStream(allowedNamesStream).as("allowedName"));
    final Table nameCountTable = tableEnv.sqlQuery("SELECT name, COUNT(*) AS c " + "FROM (VALUES ('Bob'), ('Alice'), ('Greg'), ('Bob')) AS NameTable(name) " + "WHERE name IN (SELECT allowedName FROM AllowedNamesTable)" + "GROUP BY name");
    final DataStream<Row> nameCountStream = tableEnv.toChangelogStream(nameCountTable);
    final DataStream<Tuple2<String, Long>> updatesPerNameStream = nameCountStream.keyBy(r -> r.<String>getFieldAs("name")).process(new KeyedProcessFunction<String, Row, Tuple2<String, Long>>() {

        ValueState<Long> count;

        @Override
        public void open(Configuration parameters) {
            count = getRuntimeContext().getState(new ValueStateDescriptor<>("count", Long.class));
        }

        @Override
        public void processElement(Row r, Context ctx, Collector<Tuple2<String, Long>> out) throws IOException {
            Long currentCount = count.value();
            if (currentCount == null) {
                currentCount = 0L;
            }
            final long updatedCount = currentCount + 1;
            count.update(updatedCount);
            out.collect(Tuple2.of(ctx.getCurrentKey(), updatedCount));
        }
    });
    tableEnv.createTemporaryView("UpdatesPerName", updatesPerNameStream);
    return tableEnv.sqlQuery("SELECT DISTINCT f0, f1 FROM UpdatesPerName");
}

Also used : DataType(org.apache.flink.table.types.DataType) BIGINT(org.apache.flink.table.api.DataTypes.BIGINT) STRING(org.apache.flink.table.api.DataTypes.STRING) StreamTableEnvironment(org.apache.flink.table.api.bridge.java.StreamTableEnvironment) Arrays(java.util.Arrays) Schema(org.apache.flink.table.api.Schema) Tuple3(org.apache.flink.api.java.tuple.Tuple3) TableDescriptor(org.apache.flink.table.api.TableDescriptor) Tuple2(org.apache.flink.api.java.tuple.Tuple2) ResolvedSchema(org.apache.flink.table.catalog.ResolvedSchema) TupleTypeInfo(org.apache.flink.api.java.typeutils.TupleTypeInfo) TypeHint(org.apache.flink.api.common.typeinfo.TypeHint) TIMESTAMP_LTZ(org.apache.flink.table.api.DataTypes.TIMESTAMP_LTZ) RawType(org.apache.flink.table.types.logical.RawType) ZoneOffset(java.time.ZoneOffset) TypeInformation(org.apache.flink.api.common.typeinfo.TypeInformation) FIELD(org.apache.flink.table.api.DataTypes.FIELD) Parameterized(org.junit.runners.Parameterized) AbstractTestBase(org.apache.flink.test.util.AbstractTestBase) DOUBLE(org.apache.flink.table.api.DataTypes.DOUBLE) TableConfig(org.apache.flink.table.api.TableConfig) Expressions.$(org.apache.flink.table.api.Expressions.$) TestValuesTableFactory(org.apache.flink.table.planner.factories.TestValuesTableFactory) WatermarkStrategy(org.apache.flink.api.common.eventtime.WatermarkStrategy) Table(org.apache.flink.table.api.Table) ResolvedExpressionMock(org.apache.flink.table.expressions.utils.ResolvedExpressionMock) ZoneId(java.time.ZoneId) Objects(java.util.Objects) Matchers.instanceOf(org.hamcrest.Matchers.instanceOf) CloseableIterator(org.apache.flink.util.CloseableIterator) List(java.util.List) ValueState(org.apache.flink.api.common.state.ValueState) TumblingEventTimeWindows(org.apache.flink.streaming.api.windowing.assigners.TumblingEventTimeWindows) Matchers.containsInAnyOrder(org.hamcrest.Matchers.containsInAnyOrder) STRUCTURED(org.apache.flink.table.api.DataTypes.STRUCTURED) TableResult(org.apache.flink.table.api.TableResult) Row(org.apache.flink.types.Row) StreamExecutionEnvironment(org.apache.flink.streaming.api.environment.StreamExecutionEnvironment) MAP(org.apache.flink.table.api.DataTypes.MAP) BOOLEAN(org.apache.flink.table.api.DataTypes.BOOLEAN) Either(org.apache.flink.types.Either) ChangelogMode(org.apache.flink.table.connector.ChangelogMode) ROW(org.apache.flink.table.api.DataTypes.ROW) Column(org.apache.flink.table.catalog.Column) RunWith(org.junit.runner.RunWith) Parameters(org.junit.runners.Parameterized.Parameters) LocalDateTime(java.time.LocalDateTime) Expressions.sourceWatermark(org.apache.flink.table.api.Expressions.sourceWatermark) DataStreamSource(org.apache.flink.streaming.api.datastream.DataStreamSource) KeyedProcessFunction(org.apache.flink.streaming.api.functions.KeyedProcessFunction) ArrayList(java.util.ArrayList) Collector(org.apache.flink.util.Collector) ProcessFunction(org.apache.flink.streaming.api.functions.ProcessFunction) MatcherAssert.assertThat(org.hamcrest.MatcherAssert.assertThat) INT(org.apache.flink.table.api.DataTypes.INT) Before(org.junit.Before) Types(org.apache.flink.api.common.typeinfo.Types) Time(org.apache.flink.streaming.api.windowing.time.Time) WatermarkSpec(org.apache.flink.table.catalog.WatermarkSpec) GenericTypeInfo(org.apache.flink.api.java.typeutils.GenericTypeInfo) Parameter(org.junit.runners.Parameterized.Parameter) ValueStateDescriptor(org.apache.flink.api.common.state.ValueStateDescriptor) Configuration(org.apache.flink.configuration.Configuration) SingleOutputStreamOperator(org.apache.flink.streaming.api.datastream.SingleOutputStreamOperator) DataTypes(org.apache.flink.table.api.DataTypes) Test(org.junit.Test) IOException(java.io.IOException) CollectionUtil(org.apache.flink.util.CollectionUtil) DataStream(org.apache.flink.streaming.api.datastream.DataStream) RowKind(org.apache.flink.types.RowKind) DayOfWeek(java.time.DayOfWeek) TIMESTAMP(org.apache.flink.table.api.DataTypes.TIMESTAMP) EnumTypeInfo(org.apache.flink.api.java.typeutils.EnumTypeInfo) RuntimeExecutionMode(org.apache.flink.api.common.RuntimeExecutionMode) Collections(java.util.Collections) Assert.assertEquals(org.junit.Assert.assertEquals) Table(org.apache.flink.table.api.Table) Configuration(org.apache.flink.configuration.Configuration) IOException(java.io.IOException) Tuple2(org.apache.flink.api.java.tuple.Tuple2) StreamTableEnvironment(org.apache.flink.table.api.bridge.java.StreamTableEnvironment) Row(org.apache.flink.types.Row)

Example 48 with DataStream

use of org.apache.flink.streaming.api.datastream.DataStream in project flink by apache.

the class DataStreamJavaITCase method testFromAndToChangelogStreamEventTime.

@Test
public void testFromAndToChangelogStreamEventTime() throws Exception {
    final StreamTableEnvironment tableEnv = StreamTableEnvironment.create(env);
    final DataStream<Tuple3<Long, Integer, String>> dataStream = getWatermarkedDataStream();
    final DataStream<Row> changelogStream = dataStream.map(t -> Row.ofKind(RowKind.INSERT, t.f1, t.f2)).returns(Types.ROW(Types.INT, Types.STRING));
    // derive physical columns and add a rowtime
    final Table table = tableEnv.fromChangelogStream(changelogStream, Schema.newBuilder().columnByMetadata("rowtime", TIMESTAMP_LTZ(3)).columnByExpression("computed", $("f1").upperCase()).watermark("rowtime", sourceWatermark()).build());
    tableEnv.createTemporaryView("t", table);
    // access and reorder columns
    final Table reordered = tableEnv.sqlQuery("SELECT computed, rowtime, f0 FROM t");
    // write out the rowtime column with fully declared schema
    final DataStream<Row> result = tableEnv.toChangelogStream(reordered, Schema.newBuilder().column("f1", STRING()).columnByMetadata("rowtime", TIMESTAMP_LTZ(3)).columnByExpression("ignored", $("f1").upperCase()).column("f0", INT()).build());
    // test event time window and field access
    testResult(result.keyBy(k -> k.getField("f1")).window(TumblingEventTimeWindows.of(Time.milliseconds(5))).<Row>apply((key, window, input, out) -> {
        int sum = 0;
        for (Row row : input) {
            sum += row.<Integer>getFieldAs("f0");
        }
        out.collect(Row.of(key, sum));
    }).returns(Types.ROW(Types.STRING, Types.INT)), Row.of("A", 47), Row.of("C", 1000), Row.of("C", 1000));
}

Also used : DataType(org.apache.flink.table.types.DataType) BIGINT(org.apache.flink.table.api.DataTypes.BIGINT) STRING(org.apache.flink.table.api.DataTypes.STRING) StreamTableEnvironment(org.apache.flink.table.api.bridge.java.StreamTableEnvironment) Arrays(java.util.Arrays) Schema(org.apache.flink.table.api.Schema) Tuple3(org.apache.flink.api.java.tuple.Tuple3) TableDescriptor(org.apache.flink.table.api.TableDescriptor) Tuple2(org.apache.flink.api.java.tuple.Tuple2) ResolvedSchema(org.apache.flink.table.catalog.ResolvedSchema) TupleTypeInfo(org.apache.flink.api.java.typeutils.TupleTypeInfo) TypeHint(org.apache.flink.api.common.typeinfo.TypeHint) TIMESTAMP_LTZ(org.apache.flink.table.api.DataTypes.TIMESTAMP_LTZ) RawType(org.apache.flink.table.types.logical.RawType) ZoneOffset(java.time.ZoneOffset) TypeInformation(org.apache.flink.api.common.typeinfo.TypeInformation) FIELD(org.apache.flink.table.api.DataTypes.FIELD) Parameterized(org.junit.runners.Parameterized) AbstractTestBase(org.apache.flink.test.util.AbstractTestBase) DOUBLE(org.apache.flink.table.api.DataTypes.DOUBLE) TableConfig(org.apache.flink.table.api.TableConfig) Expressions.$(org.apache.flink.table.api.Expressions.$) TestValuesTableFactory(org.apache.flink.table.planner.factories.TestValuesTableFactory) WatermarkStrategy(org.apache.flink.api.common.eventtime.WatermarkStrategy) Table(org.apache.flink.table.api.Table) ResolvedExpressionMock(org.apache.flink.table.expressions.utils.ResolvedExpressionMock) ZoneId(java.time.ZoneId) Objects(java.util.Objects) Matchers.instanceOf(org.hamcrest.Matchers.instanceOf) CloseableIterator(org.apache.flink.util.CloseableIterator) List(java.util.List) ValueState(org.apache.flink.api.common.state.ValueState) TumblingEventTimeWindows(org.apache.flink.streaming.api.windowing.assigners.TumblingEventTimeWindows) Matchers.containsInAnyOrder(org.hamcrest.Matchers.containsInAnyOrder) STRUCTURED(org.apache.flink.table.api.DataTypes.STRUCTURED) TableResult(org.apache.flink.table.api.TableResult) Row(org.apache.flink.types.Row) StreamExecutionEnvironment(org.apache.flink.streaming.api.environment.StreamExecutionEnvironment) MAP(org.apache.flink.table.api.DataTypes.MAP) BOOLEAN(org.apache.flink.table.api.DataTypes.BOOLEAN) Either(org.apache.flink.types.Either) ChangelogMode(org.apache.flink.table.connector.ChangelogMode) ROW(org.apache.flink.table.api.DataTypes.ROW) Column(org.apache.flink.table.catalog.Column) RunWith(org.junit.runner.RunWith) Parameters(org.junit.runners.Parameterized.Parameters) LocalDateTime(java.time.LocalDateTime) Expressions.sourceWatermark(org.apache.flink.table.api.Expressions.sourceWatermark) DataStreamSource(org.apache.flink.streaming.api.datastream.DataStreamSource) KeyedProcessFunction(org.apache.flink.streaming.api.functions.KeyedProcessFunction) ArrayList(java.util.ArrayList) Collector(org.apache.flink.util.Collector) ProcessFunction(org.apache.flink.streaming.api.functions.ProcessFunction) MatcherAssert.assertThat(org.hamcrest.MatcherAssert.assertThat) INT(org.apache.flink.table.api.DataTypes.INT) Before(org.junit.Before) Types(org.apache.flink.api.common.typeinfo.Types) Time(org.apache.flink.streaming.api.windowing.time.Time) WatermarkSpec(org.apache.flink.table.catalog.WatermarkSpec) GenericTypeInfo(org.apache.flink.api.java.typeutils.GenericTypeInfo) Parameter(org.junit.runners.Parameterized.Parameter) ValueStateDescriptor(org.apache.flink.api.common.state.ValueStateDescriptor) Configuration(org.apache.flink.configuration.Configuration) SingleOutputStreamOperator(org.apache.flink.streaming.api.datastream.SingleOutputStreamOperator) DataTypes(org.apache.flink.table.api.DataTypes) Test(org.junit.Test) IOException(java.io.IOException) CollectionUtil(org.apache.flink.util.CollectionUtil) DataStream(org.apache.flink.streaming.api.datastream.DataStream) RowKind(org.apache.flink.types.RowKind) DayOfWeek(java.time.DayOfWeek) TIMESTAMP(org.apache.flink.table.api.DataTypes.TIMESTAMP) EnumTypeInfo(org.apache.flink.api.java.typeutils.EnumTypeInfo) RuntimeExecutionMode(org.apache.flink.api.common.RuntimeExecutionMode) Collections(java.util.Collections) Assert.assertEquals(org.junit.Assert.assertEquals) Table(org.apache.flink.table.api.Table) Tuple3(org.apache.flink.api.java.tuple.Tuple3) StreamTableEnvironment(org.apache.flink.table.api.bridge.java.StreamTableEnvironment) Row(org.apache.flink.types.Row) TypeHint(org.apache.flink.api.common.typeinfo.TypeHint) Test(org.junit.Test)

Example 49 with DataStream

use of org.apache.flink.streaming.api.datastream.DataStream in project beam by apache.

the class FlinkStreamingPortablePipelineTranslator method translateFlatten.

private <T> void translateFlatten(String id, RunnerApi.Pipeline pipeline, StreamingTranslationContext context) {
    RunnerApi.PTransform transform = pipeline.getComponents().getTransformsOrThrow(id);
    Map<String, String> allInputs = transform.getInputsMap();
    if (allInputs.isEmpty()) {
        // create an empty dummy source to satisfy downstream operations
        // we cannot create an empty source in Flink, therefore we have to
        // add the flatMap that simply never forwards the single element
        long shutdownAfterIdleSourcesMs = context.getPipelineOptions().getShutdownSourcesAfterIdleMs();
        DataStreamSource<WindowedValue<byte[]>> dummySource = context.getExecutionEnvironment().addSource(new ImpulseSourceFunction(shutdownAfterIdleSourcesMs));
        DataStream<WindowedValue<T>> result = dummySource.<WindowedValue<T>>flatMap((s, collector) -> {
        // never return anything
        }).returns(new CoderTypeInformation<>(WindowedValue.getFullCoder((Coder<T>) VoidCoder.of(), GlobalWindow.Coder.INSTANCE), context.getPipelineOptions()));
        context.addDataStream(Iterables.getOnlyElement(transform.getOutputsMap().values()), result);
    } else {
        DataStream<T> result = null;
        // Determine DataStreams that we use as input several times. For those, we need to uniquify
        // input streams because Flink seems to swallow watermarks when we have a union of one and
        // the same stream.
        HashMultiset<DataStream<T>> inputCounts = HashMultiset.create();
        for (String input : allInputs.values()) {
            DataStream<T> current = context.getDataStreamOrThrow(input);
            inputCounts.add(current, 1);
        }
        for (String input : allInputs.values()) {
            DataStream<T> current = context.getDataStreamOrThrow(input);
            final int timesRequired = inputCounts.count(current);
            if (timesRequired > 1) {
                current = current.flatMap(new FlatMapFunction<T, T>() {

                    private static final long serialVersionUID = 1L;

                    @Override
                    public void flatMap(T t, Collector<T> collector) {
                        collector.collect(t);
                    }
                });
            }
            result = (result == null) ? current : result.union(current);
        }
        context.addDataStream(Iterables.getOnlyElement(transform.getOutputsMap().values()), result);
    }
}

Also used : SingletonKeyedWorkItemCoder(org.apache.beam.runners.flink.translation.wrappers.streaming.SingletonKeyedWorkItemCoder) WindowedValueCoder(org.apache.beam.sdk.util.WindowedValue.WindowedValueCoder) FlinkExecutableStageContextFactory(org.apache.beam.runners.flink.translation.functions.FlinkExecutableStageContextFactory) CoderUtils(org.apache.beam.sdk.util.CoderUtils) WireCoders(org.apache.beam.runners.fnexecution.wire.WireCoders) UnboundedSource(org.apache.beam.sdk.io.UnboundedSource) PCollectionViews(org.apache.beam.sdk.values.PCollectionViews) SdkHarnessClient(org.apache.beam.runners.fnexecution.control.SdkHarnessClient) ImmutableMap(org.apache.beam.vendor.guava.v26_0_jre.com.google.common.collect.ImmutableMap) RunnerPCollectionView(org.apache.beam.runners.core.construction.RunnerPCollectionView) ImmutableSet(org.apache.beam.vendor.guava.v26_0_jre.com.google.common.collect.ImmutableSet) TestStreamSource(org.apache.beam.runners.flink.translation.wrappers.streaming.io.TestStreamSource) Map(java.util.Map) TestStreamTranslation(org.apache.beam.runners.core.construction.TestStreamTranslation) GlobalWindow(org.apache.beam.sdk.transforms.windowing.GlobalWindow) JsonNode(com.fasterxml.jackson.databind.JsonNode) CoderTypeInformation(org.apache.beam.runners.flink.translation.types.CoderTypeInformation) KvCoder(org.apache.beam.sdk.coders.KvCoder) PTransformTranslation(org.apache.beam.runners.core.construction.PTransformTranslation) WindowDoFnOperator(org.apache.beam.runners.flink.translation.wrappers.streaming.WindowDoFnOperator) Set(java.util.Set) OutputTag(org.apache.flink.util.OutputTag) ExecutableStage(org.apache.beam.runners.core.construction.graph.ExecutableStage) ExecutableStageTranslation.generateNameFromStagePayload(org.apache.beam.runners.core.construction.ExecutableStageTranslation.generateNameFromStagePayload) FlatMapFunction(org.apache.flink.api.common.functions.FlatMapFunction) CoderException(org.apache.beam.sdk.coders.CoderException) WindowingStrategyTranslation(org.apache.beam.runners.core.construction.WindowingStrategyTranslation) TestStream(org.apache.beam.sdk.testing.TestStream) PipelineTranslatorUtils.instantiateCoder(org.apache.beam.runners.fnexecution.translation.PipelineTranslatorUtils.instantiateCoder) ValueWithRecordId(org.apache.beam.sdk.values.ValueWithRecordId) KV(org.apache.beam.sdk.values.KV) TypeDescriptor(org.apache.beam.sdk.values.TypeDescriptor) ArrayList(java.util.ArrayList) LinkedHashMap(java.util.LinkedHashMap) StreamingImpulseSource(org.apache.beam.runners.flink.translation.wrappers.streaming.io.StreamingImpulseSource) RichMapFunction(org.apache.flink.api.common.functions.RichMapFunction) Collector(org.apache.flink.util.Collector) TupleTag(org.apache.beam.sdk.values.TupleTag) Maps(org.apache.beam.vendor.guava.v26_0_jre.com.google.common.collect.Maps) BiMap(org.apache.beam.vendor.guava.v26_0_jre.com.google.common.collect.BiMap) InvalidProtocolBufferException(org.apache.beam.vendor.grpc.v1p43p2.com.google.protobuf.InvalidProtocolBufferException) RunnerApi(org.apache.beam.model.pipeline.v1.RunnerApi) QueryablePipeline(org.apache.beam.runners.core.construction.graph.QueryablePipeline) IterableCoder(org.apache.beam.sdk.coders.IterableCoder) SingleOutputStreamOperator(org.apache.flink.streaming.api.datastream.SingleOutputStreamOperator) IOException(java.io.IOException) DedupingOperator(org.apache.beam.runners.flink.translation.wrappers.streaming.io.DedupingOperator) BoundedSource(org.apache.beam.sdk.io.BoundedSource) TreeMap(java.util.TreeMap) PCollectionView(org.apache.beam.sdk.values.PCollectionView) AutoService(com.google.auto.service.AutoService) BoundedWindow(org.apache.beam.sdk.transforms.windowing.BoundedWindow) PipelineNode(org.apache.beam.runners.core.construction.graph.PipelineNode) VoidCoder(org.apache.beam.sdk.coders.VoidCoder) UnboundedSourceWrapper(org.apache.beam.runners.flink.translation.wrappers.streaming.io.UnboundedSourceWrapper) FileSystems(org.apache.beam.sdk.io.FileSystems) SystemReduceFn(org.apache.beam.runners.core.SystemReduceFn) SerializablePipelineOptions(org.apache.beam.runners.core.construction.SerializablePipelineOptions) WindowedValue(org.apache.beam.sdk.util.WindowedValue) PipelineTranslatorUtils.getWindowingStrategy(org.apache.beam.runners.fnexecution.translation.PipelineTranslatorUtils.getWindowingStrategy) WorkItemKeySelector(org.apache.beam.runners.flink.translation.wrappers.streaming.WorkItemKeySelector) KvToByteBufferKeySelector(org.apache.beam.runners.flink.translation.wrappers.streaming.KvToByteBufferKeySelector) SerializableFunction(org.apache.beam.sdk.transforms.SerializableFunction) RehydratedComponents(org.apache.beam.runners.core.construction.RehydratedComponents) DoFnOperator(org.apache.beam.runners.flink.translation.wrappers.streaming.DoFnOperator) ByteBuffer(java.nio.ByteBuffer) Sets(org.apache.beam.vendor.guava.v26_0_jre.com.google.common.collect.Sets) Locale(java.util.Locale) Iterables(org.apache.beam.vendor.guava.v26_0_jre.com.google.common.collect.Iterables) JobInfo(org.apache.beam.runners.fnexecution.provisioning.JobInfo) TypeInformation(org.apache.flink.api.common.typeinfo.TypeInformation) KeyedWorkItem(org.apache.beam.runners.core.KeyedWorkItem) KeySelector(org.apache.flink.api.java.functions.KeySelector) TwoInputTransformation(org.apache.flink.streaming.api.transformations.TwoInputTransformation) KeyedStream(org.apache.flink.streaming.api.datastream.KeyedStream) String.format(java.lang.String.format) ModelCoders(org.apache.beam.runners.core.construction.ModelCoders) UnionCoder(org.apache.beam.sdk.transforms.join.UnionCoder) JobExecutionResult(org.apache.flink.api.common.JobExecutionResult) List(java.util.List) TypeDescriptors(org.apache.beam.sdk.values.TypeDescriptors) WindowingStrategy(org.apache.beam.sdk.values.WindowingStrategy) StreamExecutionEnvironment(org.apache.flink.streaming.api.environment.StreamExecutionEnvironment) HashMultiset(org.apache.beam.vendor.guava.v26_0_jre.com.google.common.collect.HashMultiset) PipelineTranslatorUtils.createOutputMap(org.apache.beam.runners.fnexecution.translation.PipelineTranslatorUtils.createOutputMap) ReadTranslation(org.apache.beam.runners.core.construction.ReadTranslation) ExecutableStageDoFnOperator(org.apache.beam.runners.flink.translation.wrappers.streaming.ExecutableStageDoFnOperator) Coder(org.apache.beam.sdk.coders.Coder) HashMap(java.util.HashMap) DataStreamSource(org.apache.flink.streaming.api.datastream.DataStreamSource) RawUnionValue(org.apache.beam.sdk.transforms.join.RawUnionValue) ImpulseSourceFunction(org.apache.beam.runners.flink.translation.functions.ImpulseSourceFunction) ViewFn(org.apache.beam.sdk.transforms.ViewFn) PipelineOptions(org.apache.beam.sdk.options.PipelineOptions) SdfByteBufferKeySelector(org.apache.beam.runners.flink.translation.wrappers.streaming.SdfByteBufferKeySelector) NativeTransforms(org.apache.beam.runners.core.construction.NativeTransforms) ObjectMapper(com.fasterxml.jackson.databind.ObjectMapper) Configuration(org.apache.flink.configuration.Configuration) Lists(org.apache.beam.vendor.guava.v26_0_jre.com.google.common.collect.Lists) DataStream(org.apache.flink.streaming.api.datastream.DataStream) ByteArrayCoder(org.apache.beam.sdk.coders.ByteArrayCoder) SourceInputFormat(org.apache.beam.runners.flink.translation.wrappers.SourceInputFormat) Collections(java.util.Collections) DataStream(org.apache.flink.streaming.api.datastream.DataStream) ImpulseSourceFunction(org.apache.beam.runners.flink.translation.functions.ImpulseSourceFunction) RunnerApi(org.apache.beam.model.pipeline.v1.RunnerApi) WindowedValue(org.apache.beam.sdk.util.WindowedValue) FlatMapFunction(org.apache.flink.api.common.functions.FlatMapFunction) Collector(org.apache.flink.util.Collector)

Example 50 with DataStream

use of org.apache.flink.streaming.api.datastream.DataStream in project flink by apache.

the class KafkaTableSinkTestBase method testKafkaTableSink.

@Test
@SuppressWarnings("unchecked")
public void testKafkaTableSink() throws Exception {
    DataStream dataStream = mock(DataStream.class);
    KafkaTableSink kafkaTableSink = spy(createTableSink());
    kafkaTableSink.emitDataStream(dataStream);
    verify(dataStream).addSink(eq(PRODUCER));
    verify(kafkaTableSink).createKafkaProducer(eq(TOPIC), eq(PROPERTIES), any(getSerializationSchema().getClass()), eq(PARTITIONER));
}

Also used : DataStream(org.apache.flink.streaming.api.datastream.DataStream) Test(org.junit.Test)

Aggregations

DataStream (org.apache.flink.streaming.api.datastream.DataStream)87 StreamExecutionEnvironment (org.apache.flink.streaming.api.environment.StreamExecutionEnvironment)78 Test (org.junit.Test)70 List (java.util.List)62 Collector (org.apache.flink.util.Collector)60 Tuple2 (org.apache.flink.api.java.tuple.Tuple2)50 SingleOutputStreamOperator (org.apache.flink.streaming.api.datastream.SingleOutputStreamOperator)48 Arrays (java.util.Arrays)46 ArrayList (java.util.ArrayList)40 TypeInformation (org.apache.flink.api.common.typeinfo.TypeInformation)40 Assert.assertEquals (org.junit.Assert.assertEquals)38 WatermarkStrategy (org.apache.flink.api.common.eventtime.WatermarkStrategy)36 Configuration (org.apache.flink.configuration.Configuration)36 Assert.assertTrue (org.junit.Assert.assertTrue)33 BasicTypeInfo (org.apache.flink.api.common.typeinfo.BasicTypeInfo)32 StreamOperator (org.apache.flink.streaming.api.operators.StreamOperator)32 Types (org.apache.flink.api.common.typeinfo.Types)31 Assert (org.junit.Assert)31 ReduceFunction (org.apache.flink.api.common.functions.ReduceFunction)29 JobGraph (org.apache.flink.runtime.jobgraph.JobGraph)29