Search in sources :

Example 1 with SinkFunction

use of org.apache.flink.streaming.api.functions.sink.SinkFunction in project flink by apache.

the class KafkaProducerTestBase method runCustomPartitioningTest.

/**
	 * 
	 * <pre>
	 *             +------> (sink) --+--> [KAFKA-1] --> (source) -> (map) --+
	 *            /                  |                                       \
	 *           /                   |                                        \
	 * (source) ----------> (sink) --+--> [KAFKA-2] --> (source) -> (map) -----+-> (sink)
	 *           \                   |                                        /
	 *            \                  |                                       /
	 *             +------> (sink) --+--> [KAFKA-3] --> (source) -> (map) --+
	 * </pre>
	 * 
	 * The mapper validates that the values come consistently from the correct Kafka partition.
	 * 
	 * The final sink validates that there are no duplicates and that all partitions are present.
	 */
public void runCustomPartitioningTest() {
    try {
        LOG.info("Starting KafkaProducerITCase.testCustomPartitioning()");
        final String topic = "customPartitioningTestTopic";
        final int parallelism = 3;
        createTestTopic(topic, parallelism, 1);
        TypeInformation<Tuple2<Long, String>> longStringInfo = TypeInfoParser.parse("Tuple2<Long, String>");
        StreamExecutionEnvironment env = StreamExecutionEnvironment.createRemoteEnvironment("localhost", flinkPort);
        env.setRestartStrategy(RestartStrategies.noRestart());
        env.getConfig().disableSysoutLogging();
        TypeInformationSerializationSchema<Tuple2<Long, String>> serSchema = new TypeInformationSerializationSchema<>(longStringInfo, env.getConfig());
        TypeInformationSerializationSchema<Tuple2<Long, String>> deserSchema = new TypeInformationSerializationSchema<>(longStringInfo, env.getConfig());
        // ------ producing topology ---------
        // source has DOP 1 to make sure it generates no duplicates
        DataStream<Tuple2<Long, String>> stream = env.addSource(new SourceFunction<Tuple2<Long, String>>() {

            private boolean running = true;

            @Override
            public void run(SourceContext<Tuple2<Long, String>> ctx) throws Exception {
                long cnt = 0;
                while (running) {
                    ctx.collect(new Tuple2<Long, String>(cnt, "kafka-" + cnt));
                    cnt++;
                }
            }

            @Override
            public void cancel() {
                running = false;
            }
        }).setParallelism(1);
        Properties props = new Properties();
        props.putAll(FlinkKafkaProducerBase.getPropertiesFromBrokerList(brokerConnectionStrings));
        props.putAll(secureProps);
        // sink partitions into 
        kafkaServer.produceIntoKafka(stream, topic, new KeyedSerializationSchemaWrapper<>(serSchema), props, new CustomPartitioner(parallelism)).setParallelism(parallelism);
        // ------ consuming topology ---------
        Properties consumerProps = new Properties();
        consumerProps.putAll(standardProps);
        consumerProps.putAll(secureProps);
        FlinkKafkaConsumerBase<Tuple2<Long, String>> source = kafkaServer.getConsumer(topic, deserSchema, consumerProps);
        env.addSource(source).setParallelism(parallelism).map(new RichMapFunction<Tuple2<Long, String>, Integer>() {

            private int ourPartition = -1;

            @Override
            public Integer map(Tuple2<Long, String> value) {
                int partition = value.f0.intValue() % parallelism;
                if (ourPartition != -1) {
                    assertEquals("inconsistent partitioning", ourPartition, partition);
                } else {
                    ourPartition = partition;
                }
                return partition;
            }
        }).setParallelism(parallelism).addSink(new SinkFunction<Integer>() {

            private int[] valuesPerPartition = new int[parallelism];

            @Override
            public void invoke(Integer value) throws Exception {
                valuesPerPartition[value]++;
                boolean missing = false;
                for (int i : valuesPerPartition) {
                    if (i < 100) {
                        missing = true;
                        break;
                    }
                }
                if (!missing) {
                    throw new SuccessException();
                }
            }
        }).setParallelism(1);
        tryExecute(env, "custom partitioning test");
        deleteTestTopic(topic);
        LOG.info("Finished KafkaProducerITCase.testCustomPartitioning()");
    } catch (Exception e) {
        e.printStackTrace();
        fail(e.getMessage());
    }
}
Also used : SourceFunction(org.apache.flink.streaming.api.functions.source.SourceFunction) KeyedSerializationSchemaWrapper(org.apache.flink.streaming.util.serialization.KeyedSerializationSchemaWrapper) Properties(java.util.Properties) SuccessException(org.apache.flink.test.util.SuccessException) TypeInformationSerializationSchema(org.apache.flink.streaming.util.serialization.TypeInformationSerializationSchema) SinkFunction(org.apache.flink.streaming.api.functions.sink.SinkFunction) Tuple2(org.apache.flink.api.java.tuple.Tuple2) RichMapFunction(org.apache.flink.api.common.functions.RichMapFunction) SuccessException(org.apache.flink.test.util.SuccessException) StreamExecutionEnvironment(org.apache.flink.streaming.api.environment.StreamExecutionEnvironment)

Example 2 with SinkFunction

use of org.apache.flink.streaming.api.functions.sink.SinkFunction in project flink by apache.

the class BufferTimeoutITCase method testDisablingBufferTimeout.

/**
 * The test verifies that it is possible to disable explicit buffer flushing. It checks that
 * OutputFlasher thread would not be started when the task is running. But this doesn't
 * guarantee that the unfinished buffers can not be flushed by another events.
 */
@Test
public void testDisablingBufferTimeout() throws Exception {
    final StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
    env.setParallelism(1);
    env.setBufferTimeout(-1);
    final SharedReference<ArrayList<Integer>> results = sharedObjects.add(new ArrayList<>());
    env.addSource(new SourceFunction<Integer>() {

        @Override
        public void run(SourceContext<Integer> ctx) throws Exception {
            ctx.collect(1);
            // just sleep forever
            Thread.sleep(Long.MAX_VALUE);
        }

        @Override
        public void cancel() {
        }
    }).slotSharingGroup("source").addSink(new SinkFunction<Integer>() {

        @Override
        public void invoke(Integer value, Context context) {
            results.get().add(value);
        }
    }).slotSharingGroup("sink");
    final JobClient jobClient = env.executeAsync();
    CommonTestUtils.waitForAllTaskRunning(MINI_CLUSTER_RESOURCE.getMiniCluster(), jobClient.getJobID(), false);
    assertTrue(RecordWriter.DEFAULT_OUTPUT_FLUSH_THREAD_NAME + " thread is unexpectedly running", Thread.getAllStackTraces().keySet().stream().noneMatch(thread -> thread.getName().startsWith(RecordWriter.DEFAULT_OUTPUT_FLUSH_THREAD_NAME)));
}
Also used : SinkFunction(org.apache.flink.streaming.api.functions.sink.SinkFunction) SharedObjects(org.apache.flink.testutils.junit.SharedObjects) Assert.assertTrue(org.junit.Assert.assertTrue) Test(org.junit.Test) JobClient(org.apache.flink.core.execution.JobClient) ArrayList(java.util.ArrayList) RecordWriter(org.apache.flink.runtime.io.network.api.writer.RecordWriter) Rule(org.junit.Rule) SourceFunction(org.apache.flink.streaming.api.functions.source.SourceFunction) CommonTestUtils(org.apache.flink.runtime.testutils.CommonTestUtils) SharedReference(org.apache.flink.testutils.junit.SharedReference) StreamExecutionEnvironment(org.apache.flink.streaming.api.environment.StreamExecutionEnvironment) AbstractTestBase(org.apache.flink.test.util.AbstractTestBase) SourceFunction(org.apache.flink.streaming.api.functions.source.SourceFunction) SinkFunction(org.apache.flink.streaming.api.functions.sink.SinkFunction) ArrayList(java.util.ArrayList) StreamExecutionEnvironment(org.apache.flink.streaming.api.environment.StreamExecutionEnvironment) JobClient(org.apache.flink.core.execution.JobClient) Test(org.junit.Test)

Example 3 with SinkFunction

use of org.apache.flink.streaming.api.functions.sink.SinkFunction in project flink by apache.

the class StreamOperatorChainingTest method testMultiChainingWithSplit.

/**
 * Verify that multi-chaining works with object reuse enabled.
 */
private void testMultiChainingWithSplit(StreamExecutionEnvironment env) throws Exception {
    // set parallelism to 2 to avoid chaining with source in case when available processors is
    // 1.
    env.setParallelism(2);
    // the actual elements will not be used
    DataStream<Integer> input = env.fromElements(1, 2, 3);
    sink1Results = new ArrayList<>();
    sink2Results = new ArrayList<>();
    sink3Results = new ArrayList<>();
    input = input.map(value -> value);
    OutputTag<Integer> oneOutput = new OutputTag<Integer>("one") {
    };
    OutputTag<Integer> otherOutput = new OutputTag<Integer>("other") {
    };
    SingleOutputStreamOperator<Object> split = input.process(new ProcessFunction<Integer, Object>() {

        private static final long serialVersionUID = 1L;

        @Override
        public void processElement(Integer value, Context ctx, Collector<Object> out) throws Exception {
            if (value.equals(1)) {
                ctx.output(oneOutput, value);
            } else {
                ctx.output(otherOutput, value);
            }
        }
    });
    split.getSideOutput(oneOutput).map(value -> "First 1: " + value).addSink(new SinkFunction<String>() {

        @Override
        public void invoke(String value, Context ctx) throws Exception {
            sink1Results.add(value);
        }
    });
    split.getSideOutput(oneOutput).map(value -> "First 2: " + value).addSink(new SinkFunction<String>() {

        @Override
        public void invoke(String value, Context ctx) throws Exception {
            sink2Results.add(value);
        }
    });
    split.getSideOutput(otherOutput).map(value -> "Second: " + value).addSink(new SinkFunction<String>() {

        @Override
        public void invoke(String value, Context ctx) throws Exception {
            sink3Results.add(value);
        }
    });
    // be build our own StreamTask and OperatorChain
    JobGraph jobGraph = env.getStreamGraph().getJobGraph();
    Assert.assertTrue(jobGraph.getVerticesSortedTopologicallyFromSources().size() == 2);
    JobVertex chainedVertex = jobGraph.getVerticesSortedTopologicallyFromSources().get(1);
    Configuration configuration = chainedVertex.getConfiguration();
    StreamConfig streamConfig = new StreamConfig(configuration);
    StreamMap<Integer, Integer> headOperator = streamConfig.getStreamOperator(Thread.currentThread().getContextClassLoader());
    try (MockEnvironment environment = createMockEnvironment(chainedVertex.getName())) {
        StreamTask<Integer, StreamMap<Integer, Integer>> mockTask = createMockTask(streamConfig, environment);
        OperatorChain<Integer, StreamMap<Integer, Integer>> operatorChain = createOperatorChain(streamConfig, environment, mockTask);
        headOperator.setup(mockTask, streamConfig, operatorChain.getMainOperatorOutput());
        operatorChain.initializeStateAndOpenOperators(null);
        headOperator.processElement(new StreamRecord<>(1));
        headOperator.processElement(new StreamRecord<>(2));
        headOperator.processElement(new StreamRecord<>(3));
        assertThat(sink1Results, contains("First 1: 1"));
        assertThat(sink2Results, contains("First 2: 1"));
        assertThat(sink3Results, contains("Second: 2", "Second: 3"));
    }
}
Also used : StreamConfig(org.apache.flink.streaming.api.graph.StreamConfig) JobVertex(org.apache.flink.runtime.jobgraph.JobVertex) JobGraph(org.apache.flink.runtime.jobgraph.JobGraph) RecordWriterDelegate(org.apache.flink.runtime.io.network.api.writer.RecordWriterDelegate) ArrayList(java.util.ArrayList) StreamRecord(org.apache.flink.streaming.runtime.streamrecord.StreamRecord) StreamTaskStateInitializer(org.apache.flink.streaming.api.operators.StreamTaskStateInitializer) Collector(org.apache.flink.util.Collector) StreamMap(org.apache.flink.streaming.api.operators.StreamMap) ProcessFunction(org.apache.flink.streaming.api.functions.ProcessFunction) MatcherAssert.assertThat(org.hamcrest.MatcherAssert.assertThat) MockEnvironment(org.apache.flink.runtime.operators.testutils.MockEnvironment) SinkFunction(org.apache.flink.streaming.api.functions.sink.SinkFunction) MockInputSplitProvider(org.apache.flink.runtime.operators.testutils.MockInputSplitProvider) StreamTask(org.apache.flink.streaming.runtime.tasks.StreamTask) Configuration(org.apache.flink.configuration.Configuration) SerializationDelegate(org.apache.flink.runtime.plugable.SerializationDelegate) SingleOutputStreamOperator(org.apache.flink.streaming.api.datastream.SingleOutputStreamOperator) MockStreamTaskBuilder(org.apache.flink.streaming.util.MockStreamTaskBuilder) OutputTag(org.apache.flink.util.OutputTag) Test(org.junit.Test) MockEnvironmentBuilder(org.apache.flink.runtime.operators.testutils.MockEnvironmentBuilder) DataStream(org.apache.flink.streaming.api.datastream.DataStream) StreamOperator(org.apache.flink.streaming.api.operators.StreamOperator) List(java.util.List) Matchers.contains(org.hamcrest.Matchers.contains) ExecutionConfig(org.apache.flink.api.common.ExecutionConfig) OperatorChain(org.apache.flink.streaming.runtime.tasks.OperatorChain) RegularOperatorChain(org.apache.flink.streaming.runtime.tasks.RegularOperatorChain) Assert(org.junit.Assert) Environment(org.apache.flink.runtime.execution.Environment) StreamExecutionEnvironment(org.apache.flink.streaming.api.environment.StreamExecutionEnvironment) StreamOperatorWrapper(org.apache.flink.streaming.runtime.tasks.StreamOperatorWrapper) Configuration(org.apache.flink.configuration.Configuration) MockEnvironment(org.apache.flink.runtime.operators.testutils.MockEnvironment) OutputTag(org.apache.flink.util.OutputTag) StreamConfig(org.apache.flink.streaming.api.graph.StreamConfig) JobGraph(org.apache.flink.runtime.jobgraph.JobGraph) JobVertex(org.apache.flink.runtime.jobgraph.JobVertex) StreamMap(org.apache.flink.streaming.api.operators.StreamMap)

Example 4 with SinkFunction

use of org.apache.flink.streaming.api.functions.sink.SinkFunction in project flink by apache.

the class SavepointITCase method testTriggerSavepointAndResumeWithNoClaim.

@Test
@Ignore("Disabling this test because it regularly fails on AZP. See FLINK-25427.")
public void testTriggerSavepointAndResumeWithNoClaim() throws Exception {
    final int numTaskManagers = 2;
    final int numSlotsPerTaskManager = 2;
    final int parallelism = numTaskManagers * numSlotsPerTaskManager;
    final StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
    env.setStateBackend(new EmbeddedRocksDBStateBackend(true));
    env.getCheckpointConfig().enableExternalizedCheckpoints(CheckpointConfig.ExternalizedCheckpointCleanup.RETAIN_ON_CANCELLATION);
    env.getCheckpointConfig().setCheckpointStorage(folder.newFolder().toURI());
    env.setParallelism(parallelism);
    final SharedReference<CountDownLatch> counter = sharedObjects.add(new CountDownLatch(10_000));
    env.fromSequence(1, Long.MAX_VALUE).keyBy(i -> i % parallelism).process(new KeyedProcessFunction<Long, Long, Long>() {

        private ListState<Long> last;

        @Override
        public void open(Configuration parameters) {
            // we use list state here to create sst files of a significant size
            // if sst files do not reach certain thresholds they are not stored
            // in files, but as a byte stream in checkpoints metadata
            last = getRuntimeContext().getListState(new ListStateDescriptor<>("last", BasicTypeInfo.LONG_TYPE_INFO));
        }

        @Override
        public void processElement(Long value, KeyedProcessFunction<Long, Long, Long>.Context ctx, Collector<Long> out) throws Exception {
            last.add(value);
            out.collect(value);
        }
    }).addSink(new SinkFunction<Long>() {

        @Override
        public void invoke(Long value) {
            counter.consumeSync(CountDownLatch::countDown);
        }
    }).setParallelism(1);
    final JobGraph jobGraph = env.getStreamGraph().getJobGraph();
    MiniClusterWithClientResource cluster = new MiniClusterWithClientResource(new MiniClusterResourceConfiguration.Builder().setNumberTaskManagers(numTaskManagers).setNumberSlotsPerTaskManager(numSlotsPerTaskManager).build());
    cluster.before();
    try {
        final JobID jobID1 = new JobID();
        jobGraph.setJobID(jobID1);
        cluster.getClusterClient().submitJob(jobGraph).get();
        CommonTestUtils.waitForAllTaskRunning(cluster.getMiniCluster(), jobID1, false);
        // wait for some records to be processed before taking the checkpoint
        counter.get().await();
        final String firstCheckpoint = cluster.getMiniCluster().triggerCheckpoint(jobID1).get();
        cluster.getClusterClient().cancel(jobID1).get();
        jobGraph.setSavepointRestoreSettings(SavepointRestoreSettings.forPath(firstCheckpoint, false, RestoreMode.NO_CLAIM));
        final JobID jobID2 = new JobID();
        jobGraph.setJobID(jobID2);
        cluster.getClusterClient().submitJob(jobGraph).get();
        CommonTestUtils.waitForAllTaskRunning(cluster.getMiniCluster(), jobID2, false);
        String secondCheckpoint = cluster.getMiniCluster().triggerCheckpoint(jobID2).get();
        cluster.getClusterClient().cancel(jobID2).get();
        // delete the checkpoint we restored from
        FileUtils.deleteDirectory(Paths.get(new URI(firstCheckpoint)).getParent().toFile());
        // we should be able to restore from the second checkpoint even though it has been built
        // on top of the first checkpoint
        jobGraph.setSavepointRestoreSettings(SavepointRestoreSettings.forPath(secondCheckpoint, false, RestoreMode.NO_CLAIM));
        final JobID jobID3 = new JobID();
        jobGraph.setJobID(jobID3);
        cluster.getClusterClient().submitJob(jobGraph).get();
        CommonTestUtils.waitForAllTaskRunning(cluster.getMiniCluster(), jobID3, false);
    } finally {
        cluster.after();
    }
}
Also used : Arrays(java.util.Arrays) SharedObjects(org.apache.flink.testutils.junit.SharedObjects) MemorySize(org.apache.flink.configuration.MemorySize) EmptyRequestBody(org.apache.flink.runtime.rest.messages.EmptyRequestBody) MiniClusterResourceConfiguration(org.apache.flink.runtime.testutils.MiniClusterResourceConfiguration) ExceptionUtils.findThrowable(org.apache.flink.util.ExceptionUtils.findThrowable) CheckpointException(org.apache.flink.runtime.checkpoint.CheckpointException) TestUtils.submitJobAndWaitForResult(org.apache.flink.test.util.TestUtils.submitJobAndWaitForResult) FSDataOutputStream(org.apache.flink.core.fs.FSDataOutputStream) CheckpointListener(org.apache.flink.api.common.state.CheckpointListener) Duration(java.time.Duration) Map(java.util.Map) StreamGraph(org.apache.flink.streaming.api.graph.StreamGraph) ExceptionUtils.assertThrowable(org.apache.flink.util.ExceptionUtils.assertThrowable) RichSourceFunction(org.apache.flink.streaming.api.functions.source.RichSourceFunction) Path(java.nio.file.Path) StateSnapshotContext(org.apache.flink.runtime.state.StateSnapshotContext) SinkFunction(org.apache.flink.streaming.api.functions.sink.SinkFunction) BoundedOneInput(org.apache.flink.streaming.api.operators.BoundedOneInput) FileSystemFactory(org.apache.flink.core.fs.FileSystemFactory) CountDownLatch(java.util.concurrent.CountDownLatch) JobMessageParameters(org.apache.flink.runtime.rest.messages.JobMessageParameters) Stream(java.util.stream.Stream) ValueState(org.apache.flink.api.common.state.ValueState) ClusterClient(org.apache.flink.client.program.ClusterClient) Assert.assertFalse(org.junit.Assert.assertFalse) OneInputStreamOperator(org.apache.flink.streaming.api.operators.OneInputStreamOperator) Time(org.apache.flink.api.common.time.Time) CopyOnWriteArrayList(java.util.concurrent.CopyOnWriteArrayList) FlinkException(org.apache.flink.util.FlinkException) LocalFileSystem(org.apache.flink.core.fs.local.LocalFileSystem) JobStatus(org.apache.flink.api.common.JobStatus) KeyedProcessFunction(org.apache.flink.streaming.api.functions.KeyedProcessFunction) TypeSafeDiagnosingMatcher(org.hamcrest.TypeSafeDiagnosingMatcher) TaskManagerOptions(org.apache.flink.configuration.TaskManagerOptions) SourceFunction(org.apache.flink.streaming.api.functions.source.SourceFunction) FutureUtils(org.apache.flink.util.concurrent.FutureUtils) RichMapFunction(org.apache.flink.api.common.functions.RichMapFunction) Collector(org.apache.flink.util.Collector) JobExecutionException(org.apache.flink.runtime.client.JobExecutionException) Before(org.junit.Before) MiniClusterWithClientResource(org.apache.flink.test.util.MiniClusterWithClientResource) Files(java.nio.file.Files) ValueStateDescriptor(org.apache.flink.api.common.state.ValueStateDescriptor) ExecutionState(org.apache.flink.runtime.execution.ExecutionState) Assert.assertTrue(org.junit.Assert.assertTrue) Test(org.junit.Test) IOException(java.io.IOException) FSDataInputStream(org.apache.flink.core.fs.FSDataInputStream) File(java.io.File) AbstractStreamOperator(org.apache.flink.streaming.api.operators.AbstractStreamOperator) ExecutionException(java.util.concurrent.ExecutionException) JobID(org.apache.flink.api.common.JobID) Paths(java.nio.file.Paths) Matcher(org.hamcrest.Matcher) Assert(org.junit.Assert) SavepointRestoreSettings(org.apache.flink.runtime.jobgraph.SavepointRestoreSettings) Assert.assertEquals(org.junit.Assert.assertEquals) StateBackendOptions(org.apache.flink.configuration.StateBackendOptions) EntropyInjectingTestFileSystem(org.apache.flink.testutils.EntropyInjectingTestFileSystem) Deadline(org.apache.flink.api.common.time.Deadline) ExceptionUtils.findThrowableWithMessage(org.apache.flink.util.ExceptionUtils.findThrowableWithMessage) ClusterOptions(org.apache.flink.configuration.ClusterOptions) FileUtils(org.apache.flink.util.FileUtils) URISyntaxException(java.net.URISyntaxException) BiFunction(java.util.function.BiFunction) JobGraph(org.apache.flink.runtime.jobgraph.JobGraph) LoggerFactory(org.slf4j.LoggerFactory) BlockingNoOpInvokable(org.apache.flink.runtime.testtasks.BlockingNoOpInvokable) Random(java.util.Random) FunctionSnapshotContext(org.apache.flink.runtime.state.FunctionSnapshotContext) EmbeddedRocksDBStateBackend(org.apache.flink.contrib.streaming.state.EmbeddedRocksDBStateBackend) MapFunction(org.apache.flink.api.common.functions.MapFunction) BasicTypeInfo(org.apache.flink.api.common.typeinfo.BasicTypeInfo) Assert.assertThat(org.junit.Assert.assertThat) ListState(org.apache.flink.api.common.state.ListState) CommonTestUtils.waitForAllTaskRunning(org.apache.flink.runtime.testutils.CommonTestUtils.waitForAllTaskRunning) ChainingStrategy(org.apache.flink.streaming.api.operators.ChainingStrategy) TestLogger(org.apache.flink.util.TestLogger) ListStateDescriptor(org.apache.flink.api.common.state.ListStateDescriptor) Assert.fail(org.junit.Assert.fail) URI(java.net.URI) KeySelector(org.apache.flink.api.java.functions.KeySelector) CheckpointedFunction(org.apache.flink.streaming.api.checkpoint.CheckpointedFunction) FunctionInitializationContext(org.apache.flink.runtime.state.FunctionInitializationContext) Collection(java.util.Collection) Collectors(java.util.stream.Collectors) FileNotFoundException(java.io.FileNotFoundException) CheckpointingOptions(org.apache.flink.configuration.CheckpointingOptions) Objects(java.util.Objects) TestingUtils(org.apache.flink.testutils.TestingUtils) List(java.util.List) FileSystem(org.apache.flink.core.fs.FileSystem) FlinkJobNotFoundException(org.apache.flink.runtime.messages.FlinkJobNotFoundException) Optional(java.util.Optional) CheckpointConfig(org.apache.flink.streaming.api.environment.CheckpointConfig) ParallelSourceFunction(org.apache.flink.streaming.api.functions.source.ParallelSourceFunction) StreamExecutionEnvironment(org.apache.flink.streaming.api.environment.StreamExecutionEnvironment) RichFlatMapFunction(org.apache.flink.api.common.functions.RichFlatMapFunction) OneShotLatch(org.apache.flink.core.testutils.OneShotLatch) SavepointFormatType(org.apache.flink.core.execution.SavepointFormatType) JobVertex(org.apache.flink.runtime.jobgraph.JobVertex) IterativeStream(org.apache.flink.streaming.api.datastream.IterativeStream) CompletableFuture(java.util.concurrent.CompletableFuture) RestartStrategies(org.apache.flink.api.common.restartstrategy.RestartStrategies) RestClusterClient(org.apache.flink.client.program.rest.RestClusterClient) RestoreMode(org.apache.flink.runtime.jobgraph.RestoreMode) StreamRecord(org.apache.flink.streaming.runtime.streamrecord.StreamRecord) CompletableFuture.allOf(java.util.concurrent.CompletableFuture.allOf) JobGraphTestUtils(org.apache.flink.runtime.jobgraph.JobGraphTestUtils) JobDetailsHeaders(org.apache.flink.runtime.rest.messages.job.JobDetailsHeaders) SharedReference(org.apache.flink.testutils.junit.SharedReference) Description(org.hamcrest.Description) Logger(org.slf4j.Logger) LocalRecoverableWriter(org.apache.flink.core.fs.local.LocalRecoverableWriter) DiscardingSink(org.apache.flink.streaming.api.functions.sink.DiscardingSink) Assert.assertNotNull(org.junit.Assert.assertNotNull) Configuration(org.apache.flink.configuration.Configuration) ExceptionUtils.assertThrowableWithMessage(org.apache.flink.util.ExceptionUtils.assertThrowableWithMessage) DataStream(org.apache.flink.streaming.api.datastream.DataStream) TimeUnit(java.util.concurrent.TimeUnit) Rule(org.junit.Rule) Ignore(org.junit.Ignore) ListCheckpointed(org.apache.flink.streaming.api.checkpoint.ListCheckpointed) FileVisitOption(java.nio.file.FileVisitOption) CommonTestUtils(org.apache.flink.runtime.testutils.CommonTestUtils) Collections(java.util.Collections) TemporaryFolder(org.junit.rules.TemporaryFolder) MiniClusterResourceConfiguration(org.apache.flink.runtime.testutils.MiniClusterResourceConfiguration) Configuration(org.apache.flink.configuration.Configuration) KeyedProcessFunction(org.apache.flink.streaming.api.functions.KeyedProcessFunction) MiniClusterWithClientResource(org.apache.flink.test.util.MiniClusterWithClientResource) CountDownLatch(java.util.concurrent.CountDownLatch) URI(java.net.URI) CheckpointException(org.apache.flink.runtime.checkpoint.CheckpointException) FlinkException(org.apache.flink.util.FlinkException) JobExecutionException(org.apache.flink.runtime.client.JobExecutionException) IOException(java.io.IOException) ExecutionException(java.util.concurrent.ExecutionException) URISyntaxException(java.net.URISyntaxException) FileNotFoundException(java.io.FileNotFoundException) FlinkJobNotFoundException(org.apache.flink.runtime.messages.FlinkJobNotFoundException) JobGraph(org.apache.flink.runtime.jobgraph.JobGraph) EmbeddedRocksDBStateBackend(org.apache.flink.contrib.streaming.state.EmbeddedRocksDBStateBackend) SinkFunction(org.apache.flink.streaming.api.functions.sink.SinkFunction) StreamExecutionEnvironment(org.apache.flink.streaming.api.environment.StreamExecutionEnvironment) JobID(org.apache.flink.api.common.JobID) Ignore(org.junit.Ignore) Test(org.junit.Test)

Example 5 with SinkFunction

use of org.apache.flink.streaming.api.functions.sink.SinkFunction in project flink by apache.

the class StreamOperatorChainingTest method testMultiChaining.

/**
 * Verify that multi-chaining works.
 */
private void testMultiChaining(StreamExecutionEnvironment env) throws Exception {
    // set parallelism to 2 to avoid chaining with source in case when available processors is
    // 1.
    env.setParallelism(2);
    // the actual elements will not be used
    DataStream<Integer> input = env.fromElements(1, 2, 3);
    sink1Results = new ArrayList<>();
    sink2Results = new ArrayList<>();
    input = input.map(value -> value);
    input.map(value -> "First: " + value).addSink(new SinkFunction<String>() {

        @Override
        public void invoke(String value, Context ctx) throws Exception {
            sink1Results.add(value);
        }
    });
    input.map(value -> "Second: " + value).addSink(new SinkFunction<String>() {

        @Override
        public void invoke(String value, Context ctx) throws Exception {
            sink2Results.add(value);
        }
    });
    // be build our own StreamTask and OperatorChain
    JobGraph jobGraph = env.getStreamGraph().getJobGraph();
    Assert.assertTrue(jobGraph.getVerticesSortedTopologicallyFromSources().size() == 2);
    JobVertex chainedVertex = jobGraph.getVerticesSortedTopologicallyFromSources().get(1);
    Configuration configuration = chainedVertex.getConfiguration();
    StreamConfig streamConfig = new StreamConfig(configuration);
    StreamMap<Integer, Integer> headOperator = streamConfig.getStreamOperator(Thread.currentThread().getContextClassLoader());
    try (MockEnvironment environment = createMockEnvironment(chainedVertex.getName())) {
        StreamTask<Integer, StreamMap<Integer, Integer>> mockTask = createMockTask(streamConfig, environment);
        OperatorChain<Integer, StreamMap<Integer, Integer>> operatorChain = createOperatorChain(streamConfig, environment, mockTask);
        headOperator.setup(mockTask, streamConfig, operatorChain.getMainOperatorOutput());
        operatorChain.initializeStateAndOpenOperators(null);
        headOperator.processElement(new StreamRecord<>(1));
        headOperator.processElement(new StreamRecord<>(2));
        headOperator.processElement(new StreamRecord<>(3));
        assertThat(sink1Results, contains("First: 1", "First: 2", "First: 3"));
        assertThat(sink2Results, contains("Second: 1", "Second: 2", "Second: 3"));
    }
}
Also used : StreamConfig(org.apache.flink.streaming.api.graph.StreamConfig) JobVertex(org.apache.flink.runtime.jobgraph.JobVertex) JobGraph(org.apache.flink.runtime.jobgraph.JobGraph) RecordWriterDelegate(org.apache.flink.runtime.io.network.api.writer.RecordWriterDelegate) ArrayList(java.util.ArrayList) StreamRecord(org.apache.flink.streaming.runtime.streamrecord.StreamRecord) StreamTaskStateInitializer(org.apache.flink.streaming.api.operators.StreamTaskStateInitializer) Collector(org.apache.flink.util.Collector) StreamMap(org.apache.flink.streaming.api.operators.StreamMap) ProcessFunction(org.apache.flink.streaming.api.functions.ProcessFunction) MatcherAssert.assertThat(org.hamcrest.MatcherAssert.assertThat) MockEnvironment(org.apache.flink.runtime.operators.testutils.MockEnvironment) SinkFunction(org.apache.flink.streaming.api.functions.sink.SinkFunction) MockInputSplitProvider(org.apache.flink.runtime.operators.testutils.MockInputSplitProvider) StreamTask(org.apache.flink.streaming.runtime.tasks.StreamTask) Configuration(org.apache.flink.configuration.Configuration) SerializationDelegate(org.apache.flink.runtime.plugable.SerializationDelegate) SingleOutputStreamOperator(org.apache.flink.streaming.api.datastream.SingleOutputStreamOperator) MockStreamTaskBuilder(org.apache.flink.streaming.util.MockStreamTaskBuilder) OutputTag(org.apache.flink.util.OutputTag) Test(org.junit.Test) MockEnvironmentBuilder(org.apache.flink.runtime.operators.testutils.MockEnvironmentBuilder) DataStream(org.apache.flink.streaming.api.datastream.DataStream) StreamOperator(org.apache.flink.streaming.api.operators.StreamOperator) List(java.util.List) Matchers.contains(org.hamcrest.Matchers.contains) ExecutionConfig(org.apache.flink.api.common.ExecutionConfig) OperatorChain(org.apache.flink.streaming.runtime.tasks.OperatorChain) RegularOperatorChain(org.apache.flink.streaming.runtime.tasks.RegularOperatorChain) Assert(org.junit.Assert) Environment(org.apache.flink.runtime.execution.Environment) StreamExecutionEnvironment(org.apache.flink.streaming.api.environment.StreamExecutionEnvironment) StreamOperatorWrapper(org.apache.flink.streaming.runtime.tasks.StreamOperatorWrapper) Configuration(org.apache.flink.configuration.Configuration) StreamConfig(org.apache.flink.streaming.api.graph.StreamConfig) JobGraph(org.apache.flink.runtime.jobgraph.JobGraph) JobVertex(org.apache.flink.runtime.jobgraph.JobVertex) MockEnvironment(org.apache.flink.runtime.operators.testutils.MockEnvironment) StreamMap(org.apache.flink.streaming.api.operators.StreamMap)

Aggregations

SinkFunction (org.apache.flink.streaming.api.functions.sink.SinkFunction)8 StreamExecutionEnvironment (org.apache.flink.streaming.api.environment.StreamExecutionEnvironment)7 Test (org.junit.Test)6 ArrayList (java.util.ArrayList)5 List (java.util.List)5 SourceFunction (org.apache.flink.streaming.api.functions.source.SourceFunction)5 DataStream (org.apache.flink.streaming.api.datastream.DataStream)4 SharedObjects (org.apache.flink.testutils.junit.SharedObjects)4 SharedReference (org.apache.flink.testutils.junit.SharedReference)4 Arrays (java.util.Arrays)3 Collection (java.util.Collection)3 Collections (java.util.Collections)3 DataStreamSinkProvider (org.apache.flink.table.connector.sink.DataStreamSinkProvider)3 SinkProvider (org.apache.flink.table.connector.sink.SinkProvider)3 SinkV2Provider (org.apache.flink.table.connector.sink.SinkV2Provider)3 RowData (org.apache.flink.table.data.RowData)3 Instant (java.time.Instant)2 ExecutionException (java.util.concurrent.ExecutionException)2 Collectors (java.util.stream.Collectors)2 RichMapFunction (org.apache.flink.api.common.functions.RichMapFunction)2