Search in sources :

Example 86 with DataStream

use of org.apache.flink.streaming.api.datastream.DataStream in project flink by apache.

the class BlockingShuffleITCase method createJobGraph.

private JobGraph createJobGraph(int numRecordsToSend, boolean deletePartitionFile) {
    StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
    env.setRestartStrategy(RestartStrategies.fixedDelayRestart(1, 0L));
    env.setBufferTimeout(-1);
    env.setParallelism(numTaskManagers * numSlotsPerTaskManager);
    DataStream<String> source = env.addSource(new StringSource(numRecordsToSend));
    source.rebalance().map((MapFunction<String, String>) value -> value).broadcast().addSink(new VerifySink(deletePartitionFile));
    StreamGraph streamGraph = env.getStreamGraph();
    streamGraph.setGlobalStreamExchangeMode(GlobalStreamExchangeMode.ALL_EDGES_BLOCKING);
    // a scheduler supporting batch jobs is required for this job graph, because it contains
    // blocking data exchanges.
    // The scheduler is selected based on the JobType.
    streamGraph.setJobType(JobType.BATCH);
    return StreamingJobGraphGenerator.createJobGraph(streamGraph);
}
Also used : Files(java.nio.file.Files) Configuration(org.apache.flink.configuration.Configuration) JobGraph(org.apache.flink.runtime.jobgraph.JobGraph) Test(org.junit.Test) IOException(java.io.IOException) StreamingJobGraphGenerator(org.apache.flink.streaming.api.graph.StreamingJobGraphGenerator) RestartStrategies(org.apache.flink.api.common.restartstrategy.RestartStrategies) JobType(org.apache.flink.runtime.jobgraph.JobType) File(java.io.File) MapFunction(org.apache.flink.api.common.functions.MapFunction) RichSinkFunction(org.apache.flink.streaming.api.functions.sink.RichSinkFunction) DataStream(org.apache.flink.streaming.api.datastream.DataStream) NettyShuffleEnvironmentOptions(org.apache.flink.configuration.NettyShuffleEnvironmentOptions) CoreOptions(org.apache.flink.configuration.CoreOptions) StreamGraph(org.apache.flink.streaming.api.graph.StreamGraph) ParallelSourceFunction(org.apache.flink.streaming.api.functions.source.ParallelSourceFunction) GlobalStreamExchangeMode(org.apache.flink.streaming.api.graph.GlobalStreamExchangeMode) ClassRule(org.junit.ClassRule) TemporaryFolder(org.junit.rules.TemporaryFolder) Assert.assertEquals(org.junit.Assert.assertEquals) StreamExecutionEnvironment(org.apache.flink.streaming.api.environment.StreamExecutionEnvironment) StreamGraph(org.apache.flink.streaming.api.graph.StreamGraph) StreamExecutionEnvironment(org.apache.flink.streaming.api.environment.StreamExecutionEnvironment)

Example 87 with DataStream

use of org.apache.flink.streaming.api.datastream.DataStream in project flink by apache.

the class SourceNAryInputChainingITCase method createProgramWithUnionInput.

/**
 * Creates a DataStream program as shown below.
 *
 * <pre>
 *                                   +--------------+
 *             (src 1) --> (map) --> |              |
 *                                   |              |
 *            (src 2) --+            |    N-Ary     |
 *                      +-- UNION -> |   Operator   |
 *   (src 3) -> (map) --+            |              |
 *                                   |              |
 *                       (src 4) --> |              |
 *                                   +--------------+
 * </pre>
 */
private DataStream<Long> createProgramWithUnionInput() {
    final StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
    env.setParallelism(PARALLELISM);
    env.getConfig().enableObjectReuse();
    final DataStream<Long> source1 = env.fromSource(new NumberSequenceSource(1L, 10L), WatermarkStrategy.noWatermarks(), "source-1");
    final DataStream<Long> source2 = env.fromSource(new NumberSequenceSource(11L, 20L), WatermarkStrategy.noWatermarks(), "source-2");
    final DataStream<Long> source3 = env.fromSource(new NumberSequenceSource(21L, 30L), WatermarkStrategy.noWatermarks(), "source-3");
    final DataStream<Long> source4 = env.fromSource(new NumberSequenceSource(31L, 40L), WatermarkStrategy.noWatermarks(), "source-4");
    return nAryInputStreamOperation(source1.map((v) -> v), source2.union(source3), source4);
}
Also used : MultipleInputTransformation(org.apache.flink.streaming.api.transformations.MultipleInputTransformation) NumberSequenceSource(org.apache.flink.api.connector.source.lib.NumberSequenceSource) MultipleConnectedStreams(org.apache.flink.streaming.api.datastream.MultipleConnectedStreams) JobGraph(org.apache.flink.runtime.jobgraph.JobGraph) AbstractStreamOperatorV2(org.apache.flink.streaming.api.operators.AbstractStreamOperatorV2) MiniClusterResourceConfiguration(org.apache.flink.runtime.testutils.MiniClusterResourceConfiguration) AbstractInput(org.apache.flink.streaming.api.operators.AbstractInput) ArrayList(java.util.ArrayList) AbstractStreamOperatorFactory(org.apache.flink.streaming.api.operators.AbstractStreamOperatorFactory) ChainingStrategy(org.apache.flink.streaming.api.operators.ChainingStrategy) StreamRecord(org.apache.flink.streaming.runtime.streamrecord.StreamRecord) StreamGraph(org.apache.flink.streaming.api.graph.StreamGraph) TestLogger(org.apache.flink.util.TestLogger) Assert.fail(org.junit.Assert.fail) ClassRule(org.junit.ClassRule) Types(org.apache.flink.api.common.typeinfo.Types) MiniClusterWithClientResource(org.apache.flink.test.util.MiniClusterWithClientResource) DiscardingSink(org.apache.flink.streaming.api.functions.sink.DiscardingSink) DataStreamUtils(org.apache.flink.streaming.api.datastream.DataStreamUtils) WatermarkStrategy(org.apache.flink.api.common.eventtime.WatermarkStrategy) StreamOperatorParameters(org.apache.flink.streaming.api.operators.StreamOperatorParameters) Test(org.junit.Test) StreamingJobGraphGenerator(org.apache.flink.streaming.api.graph.StreamingJobGraphGenerator) DataStream(org.apache.flink.streaming.api.datastream.DataStream) StreamOperator(org.apache.flink.streaming.api.operators.StreamOperator) MultipleInputStreamOperator(org.apache.flink.streaming.api.operators.MultipleInputStreamOperator) List(java.util.List) TemporaryFolder(org.junit.rules.TemporaryFolder) Assert.assertEquals(org.junit.Assert.assertEquals) StreamExecutionEnvironment(org.apache.flink.streaming.api.environment.StreamExecutionEnvironment) Input(org.apache.flink.streaming.api.operators.Input) StreamExecutionEnvironment(org.apache.flink.streaming.api.environment.StreamExecutionEnvironment) NumberSequenceSource(org.apache.flink.api.connector.source.lib.NumberSequenceSource)

Aggregations

DataStream (org.apache.flink.streaming.api.datastream.DataStream)87 StreamExecutionEnvironment (org.apache.flink.streaming.api.environment.StreamExecutionEnvironment)78 Test (org.junit.Test)70 List (java.util.List)62 Collector (org.apache.flink.util.Collector)60 Tuple2 (org.apache.flink.api.java.tuple.Tuple2)50 SingleOutputStreamOperator (org.apache.flink.streaming.api.datastream.SingleOutputStreamOperator)48 Arrays (java.util.Arrays)46 ArrayList (java.util.ArrayList)40 TypeInformation (org.apache.flink.api.common.typeinfo.TypeInformation)40 Assert.assertEquals (org.junit.Assert.assertEquals)38 WatermarkStrategy (org.apache.flink.api.common.eventtime.WatermarkStrategy)36 Configuration (org.apache.flink.configuration.Configuration)36 Assert.assertTrue (org.junit.Assert.assertTrue)33 BasicTypeInfo (org.apache.flink.api.common.typeinfo.BasicTypeInfo)32 StreamOperator (org.apache.flink.streaming.api.operators.StreamOperator)32 Types (org.apache.flink.api.common.typeinfo.Types)31 Assert (org.junit.Assert)31 ReduceFunction (org.apache.flink.api.common.functions.ReduceFunction)29 JobGraph (org.apache.flink.runtime.jobgraph.JobGraph)29