Examples with RichFlatMapFunction - org.apache.flink.api.common.functions.RichFlatMapFunction

Example 1 with RichFlatMapFunction

use of org.apache.flink.api.common.functions.RichFlatMapFunction in project flink by apache.

the class KafkaConsumerTestBase method readSequence.

// ------------------------------------------------------------------------
//  Reading writing test data sets
// ------------------------------------------------------------------------
/**
	 * Runs a job using the provided environment to read a sequence of records from a single Kafka topic.
	 * The method allows to individually specify the expected starting offset and total read value count of each partition.
	 * The job will be considered successful only if all partition read results match the start offset and value count criteria.
	 */
protected void readSequence(final StreamExecutionEnvironment env, final StartupMode startupMode, final Map<KafkaTopicPartition, Long> specificStartupOffsets, final Properties cc, final String topicName, final Map<Integer, Tuple2<Integer, Integer>> partitionsToValuesCountAndStartOffset) throws Exception {
    final int sourceParallelism = partitionsToValuesCountAndStartOffset.keySet().size();
    int finalCountTmp = 0;
    for (Map.Entry<Integer, Tuple2<Integer, Integer>> valuesCountAndStartOffset : partitionsToValuesCountAndStartOffset.entrySet()) {
        finalCountTmp += valuesCountAndStartOffset.getValue().f0;
    }
    final int finalCount = finalCountTmp;
    final TypeInformation<Tuple2<Integer, Integer>> intIntTupleType = TypeInfoParser.parse("Tuple2<Integer, Integer>");
    final TypeInformationSerializationSchema<Tuple2<Integer, Integer>> deser = new TypeInformationSerializationSchema<>(intIntTupleType, env.getConfig());
    // create the consumer
    cc.putAll(secureProps);
    FlinkKafkaConsumerBase<Tuple2<Integer, Integer>> consumer = kafkaServer.getConsumer(topicName, deser, cc);
    switch(startupMode) {
        case EARLIEST:
            consumer.setStartFromEarliest();
            break;
        case LATEST:
            consumer.setStartFromLatest();
            break;
        case SPECIFIC_OFFSETS:
            consumer.setStartFromSpecificOffsets(specificStartupOffsets);
            break;
        case GROUP_OFFSETS:
            consumer.setStartFromGroupOffsets();
            break;
    }
    DataStream<Tuple2<Integer, Integer>> source = env.addSource(consumer).setParallelism(sourceParallelism).map(new ThrottledMapper<Tuple2<Integer, Integer>>(20)).setParallelism(sourceParallelism);
    // verify data
    source.flatMap(new RichFlatMapFunction<Tuple2<Integer, Integer>, Integer>() {

        private HashMap<Integer, BitSet> partitionsToValueCheck;

        private int count = 0;

        @Override
        public void open(Configuration parameters) throws Exception {
            partitionsToValueCheck = new HashMap<>();
            for (Integer partition : partitionsToValuesCountAndStartOffset.keySet()) {
                partitionsToValueCheck.put(partition, new BitSet());
            }
        }

        @Override
        public void flatMap(Tuple2<Integer, Integer> value, Collector<Integer> out) throws Exception {
            int partition = value.f0;
            int val = value.f1;
            BitSet bitSet = partitionsToValueCheck.get(partition);
            if (bitSet == null) {
                throw new RuntimeException("Got a record from an unknown partition");
            } else {
                bitSet.set(val - partitionsToValuesCountAndStartOffset.get(partition).f1);
            }
            count++;
            LOG.info("Received message {}, total {} messages", value, count);
            // verify if we've seen everything
            if (count == finalCount) {
                for (Map.Entry<Integer, BitSet> partitionsToValueCheck : this.partitionsToValueCheck.entrySet()) {
                    BitSet check = partitionsToValueCheck.getValue();
                    int expectedValueCount = partitionsToValuesCountAndStartOffset.get(partitionsToValueCheck.getKey()).f0;
                    if (check.cardinality() != expectedValueCount) {
                        throw new RuntimeException("Expected cardinality to be " + expectedValueCount + ", but was " + check.cardinality());
                    } else if (check.nextClearBit(0) != expectedValueCount) {
                        throw new RuntimeException("Expected next clear bit to be " + expectedValueCount + ", but was " + check.cardinality());
                    }
                }
                // test has passed
                throw new SuccessException();
            }
        }
    }).setParallelism(1);
    tryExecute(env, "Read data from Kafka");
    LOG.info("Successfully read sequence for verification");
}

Also used : Configuration(org.apache.flink.configuration.Configuration) HashMap(java.util.HashMap) BitSet(java.util.BitSet) TypeHint(org.apache.flink.api.common.typeinfo.TypeHint) ThrottledMapper(org.apache.flink.streaming.connectors.kafka.testutils.ThrottledMapper) TypeInformationSerializationSchema(org.apache.flink.streaming.util.serialization.TypeInformationSerializationSchema) Tuple2(org.apache.flink.api.java.tuple.Tuple2) RichFlatMapFunction(org.apache.flink.api.common.functions.RichFlatMapFunction) Collector(org.apache.flink.util.Collector) SuccessException(org.apache.flink.test.util.SuccessException) Map(java.util.Map) HashMap(java.util.HashMap)

Example 2 with RichFlatMapFunction

use of org.apache.flink.api.common.functions.RichFlatMapFunction in project flink by apache.

the class SavepointITCase method testSavepointForJobWithIteration.

@Test
public void testSavepointForJobWithIteration() throws Exception {
    for (int i = 0; i < ITER_TEST_PARALLELISM; ++i) {
        ITER_TEST_SNAPSHOT_WAIT[i] = new OneShotLatch();
        ITER_TEST_RESTORE_WAIT[i] = new OneShotLatch();
        ITER_TEST_CHECKPOINT_VERIFY[i] = 0;
    }
    TemporaryFolder folder = new TemporaryFolder();
    folder.create();
    // Temporary directory for file state backend
    final File tmpDir = folder.newFolder();
    final StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
    final IntegerStreamSource source = new IntegerStreamSource();
    IterativeStream<Integer> iteration = env.addSource(source).flatMap(new RichFlatMapFunction<Integer, Integer>() {

        private static final long serialVersionUID = 1L;

        @Override
        public void flatMap(Integer in, Collector<Integer> clctr) throws Exception {
            clctr.collect(in);
        }
    }).setParallelism(ITER_TEST_PARALLELISM).keyBy(new KeySelector<Integer, Object>() {

        private static final long serialVersionUID = 1L;

        @Override
        public Object getKey(Integer value) throws Exception {
            return value;
        }
    }).flatMap(new DuplicateFilter()).setParallelism(ITER_TEST_PARALLELISM).iterate();
    DataStream<Integer> iterationBody = iteration.map(new MapFunction<Integer, Integer>() {

        private static final long serialVersionUID = 1L;

        @Override
        public Integer map(Integer value) throws Exception {
            return value;
        }
    }).setParallelism(ITER_TEST_PARALLELISM);
    iteration.closeWith(iterationBody);
    StreamGraph streamGraph = env.getStreamGraph();
    streamGraph.setJobName("Test");
    JobGraph jobGraph = streamGraph.getJobGraph();
    Configuration config = new Configuration();
    config.addAll(jobGraph.getJobConfiguration());
    config.setLong(ConfigConstants.TASK_MANAGER_MEMORY_SIZE_KEY, -1L);
    config.setInteger(ConfigConstants.TASK_MANAGER_NUM_TASK_SLOTS, 2 * jobGraph.getMaximumParallelism());
    final File checkpointDir = new File(tmpDir, "checkpoints");
    final File savepointDir = new File(tmpDir, "savepoints");
    if (!checkpointDir.mkdir() || !savepointDir.mkdirs()) {
        fail("Test setup failed: failed to create temporary directories.");
    }
    config.setString(CoreOptions.STATE_BACKEND, "filesystem");
    config.setString(FsStateBackendFactory.CHECKPOINT_DIRECTORY_URI_CONF_KEY, checkpointDir.toURI().toString());
    config.setString(FsStateBackendFactory.MEMORY_THRESHOLD_CONF_KEY, "0");
    config.setString(ConfigConstants.SAVEPOINT_DIRECTORY_KEY, savepointDir.toURI().toString());
    TestingCluster cluster = new TestingCluster(config, false);
    String savepointPath = null;
    try {
        cluster.start();
        cluster.submitJobDetached(jobGraph);
        for (OneShotLatch latch : ITER_TEST_SNAPSHOT_WAIT) {
            latch.await();
        }
        savepointPath = cluster.triggerSavepoint(jobGraph.getJobID());
        source.cancel();
        jobGraph = streamGraph.getJobGraph();
        jobGraph.setSavepointRestoreSettings(SavepointRestoreSettings.forPath(savepointPath));
        cluster.submitJobDetached(jobGraph);
        for (OneShotLatch latch : ITER_TEST_RESTORE_WAIT) {
            latch.await();
        }
        source.cancel();
    } finally {
        if (null != savepointPath) {
            cluster.disposeSavepoint(savepointPath);
        }
        cluster.stop();
        cluster.awaitTermination();
    }
}

Also used : Configuration(org.apache.flink.configuration.Configuration) KeySelector(org.apache.flink.api.java.functions.KeySelector) MapFunction(org.apache.flink.api.common.functions.MapFunction) RichFlatMapFunction(org.apache.flink.api.common.functions.RichFlatMapFunction) RichMapFunction(org.apache.flink.api.common.functions.RichMapFunction) TriggerSavepoint(org.apache.flink.runtime.messages.JobManagerMessages.TriggerSavepoint) ResponseSavepoint(org.apache.flink.runtime.testingUtils.TestingJobManagerMessages.ResponseSavepoint) RequestSavepoint(org.apache.flink.runtime.testingUtils.TestingJobManagerMessages.RequestSavepoint) DisposeSavepoint(org.apache.flink.runtime.messages.JobManagerMessages.DisposeSavepoint) JobGraph(org.apache.flink.runtime.jobgraph.JobGraph) TestingCluster(org.apache.flink.runtime.testingUtils.TestingCluster) RichFlatMapFunction(org.apache.flink.api.common.functions.RichFlatMapFunction) TemporaryFolder(org.junit.rules.TemporaryFolder) Collector(org.apache.flink.util.Collector) OneShotLatch(org.apache.flink.core.testutils.OneShotLatch) StreamGraph(org.apache.flink.streaming.api.graph.StreamGraph) StreamExecutionEnvironment(org.apache.flink.streaming.api.environment.StreamExecutionEnvironment) File(java.io.File) Test(org.junit.Test)

Example 3 with RichFlatMapFunction

use of org.apache.flink.api.common.functions.RichFlatMapFunction in project flink by apache.

the class MiscellaneousIssuesITCase method testAccumulatorsAfterNoOp.

@Test
public void testAccumulatorsAfterNoOp() {
    final String ACC_NAME = "test_accumulator";
    try {
        ExecutionEnvironment env = ExecutionEnvironment.createRemoteEnvironment("localhost", cluster.getLeaderRPCPort());
        env.setParallelism(6);
        env.getConfig().disableSysoutLogging();
        env.generateSequence(1, 1000000).rebalance().flatMap(new RichFlatMapFunction<Long, Long>() {

            private LongCounter counter;

            @Override
            public void open(Configuration parameters) {
                counter = getRuntimeContext().getLongCounter(ACC_NAME);
            }

            @Override
            public void flatMap(Long value, Collector<Long> out) {
                counter.add(1L);
            }
        }).output(new DiscardingOutputFormat<Long>());
        JobExecutionResult result = env.execute();
        assertEquals(1000000L, result.getAllAccumulatorResults().get(ACC_NAME));
    } catch (Exception e) {
        e.printStackTrace();
        fail(e.getMessage());
    }
}

Also used : JobExecutionResult(org.apache.flink.api.common.JobExecutionResult) ExecutionEnvironment(org.apache.flink.api.java.ExecutionEnvironment) Configuration(org.apache.flink.configuration.Configuration) RichFlatMapFunction(org.apache.flink.api.common.functions.RichFlatMapFunction) Collector(org.apache.flink.util.Collector) LongCounter(org.apache.flink.api.common.accumulators.LongCounter) ProgramInvocationException(org.apache.flink.client.program.ProgramInvocationException) Test(org.junit.Test)

Aggregations

RichFlatMapFunction (org.apache.flink.api.common.functions.RichFlatMapFunction)3 Configuration (org.apache.flink.configuration.Configuration)3 Collector (org.apache.flink.util.Collector)3 Test (org.junit.Test)2 File (java.io.File)1 BitSet (java.util.BitSet)1 HashMap (java.util.HashMap)1 Map (java.util.Map)1 JobExecutionResult (org.apache.flink.api.common.JobExecutionResult)1 LongCounter (org.apache.flink.api.common.accumulators.LongCounter)1 MapFunction (org.apache.flink.api.common.functions.MapFunction)1 RichMapFunction (org.apache.flink.api.common.functions.RichMapFunction)1 TypeHint (org.apache.flink.api.common.typeinfo.TypeHint)1 ExecutionEnvironment (org.apache.flink.api.java.ExecutionEnvironment)1 KeySelector (org.apache.flink.api.java.functions.KeySelector)1 Tuple2 (org.apache.flink.api.java.tuple.Tuple2)1 ProgramInvocationException (org.apache.flink.client.program.ProgramInvocationException)1 OneShotLatch (org.apache.flink.core.testutils.OneShotLatch)1 JobGraph (org.apache.flink.runtime.jobgraph.JobGraph)1 DisposeSavepoint (org.apache.flink.runtime.messages.JobManagerMessages.DisposeSavepoint)1