Search in sources :

Example 1 with ValidatingExactlyOnceSink

use of org.apache.flink.streaming.connectors.kafka.testutils.ValidatingExactlyOnceSink in project flink by apache.

the class KafkaConsumerTestBase method runOneToOneExactlyOnceTest.

/**
	 * Tests the proper consumption when having a 1:1 correspondence between kafka partitions and
	 * Flink sources.
	 */
public void runOneToOneExactlyOnceTest() throws Exception {
    final String topic = "oneToOneTopic";
    final int parallelism = 5;
    final int numElementsPerPartition = 1000;
    final int totalElements = parallelism * numElementsPerPartition;
    final int failAfterElements = numElementsPerPartition / 3;
    createTestTopic(topic, parallelism, 1);
    DataGenerators.generateRandomizedIntegerSequence(StreamExecutionEnvironment.createRemoteEnvironment("localhost", flinkPort), kafkaServer, topic, parallelism, numElementsPerPartition, true);
    // run the topology that fails and recovers
    DeserializationSchema<Integer> schema = new TypeInformationSerializationSchema<>(BasicTypeInfo.INT_TYPE_INFO, new ExecutionConfig());
    StreamExecutionEnvironment env = StreamExecutionEnvironment.createRemoteEnvironment("localhost", flinkPort);
    env.enableCheckpointing(500);
    env.setParallelism(parallelism);
    env.setRestartStrategy(RestartStrategies.fixedDelayRestart(1, 0));
    env.getConfig().disableSysoutLogging();
    Properties props = new Properties();
    props.putAll(standardProps);
    props.putAll(secureProps);
    FlinkKafkaConsumerBase<Integer> kafkaSource = kafkaServer.getConsumer(topic, schema, props);
    env.addSource(kafkaSource).map(new PartitionValidatingMapper(parallelism, 1)).map(new FailingIdentityMapper<Integer>(failAfterElements)).addSink(new ValidatingExactlyOnceSink(totalElements)).setParallelism(1);
    FailingIdentityMapper.failedBefore = false;
    tryExecute(env, "One-to-one exactly once test");
    deleteTestTopic(topic);
}
Also used : TypeInformationSerializationSchema(org.apache.flink.streaming.util.serialization.TypeInformationSerializationSchema) ValidatingExactlyOnceSink(org.apache.flink.streaming.connectors.kafka.testutils.ValidatingExactlyOnceSink) PartitionValidatingMapper(org.apache.flink.streaming.connectors.kafka.testutils.PartitionValidatingMapper) StreamExecutionEnvironment(org.apache.flink.streaming.api.environment.StreamExecutionEnvironment) ExecutionConfig(org.apache.flink.api.common.ExecutionConfig) Properties(java.util.Properties) TypeHint(org.apache.flink.api.common.typeinfo.TypeHint)

Example 2 with ValidatingExactlyOnceSink

use of org.apache.flink.streaming.connectors.kafka.testutils.ValidatingExactlyOnceSink in project flink by apache.

the class KafkaConsumerTestBase method runOneSourceMultiplePartitionsExactlyOnceTest.

/**
	 * Tests the proper consumption when having fewer Flink sources than Kafka partitions, so
	 * one Flink source will read multiple Kafka partitions.
	 */
public void runOneSourceMultiplePartitionsExactlyOnceTest() throws Exception {
    final String topic = "oneToManyTopic";
    final int numPartitions = 5;
    final int numElementsPerPartition = 1000;
    final int totalElements = numPartitions * numElementsPerPartition;
    final int failAfterElements = numElementsPerPartition / 3;
    final int parallelism = 2;
    createTestTopic(topic, numPartitions, 1);
    DataGenerators.generateRandomizedIntegerSequence(StreamExecutionEnvironment.createRemoteEnvironment("localhost", flinkPort), kafkaServer, topic, numPartitions, numElementsPerPartition, false);
    // run the topology that fails and recovers
    DeserializationSchema<Integer> schema = new TypeInformationSerializationSchema<>(BasicTypeInfo.INT_TYPE_INFO, new ExecutionConfig());
    StreamExecutionEnvironment env = StreamExecutionEnvironment.createRemoteEnvironment("localhost", flinkPort);
    env.enableCheckpointing(500);
    env.setParallelism(parallelism);
    env.setRestartStrategy(RestartStrategies.fixedDelayRestart(1, 0));
    env.getConfig().disableSysoutLogging();
    Properties props = new Properties();
    props.putAll(standardProps);
    props.putAll(secureProps);
    FlinkKafkaConsumerBase<Integer> kafkaSource = kafkaServer.getConsumer(topic, schema, props);
    env.addSource(kafkaSource).map(new PartitionValidatingMapper(numPartitions, 3)).map(new FailingIdentityMapper<Integer>(failAfterElements)).addSink(new ValidatingExactlyOnceSink(totalElements)).setParallelism(1);
    FailingIdentityMapper.failedBefore = false;
    tryExecute(env, "One-source-multi-partitions exactly once test");
    deleteTestTopic(topic);
}
Also used : TypeInformationSerializationSchema(org.apache.flink.streaming.util.serialization.TypeInformationSerializationSchema) ValidatingExactlyOnceSink(org.apache.flink.streaming.connectors.kafka.testutils.ValidatingExactlyOnceSink) PartitionValidatingMapper(org.apache.flink.streaming.connectors.kafka.testutils.PartitionValidatingMapper) StreamExecutionEnvironment(org.apache.flink.streaming.api.environment.StreamExecutionEnvironment) ExecutionConfig(org.apache.flink.api.common.ExecutionConfig) Properties(java.util.Properties) TypeHint(org.apache.flink.api.common.typeinfo.TypeHint)

Example 3 with ValidatingExactlyOnceSink

use of org.apache.flink.streaming.connectors.kafka.testutils.ValidatingExactlyOnceSink in project flink by apache.

the class KafkaConsumerTestBase method runBrokerFailureTest.

public void runBrokerFailureTest() throws Exception {
    final String topic = "brokerFailureTestTopic";
    final int parallelism = 2;
    final int numElementsPerPartition = 1000;
    final int totalElements = parallelism * numElementsPerPartition;
    final int failAfterElements = numElementsPerPartition / 3;
    createTestTopic(topic, parallelism, 2);
    DataGenerators.generateRandomizedIntegerSequence(StreamExecutionEnvironment.createRemoteEnvironment("localhost", flinkPort), kafkaServer, topic, parallelism, numElementsPerPartition, true);
    // find leader to shut down
    int leaderId = kafkaServer.getLeaderToShutDown(topic);
    LOG.info("Leader to shutdown {}", leaderId);
    // run the topology (the consumers must handle the failures)
    DeserializationSchema<Integer> schema = new TypeInformationSerializationSchema<>(BasicTypeInfo.INT_TYPE_INFO, new ExecutionConfig());
    StreamExecutionEnvironment env = StreamExecutionEnvironment.createRemoteEnvironment("localhost", flinkPort);
    env.setParallelism(parallelism);
    env.enableCheckpointing(500);
    env.setRestartStrategy(RestartStrategies.noRestart());
    env.getConfig().disableSysoutLogging();
    Properties props = new Properties();
    props.putAll(standardProps);
    props.putAll(secureProps);
    FlinkKafkaConsumerBase<Integer> kafkaSource = kafkaServer.getConsumer(topic, schema, props);
    env.addSource(kafkaSource).map(new PartitionValidatingMapper(parallelism, 1)).map(new BrokerKillingMapper<Integer>(leaderId, failAfterElements)).addSink(new ValidatingExactlyOnceSink(totalElements)).setParallelism(1);
    BrokerKillingMapper.killedLeaderBefore = false;
    tryExecute(env, "Broker failure once test");
    // start a new broker:
    kafkaServer.restartBroker(leaderId);
}
Also used : TypeInformationSerializationSchema(org.apache.flink.streaming.util.serialization.TypeInformationSerializationSchema) ValidatingExactlyOnceSink(org.apache.flink.streaming.connectors.kafka.testutils.ValidatingExactlyOnceSink) PartitionValidatingMapper(org.apache.flink.streaming.connectors.kafka.testutils.PartitionValidatingMapper) StreamExecutionEnvironment(org.apache.flink.streaming.api.environment.StreamExecutionEnvironment) ExecutionConfig(org.apache.flink.api.common.ExecutionConfig) Properties(java.util.Properties) TypeHint(org.apache.flink.api.common.typeinfo.TypeHint)

Example 4 with ValidatingExactlyOnceSink

use of org.apache.flink.streaming.connectors.kafka.testutils.ValidatingExactlyOnceSink in project flink by apache.

the class KafkaConsumerTestBase method runMultipleSourcesOnePartitionExactlyOnceTest.

/**
	 * Tests the proper consumption when having more Flink sources than Kafka partitions, which means
	 * that some Flink sources will read no partitions.
	 */
public void runMultipleSourcesOnePartitionExactlyOnceTest() throws Exception {
    final String topic = "manyToOneTopic";
    final int numPartitions = 5;
    final int numElementsPerPartition = 1000;
    final int totalElements = numPartitions * numElementsPerPartition;
    final int failAfterElements = numElementsPerPartition / 3;
    final int parallelism = 8;
    createTestTopic(topic, numPartitions, 1);
    DataGenerators.generateRandomizedIntegerSequence(StreamExecutionEnvironment.createRemoteEnvironment("localhost", flinkPort), kafkaServer, topic, numPartitions, numElementsPerPartition, true);
    // run the topology that fails and recovers
    DeserializationSchema<Integer> schema = new TypeInformationSerializationSchema<>(BasicTypeInfo.INT_TYPE_INFO, new ExecutionConfig());
    StreamExecutionEnvironment env = StreamExecutionEnvironment.createRemoteEnvironment("localhost", flinkPort);
    env.enableCheckpointing(500);
    env.setParallelism(parallelism);
    // set the number of restarts to one. The failing mapper will fail once, then it's only success exceptions.
    env.setRestartStrategy(RestartStrategies.fixedDelayRestart(1, 0));
    env.getConfig().disableSysoutLogging();
    env.setBufferTimeout(0);
    Properties props = new Properties();
    props.putAll(standardProps);
    props.putAll(secureProps);
    FlinkKafkaConsumerBase<Integer> kafkaSource = kafkaServer.getConsumer(topic, schema, props);
    env.addSource(kafkaSource).map(new PartitionValidatingMapper(numPartitions, 1)).map(new FailingIdentityMapper<Integer>(failAfterElements)).addSink(new ValidatingExactlyOnceSink(totalElements)).setParallelism(1);
    FailingIdentityMapper.failedBefore = false;
    tryExecute(env, "multi-source-one-partitions exactly once test");
    deleteTestTopic(topic);
}
Also used : TypeInformationSerializationSchema(org.apache.flink.streaming.util.serialization.TypeInformationSerializationSchema) ValidatingExactlyOnceSink(org.apache.flink.streaming.connectors.kafka.testutils.ValidatingExactlyOnceSink) PartitionValidatingMapper(org.apache.flink.streaming.connectors.kafka.testutils.PartitionValidatingMapper) StreamExecutionEnvironment(org.apache.flink.streaming.api.environment.StreamExecutionEnvironment) ExecutionConfig(org.apache.flink.api.common.ExecutionConfig) Properties(java.util.Properties) TypeHint(org.apache.flink.api.common.typeinfo.TypeHint)

Aggregations

Properties (java.util.Properties)4 ExecutionConfig (org.apache.flink.api.common.ExecutionConfig)4 TypeHint (org.apache.flink.api.common.typeinfo.TypeHint)4 StreamExecutionEnvironment (org.apache.flink.streaming.api.environment.StreamExecutionEnvironment)4 PartitionValidatingMapper (org.apache.flink.streaming.connectors.kafka.testutils.PartitionValidatingMapper)4 ValidatingExactlyOnceSink (org.apache.flink.streaming.connectors.kafka.testutils.ValidatingExactlyOnceSink)4 TypeInformationSerializationSchema (org.apache.flink.streaming.util.serialization.TypeInformationSerializationSchema)4