Search in sources :

Example 1 with ConsumerConfig

use of org.apache.kafka.clients.consumer.ConsumerConfig in project kafka-streams-examples by confluentinc.

the class StreamToTableJoinIntegrationTest method shouldCountClicksPerRegion.

@Test
public void shouldCountClicksPerRegion() throws Exception {
    // Input 1: Clicks per user (multiple records allowed per user).
    List<KeyValue<String, Long>> userClicks = Arrays.asList(new KeyValue<>("alice", 13L), new KeyValue<>("bob", 4L), new KeyValue<>("chao", 25L), new KeyValue<>("bob", 19L), new KeyValue<>("dave", 56L), new KeyValue<>("eve", 78L), new KeyValue<>("alice", 40L), new KeyValue<>("fang", 99L));
    // Input 2: Region per user (multiple records allowed per user).
    List<KeyValue<String, String>> userRegions = Arrays.asList(new KeyValue<>("alice", "asia"), /* Alice lived in Asia originally... */
    new KeyValue<>("bob", "americas"), new KeyValue<>("chao", "asia"), new KeyValue<>("dave", "europe"), new KeyValue<>("alice", "europe"), /* ...but moved to Europe some time later. */
    new KeyValue<>("eve", "americas"), new KeyValue<>("fang", "asia"));
    List<KeyValue<String, Long>> expectedClicksPerRegion = Arrays.asList(new KeyValue<>("americas", 101L), new KeyValue<>("europe", 109L), new KeyValue<>("asia", 124L));
    // 
    // Step 1: Configure and start the processor topology.
    // 
    final Serde<String> stringSerde = Serdes.String();
    final Serde<Long> longSerde = Serdes.Long();
    Properties streamsConfiguration = new Properties();
    streamsConfiguration.put(StreamsConfig.APPLICATION_ID_CONFIG, "stream-table-join-lambda-integration-test");
    streamsConfiguration.put(StreamsConfig.BOOTSTRAP_SERVERS_CONFIG, CLUSTER.bootstrapServers());
    streamsConfiguration.put(StreamsConfig.DEFAULT_KEY_SERDE_CLASS_CONFIG, Serdes.String().getClass().getName());
    streamsConfiguration.put(StreamsConfig.DEFAULT_VALUE_SERDE_CLASS_CONFIG, Serdes.String().getClass().getName());
    // The commit interval for flushing records to state stores and downstream must be lower than
    // this integration test's timeout (30 secs) to ensure we observe the expected processing results.
    streamsConfiguration.put(StreamsConfig.COMMIT_INTERVAL_MS_CONFIG, 10 * 1000);
    streamsConfiguration.put(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, "earliest");
    // Use a temporary directory for storing state, which will be automatically removed after the test.
    streamsConfiguration.put(StreamsConfig.STATE_DIR_CONFIG, TestUtils.tempDirectory().getAbsolutePath());
    StreamsBuilder builder = new StreamsBuilder();
    // This KStream contains information such as "alice" -> 13L.
    // 
    // Because this is a KStream ("record stream"), multiple records for the same user will be
    // considered as separate click-count events, each of which will be added to the total count.
    KStream<String, Long> userClicksStream = builder.stream(userClicksTopic, Consumed.with(stringSerde, longSerde));
    // This KTable contains information such as "alice" -> "europe".
    // 
    // Because this is a KTable ("changelog stream"), only the latest value (here: region) for a
    // record key will be considered at the time when a new user-click record (see above) is
    // received for the `leftJoin` below.  Any previous region values are being considered out of
    // date.  This behavior is quite different to the KStream for user clicks above.
    // 
    // For example, the user "alice" will be considered to live in "europe" (although originally she
    // lived in "asia") because, at the time her first user-click record is being received and
    // subsequently processed in the `leftJoin`, the latest region update for "alice" is "europe"
    // (which overrides her previous region value of "asia").
    KTable<String, String> userRegionsTable = builder.table(userRegionsTopic);
    // Compute the number of clicks per region, e.g. "europe" -> 13L.
    // 
    // The resulting KTable is continuously being updated as new data records are arriving in the
    // input KStream `userClicksStream` and input KTable `userRegionsTable`.
    KTable<String, Long> clicksPerRegion = userClicksStream.leftJoin(userRegionsTable, (clicks, region) -> new RegionWithClicks(region == null ? "UNKNOWN" : region, clicks)).map((user, regionWithClicks) -> new KeyValue<>(regionWithClicks.getRegion(), regionWithClicks.getClicks())).groupByKey(Serialized.with(stringSerde, longSerde)).reduce((firstClicks, secondClicks) -> firstClicks + secondClicks);
    // Write the (continuously updating) results to the output topic.
    clicksPerRegion.toStream().to(outputTopic, Produced.with(stringSerde, longSerde));
    KafkaStreams streams = new KafkaStreams(builder.build(), streamsConfiguration);
    streams.start();
    // 
    // Step 2: Publish user-region information.
    // 
    // To keep this code example simple and easier to understand/reason about, we publish all
    // user-region records before any user-click records (cf. step 3).  In practice though,
    // data records would typically be arriving concurrently in both input streams/topics.
    Properties userRegionsProducerConfig = new Properties();
    userRegionsProducerConfig.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG, CLUSTER.bootstrapServers());
    userRegionsProducerConfig.put(ProducerConfig.ACKS_CONFIG, "all");
    userRegionsProducerConfig.put(ProducerConfig.RETRIES_CONFIG, 0);
    userRegionsProducerConfig.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG, StringSerializer.class);
    userRegionsProducerConfig.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG, StringSerializer.class);
    IntegrationTestUtils.produceKeyValuesSynchronously(userRegionsTopic, userRegions, userRegionsProducerConfig);
    // 
    // Step 3: Publish some user click events.
    // 
    Properties userClicksProducerConfig = new Properties();
    userClicksProducerConfig.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG, CLUSTER.bootstrapServers());
    userClicksProducerConfig.put(ProducerConfig.ACKS_CONFIG, "all");
    userClicksProducerConfig.put(ProducerConfig.RETRIES_CONFIG, 0);
    userClicksProducerConfig.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG, StringSerializer.class);
    userClicksProducerConfig.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG, LongSerializer.class);
    IntegrationTestUtils.produceKeyValuesSynchronously(userClicksTopic, userClicks, userClicksProducerConfig);
    // 
    // Step 4: Verify the application's output data.
    // 
    Properties consumerConfig = new Properties();
    consumerConfig.put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG, CLUSTER.bootstrapServers());
    consumerConfig.put(ConsumerConfig.GROUP_ID_CONFIG, "join-lambda-integration-test-standard-consumer");
    consumerConfig.put(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, "earliest");
    consumerConfig.put(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG, StringDeserializer.class);
    consumerConfig.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG, LongDeserializer.class);
    List<KeyValue<String, Long>> actualClicksPerRegion = IntegrationTestUtils.waitUntilMinKeyValueRecordsReceived(consumerConfig, outputTopic, expectedClicksPerRegion.size());
    streams.close();
    assertThat(actualClicksPerRegion).containsExactlyElementsOf(expectedClicksPerRegion);
}
Also used : StreamsConfig(org.apache.kafka.streams.StreamsConfig) Arrays(java.util.Arrays) BeforeClass(org.junit.BeforeClass) Produced(org.apache.kafka.streams.kstream.Produced) Serialized(org.apache.kafka.streams.kstream.Serialized) Assertions.assertThat(org.assertj.core.api.Assertions.assertThat) KStream(org.apache.kafka.streams.kstream.KStream) StringDeserializer(org.apache.kafka.common.serialization.StringDeserializer) Consumed(org.apache.kafka.streams.Consumed) Serde(org.apache.kafka.common.serialization.Serde) EmbeddedSingleNodeKafkaCluster(io.confluent.examples.streams.kafka.EmbeddedSingleNodeKafkaCluster) Serdes(org.apache.kafka.common.serialization.Serdes) StringSerializer(org.apache.kafka.common.serialization.StringSerializer) ClassRule(org.junit.ClassRule) ProducerConfig(org.apache.kafka.clients.producer.ProducerConfig) StreamsBuilder(org.apache.kafka.streams.StreamsBuilder) KTable(org.apache.kafka.streams.kstream.KTable) Properties(java.util.Properties) TestUtils(org.apache.kafka.test.TestUtils) KeyValue(org.apache.kafka.streams.KeyValue) LongDeserializer(org.apache.kafka.common.serialization.LongDeserializer) ConsumerConfig(org.apache.kafka.clients.consumer.ConsumerConfig) Test(org.junit.Test) LongSerializer(org.apache.kafka.common.serialization.LongSerializer) List(java.util.List) KafkaStreams(org.apache.kafka.streams.KafkaStreams) KafkaStreams(org.apache.kafka.streams.KafkaStreams) KeyValue(org.apache.kafka.streams.KeyValue) Properties(java.util.Properties) StreamsBuilder(org.apache.kafka.streams.StreamsBuilder) Test(org.junit.Test)

Example 2 with ConsumerConfig

use of org.apache.kafka.clients.consumer.ConsumerConfig in project kafka-streams-examples by confluentinc.

the class WordCountLambdaIntegrationTest method shouldCountWords.

@Test
public void shouldCountWords() throws Exception {
    List<String> inputValues = Arrays.asList("Hello Kafka Streams", "All streams lead to Kafka", "Join Kafka Summit", "И теперь пошли русские слова");
    List<KeyValue<String, Long>> expectedWordCounts = Arrays.asList(new KeyValue<>("hello", 1L), new KeyValue<>("all", 1L), new KeyValue<>("streams", 2L), new KeyValue<>("lead", 1L), new KeyValue<>("to", 1L), new KeyValue<>("join", 1L), new KeyValue<>("kafka", 3L), new KeyValue<>("summit", 1L), new KeyValue<>("и", 1L), new KeyValue<>("теперь", 1L), new KeyValue<>("пошли", 1L), new KeyValue<>("русские", 1L), new KeyValue<>("слова", 1L));
    // 
    // Step 1: Configure and start the processor topology.
    // 
    final Serde<String> stringSerde = Serdes.String();
    final Serde<Long> longSerde = Serdes.Long();
    Properties streamsConfiguration = new Properties();
    streamsConfiguration.put(StreamsConfig.APPLICATION_ID_CONFIG, "wordcount-lambda-integration-test");
    streamsConfiguration.put(StreamsConfig.BOOTSTRAP_SERVERS_CONFIG, CLUSTER.bootstrapServers());
    streamsConfiguration.put(StreamsConfig.DEFAULT_KEY_SERDE_CLASS_CONFIG, Serdes.String().getClass().getName());
    streamsConfiguration.put(StreamsConfig.DEFAULT_VALUE_SERDE_CLASS_CONFIG, Serdes.String().getClass().getName());
    // The commit interval for flushing records to state stores and downstream must be lower than
    // this integration test's timeout (30 secs) to ensure we observe the expected processing results.
    streamsConfiguration.put(StreamsConfig.COMMIT_INTERVAL_MS_CONFIG, 10 * 1000);
    streamsConfiguration.put(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, "earliest");
    // Use a temporary directory for storing state, which will be automatically removed after the test.
    streamsConfiguration.put(StreamsConfig.STATE_DIR_CONFIG, TestUtils.tempDirectory().getAbsolutePath());
    StreamsBuilder builder = new StreamsBuilder();
    KStream<String, String> textLines = builder.stream(inputTopic);
    Pattern pattern = Pattern.compile("\\W+", Pattern.UNICODE_CHARACTER_CLASS);
    KTable<String, Long> wordCounts = textLines.flatMapValues(value -> Arrays.asList(pattern.split(value.toLowerCase()))).groupBy((key, word) -> word).count();
    wordCounts.toStream().to(outputTopic, Produced.with(stringSerde, longSerde));
    KafkaStreams streams = new KafkaStreams(builder.build(), streamsConfiguration);
    streams.start();
    // 
    // Step 2: Produce some input data to the input topic.
    // 
    Properties producerConfig = new Properties();
    producerConfig.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG, CLUSTER.bootstrapServers());
    producerConfig.put(ProducerConfig.ACKS_CONFIG, "all");
    producerConfig.put(ProducerConfig.RETRIES_CONFIG, 0);
    producerConfig.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG, StringSerializer.class);
    producerConfig.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG, StringSerializer.class);
    IntegrationTestUtils.produceValuesSynchronously(inputTopic, inputValues, producerConfig);
    // 
    // Step 3: Verify the application's output data.
    // 
    Properties consumerConfig = new Properties();
    consumerConfig.put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG, CLUSTER.bootstrapServers());
    consumerConfig.put(ConsumerConfig.GROUP_ID_CONFIG, "wordcount-lambda-integration-test-standard-consumer");
    consumerConfig.put(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, "earliest");
    consumerConfig.put(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG, StringDeserializer.class);
    consumerConfig.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG, LongDeserializer.class);
    List<KeyValue<String, Long>> actualWordCounts = IntegrationTestUtils.waitUntilMinKeyValueRecordsReceived(consumerConfig, outputTopic, expectedWordCounts.size());
    streams.close();
    assertThat(actualWordCounts).containsExactlyElementsOf(expectedWordCounts);
}
Also used : StreamsBuilder(org.apache.kafka.streams.StreamsBuilder) StreamsConfig(org.apache.kafka.streams.StreamsConfig) KTable(org.apache.kafka.streams.kstream.KTable) Arrays(java.util.Arrays) Properties(java.util.Properties) BeforeClass(org.junit.BeforeClass) Produced(org.apache.kafka.streams.kstream.Produced) TestUtils(org.apache.kafka.test.TestUtils) Assertions.assertThat(org.assertj.core.api.Assertions.assertThat) KeyValue(org.apache.kafka.streams.KeyValue) LongDeserializer(org.apache.kafka.common.serialization.LongDeserializer) ConsumerConfig(org.apache.kafka.clients.consumer.ConsumerConfig) Test(org.junit.Test) KStream(org.apache.kafka.streams.kstream.KStream) List(java.util.List) StringDeserializer(org.apache.kafka.common.serialization.StringDeserializer) Serde(org.apache.kafka.common.serialization.Serde) EmbeddedSingleNodeKafkaCluster(io.confluent.examples.streams.kafka.EmbeddedSingleNodeKafkaCluster) Serdes(org.apache.kafka.common.serialization.Serdes) StringSerializer(org.apache.kafka.common.serialization.StringSerializer) KafkaStreams(org.apache.kafka.streams.KafkaStreams) Pattern(java.util.regex.Pattern) ClassRule(org.junit.ClassRule) ProducerConfig(org.apache.kafka.clients.producer.ProducerConfig) Pattern(java.util.regex.Pattern) KafkaStreams(org.apache.kafka.streams.KafkaStreams) KeyValue(org.apache.kafka.streams.KeyValue) Properties(java.util.Properties) StreamsBuilder(org.apache.kafka.streams.StreamsBuilder) Test(org.junit.Test)

Example 3 with ConsumerConfig

use of org.apache.kafka.clients.consumer.ConsumerConfig in project kafka-streams-examples by confluentinc.

the class EventDeduplicationLambdaIntegrationTest method shouldRemoveDuplicatesFromTheInput.

@Test
public void shouldRemoveDuplicatesFromTheInput() throws Exception {
    // e.g. "4ff3cb44-abcb-46e3-8f9a-afb7cc74fbb8"
    String firstId = UUID.randomUUID().toString();
    String secondId = UUID.randomUUID().toString();
    String thirdId = UUID.randomUUID().toString();
    List<String> inputValues = Arrays.asList(firstId, secondId, firstId, firstId, secondId, thirdId, thirdId, firstId, secondId);
    List<String> expectedValues = Arrays.asList(firstId, secondId, thirdId);
    // 
    // Step 1: Configure and start the processor topology.
    // 
    StreamsBuilder builder = new StreamsBuilder();
    Properties streamsConfiguration = new Properties();
    streamsConfiguration.put(StreamsConfig.APPLICATION_ID_CONFIG, "deduplication-lambda-integration-test");
    streamsConfiguration.put(StreamsConfig.BOOTSTRAP_SERVERS_CONFIG, CLUSTER.bootstrapServers());
    streamsConfiguration.put(StreamsConfig.DEFAULT_KEY_SERDE_CLASS_CONFIG, Serdes.ByteArray().getClass().getName());
    streamsConfiguration.put(StreamsConfig.DEFAULT_VALUE_SERDE_CLASS_CONFIG, Serdes.String().getClass().getName());
    // The commit interval for flushing records to state stores and downstream must be lower than
    // this integration test's timeout (30 secs) to ensure we observe the expected processing results.
    streamsConfiguration.put(StreamsConfig.COMMIT_INTERVAL_MS_CONFIG, TimeUnit.SECONDS.toMillis(10));
    streamsConfiguration.put(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, "earliest");
    // Use a temporary directory for storing state, which will be automatically removed after the test.
    streamsConfiguration.put(StreamsConfig.STATE_DIR_CONFIG, TestUtils.tempDirectory().getAbsolutePath());
    // How long we "remember" an event.  During this time, any incoming duplicates of the event
    // will be, well, dropped, thereby de-duplicating the input data.
    // 
    // The actual value depends on your use case.  To reduce memory and disk usage, you could
    // decrease the size to purge old windows more frequently at the cost of potentially missing out
    // on de-duplicating late-arriving records.
    long maintainDurationPerEventInMs = TimeUnit.MINUTES.toMillis(10);
    // The number of segments has no impact on "correctness".
    // Using more segments implies larger overhead but allows for more fined grained record expiration
    // Note: the specified retention time is a _minimum_ time span and no strict upper time bound
    int numberOfSegments = 3;
    // retention period must be at least window size -- for this use case, we don't need a longer retention period
    // and thus just use the window size as retention time
    long retentionPeriod = maintainDurationPerEventInMs;
    StoreBuilder<WindowStore<String, Long>> dedupStoreBuilder = Stores.windowStoreBuilder(Stores.persistentWindowStore(storeName, retentionPeriod, numberOfSegments, maintainDurationPerEventInMs, false), Serdes.String(), Serdes.Long());
    builder.addStateStore(dedupStoreBuilder);
    KStream<byte[], String> input = builder.stream(inputTopic);
    KStream<byte[], String> deduplicated = input.transform(// function as needed.
    () -> new DeduplicationTransformer<>(maintainDurationPerEventInMs, (key, value) -> value), storeName);
    deduplicated.to(outputTopic);
    KafkaStreams streams = new KafkaStreams(builder.build(), streamsConfiguration);
    streams.start();
    // 
    // Step 2: Produce some input data to the input topic.
    // 
    Properties producerConfig = new Properties();
    producerConfig.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG, CLUSTER.bootstrapServers());
    producerConfig.put(ProducerConfig.ACKS_CONFIG, "all");
    producerConfig.put(ProducerConfig.RETRIES_CONFIG, 0);
    producerConfig.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG, ByteArraySerializer.class);
    producerConfig.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG, StringSerializer.class);
    IntegrationTestUtils.produceValuesSynchronously(inputTopic, inputValues, producerConfig);
    // 
    // Step 3: Verify the application's output data.
    // 
    Properties consumerConfig = new Properties();
    consumerConfig.put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG, CLUSTER.bootstrapServers());
    consumerConfig.put(ConsumerConfig.GROUP_ID_CONFIG, "deduplication-integration-test-standard-consumer");
    consumerConfig.put(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, "earliest");
    consumerConfig.put(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG, ByteArrayDeserializer.class);
    consumerConfig.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG, StringDeserializer.class);
    List<String> actualValues = IntegrationTestUtils.waitUntilMinValuesRecordsReceived(consumerConfig, outputTopic, expectedValues.size());
    streams.close();
    assertThat(actualValues).containsExactlyElementsOf(expectedValues);
}
Also used : StreamsBuilder(org.apache.kafka.streams.StreamsBuilder) WindowStore(org.apache.kafka.streams.state.WindowStore) StreamsConfig(org.apache.kafka.streams.StreamsConfig) Arrays(java.util.Arrays) BeforeClass(org.junit.BeforeClass) Assertions.assertThat(org.assertj.core.api.Assertions.assertThat) Stores(org.apache.kafka.streams.state.Stores) KStream(org.apache.kafka.streams.kstream.KStream) WindowStore(org.apache.kafka.streams.state.WindowStore) ByteArraySerializer(org.apache.kafka.common.serialization.ByteArraySerializer) StringDeserializer(org.apache.kafka.common.serialization.StringDeserializer) EmbeddedSingleNodeKafkaCluster(io.confluent.examples.streams.kafka.EmbeddedSingleNodeKafkaCluster) Serdes(org.apache.kafka.common.serialization.Serdes) StringSerializer(org.apache.kafka.common.serialization.StringSerializer) ClassRule(org.junit.ClassRule) ProducerConfig(org.apache.kafka.clients.producer.ProducerConfig) ByteArrayDeserializer(org.apache.kafka.common.serialization.ByteArrayDeserializer) StreamsBuilder(org.apache.kafka.streams.StreamsBuilder) KeyValueMapper(org.apache.kafka.streams.kstream.KeyValueMapper) Properties(java.util.Properties) TestUtils(org.apache.kafka.test.TestUtils) Transformer(org.apache.kafka.streams.kstream.Transformer) KeyValue(org.apache.kafka.streams.KeyValue) ConsumerConfig(org.apache.kafka.clients.consumer.ConsumerConfig) Test(org.junit.Test) UUID(java.util.UUID) StoreBuilder(org.apache.kafka.streams.state.StoreBuilder) TimeUnit(java.util.concurrent.TimeUnit) ProcessorContext(org.apache.kafka.streams.processor.ProcessorContext) List(java.util.List) WindowStoreIterator(org.apache.kafka.streams.state.WindowStoreIterator) KafkaStreams(org.apache.kafka.streams.KafkaStreams) KafkaStreams(org.apache.kafka.streams.KafkaStreams) Properties(java.util.Properties) Test(org.junit.Test)

Example 4 with ConsumerConfig

use of org.apache.kafka.clients.consumer.ConsumerConfig in project kafka-streams-examples by confluentinc.

the class HandlingCorruptedInputRecordsIntegrationTest method shouldIgnoreCorruptInputRecords.

@Test
public void shouldIgnoreCorruptInputRecords() throws Exception {
    List<Long> inputValues = Arrays.asList(1L, 2L, 3L);
    List<Long> expectedValues = inputValues.stream().map(x -> 2 * x).collect(Collectors.toList());
    // 
    // Step 1: Configure and start the processor topology.
    // 
    StreamsBuilder builder = new StreamsBuilder();
    Properties streamsConfiguration = new Properties();
    streamsConfiguration.put(StreamsConfig.APPLICATION_ID_CONFIG, "failure-handling-integration-test");
    streamsConfiguration.put(StreamsConfig.BOOTSTRAP_SERVERS_CONFIG, CLUSTER.bootstrapServers());
    streamsConfiguration.put(StreamsConfig.DEFAULT_KEY_SERDE_CLASS_CONFIG, Serdes.ByteArray().getClass().getName());
    streamsConfiguration.put(StreamsConfig.DEFAULT_VALUE_SERDE_CLASS_CONFIG, Serdes.ByteArray().getClass().getName());
    streamsConfiguration.put(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, "earliest");
    Serde<String> stringSerde = Serdes.String();
    Serde<Long> longSerde = Serdes.Long();
    KStream<byte[], byte[]> input = builder.stream(inputTopic);
    // Note how the returned stream is of type `KStream<String, Long>`.
    KStream<String, Long> doubled = input.flatMap((k, v) -> {
        try {
            // Attempt deserialization
            String key = stringSerde.deserializer().deserialize("input-topic", k);
            long value = longSerde.deserializer().deserialize("input-topic", v);
            // checking.
            return Collections.singletonList(KeyValue.pair(key, 2 * value));
        } catch (SerializationException e) {
            // Ignore/skip the corrupted record by catching the exception.
            // Optionally, we can log the fact that we did so:
            System.err.println("Could not deserialize record: " + e.getMessage());
        }
        return Collections.emptyList();
    });
    // Write the processing results (which was generated from valid records only) to Kafka.
    doubled.to(outputTopic, Produced.with(stringSerde, longSerde));
    KafkaStreams streams = new KafkaStreams(builder.build(), streamsConfiguration);
    streams.start();
    // 
    // Step 2: Produce some corrupt input data to the input topic.
    // 
    Properties producerConfigForCorruptRecords = new Properties();
    producerConfigForCorruptRecords.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG, CLUSTER.bootstrapServers());
    producerConfigForCorruptRecords.put(ProducerConfig.ACKS_CONFIG, "all");
    producerConfigForCorruptRecords.put(ProducerConfig.RETRIES_CONFIG, 0);
    producerConfigForCorruptRecords.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG, ByteArraySerializer.class);
    producerConfigForCorruptRecords.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG, StringSerializer.class);
    IntegrationTestUtils.produceValuesSynchronously(inputTopic, Collections.singletonList("corrupt"), producerConfigForCorruptRecords);
    // 
    // Step 3: Produce some (valid) input data to the input topic.
    // 
    Properties producerConfig = new Properties();
    producerConfig.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG, CLUSTER.bootstrapServers());
    producerConfig.put(ProducerConfig.ACKS_CONFIG, "all");
    producerConfig.put(ProducerConfig.RETRIES_CONFIG, 0);
    producerConfig.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG, ByteArraySerializer.class);
    producerConfig.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG, LongSerializer.class);
    IntegrationTestUtils.produceValuesSynchronously(inputTopic, inputValues, producerConfig);
    // 
    // Step 4: Verify the application's output data.
    // 
    Properties consumerConfig = new Properties();
    consumerConfig.put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG, CLUSTER.bootstrapServers());
    consumerConfig.put(ConsumerConfig.GROUP_ID_CONFIG, "map-function-lambda-integration-test-standard-consumer");
    consumerConfig.put(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, "earliest");
    consumerConfig.put(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG, ByteArrayDeserializer.class);
    consumerConfig.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG, LongDeserializer.class);
    List<Long> actualValues = IntegrationTestUtils.waitUntilMinValuesRecordsReceived(consumerConfig, outputTopic, expectedValues.size());
    streams.close();
    assertThat(actualValues).isEqualTo(expectedValues);
}
Also used : StreamsConfig(org.apache.kafka.streams.StreamsConfig) Arrays(java.util.Arrays) BeforeClass(org.junit.BeforeClass) Produced(org.apache.kafka.streams.kstream.Produced) SerializationException(org.apache.kafka.common.errors.SerializationException) Assertions.assertThat(org.assertj.core.api.Assertions.assertThat) KStream(org.apache.kafka.streams.kstream.KStream) ByteArraySerializer(org.apache.kafka.common.serialization.ByteArraySerializer) Serde(org.apache.kafka.common.serialization.Serde) EmbeddedSingleNodeKafkaCluster(io.confluent.examples.streams.kafka.EmbeddedSingleNodeKafkaCluster) Serdes(org.apache.kafka.common.serialization.Serdes) StringSerializer(org.apache.kafka.common.serialization.StringSerializer) ClassRule(org.junit.ClassRule) ProducerConfig(org.apache.kafka.clients.producer.ProducerConfig) ByteArrayDeserializer(org.apache.kafka.common.serialization.ByteArrayDeserializer) StreamsBuilder(org.apache.kafka.streams.StreamsBuilder) Properties(java.util.Properties) KeyValue(org.apache.kafka.streams.KeyValue) LongDeserializer(org.apache.kafka.common.serialization.LongDeserializer) ConsumerConfig(org.apache.kafka.clients.consumer.ConsumerConfig) Test(org.junit.Test) LongSerializer(org.apache.kafka.common.serialization.LongSerializer) Collectors(java.util.stream.Collectors) List(java.util.List) KafkaStreams(org.apache.kafka.streams.KafkaStreams) Collections(java.util.Collections) KafkaStreams(org.apache.kafka.streams.KafkaStreams) SerializationException(org.apache.kafka.common.errors.SerializationException) Properties(java.util.Properties) StreamsBuilder(org.apache.kafka.streams.StreamsBuilder) Test(org.junit.Test)

Example 5 with ConsumerConfig

use of org.apache.kafka.clients.consumer.ConsumerConfig in project samza by apache.

the class IntegrationTestHarness method createSystemAdmin.

private KafkaSystemAdmin createSystemAdmin(String system) {
    String kafkaConsumerPropertyPrefix = "systems." + system + ".consumer.";
    Map<String, String> map = new HashMap<>();
    map.put(kafkaConsumerPropertyPrefix + CommonClientConfigs.BOOTSTRAP_SERVERS_CONFIG, brokerList());
    map.put(JobConfig.JOB_NAME, "test.job");
    Config config = new MapConfig(map);
    HashMap<String, Object> consumerConfig = KafkaConsumerConfig.getKafkaSystemConsumerConfig(config, system, KafkaConsumerConfig.createClientId("kafka-admin-consumer", config));
    return new KafkaSystemAdmin(system, new MapConfig(map), KafkaSystemConsumer.createKafkaConsumerImpl(system, consumerConfig));
}
Also used : HashMap(java.util.HashMap) JobConfig(org.apache.samza.config.JobConfig) MapConfig(org.apache.samza.config.MapConfig) ProducerConfig(org.apache.kafka.clients.producer.ProducerConfig) ConsumerConfig(org.apache.kafka.clients.consumer.ConsumerConfig) KafkaConsumerConfig(org.apache.samza.config.KafkaConsumerConfig) Config(org.apache.samza.config.Config) KafkaSystemAdmin(org.apache.samza.system.kafka.KafkaSystemAdmin) MapConfig(org.apache.samza.config.MapConfig)

Aggregations

ConsumerConfig (org.apache.kafka.clients.consumer.ConsumerConfig)35 List (java.util.List)31 Map (java.util.Map)24 Set (java.util.Set)24 Collectors (java.util.stream.Collectors)24 Pattern (java.util.regex.Pattern)23 Optional (java.util.Optional)22 TimeUnit (java.util.concurrent.TimeUnit)20 UUID (java.util.UUID)19 Handler (io.vertx.core.Handler)18 Vertx (io.vertx.core.Vertx)18 Buffer (io.vertx.core.buffer.Buffer)18 KafkaConsumerRecord (io.vertx.kafka.client.consumer.KafkaConsumerRecord)18 Logger (org.slf4j.Logger)18 LoggerFactory (org.slf4j.LoggerFactory)18 Instant (java.time.Instant)17 HashMap (java.util.HashMap)17 Truth.assertThat (com.google.common.truth.Truth.assertThat)16 Future (io.vertx.core.Future)16 Promise (io.vertx.core.Promise)16