Search in sources :

Example 1 with WikiFeed

use of io.confluent.examples.streams.avro.WikiFeed in project kafka-streams-examples by confluentinc.

the class WikipediaFeedAvroExampleTest method shouldRunTheWikipediaFeedExample.

@Test
public void shouldRunTheWikipediaFeedExample() throws Exception {
    final Properties props = new Properties();
    props.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG, CLUSTER.bootstrapServers());
    props.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG, StringSerializer.class);
    props.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG, io.confluent.kafka.serializers.KafkaAvroSerializer.class);
    props.put(AbstractKafkaAvroSerDeConfig.SCHEMA_REGISTRY_URL_CONFIG, CLUSTER.schemaRegistryUrl());
    final KafkaProducer<String, WikiFeed> producer = new KafkaProducer<>(props);
    producer.send(new ProducerRecord<>(WikipediaFeedAvroExample.WIKIPEDIA_FEED, new WikiFeed("donna", true, "first post")));
    producer.send(new ProducerRecord<>(WikipediaFeedAvroExample.WIKIPEDIA_FEED, new WikiFeed("donna", true, "second post")));
    producer.send(new ProducerRecord<>(WikipediaFeedAvroExample.WIKIPEDIA_FEED, new WikiFeed("donna", true, "third post")));
    producer.send(new ProducerRecord<>(WikipediaFeedAvroExample.WIKIPEDIA_FEED, new WikiFeed("becca", true, "first post")));
    producer.send(new ProducerRecord<>(WikipediaFeedAvroExample.WIKIPEDIA_FEED, new WikiFeed("becca", true, "second post")));
    producer.send(new ProducerRecord<>(WikipediaFeedAvroExample.WIKIPEDIA_FEED, new WikiFeed("john", true, "first post")));
    producer.flush();
    streams.start();
    final Properties consumerProperties = new Properties();
    consumerProperties.put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG, CLUSTER.bootstrapServers());
    consumerProperties.put(ConsumerConfig.GROUP_ID_CONFIG, "wikipedia-feed-consumer");
    consumerProperties.put(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, "earliest");
    final KafkaConsumer<String, Long> consumer = new KafkaConsumer<>(consumerProperties, new StringDeserializer(), new LongDeserializer());
    final Map<String, Long> expected = new HashMap<>();
    expected.put("donna", 3L);
    expected.put("becca", 2L);
    expected.put("john", 1L);
    final Map<String, Long> actual = new HashMap<>();
    consumer.subscribe(Collections.singleton(WikipediaFeedAvroExample.WIKIPEDIA_STATS));
    final long timeout = System.currentTimeMillis() + 30000L;
    while (!actual.equals(expected) && System.currentTimeMillis() < timeout) {
        final ConsumerRecords<String, Long> records = consumer.poll(1000);
        records.forEach(record -> actual.put(record.key(), record.value()));
    }
    assertThat(expected, equalTo(actual));
}
Also used : KafkaProducer(org.apache.kafka.clients.producer.KafkaProducer) HashMap(java.util.HashMap) StringDeserializer(org.apache.kafka.common.serialization.StringDeserializer) KafkaConsumer(org.apache.kafka.clients.consumer.KafkaConsumer) Properties(java.util.Properties) LongDeserializer(org.apache.kafka.common.serialization.LongDeserializer) WikiFeed(io.confluent.examples.streams.avro.WikiFeed) Test(org.junit.Test)

Example 2 with WikiFeed

use of io.confluent.examples.streams.avro.WikiFeed in project kafka-streams-examples by confluentinc.

the class WikipediaFeedAvroExample method buildWikipediaFeed.

static KafkaStreams buildWikipediaFeed(final String bootstrapServers, final String schemaRegistryUrl, final String stateDir) {
    final Properties streamsConfiguration = new Properties();
    // Give the Streams application a unique name.  The name must be unique in the Kafka cluster
    // against which the application is run.
    streamsConfiguration.put(StreamsConfig.APPLICATION_ID_CONFIG, "wordcount-avro-example");
    streamsConfiguration.put(StreamsConfig.CLIENT_ID_CONFIG, "wordcount-avro-example-client");
    // Where to find Kafka broker(s).
    streamsConfiguration.put(StreamsConfig.BOOTSTRAP_SERVERS_CONFIG, bootstrapServers);
    // Where to find the Confluent schema registry instance(s)
    streamsConfiguration.put(AbstractKafkaAvroSerDeConfig.SCHEMA_REGISTRY_URL_CONFIG, schemaRegistryUrl);
    // Specify default (de)serializers for record keys and for record values.
    streamsConfiguration.put(StreamsConfig.DEFAULT_KEY_SERDE_CLASS_CONFIG, Serdes.String().getClass().getName());
    streamsConfiguration.put(StreamsConfig.DEFAULT_VALUE_SERDE_CLASS_CONFIG, SpecificAvroSerde.class);
    streamsConfiguration.put(StreamsConfig.STATE_DIR_CONFIG, stateDir);
    streamsConfiguration.put(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, "earliest");
    // Records should be flushed every 10 seconds. This is less than the default
    // in order to keep this example interactive.
    streamsConfiguration.put(StreamsConfig.COMMIT_INTERVAL_MS_CONFIG, 10 * 1000);
    final Serde<String> stringSerde = Serdes.String();
    final Serde<Long> longSerde = Serdes.Long();
    final StreamsBuilder builder = new StreamsBuilder();
    // read the source stream
    final KStream<String, WikiFeed> feeds = builder.stream(WIKIPEDIA_FEED);
    // aggregate the new feed counts of by user
    final KTable<String, Long> aggregated = feeds.filter(new Predicate<String, WikiFeed>() {

        @Override
        public boolean test(final String dummy, final WikiFeed value) {
            return value.getIsNew();
        }
    }).map(new KeyValueMapper<String, WikiFeed, KeyValue<String, WikiFeed>>() {

        @Override
        public KeyValue<String, WikiFeed> apply(final String key, final WikiFeed value) {
            return new KeyValue<>(value.getUser(), value);
        }
    }).groupByKey().count();
    // write to the result topic, need to override serdes
    aggregated.toStream().to(WIKIPEDIA_STATS, Produced.with(stringSerde, longSerde));
    return new KafkaStreams(builder.build(), streamsConfiguration);
}
Also used : KafkaStreams(org.apache.kafka.streams.KafkaStreams) KeyValue(org.apache.kafka.streams.KeyValue) Properties(java.util.Properties) Predicate(org.apache.kafka.streams.kstream.Predicate) StreamsBuilder(org.apache.kafka.streams.StreamsBuilder) WikiFeed(io.confluent.examples.streams.avro.WikiFeed)

Example 3 with WikiFeed

use of io.confluent.examples.streams.avro.WikiFeed in project kafka-streams-examples by confluentinc.

the class WikipediaFeedAvroLambdaExampleTest method shouldRunTheWikipediaFeedLambdaExample.

@Test
public void shouldRunTheWikipediaFeedLambdaExample() {
    final Properties props = new Properties();
    props.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG, CLUSTER.bootstrapServers());
    props.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG, StringSerializer.class);
    props.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG, io.confluent.kafka.serializers.KafkaAvroSerializer.class);
    props.put(AbstractKafkaAvroSerDeConfig.SCHEMA_REGISTRY_URL_CONFIG, CLUSTER.schemaRegistryUrl());
    final KafkaProducer<String, WikiFeed> producer = new KafkaProducer<>(props);
    producer.send(new ProducerRecord<>(WikipediaFeedAvroExample.WIKIPEDIA_FEED, new WikiFeed("donna", true, "first post")));
    producer.send(new ProducerRecord<>(WikipediaFeedAvroExample.WIKIPEDIA_FEED, new WikiFeed("donna", true, "second post")));
    producer.send(new ProducerRecord<>(WikipediaFeedAvroExample.WIKIPEDIA_FEED, new WikiFeed("donna", true, "third post")));
    producer.send(new ProducerRecord<>(WikipediaFeedAvroExample.WIKIPEDIA_FEED, new WikiFeed("becca", true, "first post")));
    producer.send(new ProducerRecord<>(WikipediaFeedAvroExample.WIKIPEDIA_FEED, new WikiFeed("becca", true, "second post")));
    producer.send(new ProducerRecord<>(WikipediaFeedAvroExample.WIKIPEDIA_FEED, new WikiFeed("john", true, "first post")));
    producer.flush();
    streams.start();
    final Properties consumerProperties = new Properties();
    consumerProperties.put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG, CLUSTER.bootstrapServers());
    consumerProperties.put(ConsumerConfig.GROUP_ID_CONFIG, "wikipedia-lambda-feed-consumer");
    consumerProperties.put(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, "earliest");
    final KafkaConsumer<String, Long> consumer = new KafkaConsumer<>(consumerProperties, new StringDeserializer(), new LongDeserializer());
    final Map<String, Long> expected = new HashMap<>();
    expected.put("donna", 3L);
    expected.put("becca", 2L);
    expected.put("john", 1L);
    final Map<String, Long> actual = new HashMap<>();
    consumer.subscribe(Collections.singleton(WikipediaFeedAvroExample.WIKIPEDIA_STATS));
    final long timeout = System.currentTimeMillis() + 30000L;
    while (!actual.equals(expected) && System.currentTimeMillis() < timeout) {
        final ConsumerRecords<String, Long> records = consumer.poll(1000);
        records.forEach(record -> actual.put(record.key(), record.value()));
    }
    assertThat(actual, equalTo(expected));
}
Also used : KafkaProducer(org.apache.kafka.clients.producer.KafkaProducer) HashMap(java.util.HashMap) StringDeserializer(org.apache.kafka.common.serialization.StringDeserializer) KafkaConsumer(org.apache.kafka.clients.consumer.KafkaConsumer) Properties(java.util.Properties) LongDeserializer(org.apache.kafka.common.serialization.LongDeserializer) WikiFeed(io.confluent.examples.streams.avro.WikiFeed) Test(org.junit.Test)

Example 4 with WikiFeed

use of io.confluent.examples.streams.avro.WikiFeed in project kafka-streams-examples by confluentinc.

the class SpecificAvroIntegrationTest method shouldRoundTripSpecificAvroDataThroughKafka.

@Test
public void shouldRoundTripSpecificAvroDataThroughKafka() throws Exception {
    List<WikiFeed> inputValues = Collections.singletonList(WikiFeed.newBuilder().setUser("alice").setIsNew(true).setContent("lorem ipsum").build());
    // 
    // Step 1: Configure and start the processor topology.
    // 
    StreamsBuilder builder = new StreamsBuilder();
    Properties streamsConfiguration = new Properties();
    streamsConfiguration.put(StreamsConfig.APPLICATION_ID_CONFIG, "specific-avro-integration-test");
    streamsConfiguration.put(StreamsConfig.BOOTSTRAP_SERVERS_CONFIG, CLUSTER.bootstrapServers());
    streamsConfiguration.put(StreamsConfig.DEFAULT_KEY_SERDE_CLASS_CONFIG, Serdes.ByteArray().getClass().getName());
    streamsConfiguration.put(StreamsConfig.DEFAULT_VALUE_SERDE_CLASS_CONFIG, SpecificAvroSerde.class);
    streamsConfiguration.put(AbstractKafkaAvroSerDeConfig.SCHEMA_REGISTRY_URL_CONFIG, CLUSTER.schemaRegistryUrl());
    streamsConfiguration.put(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, "earliest");
    // Write the input data as-is to the output topic.
    // 
    // Normally, because a) we have already configured the correct default serdes for keys and
    // values and b) the types for keys and values are the same for both the input topic and the
    // output topic, we would only need to define:
    // 
    // builder.stream(inputTopic).to(outputTopic);
    // 
    // However, in the code below we intentionally override the default serdes in `to()` to
    // demonstrate how you can construct and configure a specific Avro serde manually.
    final Serde<String> stringSerde = Serdes.String();
    final Serde<WikiFeed> specificAvroSerde = new SpecificAvroSerde<>();
    // Note how we must manually call `configure()` on this serde to configure the schema registry
    // url.  This is different from the case of setting default serdes (see `streamsConfiguration`
    // above), which will be auto-configured based on the `StreamsConfiguration` instance.
    final boolean isKeySerde = false;
    specificAvroSerde.configure(Collections.singletonMap(AbstractKafkaAvroSerDeConfig.SCHEMA_REGISTRY_URL_CONFIG, CLUSTER.schemaRegistryUrl()), isKeySerde);
    KStream<String, WikiFeed> stream = builder.stream(inputTopic);
    stream.to(outputTopic, Produced.with(stringSerde, specificAvroSerde));
    KafkaStreams streams = new KafkaStreams(builder.build(), streamsConfiguration);
    streams.start();
    // 
    // Step 2: Produce some input data to the input topic.
    // 
    Properties producerConfig = new Properties();
    producerConfig.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG, CLUSTER.bootstrapServers());
    producerConfig.put(ProducerConfig.ACKS_CONFIG, "all");
    producerConfig.put(ProducerConfig.RETRIES_CONFIG, 0);
    producerConfig.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG, ByteArraySerializer.class);
    producerConfig.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG, KafkaAvroSerializer.class);
    producerConfig.put(AbstractKafkaAvroSerDeConfig.SCHEMA_REGISTRY_URL_CONFIG, CLUSTER.schemaRegistryUrl());
    IntegrationTestUtils.produceValuesSynchronously(inputTopic, inputValues, producerConfig);
    // 
    // Step 3: Verify the application's output data.
    // 
    Properties consumerConfig = new Properties();
    consumerConfig.put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG, CLUSTER.bootstrapServers());
    consumerConfig.put(ConsumerConfig.GROUP_ID_CONFIG, "specific-avro-integration-test-standard-consumer");
    consumerConfig.put(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, "earliest");
    consumerConfig.put(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG, ByteArrayDeserializer.class);
    consumerConfig.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG, KafkaAvroDeserializer.class);
    consumerConfig.put(AbstractKafkaAvroSerDeConfig.SCHEMA_REGISTRY_URL_CONFIG, CLUSTER.schemaRegistryUrl());
    consumerConfig.put(KafkaAvroDeserializerConfig.SPECIFIC_AVRO_READER_CONFIG, true);
    List<WikiFeed> actualValues = IntegrationTestUtils.waitUntilMinValuesRecordsReceived(consumerConfig, outputTopic, inputValues.size());
    streams.close();
    assertEquals(inputValues, actualValues);
}
Also used : StreamsBuilder(org.apache.kafka.streams.StreamsBuilder) SpecificAvroSerde(io.confluent.kafka.streams.serdes.avro.SpecificAvroSerde) KafkaStreams(org.apache.kafka.streams.KafkaStreams) WikiFeed(io.confluent.examples.streams.avro.WikiFeed) Properties(java.util.Properties) Test(org.junit.Test)

Example 5 with WikiFeed

use of io.confluent.examples.streams.avro.WikiFeed in project kafka-streams-examples by confluentinc.

the class WikipediaFeedAvroExampleDriver method produceInputs.

private static void produceInputs(String bootstrapServers, String schemaRegistryUrl) throws IOException {
    final String[] users = { "erica", "bob", "joe", "damian", "tania", "phil", "sam", "lauren", "joseph" };
    final Properties props = new Properties();
    props.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG, bootstrapServers);
    props.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG, StringSerializer.class);
    props.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG, io.confluent.kafka.serializers.KafkaAvroSerializer.class);
    props.put(AbstractKafkaAvroSerDeConfig.SCHEMA_REGISTRY_URL_CONFIG, schemaRegistryUrl);
    final KafkaProducer<String, WikiFeed> producer = new KafkaProducer<>(props);
    final Random random = new Random();
    IntStream.range(0, random.nextInt(100)).mapToObj(value -> new WikiFeed(users[random.nextInt(users.length)], true, "content")).forEach(record -> producer.send(new ProducerRecord<>(WikipediaFeedAvroExample.WIKIPEDIA_FEED, null, record)));
    producer.flush();
}
Also used : KafkaProducer(org.apache.kafka.clients.producer.KafkaProducer) IntStream(java.util.stream.IntStream) ProducerRecord(org.apache.kafka.clients.producer.ProducerRecord) Properties(java.util.Properties) WikiFeed(io.confluent.examples.streams.avro.WikiFeed) LongDeserializer(org.apache.kafka.common.serialization.LongDeserializer) ConsumerConfig(org.apache.kafka.clients.consumer.ConsumerConfig) IOException(java.io.IOException) Random(java.util.Random) AbstractKafkaAvroSerDeConfig(io.confluent.kafka.serializers.AbstractKafkaAvroSerDeConfig) ConsumerRecords(org.apache.kafka.clients.consumer.ConsumerRecords) KafkaProducer(org.apache.kafka.clients.producer.KafkaProducer) StringDeserializer(org.apache.kafka.common.serialization.StringDeserializer) ConsumerRecord(org.apache.kafka.clients.consumer.ConsumerRecord) StringSerializer(org.apache.kafka.common.serialization.StringSerializer) ProducerConfig(org.apache.kafka.clients.producer.ProducerConfig) Collections(java.util.Collections) KafkaConsumer(org.apache.kafka.clients.consumer.KafkaConsumer) Random(java.util.Random) ProducerRecord(org.apache.kafka.clients.producer.ProducerRecord) WikiFeed(io.confluent.examples.streams.avro.WikiFeed) Properties(java.util.Properties)

Aggregations

WikiFeed (io.confluent.examples.streams.avro.WikiFeed)6 Properties (java.util.Properties)6 KafkaConsumer (org.apache.kafka.clients.consumer.KafkaConsumer)3 KafkaProducer (org.apache.kafka.clients.producer.KafkaProducer)3 LongDeserializer (org.apache.kafka.common.serialization.LongDeserializer)3 StringDeserializer (org.apache.kafka.common.serialization.StringDeserializer)3 KafkaStreams (org.apache.kafka.streams.KafkaStreams)3 StreamsBuilder (org.apache.kafka.streams.StreamsBuilder)3 Test (org.junit.Test)3 AbstractKafkaAvroSerDeConfig (io.confluent.kafka.serializers.AbstractKafkaAvroSerDeConfig)2 SpecificAvroSerde (io.confluent.kafka.streams.serdes.avro.SpecificAvroSerde)2 HashMap (java.util.HashMap)2 ConsumerConfig (org.apache.kafka.clients.consumer.ConsumerConfig)2 KeyValue (org.apache.kafka.streams.KeyValue)2 IOException (java.io.IOException)1 Collections (java.util.Collections)1 Random (java.util.Random)1 IntStream (java.util.stream.IntStream)1 ConsumerRecord (org.apache.kafka.clients.consumer.ConsumerRecord)1 ConsumerRecords (org.apache.kafka.clients.consumer.ConsumerRecords)1