Search in sources :

Example 21 with KafkaStreams

use of org.apache.kafka.streams.KafkaStreams in project kafka by apache.

the class KStreamKTableJoinIntegrationTest method shouldCountClicksPerRegion.

@Test
public void shouldCountClicksPerRegion() throws Exception {
    // Input 1: Clicks per user (multiple records allowed per user).
    final List<KeyValue<String, Long>> userClicks = Arrays.asList(new KeyValue<>("alice", 13L), new KeyValue<>("bob", 4L), new KeyValue<>("chao", 25L), new KeyValue<>("bob", 19L), new KeyValue<>("dave", 56L), new KeyValue<>("eve", 78L), new KeyValue<>("alice", 40L), new KeyValue<>("fang", 99L));
    // Input 2: Region per user (multiple records allowed per user).
    final List<KeyValue<String, String>> userRegions = Arrays.asList(new KeyValue<>("alice", "asia"), /* Alice lived in Asia originally... */
    new KeyValue<>("bob", "americas"), new KeyValue<>("chao", "asia"), new KeyValue<>("dave", "europe"), new KeyValue<>("alice", "europe"), /* ...but moved to Europe some time later. */
    new KeyValue<>("eve", "americas"), new KeyValue<>("fang", "asia"));
    final List<KeyValue<String, Long>> expectedClicksPerRegion = (cacheSizeBytes == 0) ? Arrays.asList(new KeyValue<>("europe", 13L), new KeyValue<>("americas", 4L), new KeyValue<>("asia", 25L), new KeyValue<>("americas", 23L), new KeyValue<>("europe", 69L), new KeyValue<>("americas", 101L), new KeyValue<>("europe", 109L), new KeyValue<>("asia", 124L)) : Arrays.asList(new KeyValue<>("americas", 101L), new KeyValue<>("europe", 109L), new KeyValue<>("asia", 124L));
    //
    // Step 1: Configure and start the processor topology.
    //
    final Serde<String> stringSerde = Serdes.String();
    final Serde<Long> longSerde = Serdes.Long();
    final KStreamBuilder builder = new KStreamBuilder();
    // This KStream contains information such as "alice" -> 13L.
    //
    // Because this is a KStream ("record stream"), multiple records for the same user will be
    // considered as separate click-count events, each of which will be added to the total count.
    final KStream<String, Long> userClicksStream = builder.stream(stringSerde, longSerde, userClicksTopic);
    // This KTable contains information such as "alice" -> "europe".
    //
    // Because this is a KTable ("changelog stream"), only the latest value (here: region) for a
    // record key will be considered at the time when a new user-click record (see above) is
    // received for the `leftJoin` below.  Any previous region values are being considered out of
    // date.  This behavior is quite different to the KStream for user clicks above.
    //
    // For example, the user "alice" will be considered to live in "europe" (although originally she
    // lived in "asia") because, at the time her first user-click record is being received and
    // subsequently processed in the `leftJoin`, the latest region update for "alice" is "europe"
    // (which overrides her previous region value of "asia").
    final KTable<String, String> userRegionsTable = builder.table(stringSerde, stringSerde, userRegionsTopic, userRegionsStoreName);
    // Compute the number of clicks per region, e.g. "europe" -> 13L.
    //
    // The resulting KTable is continuously being updated as new data records are arriving in the
    // input KStream `userClicksStream` and input KTable `userRegionsTable`.
    final KTable<String, Long> clicksPerRegion = userClicksStream.leftJoin(userRegionsTable, new ValueJoiner<Long, String, RegionWithClicks>() {

        @Override
        public RegionWithClicks apply(final Long clicks, final String region) {
            return new RegionWithClicks(region == null ? "UNKNOWN" : region, clicks);
        }
    }).map(new KeyValueMapper<String, RegionWithClicks, KeyValue<String, Long>>() {

        @Override
        public KeyValue<String, Long> apply(final String key, final RegionWithClicks value) {
            return new KeyValue<>(value.getRegion(), value.getClicks());
        }
    }).groupByKey(stringSerde, longSerde).reduce(new Reducer<Long>() {

        @Override
        public Long apply(final Long value1, final Long value2) {
            return value1 + value2;
        }
    }, "ClicksPerRegionUnwindowed");
    // Write the (continuously updating) results to the output topic.
    clicksPerRegion.to(stringSerde, longSerde, outputTopic);
    kafkaStreams = new KafkaStreams(builder, streamsConfiguration);
    kafkaStreams.start();
    //
    // Step 2: Publish user-region information.
    //
    // To keep this code example simple and easier to understand/reason about, we publish all
    // user-region records before any user-click records (cf. step 3). In practice though,
    // data records would typically be arriving concurrently in both input streams/topics.
    final Properties userRegionsProducerConfig = new Properties();
    userRegionsProducerConfig.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG, CLUSTER.bootstrapServers());
    userRegionsProducerConfig.put(ProducerConfig.ACKS_CONFIG, "all");
    userRegionsProducerConfig.put(ProducerConfig.RETRIES_CONFIG, 0);
    userRegionsProducerConfig.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG, StringSerializer.class);
    userRegionsProducerConfig.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG, StringSerializer.class);
    IntegrationTestUtils.produceKeyValuesSynchronously(userRegionsTopic, userRegions, userRegionsProducerConfig, mockTime);
    //
    // Step 3: Publish some user click events.
    //
    final Properties userClicksProducerConfig = new Properties();
    userClicksProducerConfig.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG, CLUSTER.bootstrapServers());
    userClicksProducerConfig.put(ProducerConfig.ACKS_CONFIG, "all");
    userClicksProducerConfig.put(ProducerConfig.RETRIES_CONFIG, 0);
    userClicksProducerConfig.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG, StringSerializer.class);
    userClicksProducerConfig.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG, LongSerializer.class);
    IntegrationTestUtils.produceKeyValuesSynchronously(userClicksTopic, userClicks, userClicksProducerConfig, mockTime);
    //
    // Step 4: Verify the application's output data.
    //
    final Properties consumerConfig = new Properties();
    consumerConfig.put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG, CLUSTER.bootstrapServers());
    consumerConfig.put(ConsumerConfig.GROUP_ID_CONFIG, "join-integration-test-standard-consumer");
    consumerConfig.put(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, "earliest");
    consumerConfig.put(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG, StringDeserializer.class);
    consumerConfig.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG, LongDeserializer.class);
    final List<KeyValue<String, Long>> actualClicksPerRegion = IntegrationTestUtils.waitUntilMinKeyValueRecordsReceived(consumerConfig, outputTopic, expectedClicksPerRegion.size());
    assertThat(actualClicksPerRegion, equalTo(expectedClicksPerRegion));
}
Also used : KStreamBuilder(org.apache.kafka.streams.kstream.KStreamBuilder) KafkaStreams(org.apache.kafka.streams.KafkaStreams) KeyValue(org.apache.kafka.streams.KeyValue) Properties(java.util.Properties) ValueJoiner(org.apache.kafka.streams.kstream.ValueJoiner) Test(org.junit.Test)

Example 22 with KafkaStreams

use of org.apache.kafka.streams.KafkaStreams in project kafka by apache.

the class KStreamRepartitionJoinTest method startStreams.

private void startStreams() {
    kafkaStreams = new KafkaStreams(builder, streamsConfiguration);
    kafkaStreams.start();
}
Also used : KafkaStreams(org.apache.kafka.streams.KafkaStreams)

Example 23 with KafkaStreams

use of org.apache.kafka.streams.KafkaStreams in project kafka by apache.

the class KStreamsFineGrainedAutoResetIntegrationTest method shouldThrowStreamsExceptionNoResetSpecified.

@Test
public void shouldThrowStreamsExceptionNoResetSpecified() throws Exception {
    Properties props = new Properties();
    props.put(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, "none");
    Properties localConfig = StreamsTestUtils.getStreamsConfig("testAutoOffsetWithNone", CLUSTER.bootstrapServers(), STRING_SERDE_CLASSNAME, STRING_SERDE_CLASSNAME, props);
    final KStreamBuilder builder = new KStreamBuilder();
    final KStream<String, String> exceptionStream = builder.stream(NOOP);
    exceptionStream.to(stringSerde, stringSerde, DEFAULT_OUTPUT_TOPIC);
    KafkaStreams streams = new KafkaStreams(builder, localConfig);
    final TestingUncaughtExceptionHandler uncaughtExceptionHandler = new TestingUncaughtExceptionHandler();
    final TestCondition correctExceptionThrownCondition = new TestCondition() {

        @Override
        public boolean conditionMet() {
            return uncaughtExceptionHandler.correctExceptionThrown;
        }
    };
    streams.setUncaughtExceptionHandler(uncaughtExceptionHandler);
    streams.start();
    TestUtils.waitForCondition(correctExceptionThrownCondition, "The expected NoOffsetForPartitionException was never thrown");
    streams.close();
}
Also used : KStreamBuilder(org.apache.kafka.streams.kstream.KStreamBuilder) KafkaStreams(org.apache.kafka.streams.KafkaStreams) TestCondition(org.apache.kafka.test.TestCondition) Properties(java.util.Properties) Test(org.junit.Test)

Example 24 with KafkaStreams

use of org.apache.kafka.streams.KafkaStreams in project kafka by apache.

the class QueryableStateIntegrationTest method shouldBeAbleToQueryState.

@Test
public void shouldBeAbleToQueryState() throws Exception {
    final KStreamBuilder builder = new KStreamBuilder();
    final String[] keys = { "hello", "goodbye", "welcome", "go", "kafka" };
    final Set<KeyValue<String, String>> batch1 = new TreeSet<>(stringComparator);
    batch1.addAll(Arrays.asList(new KeyValue<>(keys[0], "hello"), new KeyValue<>(keys[1], "goodbye"), new KeyValue<>(keys[2], "welcome"), new KeyValue<>(keys[3], "go"), new KeyValue<>(keys[4], "kafka")));
    final Set<KeyValue<String, Long>> expectedCount = new TreeSet<>(stringLongComparator);
    for (final String key : keys) {
        expectedCount.add(new KeyValue<>(key, 1L));
    }
    IntegrationTestUtils.produceKeyValuesSynchronously(streamOne, batch1, TestUtils.producerConfig(CLUSTER.bootstrapServers(), StringSerializer.class, StringSerializer.class, new Properties()), mockTime);
    final KStream<String, String> s1 = builder.stream(streamOne);
    // Non Windowed
    s1.groupByKey().count("my-count").to(Serdes.String(), Serdes.Long(), outputTopic);
    s1.groupByKey().count(TimeWindows.of(WINDOW_SIZE), "windowed-count");
    kafkaStreams = new KafkaStreams(builder, streamsConfiguration);
    kafkaStreams.start();
    waitUntilAtLeastNumRecordProcessed(outputTopic, 1);
    final ReadOnlyKeyValueStore<String, Long> myCount = kafkaStreams.store("my-count", QueryableStoreTypes.<String, Long>keyValueStore());
    final ReadOnlyWindowStore<String, Long> windowStore = kafkaStreams.store("windowed-count", QueryableStoreTypes.<String, Long>windowStore());
    verifyCanGetByKey(keys, expectedCount, expectedCount, windowStore, myCount);
    verifyRangeAndAll(expectedCount, myCount);
}
Also used : KStreamBuilder(org.apache.kafka.streams.kstream.KStreamBuilder) KafkaStreams(org.apache.kafka.streams.KafkaStreams) KeyValue(org.apache.kafka.streams.KeyValue) Properties(java.util.Properties) TreeSet(java.util.TreeSet) StringSerializer(org.apache.kafka.common.serialization.StringSerializer) KafkaStreamsTest(org.apache.kafka.streams.KafkaStreamsTest) Test(org.junit.Test)

Example 25 with KafkaStreams

use of org.apache.kafka.streams.KafkaStreams in project kafka by apache.

the class QueryableStateIntegrationTest method shouldNotMakeStoreAvailableUntilAllStoresAvailable.

@Test
public void shouldNotMakeStoreAvailableUntilAllStoresAvailable() throws Exception {
    final KStreamBuilder builder = new KStreamBuilder();
    final KStream<String, String> stream = builder.stream(streamThree);
    final String storeName = "count-by-key";
    stream.groupByKey().count(storeName);
    kafkaStreams = new KafkaStreams(builder, streamsConfiguration);
    kafkaStreams.start();
    final KeyValue<String, String> hello = KeyValue.pair("hello", "hello");
    IntegrationTestUtils.produceKeyValuesSynchronously(streamThree, Arrays.asList(hello, hello, hello, hello, hello, hello, hello, hello), TestUtils.producerConfig(CLUSTER.bootstrapServers(), StringSerializer.class, StringSerializer.class, new Properties()), mockTime);
    final int maxWaitMs = 30000;
    TestUtils.waitForCondition(new TestCondition() {

        @Override
        public boolean conditionMet() {
            try {
                kafkaStreams.store(storeName, QueryableStoreTypes.<String, Long>keyValueStore());
                return true;
            } catch (InvalidStateStoreException ise) {
                return false;
            }
        }
    }, maxWaitMs, "waiting for store " + storeName);
    final ReadOnlyKeyValueStore<String, Long> store = kafkaStreams.store(storeName, QueryableStoreTypes.<String, Long>keyValueStore());
    TestUtils.waitForCondition(new TestCondition() {

        @Override
        public boolean conditionMet() {
            return new Long(8).equals(store.get("hello"));
        }
    }, maxWaitMs, "wait for count to be 8");
    // close stream
    kafkaStreams.close();
    // start again
    kafkaStreams = new KafkaStreams(builder, streamsConfiguration);
    kafkaStreams.start();
    // make sure we never get any value other than 8 for hello
    TestUtils.waitForCondition(new TestCondition() {

        @Override
        public boolean conditionMet() {
            try {
                assertEquals(Long.valueOf(8L), kafkaStreams.store(storeName, QueryableStoreTypes.<String, Long>keyValueStore()).get("hello"));
                return true;
            } catch (InvalidStateStoreException ise) {
                return false;
            }
        }
    }, maxWaitMs, "waiting for store " + storeName);
}
Also used : KStreamBuilder(org.apache.kafka.streams.kstream.KStreamBuilder) KafkaStreams(org.apache.kafka.streams.KafkaStreams) Properties(java.util.Properties) InvalidStateStoreException(org.apache.kafka.streams.errors.InvalidStateStoreException) TestCondition(org.apache.kafka.test.TestCondition) StringSerializer(org.apache.kafka.common.serialization.StringSerializer) KafkaStreamsTest(org.apache.kafka.streams.KafkaStreamsTest) Test(org.junit.Test)

Aggregations

KafkaStreams (org.apache.kafka.streams.KafkaStreams)40 Properties (java.util.Properties)24 KStreamBuilder (org.apache.kafka.streams.kstream.KStreamBuilder)23 Test (org.junit.Test)15 KeyValue (org.apache.kafka.streams.KeyValue)9 CountDownLatch (java.util.concurrent.CountDownLatch)8 TestCondition (org.apache.kafka.test.TestCondition)5 StreamsConfig (org.apache.kafka.streams.StreamsConfig)4 ValueJoiner (org.apache.kafka.streams.kstream.ValueJoiner)4 ValueMapper (org.apache.kafka.streams.kstream.ValueMapper)4 Field (java.lang.reflect.Field)3 ArrayList (java.util.ArrayList)3 Metrics (org.apache.kafka.common.metrics.Metrics)3 StringSerializer (org.apache.kafka.common.serialization.StringSerializer)3 DefaultKafkaClientSupplier (org.apache.kafka.streams.processor.internals.DefaultKafkaClientSupplier)3 StreamThread (org.apache.kafka.streams.processor.internals.StreamThread)3 MockKeyValueMapper (org.apache.kafka.test.MockKeyValueMapper)3 KafkaProducer (org.apache.kafka.clients.producer.KafkaProducer)2 KafkaStreamsTest (org.apache.kafka.streams.KafkaStreamsTest)2 Windowed (org.apache.kafka.streams.kstream.Windowed)2