use of org.apache.kafka.streams.kstream.KTable in project kafka-streams-examples by confluentinc.
the class WordCountLambdaExample method main.
public static void main(final String[] args) throws Exception {
final String bootstrapServers = args.length > 0 ? args[0] : "localhost:9092";
final Properties streamsConfiguration = new Properties();
// Give the Streams application a unique name. The name must be unique in the Kafka cluster
// against which the application is run.
streamsConfiguration.put(StreamsConfig.APPLICATION_ID_CONFIG, "wordcount-lambda-example");
streamsConfiguration.put(StreamsConfig.CLIENT_ID_CONFIG, "wordcount-lambda-example-client");
// Where to find Kafka broker(s).
streamsConfiguration.put(StreamsConfig.BOOTSTRAP_SERVERS_CONFIG, bootstrapServers);
// Specify default (de)serializers for record keys and for record values.
streamsConfiguration.put(StreamsConfig.DEFAULT_KEY_SERDE_CLASS_CONFIG, Serdes.String().getClass().getName());
streamsConfiguration.put(StreamsConfig.DEFAULT_VALUE_SERDE_CLASS_CONFIG, Serdes.String().getClass().getName());
// Records should be flushed every 10 seconds. This is less than the default
// in order to keep this example interactive.
streamsConfiguration.put(StreamsConfig.COMMIT_INTERVAL_MS_CONFIG, 10 * 1000);
// For illustrative purposes we disable record caches
streamsConfiguration.put(StreamsConfig.CACHE_MAX_BYTES_BUFFERING_CONFIG, 0);
// Set up serializers and deserializers, which we will use for overriding the default serdes
// specified above.
final Serde<String> stringSerde = Serdes.String();
final Serde<Long> longSerde = Serdes.Long();
// In the subsequent lines we define the processing topology of the Streams application.
final StreamsBuilder builder = new StreamsBuilder();
// Construct a `KStream` from the input topic "streams-plaintext-input", where message values
// represent lines of text (for the sake of this example, we ignore whatever may be stored
// in the message keys).
//
// Note: We could also just call `builder.stream("streams-plaintext-input")` if we wanted to leverage
// the default serdes specified in the Streams configuration above, because these defaults
// match what's in the actual topic. However we explicitly set the deserializers in the
// call to `stream()` below in order to show how that's done, too.
final KStream<String, String> textLines = builder.stream("streams-plaintext-input");
final Pattern pattern = Pattern.compile("\\W+", Pattern.UNICODE_CHARACTER_CLASS);
final KTable<String, Long> wordCounts = textLines.flatMapValues(value -> Arrays.asList(pattern.split(value.toLowerCase()))).groupBy((key, word) -> word).count();
// Write the `KTable<String, Long>` to the output topic.
wordCounts.toStream().to("streams-wordcount-output", Produced.with(stringSerde, longSerde));
// Now that we have finished the definition of the processing topology we can actually run
// it via `start()`. The Streams application as a whole can be launched just like any
// normal Java application that has a `main()` method.
final KafkaStreams streams = new KafkaStreams(builder.build(), streamsConfiguration);
// Always (and unconditionally) clean local state prior to starting the processing topology.
// We opt for this unconditional call here because this will make it easier for you to play around with the example
// when resetting the application for doing a re-run (via the Application Reset Tool,
// http://docs.confluent.io/current/streams/developer-guide.html#application-reset-tool).
//
// The drawback of cleaning up local state prior is that your app must rebuilt its local state from scratch, which
// will take time and will require reading all the state-relevant data from the Kafka cluster over the network.
// Thus in a production scenario you typically do not want to clean up always as we do here but rather only when it
// is truly needed, i.e., only under certain conditions (e.g., the presence of a command line flag for your app).
// See `ApplicationResetExample.java` for a production-like example.
streams.cleanUp();
streams.start();
// Add shutdown hook to respond to SIGTERM and gracefully close Kafka Streams
Runtime.getRuntime().addShutdownHook(new Thread(streams::close));
}
use of org.apache.kafka.streams.kstream.KTable in project kafka by apache.
the class CogroupedKStreamImplTest method shouldInsertRepartitionsTopicForUpstreamKeyModificationWithGroupedReusedInSameCogroupsWithOptimization.
@Test
public void shouldInsertRepartitionsTopicForUpstreamKeyModificationWithGroupedReusedInSameCogroupsWithOptimization() {
final Properties properties = new Properties();
properties.setProperty(StreamsConfig.TOPOLOGY_OPTIMIZATION_CONFIG, StreamsConfig.OPTIMIZE);
final StreamsBuilder builder = new StreamsBuilder();
final KStream<String, String> stream1 = builder.stream("one", stringConsumed);
final KStream<String, String> stream2 = builder.stream("two", stringConsumed);
final KGroupedStream<String, String> groupedOne = stream1.map((k, v) -> new KeyValue<>(v, k)).groupByKey();
final KGroupedStream<String, String> groupedTwo = stream2.groupByKey();
final KTable<String, String> cogroupedTwo = groupedOne.cogroup(STRING_AGGREGATOR).cogroup(groupedTwo, STRING_AGGREGATOR).aggregate(STRING_INITIALIZER);
final KTable<String, String> cogroupedOne = groupedOne.cogroup(STRING_AGGREGATOR).cogroup(groupedTwo, STRING_AGGREGATOR).aggregate(STRING_INITIALIZER);
cogroupedOne.toStream().to(OUTPUT);
cogroupedTwo.toStream().to("OUTPUT2");
final String topologyDescription = builder.build(properties).describe().toString();
assertThat(topologyDescription, equalTo("Topologies:\n" + " Sub-topology: 0\n" + " Source: KSTREAM-SOURCE-0000000000 (topics: [one])\n" + " --> KSTREAM-MAP-0000000002\n" + " Processor: KSTREAM-MAP-0000000002 (stores: [])\n" + " --> COGROUPKSTREAM-AGGREGATE-STATE-STORE-0000000003-repartition-filter\n" + " <-- KSTREAM-SOURCE-0000000000\n" + " Processor: COGROUPKSTREAM-AGGREGATE-STATE-STORE-0000000003-repartition-filter (stores: [])\n" + " --> COGROUPKSTREAM-AGGREGATE-STATE-STORE-0000000003-repartition-sink\n" + " <-- KSTREAM-MAP-0000000002\n" + " Sink: COGROUPKSTREAM-AGGREGATE-STATE-STORE-0000000003-repartition-sink (topic: COGROUPKSTREAM-AGGREGATE-STATE-STORE-0000000003-repartition)\n" + " <-- COGROUPKSTREAM-AGGREGATE-STATE-STORE-0000000003-repartition-filter\n\n" + " Sub-topology: 1\n" + " Source: COGROUPKSTREAM-AGGREGATE-STATE-STORE-0000000003-repartition-source (topics: [COGROUPKSTREAM-AGGREGATE-STATE-STORE-0000000003-repartition])\n" + " --> COGROUPKSTREAM-AGGREGATE-0000000014, COGROUPKSTREAM-AGGREGATE-0000000007\n" + " Source: KSTREAM-SOURCE-0000000001 (topics: [two])\n" + " --> COGROUPKSTREAM-AGGREGATE-0000000015, COGROUPKSTREAM-AGGREGATE-0000000008\n" + " Processor: COGROUPKSTREAM-AGGREGATE-0000000007 (stores: [COGROUPKSTREAM-AGGREGATE-STATE-STORE-0000000003])\n" + " --> COGROUPKSTREAM-MERGE-0000000009\n" + " <-- COGROUPKSTREAM-AGGREGATE-STATE-STORE-0000000003-repartition-source\n" + " Processor: COGROUPKSTREAM-AGGREGATE-0000000008 (stores: [COGROUPKSTREAM-AGGREGATE-STATE-STORE-0000000003])\n" + " --> COGROUPKSTREAM-MERGE-0000000009\n" + " <-- KSTREAM-SOURCE-0000000001\n" + " Processor: COGROUPKSTREAM-AGGREGATE-0000000014 (stores: [COGROUPKSTREAM-AGGREGATE-STATE-STORE-0000000010])\n" + " --> COGROUPKSTREAM-MERGE-0000000016\n" + " <-- COGROUPKSTREAM-AGGREGATE-STATE-STORE-0000000003-repartition-source\n" + " Processor: COGROUPKSTREAM-AGGREGATE-0000000015 (stores: [COGROUPKSTREAM-AGGREGATE-STATE-STORE-0000000010])\n" + " --> COGROUPKSTREAM-MERGE-0000000016\n" + " <-- KSTREAM-SOURCE-0000000001\n" + " Processor: COGROUPKSTREAM-MERGE-0000000009 (stores: [])\n" + " --> KTABLE-TOSTREAM-0000000019\n" + " <-- COGROUPKSTREAM-AGGREGATE-0000000007, COGROUPKSTREAM-AGGREGATE-0000000008\n" + " Processor: COGROUPKSTREAM-MERGE-0000000016 (stores: [])\n" + " --> KTABLE-TOSTREAM-0000000017\n" + " <-- COGROUPKSTREAM-AGGREGATE-0000000014, COGROUPKSTREAM-AGGREGATE-0000000015\n" + " Processor: KTABLE-TOSTREAM-0000000017 (stores: [])\n" + " --> KSTREAM-SINK-0000000018\n" + " <-- COGROUPKSTREAM-MERGE-0000000016\n" + " Processor: KTABLE-TOSTREAM-0000000019 (stores: [])\n" + " --> KSTREAM-SINK-0000000020\n" + " <-- COGROUPKSTREAM-MERGE-0000000009\n" + " Sink: KSTREAM-SINK-0000000018 (topic: output)\n" + " <-- KTABLE-TOSTREAM-0000000017\n" + " Sink: KSTREAM-SINK-0000000020 (topic: OUTPUT2)\n" + " <-- KTABLE-TOSTREAM-0000000019\n\n"));
}
use of org.apache.kafka.streams.kstream.KTable in project kafka by apache.
the class KStreamImplTest method shouldSupportKeyChangeKTableFromKStream.
@Test
public void shouldSupportKeyChangeKTableFromKStream() {
final Consumed<String, String> consumed = Consumed.with(Serdes.String(), Serdes.String());
final StreamsBuilder builder = new StreamsBuilder();
final String input = "input";
final String output = "output";
builder.stream(input, consumed).map((key, value) -> new KeyValue<>(key.charAt(0) - 'A', value)).toTable(Materialized.with(Serdes.Integer(), null)).toStream().to(output);
final Topology topology = builder.build();
final String topologyDescription = topology.describe().toString();
assertThat(topologyDescription, equalTo("Topologies:\n" + " Sub-topology: 0\n" + " Source: KSTREAM-SOURCE-0000000000 (topics: [input])\n" + " --> KSTREAM-MAP-0000000001\n" + " Processor: KSTREAM-MAP-0000000001 (stores: [])\n" + " --> KSTREAM-FILTER-0000000005\n" + " <-- KSTREAM-SOURCE-0000000000\n" + " Processor: KSTREAM-FILTER-0000000005 (stores: [])\n" + " --> KSTREAM-SINK-0000000004\n" + " <-- KSTREAM-MAP-0000000001\n" + " Sink: KSTREAM-SINK-0000000004 (topic: KSTREAM-TOTABLE-0000000002-repartition)\n" + " <-- KSTREAM-FILTER-0000000005\n" + "\n" + " Sub-topology: 1\n" + " Source: KSTREAM-SOURCE-0000000006 (topics: [KSTREAM-TOTABLE-0000000002-repartition])\n" + " --> KSTREAM-TOTABLE-0000000002\n" + " Processor: KSTREAM-TOTABLE-0000000002 (stores: [])\n" + " --> KTABLE-TOSTREAM-0000000007\n" + " <-- KSTREAM-SOURCE-0000000006\n" + " Processor: KTABLE-TOSTREAM-0000000007 (stores: [])\n" + " --> KSTREAM-SINK-0000000008\n" + " <-- KSTREAM-TOTABLE-0000000002\n" + " Sink: KSTREAM-SINK-0000000008 (topic: output)\n" + " <-- KTABLE-TOSTREAM-0000000007\n\n"));
try (final TopologyTestDriver driver = new TopologyTestDriver(topology, props)) {
final TestInputTopic<String, String> inputTopic = driver.createInputTopic(input, Serdes.String().serializer(), Serdes.String().serializer());
final TestOutputTopic<Integer, String> outputTopic = driver.createOutputTopic(output, Serdes.Integer().deserializer(), Serdes.String().deserializer());
inputTopic.pipeInput("A", "01", 5L);
inputTopic.pipeInput("B", "02", 100L);
inputTopic.pipeInput("C", "03", 0L);
inputTopic.pipeInput("D", "04", 0L);
inputTopic.pipeInput("A", "05", 10L);
inputTopic.pipeInput("A", "06", 8L);
final List<TestRecord<Integer, String>> outputExpectRecords = new ArrayList<>();
outputExpectRecords.add(new TestRecord<>(0, "01", Instant.ofEpochMilli(5L)));
outputExpectRecords.add(new TestRecord<>(1, "02", Instant.ofEpochMilli(100L)));
outputExpectRecords.add(new TestRecord<>(2, "03", Instant.ofEpochMilli(0L)));
outputExpectRecords.add(new TestRecord<>(3, "04", Instant.ofEpochMilli(0L)));
outputExpectRecords.add(new TestRecord<>(0, "05", Instant.ofEpochMilli(10L)));
outputExpectRecords.add(new TestRecord<>(0, "06", Instant.ofEpochMilli(8L)));
assertEquals(outputTopic.readRecordsToList(), outputExpectRecords);
}
}
use of org.apache.kafka.streams.kstream.KTable in project kafka by apache.
the class CogroupedKStreamImplTest method shouldNameRepartitionTopic.
@Test
public void shouldNameRepartitionTopic() {
final StreamsBuilder builder = new StreamsBuilder();
final KStream<String, String> stream1 = builder.stream("one", stringConsumed);
final KStream<String, String> test2 = builder.stream("two", stringConsumed);
final KGroupedStream<String, String> groupedOne = stream1.map((k, v) -> new KeyValue<>(v, k)).groupByKey(Grouped.as("repartition-test"));
final KGroupedStream<String, String> groupedTwo = test2.groupByKey();
final KTable<String, String> customers = groupedOne.cogroup(STRING_AGGREGATOR).cogroup(groupedTwo, STRING_AGGREGATOR).aggregate(STRING_INITIALIZER);
customers.toStream().to(OUTPUT);
final String topologyDescription = builder.build().describe().toString();
assertThat(topologyDescription, equalTo("Topologies:\n" + " Sub-topology: 0\n" + " Source: KSTREAM-SOURCE-0000000000 (topics: [one])\n" + " --> KSTREAM-MAP-0000000002\n" + " Processor: KSTREAM-MAP-0000000002 (stores: [])\n" + " --> repartition-test-repartition-filter\n" + " <-- KSTREAM-SOURCE-0000000000\n" + " Processor: repartition-test-repartition-filter (stores: [])\n" + " --> repartition-test-repartition-sink\n" + " <-- KSTREAM-MAP-0000000002\n" + " Sink: repartition-test-repartition-sink (topic: repartition-test-repartition)\n" + " <-- repartition-test-repartition-filter\n\n" + " Sub-topology: 1\n" + " Source: KSTREAM-SOURCE-0000000001 (topics: [two])\n" + " --> COGROUPKSTREAM-AGGREGATE-0000000008\n" + " Source: repartition-test-repartition-source (topics: [repartition-test-repartition])\n" + " --> COGROUPKSTREAM-AGGREGATE-0000000007\n" + " Processor: COGROUPKSTREAM-AGGREGATE-0000000007 (stores: [COGROUPKSTREAM-AGGREGATE-STATE-STORE-0000000003])\n" + " --> COGROUPKSTREAM-MERGE-0000000009\n" + " <-- repartition-test-repartition-source\n" + " Processor: COGROUPKSTREAM-AGGREGATE-0000000008 (stores: [COGROUPKSTREAM-AGGREGATE-STATE-STORE-0000000003])\n" + " --> COGROUPKSTREAM-MERGE-0000000009\n" + " <-- KSTREAM-SOURCE-0000000001\n" + " Processor: COGROUPKSTREAM-MERGE-0000000009 (stores: [])\n" + " --> KTABLE-TOSTREAM-0000000010\n" + " <-- COGROUPKSTREAM-AGGREGATE-0000000007, COGROUPKSTREAM-AGGREGATE-0000000008\n" + " Processor: KTABLE-TOSTREAM-0000000010 (stores: [])\n" + " --> KSTREAM-SINK-0000000011\n" + " <-- COGROUPKSTREAM-MERGE-0000000009\n" + " Sink: KSTREAM-SINK-0000000011 (topic: output)\n" + " <-- KTABLE-TOSTREAM-0000000010\n\n"));
}
use of org.apache.kafka.streams.kstream.KTable in project kafka by apache.
the class CogroupedKStreamImplTest method shouldInsertRepartitionsTopicForUpstreamKeyModificationWithGroupedReusedInSameCogroups.
@Test
public void shouldInsertRepartitionsTopicForUpstreamKeyModificationWithGroupedReusedInSameCogroups() {
final StreamsBuilder builder = new StreamsBuilder();
final KStream<String, String> stream1 = builder.stream("one", stringConsumed);
final KStream<String, String> stream2 = builder.stream("two", stringConsumed);
final KGroupedStream<String, String> groupedOne = stream1.map((k, v) -> new KeyValue<>(v, k)).groupByKey();
final KGroupedStream<String, String> groupedTwo = stream2.groupByKey();
final KTable<String, String> cogroupedTwo = groupedOne.cogroup(STRING_AGGREGATOR).cogroup(groupedTwo, STRING_AGGREGATOR).aggregate(STRING_INITIALIZER);
final KTable<String, String> cogroupedOne = groupedOne.cogroup(STRING_AGGREGATOR).cogroup(groupedTwo, STRING_AGGREGATOR).aggregate(STRING_INITIALIZER);
cogroupedOne.toStream().to(OUTPUT);
cogroupedTwo.toStream().to("OUTPUT2");
final String topologyDescription = builder.build().describe().toString();
assertThat(topologyDescription, equalTo("Topologies:\n" + " Sub-topology: 0\n" + " Source: KSTREAM-SOURCE-0000000000 (topics: [one])\n" + " --> KSTREAM-MAP-0000000002\n" + " Processor: KSTREAM-MAP-0000000002 (stores: [])\n" + " --> COGROUPKSTREAM-AGGREGATE-STATE-STORE-0000000010-repartition-filter, COGROUPKSTREAM-AGGREGATE-STATE-STORE-0000000003-repartition-filter\n" + " <-- KSTREAM-SOURCE-0000000000\n" + " Processor: COGROUPKSTREAM-AGGREGATE-STATE-STORE-0000000003-repartition-filter (stores: [])\n" + " --> COGROUPKSTREAM-AGGREGATE-STATE-STORE-0000000003-repartition-sink\n" + " <-- KSTREAM-MAP-0000000002\n" + " Processor: COGROUPKSTREAM-AGGREGATE-STATE-STORE-0000000010-repartition-filter (stores: [])\n" + " --> COGROUPKSTREAM-AGGREGATE-STATE-STORE-0000000010-repartition-sink\n" + " <-- KSTREAM-MAP-0000000002\n" + " Sink: COGROUPKSTREAM-AGGREGATE-STATE-STORE-0000000003-repartition-sink (topic: COGROUPKSTREAM-AGGREGATE-STATE-STORE-0000000003-repartition)\n" + " <-- COGROUPKSTREAM-AGGREGATE-STATE-STORE-0000000003-repartition-filter\n" + " Sink: COGROUPKSTREAM-AGGREGATE-STATE-STORE-0000000010-repartition-sink (topic: COGROUPKSTREAM-AGGREGATE-STATE-STORE-0000000010-repartition)\n" + " <-- COGROUPKSTREAM-AGGREGATE-STATE-STORE-0000000010-repartition-filter\n\n" + " Sub-topology: 1\n" + " Source: KSTREAM-SOURCE-0000000001 (topics: [two])\n" + " --> COGROUPKSTREAM-AGGREGATE-0000000008, COGROUPKSTREAM-AGGREGATE-0000000015\n" + " Source: COGROUPKSTREAM-AGGREGATE-STATE-STORE-0000000003-repartition-source (topics: [COGROUPKSTREAM-AGGREGATE-STATE-STORE-0000000003-repartition])\n" + " --> COGROUPKSTREAM-AGGREGATE-0000000007\n" + " Source: COGROUPKSTREAM-AGGREGATE-STATE-STORE-0000000010-repartition-source (topics: [COGROUPKSTREAM-AGGREGATE-STATE-STORE-0000000010-repartition])\n" + " --> COGROUPKSTREAM-AGGREGATE-0000000014\n" + " Processor: COGROUPKSTREAM-AGGREGATE-0000000007 (stores: [COGROUPKSTREAM-AGGREGATE-STATE-STORE-0000000003])\n" + " --> COGROUPKSTREAM-MERGE-0000000009\n" + " <-- COGROUPKSTREAM-AGGREGATE-STATE-STORE-0000000003-repartition-source\n" + " Processor: COGROUPKSTREAM-AGGREGATE-0000000008 (stores: [COGROUPKSTREAM-AGGREGATE-STATE-STORE-0000000003])\n" + " --> COGROUPKSTREAM-MERGE-0000000009\n" + " <-- KSTREAM-SOURCE-0000000001\n" + " Processor: COGROUPKSTREAM-AGGREGATE-0000000014 (stores: [COGROUPKSTREAM-AGGREGATE-STATE-STORE-0000000010])\n" + " --> COGROUPKSTREAM-MERGE-0000000016\n" + " <-- COGROUPKSTREAM-AGGREGATE-STATE-STORE-0000000010-repartition-source\n" + " Processor: COGROUPKSTREAM-AGGREGATE-0000000015 (stores: [COGROUPKSTREAM-AGGREGATE-STATE-STORE-0000000010])\n" + " --> COGROUPKSTREAM-MERGE-0000000016\n" + " <-- KSTREAM-SOURCE-0000000001\n" + " Processor: COGROUPKSTREAM-MERGE-0000000009 (stores: [])\n" + " --> KTABLE-TOSTREAM-0000000019\n" + " <-- COGROUPKSTREAM-AGGREGATE-0000000007, COGROUPKSTREAM-AGGREGATE-0000000008\n" + " Processor: COGROUPKSTREAM-MERGE-0000000016 (stores: [])\n" + " --> KTABLE-TOSTREAM-0000000017\n" + " <-- COGROUPKSTREAM-AGGREGATE-0000000014, COGROUPKSTREAM-AGGREGATE-0000000015\n" + " Processor: KTABLE-TOSTREAM-0000000017 (stores: [])\n" + " --> KSTREAM-SINK-0000000018\n" + " <-- COGROUPKSTREAM-MERGE-0000000016\n" + " Processor: KTABLE-TOSTREAM-0000000019 (stores: [])\n" + " --> KSTREAM-SINK-0000000020\n" + " <-- COGROUPKSTREAM-MERGE-0000000009\n" + " Sink: KSTREAM-SINK-0000000018 (topic: output)\n" + " <-- KTABLE-TOSTREAM-0000000017\n" + " Sink: KSTREAM-SINK-0000000020 (topic: OUTPUT2)\n" + " <-- KTABLE-TOSTREAM-0000000019\n\n"));
}
Aggregations