Search in sources :

Example 1 with GraphNode

use of org.apache.kafka.streams.kstream.internals.graph.GraphNode in project kafka by apache.

the class CogroupedStreamAggregateBuilder method processRepartitions.

private void processRepartitions(final Map<KGroupedStreamImpl<K, ?>, Aggregator<? super K, ? super Object, VOut>> groupPatterns, final StoreBuilder<?> storeBuilder) {
    for (final KGroupedStreamImpl<K, ?> repartitionReqs : groupPatterns.keySet()) {
        if (repartitionReqs.repartitionRequired) {
            final OptimizableRepartitionNodeBuilder<K, ?> repartitionNodeBuilder = optimizableRepartitionNodeBuilder();
            final String repartitionNamePrefix = repartitionReqs.userProvidedRepartitionTopicName != null ? repartitionReqs.userProvidedRepartitionTopicName : storeBuilder.name();
            createRepartitionSource(repartitionNamePrefix, repartitionNodeBuilder, repartitionReqs.keySerde, repartitionReqs.valueSerde);
            if (!parentNodes.containsKey(repartitionReqs)) {
                final GraphNode repartitionNode = repartitionNodeBuilder.build();
                builder.addGraphNode(repartitionReqs.graphNode, repartitionNode);
                parentNodes.put(repartitionReqs, repartitionNode);
            }
        } else {
            parentNodes.put(repartitionReqs, repartitionReqs.graphNode);
        }
    }
    final Collection<? extends AbstractStream<K, ?>> groupedStreams = new ArrayList<>(parentNodes.keySet());
    final AbstractStream<K, ?> kGrouped = groupedStreams.iterator().next();
    groupedStreams.remove(kGrouped);
    kGrouped.ensureCopartitionWith(groupedStreams);
}
Also used : ArrayList(java.util.ArrayList) ProcessorGraphNode(org.apache.kafka.streams.kstream.internals.graph.ProcessorGraphNode) GraphNode(org.apache.kafka.streams.kstream.internals.graph.GraphNode)

Example 2 with GraphNode

use of org.apache.kafka.streams.kstream.internals.graph.GraphNode in project kafka by apache.

the class GroupedStreamAggregateBuilder method build.

<KR, VR> KTable<KR, VR> build(final NamedInternal functionName, final StoreBuilder<?> storeBuilder, final KStreamAggProcessorSupplier<K, V, KR, VR> aggregateSupplier, final String queryableStoreName, final Serde<KR> keySerde, final Serde<VR> valueSerde) {
    assert queryableStoreName == null || queryableStoreName.equals(storeBuilder.name());
    final String aggFunctionName = functionName.name();
    String sourceName = this.name;
    GraphNode parentNode = graphNode;
    if (repartitionRequired) {
        final OptimizableRepartitionNodeBuilder<K, V> repartitionNodeBuilder = optimizableRepartitionNodeBuilder();
        final String repartitionTopicPrefix = userProvidedRepartitionTopicName != null ? userProvidedRepartitionTopicName : storeBuilder.name();
        sourceName = createRepartitionSource(repartitionTopicPrefix, repartitionNodeBuilder);
        // the existing repartition node, otherwise we create a new one.
        if (repartitionNode == null || userProvidedRepartitionTopicName == null) {
            repartitionNode = repartitionNodeBuilder.build();
        }
        builder.addGraphNode(parentNode, repartitionNode);
        parentNode = repartitionNode;
    }
    final StatefulProcessorNode<K, V> statefulProcessorNode = new StatefulProcessorNode<>(aggFunctionName, new ProcessorParameters<>(aggregateSupplier, aggFunctionName), storeBuilder);
    builder.addGraphNode(parentNode, statefulProcessorNode);
    return new KTableImpl<>(aggFunctionName, keySerde, valueSerde, sourceName.equals(this.name) ? subTopologySourceNodes : Collections.singleton(sourceName), queryableStoreName, aggregateSupplier, statefulProcessorNode, builder);
}
Also used : StatefulProcessorNode(org.apache.kafka.streams.kstream.internals.graph.StatefulProcessorNode) GraphNode(org.apache.kafka.streams.kstream.internals.graph.GraphNode)

Example 3 with GraphNode

use of org.apache.kafka.streams.kstream.internals.graph.GraphNode in project kafka by apache.

the class InternalStreamsBuilder method getKeyChangingParentNode.

private GraphNode getKeyChangingParentNode(final GraphNode repartitionNode) {
    final GraphNode shouldBeKeyChangingNode = findParentNodeMatching(repartitionNode, n -> n.isKeyChangingOperation() || n.isValueChangingOperation());
    final GraphNode keyChangingNode = findParentNodeMatching(repartitionNode, GraphNode::isKeyChangingOperation);
    if (shouldBeKeyChangingNode != null && shouldBeKeyChangingNode.equals(keyChangingNode)) {
        return keyChangingNode;
    }
    return null;
}
Also used : GraphNode(org.apache.kafka.streams.kstream.internals.graph.GraphNode)

Example 4 with GraphNode

use of org.apache.kafka.streams.kstream.internals.graph.GraphNode in project kafka by apache.

the class InternalStreamsBuilder method maybeUpdateKeyChangingRepartitionNodeMap.

private void maybeUpdateKeyChangingRepartitionNodeMap() {
    final Map<GraphNode, Set<GraphNode>> mergeNodesToKeyChangers = new HashMap<>();
    final Set<GraphNode> mergeNodeKeyChangingParentsToRemove = new HashSet<>();
    for (final GraphNode mergeNode : mergeNodes) {
        mergeNodesToKeyChangers.put(mergeNode, new LinkedHashSet<>());
        final Set<Map.Entry<GraphNode, LinkedHashSet<OptimizableRepartitionNode<?, ?>>>> entrySet = keyChangingOperationsToOptimizableRepartitionNodes.entrySet();
        for (final Map.Entry<GraphNode, LinkedHashSet<OptimizableRepartitionNode<?, ?>>> entry : entrySet) {
            if (mergeNodeHasRepartitionChildren(mergeNode, entry.getValue())) {
                final GraphNode maybeParentKey = findParentNodeMatching(mergeNode, node -> node.parentNodes().contains(entry.getKey()));
                if (maybeParentKey != null) {
                    mergeNodesToKeyChangers.get(mergeNode).add(entry.getKey());
                }
            }
        }
    }
    for (final Map.Entry<GraphNode, Set<GraphNode>> entry : mergeNodesToKeyChangers.entrySet()) {
        final GraphNode mergeKey = entry.getKey();
        final Collection<GraphNode> keyChangingParents = entry.getValue();
        final LinkedHashSet<OptimizableRepartitionNode<?, ?>> repartitionNodes = new LinkedHashSet<>();
        for (final GraphNode keyChangingParent : keyChangingParents) {
            repartitionNodes.addAll(keyChangingOperationsToOptimizableRepartitionNodes.get(keyChangingParent));
            mergeNodeKeyChangingParentsToRemove.add(keyChangingParent);
        }
        keyChangingOperationsToOptimizableRepartitionNodes.put(mergeKey, repartitionNodes);
    }
    for (final GraphNode mergeNodeKeyChangingParent : mergeNodeKeyChangingParentsToRemove) {
        keyChangingOperationsToOptimizableRepartitionNodes.remove(mergeNodeKeyChangingParent);
    }
}
Also used : LinkedHashSet(java.util.LinkedHashSet) HashSet(java.util.HashSet) LinkedHashSet(java.util.LinkedHashSet) Set(java.util.Set) HashMap(java.util.HashMap) LinkedHashMap(java.util.LinkedHashMap) GraphNode(org.apache.kafka.streams.kstream.internals.graph.GraphNode) OptimizableRepartitionNode(org.apache.kafka.streams.kstream.internals.graph.OptimizableRepartitionNode) Entry(java.util.Map.Entry) HashMap(java.util.HashMap) LinkedHashMap(java.util.LinkedHashMap) Map(java.util.Map) TreeMap(java.util.TreeMap) HashSet(java.util.HashSet) LinkedHashSet(java.util.LinkedHashSet)

Example 5 with GraphNode

use of org.apache.kafka.streams.kstream.internals.graph.GraphNode in project kafka by apache.

the class InternalStreamsBuilder method mergeDuplicateSourceNodes.

private void mergeDuplicateSourceNodes() {
    final Map<String, StreamSourceNode<?, ?>> topicsToSourceNodes = new HashMap<>();
    // We don't really care about the order here, but since Pattern does not implement equals() we can't rely on
    // a regular HashMap and containsKey(Pattern). But for our purposes it's sufficient to compare the compiled
    // string and flags to determine if two pattern subscriptions can be merged into a single source node
    final Map<Pattern, StreamSourceNode<?, ?>> patternsToSourceNodes = new TreeMap<>(Comparator.comparing(Pattern::pattern).thenComparing(Pattern::flags));
    for (final GraphNode graphNode : root.children()) {
        if (graphNode instanceof StreamSourceNode) {
            final StreamSourceNode<?, ?> currentSourceNode = (StreamSourceNode<?, ?>) graphNode;
            if (currentSourceNode.topicPattern().isPresent()) {
                final Pattern topicPattern = currentSourceNode.topicPattern().get();
                if (!patternsToSourceNodes.containsKey(topicPattern)) {
                    patternsToSourceNodes.put(topicPattern, currentSourceNode);
                } else {
                    final StreamSourceNode<?, ?> mainSourceNode = patternsToSourceNodes.get(topicPattern);
                    mainSourceNode.merge(currentSourceNode);
                    root.removeChild(graphNode);
                }
            } else {
                for (final String topic : currentSourceNode.topicNames().get()) {
                    if (!topicsToSourceNodes.containsKey(topic)) {
                        topicsToSourceNodes.put(topic, currentSourceNode);
                    } else {
                        final StreamSourceNode<?, ?> mainSourceNode = topicsToSourceNodes.get(topic);
                        // this by splitting these source nodes into one topic per node and routing to the subscribed children
                        if (!mainSourceNode.topicNames().equals(currentSourceNode.topicNames())) {
                            LOG.error("Topic {} was found in  subscription for non-equal source nodes {} and {}", topic, mainSourceNode, currentSourceNode);
                            throw new TopologyException("Two source nodes are subscribed to overlapping but not equal input topics");
                        }
                        mainSourceNode.merge(currentSourceNode);
                        root.removeChild(graphNode);
                    }
                }
            }
        }
    }
}
Also used : StreamSourceNode(org.apache.kafka.streams.kstream.internals.graph.StreamSourceNode) Pattern(java.util.regex.Pattern) HashMap(java.util.HashMap) LinkedHashMap(java.util.LinkedHashMap) GraphNode(org.apache.kafka.streams.kstream.internals.graph.GraphNode) TreeMap(java.util.TreeMap) TopologyException(org.apache.kafka.streams.errors.TopologyException)

Aggregations

GraphNode (org.apache.kafka.streams.kstream.internals.graph.GraphNode)14 ProcessorGraphNode (org.apache.kafka.streams.kstream.internals.graph.ProcessorGraphNode)6 HashMap (java.util.HashMap)3 LinkedHashMap (java.util.LinkedHashMap)3 TreeMap (java.util.TreeMap)3 TableProcessorNode (org.apache.kafka.streams.kstream.internals.graph.TableProcessorNode)3 TimestampedKeyValueStore (org.apache.kafka.streams.state.TimestampedKeyValueStore)3 HashSet (java.util.HashSet)2 LinkedHashSet (java.util.LinkedHashSet)2 Map (java.util.Map)2 Entry (java.util.Map.Entry)2 StreamsException (org.apache.kafka.streams.errors.StreamsException)2 OptimizableRepartitionNode (org.apache.kafka.streams.kstream.internals.graph.OptimizableRepartitionNode)2 ProcessorParameters (org.apache.kafka.streams.kstream.internals.graph.ProcessorParameters)2 ArrayList (java.util.ArrayList)1 PriorityQueue (java.util.PriorityQueue)1 Set (java.util.Set)1 Pattern (java.util.regex.Pattern)1 TopologyException (org.apache.kafka.streams.errors.TopologyException)1 GlobalStoreNode (org.apache.kafka.streams.kstream.internals.graph.GlobalStoreNode)1