Search in sources :

Example 1 with OperatorSpec

use of org.apache.samza.operators.spec.OperatorSpec in project samza by apache.

the class ExecutionPlanner method findReachableJoins.

/**
   * This function traverses the StreamGraph to find and update mappings for all Joins reachable from this input StreamEdge
   * @param inputMessageStream next input MessageStream to traverse {@link MessageStream}
   * @param sourceStreamEdge source {@link StreamEdge}
   * @param joinSpecToStreamEdges mapping from join spec to its source {@link StreamEdge}s
   * @param streamEdgeToJoinSpecs mapping from source {@link StreamEdge} to the join specs that consumes it
   * @param outputStreamToJoinSpec mapping from the output stream to the join spec
   * @param joinQ queue that contains joinSpecs where at least one of the input stream edge partitions is known.
   */
private static void findReachableJoins(MessageStream inputMessageStream, StreamEdge sourceStreamEdge, Multimap<OperatorSpec, StreamEdge> joinSpecToStreamEdges, Multimap<StreamEdge, OperatorSpec> streamEdgeToJoinSpecs, Map<MessageStream, OperatorSpec> outputStreamToJoinSpec, Queue<OperatorSpec> joinQ, Set<OperatorSpec> visited) {
    Collection<OperatorSpec> specs = ((MessageStreamImpl) inputMessageStream).getRegisteredOperatorSpecs();
    for (OperatorSpec spec : specs) {
        if (spec instanceof PartialJoinOperatorSpec) {
            // every join will have two partial join operators
            // we will choose one of them in order to consolidate the inputs
            // the first one who registered with the outputStreamToJoinSpec will win
            MessageStream output = spec.getNextStream();
            OperatorSpec joinSpec = outputStreamToJoinSpec.get(output);
            if (joinSpec == null) {
                joinSpec = spec;
                outputStreamToJoinSpec.put(output, joinSpec);
            }
            joinSpecToStreamEdges.put(joinSpec, sourceStreamEdge);
            streamEdgeToJoinSpecs.put(sourceStreamEdge, joinSpec);
            if (!visited.contains(joinSpec) && sourceStreamEdge.getPartitionCount() > 0) {
                // put the joins with known input partitions into the queue
                joinQ.add(joinSpec);
                visited.add(joinSpec);
            }
        }
        if (spec.getNextStream() != null) {
            findReachableJoins(spec.getNextStream(), sourceStreamEdge, joinSpecToStreamEdges, streamEdgeToJoinSpecs, outputStreamToJoinSpec, joinQ, visited);
        }
    }
}
Also used : OperatorSpec(org.apache.samza.operators.spec.OperatorSpec) PartialJoinOperatorSpec(org.apache.samza.operators.spec.PartialJoinOperatorSpec) MessageStreamImpl(org.apache.samza.operators.MessageStreamImpl) MessageStream(org.apache.samza.operators.MessageStream) PartialJoinOperatorSpec(org.apache.samza.operators.spec.PartialJoinOperatorSpec)

Example 2 with OperatorSpec

use of org.apache.samza.operators.spec.OperatorSpec in project samza by apache.

the class MessageStreamImpl method merge.

@Override
public MessageStream<M> merge(Collection<? extends MessageStream<? extends M>> otherStreams) {
    MessageStreamImpl<M> nextStream = new MessageStreamImpl<>(this.graph);
    List<MessageStream<M>> streamsToMerge = new ArrayList<>((Collection<MessageStream<M>>) otherStreams);
    streamsToMerge.add(this);
    streamsToMerge.forEach(stream -> {
        OperatorSpec mergeOperatorSpec = OperatorSpecs.createMergeOperatorSpec(nextStream, this.graph.getNextOpId());
        ((MessageStreamImpl<M>) stream).registeredOperatorSpecs.add(mergeOperatorSpec);
    });
    return nextStream;
}
Also used : OperatorSpec(org.apache.samza.operators.spec.OperatorSpec) SinkOperatorSpec(org.apache.samza.operators.spec.SinkOperatorSpec) ArrayList(java.util.ArrayList)

Example 3 with OperatorSpec

use of org.apache.samza.operators.spec.OperatorSpec in project samza by apache.

the class OperatorJsonUtils method operatorToMap.

/**
   * Format the operator properties into a map
   * @param spec a {@link OperatorSpec} instance
   * @return map of the operator properties
   */
public static Map<String, Object> operatorToMap(OperatorSpec spec) {
    Map<String, Object> map = new HashMap<>();
    map.put(OP_CODE, spec.getOpCode().name());
    map.put(OP_ID, spec.getOpId());
    map.put(SOURCE_LOCATION, spec.getSourceLocation());
    if (spec.getNextStream() != null) {
        Collection<OperatorSpec> nextOperators = spec.getNextStream().getRegisteredOperatorSpecs();
        map.put(NEXT_OPERATOR_IDS, nextOperators.stream().map(OperatorSpec::getOpId).collect(Collectors.toSet()));
    } else {
        map.put(NEXT_OPERATOR_IDS, Collections.emptySet());
    }
    if (spec instanceof SinkOperatorSpec) {
        OutputStreamInternal outputStream = ((SinkOperatorSpec) spec).getOutputStream();
        if (outputStream != null) {
            map.put(OUTPUT_STREAM_ID, outputStream.getStreamSpec().getId());
        }
    }
    if (spec instanceof PartialJoinOperatorSpec) {
        map.put(TTL_MS, ((PartialJoinOperatorSpec) spec).getTtlMs());
    }
    return map;
}
Also used : OperatorSpec(org.apache.samza.operators.spec.OperatorSpec) PartialJoinOperatorSpec(org.apache.samza.operators.spec.PartialJoinOperatorSpec) SinkOperatorSpec(org.apache.samza.operators.spec.SinkOperatorSpec) HashMap(java.util.HashMap) PartialJoinOperatorSpec(org.apache.samza.operators.spec.PartialJoinOperatorSpec) SinkOperatorSpec(org.apache.samza.operators.spec.SinkOperatorSpec) OutputStreamInternal(org.apache.samza.operators.stream.OutputStreamInternal)

Example 4 with OperatorSpec

use of org.apache.samza.operators.spec.OperatorSpec in project samza by apache.

the class TestStreamApplicationDescriptorImpl method testGetIntermediateStreamWithValueSerde.

@Test
public void testGetIntermediateStreamWithValueSerde() {
    String streamId = "stream-1";
    StreamApplicationDescriptorImpl streamAppDesc = new StreamApplicationDescriptorImpl(appDesc -> {
    }, getConfig());
    Serde mockValueSerde = mock(Serde.class);
    IntermediateMessageStreamImpl<TestMessageEnvelope> intermediateStreamImpl = streamAppDesc.getIntermediateStream(streamId, mockValueSerde, false);
    assertEquals(streamAppDesc.getInputOperators().get(streamId), intermediateStreamImpl.getOperatorSpec());
    assertEquals(streamAppDesc.getOutputStreams().get(streamId), intermediateStreamImpl.getOutputStream());
    assertEquals(streamId, intermediateStreamImpl.getStreamId());
    assertTrue(intermediateStreamImpl.getOutputStream().getKeySerde() instanceof NoOpSerde);
    assertEquals(mockValueSerde, intermediateStreamImpl.getOutputStream().getValueSerde());
    assertTrue(((InputOperatorSpec) (OperatorSpec) intermediateStreamImpl.getOperatorSpec()).getKeySerde() instanceof NoOpSerde);
    assertEquals(mockValueSerde, ((InputOperatorSpec) (OperatorSpec) intermediateStreamImpl.getOperatorSpec()).getValueSerde());
}
Also used : Serde(org.apache.samza.serializers.Serde) IntegerSerde(org.apache.samza.serializers.IntegerSerde) NoOpSerde(org.apache.samza.serializers.NoOpSerde) KVSerde(org.apache.samza.serializers.KVSerde) OperatorSpec(org.apache.samza.operators.spec.OperatorSpec) InputOperatorSpec(org.apache.samza.operators.spec.InputOperatorSpec) InputOperatorSpec(org.apache.samza.operators.spec.InputOperatorSpec) TestMessageEnvelope(org.apache.samza.operators.data.TestMessageEnvelope) NoOpSerde(org.apache.samza.serializers.NoOpSerde) Test(org.junit.Test)

Example 5 with OperatorSpec

use of org.apache.samza.operators.spec.OperatorSpec in project samza by apache.

the class TestStreamApplicationDescriptorImpl method testGetIntermediateStreamWithKeyValueSerde.

@Test
public void testGetIntermediateStreamWithKeyValueSerde() {
    String streamId = "streamId";
    StreamApplicationDescriptorImpl streamAppDesc = new StreamApplicationDescriptorImpl(appDesc -> {
    }, getConfig());
    KVSerde mockKVSerde = mock(KVSerde.class);
    Serde mockKeySerde = mock(Serde.class);
    Serde mockValueSerde = mock(Serde.class);
    doReturn(mockKeySerde).when(mockKVSerde).getKeySerde();
    doReturn(mockValueSerde).when(mockKVSerde).getValueSerde();
    IntermediateMessageStreamImpl<TestMessageEnvelope> intermediateStreamImpl = streamAppDesc.getIntermediateStream(streamId, mockKVSerde, false);
    assertEquals(streamAppDesc.getInputOperators().get(streamId), intermediateStreamImpl.getOperatorSpec());
    assertEquals(streamAppDesc.getOutputStreams().get(streamId), intermediateStreamImpl.getOutputStream());
    assertEquals(streamId, intermediateStreamImpl.getStreamId());
    assertEquals(mockKeySerde, intermediateStreamImpl.getOutputStream().getKeySerde());
    assertEquals(mockValueSerde, intermediateStreamImpl.getOutputStream().getValueSerde());
    assertEquals(mockKeySerde, ((InputOperatorSpec) (OperatorSpec) intermediateStreamImpl.getOperatorSpec()).getKeySerde());
    assertEquals(mockValueSerde, ((InputOperatorSpec) (OperatorSpec) intermediateStreamImpl.getOperatorSpec()).getValueSerde());
}
Also used : Serde(org.apache.samza.serializers.Serde) IntegerSerde(org.apache.samza.serializers.IntegerSerde) NoOpSerde(org.apache.samza.serializers.NoOpSerde) KVSerde(org.apache.samza.serializers.KVSerde) OperatorSpec(org.apache.samza.operators.spec.OperatorSpec) InputOperatorSpec(org.apache.samza.operators.spec.InputOperatorSpec) KVSerde(org.apache.samza.serializers.KVSerde) TestMessageEnvelope(org.apache.samza.operators.data.TestMessageEnvelope) Test(org.junit.Test)

Aggregations

OperatorSpec (org.apache.samza.operators.spec.OperatorSpec)34 SinkOperatorSpec (org.apache.samza.operators.spec.SinkOperatorSpec)20 JoinOperatorSpec (org.apache.samza.operators.spec.JoinOperatorSpec)18 StreamOperatorSpec (org.apache.samza.operators.spec.StreamOperatorSpec)18 StreamTableJoinOperatorSpec (org.apache.samza.operators.spec.StreamTableJoinOperatorSpec)18 OutputOperatorSpec (org.apache.samza.operators.spec.OutputOperatorSpec)17 SendToTableOperatorSpec (org.apache.samza.operators.spec.SendToTableOperatorSpec)17 WindowOperatorSpec (org.apache.samza.operators.spec.WindowOperatorSpec)16 Test (org.junit.Test)16 TestMessageEnvelope (org.apache.samza.operators.data.TestMessageEnvelope)15 PartitionByOperatorSpec (org.apache.samza.operators.spec.PartitionByOperatorSpec)15 StreamApplicationDescriptorImpl (org.apache.samza.application.descriptors.StreamApplicationDescriptorImpl)14 IntermediateMessageStreamImpl (org.apache.samza.operators.stream.IntermediateMessageStreamImpl)12 InputOperatorSpec (org.apache.samza.operators.spec.InputOperatorSpec)11 FlatMapFunction (org.apache.samza.operators.functions.FlatMapFunction)7 HashMap (java.util.HashMap)5 HashSet (java.util.HashSet)5 PartialJoinOperatorSpec (org.apache.samza.operators.spec.PartialJoinOperatorSpec)5 KVSerde (org.apache.samza.serializers.KVSerde)5 Collection (java.util.Collection)4