Search in sources :

Example 1 with PartialJoinOperatorSpec

use of org.apache.samza.operators.spec.PartialJoinOperatorSpec in project samza by apache.

the class ExecutionPlanner method findReachableJoins.

/**
   * This function traverses the StreamGraph to find and update mappings for all Joins reachable from this input StreamEdge
   * @param inputMessageStream next input MessageStream to traverse {@link MessageStream}
   * @param sourceStreamEdge source {@link StreamEdge}
   * @param joinSpecToStreamEdges mapping from join spec to its source {@link StreamEdge}s
   * @param streamEdgeToJoinSpecs mapping from source {@link StreamEdge} to the join specs that consumes it
   * @param outputStreamToJoinSpec mapping from the output stream to the join spec
   * @param joinQ queue that contains joinSpecs where at least one of the input stream edge partitions is known.
   */
private static void findReachableJoins(MessageStream inputMessageStream, StreamEdge sourceStreamEdge, Multimap<OperatorSpec, StreamEdge> joinSpecToStreamEdges, Multimap<StreamEdge, OperatorSpec> streamEdgeToJoinSpecs, Map<MessageStream, OperatorSpec> outputStreamToJoinSpec, Queue<OperatorSpec> joinQ, Set<OperatorSpec> visited) {
    Collection<OperatorSpec> specs = ((MessageStreamImpl) inputMessageStream).getRegisteredOperatorSpecs();
    for (OperatorSpec spec : specs) {
        if (spec instanceof PartialJoinOperatorSpec) {
            // every join will have two partial join operators
            // we will choose one of them in order to consolidate the inputs
            // the first one who registered with the outputStreamToJoinSpec will win
            MessageStream output = spec.getNextStream();
            OperatorSpec joinSpec = outputStreamToJoinSpec.get(output);
            if (joinSpec == null) {
                joinSpec = spec;
                outputStreamToJoinSpec.put(output, joinSpec);
            }
            joinSpecToStreamEdges.put(joinSpec, sourceStreamEdge);
            streamEdgeToJoinSpecs.put(sourceStreamEdge, joinSpec);
            if (!visited.contains(joinSpec) && sourceStreamEdge.getPartitionCount() > 0) {
                // put the joins with known input partitions into the queue
                joinQ.add(joinSpec);
                visited.add(joinSpec);
            }
        }
        if (spec.getNextStream() != null) {
            findReachableJoins(spec.getNextStream(), sourceStreamEdge, joinSpecToStreamEdges, streamEdgeToJoinSpecs, outputStreamToJoinSpec, joinQ, visited);
        }
    }
}
Also used : OperatorSpec(org.apache.samza.operators.spec.OperatorSpec) PartialJoinOperatorSpec(org.apache.samza.operators.spec.PartialJoinOperatorSpec) MessageStreamImpl(org.apache.samza.operators.MessageStreamImpl) MessageStream(org.apache.samza.operators.MessageStream) PartialJoinOperatorSpec(org.apache.samza.operators.spec.PartialJoinOperatorSpec)

Example 2 with PartialJoinOperatorSpec

use of org.apache.samza.operators.spec.PartialJoinOperatorSpec in project samza by apache.

the class OperatorJsonUtils method operatorToMap.

/**
   * Format the operator properties into a map
   * @param spec a {@link OperatorSpec} instance
   * @return map of the operator properties
   */
public static Map<String, Object> operatorToMap(OperatorSpec spec) {
    Map<String, Object> map = new HashMap<>();
    map.put(OP_CODE, spec.getOpCode().name());
    map.put(OP_ID, spec.getOpId());
    map.put(SOURCE_LOCATION, spec.getSourceLocation());
    if (spec.getNextStream() != null) {
        Collection<OperatorSpec> nextOperators = spec.getNextStream().getRegisteredOperatorSpecs();
        map.put(NEXT_OPERATOR_IDS, nextOperators.stream().map(OperatorSpec::getOpId).collect(Collectors.toSet()));
    } else {
        map.put(NEXT_OPERATOR_IDS, Collections.emptySet());
    }
    if (spec instanceof SinkOperatorSpec) {
        OutputStreamInternal outputStream = ((SinkOperatorSpec) spec).getOutputStream();
        if (outputStream != null) {
            map.put(OUTPUT_STREAM_ID, outputStream.getStreamSpec().getId());
        }
    }
    if (spec instanceof PartialJoinOperatorSpec) {
        map.put(TTL_MS, ((PartialJoinOperatorSpec) spec).getTtlMs());
    }
    return map;
}
Also used : OperatorSpec(org.apache.samza.operators.spec.OperatorSpec) PartialJoinOperatorSpec(org.apache.samza.operators.spec.PartialJoinOperatorSpec) SinkOperatorSpec(org.apache.samza.operators.spec.SinkOperatorSpec) HashMap(java.util.HashMap) PartialJoinOperatorSpec(org.apache.samza.operators.spec.PartialJoinOperatorSpec) SinkOperatorSpec(org.apache.samza.operators.spec.SinkOperatorSpec) OutputStreamInternal(org.apache.samza.operators.stream.OutputStreamInternal)

Example 3 with PartialJoinOperatorSpec

use of org.apache.samza.operators.spec.PartialJoinOperatorSpec in project samza by apache.

the class TestOperatorImpls method testCreateOperator.

@Test
public void testCreateOperator() throws NoSuchFieldException, IllegalAccessException, InvocationTargetException {
    // get window operator
    WindowOperatorSpec mockWnd = mock(WindowOperatorSpec.class);
    WindowInternal<TestMessageEnvelope, String, Integer> windowInternal = new WindowInternal<>(null, null, null, null, null, WindowType.TUMBLING);
    when(mockWnd.getWindow()).thenReturn(windowInternal);
    Config mockConfig = mock(Config.class);
    TaskContext mockContext = mock(TaskContext.class);
    OperatorImplGraph opGraph = new OperatorImplGraph();
    OperatorImpl<TestMessageEnvelope, ?> opImpl = (OperatorImpl<TestMessageEnvelope, ?>) createOpMethod.invoke(opGraph, mockWnd, mockConfig, mockContext);
    assertTrue(opImpl instanceof WindowOperatorImpl);
    Field wndInternalField = WindowOperatorImpl.class.getDeclaredField("window");
    wndInternalField.setAccessible(true);
    WindowInternal wndInternal = (WindowInternal) wndInternalField.get(opImpl);
    assertEquals(wndInternal, windowInternal);
    // get simple operator
    StreamOperatorSpec<TestMessageEnvelope, TestOutputMessageEnvelope> mockSimpleOp = mock(StreamOperatorSpec.class);
    FlatMapFunction<TestMessageEnvelope, TestOutputMessageEnvelope> mockTxfmFn = mock(FlatMapFunction.class);
    when(mockSimpleOp.getTransformFn()).thenReturn(mockTxfmFn);
    opImpl = (OperatorImpl<TestMessageEnvelope, ?>) createOpMethod.invoke(opGraph, mockSimpleOp, mockConfig, mockContext);
    assertTrue(opImpl instanceof StreamOperatorImpl);
    Field txfmFnField = StreamOperatorImpl.class.getDeclaredField("transformFn");
    txfmFnField.setAccessible(true);
    assertEquals(mockTxfmFn, txfmFnField.get(opImpl));
    // get sink operator
    SinkFunction<TestMessageEnvelope> sinkFn = (m, mc, tc) -> {
    };
    SinkOperatorSpec<TestMessageEnvelope> sinkOp = mock(SinkOperatorSpec.class);
    when(sinkOp.getSinkFn()).thenReturn(sinkFn);
    opImpl = (OperatorImpl<TestMessageEnvelope, ?>) createOpMethod.invoke(opGraph, sinkOp, mockConfig, mockContext);
    assertTrue(opImpl instanceof SinkOperatorImpl);
    Field sinkFnField = SinkOperatorImpl.class.getDeclaredField("sinkFn");
    sinkFnField.setAccessible(true);
    assertEquals(sinkFn, sinkFnField.get(opImpl));
    // get join operator
    PartialJoinOperatorSpec<String, TestMessageEnvelope, TestMessageEnvelope, TestOutputMessageEnvelope> joinOp = mock(PartialJoinOperatorSpec.class);
    opImpl = (OperatorImpl<TestMessageEnvelope, ?>) createOpMethod.invoke(opGraph, joinOp, mockConfig, mockContext);
    assertTrue(opImpl instanceof PartialJoinOperatorImpl);
}
Also used : StreamOperatorSpec(org.apache.samza.operators.spec.StreamOperatorSpec) Assert.assertNotSame(org.junit.Assert.assertNotSame) ArrayList(java.util.ArrayList) OperatorSpec(org.apache.samza.operators.spec.OperatorSpec) PartialJoinOperatorSpec(org.apache.samza.operators.spec.PartialJoinOperatorSpec) SinkOperatorSpec(org.apache.samza.operators.spec.SinkOperatorSpec) MessageStreamImpl(org.apache.samza.operators.MessageStreamImpl) Duration(java.time.Duration) TestOutputMessageEnvelope(org.apache.samza.operators.data.TestOutputMessageEnvelope) Method(java.lang.reflect.Method) Before(org.junit.Before) TestMessageEnvelope(org.apache.samza.operators.data.TestMessageEnvelope) WindowType(org.apache.samza.operators.windows.internal.WindowType) TaskContext(org.apache.samza.task.TaskContext) Windows(org.apache.samza.operators.windows.Windows) Iterator(java.util.Iterator) WindowOperatorSpec(org.apache.samza.operators.spec.WindowOperatorSpec) Set(java.util.Set) Assert.assertTrue(org.junit.Assert.assertTrue) Test(org.junit.Test) Mockito.when(org.mockito.Mockito.when) JoinFunction(org.apache.samza.operators.functions.JoinFunction) Field(java.lang.reflect.Field) FlatMapFunction(org.apache.samza.operators.functions.FlatMapFunction) StreamGraphImpl(org.apache.samza.operators.StreamGraphImpl) InvocationTargetException(java.lang.reflect.InvocationTargetException) SinkFunction(org.apache.samza.operators.functions.SinkFunction) Config(org.apache.samza.config.Config) TestMessageStreamImplUtil(org.apache.samza.operators.TestMessageStreamImplUtil) WindowInternal(org.apache.samza.operators.windows.internal.WindowInternal) Assert.assertEquals(org.junit.Assert.assertEquals) MetricsRegistryMap(org.apache.samza.metrics.MetricsRegistryMap) Mockito.mock(org.mockito.Mockito.mock) TaskContext(org.apache.samza.task.TaskContext) WindowInternal(org.apache.samza.operators.windows.internal.WindowInternal) Config(org.apache.samza.config.Config) WindowOperatorSpec(org.apache.samza.operators.spec.WindowOperatorSpec) Field(java.lang.reflect.Field) TestMessageEnvelope(org.apache.samza.operators.data.TestMessageEnvelope) TestOutputMessageEnvelope(org.apache.samza.operators.data.TestOutputMessageEnvelope) Test(org.junit.Test)

Aggregations

OperatorSpec (org.apache.samza.operators.spec.OperatorSpec)3 PartialJoinOperatorSpec (org.apache.samza.operators.spec.PartialJoinOperatorSpec)3 MessageStreamImpl (org.apache.samza.operators.MessageStreamImpl)2 SinkOperatorSpec (org.apache.samza.operators.spec.SinkOperatorSpec)2 Field (java.lang.reflect.Field)1 InvocationTargetException (java.lang.reflect.InvocationTargetException)1 Method (java.lang.reflect.Method)1 Duration (java.time.Duration)1 ArrayList (java.util.ArrayList)1 HashMap (java.util.HashMap)1 Iterator (java.util.Iterator)1 Set (java.util.Set)1 Config (org.apache.samza.config.Config)1 MetricsRegistryMap (org.apache.samza.metrics.MetricsRegistryMap)1 MessageStream (org.apache.samza.operators.MessageStream)1 StreamGraphImpl (org.apache.samza.operators.StreamGraphImpl)1 TestMessageStreamImplUtil (org.apache.samza.operators.TestMessageStreamImplUtil)1 TestMessageEnvelope (org.apache.samza.operators.data.TestMessageEnvelope)1 TestOutputMessageEnvelope (org.apache.samza.operators.data.TestOutputMessageEnvelope)1 FlatMapFunction (org.apache.samza.operators.functions.FlatMapFunction)1