Search in sources :

Example 1 with PartitionByOperatorSpec

use of org.apache.samza.operators.spec.PartitionByOperatorSpec in project samza by apache.

the class TestMessageStreamImpl method testPartitionBy.

@Test
public void testPartitionBy() throws IOException {
    StreamApplicationDescriptorImpl mockGraph = mock(StreamApplicationDescriptorImpl.class);
    OperatorSpec mockOpSpec = mock(OperatorSpec.class);
    String mockOpName = "mockName";
    when(mockGraph.getNextOpId(anyObject(), anyObject())).thenReturn(mockOpName);
    OutputStreamImpl mockOutputStreamImpl = mock(OutputStreamImpl.class);
    KVSerde mockKVSerde = mock(KVSerde.class);
    IntermediateMessageStreamImpl mockIntermediateStream = mock(IntermediateMessageStreamImpl.class);
    when(mockGraph.getIntermediateStream(eq(mockOpName), eq(mockKVSerde), eq(false))).thenReturn(mockIntermediateStream);
    when(mockIntermediateStream.getOutputStream()).thenReturn(mockOutputStreamImpl);
    when(mockIntermediateStream.isKeyed()).thenReturn(true);
    MessageStreamImpl<TestMessageEnvelope> inputStream = new MessageStreamImpl<>(mockGraph, mockOpSpec);
    MapFunction mockKeyFunction = mock(MapFunction.class);
    MapFunction mockValueFunction = mock(MapFunction.class);
    inputStream.partitionBy(mockKeyFunction, mockValueFunction, mockKVSerde, "p1");
    ArgumentCaptor<OperatorSpec> registeredOpCaptor = ArgumentCaptor.forClass(OperatorSpec.class);
    verify(mockOpSpec).registerNextOperatorSpec(registeredOpCaptor.capture());
    OperatorSpec<?, TestMessageEnvelope> registeredOpSpec = registeredOpCaptor.getValue();
    assertTrue(registeredOpSpec instanceof PartitionByOperatorSpec);
    assertEquals(OpCode.PARTITION_BY, registeredOpSpec.getOpCode());
    assertEquals(mockOutputStreamImpl, ((PartitionByOperatorSpec) registeredOpSpec).getOutputStream());
    assertEquals(mockKeyFunction, ((PartitionByOperatorSpec) registeredOpSpec).getKeyFunction());
    assertEquals(mockValueFunction, ((PartitionByOperatorSpec) registeredOpSpec).getValueFunction());
}
Also used : IntermediateMessageStreamImpl(org.apache.samza.operators.stream.IntermediateMessageStreamImpl) KVSerde(org.apache.samza.serializers.KVSerde) OutputStreamImpl(org.apache.samza.operators.spec.OutputStreamImpl) IntermediateMessageStreamImpl(org.apache.samza.operators.stream.IntermediateMessageStreamImpl) MapFunction(org.apache.samza.operators.functions.MapFunction) FlatMapFunction(org.apache.samza.operators.functions.FlatMapFunction) StreamOperatorSpec(org.apache.samza.operators.spec.StreamOperatorSpec) PartitionByOperatorSpec(org.apache.samza.operators.spec.PartitionByOperatorSpec) JoinOperatorSpec(org.apache.samza.operators.spec.JoinOperatorSpec) SendToTableOperatorSpec(org.apache.samza.operators.spec.SendToTableOperatorSpec) OperatorSpec(org.apache.samza.operators.spec.OperatorSpec) SinkOperatorSpec(org.apache.samza.operators.spec.SinkOperatorSpec) OutputOperatorSpec(org.apache.samza.operators.spec.OutputOperatorSpec) WindowOperatorSpec(org.apache.samza.operators.spec.WindowOperatorSpec) StreamTableJoinOperatorSpec(org.apache.samza.operators.spec.StreamTableJoinOperatorSpec) TestMessageEnvelope(org.apache.samza.operators.data.TestMessageEnvelope) StreamApplicationDescriptorImpl(org.apache.samza.application.descriptors.StreamApplicationDescriptorImpl) PartitionByOperatorSpec(org.apache.samza.operators.spec.PartitionByOperatorSpec) Test(org.junit.Test)

Example 2 with PartitionByOperatorSpec

use of org.apache.samza.operators.spec.PartitionByOperatorSpec in project samza by apache.

the class OperatorImplGraph method computeOutputToInput.

private static void computeOutputToInput(SystemStream input, OperatorSpec opSpec, Multimap<SystemStream, SystemStream> outputToInputStreams, StreamConfig streamConfig) {
    if (opSpec instanceof PartitionByOperatorSpec) {
        PartitionByOperatorSpec spec = (PartitionByOperatorSpec) opSpec;
        SystemStream systemStream = streamConfig.streamIdToSystemStream(spec.getOutputStream().getStreamId());
        outputToInputStreams.put(systemStream, input);
    } else if (opSpec instanceof BroadcastOperatorSpec) {
        BroadcastOperatorSpec spec = (BroadcastOperatorSpec) opSpec;
        SystemStream systemStream = streamConfig.streamIdToSystemStream(spec.getOutputStream().getStreamId());
        outputToInputStreams.put(systemStream, input);
    } else {
        Collection<OperatorSpec> nextOperators = opSpec.getRegisteredOperatorSpecs();
        nextOperators.forEach(spec -> computeOutputToInput(input, spec, outputToInputStreams, streamConfig));
    }
}
Also used : StreamOperatorSpec(org.apache.samza.operators.spec.StreamOperatorSpec) BroadcastOperatorSpec(org.apache.samza.operators.spec.BroadcastOperatorSpec) PartialJoinFunction(org.apache.samza.operators.functions.PartialJoinFunction) PartitionByOperatorSpec(org.apache.samza.operators.spec.PartitionByOperatorSpec) SendToTableWithUpdateOperatorSpec(org.apache.samza.operators.spec.SendToTableWithUpdateOperatorSpec) LoggerFactory(org.slf4j.LoggerFactory) JoinOperatorSpec(org.apache.samza.operators.spec.JoinOperatorSpec) HashMap(java.util.HashMap) TimestampedValue(org.apache.samza.util.TimestampedValue) Multimap(com.google.common.collect.Multimap) OperatorSpecGraph(org.apache.samza.operators.OperatorSpecGraph) StreamConfig(org.apache.samza.config.StreamConfig) SendToTableOperatorSpec(org.apache.samza.operators.spec.SendToTableOperatorSpec) ArrayList(java.util.ArrayList) LinkedHashMap(java.util.LinkedHashMap) HashMultimap(com.google.common.collect.HashMultimap) Lists(com.google.common.collect.Lists) OperatorSpec(org.apache.samza.operators.spec.OperatorSpec) SinkOperatorSpec(org.apache.samza.operators.spec.SinkOperatorSpec) SystemStream(org.apache.samza.system.SystemStream) AsyncFlatMapOperatorSpec(org.apache.samza.operators.spec.AsyncFlatMapOperatorSpec) Map(java.util.Map) KV(org.apache.samza.operators.KV) JobModel(org.apache.samza.job.model.JobModel) Logger(org.slf4j.Logger) OutputOperatorSpec(org.apache.samza.operators.spec.OutputOperatorSpec) Collection(java.util.Collection) WindowOperatorSpec(org.apache.samza.operators.spec.WindowOperatorSpec) Scheduler(org.apache.samza.operators.Scheduler) Clock(org.apache.samza.util.Clock) JoinFunction(org.apache.samza.operators.functions.JoinFunction) Collectors(java.util.stream.Collectors) Context(org.apache.samza.context.Context) List(java.util.List) StreamTableJoinOperatorSpec(org.apache.samza.operators.spec.StreamTableJoinOperatorSpec) Config(org.apache.samza.config.Config) KeyValueStore(org.apache.samza.storage.kv.KeyValueStore) Collections(java.util.Collections) InternalTaskContext(org.apache.samza.context.InternalTaskContext) InputOperatorSpec(org.apache.samza.operators.spec.InputOperatorSpec) SystemStream(org.apache.samza.system.SystemStream) PartitionByOperatorSpec(org.apache.samza.operators.spec.PartitionByOperatorSpec) Collection(java.util.Collection) BroadcastOperatorSpec(org.apache.samza.operators.spec.BroadcastOperatorSpec)

Example 3 with PartitionByOperatorSpec

use of org.apache.samza.operators.spec.PartitionByOperatorSpec in project samza by apache.

the class OperatorImplGraph method createOperatorImpl.

/**
 * Creates a new {@link OperatorImpl} instance for the provided {@link OperatorSpec}.
 *
 * @param prevOperatorSpec the original {@link OperatorSpec} that produces output for {@code operatorSpec} from {@link OperatorSpecGraph}
 * @param operatorSpec  the original {@link OperatorSpec} from {@link OperatorSpecGraph}
 * @param context  the {@link Context} required to instantiate operators
 * @return  the {@link OperatorImpl} implementation instance
 */
OperatorImpl createOperatorImpl(OperatorSpec prevOperatorSpec, OperatorSpec operatorSpec, Context context) {
    Config config = context.getJobContext().getConfig();
    StreamConfig streamConfig = new StreamConfig(config);
    if (operatorSpec instanceof InputOperatorSpec) {
        return new InputOperatorImpl((InputOperatorSpec) operatorSpec);
    } else if (operatorSpec instanceof StreamOperatorSpec) {
        return new FlatmapOperatorImpl((StreamOperatorSpec) operatorSpec);
    } else if (operatorSpec instanceof SinkOperatorSpec) {
        return new SinkOperatorImpl((SinkOperatorSpec) operatorSpec);
    } else if (operatorSpec instanceof OutputOperatorSpec) {
        String streamId = ((OutputOperatorSpec) operatorSpec).getOutputStream().getStreamId();
        SystemStream systemStream = streamConfig.streamIdToSystemStream(streamId);
        return new OutputOperatorImpl((OutputOperatorSpec) operatorSpec, systemStream);
    } else if (operatorSpec instanceof PartitionByOperatorSpec) {
        String streamId = ((PartitionByOperatorSpec) operatorSpec).getOutputStream().getStreamId();
        SystemStream systemStream = streamConfig.streamIdToSystemStream(streamId);
        return new PartitionByOperatorImpl((PartitionByOperatorSpec) operatorSpec, systemStream, internalTaskContext);
    } else if (operatorSpec instanceof WindowOperatorSpec) {
        return new WindowOperatorImpl((WindowOperatorSpec) operatorSpec, clock);
    } else if (operatorSpec instanceof JoinOperatorSpec) {
        return getOrCreatePartialJoinOpImpls((JoinOperatorSpec) operatorSpec, prevOperatorSpec.equals(((JoinOperatorSpec) operatorSpec).getLeftInputOpSpec()), clock);
    } else if (operatorSpec instanceof StreamTableJoinOperatorSpec) {
        return new StreamTableJoinOperatorImpl((StreamTableJoinOperatorSpec) operatorSpec, context);
    } else if (operatorSpec instanceof SendToTableOperatorSpec) {
        return new SendToTableOperatorImpl((SendToTableOperatorSpec) operatorSpec, context);
    } else if (operatorSpec instanceof SendToTableWithUpdateOperatorSpec) {
        return new SendToTableWithUpdateOperatorImpl((SendToTableWithUpdateOperatorSpec) operatorSpec, context);
    } else if (operatorSpec instanceof BroadcastOperatorSpec) {
        String streamId = ((BroadcastOperatorSpec) operatorSpec).getOutputStream().getStreamId();
        SystemStream systemStream = streamConfig.streamIdToSystemStream(streamId);
        return new BroadcastOperatorImpl((BroadcastOperatorSpec) operatorSpec, systemStream, context);
    } else if (operatorSpec instanceof AsyncFlatMapOperatorSpec) {
        return new AsyncFlatmapOperatorImpl((AsyncFlatMapOperatorSpec) operatorSpec);
    }
    throw new IllegalArgumentException(String.format("Unsupported OperatorSpec: %s", operatorSpec.getClass().getName()));
}
Also used : StreamConfig(org.apache.samza.config.StreamConfig) Config(org.apache.samza.config.Config) JoinOperatorSpec(org.apache.samza.operators.spec.JoinOperatorSpec) StreamTableJoinOperatorSpec(org.apache.samza.operators.spec.StreamTableJoinOperatorSpec) OutputOperatorSpec(org.apache.samza.operators.spec.OutputOperatorSpec) StreamOperatorSpec(org.apache.samza.operators.spec.StreamOperatorSpec) PartitionByOperatorSpec(org.apache.samza.operators.spec.PartitionByOperatorSpec) BroadcastOperatorSpec(org.apache.samza.operators.spec.BroadcastOperatorSpec) InputOperatorSpec(org.apache.samza.operators.spec.InputOperatorSpec) SystemStream(org.apache.samza.system.SystemStream) StreamConfig(org.apache.samza.config.StreamConfig) WindowOperatorSpec(org.apache.samza.operators.spec.WindowOperatorSpec) SendToTableWithUpdateOperatorSpec(org.apache.samza.operators.spec.SendToTableWithUpdateOperatorSpec) SendToTableOperatorSpec(org.apache.samza.operators.spec.SendToTableOperatorSpec) AsyncFlatMapOperatorSpec(org.apache.samza.operators.spec.AsyncFlatMapOperatorSpec) SinkOperatorSpec(org.apache.samza.operators.spec.SinkOperatorSpec) StreamTableJoinOperatorSpec(org.apache.samza.operators.spec.StreamTableJoinOperatorSpec)

Example 4 with PartitionByOperatorSpec

use of org.apache.samza.operators.spec.PartitionByOperatorSpec in project samza by apache.

the class JobGraphJsonGenerator method operatorToMap.

/**
 * Format the operator properties into a map
 * @param spec a {@link OperatorSpec} instance
 * @return map of the operator properties
 */
@VisibleForTesting
Map<String, Object> operatorToMap(OperatorSpec spec) {
    Map<String, Object> map = new HashMap<>();
    map.put("opCode", spec.getOpCode().name());
    map.put("opId", spec.getOpId());
    map.put("sourceLocation", spec.getSourceLocation());
    Collection<OperatorSpec> nextOperators = spec.getRegisteredOperatorSpecs();
    map.put("nextOperatorIds", nextOperators.stream().map(OperatorSpec::getOpId).collect(Collectors.toSet()));
    if (spec instanceof OutputOperatorSpec) {
        OutputStreamImpl outputStream = ((OutputOperatorSpec) spec).getOutputStream();
        map.put("outputStreamId", outputStream.getStreamId());
    } else if (spec instanceof PartitionByOperatorSpec) {
        OutputStreamImpl outputStream = ((PartitionByOperatorSpec) spec).getOutputStream();
        map.put("outputStreamId", outputStream.getStreamId());
    }
    if (spec instanceof StreamTableJoinOperatorSpec) {
        String tableId = ((StreamTableJoinOperatorSpec) spec).getTableId();
        map.put("tableId", tableId);
    }
    if (spec instanceof SendToTableOperatorSpec) {
        String tableId = ((SendToTableOperatorSpec) spec).getTableId();
        map.put("tableId", tableId);
    }
    if (spec instanceof JoinOperatorSpec) {
        map.put("ttlMs", ((JoinOperatorSpec) spec).getTtlMs());
    }
    return map;
}
Also used : PartitionByOperatorSpec(org.apache.samza.operators.spec.PartitionByOperatorSpec) OutputOperatorSpec(org.apache.samza.operators.spec.OutputOperatorSpec) JoinOperatorSpec(org.apache.samza.operators.spec.JoinOperatorSpec) SendToTableOperatorSpec(org.apache.samza.operators.spec.SendToTableOperatorSpec) OperatorSpec(org.apache.samza.operators.spec.OperatorSpec) StreamTableJoinOperatorSpec(org.apache.samza.operators.spec.StreamTableJoinOperatorSpec) OutputStreamImpl(org.apache.samza.operators.spec.OutputStreamImpl) HashMap(java.util.HashMap) JoinOperatorSpec(org.apache.samza.operators.spec.JoinOperatorSpec) StreamTableJoinOperatorSpec(org.apache.samza.operators.spec.StreamTableJoinOperatorSpec) PartitionByOperatorSpec(org.apache.samza.operators.spec.PartitionByOperatorSpec) OutputOperatorSpec(org.apache.samza.operators.spec.OutputOperatorSpec) StreamTableJoinOperatorSpec(org.apache.samza.operators.spec.StreamTableJoinOperatorSpec) SendToTableOperatorSpec(org.apache.samza.operators.spec.SendToTableOperatorSpec) VisibleForTesting(com.google.common.annotations.VisibleForTesting)

Aggregations

JoinOperatorSpec (org.apache.samza.operators.spec.JoinOperatorSpec)4 OutputOperatorSpec (org.apache.samza.operators.spec.OutputOperatorSpec)4 PartitionByOperatorSpec (org.apache.samza.operators.spec.PartitionByOperatorSpec)4 SendToTableOperatorSpec (org.apache.samza.operators.spec.SendToTableOperatorSpec)4 StreamTableJoinOperatorSpec (org.apache.samza.operators.spec.StreamTableJoinOperatorSpec)4 OperatorSpec (org.apache.samza.operators.spec.OperatorSpec)3 SinkOperatorSpec (org.apache.samza.operators.spec.SinkOperatorSpec)3 StreamOperatorSpec (org.apache.samza.operators.spec.StreamOperatorSpec)3 WindowOperatorSpec (org.apache.samza.operators.spec.WindowOperatorSpec)3 HashMap (java.util.HashMap)2 Config (org.apache.samza.config.Config)2 StreamConfig (org.apache.samza.config.StreamConfig)2 AsyncFlatMapOperatorSpec (org.apache.samza.operators.spec.AsyncFlatMapOperatorSpec)2 BroadcastOperatorSpec (org.apache.samza.operators.spec.BroadcastOperatorSpec)2 InputOperatorSpec (org.apache.samza.operators.spec.InputOperatorSpec)2 OutputStreamImpl (org.apache.samza.operators.spec.OutputStreamImpl)2 SendToTableWithUpdateOperatorSpec (org.apache.samza.operators.spec.SendToTableWithUpdateOperatorSpec)2 SystemStream (org.apache.samza.system.SystemStream)2 VisibleForTesting (com.google.common.annotations.VisibleForTesting)1 HashMultimap (com.google.common.collect.HashMultimap)1