Search in sources :

Example 1 with InputTransformer

use of org.apache.samza.system.descriptors.InputTransformer in project samza by apache.

the class TestStreamApplicationDescriptorImpl method testGetInputStreamWithTransformFunction.

@Test
public void testGetInputStreamWithTransformFunction() {
    String streamId = "test-stream-1";
    Serde mockValueSerde = mock(Serde.class);
    InputTransformer transformer = ime -> ime;
    MockTransformingSystemDescriptor sd = new MockTransformingSystemDescriptor("mockSystem", transformer);
    MockInputDescriptor isd = sd.getInputDescriptor(streamId, mockValueSerde);
    StreamApplicationDescriptorImpl streamAppDesc = new StreamApplicationDescriptorImpl(appDesc -> {
        appDesc.getInputStream(isd);
    }, getConfig());
    InputOperatorSpec inputOpSpec = streamAppDesc.getInputOperators().get(streamId);
    assertEquals(OpCode.INPUT, inputOpSpec.getOpCode());
    assertEquals(streamId, inputOpSpec.getStreamId());
    assertEquals(isd, streamAppDesc.getInputDescriptors().get(streamId));
    assertEquals(transformer, inputOpSpec.getTransformer());
}
Also used : Serde(org.apache.samza.serializers.Serde) IntegerSerde(org.apache.samza.serializers.IntegerSerde) NoOpSerde(org.apache.samza.serializers.NoOpSerde) KVSerde(org.apache.samza.serializers.KVSerde) SystemDescriptor(org.apache.samza.system.descriptors.SystemDescriptor) GenericSystemDescriptor(org.apache.samza.system.descriptors.GenericSystemDescriptor) IntermediateMessageStreamImpl(org.apache.samza.operators.stream.IntermediateMessageStreamImpl) HashMap(java.util.HashMap) Serde(org.apache.samza.serializers.Serde) GenericInputDescriptor(org.apache.samza.system.descriptors.GenericInputDescriptor) AtomicReference(java.util.concurrent.atomic.AtomicReference) TableImpl(org.apache.samza.operators.TableImpl) ArrayList(java.util.ArrayList) OutputStreamImpl(org.apache.samza.operators.spec.OutputStreamImpl) OperatorSpec(org.apache.samza.operators.spec.OperatorSpec) BaseTableDescriptor(org.apache.samza.table.descriptors.BaseTableDescriptor) ImmutableList(com.google.common.collect.ImmutableList) AtomicInteger(java.util.concurrent.atomic.AtomicInteger) ApplicationConfig(org.apache.samza.config.ApplicationConfig) InputTransformer(org.apache.samza.system.descriptors.InputTransformer) ProcessorLifecycleListenerFactory(org.apache.samza.runtime.ProcessorLifecycleListenerFactory) Assert.fail(org.junit.Assert.fail) MapConfig(org.apache.samza.config.MapConfig) IntegerSerde(org.apache.samza.serializers.IntegerSerde) NoOpSerde(org.apache.samza.serializers.NoOpSerde) Mockito.doReturn(org.mockito.Mockito.doReturn) TestMessageEnvelope(org.apache.samza.operators.data.TestMessageEnvelope) OpCode(org.apache.samza.operators.spec.OperatorSpec.OpCode) InputDescriptor(org.apache.samza.system.descriptors.InputDescriptor) ApplicationContainerContextFactory(org.apache.samza.context.ApplicationContainerContextFactory) GenericOutputDescriptor(org.apache.samza.system.descriptors.GenericOutputDescriptor) Assert.assertTrue(org.junit.Assert.assertTrue) Test(org.junit.Test) Mockito.when(org.mockito.Mockito.when) TransformingInputDescriptorProvider(org.apache.samza.system.descriptors.TransformingInputDescriptorProvider) SamzaException(org.apache.samza.SamzaException) ApplicationTaskContextFactory(org.apache.samza.context.ApplicationTaskContextFactory) ExpandingInputDescriptorProvider(org.apache.samza.system.descriptors.ExpandingInputDescriptorProvider) StreamExpander(org.apache.samza.system.descriptors.StreamExpander) Mockito.verify(org.mockito.Mockito.verify) List(java.util.List) Assert.assertFalse(org.junit.Assert.assertFalse) Optional(java.util.Optional) Config(org.apache.samza.config.Config) KVSerde(org.apache.samza.serializers.KVSerde) StreamApplication(org.apache.samza.application.StreamApplication) Assert.assertEquals(org.junit.Assert.assertEquals) InputOperatorSpec(org.apache.samza.operators.spec.InputOperatorSpec) Mockito.mock(org.mockito.Mockito.mock) InputOperatorSpec(org.apache.samza.operators.spec.InputOperatorSpec) InputTransformer(org.apache.samza.system.descriptors.InputTransformer) Test(org.junit.Test)

Example 2 with InputTransformer

use of org.apache.samza.system.descriptors.InputTransformer in project samza by apache.

the class InputOperatorImpl method handleMessageAsync.

@Override
protected CompletionStage<Collection<Object>> handleMessageAsync(IncomingMessageEnvelope message, MessageCollector collector, TaskCoordinator coordinator) {
    Object result;
    InputTransformer transformer = inputOpSpec.getTransformer();
    if (transformer != null) {
        result = transformer.apply(message);
    } else {
        result = this.inputOpSpec.isKeyed() ? KV.of(message.getKey(), message.getMessage()) : message.getMessage();
    }
    Collection<Object> output = Optional.ofNullable(result).map(Collections::singletonList).orElse(Collections.emptyList());
    return CompletableFuture.completedFuture(output);
}
Also used : InputTransformer(org.apache.samza.system.descriptors.InputTransformer)

Example 3 with InputTransformer

use of org.apache.samza.system.descriptors.InputTransformer in project samza by apache.

the class ScanTranslator method translate.

// ScanMapFunction
void translate(final TableScan tableScan, final String queryLogicalId, final String logicalOpId, final TranslatorContext context, Map<String, DelegatingSystemDescriptor> systemDescriptors, Map<String, MessageStream<SamzaSqlInputMessage>> inputMsgStreams) {
    StreamApplicationDescriptor streamAppDesc = context.getStreamAppDescriptor();
    List<String> tableNameParts = tableScan.getTable().getQualifiedName();
    String sourceName = SqlIOConfig.getSourceFromSourceParts(tableNameParts);
    Validate.isTrue(relMsgConverters.containsKey(sourceName), String.format("Unknown source %s", sourceName));
    SqlIOConfig sqlIOConfig = systemStreamConfig.get(sourceName);
    final String systemName = sqlIOConfig.getSystemName();
    final String streamId = sqlIOConfig.getStreamId();
    final String source = sqlIOConfig.getSource();
    final boolean isRemoteTable = sqlIOConfig.getTableDescriptor().isPresent() && (sqlIOConfig.getTableDescriptor().get() instanceof RemoteTableDescriptor || sqlIOConfig.getTableDescriptor().get() instanceof CachingTableDescriptor);
    // descriptor to load the local table.
    if (isRemoteTable) {
        return;
    }
    // set the wrapper input transformer (SamzaSqlInputTransformer) in system descriptor
    DelegatingSystemDescriptor systemDescriptor = systemDescriptors.get(systemName);
    if (systemDescriptor == null) {
        systemDescriptor = new DelegatingSystemDescriptor(systemName, new SamzaSqlInputTransformer());
        systemDescriptors.put(systemName, systemDescriptor);
    } else {
        /* in SamzaSQL, there should be no systemDescriptor setup by user, so this branch happens only
       * in case of Fan-OUT (i.e., same input stream used in multiple sql statements), or when same input
       * used twice in same sql statement (e.g., select ... from input as i1, input as i2 ...), o.w., throw error */
        if (systemDescriptor.getTransformer().isPresent()) {
            InputTransformer existingTransformer = systemDescriptor.getTransformer().get();
            if (!(existingTransformer instanceof SamzaSqlInputTransformer)) {
                throw new SamzaException("SamzaSQL Exception: existing transformer for " + systemName + " is not SamzaSqlInputTransformer");
            }
        }
    }
    InputDescriptor inputDescriptor = systemDescriptor.getInputDescriptor(streamId, new NoOpSerde<>());
    if (!inputMsgStreams.containsKey(source)) {
        MessageStream<SamzaSqlInputMessage> inputMsgStream = streamAppDesc.getInputStream(inputDescriptor);
        inputMsgStreams.put(source, inputMsgStream.map(new SystemMessageMapperFunction(source, queryId)));
    }
    MessageStream<SamzaSqlRelMessage> samzaSqlRelMessageStream = inputMsgStreams.get(source).filter(new FilterSystemMessageFunction(sourceName, queryId)).map(new ScanMapFunction(sourceName, queryId, queryLogicalId, logicalOpId));
    context.registerMessageStream(tableScan.getId(), samzaSqlRelMessageStream);
}
Also used : SqlIOConfig(org.apache.samza.sql.interfaces.SqlIOConfig) InputDescriptor(org.apache.samza.system.descriptors.InputDescriptor) CachingTableDescriptor(org.apache.samza.table.descriptors.CachingTableDescriptor) RemoteTableDescriptor(org.apache.samza.table.descriptors.RemoteTableDescriptor) SamzaSqlInputMessage(org.apache.samza.sql.SamzaSqlInputMessage) SamzaSqlInputTransformer(org.apache.samza.sql.SamzaSqlInputTransformer) InputTransformer(org.apache.samza.system.descriptors.InputTransformer) SamzaException(org.apache.samza.SamzaException) StreamApplicationDescriptor(org.apache.samza.application.descriptors.StreamApplicationDescriptor) DelegatingSystemDescriptor(org.apache.samza.system.descriptors.DelegatingSystemDescriptor) SamzaSqlInputTransformer(org.apache.samza.sql.SamzaSqlInputTransformer) SamzaSqlRelMessage(org.apache.samza.sql.data.SamzaSqlRelMessage)

Example 4 with InputTransformer

use of org.apache.samza.system.descriptors.InputTransformer in project samza by apache.

the class StreamApplicationDescriptorImpl method getInputStream.

@Override
public <M> MessageStream<M> getInputStream(InputDescriptor<M, ?> inputDescriptor) {
    SystemDescriptor systemDescriptor = inputDescriptor.getSystemDescriptor();
    Optional<StreamExpander> expander = systemDescriptor.getExpander();
    if (expander.isPresent()) {
        return expander.get().apply(this, inputDescriptor);
    }
    // TODO: SAMZA-1841: need to add to the broadcast streams if inputDescriptor is for a broadcast stream
    addInputDescriptor(inputDescriptor);
    String streamId = inputDescriptor.getStreamId();
    Serde serde = inputDescriptor.getSerde();
    KV<Serde, Serde> kvSerdes = getOrCreateStreamSerdes(streamId, serde);
    boolean isKeyed = serde instanceof KVSerde;
    InputTransformer transformer = inputDescriptor.getTransformer().orElse(null);
    InputOperatorSpec inputOperatorSpec = OperatorSpecs.createInputOperatorSpec(streamId, kvSerdes.getKey(), kvSerdes.getValue(), transformer, isKeyed, this.getNextOpId(OpCode.INPUT, null));
    inputOperators.put(streamId, inputOperatorSpec);
    return new MessageStreamImpl(this, inputOperators.get(streamId));
}
Also used : Serde(org.apache.samza.serializers.Serde) KVSerde(org.apache.samza.serializers.KVSerde) InputOperatorSpec(org.apache.samza.operators.spec.InputOperatorSpec) IntermediateMessageStreamImpl(org.apache.samza.operators.stream.IntermediateMessageStreamImpl) MessageStreamImpl(org.apache.samza.operators.MessageStreamImpl) SystemDescriptor(org.apache.samza.system.descriptors.SystemDescriptor) KVSerde(org.apache.samza.serializers.KVSerde) InputTransformer(org.apache.samza.system.descriptors.InputTransformer) StreamExpander(org.apache.samza.system.descriptors.StreamExpander)

Example 5 with InputTransformer

use of org.apache.samza.system.descriptors.InputTransformer in project samza by apache.

the class StreamApplicationDescriptorImpl method getIntermediateStream.

/**
 * Internal helper for {@link MessageStreamImpl} to add an intermediate {@link MessageStream} to the graph.
 * An intermediate {@link MessageStream} is both an output and an input stream.
 *
 * @param streamId the id of the stream to be created.
 * @param serde the {@link Serde} to use for the message in the intermediate stream. If null, the default serde
 *              is used.
 * @param isBroadcast whether the stream is a broadcast stream.
 * @param <M> the type of messages in the intermediate {@link MessageStream}
 * @return  the intermediate {@link MessageStreamImpl}
 */
@VisibleForTesting
public <M> IntermediateMessageStreamImpl<M> getIntermediateStream(String streamId, Serde<M> serde, boolean isBroadcast) {
    Preconditions.checkNotNull(serde, "serde must not be null for intermediate stream: " + streamId);
    Preconditions.checkState(!inputOperators.containsKey(streamId) && !outputStreams.containsKey(streamId), "getIntermediateStream must not be called multiple times with the same streamId: " + streamId);
    if (isBroadcast) {
        intermediateBroadcastStreamIds.add(streamId);
    }
    boolean isKeyed = serde instanceof KVSerde;
    KV<Serde, Serde> kvSerdes = getOrCreateStreamSerdes(streamId, serde);
    InputTransformer transformer = (InputTransformer) getDefaultSystemDescriptor().flatMap(SystemDescriptor::getTransformer).orElse(null);
    InputOperatorSpec inputOperatorSpec = OperatorSpecs.createInputOperatorSpec(streamId, kvSerdes.getKey(), kvSerdes.getValue(), transformer, isKeyed, this.getNextOpId(OpCode.INPUT, null));
    inputOperators.put(streamId, inputOperatorSpec);
    outputStreams.put(streamId, new OutputStreamImpl(streamId, kvSerdes.getKey(), kvSerdes.getValue(), isKeyed));
    return new IntermediateMessageStreamImpl<>(this, inputOperators.get(streamId), outputStreams.get(streamId));
}
Also used : Serde(org.apache.samza.serializers.Serde) KVSerde(org.apache.samza.serializers.KVSerde) InputOperatorSpec(org.apache.samza.operators.spec.InputOperatorSpec) KVSerde(org.apache.samza.serializers.KVSerde) SystemDescriptor(org.apache.samza.system.descriptors.SystemDescriptor) OutputStreamImpl(org.apache.samza.operators.spec.OutputStreamImpl) IntermediateMessageStreamImpl(org.apache.samza.operators.stream.IntermediateMessageStreamImpl) InputTransformer(org.apache.samza.system.descriptors.InputTransformer) VisibleForTesting(com.google.common.annotations.VisibleForTesting)

Aggregations

InputTransformer (org.apache.samza.system.descriptors.InputTransformer)5 InputOperatorSpec (org.apache.samza.operators.spec.InputOperatorSpec)3 IntermediateMessageStreamImpl (org.apache.samza.operators.stream.IntermediateMessageStreamImpl)3 KVSerde (org.apache.samza.serializers.KVSerde)3 Serde (org.apache.samza.serializers.Serde)3 SystemDescriptor (org.apache.samza.system.descriptors.SystemDescriptor)3 SamzaException (org.apache.samza.SamzaException)2 OutputStreamImpl (org.apache.samza.operators.spec.OutputStreamImpl)2 InputDescriptor (org.apache.samza.system.descriptors.InputDescriptor)2 StreamExpander (org.apache.samza.system.descriptors.StreamExpander)2 VisibleForTesting (com.google.common.annotations.VisibleForTesting)1 ImmutableList (com.google.common.collect.ImmutableList)1 ArrayList (java.util.ArrayList)1 HashMap (java.util.HashMap)1 List (java.util.List)1 Optional (java.util.Optional)1 AtomicInteger (java.util.concurrent.atomic.AtomicInteger)1 AtomicReference (java.util.concurrent.atomic.AtomicReference)1 StreamApplication (org.apache.samza.application.StreamApplication)1 StreamApplicationDescriptor (org.apache.samza.application.descriptors.StreamApplicationDescriptor)1