Search in sources :

Example 1 with EndOfStreamMessage

use of org.apache.samza.system.EndOfStreamMessage in project samza by apache.

the class StreamOperatorTask method processAsync.

/**
 * Passes the incoming message envelopes along to the {@link InputOperatorImpl} node
 * for the input {@link SystemStream}. It is non-blocking and dispatches the message to the container thread
 * pool. The thread pool size is configured through job.container.thread.pool.size. In the absence of the config,
 * the task executes the DAG on the run loop thread.
 * <p>
 * From then on, each {@link org.apache.samza.operators.impl.OperatorImpl} propagates its transformed output to
 * its chained {@link org.apache.samza.operators.impl.OperatorImpl}s itself.
 *
 * @param ime incoming message envelope to process
 * @param collector the collector to send messages with
 * @param coordinator the coordinator to request commits or shutdown
 * @param callback the task callback handle
 */
@Override
public final void processAsync(IncomingMessageEnvelope ime, MessageCollector collector, TaskCoordinator coordinator, TaskCallback callback) {
    Runnable processRunnable = () -> {
        try {
            SystemStream systemStream = ime.getSystemStreamPartition().getSystemStream();
            InputOperatorImpl inputOpImpl = operatorImplGraph.getInputOperator(systemStream);
            if (inputOpImpl != null) {
                CompletionStage<Void> processFuture;
                MessageType messageType = MessageType.of(ime.getMessage());
                switch(messageType) {
                    case USER_MESSAGE:
                        processFuture = inputOpImpl.onMessageAsync(ime, collector, coordinator);
                        break;
                    case END_OF_STREAM:
                        EndOfStreamMessage eosMessage = (EndOfStreamMessage) ime.getMessage();
                        processFuture = inputOpImpl.aggregateEndOfStream(eosMessage, ime.getSystemStreamPartition(), collector, coordinator);
                        break;
                    case WATERMARK:
                        WatermarkMessage watermarkMessage = (WatermarkMessage) ime.getMessage();
                        processFuture = inputOpImpl.aggregateWatermark(watermarkMessage, ime.getSystemStreamPartition(), collector, coordinator);
                        break;
                    default:
                        processFuture = failedFuture(new SamzaException("Unknown message type " + messageType + " encountered."));
                        break;
                }
                processFuture.whenComplete((val, ex) -> {
                    if (ex != null) {
                        callback.failure(ex);
                    } else {
                        callback.complete();
                    }
                });
            } else {
                // If InputOperator is not found in the operator graph for a given SystemStream, throw an exception else the
                // job will timeout due to async task callback timeout (TaskCallbackTimeoutException)
                final String errMessage = String.format("InputOperator not found in OperatorGraph for %s. The available input" + " operators are: %s. Please check SystemStream configuration for the `SystemConsumer` and/or task.inputs" + " task configuration.", systemStream, operatorImplGraph.getAllInputOperators());
                LOG.error(errMessage);
                callback.failure(new SamzaException(errMessage));
            }
        } catch (Exception e) {
            LOG.error("Failed to process the incoming message due to ", e);
            callback.failure(e);
        }
    };
    if (taskThreadPool != null) {
        LOG.debug("Processing message using thread pool.");
        taskThreadPool.submit(processRunnable);
    } else {
        LOG.debug("Processing message on the run loop thread.");
        processRunnable.run();
    }
}
Also used : IncomingMessageEnvelope(org.apache.samza.system.IncomingMessageEnvelope) Logger(org.slf4j.Logger) InputOperatorImpl(org.apache.samza.operators.impl.InputOperatorImpl) LoggerFactory(org.slf4j.LoggerFactory) CompletableFuture(java.util.concurrent.CompletableFuture) Clock(org.apache.samza.util.Clock) MessageType(org.apache.samza.system.MessageType) OperatorSpecGraph(org.apache.samza.operators.OperatorSpecGraph) SamzaException(org.apache.samza.SamzaException) Context(org.apache.samza.context.Context) CompletionStage(java.util.concurrent.CompletionStage) SystemClock(org.apache.samza.util.SystemClock) OperatorImplGraph(org.apache.samza.operators.impl.OperatorImplGraph) SystemStream(org.apache.samza.system.SystemStream) WatermarkMessage(org.apache.samza.system.WatermarkMessage) Preconditions(com.google.common.base.Preconditions) EndOfStreamMessage(org.apache.samza.system.EndOfStreamMessage) VisibleForTesting(com.google.common.annotations.VisibleForTesting) ExecutorService(java.util.concurrent.ExecutorService) WatermarkMessage(org.apache.samza.system.WatermarkMessage) SystemStream(org.apache.samza.system.SystemStream) InputOperatorImpl(org.apache.samza.operators.impl.InputOperatorImpl) SamzaException(org.apache.samza.SamzaException) CompletionStage(java.util.concurrent.CompletionStage) MessageType(org.apache.samza.system.MessageType) EndOfStreamMessage(org.apache.samza.system.EndOfStreamMessage) SamzaException(org.apache.samza.SamzaException)

Example 2 with EndOfStreamMessage

use of org.apache.samza.system.EndOfStreamMessage in project samza by apache.

the class IntermediateMessageSerde method toBytes.

@Override
public byte[] toBytes(Object object) {
    final byte[] data;
    final MessageType type = MessageType.of(object);
    switch(type) {
        case USER_MESSAGE:
            data = userMessageSerde.toBytes(object);
            break;
        case WATERMARK:
            data = watermarkSerde.toBytes((WatermarkMessage) object);
            break;
        case END_OF_STREAM:
            data = eosSerde.toBytes((EndOfStreamMessage) object);
            break;
        default:
            throw new SamzaException("Unknown message type: " + type.name());
    }
    final byte[] bytes = new byte[data.length + 1];
    bytes[0] = (byte) type.ordinal();
    System.arraycopy(data, 0, bytes, 1, data.length);
    return bytes;
}
Also used : WatermarkMessage(org.apache.samza.system.WatermarkMessage) SamzaException(org.apache.samza.SamzaException) MessageType(org.apache.samza.system.MessageType) EndOfStreamMessage(org.apache.samza.system.EndOfStreamMessage)

Example 3 with EndOfStreamMessage

use of org.apache.samza.system.EndOfStreamMessage in project samza by apache.

the class TestInMemorySystem method testEndOfStreamMessageWithoutTask.

@Test
public void testEndOfStreamMessageWithoutTask() {
    EndOfStreamMessage eos = new EndOfStreamMessage();
    produceMessages(eos);
    Set<SystemStreamPartition> sspsToPoll = IntStream.range(0, PARTITION_COUNT).mapToObj(partition -> new SystemStreamPartition(SYSTEM_STREAM, new Partition(partition))).collect(Collectors.toSet());
    List<IncomingMessageEnvelope> results = consumeRawMessages(sspsToPoll);
    assertEquals(1, results.size());
    assertNull(((EndOfStreamMessage) results.get(0).getMessage()).getTaskName());
    assertTrue(results.get(0).isEndOfStream());
}
Also used : IntStream(java.util.stream.IntStream) IncomingMessageEnvelope(org.apache.samza.system.IncomingMessageEnvelope) Partition(org.apache.samza.Partition) Set(java.util.Set) StreamSpec(org.apache.samza.system.StreamSpec) Test(org.junit.Test) MetricsRegistry(org.apache.samza.metrics.MetricsRegistry) SystemStreamPartition(org.apache.samza.system.SystemStreamPartition) Collectors(java.util.stream.Collectors) ArrayList(java.util.ArrayList) Mockito(org.mockito.Mockito) List(java.util.List) Stream(java.util.stream.Stream) SystemConsumer(org.apache.samza.system.SystemConsumer) SystemProducer(org.apache.samza.system.SystemProducer) SystemStream(org.apache.samza.system.SystemStream) Map(java.util.Map) SystemAdmin(org.apache.samza.system.SystemAdmin) OutgoingMessageEnvelope(org.apache.samza.system.OutgoingMessageEnvelope) EndOfStreamMessage(org.apache.samza.system.EndOfStreamMessage) Config(org.apache.samza.config.Config) Assert(org.junit.Assert) MapConfig(org.apache.samza.config.MapConfig) Partition(org.apache.samza.Partition) SystemStreamPartition(org.apache.samza.system.SystemStreamPartition) IncomingMessageEnvelope(org.apache.samza.system.IncomingMessageEnvelope) EndOfStreamMessage(org.apache.samza.system.EndOfStreamMessage) SystemStreamPartition(org.apache.samza.system.SystemStreamPartition) Test(org.junit.Test)

Example 4 with EndOfStreamMessage

use of org.apache.samza.system.EndOfStreamMessage in project samza by apache.

the class TestInMemorySystem method testEndOfStreamMessageWithTask.

@Test
public void testEndOfStreamMessageWithTask() {
    EndOfStreamMessage eos = new EndOfStreamMessage("test-task");
    produceMessages(eos);
    Set<SystemStreamPartition> sspsToPoll = IntStream.range(0, PARTITION_COUNT).mapToObj(partition -> new SystemStreamPartition(SYSTEM_STREAM, new Partition(partition))).collect(Collectors.toSet());
    List<IncomingMessageEnvelope> results = consumeRawMessages(sspsToPoll);
    assertEquals(1, results.size());
    assertEquals("test-task", ((EndOfStreamMessage) results.get(0).getMessage()).getTaskName());
    assertFalse(results.get(0).isEndOfStream());
}
Also used : IntStream(java.util.stream.IntStream) IncomingMessageEnvelope(org.apache.samza.system.IncomingMessageEnvelope) Partition(org.apache.samza.Partition) Set(java.util.Set) StreamSpec(org.apache.samza.system.StreamSpec) Test(org.junit.Test) MetricsRegistry(org.apache.samza.metrics.MetricsRegistry) SystemStreamPartition(org.apache.samza.system.SystemStreamPartition) Collectors(java.util.stream.Collectors) ArrayList(java.util.ArrayList) Mockito(org.mockito.Mockito) List(java.util.List) Stream(java.util.stream.Stream) SystemConsumer(org.apache.samza.system.SystemConsumer) SystemProducer(org.apache.samza.system.SystemProducer) SystemStream(org.apache.samza.system.SystemStream) Map(java.util.Map) SystemAdmin(org.apache.samza.system.SystemAdmin) OutgoingMessageEnvelope(org.apache.samza.system.OutgoingMessageEnvelope) EndOfStreamMessage(org.apache.samza.system.EndOfStreamMessage) Config(org.apache.samza.config.Config) Assert(org.junit.Assert) MapConfig(org.apache.samza.config.MapConfig) Partition(org.apache.samza.Partition) SystemStreamPartition(org.apache.samza.system.SystemStreamPartition) IncomingMessageEnvelope(org.apache.samza.system.IncomingMessageEnvelope) EndOfStreamMessage(org.apache.samza.system.EndOfStreamMessage) SystemStreamPartition(org.apache.samza.system.SystemStreamPartition) Test(org.junit.Test)

Example 5 with EndOfStreamMessage

use of org.apache.samza.system.EndOfStreamMessage in project samza by apache.

the class TestRunner method initializeInMemoryInputStream.

/**
 * Creates an in memory stream with {@link InMemorySystemFactory} and feeds its partition with stream of messages
 * @param partitionData key of the map represents partitionId and value represents messages in the partition
 * @param descriptor describes a stream to initialize with the in memory system
 */
private <StreamMessageType> void initializeInMemoryInputStream(InMemoryInputDescriptor<?> descriptor, Map<Integer, Iterable<StreamMessageType>> partitionData) {
    String systemName = descriptor.getSystemName();
    String streamName = (String) descriptor.getPhysicalName().orElse(descriptor.getStreamId());
    if (this.app instanceof LegacyTaskApplication) {
        // for legacy applications that only specify task.class.
        if (configs.containsKey(TaskConfig.INPUT_STREAMS)) {
            configs.put(TaskConfig.INPUT_STREAMS, configs.get(TaskConfig.INPUT_STREAMS).concat("," + systemName + "." + streamName));
        } else {
            configs.put(TaskConfig.INPUT_STREAMS, systemName + "." + streamName);
        }
    }
    InMemorySystemDescriptor imsd = (InMemorySystemDescriptor) descriptor.getSystemDescriptor();
    imsd.withInMemoryScope(this.inMemoryScope);
    addConfig(descriptor.toConfig());
    addConfig(descriptor.getSystemDescriptor().toConfig());
    addSerdeConfigs(descriptor);
    StreamSpec spec = new StreamSpec(descriptor.getStreamId(), streamName, systemName, partitionData.size());
    SystemFactory factory = new InMemorySystemFactory();
    Config config = new MapConfig(descriptor.toConfig(), descriptor.getSystemDescriptor().toConfig());
    factory.getAdmin(systemName, config).createStream(spec);
    InMemorySystemProducer producer = (InMemorySystemProducer) factory.getProducer(systemName, config, null);
    SystemStream sysStream = new SystemStream(systemName, streamName);
    partitionData.forEach((partitionId, partition) -> {
        partition.forEach(e -> {
            Object key = e instanceof KV ? ((KV) e).getKey() : null;
            Object value = e instanceof KV ? ((KV) e).getValue() : e;
            if (value instanceof IncomingMessageEnvelope) {
                producer.send((IncomingMessageEnvelope) value);
            } else {
                producer.send(systemName, new OutgoingMessageEnvelope(sysStream, Integer.valueOf(partitionId), key, value));
            }
        });
        producer.send(systemName, new OutgoingMessageEnvelope(sysStream, Integer.valueOf(partitionId), null, new EndOfStreamMessage(null)));
    });
}
Also used : StreamSpec(org.apache.samza.system.StreamSpec) InMemorySystemFactory(org.apache.samza.system.inmemory.InMemorySystemFactory) SystemFactory(org.apache.samza.system.SystemFactory) MapConfig(org.apache.samza.config.MapConfig) InMemorySystemConfig(org.apache.samza.config.InMemorySystemConfig) JobCoordinatorConfig(org.apache.samza.config.JobCoordinatorConfig) Config(org.apache.samza.config.Config) JobConfig(org.apache.samza.config.JobConfig) ClusterManagerConfig(org.apache.samza.config.ClusterManagerConfig) StreamConfig(org.apache.samza.config.StreamConfig) ApplicationConfig(org.apache.samza.config.ApplicationConfig) TaskConfig(org.apache.samza.config.TaskConfig) SystemStream(org.apache.samza.system.SystemStream) IncomingMessageEnvelope(org.apache.samza.system.IncomingMessageEnvelope) LegacyTaskApplication(org.apache.samza.application.LegacyTaskApplication) KV(org.apache.samza.operators.KV) InMemorySystemDescriptor(org.apache.samza.test.framework.system.descriptors.InMemorySystemDescriptor) EndOfStreamMessage(org.apache.samza.system.EndOfStreamMessage) MapConfig(org.apache.samza.config.MapConfig) OutgoingMessageEnvelope(org.apache.samza.system.OutgoingMessageEnvelope) InMemorySystemFactory(org.apache.samza.system.inmemory.InMemorySystemFactory) InMemorySystemProducer(org.apache.samza.system.inmemory.InMemorySystemProducer)

Aggregations

EndOfStreamMessage (org.apache.samza.system.EndOfStreamMessage)10 SystemStream (org.apache.samza.system.SystemStream)7 Config (org.apache.samza.config.Config)5 IncomingMessageEnvelope (org.apache.samza.system.IncomingMessageEnvelope)5 StreamSpec (org.apache.samza.system.StreamSpec)5 SystemStreamPartition (org.apache.samza.system.SystemStreamPartition)5 Test (org.junit.Test)5 Set (java.util.Set)4 Partition (org.apache.samza.Partition)4 MapConfig (org.apache.samza.config.MapConfig)4 OutgoingMessageEnvelope (org.apache.samza.system.OutgoingMessageEnvelope)4 WatermarkMessage (org.apache.samza.system.WatermarkMessage)4 List (java.util.List)3 Map (java.util.Map)3 SamzaException (org.apache.samza.SamzaException)3 MetricsRegistry (org.apache.samza.metrics.MetricsRegistry)3 SystemProducer (org.apache.samza.system.SystemProducer)3 VisibleForTesting (com.google.common.annotations.VisibleForTesting)2 ArrayList (java.util.ArrayList)2 Collections (java.util.Collections)2