Search in sources :

Example 1 with InMemorySystemFactory

use of org.apache.samza.system.inmemory.InMemorySystemFactory in project samza by apache.

the class TestRunner method initializeInMemoryInputStream.

/**
 * Creates an in memory stream with {@link InMemorySystemFactory} and feeds its partition with stream of messages
 * @param partitionData key of the map represents partitionId and value represents messages in the partition
 * @param descriptor describes a stream to initialize with the in memory system
 */
private <StreamMessageType> void initializeInMemoryInputStream(InMemoryInputDescriptor<?> descriptor, Map<Integer, Iterable<StreamMessageType>> partitionData) {
    String systemName = descriptor.getSystemName();
    String streamName = (String) descriptor.getPhysicalName().orElse(descriptor.getStreamId());
    if (this.app instanceof LegacyTaskApplication) {
        // for legacy applications that only specify task.class.
        if (configs.containsKey(TaskConfig.INPUT_STREAMS)) {
            configs.put(TaskConfig.INPUT_STREAMS, configs.get(TaskConfig.INPUT_STREAMS).concat("," + systemName + "." + streamName));
        } else {
            configs.put(TaskConfig.INPUT_STREAMS, systemName + "." + streamName);
        }
    }
    InMemorySystemDescriptor imsd = (InMemorySystemDescriptor) descriptor.getSystemDescriptor();
    imsd.withInMemoryScope(this.inMemoryScope);
    addConfig(descriptor.toConfig());
    addConfig(descriptor.getSystemDescriptor().toConfig());
    addSerdeConfigs(descriptor);
    StreamSpec spec = new StreamSpec(descriptor.getStreamId(), streamName, systemName, partitionData.size());
    SystemFactory factory = new InMemorySystemFactory();
    Config config = new MapConfig(descriptor.toConfig(), descriptor.getSystemDescriptor().toConfig());
    factory.getAdmin(systemName, config).createStream(spec);
    InMemorySystemProducer producer = (InMemorySystemProducer) factory.getProducer(systemName, config, null);
    SystemStream sysStream = new SystemStream(systemName, streamName);
    partitionData.forEach((partitionId, partition) -> {
        partition.forEach(e -> {
            Object key = e instanceof KV ? ((KV) e).getKey() : null;
            Object value = e instanceof KV ? ((KV) e).getValue() : e;
            if (value instanceof IncomingMessageEnvelope) {
                producer.send((IncomingMessageEnvelope) value);
            } else {
                producer.send(systemName, new OutgoingMessageEnvelope(sysStream, Integer.valueOf(partitionId), key, value));
            }
        });
        producer.send(systemName, new OutgoingMessageEnvelope(sysStream, Integer.valueOf(partitionId), null, new EndOfStreamMessage(null)));
    });
}
Also used : StreamSpec(org.apache.samza.system.StreamSpec) InMemorySystemFactory(org.apache.samza.system.inmemory.InMemorySystemFactory) SystemFactory(org.apache.samza.system.SystemFactory) MapConfig(org.apache.samza.config.MapConfig) InMemorySystemConfig(org.apache.samza.config.InMemorySystemConfig) JobCoordinatorConfig(org.apache.samza.config.JobCoordinatorConfig) Config(org.apache.samza.config.Config) JobConfig(org.apache.samza.config.JobConfig) ClusterManagerConfig(org.apache.samza.config.ClusterManagerConfig) StreamConfig(org.apache.samza.config.StreamConfig) ApplicationConfig(org.apache.samza.config.ApplicationConfig) TaskConfig(org.apache.samza.config.TaskConfig) SystemStream(org.apache.samza.system.SystemStream) IncomingMessageEnvelope(org.apache.samza.system.IncomingMessageEnvelope) LegacyTaskApplication(org.apache.samza.application.LegacyTaskApplication) KV(org.apache.samza.operators.KV) InMemorySystemDescriptor(org.apache.samza.test.framework.system.descriptors.InMemorySystemDescriptor) EndOfStreamMessage(org.apache.samza.system.EndOfStreamMessage) MapConfig(org.apache.samza.config.MapConfig) OutgoingMessageEnvelope(org.apache.samza.system.OutgoingMessageEnvelope) InMemorySystemFactory(org.apache.samza.system.inmemory.InMemorySystemFactory) InMemorySystemProducer(org.apache.samza.system.inmemory.InMemorySystemProducer)

Example 2 with InMemorySystemFactory

use of org.apache.samza.system.inmemory.InMemorySystemFactory in project samza by apache.

the class TestRunner method consumeStream.

/**
 * Gets the contents of the output stream represented by {@code outputDescriptor} after {@link TestRunner#run(Duration)}
 * has completed
 *
 * @param outputDescriptor describes the stream to be consumed
 * @param timeout timeout for consumption of stream in Ms
 * @param <StreamMessageType> type of message
 *
 * @return a map whose key is {@code partitionId} and value is messages in partition
 * @throws SamzaException Thrown when a poll is incomplete
 */
public static <StreamMessageType> Map<Integer, List<StreamMessageType>> consumeStream(InMemoryOutputDescriptor outputDescriptor, Duration timeout) throws SamzaException {
    Preconditions.checkNotNull(outputDescriptor);
    String streamId = outputDescriptor.getStreamId();
    String systemName = outputDescriptor.getSystemName();
    Set<SystemStreamPartition> ssps = new HashSet<>();
    Set<String> streamIds = new HashSet<>();
    streamIds.add(streamId);
    SystemFactory factory = new InMemorySystemFactory();
    Config config = new MapConfig(outputDescriptor.toConfig(), outputDescriptor.getSystemDescriptor().toConfig());
    Map<String, SystemStreamMetadata> metadata = factory.getAdmin(systemName, config).getSystemStreamMetadata(streamIds);
    SystemConsumer consumer = factory.getConsumer(systemName, config, null);
    String name = (String) outputDescriptor.getPhysicalName().orElse(streamId);
    metadata.get(name).getSystemStreamPartitionMetadata().keySet().forEach(partition -> {
        SystemStreamPartition temp = new SystemStreamPartition(systemName, streamId, partition);
        ssps.add(temp);
        consumer.register(temp, "0");
    });
    long t = System.currentTimeMillis();
    Map<SystemStreamPartition, List<IncomingMessageEnvelope>> output = new HashMap<>();
    HashSet<SystemStreamPartition> didNotReachEndOfStream = new HashSet<>(ssps);
    while (System.currentTimeMillis() < t + timeout.toMillis()) {
        Map<SystemStreamPartition, List<IncomingMessageEnvelope>> currentState = null;
        try {
            currentState = consumer.poll(ssps, 10);
        } catch (InterruptedException e) {
            throw new SamzaException("Timed out while consuming stream \n" + e.getMessage());
        }
        for (Map.Entry<SystemStreamPartition, List<IncomingMessageEnvelope>> entry : currentState.entrySet()) {
            SystemStreamPartition ssp = entry.getKey();
            output.computeIfAbsent(ssp, k -> new LinkedList<IncomingMessageEnvelope>());
            List<IncomingMessageEnvelope> currentBuffer = entry.getValue();
            int totalMessagesToFetch = Integer.valueOf(metadata.get(outputDescriptor.getStreamId()).getSystemStreamPartitionMetadata().get(ssp.getPartition()).getUpcomingOffset());
            if (output.get(ssp).size() + currentBuffer.size() == totalMessagesToFetch) {
                didNotReachEndOfStream.remove(entry.getKey());
                ssps.remove(entry.getKey());
            }
            output.get(ssp).addAll(currentBuffer);
        }
        if (didNotReachEndOfStream.isEmpty()) {
            break;
        }
    }
    if (!didNotReachEndOfStream.isEmpty()) {
        throw new IllegalStateException("Could not poll for all system stream partitions");
    }
    return output.entrySet().stream().collect(Collectors.toMap(entry -> entry.getKey().getPartition().getPartitionId(), entry -> entry.getValue().stream().map(e -> (StreamMessageType) e.getMessage()).collect(Collectors.toList())));
}
Also used : InMemorySystemFactory(org.apache.samza.system.inmemory.InMemorySystemFactory) InMemorySystemDescriptor(org.apache.samza.test.framework.system.descriptors.InMemorySystemDescriptor) LegacyTaskApplication(org.apache.samza.application.LegacyTaskApplication) LoggerFactory(org.slf4j.LoggerFactory) InMemorySystemProducer(org.apache.samza.system.inmemory.InMemorySystemProducer) SingleContainerGrouperFactory(org.apache.samza.container.grouper.task.SingleContainerGrouperFactory) FileUtil(org.apache.samza.util.FileUtil) InMemoryInputDescriptor(org.apache.samza.test.framework.system.descriptors.InMemoryInputDescriptor) SystemConsumer(org.apache.samza.system.SystemConsumer) Duration(java.time.Duration) Map(java.util.Map) StreamTask(org.apache.samza.task.StreamTask) SamzaApplication(org.apache.samza.application.SamzaApplication) InMemoryMetadataStoreFactory(org.apache.samza.metadatastore.InMemoryMetadataStoreFactory) ExternalContext(org.apache.samza.context.ExternalContext) MapConfig(org.apache.samza.config.MapConfig) KV(org.apache.samza.operators.KV) IncomingMessageEnvelope(org.apache.samza.system.IncomingMessageEnvelope) AsyncStreamTask(org.apache.samza.task.AsyncStreamTask) StreamDescriptor(org.apache.samza.system.descriptors.StreamDescriptor) Set(java.util.Set) InMemorySystemConfig(org.apache.samza.config.InMemorySystemConfig) PassthroughJobCoordinatorFactory(org.apache.samza.standalone.PassthroughJobCoordinatorFactory) Collectors(java.util.stream.Collectors) List(java.util.List) RandomStringUtils(org.apache.commons.lang3.RandomStringUtils) JobCoordinatorConfig(org.apache.samza.config.JobCoordinatorConfig) Config(org.apache.samza.config.Config) ApplicationStatus(org.apache.samza.job.ApplicationStatus) JobConfig(org.apache.samza.config.JobConfig) HashMap(java.util.HashMap) ClusterManagerConfig(org.apache.samza.config.ClusterManagerConfig) SystemStreamPartition(org.apache.samza.system.SystemStreamPartition) SystemStreamMetadata(org.apache.samza.system.SystemStreamMetadata) StreamConfig(org.apache.samza.config.StreamConfig) HashSet(java.util.HashSet) SystemStream(org.apache.samza.system.SystemStream) ApplicationConfig(org.apache.samza.config.ApplicationConfig) LinkedList(java.util.LinkedList) LocalApplicationRunner(org.apache.samza.runtime.LocalApplicationRunner) InMemoryOutputDescriptor(org.apache.samza.test.framework.system.descriptors.InMemoryOutputDescriptor) Logger(org.slf4j.Logger) TaskConfig(org.apache.samza.config.TaskConfig) JobPlanner(org.apache.samza.execution.JobPlanner) SystemFactory(org.apache.samza.system.SystemFactory) StreamSpec(org.apache.samza.system.StreamSpec) File(java.io.File) SamzaException(org.apache.samza.SamzaException) OutgoingMessageEnvelope(org.apache.samza.system.OutgoingMessageEnvelope) Preconditions(com.google.common.base.Preconditions) EndOfStreamMessage(org.apache.samza.system.EndOfStreamMessage) SystemConsumer(org.apache.samza.system.SystemConsumer) InMemorySystemFactory(org.apache.samza.system.inmemory.InMemorySystemFactory) SystemFactory(org.apache.samza.system.SystemFactory) HashMap(java.util.HashMap) MapConfig(org.apache.samza.config.MapConfig) InMemorySystemConfig(org.apache.samza.config.InMemorySystemConfig) JobCoordinatorConfig(org.apache.samza.config.JobCoordinatorConfig) Config(org.apache.samza.config.Config) JobConfig(org.apache.samza.config.JobConfig) ClusterManagerConfig(org.apache.samza.config.ClusterManagerConfig) StreamConfig(org.apache.samza.config.StreamConfig) ApplicationConfig(org.apache.samza.config.ApplicationConfig) TaskConfig(org.apache.samza.config.TaskConfig) IncomingMessageEnvelope(org.apache.samza.system.IncomingMessageEnvelope) SamzaException(org.apache.samza.SamzaException) List(java.util.List) LinkedList(java.util.LinkedList) MapConfig(org.apache.samza.config.MapConfig) InMemorySystemFactory(org.apache.samza.system.inmemory.InMemorySystemFactory) HashSet(java.util.HashSet) SystemStreamMetadata(org.apache.samza.system.SystemStreamMetadata) Map(java.util.Map) HashMap(java.util.HashMap) SystemStreamPartition(org.apache.samza.system.SystemStreamPartition)

Example 3 with InMemorySystemFactory

use of org.apache.samza.system.inmemory.InMemorySystemFactory in project samza by apache.

the class TestRunner method addOutputStream.

/**
 * Adds the provided output stream to the test application. Default configs and user added configs have a higher
 * precedence over system and stream descriptor generated configs.
 * @param streamDescriptor describes the stream that is supposed to be output for the Samza application
 * @param partitionCount partition count of output stream
 * @return this {@link TestRunner}
 */
public TestRunner addOutputStream(InMemoryOutputDescriptor<?> streamDescriptor, int partitionCount) {
    Preconditions.checkNotNull(streamDescriptor);
    Preconditions.checkState(partitionCount >= 1);
    InMemorySystemDescriptor imsd = (InMemorySystemDescriptor) streamDescriptor.getSystemDescriptor();
    imsd.withInMemoryScope(this.inMemoryScope);
    Config config = new MapConfig(streamDescriptor.toConfig(), streamDescriptor.getSystemDescriptor().toConfig());
    InMemorySystemFactory factory = new InMemorySystemFactory();
    String physicalName = (String) streamDescriptor.getPhysicalName().orElse(streamDescriptor.getStreamId());
    StreamSpec spec = new StreamSpec(streamDescriptor.getStreamId(), physicalName, streamDescriptor.getSystemName(), partitionCount);
    factory.getAdmin(streamDescriptor.getSystemName(), config).createStream(spec);
    addConfig(streamDescriptor.toConfig());
    addConfig(streamDescriptor.getSystemDescriptor().toConfig());
    addSerdeConfigs(streamDescriptor);
    return this;
}
Also used : StreamSpec(org.apache.samza.system.StreamSpec) MapConfig(org.apache.samza.config.MapConfig) InMemorySystemConfig(org.apache.samza.config.InMemorySystemConfig) JobCoordinatorConfig(org.apache.samza.config.JobCoordinatorConfig) Config(org.apache.samza.config.Config) JobConfig(org.apache.samza.config.JobConfig) ClusterManagerConfig(org.apache.samza.config.ClusterManagerConfig) StreamConfig(org.apache.samza.config.StreamConfig) ApplicationConfig(org.apache.samza.config.ApplicationConfig) TaskConfig(org.apache.samza.config.TaskConfig) MapConfig(org.apache.samza.config.MapConfig) InMemorySystemDescriptor(org.apache.samza.test.framework.system.descriptors.InMemorySystemDescriptor) InMemorySystemFactory(org.apache.samza.system.inmemory.InMemorySystemFactory)

Example 4 with InMemorySystemFactory

use of org.apache.samza.system.inmemory.InMemorySystemFactory in project beam by apache.

the class TranslationContext method createDummyStreamDescriptor.

/**
 * The dummy stream created will only be used in Beam tests.
 */
private static InputDescriptor<OpMessage<String>, ?> createDummyStreamDescriptor(String id) {
    final GenericSystemDescriptor dummySystem = new GenericSystemDescriptor(id, InMemorySystemFactory.class.getName());
    final GenericInputDescriptor<OpMessage<String>> dummyInput = dummySystem.getInputDescriptor(id, new NoOpSerde<>());
    dummyInput.withOffsetDefault(SystemStreamMetadata.OffsetType.OLDEST);
    final Config config = new MapConfig(dummyInput.toConfig(), dummySystem.toConfig());
    final SystemFactory factory = new InMemorySystemFactory();
    final StreamSpec dummyStreamSpec = new StreamSpec(id, id, id, 1);
    factory.getAdmin(id, config).createStream(dummyStreamSpec);
    final SystemProducer producer = factory.getProducer(id, config, null);
    final SystemStream sysStream = new SystemStream(id, id);
    final Consumer<Object> sendFn = (msg) -> {
        producer.send(id, new OutgoingMessageEnvelope(sysStream, 0, null, msg));
    };
    final WindowedValue<String> windowedValue = WindowedValue.timestampedValueInGlobalWindow("dummy", new Instant());
    sendFn.accept(OpMessage.ofElement(windowedValue));
    sendFn.accept(new WatermarkMessage(BoundedWindow.TIMESTAMP_MAX_VALUE.getMillis()));
    sendFn.accept(new EndOfStreamMessage(null));
    return dummyInput;
}
Also used : InMemorySystemFactory(org.apache.samza.system.inmemory.InMemorySystemFactory) WindowedValue(org.apache.beam.sdk.util.WindowedValue) TableDescriptor(org.apache.samza.table.descriptors.TableDescriptor) GenericSystemDescriptor(org.apache.samza.system.descriptors.GenericSystemDescriptor) LoggerFactory(org.slf4j.LoggerFactory) HashMap(java.util.HashMap) OpMessage(org.apache.beam.runners.samza.runtime.OpMessage) GenericInputDescriptor(org.apache.samza.system.descriptors.GenericInputDescriptor) TransformInputs(org.apache.beam.runners.core.construction.TransformInputs) SystemStreamMetadata(org.apache.samza.system.SystemStreamMetadata) PTransform(org.apache.beam.sdk.transforms.PTransform) HashSet(java.util.HashSet) TupleTag(org.apache.beam.sdk.values.TupleTag) SystemStream(org.apache.samza.system.SystemStream) Map(java.util.Map) Iterables(org.apache.beam.vendor.guava.v26_0_jre.com.google.common.collect.Iterables) WatermarkMessage(org.apache.samza.system.WatermarkMessage) MapConfig(org.apache.samza.config.MapConfig) KV(org.apache.samza.operators.KV) NoOpSerde(org.apache.samza.serializers.NoOpSerde) AppliedPTransform(org.apache.beam.sdk.runners.AppliedPTransform) OutputDescriptor(org.apache.samza.system.descriptors.OutputDescriptor) MessageStream(org.apache.samza.operators.MessageStream) Table(org.apache.samza.table.Table) InputDescriptor(org.apache.samza.system.descriptors.InputDescriptor) Logger(org.slf4j.Logger) Set(java.util.Set) SystemFactory(org.apache.samza.system.SystemFactory) StreamSpec(org.apache.samza.system.StreamSpec) UUID(java.util.UUID) PCollection(org.apache.beam.sdk.values.PCollection) HashIdGenerator(org.apache.beam.runners.samza.util.HashIdGenerator) Consumer(java.util.function.Consumer) SamzaPipelineOptions(org.apache.beam.runners.samza.SamzaPipelineOptions) List(java.util.List) PValue(org.apache.beam.sdk.values.PValue) SystemProducer(org.apache.samza.system.SystemProducer) StreamApplicationDescriptor(org.apache.samza.application.descriptors.StreamApplicationDescriptor) PCollectionView(org.apache.beam.sdk.values.PCollectionView) BoundedWindow(org.apache.beam.sdk.transforms.windowing.BoundedWindow) Instant(org.joda.time.Instant) OutgoingMessageEnvelope(org.apache.samza.system.OutgoingMessageEnvelope) EndOfStreamMessage(org.apache.samza.system.EndOfStreamMessage) Config(org.apache.samza.config.Config) Collections(java.util.Collections) OutputStream(org.apache.samza.operators.OutputStream) StreamSpec(org.apache.samza.system.StreamSpec) InMemorySystemFactory(org.apache.samza.system.inmemory.InMemorySystemFactory) SystemFactory(org.apache.samza.system.SystemFactory) OpMessage(org.apache.beam.runners.samza.runtime.OpMessage) MapConfig(org.apache.samza.config.MapConfig) Config(org.apache.samza.config.Config) SystemProducer(org.apache.samza.system.SystemProducer) SystemStream(org.apache.samza.system.SystemStream) Instant(org.joda.time.Instant) EndOfStreamMessage(org.apache.samza.system.EndOfStreamMessage) WatermarkMessage(org.apache.samza.system.WatermarkMessage) MapConfig(org.apache.samza.config.MapConfig) OutgoingMessageEnvelope(org.apache.samza.system.OutgoingMessageEnvelope) GenericSystemDescriptor(org.apache.samza.system.descriptors.GenericSystemDescriptor) InMemorySystemFactory(org.apache.samza.system.inmemory.InMemorySystemFactory)

Aggregations

Config (org.apache.samza.config.Config)4 MapConfig (org.apache.samza.config.MapConfig)4 StreamSpec (org.apache.samza.system.StreamSpec)4 InMemorySystemFactory (org.apache.samza.system.inmemory.InMemorySystemFactory)4 ApplicationConfig (org.apache.samza.config.ApplicationConfig)3 ClusterManagerConfig (org.apache.samza.config.ClusterManagerConfig)3 InMemorySystemConfig (org.apache.samza.config.InMemorySystemConfig)3 JobConfig (org.apache.samza.config.JobConfig)3 JobCoordinatorConfig (org.apache.samza.config.JobCoordinatorConfig)3 StreamConfig (org.apache.samza.config.StreamConfig)3 TaskConfig (org.apache.samza.config.TaskConfig)3 KV (org.apache.samza.operators.KV)3 EndOfStreamMessage (org.apache.samza.system.EndOfStreamMessage)3 OutgoingMessageEnvelope (org.apache.samza.system.OutgoingMessageEnvelope)3 SystemFactory (org.apache.samza.system.SystemFactory)3 SystemStream (org.apache.samza.system.SystemStream)3 InMemorySystemDescriptor (org.apache.samza.test.framework.system.descriptors.InMemorySystemDescriptor)3 HashMap (java.util.HashMap)2 HashSet (java.util.HashSet)2 List (java.util.List)2