Search in sources :

Example 66 with SystemStream

use of org.apache.samza.system.SystemStream in project samza by apache.

the class TestStreamConfig method testContainsSamzaPropertyThrowsIfInvalidPropertyName.

@Test(expected = IllegalArgumentException.class)
public void testContainsSamzaPropertyThrowsIfInvalidPropertyName() {
    StreamConfig config = buildConfig("key1", "value1", "key2", "value2");
    config.containsSamzaProperty(new SystemStream("SysX", "StrX"), "key1");
}
Also used : SystemStream(org.apache.samza.system.SystemStream) Test(org.junit.Test)

Example 67 with SystemStream

use of org.apache.samza.system.SystemStream in project samza by apache.

the class OperatorImplGraph method init.

/**
   * Initialize the DAG of {@link OperatorImpl}s for the input {@link MessageStreamImpl} in the provided
   * {@link StreamGraphImpl}.
   *
   * @param streamGraph  the logical {@link StreamGraphImpl}
   * @param config  the {@link Config} required to instantiate operators
   * @param context  the {@link TaskContext} required to instantiate operators
   */
public void init(StreamGraphImpl streamGraph, Config config, TaskContext context) {
    streamGraph.getInputStreams().forEach((streamSpec, inputStream) -> {
        SystemStream systemStream = new SystemStream(streamSpec.getSystemName(), streamSpec.getPhysicalName());
        this.rootOperators.put(systemStream, this.createOperatorImpls((MessageStreamImpl) inputStream, config, context));
    });
}
Also used : MessageStreamImpl(org.apache.samza.operators.MessageStreamImpl) SystemStream(org.apache.samza.system.SystemStream)

Example 68 with SystemStream

use of org.apache.samza.system.SystemStream in project samza by apache.

the class TaskConfigJava method getBroadcastSystemStreams.

/**
   * Get the SystemStreams for the configured broadcast streams.
   *
   * @return the set of SystemStreams for which there are broadcast stream SSPs configured.
   */
public Set<SystemStream> getBroadcastSystemStreams() {
    Set<SystemStream> broadcastSS = new HashSet<>();
    Set<SystemStreamPartition> broadcastSSPs = getBroadcastSystemStreamPartitions();
    for (SystemStreamPartition bssp : broadcastSSPs) {
        broadcastSS.add(bssp.getSystemStream());
    }
    return Collections.unmodifiableSet(broadcastSS);
}
Also used : SystemStream(org.apache.samza.system.SystemStream) HashSet(java.util.HashSet) SystemStreamPartition(org.apache.samza.system.SystemStreamPartition)

Example 69 with SystemStream

use of org.apache.samza.system.SystemStream in project samza by apache.

the class TaskConfigJava method getBroadcastSystemStreamPartitions.

/**
   * Get the systemStreamPartitions of the broadcast stream. Specifying
   * one partition for one stream or a range of the partitions for one
   * stream is allowed.
   *
   * @return a Set of SystemStreamPartitions
   */
public Set<SystemStreamPartition> getBroadcastSystemStreamPartitions() {
    HashSet<SystemStreamPartition> systemStreamPartitionSet = new HashSet<SystemStreamPartition>();
    List<String> systemStreamPartitions = getList(BROADCAST_INPUT_STREAMS, Collections.<String>emptyList());
    for (String systemStreamPartition : systemStreamPartitions) {
        int hashPosition = systemStreamPartition.indexOf("#");
        if (hashPosition == -1) {
            throw new IllegalArgumentException("incorrect format in " + systemStreamPartition + ". Broadcast stream names should be in the form 'system.stream#partitionId' or 'system.stream#[partitionN-partitionM]'");
        } else {
            String systemStreamName = systemStreamPartition.substring(0, hashPosition);
            String partitionSegment = systemStreamPartition.substring(hashPosition + 1);
            SystemStream systemStream = Util.getSystemStreamFromNames(systemStreamName);
            if (Pattern.matches(BROADCAST_STREAM_PATTERN, partitionSegment)) {
                systemStreamPartitionSet.add(new SystemStreamPartition(systemStream, new Partition(Integer.valueOf(partitionSegment))));
            } else {
                if (Pattern.matches(BROADCAST_STREAM_RANGE_PATTERN, partitionSegment)) {
                    int partitionStart = Integer.valueOf(partitionSegment.substring(1, partitionSegment.lastIndexOf("-")));
                    int partitionEnd = Integer.valueOf(partitionSegment.substring(partitionSegment.lastIndexOf("-") + 1, partitionSegment.indexOf("]")));
                    if (partitionStart > partitionEnd) {
                        LOGGER.warn("The starting partition in stream " + systemStream.toString() + " is bigger than the ending Partition. No partition is added");
                    }
                    for (int i = partitionStart; i <= partitionEnd; i++) {
                        systemStreamPartitionSet.add(new SystemStreamPartition(systemStream, new Partition(i)));
                    }
                } else {
                    throw new IllegalArgumentException("incorrect format in " + systemStreamPartition + ". Broadcast stream names should be in the form 'system.stream#partitionId' or 'system.stream#[partitionN-partitionM]'");
                }
            }
        }
    }
    return systemStreamPartitionSet;
}
Also used : Partition(org.apache.samza.Partition) SystemStreamPartition(org.apache.samza.system.SystemStreamPartition) SystemStream(org.apache.samza.system.SystemStream) HashSet(java.util.HashSet) SystemStreamPartition(org.apache.samza.system.SystemStreamPartition)

Example 70 with SystemStream

use of org.apache.samza.system.SystemStream in project samza by apache.

the class ExecutionPlanner method updateExistingPartitions.

/**
   * Fetch the partitions of source/sink streams and update the StreamEdges.
   * @param jobGraph {@link JobGraph}
   * @param streamManager the {@StreamManager} to interface with the streams.
   */
/* package private */
static void updateExistingPartitions(JobGraph jobGraph, StreamManager streamManager) {
    Set<StreamEdge> existingStreams = new HashSet<>();
    existingStreams.addAll(jobGraph.getSources());
    existingStreams.addAll(jobGraph.getSinks());
    Multimap<String, StreamEdge> systemToStreamEdges = HashMultimap.create();
    // group the StreamEdge(s) based on the system name
    existingStreams.forEach(streamEdge -> {
        SystemStream systemStream = streamEdge.getSystemStream();
        systemToStreamEdges.put(systemStream.getSystem(), streamEdge);
    });
    for (Map.Entry<String, Collection<StreamEdge>> entry : systemToStreamEdges.asMap().entrySet()) {
        String systemName = entry.getKey();
        Collection<StreamEdge> streamEdges = entry.getValue();
        Map<String, StreamEdge> streamToStreamEdge = new HashMap<>();
        // create the stream name to StreamEdge mapping for this system
        streamEdges.forEach(streamEdge -> streamToStreamEdge.put(streamEdge.getSystemStream().getStream(), streamEdge));
        // retrieve the partition counts for the streams in this system
        Map<String, Integer> streamToPartitionCount = streamManager.getStreamPartitionCounts(systemName, streamToStreamEdge.keySet());
        // set the partitions of a stream to its StreamEdge
        streamToPartitionCount.forEach((stream, partitionCount) -> {
            streamToStreamEdge.get(stream).setPartitionCount(partitionCount);
            log.debug("Partition count is {} for stream {}", partitionCount, stream);
        });
    }
}
Also used : HashMap(java.util.HashMap) SystemStream(org.apache.samza.system.SystemStream) Collection(java.util.Collection) HashMap(java.util.HashMap) Map(java.util.Map) HashSet(java.util.HashSet)

Aggregations

SystemStream (org.apache.samza.system.SystemStream)143 HashMap (java.util.HashMap)75 Test (org.junit.Test)74 SystemStreamPartition (org.apache.samza.system.SystemStreamPartition)72 Partition (org.apache.samza.Partition)58 Map (java.util.Map)55 TaskName (org.apache.samza.container.TaskName)52 MapConfig (org.apache.samza.config.MapConfig)49 Config (org.apache.samza.config.Config)46 SystemAdmin (org.apache.samza.system.SystemAdmin)42 SystemAdmins (org.apache.samza.system.SystemAdmins)40 TaskModel (org.apache.samza.job.model.TaskModel)39 Collections (java.util.Collections)37 Set (java.util.Set)37 TaskConfig (org.apache.samza.config.TaskConfig)37 Clock (org.apache.samza.util.Clock)36 File (java.io.File)35 ImmutableMap (com.google.common.collect.ImmutableMap)34 SystemStreamPartitionMetadata (org.apache.samza.system.SystemStreamMetadata.SystemStreamPartitionMetadata)33 TaskMode (org.apache.samza.job.model.TaskMode)32