Search in sources :

Example 11 with SystemStreamPartition

use of org.apache.samza.system.SystemStreamPartition in project samza by apache.

the class TestGroupBySystemStreamPartition method testBroadcastStreamGroupedCorrectly.

@Test
public void testBroadcastStreamGroupedCorrectly() {
    HashMap<String, String> configMap = new HashMap<String, String>();
    configMap.put("task.broadcast.inputs", "SystemA.StreamA#0");
    Config config = new MapConfig(configMap);
    HashSet<SystemStreamPartition> allSSPs = new HashSet<SystemStreamPartition>();
    Collections.addAll(allSSPs, aa0, aa1, aa2, ac0);
    GroupBySystemStreamPartitionFactory grouperFactory = new GroupBySystemStreamPartitionFactory();
    SystemStreamPartitionGrouper grouper = grouperFactory.getSystemStreamPartitionGrouper(config);
    Map<TaskName, Set<SystemStreamPartition>> result = grouper.group(allSSPs);
    Map<TaskName, Set<SystemStreamPartition>> expectedResult = new HashMap<TaskName, Set<SystemStreamPartition>>();
    HashSet<SystemStreamPartition> partitionaa1 = new HashSet<SystemStreamPartition>();
    partitionaa1.add(aa1);
    partitionaa1.add(aa0);
    expectedResult.put(new TaskName(aa1.toString()), partitionaa1);
    HashSet<SystemStreamPartition> partitionaa2 = new HashSet<SystemStreamPartition>();
    partitionaa2.add(aa2);
    partitionaa2.add(aa0);
    expectedResult.put(new TaskName(aa2.toString()), partitionaa2);
    HashSet<SystemStreamPartition> partitionac0 = new HashSet<SystemStreamPartition>();
    partitionac0.add(ac0);
    partitionac0.add(aa0);
    expectedResult.put(new TaskName(ac0.toString()), partitionac0);
    assertEquals(expectedResult, result);
}
Also used : HashSet(java.util.HashSet) Set(java.util.Set) HashMap(java.util.HashMap) Config(org.apache.samza.config.Config) MapConfig(org.apache.samza.config.MapConfig) TaskName(org.apache.samza.container.TaskName) MapConfig(org.apache.samza.config.MapConfig) HashSet(java.util.HashSet) SystemStreamPartition(org.apache.samza.system.SystemStreamPartition) Test(org.junit.Test)

Example 12 with SystemStreamPartition

use of org.apache.samza.system.SystemStreamPartition in project samza by apache.

the class TestGroupBySystemStreamPartition method testLocalStreamGroupedCorrectly.

@Test
public void testLocalStreamGroupedCorrectly() {
    HashSet<SystemStreamPartition> allSSPs = new HashSet<SystemStreamPartition>();
    HashMap<String, String> configMap = new HashMap<String, String>();
    Config config = new MapConfig(configMap);
    SystemStreamPartitionGrouper grouper = grouperFactory.getSystemStreamPartitionGrouper(config);
    Map<TaskName, Set<SystemStreamPartition>> emptyResult = grouper.group(allSSPs);
    assertTrue(emptyResult.isEmpty());
    Collections.addAll(allSSPs, aa0, aa1, aa2, ac0);
    Map<TaskName, Set<SystemStreamPartition>> result = grouper.group(allSSPs);
    Map<TaskName, Set<SystemStreamPartition>> expectedResult = new HashMap<TaskName, Set<SystemStreamPartition>>();
    HashSet<SystemStreamPartition> partitionaa0 = new HashSet<SystemStreamPartition>();
    partitionaa0.add(aa0);
    expectedResult.put(new TaskName(aa0.toString()), partitionaa0);
    HashSet<SystemStreamPartition> partitionaa1 = new HashSet<SystemStreamPartition>();
    partitionaa1.add(aa1);
    expectedResult.put(new TaskName(aa1.toString()), partitionaa1);
    HashSet<SystemStreamPartition> partitionaa2 = new HashSet<SystemStreamPartition>();
    partitionaa2.add(aa2);
    expectedResult.put(new TaskName(aa2.toString()), partitionaa2);
    HashSet<SystemStreamPartition> partitionac0 = new HashSet<SystemStreamPartition>();
    partitionac0.add(ac0);
    expectedResult.put(new TaskName(ac0.toString()), partitionac0);
    assertEquals(expectedResult, result);
}
Also used : HashSet(java.util.HashSet) Set(java.util.Set) HashMap(java.util.HashMap) TaskName(org.apache.samza.container.TaskName) Config(org.apache.samza.config.Config) MapConfig(org.apache.samza.config.MapConfig) MapConfig(org.apache.samza.config.MapConfig) HashSet(java.util.HashSet) SystemStreamPartition(org.apache.samza.system.SystemStreamPartition) Test(org.junit.Test)

Example 13 with SystemStreamPartition

use of org.apache.samza.system.SystemStreamPartition in project samza by apache.

the class TestHdfsSystemConsumer method testHdfsSystemConsumerE2E.

/*
   * A simple end to end test that covers the workflow from system admin to
   * partitioner, system consumer, and so on, making sure the basic functionality
   * works as expected.
   */
@Test
public void testHdfsSystemConsumerE2E() throws Exception {
    Config config = generateDefaultConfig();
    HdfsSystemFactory systemFactory = new HdfsSystemFactory();
    // create admin and do partitioning
    HdfsSystemAdmin systemAdmin = systemFactory.getAdmin(SYSTEM_NAME, config);
    String streamName = WORKING_DIRECTORY;
    Set<String> streamNames = new HashSet<>();
    streamNames.add(streamName);
    generateAvroDataFiles();
    Map<String, SystemStreamMetadata> streamMetadataMap = systemAdmin.getSystemStreamMetadata(streamNames);
    SystemStreamMetadata systemStreamMetadata = streamMetadataMap.get(streamName);
    Assert.assertEquals(NUM_FILES, systemStreamMetadata.getSystemStreamPartitionMetadata().size());
    // create consumer and read from files
    HdfsSystemConsumer systemConsumer = systemFactory.getConsumer(SYSTEM_NAME, config, new NoOpMetricsRegistry());
    Map<Partition, SystemStreamMetadata.SystemStreamPartitionMetadata> metadataMap = systemStreamMetadata.getSystemStreamPartitionMetadata();
    Set<SystemStreamPartition> systemStreamPartitionSet = new HashSet<>();
    metadataMap.forEach((partition, metadata) -> {
        SystemStreamPartition ssp = new SystemStreamPartition(SYSTEM_NAME, streamName, partition);
        systemStreamPartitionSet.add(ssp);
        String offset = metadata.getOldestOffset();
        systemConsumer.register(ssp, offset);
    });
    systemConsumer.start();
    // verify events read from consumer
    int eventsReceived = 0;
    // one "End of Stream" event in the end
    int totalEvents = (NUM_EVENTS + 1) * NUM_FILES;
    int remainingRetires = 100;
    Map<SystemStreamPartition, List<IncomingMessageEnvelope>> overallResults = new HashMap<>();
    while (eventsReceived < totalEvents && remainingRetires > 0) {
        remainingRetires--;
        Map<SystemStreamPartition, List<IncomingMessageEnvelope>> result = systemConsumer.poll(systemStreamPartitionSet, 200);
        for (SystemStreamPartition ssp : result.keySet()) {
            List<IncomingMessageEnvelope> messageEnvelopeList = result.get(ssp);
            overallResults.putIfAbsent(ssp, new ArrayList<>());
            overallResults.get(ssp).addAll(messageEnvelopeList);
            if (overallResults.get(ssp).size() >= NUM_EVENTS + 1) {
                systemStreamPartitionSet.remove(ssp);
            }
            eventsReceived += messageEnvelopeList.size();
        }
    }
    Assert.assertEquals(eventsReceived, totalEvents);
    Assert.assertEquals(NUM_FILES, overallResults.size());
    overallResults.values().forEach(messages -> {
        Assert.assertEquals(NUM_EVENTS + 1, messages.size());
        for (int index = 0; index < NUM_EVENTS; index++) {
            GenericRecord record = (GenericRecord) messages.get(index).getMessage();
            Assert.assertEquals(index % NUM_EVENTS, record.get(FIELD_1));
            Assert.assertEquals("string_" + (index % NUM_EVENTS), record.get(FIELD_2).toString());
        }
        Assert.assertEquals(messages.get(NUM_EVENTS).getOffset(), IncomingMessageEnvelope.END_OF_STREAM_OFFSET);
    });
}
Also used : Partition(org.apache.samza.Partition) SystemStreamPartition(org.apache.samza.system.SystemStreamPartition) HashMap(java.util.HashMap) Config(org.apache.samza.config.Config) MapConfig(org.apache.samza.config.MapConfig) IncomingMessageEnvelope(org.apache.samza.system.IncomingMessageEnvelope) SystemStreamMetadata(org.apache.samza.system.SystemStreamMetadata) NoOpMetricsRegistry(org.apache.samza.util.NoOpMetricsRegistry) ArrayList(java.util.ArrayList) List(java.util.List) GenericRecord(org.apache.avro.generic.GenericRecord) HashSet(java.util.HashSet) SystemStreamPartition(org.apache.samza.system.SystemStreamPartition) Test(org.junit.Test)

Example 14 with SystemStreamPartition

use of org.apache.samza.system.SystemStreamPartition in project samza by apache.

the class TestTaskCallbackManager method testUpdateCallbackWithCoordinatorRequests.

@Test
public void testUpdateCallbackWithCoordinatorRequests() {
    TaskName taskName = new TaskName("Partition 0");
    SystemStreamPartition ssp = new SystemStreamPartition("kafka", "topic", new Partition(0));
    // simulate out of order
    IncomingMessageEnvelope envelope2 = new IncomingMessageEnvelope(ssp, "2", null, null);
    ReadableCoordinator coordinator2 = new ReadableCoordinator(taskName);
    coordinator2.shutdown(TaskCoordinator.RequestScope.ALL_TASKS_IN_CONTAINER);
    TaskCallbackImpl callback2 = new TaskCallbackImpl(listener, taskName, envelope2, coordinator2, 2, 0);
    List<TaskCallbackImpl> callbacksToUpdate = callbackManager.updateCallback(callback2);
    assertTrue(callbacksToUpdate.isEmpty());
    IncomingMessageEnvelope envelope1 = new IncomingMessageEnvelope(ssp, "1", null, null);
    ReadableCoordinator coordinator1 = new ReadableCoordinator(taskName);
    coordinator1.commit(TaskCoordinator.RequestScope.CURRENT_TASK);
    TaskCallbackImpl callback1 = new TaskCallbackImpl(listener, taskName, envelope1, coordinator1, 1, 0);
    callbacksToUpdate = callbackManager.updateCallback(callback1);
    assertTrue(callbacksToUpdate.isEmpty());
    IncomingMessageEnvelope envelope0 = new IncomingMessageEnvelope(ssp, "0", null, null);
    ReadableCoordinator coordinator = new ReadableCoordinator(taskName);
    TaskCallbackImpl callback0 = new TaskCallbackImpl(listener, taskName, envelope0, coordinator, 0, 0);
    callbacksToUpdate = callbackManager.updateCallback(callback0);
    assertEquals(2, callbacksToUpdate.size());
    //Check for envelope0
    TaskCallbackImpl taskCallback = callbacksToUpdate.get(0);
    assertTrue(taskCallback.matchSeqNum(0));
    assertEquals(ssp, taskCallback.envelope.getSystemStreamPartition());
    assertEquals("0", taskCallback.envelope.getOffset());
    //Check for envelope1
    taskCallback = callbacksToUpdate.get(1);
    assertTrue(taskCallback.matchSeqNum(1));
    assertEquals(ssp, taskCallback.envelope.getSystemStreamPartition());
    assertEquals("1", taskCallback.envelope.getOffset());
}
Also used : Partition(org.apache.samza.Partition) SystemStreamPartition(org.apache.samza.system.SystemStreamPartition) TaskName(org.apache.samza.container.TaskName) IncomingMessageEnvelope(org.apache.samza.system.IncomingMessageEnvelope) SystemStreamPartition(org.apache.samza.system.SystemStreamPartition) Test(org.junit.Test)

Example 15 with SystemStreamPartition

use of org.apache.samza.system.SystemStreamPartition in project samza by apache.

the class TestTaskCallbackManager method testUpdateShouldReturnAllCompletedCallbacksTillTheCommitRequestDefined.

@Test
public void testUpdateShouldReturnAllCompletedCallbacksTillTheCommitRequestDefined() {
    TaskName taskName = new TaskName("Partition 0");
    SystemStreamPartition ssp1 = new SystemStreamPartition("kafka", "topic", new Partition(0));
    SystemStreamPartition ssp2 = new SystemStreamPartition("kafka", "topic", new Partition(0));
    // Callback for Envelope3 contains commit request.
    IncomingMessageEnvelope envelope3 = new IncomingMessageEnvelope(ssp2, "0", null, null);
    ReadableCoordinator coordinator3 = new ReadableCoordinator(taskName);
    coordinator3.commit(TaskCoordinator.RequestScope.CURRENT_TASK);
    TaskCallbackImpl callback3 = new TaskCallbackImpl(listener, taskName, envelope3, coordinator3, 3, 0);
    List<TaskCallbackImpl> callbacksToUpdate = callbackManager.updateCallback(callback3);
    assertTrue(callbacksToUpdate.isEmpty());
    IncomingMessageEnvelope envelope2 = new IncomingMessageEnvelope(ssp1, "2", null, null);
    ReadableCoordinator coordinator2 = new ReadableCoordinator(taskName);
    coordinator2.shutdown(TaskCoordinator.RequestScope.ALL_TASKS_IN_CONTAINER);
    TaskCallbackImpl callback2 = new TaskCallbackImpl(listener, taskName, envelope2, coordinator2, 2, 0);
    callbacksToUpdate = callbackManager.updateCallback(callback2);
    assertTrue(callbacksToUpdate.isEmpty());
    IncomingMessageEnvelope envelope1 = new IncomingMessageEnvelope(ssp1, "1", null, null);
    ReadableCoordinator coordinator1 = new ReadableCoordinator(taskName);
    coordinator1.commit(TaskCoordinator.RequestScope.CURRENT_TASK);
    TaskCallbackImpl callback1 = new TaskCallbackImpl(listener, taskName, envelope1, coordinator1, 1, 0);
    callbacksToUpdate = callbackManager.updateCallback(callback1);
    assertTrue(callbacksToUpdate.isEmpty());
    // Callback for Envelope0 contains commit request.
    IncomingMessageEnvelope envelope0 = new IncomingMessageEnvelope(ssp1, "0", null, null);
    ReadableCoordinator coordinator = new ReadableCoordinator(taskName);
    TaskCallbackImpl callback0 = new TaskCallbackImpl(listener, taskName, envelope0, coordinator, 0, 0);
    // Check for both Envelope1, Envelope2, Envelope3 in callbacks to commit.
    // Two callbacks belonging to different system partition and has commitRequest defined is returned.
    callbacksToUpdate = callbackManager.updateCallback(callback0);
    assertEquals(2, callbacksToUpdate.size());
    TaskCallbackImpl callback = callbacksToUpdate.get(0);
    assertTrue(callback.matchSeqNum(0));
    assertEquals(envelope0.getSystemStreamPartition(), callback.envelope.getSystemStreamPartition());
    assertEquals(envelope0.getOffset(), callback.envelope.getOffset());
    callback = callbacksToUpdate.get(1);
    assertTrue(callback.matchSeqNum(1));
    assertEquals(envelope1.getSystemStreamPartition(), callback.envelope.getSystemStreamPartition());
    assertEquals(envelope1.getOffset(), callback.envelope.getOffset());
}
Also used : Partition(org.apache.samza.Partition) SystemStreamPartition(org.apache.samza.system.SystemStreamPartition) TaskName(org.apache.samza.container.TaskName) IncomingMessageEnvelope(org.apache.samza.system.IncomingMessageEnvelope) SystemStreamPartition(org.apache.samza.system.SystemStreamPartition) Test(org.junit.Test)

Aggregations

SystemStreamPartition (org.apache.samza.system.SystemStreamPartition)43 Partition (org.apache.samza.Partition)29 Test (org.junit.Test)26 HashMap (java.util.HashMap)17 HashSet (java.util.HashSet)17 TaskName (org.apache.samza.container.TaskName)13 IncomingMessageEnvelope (org.apache.samza.system.IncomingMessageEnvelope)13 Config (org.apache.samza.config.Config)10 Set (java.util.Set)8 MapConfig (org.apache.samza.config.MapConfig)7 GenericRecord (org.apache.avro.generic.GenericRecord)6 ArrayList (java.util.ArrayList)5 List (java.util.List)5 SystemStream (org.apache.samza.system.SystemStream)5 MetricsRegistryMap (org.apache.samza.metrics.MetricsRegistryMap)4 SystemStreamMetadata (org.apache.samza.system.SystemStreamMetadata)4 LinkedHashMap (java.util.LinkedHashMap)3 SamzaException (org.apache.samza.SamzaException)3 TaskInstance (org.apache.samza.container.TaskInstance)3 SystemConsumer (org.apache.samza.system.SystemConsumer)3