Search in sources :

Example 1 with ExtendedSequenceNumber

use of com.amazonaws.services.kinesis.clientlibrary.types.ExtendedSequenceNumber in project samza by apache.

the class TestKinesisSystemConsumer method createAndInitProcessors.

private Map<String, KinesisRecordProcessor> createAndInitProcessors(IRecordProcessorFactory factory, int numShards) {
    Map<String, KinesisRecordProcessor> processorMap = new HashMap<>();
    IntStream.range(0, numShards).forEach(p -> {
        String shardId = String.format("shard-%05d", p);
        // Create Kinesis processor
        KinesisRecordProcessor processor = (KinesisRecordProcessor) factory.createProcessor();
        // Initialize the shard
        ExtendedSequenceNumber seqNum = new ExtendedSequenceNumber("0000");
        InitializationInput initializationInput = new InitializationInput().withShardId(shardId).withExtendedSequenceNumber(seqNum);
        processor.initialize(initializationInput);
        processorMap.put(shardId, processor);
    });
    return processorMap;
}
Also used : TestKinesisRecordProcessor(org.apache.samza.system.kinesis.consumer.TestKinesisRecordProcessor) HashMap(java.util.HashMap) ExtendedSequenceNumber(com.amazonaws.services.kinesis.clientlibrary.types.ExtendedSequenceNumber) InitializationInput(com.amazonaws.services.kinesis.clientlibrary.types.InitializationInput)

Example 2 with ExtendedSequenceNumber

use of com.amazonaws.services.kinesis.clientlibrary.types.ExtendedSequenceNumber in project samza by apache.

the class TestKinesisRecordProcessor method testCheckpointAfterInit.

/**
 * Test the scenario where a processor instance is created for a shard and while it is processing records, it got
 * re-assigned to the same consumer. This results in a new processor instance owning the shard and this instance
 * could receive checkpoint calls for the records that are processed by the old processor instance. This test covers
 * the scenario where the new instance receives the checkpoint call while it is done with the initialization phase and
 * before it processed any records.
 */
@Test
public void testCheckpointAfterInit() {
    String system = "kinesis";
    String stream = "stream";
    final CountDownLatch receivedShutdownLatch = new CountDownLatch(1);
    KinesisRecordProcessorListener listener = new KinesisRecordProcessorListener() {

        @Override
        public void onReceiveRecords(SystemStreamPartition ssp, List<Record> records, long millisBehindLatest) {
        }

        @Override
        public void onShutdown(SystemStreamPartition ssp) {
            receivedShutdownLatch.countDown();
        }
    };
    KinesisRecordProcessor processor = new KinesisRecordProcessor(new SystemStreamPartition(system, stream, new Partition(0)), listener);
    // Initialize the processor
    ExtendedSequenceNumber seqNum = new ExtendedSequenceNumber("0000");
    InitializationInput initializationInput = new InitializationInput().withShardId("shard-0000").withExtendedSequenceNumber(seqNum);
    processor.initialize(initializationInput);
    // Call checkpoint. This checkpoint could have originally headed to the processor instance for the same shard but
    // due to reassignment a new processor instance is created.
    processor.checkpoint("1234567");
    // Call shutdown (with ZOMBIE reason) on processor and verify that the processor calls shutdown on the listener.
    shutDownProcessor(processor, ShutdownReason.ZOMBIE);
    // Verify that the processor is shutdown.
    Assert.assertEquals("Unable to shutdown processor.", 0, receivedShutdownLatch.getCount());
}
Also used : SystemStreamPartition(org.apache.samza.system.SystemStreamPartition) Partition(org.apache.samza.Partition) ExtendedSequenceNumber(com.amazonaws.services.kinesis.clientlibrary.types.ExtendedSequenceNumber) ArrayList(java.util.ArrayList) List(java.util.List) CountDownLatch(java.util.concurrent.CountDownLatch) SystemStreamPartition(org.apache.samza.system.SystemStreamPartition) InitializationInput(com.amazonaws.services.kinesis.clientlibrary.types.InitializationInput) Test(org.junit.Test)

Example 3 with ExtendedSequenceNumber

use of com.amazonaws.services.kinesis.clientlibrary.types.ExtendedSequenceNumber in project samza by apache.

the class KinesisRecordProcessor method processRecords.

/**
 * Process data records. The Amazon Kinesis Client Library will invoke this method to deliver data records to the
 * application. Upon fail over, the new instance will get records with sequence number greater than the checkpoint
 * position for each partition key.
 *
 * @param processRecordsInput Provides the records to be processed as well as information and capabilities related
 *        to them (eg checkpointing).
 */
@Override
public void processRecords(ProcessRecordsInput processRecordsInput) {
    // KCL does not send any records to the processor that was shutdown.
    Validate.isTrue(!shutdownRequested, String.format("KCL returned records after shutdown is called on the processor %s.", this));
    // KCL aways gives reference to the same checkpointer instance for a given processor instance.
    checkpointer = processRecordsInput.getCheckpointer();
    List<Record> records = processRecordsInput.getRecords();
    // Empty records are expected when KCL config has CallProcessRecordsEvenForEmptyRecordList set to true.
    if (!records.isEmpty()) {
        lastProcessedRecordSeqNumber = new ExtendedSequenceNumber(records.get(records.size() - 1).getSequenceNumber());
        listener.onReceiveRecords(ssp, records, processRecordsInput.getMillisBehindLatest());
    }
}
Also used : ExtendedSequenceNumber(com.amazonaws.services.kinesis.clientlibrary.types.ExtendedSequenceNumber) Record(com.amazonaws.services.kinesis.model.Record)

Example 4 with ExtendedSequenceNumber

use of com.amazonaws.services.kinesis.clientlibrary.types.ExtendedSequenceNumber in project samza by apache.

the class KinesisRecordProcessor method checkpoint.

/**
 * Invoked by the Samza thread to commit checkpoint for the shard owned by the record processor instance.
 *
 * @param seqNumber sequenceNumber to checkpoint for the shard owned by this processor instance.
 */
public void checkpoint(String seqNumber) {
    ExtendedSequenceNumber seqNumberToCheckpoint = new ExtendedSequenceNumber(seqNumber);
    if (initSeqNumber.compareTo(seqNumberToCheckpoint) > 0) {
        LOG.warn("Samza called checkpoint with seqNumber {} smaller than initial seqNumber {} for {}. Ignoring it!", seqNumber, initSeqNumber, this);
        return;
    }
    if (checkpointer == null) {
        // checkpointer could be null as a result of shard re-assignment before the first record is processed.
        LOG.warn("Ignoring checkpointing for {} with seqNumber {} because of re-assignment.", this, seqNumber);
        return;
    }
    try {
        checkpointer.checkpoint(seqNumber);
        lastCheckpointedRecordSeqNumber = seqNumberToCheckpoint;
    } catch (ShutdownException e) {
        // This can happen as a result of shard re-assignment.
        String msg = String.format("Checkpointing %s with seqNumber %s failed with exception. Dropping the checkpoint.", this, seqNumber);
        LOG.warn(msg, e);
    } catch (InvalidStateException e) {
        // This can happen when KCL encounters issues with internal state, eg: dynamoDB table is not found
        String msg = String.format("Checkpointing %s with seqNumber %s failed with exception.", this, seqNumber);
        LOG.error(msg, e);
        throw new SamzaException(msg, e);
    } catch (ThrottlingException e) {
        // Throttling is handled by KCL via the client lib configuration properties. If we get an exception inspite of
        // throttling back-off behavior, let's throw an exception as the configs
        String msg = String.format("Checkpointing %s with seqNumber %s failed with exception. Checkpoint interval is" + " too aggressive for the provisioned throughput of the dynamoDB table where the checkpoints are stored." + " Either reduce the checkpoint interval -or- increase the throughput of dynamoDB table.", this, seqNumber);
        throw new SamzaException(msg);
    }
}
Also used : ExtendedSequenceNumber(com.amazonaws.services.kinesis.clientlibrary.types.ExtendedSequenceNumber) ShutdownException(com.amazonaws.services.kinesis.clientlibrary.exceptions.ShutdownException) ThrottlingException(com.amazonaws.services.kinesis.clientlibrary.exceptions.ThrottlingException) InvalidStateException(com.amazonaws.services.kinesis.clientlibrary.exceptions.InvalidStateException) SamzaException(org.apache.samza.SamzaException)

Example 5 with ExtendedSequenceNumber

use of com.amazonaws.services.kinesis.clientlibrary.types.ExtendedSequenceNumber in project samza by apache.

the class TestKinesisRecordProcessor method testLifeCycleHelper.

private void testLifeCycleHelper(int numRecords) {
    String system = "kinesis";
    String stream = "stream";
    final CountDownLatch receivedShutdownLatch = new CountDownLatch(1);
    final CountDownLatch receivedRecordsLatch = new CountDownLatch(numRecords > 0 ? 1 : 0);
    KinesisRecordProcessorListener listener = new KinesisRecordProcessorListener() {

        @Override
        public void onReceiveRecords(SystemStreamPartition ssp, List<Record> records, long millisBehindLatest) {
            receivedRecordsLatch.countDown();
        }

        @Override
        public void onShutdown(SystemStreamPartition ssp) {
            receivedShutdownLatch.countDown();
        }
    };
    KinesisRecordProcessor processor = new KinesisRecordProcessor(new SystemStreamPartition(system, stream, new Partition(0)), listener);
    // Initialize the processor
    ExtendedSequenceNumber seqNum = new ExtendedSequenceNumber("0000");
    InitializationInput initializationInput = new InitializationInput().withShardId("shard-0000").withExtendedSequenceNumber(seqNum);
    processor.initialize(initializationInput);
    // Call processRecords on the processor
    List<Record> records = generateRecords(numRecords, Collections.singletonList(processor)).get(processor);
    // Verification steps
    // Verify there is a receivedRecords call to listener.
    Assert.assertEquals("Unable to receive records.", 0, receivedRecordsLatch.getCount());
    if (numRecords > 0) {
        // Call checkpoint on last record
        processor.checkpoint(records.get(records.size() - 1).getSequenceNumber());
    }
    // Call shutdown (with ZOMBIE reason) on processor and verify that the processor calls shutdown on the listener.
    shutDownProcessor(processor, ShutdownReason.ZOMBIE);
    // Verify that the processor is shutdown.
    Assert.assertEquals("Unable to shutdown processor.", 0, receivedShutdownLatch.getCount());
}
Also used : SystemStreamPartition(org.apache.samza.system.SystemStreamPartition) Partition(org.apache.samza.Partition) ExtendedSequenceNumber(com.amazonaws.services.kinesis.clientlibrary.types.ExtendedSequenceNumber) ArrayList(java.util.ArrayList) List(java.util.List) Record(com.amazonaws.services.kinesis.model.Record) CountDownLatch(java.util.concurrent.CountDownLatch) SystemStreamPartition(org.apache.samza.system.SystemStreamPartition) InitializationInput(com.amazonaws.services.kinesis.clientlibrary.types.InitializationInput)

Aggregations

ExtendedSequenceNumber (com.amazonaws.services.kinesis.clientlibrary.types.ExtendedSequenceNumber)6 InitializationInput (com.amazonaws.services.kinesis.clientlibrary.types.InitializationInput)4 Record (com.amazonaws.services.kinesis.model.Record)3 ArrayList (java.util.ArrayList)3 List (java.util.List)3 CountDownLatch (java.util.concurrent.CountDownLatch)3 Partition (org.apache.samza.Partition)3 SystemStreamPartition (org.apache.samza.system.SystemStreamPartition)3 InvalidStateException (com.amazonaws.services.kinesis.clientlibrary.exceptions.InvalidStateException)1 ShutdownException (com.amazonaws.services.kinesis.clientlibrary.exceptions.ShutdownException)1 ThrottlingException (com.amazonaws.services.kinesis.clientlibrary.exceptions.ThrottlingException)1 HashMap (java.util.HashMap)1 SamzaException (org.apache.samza.SamzaException)1 TestKinesisRecordProcessor (org.apache.samza.system.kinesis.consumer.TestKinesisRecordProcessor)1 Test (org.junit.Test)1