Search in sources :

Example 6 with SystemProducerException

use of org.apache.samza.system.SystemProducerException in project samza by apache.

the class AzureBlobSystemProducer method flushWriters.

private void flushWriters(Map<String, AzureBlobWriter> sourceWriterMap) {
    sourceWriterMap.forEach((stream, writer) -> {
        try {
            LOG.info("Flushing topic:{}", stream);
            writer.flush();
        } catch (IOException e) {
            throw new SystemProducerException("Close failed for topic " + stream, e);
        }
    });
}
Also used : IOException(java.io.IOException) SystemProducerException(org.apache.samza.system.SystemProducerException)

Example 7 with SystemProducerException

use of org.apache.samza.system.SystemProducerException in project samza by apache.

the class AzureBlobSystemProducer method validateFlushThresholdSizeSupported.

void validateFlushThresholdSizeSupported(BlobServiceAsyncClient storageClient) {
    long flushThresholdSize = config.getMaxFlushThresholdSize(systemName);
    try {
        SkuName accountType = storageClient.getAccountInfo().block().getSkuName();
        String accountName = storageClient.getAccountName();
        boolean isPremiumAccount = SkuName.PREMIUM_LRS == accountType;
        if (isPremiumAccount && flushThresholdSize > PREMIUM_MAX_BLOCK_SIZE) {
            // 100 MB
            throw new SystemProducerException("Azure storage account with name: " + accountName + " is a premium account and can only handle upto " + PREMIUM_MAX_BLOCK_SIZE + " threshold size. Given flush threshold size is " + flushThresholdSize);
        } else if (!isPremiumAccount && flushThresholdSize > STANDARD_MAX_BLOCK_SIZE) {
            // STANDARD account
            throw new SystemProducerException("Azure storage account with name: " + accountName + " is a standard account and can only handle upto " + STANDARD_MAX_BLOCK_SIZE + " threshold size. Given flush threshold size is " + flushThresholdSize);
        }
    } catch (Exception e) {
        LOG.warn("Exception encountered while trying to ensure that the given flush threshold size is " + "supported by the desired azure blob storage account. " + "{} is the given account name and {} is the flush threshold size.", config.getAzureAccountName(systemName), flushThresholdSize);
        LOG.warn("SystemProducer will continue and send messages to Azure Blob Storage but they might fail " + "if the given threshold size is not supported by the account");
    }
}
Also used : SkuName(com.azure.storage.blob.models.SkuName) SystemProducerException(org.apache.samza.system.SystemProducerException) BlobStorageException(com.azure.storage.blob.models.BlobStorageException) SystemProducerException(org.apache.samza.system.SystemProducerException) IOException(java.io.IOException)

Example 8 with SystemProducerException

use of org.apache.samza.system.SystemProducerException in project samza by apache.

the class AzureBlobSystemProducer method send.

/**
 * Multi-threading and thread-safety:
 *
 *  From Samza usage of SystemProducer:
 *  The lifecycle of SystemProducer shown above is consistent with most use cases within Samza (with the exception of
 *  Coordinator stream store/producer and KafkaCheckpointManager).
 *  A single parent thread creates the SystemProducer, registers all sources and starts it before handing it
 *  to multiple threads for use (send and flush). Finally, the single parent thread stops the producer.
 *  The most frequent operations on a SystemProducer are send and flush while register, start and stop are one-time operations.
 *
 *  Based on this usage pattern: to provide multi-threaded support and improve throughput of this SystemProducer,
 *  multiple sends and flushes need to happen in parallel. However, the following rules are needed to ensure
 *  o data loss and data consistency.
 *  1. sends can happen in parallel for same source or different sources.
 *  2. send and flush for the same source can not happen in parallel. Although, the AzureBlobWriter is thread safe,
 *     interleaving write and flush and close operations of a writer can lead to data loss if a write happens between flush and close.
 *     There are other scenarios such as issuing a write to the writer after close and so on.
 *  3. writer creation for the same writer key (SSP) can not happen in parallel - for the reason that multiple
 *     writers could get created with only one being retained but all being used and GCed after a send, leading to data loss.
 *
 *  These 3 rules are achieved by using a per source ReadWriteLock to allow sends in parallel but guarantee exclusivity for flush.
 *  Additionally, a per source lock is used to ensure writer creation is in a critical section.
 *
 *  Concurrent access to shared objects as follows:
 *  1. AzureBlobWriters is permitted as long as there are no interleaving of operations for a writer.
 *     If multiple operations of writer (as in flush) then make it synchronized.
 *  2. ConcurrentHashMaps (esp writerMap per source) get and put - disallow interleaving by doing put and clear under locks.
 *  3. WriterFactory and Metrics are thread-safe. WriterFactory is stateless while Metrics' operations interleaving
 *     is thread-safe too as they work on different counters.
 *  The above locking mechanisms ensure thread-safety.
 * {@inheritDoc}
 * @throws SystemProducerException
 */
@Override
public void send(String source, OutgoingMessageEnvelope messageEnvelope) {
    if (!isStarted) {
        throw new SystemProducerException("Trying to send before producer has started.");
    }
    if (isStopped) {
        throw new SystemProducerException("Sending after producer has been stopped.");
    }
    ReadWriteLock lock = sourceSendFlushLockMap.get(source);
    if (lock == null) {
        throw new SystemProducerException("Attempting to send to source: " + source + " but it was not registered");
    }
    lock.readLock().lock();
    try {
        AzureBlobWriter writer = getOrCreateWriter(source, messageEnvelope);
        writer.write(messageEnvelope);
        metrics.updateWriteMetrics(source);
    } catch (Exception e) {
        metrics.updateErrorMetrics(source);
        Object partitionKey = getPartitionKey(messageEnvelope);
        String msg = "Send failed for source: " + source + ", system: " + systemName + ", stream: " + messageEnvelope.getSystemStream().getStream() + ", partitionKey: " + ((partitionKey != null) ? partitionKey : "null");
        throw new SystemProducerException(msg, e);
    } finally {
        lock.readLock().unlock();
    }
}
Also used : ReentrantReadWriteLock(java.util.concurrent.locks.ReentrantReadWriteLock) ReadWriteLock(java.util.concurrent.locks.ReadWriteLock) SystemProducerException(org.apache.samza.system.SystemProducerException) BlobStorageException(com.azure.storage.blob.models.BlobStorageException) SystemProducerException(org.apache.samza.system.SystemProducerException) IOException(java.io.IOException)

Example 9 with SystemProducerException

use of org.apache.samza.system.SystemProducerException in project samza by apache.

the class AzureBlobSystemProducer method stop.

/**
 * {@inheritDoc}
 * @throws SystemProducerException
 */
@Override
public synchronized void stop() {
    if (!isStarted) {
        LOG.warn("Attempting to stop a producer that was not started.");
        return;
    }
    if (isStopped) {
        LOG.warn("Attempting to stop an already stopped producer.");
        return;
    }
    try {
        writerMap.forEach((source, sourceWriterMap) -> flush(source));
        asyncBlobThreadPool.shutdown();
        isStarted = false;
    } catch (Exception e) {
        throw new SystemProducerException("Stop failed with exception.", e);
    } finally {
        writerMap.clear();
        isStopped = true;
    }
}
Also used : BlobStorageException(com.azure.storage.blob.models.BlobStorageException) SystemProducerException(org.apache.samza.system.SystemProducerException) IOException(java.io.IOException) SystemProducerException(org.apache.samza.system.SystemProducerException)

Example 10 with SystemProducerException

use of org.apache.samza.system.SystemProducerException in project samza by apache.

the class TestAzureBlobSystemProducer method testStopWhenWriterFails.

@Test(expected = SystemProducerException.class)
public void testStopWhenWriterFails() throws IOException {
    doThrow(new SystemProducerException("Failed")).when(mockAzureWriter).flush();
    systemProducer.register(SOURCE);
    systemProducer.start();
    systemProducer.send(SOURCE, ome);
    systemProducer.stop();
}
Also used : SystemProducerException(org.apache.samza.system.SystemProducerException) PrepareForTest(org.powermock.core.classloader.annotations.PrepareForTest) Test(org.junit.Test)

Aggregations

SystemProducerException (org.apache.samza.system.SystemProducerException)13 IOException (java.io.IOException)6 Test (org.junit.Test)6 PrepareForTest (org.powermock.core.classloader.annotations.PrepareForTest)6 BlobStorageException (com.azure.storage.blob.models.BlobStorageException)5 ReentrantReadWriteLock (java.util.concurrent.locks.ReentrantReadWriteLock)3 ReadWriteLock (java.util.concurrent.locks.ReadWriteLock)2 AzureBlobConfig (org.apache.samza.system.azureblob.AzureBlobConfig)2 BlobServiceAsyncClient (com.azure.storage.blob.BlobServiceAsyncClient)1 SkuName (com.azure.storage.blob.models.SkuName)1 VisibleForTesting (com.google.common.annotations.VisibleForTesting)1 OutgoingMessageEnvelope (org.apache.samza.system.OutgoingMessageEnvelope)1 AzureBlobClientBuilder (org.apache.samza.system.azureblob.AzureBlobClientBuilder)1 Mockito.anyString (org.mockito.Mockito.anyString)1