Search in sources :

Example 1 with TransactionParticipant

use of org.apache.hudi.connect.transaction.TransactionParticipant in project hudi by apache.

the class HoodieSinkTask method put.

@Override
public void put(Collection<SinkRecord> records) {
    for (SinkRecord record : records) {
        String topic = record.topic();
        int partition = record.kafkaPartition();
        TopicPartition tp = new TopicPartition(topic, partition);
        TransactionParticipant transactionParticipant = transactionParticipants.get(tp);
        if (transactionParticipant != null) {
            transactionParticipant.buffer(record);
        }
    }
    for (TopicPartition partition : context.assignment()) {
        if (transactionParticipants.get(partition) == null) {
            throw new RetriableException("TransactionParticipant should be created for each assigned partition, " + "but has not been created for the topic/partition: " + partition.topic() + ":" + partition.partition());
        }
        try {
            transactionParticipants.get(partition).processRecords();
        } catch (HoodieIOException exception) {
            throw new RetriableException("Intermittent write errors for Hudi " + " for the topic/partition: " + partition.topic() + ":" + partition.partition() + " , ensuring kafka connect will retry ", exception);
        }
    }
}
Also used : TransactionParticipant(org.apache.hudi.connect.transaction.TransactionParticipant) ConnectTransactionParticipant(org.apache.hudi.connect.transaction.ConnectTransactionParticipant) HoodieIOException(org.apache.hudi.exception.HoodieIOException) TopicPartition(org.apache.kafka.common.TopicPartition) SinkRecord(org.apache.kafka.connect.sink.SinkRecord) RetriableException(org.apache.kafka.connect.errors.RetriableException)

Example 2 with TransactionParticipant

use of org.apache.hudi.connect.transaction.TransactionParticipant in project hudi by apache.

the class HoodieSinkTask method close.

@Override
public void close(Collection<TopicPartition> partitions) {
    LOG.info("Existing partitions deleted " + partitions.toString());
    // valid. For now, we prefer the simpler solution that may result in a bit of wasted effort.
    for (TopicPartition partition : partitions) {
        if (partition.partition() == ConnectTransactionCoordinator.COORDINATOR_KAFKA_PARTITION) {
            if (transactionCoordinators.containsKey(partition)) {
                transactionCoordinators.get(partition).stop();
                transactionCoordinators.remove(partition);
            }
        }
        TransactionParticipant worker = transactionParticipants.remove(partition);
        if (worker != null) {
            try {
                LOG.debug("Closing data writer due to task start failure.");
                worker.stop();
            } catch (Throwable t) {
                LOG.debug(String.format("Error closing and stopping data writer: %s", t.getMessage()), t);
            }
        }
    }
}
Also used : TransactionParticipant(org.apache.hudi.connect.transaction.TransactionParticipant) ConnectTransactionParticipant(org.apache.hudi.connect.transaction.ConnectTransactionParticipant) TopicPartition(org.apache.kafka.common.TopicPartition)

Example 3 with TransactionParticipant

use of org.apache.hudi.connect.transaction.TransactionParticipant in project hudi by apache.

the class KafkaConnectControlAgent method start.

private void start() {
    Properties props = new Properties();
    props.put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG, bootstrapServers);
    // Todo fetch the worker id or name instead of a uuid.
    props.put(ConsumerConfig.GROUP_ID_CONFIG, "hudi-control-group" + UUID.randomUUID().toString());
    props.put(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG, StringDeserializer.class);
    props.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG, ByteArrayDeserializer.class);
    // Since we are using Kafka Control Topic as a RPC like interface,
    // we want consumers to only process messages that are sent after they come online
    props.put(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, "latest");
    consumer = new KafkaConsumer<>(props, new StringDeserializer(), new ByteArrayDeserializer());
    consumer.subscribe(Collections.singletonList(controlTopicName));
    executorService.submit(() -> {
        while (true) {
            ConsumerRecords<String, byte[]> records;
            records = consumer.poll(Duration.ofMillis(KAFKA_POLL_TIMEOUT_MS));
            for (ConsumerRecord<String, byte[]> record : records) {
                try {
                    LOG.debug(String.format("Kafka consumerGroupId = %s topic = %s, partition = %s, offset = %s, customer = %s, country = %s", "", record.topic(), record.partition(), record.offset(), record.key(), record.value()));
                    ControlMessage message = ControlMessage.parseFrom(record.value());
                    String senderTopic = message.getTopicName();
                    if (message.getReceiverType().equals(ControlMessage.EntityType.PARTICIPANT)) {
                        if (partitionWorkers.containsKey(senderTopic)) {
                            for (TransactionParticipant partitionWorker : partitionWorkers.get(senderTopic)) {
                                partitionWorker.processControlEvent(message);
                            }
                        } else {
                            LOG.warn(String.format("Failed to send message for unregistered participants for topic %s", senderTopic));
                        }
                    } else if (message.getReceiverType().equals(ControlMessage.EntityType.COORDINATOR)) {
                        if (topicCoordinators.containsKey(senderTopic)) {
                            topicCoordinators.get(senderTopic).processControlEvent(message);
                        }
                    } else {
                        LOG.warn(String.format("Sender type of Control Message unknown %s", message.getSenderType().name()));
                    }
                } catch (Exception e) {
                    LOG.error(String.format("Fatal error while consuming a kafka record for topic = %s partition = %s", record.topic(), record.partition()), e);
                }
            }
            try {
                consumer.commitSync();
            } catch (CommitFailedException exception) {
                LOG.error("Fatal error while committing kafka control topic");
            }
        }
    });
}
Also used : TransactionParticipant(org.apache.hudi.connect.transaction.TransactionParticipant) StringDeserializer(org.apache.kafka.common.serialization.StringDeserializer) Properties(java.util.Properties) ByteArrayDeserializer(org.apache.kafka.common.serialization.ByteArrayDeserializer) CommitFailedException(org.apache.kafka.clients.consumer.CommitFailedException) ControlMessage(org.apache.hudi.connect.ControlMessage) CommitFailedException(org.apache.kafka.clients.consumer.CommitFailedException)

Example 4 with TransactionParticipant

use of org.apache.hudi.connect.transaction.TransactionParticipant in project hudi by apache.

the class HoodieSinkTask method cleanup.

private void cleanup() {
    for (TopicPartition partition : context.assignment()) {
        TransactionParticipant worker = transactionParticipants.get(partition);
        if (worker != null) {
            try {
                LOG.debug("Closing data writer due to task start failure.");
                worker.stop();
            } catch (Throwable t) {
                LOG.debug("Error closing and stopping data writer", t);
            }
        }
    }
    transactionParticipants.clear();
    transactionCoordinators.forEach((topic, transactionCoordinator) -> transactionCoordinator.stop());
    transactionCoordinators.clear();
}
Also used : TransactionParticipant(org.apache.hudi.connect.transaction.TransactionParticipant) ConnectTransactionParticipant(org.apache.hudi.connect.transaction.ConnectTransactionParticipant) TopicPartition(org.apache.kafka.common.TopicPartition)

Aggregations

TransactionParticipant (org.apache.hudi.connect.transaction.TransactionParticipant)4 ConnectTransactionParticipant (org.apache.hudi.connect.transaction.ConnectTransactionParticipant)3 TopicPartition (org.apache.kafka.common.TopicPartition)3 Properties (java.util.Properties)1 ControlMessage (org.apache.hudi.connect.ControlMessage)1 HoodieIOException (org.apache.hudi.exception.HoodieIOException)1 CommitFailedException (org.apache.kafka.clients.consumer.CommitFailedException)1 ByteArrayDeserializer (org.apache.kafka.common.serialization.ByteArrayDeserializer)1 StringDeserializer (org.apache.kafka.common.serialization.StringDeserializer)1 RetriableException (org.apache.kafka.connect.errors.RetriableException)1 SinkRecord (org.apache.kafka.connect.sink.SinkRecord)1