Examples with KafkaPartitionSplit - org.apache.flink.connector.kafka.source.split.KafkaPartitionSplit

Example 16 with KafkaPartitionSplit

use of org.apache.flink.connector.kafka.source.split.KafkaPartitionSplit in project flink by apache.

the class KafkaSourceEnumerator method addPartitionSplitChangeToPendingAssignments.

// This method should only be invoked in the coordinator executor thread.
private void addPartitionSplitChangeToPendingAssignments(Collection<KafkaPartitionSplit> newPartitionSplits) {
    int numReaders = context.currentParallelism();
    for (KafkaPartitionSplit split : newPartitionSplits) {
        int ownerReader = getSplitOwner(split.getTopicPartition(), numReaders);
        pendingPartitionSplitAssignment.computeIfAbsent(ownerReader, r -> new HashSet<>()).add(split);
    }
    LOG.debug("Assigned {} to {} readers of consumer group {}.", newPartitionSplits, numReaders, consumerGroupId);
}

Also used : ListConsumerGroupOffsetsOptions(org.apache.kafka.clients.admin.ListConsumerGroupOffsetsOptions) OffsetsInitializer(org.apache.flink.connector.kafka.source.enumerator.initializer.OffsetsInitializer) LoggerFactory(org.slf4j.LoggerFactory) HashMap(java.util.HashMap) SplitsAssignment(org.apache.flink.api.connector.source.SplitsAssignment) ArrayList(java.util.ArrayList) AdminClient(org.apache.kafka.clients.admin.AdminClient) HashSet(java.util.HashSet) ListOffsetsResult(org.apache.kafka.clients.admin.ListOffsetsResult) SplitEnumerator(org.apache.flink.api.connector.source.SplitEnumerator) Duration(java.time.Duration) Map(java.util.Map) KafkaSourceOptions(org.apache.flink.connector.kafka.source.KafkaSourceOptions) KafkaAdminClient(org.apache.kafka.clients.admin.KafkaAdminClient) Nullable(javax.annotation.Nullable) TopicPartition(org.apache.kafka.common.TopicPartition) Logger(org.slf4j.Logger) Properties(java.util.Properties) FlinkRuntimeException(org.apache.flink.util.FlinkRuntimeException) SplitEnumeratorContext(org.apache.flink.api.connector.source.SplitEnumeratorContext) Collection(java.util.Collection) Set(java.util.Set) ConsumerConfig(org.apache.kafka.clients.consumer.ConsumerConfig) OffsetAndTimestamp(org.apache.kafka.clients.consumer.OffsetAndTimestamp) KafkaSubscriber(org.apache.flink.connector.kafka.source.enumerator.subscriber.KafkaSubscriber) Collectors(java.util.stream.Collectors) VisibleForTesting(org.apache.flink.annotation.VisibleForTesting) OffsetSpec(org.apache.kafka.clients.admin.OffsetSpec) ExecutionException(java.util.concurrent.ExecutionException) Consumer(java.util.function.Consumer) List(java.util.List) Internal(org.apache.flink.annotation.Internal) KafkaPartitionSplit(org.apache.flink.connector.kafka.source.split.KafkaPartitionSplit) Collections(java.util.Collections) Boundedness(org.apache.flink.api.connector.source.Boundedness) KafkaPartitionSplit(org.apache.flink.connector.kafka.source.split.KafkaPartitionSplit) HashSet(java.util.HashSet)

Example 17 with KafkaPartitionSplit

use of org.apache.flink.connector.kafka.source.split.KafkaPartitionSplit in project flink by apache.

the class KafkaPartitionSplitReaderTest method assignSplitsAndFetchUntilFinish.

// ------------------
private void assignSplitsAndFetchUntilFinish(KafkaPartitionSplitReader reader, int readerId) throws IOException {
    Map<String, KafkaPartitionSplit> splits = assignSplits(reader, splitsByOwners.get(readerId));
    Map<String, Integer> numConsumedRecords = new HashMap<>();
    Set<String> finishedSplits = new HashSet<>();
    while (finishedSplits.size() < splits.size()) {
        RecordsWithSplitIds<ConsumerRecord<byte[], byte[]>> recordsBySplitIds = reader.fetch();
        String splitId = recordsBySplitIds.nextSplit();
        while (splitId != null) {
            // Collect the records in this split.
            List<ConsumerRecord<byte[], byte[]>> splitFetch = new ArrayList<>();
            ConsumerRecord<byte[], byte[]> record;
            while ((record = recordsBySplitIds.nextRecordFromSplit()) != null) {
                splitFetch.add(record);
            }
            // Compute the expected next offset for the split.
            TopicPartition tp = splits.get(splitId).getTopicPartition();
            long earliestOffset = earliestOffsets.get(tp);
            int numConsumedRecordsForSplit = numConsumedRecords.getOrDefault(splitId, 0);
            long expectedStartingOffset = earliestOffset + numConsumedRecordsForSplit;
            // verify the consumed records.
            if (verifyConsumed(splits.get(splitId), expectedStartingOffset, splitFetch)) {
                finishedSplits.add(splitId);
            }
            numConsumedRecords.compute(splitId, (ignored, recordCount) -> recordCount == null ? splitFetch.size() : recordCount + splitFetch.size());
            splitId = recordsBySplitIds.nextSplit();
        }
    }
    // Verify the number of records consumed from each split.
    numConsumedRecords.forEach((splitId, recordCount) -> {
        TopicPartition tp = splits.get(splitId).getTopicPartition();
        long earliestOffset = earliestOffsets.get(tp);
        long expectedRecordCount = NUM_RECORDS_PER_PARTITION - earliestOffset;
        assertEquals(expectedRecordCount, (long) recordCount, String.format("%s should have %d records.", splits.get(splitId), expectedRecordCount));
    });
}

Also used : KafkaPartitionSplit(org.apache.flink.connector.kafka.source.split.KafkaPartitionSplit) HashMap(java.util.HashMap) ArrayList(java.util.ArrayList) ConsumerRecord(org.apache.kafka.clients.consumer.ConsumerRecord) TopicPartition(org.apache.kafka.common.TopicPartition) HashSet(java.util.HashSet)

Example 18 with KafkaPartitionSplit

use of org.apache.flink.connector.kafka.source.split.KafkaPartitionSplit in project flink by apache.

the class KafkaPartitionSplitReaderTest method testPendingRecordsGauge.

@ParameterizedTest
@EmptySource
@ValueSource(strings = { "_underscore.period-minus" })
public void testPendingRecordsGauge(String topicSuffix) throws Throwable {
    final String topic1Name = TOPIC1 + topicSuffix;
    final String topic2Name = TOPIC2 + topicSuffix;
    if (!topicSuffix.isEmpty()) {
        KafkaSourceTestEnv.setupTopic(topic1Name, true, true, KafkaSourceTestEnv::getRecordsForTopic);
        KafkaSourceTestEnv.setupTopic(topic2Name, true, true, KafkaSourceTestEnv::getRecordsForTopic);
    }
    MetricListener metricListener = new MetricListener();
    final Properties props = new Properties();
    props.setProperty(ConsumerConfig.MAX_POLL_RECORDS_CONFIG, "1");
    KafkaPartitionSplitReader reader = createReader(props, InternalSourceReaderMetricGroup.mock(metricListener.getMetricGroup()));
    // Add a split
    reader.handleSplitsChanges(new SplitsAddition<>(Collections.singletonList(new KafkaPartitionSplit(new TopicPartition(topic1Name, 0), 0L))));
    // pendingRecords should have not been registered because of lazily registration
    assertFalse(metricListener.getGauge(MetricNames.PENDING_RECORDS).isPresent());
    // Trigger first fetch
    reader.fetch();
    final Optional<Gauge<Long>> pendingRecords = metricListener.getGauge(MetricNames.PENDING_RECORDS);
    assertTrue(pendingRecords.isPresent());
    // Validate pendingRecords
    assertNotNull(pendingRecords);
    assertEquals(NUM_RECORDS_PER_PARTITION - 1, (long) pendingRecords.get().getValue());
    for (int i = 1; i < NUM_RECORDS_PER_PARTITION; i++) {
        reader.fetch();
        assertEquals(NUM_RECORDS_PER_PARTITION - i - 1, (long) pendingRecords.get().getValue());
    }
    // Add another split
    reader.handleSplitsChanges(new SplitsAddition<>(Collections.singletonList(new KafkaPartitionSplit(new TopicPartition(topic2Name, 0), 0L))));
    // Validate pendingRecords
    for (int i = 0; i < NUM_RECORDS_PER_PARTITION; i++) {
        reader.fetch();
        assertEquals(NUM_RECORDS_PER_PARTITION - i - 1, (long) pendingRecords.get().getValue());
    }
}

Also used : KafkaPartitionSplit(org.apache.flink.connector.kafka.source.split.KafkaPartitionSplit) TopicPartition(org.apache.kafka.common.TopicPartition) KafkaSourceTestEnv(org.apache.flink.connector.kafka.testutils.KafkaSourceTestEnv) Properties(java.util.Properties) MetricListener(org.apache.flink.metrics.testutils.MetricListener) Gauge(org.apache.flink.metrics.Gauge) EmptySource(org.junit.jupiter.params.provider.EmptySource) ValueSource(org.junit.jupiter.params.provider.ValueSource) ParameterizedTest(org.junit.jupiter.params.ParameterizedTest)

Example 19 with KafkaPartitionSplit

use of org.apache.flink.connector.kafka.source.split.KafkaPartitionSplit in project flink by apache.

the class KafkaPartitionSplitReaderTest method testWakeUp.

@Test
public void testWakeUp() throws Exception {
    KafkaPartitionSplitReader reader = createReader();
    TopicPartition nonExistingTopicPartition = new TopicPartition("NotExist", 0);
    assignSplits(reader, Collections.singletonMap(KafkaPartitionSplit.toSplitId(nonExistingTopicPartition), new KafkaPartitionSplit(nonExistingTopicPartition, 0)));
    AtomicReference<Throwable> error = new AtomicReference<>();
    Thread t = new Thread(() -> {
        try {
            reader.fetch();
        } catch (Throwable e) {
            error.set(e);
        }
    }, "testWakeUp-thread");
    t.start();
    long deadline = System.currentTimeMillis() + 5000L;
    while (t.isAlive() && System.currentTimeMillis() < deadline) {
        reader.wakeUp();
        Thread.sleep(10);
    }
    assertNull(error.get());
}

Also used : KafkaPartitionSplit(org.apache.flink.connector.kafka.source.split.KafkaPartitionSplit) TopicPartition(org.apache.kafka.common.TopicPartition) AtomicReference(java.util.concurrent.atomic.AtomicReference) Test(org.junit.jupiter.api.Test) ParameterizedTest(org.junit.jupiter.params.ParameterizedTest)

Example 20 with KafkaPartitionSplit

use of org.apache.flink.connector.kafka.source.split.KafkaPartitionSplit in project flink by apache.

the class KafkaPartitionSplitReaderTest method testUsingCommittedOffsetsWithNoneOffsetResetStrategy.

@Test
public void testUsingCommittedOffsetsWithNoneOffsetResetStrategy() {
    final Properties props = new Properties();
    props.setProperty(ConsumerConfig.GROUP_ID_CONFIG, "using-committed-offset-with-none-offset-reset");
    KafkaPartitionSplitReader reader = createReader(props, UnregisteredMetricsGroup.createSourceReaderMetricGroup());
    // We expect that there is a committed offset, but the group does not actually have a
    // committed offset, and the offset reset strategy is none (Throw exception to the consumer
    // if no previous offset is found for the consumer's group);
    // So it is expected to throw an exception that missing the committed offset.
    final KafkaException undefinedOffsetException = Assertions.assertThrows(KafkaException.class, () -> reader.handleSplitsChanges(new SplitsAddition<>(Collections.singletonList(new KafkaPartitionSplit(new TopicPartition(TOPIC1, 0), KafkaPartitionSplit.COMMITTED_OFFSET)))));
    MatcherAssert.assertThat(undefinedOffsetException.getMessage(), CoreMatchers.containsString("Undefined offset with no reset policy for partition"));
}

Also used : KafkaPartitionSplit(org.apache.flink.connector.kafka.source.split.KafkaPartitionSplit) TopicPartition(org.apache.kafka.common.TopicPartition) SplitsAddition(org.apache.flink.connector.base.source.reader.splitreader.SplitsAddition) KafkaException(org.apache.kafka.common.KafkaException) Properties(java.util.Properties) Test(org.junit.jupiter.api.Test) ParameterizedTest(org.junit.jupiter.params.ParameterizedTest)

Aggregations

KafkaPartitionSplit (org.apache.flink.connector.kafka.source.split.KafkaPartitionSplit)25 TopicPartition (org.apache.kafka.common.TopicPartition)20 Properties (java.util.Properties)10 Test (org.junit.jupiter.api.Test)9 HashSet (java.util.HashSet)7 ParameterizedTest (org.junit.jupiter.params.ParameterizedTest)6 HashMap (java.util.HashMap)5 Map (java.util.Map)5 AdminClient (org.apache.kafka.clients.admin.AdminClient)5 ArrayList (java.util.ArrayList)4 Collection (java.util.Collection)4 Test (org.junit.Test)4 Collections (java.util.Collections)3 List (java.util.List)3 Set (java.util.Set)3 Consumer (java.util.function.Consumer)3 MockSplitEnumeratorContext (org.apache.flink.api.connector.source.mocks.MockSplitEnumeratorContext)3 OffsetAndMetadata (org.apache.kafka.clients.consumer.OffsetAndMetadata)3 Arrays (java.util.Arrays)2 Optional (java.util.Optional)2