Examples with KafkaRecordEntity - org.apache.druid.data.input.kafka.KafkaRecordEntity

Example 6 with KafkaRecordEntity

use of org.apache.druid.data.input.kafka.KafkaRecordEntity in project druid by druid-io.

the class KafkaInputFormatTest method testWithOutKey.

@Test
public // Headers cannot be null, so testing only no key use case!
void testWithOutKey() throws IOException {
    final byte[] payload = StringUtils.toUtf8("{\n" + "    \"timestamp\": \"2021-06-24\",\n" + "    \"bar\": null,\n" + "    \"foo\": \"x\",\n" + "    \"baz\": 4,\n" + "    \"o\": {\n" + "        \"mg\": 1\n" + "    }\n" + "}");
    Headers headers = new RecordHeaders(SAMPLE_HEADERS);
    inputEntity = new KafkaRecordEntity(new ConsumerRecord<byte[], byte[]>("sample", 0, 0, timestamp, null, null, 0, 0, null, payload, headers));
    final InputEntityReader reader = format.createReader(new InputRowSchema(new TimestampSpec("timestamp", "iso", null), new DimensionsSpec(DimensionsSpec.getDefaultSchemas(ImmutableList.of("bar", "foo", "kafka.newheader.encoding", "kafka.newheader.kafkapkc", "kafka.newts.timestamp"))), ColumnsFilter.all()), inputEntity, null);
    final int numExpectedIterations = 1;
    try (CloseableIterator<InputRow> iterator = reader.read()) {
        int numActualIterations = 0;
        while (iterator.hasNext()) {
            final InputRow row = iterator.next();
            // Key verification
            Assert.assertTrue(row.getDimension("kafka.newkey.key").isEmpty());
            numActualIterations++;
        }
        Assert.assertEquals(numExpectedIterations, numActualIterations);
    }
}

Also used : KafkaRecordEntity(org.apache.druid.data.input.kafka.KafkaRecordEntity) RecordHeaders(org.apache.kafka.common.header.internals.RecordHeaders) Headers(org.apache.kafka.common.header.Headers) RecordHeaders(org.apache.kafka.common.header.internals.RecordHeaders) TimestampSpec(org.apache.druid.data.input.impl.TimestampSpec) MapBasedInputRow(org.apache.druid.data.input.MapBasedInputRow) InputRow(org.apache.druid.data.input.InputRow) DimensionsSpec(org.apache.druid.data.input.impl.DimensionsSpec) InputEntityReader(org.apache.druid.data.input.InputEntityReader) InputRowSchema(org.apache.druid.data.input.InputRowSchema) ConsumerRecord(org.apache.kafka.clients.consumer.ConsumerRecord) Test(org.junit.Test)

Example 7 with KafkaRecordEntity

use of org.apache.druid.data.input.kafka.KafkaRecordEntity in project druid by druid-io.

the class KafkaStringHeaderFormatTest method testASCIIHeaderFormat.

@Test
public void testASCIIHeaderFormat() {
    Iterable<Header> header = ImmutableList.of(new Header() {

        @Override
        public String key() {
            return "encoding";
        }

        @Override
        public byte[] value() {
            return "application/json".getBytes(StandardCharsets.US_ASCII);
        }
    }, new Header() {

        @Override
        public String key() {
            return "kafkapkc";
        }

        @Override
        public byte[] value() {
            return "pkc-bar".getBytes(StandardCharsets.US_ASCII);
        }
    });
    String headerLabelPrefix = "test.kafka.header.";
    Headers headers = new RecordHeaders(header);
    inputEntity = new KafkaRecordEntity(new ConsumerRecord<byte[], byte[]>("sample", 0, 0, timestamp, null, null, 0, 0, null, "sampleValue".getBytes(StandardCharsets.UTF_8), headers));
    List<Pair<String, Object>> expectedResults = Arrays.asList(Pair.of("test.kafka.header.encoding", "application/json"), Pair.of("test.kafka.header.kafkapkc", "pkc-bar"));
    KafkaHeaderFormat headerInput = new KafkaStringHeaderFormat("US-ASCII");
    KafkaHeaderReader headerParser = headerInput.createReader(inputEntity.getRecord().headers(), headerLabelPrefix);
    List<Pair<String, Object>> rows = headerParser.read();
    Assert.assertEquals(expectedResults, rows);
}

Also used : KafkaRecordEntity(org.apache.druid.data.input.kafka.KafkaRecordEntity) Headers(org.apache.kafka.common.header.Headers) RecordHeaders(org.apache.kafka.common.header.internals.RecordHeaders) ConsumerRecord(org.apache.kafka.clients.consumer.ConsumerRecord) RecordHeaders(org.apache.kafka.common.header.internals.RecordHeaders) Header(org.apache.kafka.common.header.Header) Pair(org.apache.druid.java.util.common.Pair) Test(org.junit.Test)

Example 8 with KafkaRecordEntity

use of org.apache.druid.data.input.kafka.KafkaRecordEntity in project druid by druid-io.

the class KafkaStringHeaderFormatTest method testDefaultHeaderFormat.

@Test
public void testDefaultHeaderFormat() {
    String headerLabelPrefix = "test.kafka.header.";
    Headers headers = new RecordHeaders(SAMPLE_HEADERS);
    inputEntity = new KafkaRecordEntity(new ConsumerRecord<byte[], byte[]>("sample", 0, 0, timestamp, null, null, 0, 0, null, "sampleValue".getBytes(StandardCharsets.UTF_8), headers));
    List<Pair<String, Object>> expectedResults = Arrays.asList(Pair.of("test.kafka.header.encoding", "application/json"), Pair.of("test.kafka.header.kafkapkc", "pkc-bar"));
    KafkaHeaderFormat headerInput = new KafkaStringHeaderFormat(null);
    KafkaHeaderReader headerParser = headerInput.createReader(inputEntity.getRecord().headers(), headerLabelPrefix);
    Assert.assertEquals(expectedResults, headerParser.read());
}

Example 9 with KafkaRecordEntity

use of org.apache.druid.data.input.kafka.KafkaRecordEntity in project druid by druid-io.

the class IncrementalPublishingKafkaIndexTaskRunner method possiblyResetOffsetsOrWait.

private void possiblyResetOffsetsOrWait(Map<TopicPartition, Long> outOfRangePartitions, RecordSupplier<Integer, Long, KafkaRecordEntity> recordSupplier, TaskToolbox taskToolbox) throws InterruptedException, IOException {
    final Map<TopicPartition, Long> resetPartitions = new HashMap<>();
    boolean doReset = false;
    if (task.getTuningConfig().isResetOffsetAutomatically()) {
        for (Map.Entry<TopicPartition, Long> outOfRangePartition : outOfRangePartitions.entrySet()) {
            final TopicPartition topicPartition = outOfRangePartition.getKey();
            final long nextOffset = outOfRangePartition.getValue();
            // seek to the beginning to get the least available offset
            StreamPartition<Integer> streamPartition = StreamPartition.of(topicPartition.topic(), topicPartition.partition());
            final Long leastAvailableOffset = recordSupplier.getEarliestSequenceNumber(streamPartition);
            if (leastAvailableOffset == null) {
                throw new ISE("got null sequence number for partition[%s] when fetching from kafka!", topicPartition.partition());
            }
            // reset the seek
            recordSupplier.seek(streamPartition, nextOffset);
            // next message offset that we are trying to fetch
            if (leastAvailableOffset > nextOffset) {
                doReset = true;
                resetPartitions.put(topicPartition, nextOffset);
            }
        }
    }
    if (doReset) {
        sendResetRequestAndWait(CollectionUtils.mapKeys(resetPartitions, streamPartition -> StreamPartition.of(streamPartition.topic(), streamPartition.partition())), taskToolbox);
    } else {
        log.warn("Retrying in %dms", task.getPollRetryMs());
        pollRetryLock.lockInterruptibly();
        try {
            long nanos = TimeUnit.MILLISECONDS.toNanos(task.getPollRetryMs());
            while (nanos > 0L && !pauseRequested && !stopRequested.get()) {
                nanos = isAwaitingRetry.awaitNanos(nanos);
            }
        } finally {
            pollRetryLock.unlock();
        }
    }
}

Also used : TaskToolbox(org.apache.druid.indexing.common.TaskToolbox) SeekableStreamSequenceNumbers(org.apache.druid.indexing.seekablestream.SeekableStreamSequenceNumbers) StreamPartition(org.apache.druid.indexing.seekablestream.common.StreamPartition) RecordSupplier(org.apache.druid.indexing.seekablestream.common.RecordSupplier) OrderedPartitionableRecord(org.apache.druid.indexing.seekablestream.common.OrderedPartitionableRecord) AuthorizerMapper(org.apache.druid.server.security.AuthorizerMapper) CollectionUtils(org.apache.druid.utils.CollectionUtils) HashMap(java.util.HashMap) ByteBuffer(java.nio.ByteBuffer) SequenceMetadata(org.apache.druid.indexing.seekablestream.SequenceMetadata) Map(java.util.Map) TypeReference(com.fasterxml.jackson.core.type.TypeReference) Nonnull(javax.annotation.Nonnull) Nullable(javax.annotation.Nullable) OffsetOutOfRangeException(org.apache.kafka.clients.consumer.OffsetOutOfRangeException) SeekableStreamEndSequenceNumbers(org.apache.druid.indexing.seekablestream.SeekableStreamEndSequenceNumbers) TopicPartition(org.apache.kafka.common.TopicPartition) SeekableStreamDataSourceMetadata(org.apache.druid.indexing.seekablestream.SeekableStreamDataSourceMetadata) EmittingLogger(org.apache.druid.java.util.emitter.EmittingLogger) ObjectMapper(com.fasterxml.jackson.databind.ObjectMapper) Set(java.util.Set) ISE(org.apache.druid.java.util.common.ISE) IOException(java.io.IOException) InputRowParser(org.apache.druid.data.input.impl.InputRowParser) NotNull(javax.validation.constraints.NotNull) LockGranularity(org.apache.druid.indexing.common.LockGranularity) OrderedSequenceNumber(org.apache.druid.indexing.seekablestream.common.OrderedSequenceNumber) TimeUnit(java.util.concurrent.TimeUnit) KafkaRecordEntity(org.apache.druid.data.input.kafka.KafkaRecordEntity) List(java.util.List) TreeMap(java.util.TreeMap) Collections(java.util.Collections) SeekableStreamIndexTaskRunner(org.apache.druid.indexing.seekablestream.SeekableStreamIndexTaskRunner) HashMap(java.util.HashMap) TopicPartition(org.apache.kafka.common.TopicPartition) ISE(org.apache.druid.java.util.common.ISE) HashMap(java.util.HashMap) Map(java.util.Map) TreeMap(java.util.TreeMap)

Aggregations

KafkaRecordEntity (org.apache.druid.data.input.kafka.KafkaRecordEntity)9 ConsumerRecord (org.apache.kafka.clients.consumer.ConsumerRecord)7 Headers (org.apache.kafka.common.header.Headers)7 RecordHeaders (org.apache.kafka.common.header.internals.RecordHeaders)7 Test (org.junit.Test)7 InputRowSchema (org.apache.druid.data.input.InputRowSchema)5 InputEntityReader (org.apache.druid.data.input.InputEntityReader)4 InputRow (org.apache.druid.data.input.InputRow)4 MapBasedInputRow (org.apache.druid.data.input.MapBasedInputRow)4 DimensionsSpec (org.apache.druid.data.input.impl.DimensionsSpec)4 TimestampSpec (org.apache.druid.data.input.impl.TimestampSpec)4 Pair (org.apache.druid.java.util.common.Pair)3 Header (org.apache.kafka.common.header.Header)2 TypeReference (com.fasterxml.jackson.core.type.TypeReference)1 ObjectMapper (com.fasterxml.jackson.databind.ObjectMapper)1 IOException (java.io.IOException)1 ByteBuffer (java.nio.ByteBuffer)1 Collections (java.util.Collections)1 HashMap (java.util.HashMap)1 List (java.util.List)1