Search in sources :

Example 11 with TopicAndPartition

use of kafka.common.TopicAndPartition in project cdap by caskdata.

the class KafkaUtil method getOffsetByTimestamp.

/**
   * Fetch the starting offset of the last segment whose latest message is published before the given timestamp.
   * The timestamp can also be special value {@link OffsetRequest$#EarliestTime()}
   * or {@link OffsetRequest$#LatestTime()}.
   *
   * @param consumer the consumer to send request to and receive response from
   * @param topic the topic for fetching the offset from
   * @param partition the partition for fetching the offset from
   * @param timestamp the timestamp to use for fetching last offset before it
   * @return the latest offset
   *
   * @throws NotLeaderForPartitionException if the broker that the consumer is talking to is not the leader
   *                                        for the given topic and partition.
   * @throws UnknownTopicOrPartitionException if the topic or partition is not known by the Kafka server
   * @throws UnknownServerException if the Kafka server responded with error.
   */
public static long getOffsetByTimestamp(SimpleConsumer consumer, String topic, int partition, long timestamp) throws KafkaException {
    // Fire offset request
    OffsetRequest request = new OffsetRequest(ImmutableMap.of(new TopicAndPartition(topic, partition), new PartitionOffsetRequestInfo(timestamp, 1)), kafka.api.OffsetRequest.CurrentVersion(), consumer.clientId());
    OffsetResponse response = consumer.getOffsetsBefore(request);
    if (response.hasError()) {
        throw Errors.forCode(response.errorCode(topic, partition)).exception();
    }
    // Retrieve offsets from response
    long[] offsets = response.offsets(topic, partition);
    if (offsets.length == 0) {
        if (timestamp != kafka.api.OffsetRequest.EarliestTime()) {
            // Hence, use the earliest time to find out the offset
            return getOffsetByTimestamp(consumer, topic, partition, kafka.api.OffsetRequest.EarliestTime());
        }
        // This shouldn't happen. The find earliest offset response should return at least one offset.
        throw new UnknownServerException("Empty offsets received from offsets request on " + topic + ":" + partition + " from broker " + consumer.host() + ":" + consumer.port());
    }
    LOG.debug("Offset {} fetched for {}:{} with timestamp {}.", offsets[0], topic, partition, timestamp);
    return offsets[0];
}
Also used : OffsetResponse(kafka.javaapi.OffsetResponse) PartitionOffsetRequestInfo(kafka.api.PartitionOffsetRequestInfo) TopicAndPartition(kafka.common.TopicAndPartition) UnknownServerException(org.apache.kafka.common.errors.UnknownServerException) OffsetRequest(kafka.javaapi.OffsetRequest)

Example 12 with TopicAndPartition

use of kafka.common.TopicAndPartition in project storm by apache.

the class KafkaOffsetLagUtil method getLogHeadOffsets.

private static Map<String, Map<Integer, Long>> getLogHeadOffsets(Map<String, List<TopicPartition>> leadersAndTopicPartitions) {
    Map<String, Map<Integer, Long>> result = new HashMap<>();
    if (leadersAndTopicPartitions != null) {
        PartitionOffsetRequestInfo partitionOffsetRequestInfo = new PartitionOffsetRequestInfo(OffsetRequest.LatestTime(), 1);
        SimpleConsumer simpleConsumer = null;
        for (Map.Entry<String, List<TopicPartition>> leader : leadersAndTopicPartitions.entrySet()) {
            try {
                simpleConsumer = new SimpleConsumer(leader.getKey().split(":")[0], Integer.parseInt(leader.getKey().split(":")[1]), 10000, 64 * 1024, "LogHeadOffsetRequest");
                Map<TopicAndPartition, PartitionOffsetRequestInfo> requestInfo = new HashMap<TopicAndPartition, PartitionOffsetRequestInfo>();
                for (TopicPartition topicPartition : leader.getValue()) {
                    requestInfo.put(new TopicAndPartition(topicPartition.topic(), topicPartition.partition()), partitionOffsetRequestInfo);
                    if (!result.containsKey(topicPartition.topic())) {
                        result.put(topicPartition.topic(), new HashMap<Integer, Long>());
                    }
                }
                kafka.javaapi.OffsetRequest request = new kafka.javaapi.OffsetRequest(requestInfo, kafka.api.OffsetRequest.CurrentVersion(), "LogHeadOffsetRequest");
                OffsetResponse response = simpleConsumer.getOffsetsBefore(request);
                for (TopicPartition topicPartition : leader.getValue()) {
                    result.get(topicPartition.topic()).put(topicPartition.partition(), response.offsets(topicPartition.topic(), topicPartition.partition())[0]);
                }
            } finally {
                if (simpleConsumer != null) {
                    simpleConsumer.close();
                }
            }
        }
    }
    return result;
}
Also used : OffsetResponse(kafka.javaapi.OffsetResponse) HashMap(java.util.HashMap) PartitionOffsetRequestInfo(kafka.api.PartitionOffsetRequestInfo) TopicPartition(org.apache.kafka.common.TopicPartition) ArrayList(java.util.ArrayList) List(java.util.List) TopicAndPartition(kafka.common.TopicAndPartition) HashMap(java.util.HashMap) Map(java.util.Map) SimpleConsumer(kafka.javaapi.consumer.SimpleConsumer) OffsetRequest(kafka.api.OffsetRequest)

Example 13 with TopicAndPartition

use of kafka.common.TopicAndPartition in project flink by apache.

the class SimpleConsumerThread method requestAndSetSpecificTimeOffsetsFromKafka.

// ------------------------------------------------------------------------
//  Kafka Request Utils
// ------------------------------------------------------------------------
/**
	 * Request offsets before a specific time for a set of partitions, via a Kafka consumer.
	 *
	 * @param consumer The consumer connected to lead broker
	 * @param partitions The list of partitions we need offsets for
	 * @param whichTime The type of time we are requesting. -1 and -2 are special constants (See OffsetRequest)
	 */
private static void requestAndSetSpecificTimeOffsetsFromKafka(SimpleConsumer consumer, List<KafkaTopicPartitionState<TopicAndPartition>> partitions, long whichTime) throws IOException {
    Map<TopicAndPartition, PartitionOffsetRequestInfo> requestInfo = new HashMap<>();
    for (KafkaTopicPartitionState<TopicAndPartition> part : partitions) {
        requestInfo.put(part.getKafkaPartitionHandle(), new PartitionOffsetRequestInfo(whichTime, 1));
    }
    requestAndSetOffsetsFromKafka(consumer, partitions, requestInfo);
}
Also used : HashMap(java.util.HashMap) PartitionOffsetRequestInfo(kafka.api.PartitionOffsetRequestInfo) TopicAndPartition(kafka.common.TopicAndPartition)

Example 14 with TopicAndPartition

use of kafka.common.TopicAndPartition in project flink by apache.

the class Kafka08Fetcher method runFetchLoop.

// ------------------------------------------------------------------------
//  Main Work Loop
// ------------------------------------------------------------------------
@Override
public void runFetchLoop() throws Exception {
    // the map from broker to the thread that is connected to that broker
    final Map<Node, SimpleConsumerThread<T>> brokerToThread = new HashMap<>();
    // this holds possible the exceptions from the concurrent broker connection threads
    final ExceptionProxy errorHandler = new ExceptionProxy(Thread.currentThread());
    // the offset handler handles the communication with ZooKeeper, to commit externally visible offsets
    final ZookeeperOffsetHandler zookeeperOffsetHandler = new ZookeeperOffsetHandler(kafkaConfig);
    this.zookeeperOffsetHandler = zookeeperOffsetHandler;
    PeriodicOffsetCommitter periodicCommitter = null;
    try {
        // values yet; replace those with actual offsets, according to what the sentinel value represent.
        for (KafkaTopicPartitionState<TopicAndPartition> partition : subscribedPartitionStates()) {
            if (partition.getOffset() == KafkaTopicPartitionStateSentinel.EARLIEST_OFFSET) {
                // this will be replaced by an actual offset in SimpleConsumerThread
                partition.setOffset(OffsetRequest.EarliestTime());
            } else if (partition.getOffset() == KafkaTopicPartitionStateSentinel.LATEST_OFFSET) {
                // this will be replaced by an actual offset in SimpleConsumerThread
                partition.setOffset(OffsetRequest.LatestTime());
            } else if (partition.getOffset() == KafkaTopicPartitionStateSentinel.GROUP_OFFSET) {
                Long committedOffset = zookeeperOffsetHandler.getCommittedOffset(partition.getKafkaTopicPartition());
                if (committedOffset != null) {
                    // the committed offset in ZK represents the next record to process,
                    // so we subtract it by 1 to correctly represent internal state
                    partition.setOffset(committedOffset - 1);
                } else {
                    // if we can't find an offset for a partition in ZK when using GROUP_OFFSETS,
                    // we default to "auto.offset.reset" like the Kafka high-level consumer
                    LOG.warn("No group offset can be found for partition {} in Zookeeper;" + " resetting starting offset to 'auto.offset.reset'", partition);
                    partition.setOffset(invalidOffsetBehavior);
                }
            } else {
            // the partition already has a specific start offset and is ready to be consumed
            }
        }
        // start the periodic offset committer thread, if necessary
        if (autoCommitInterval > 0) {
            LOG.info("Starting periodic offset committer, with commit interval of {}ms", autoCommitInterval);
            periodicCommitter = new PeriodicOffsetCommitter(zookeeperOffsetHandler, subscribedPartitionStates(), errorHandler, autoCommitInterval);
            periodicCommitter.setName("Periodic Kafka partition offset committer");
            periodicCommitter.setDaemon(true);
            periodicCommitter.start();
        }
        // register offset metrics
        if (useMetrics) {
            final MetricGroup kafkaMetricGroup = runtimeContext.getMetricGroup().addGroup("KafkaConsumer");
            addOffsetStateGauge(kafkaMetricGroup);
        }
        // Main loop polling elements from the unassignedPartitions queue to the threads
        while (running) {
            // re-throw any exception from the concurrent fetcher threads
            errorHandler.checkAndThrowException();
            // wait for max 5 seconds trying to get partitions to assign
            // if threads shut down, this poll returns earlier, because the threads inject the
            // special marker into the queue
            List<KafkaTopicPartitionState<TopicAndPartition>> partitionsToAssign = unassignedPartitionsQueue.getBatchBlocking(5000);
            partitionsToAssign.remove(MARKER);
            if (!partitionsToAssign.isEmpty()) {
                LOG.info("Assigning {} partitions to broker threads", partitionsToAssign.size());
                Map<Node, List<KafkaTopicPartitionState<TopicAndPartition>>> partitionsWithLeaders = findLeaderForPartitions(partitionsToAssign, kafkaConfig);
                // assign the partitions to the leaders (maybe start the threads)
                for (Map.Entry<Node, List<KafkaTopicPartitionState<TopicAndPartition>>> partitionsWithLeader : partitionsWithLeaders.entrySet()) {
                    final Node leader = partitionsWithLeader.getKey();
                    final List<KafkaTopicPartitionState<TopicAndPartition>> partitions = partitionsWithLeader.getValue();
                    SimpleConsumerThread<T> brokerThread = brokerToThread.get(leader);
                    if (!running) {
                        break;
                    }
                    if (brokerThread == null || !brokerThread.getNewPartitionsQueue().isOpen()) {
                        // start new thread
                        brokerThread = createAndStartSimpleConsumerThread(partitions, leader, errorHandler);
                        brokerToThread.put(leader, brokerThread);
                    } else {
                        // put elements into queue of thread
                        ClosableBlockingQueue<KafkaTopicPartitionState<TopicAndPartition>> newPartitionsQueue = brokerThread.getNewPartitionsQueue();
                        for (KafkaTopicPartitionState<TopicAndPartition> fp : partitions) {
                            if (!newPartitionsQueue.addIfOpen(fp)) {
                                // we were unable to add the partition to the broker's queue
                                // the broker has closed in the meantime (the thread will shut down)
                                // create a new thread for connecting to this broker
                                List<KafkaTopicPartitionState<TopicAndPartition>> seedPartitions = new ArrayList<>();
                                seedPartitions.add(fp);
                                brokerThread = createAndStartSimpleConsumerThread(seedPartitions, leader, errorHandler);
                                brokerToThread.put(leader, brokerThread);
                                // update queue for the subsequent partitions
                                newPartitionsQueue = brokerThread.getNewPartitionsQueue();
                            }
                        }
                    }
                }
            } else {
                // there were no partitions to assign. Check if any broker threads shut down.
                // we get into this section of the code, if either the poll timed out, or the
                // blocking poll was woken up by the marker element
                Iterator<SimpleConsumerThread<T>> bttIterator = brokerToThread.values().iterator();
                while (bttIterator.hasNext()) {
                    SimpleConsumerThread<T> thread = bttIterator.next();
                    if (!thread.getNewPartitionsQueue().isOpen()) {
                        LOG.info("Removing stopped consumer thread {}", thread.getName());
                        bttIterator.remove();
                    }
                }
            }
            if (brokerToThread.size() == 0 && unassignedPartitionsQueue.isEmpty()) {
                if (unassignedPartitionsQueue.close()) {
                    LOG.info("All consumer threads are finished, there are no more unassigned partitions. Stopping fetcher");
                    break;
                }
            // we end up here if somebody added something to the queue in the meantime --> continue to poll queue again
            }
        }
    } catch (InterruptedException e) {
        // this may be thrown because an exception on one of the concurrent fetcher threads
        // woke this thread up. make sure we throw the root exception instead in that case
        errorHandler.checkAndThrowException();
        // no other root exception, throw the interrupted exception
        throw e;
    } finally {
        this.running = false;
        this.zookeeperOffsetHandler = null;
        // if we run a periodic committer thread, shut that down
        if (periodicCommitter != null) {
            periodicCommitter.shutdown();
        }
        // clear the interruption flag
        // this allows the joining on consumer threads (on best effort) to happen in
        // case the initial interrupt already
        Thread.interrupted();
        // make sure that in any case (completion, abort, error), all spawned threads are stopped
        try {
            int runningThreads;
            do {
                // check whether threads are alive and cancel them
                runningThreads = 0;
                Iterator<SimpleConsumerThread<T>> threads = brokerToThread.values().iterator();
                while (threads.hasNext()) {
                    SimpleConsumerThread<?> t = threads.next();
                    if (t.isAlive()) {
                        t.cancel();
                        runningThreads++;
                    } else {
                        threads.remove();
                    }
                }
                // wait for the threads to finish, before issuing a cancel call again
                if (runningThreads > 0) {
                    for (SimpleConsumerThread<?> t : brokerToThread.values()) {
                        t.join(500 / runningThreads + 1);
                    }
                }
            } while (runningThreads > 0);
        } catch (InterruptedException ignored) {
            // waiting for the thread shutdown apparently got interrupted
            // restore interrupted state and continue
            Thread.currentThread().interrupt();
        } catch (Throwable t) {
            // we catch all here to preserve the original exception
            LOG.error("Exception while shutting down consumer threads", t);
        }
        try {
            zookeeperOffsetHandler.close();
        } catch (Throwable t) {
            // we catch all here to preserve the original exception
            LOG.error("Exception while shutting down ZookeeperOffsetHandler", t);
        }
    }
}
Also used : HashMap(java.util.HashMap) Node(org.apache.kafka.common.Node) MetricGroup(org.apache.flink.metrics.MetricGroup) ArrayList(java.util.ArrayList) TopicAndPartition(kafka.common.TopicAndPartition) ArrayList(java.util.ArrayList) List(java.util.List) HashMap(java.util.HashMap) Map(java.util.Map)

Example 15 with TopicAndPartition

use of kafka.common.TopicAndPartition in project flink by apache.

the class SimpleConsumerThread method requestAndSetOffsetsFromKafka.

/**
	 * Request offsets from Kafka with a specified set of partition's offset request information.
	 * The returned offsets are used to set the internal partition states.
	 *
	 * <p>This method retries three times if the response has an error.
	 *
	 * @param consumer The consumer connected to lead broker
	 * @param partitionStates the partition states, will be set with offsets fetched from Kafka request
	 * @param partitionToRequestInfo map of each partition to its offset request info
	 */
private static void requestAndSetOffsetsFromKafka(SimpleConsumer consumer, List<KafkaTopicPartitionState<TopicAndPartition>> partitionStates, Map<TopicAndPartition, PartitionOffsetRequestInfo> partitionToRequestInfo) throws IOException {
    int retries = 0;
    OffsetResponse response;
    while (true) {
        kafka.javaapi.OffsetRequest request = new kafka.javaapi.OffsetRequest(partitionToRequestInfo, kafka.api.OffsetRequest.CurrentVersion(), consumer.clientId());
        response = consumer.getOffsetsBefore(request);
        if (response.hasError()) {
            StringBuilder exception = new StringBuilder();
            for (KafkaTopicPartitionState<TopicAndPartition> part : partitionStates) {
                short code;
                if ((code = response.errorCode(part.getTopic(), part.getPartition())) != ErrorMapping.NoError()) {
                    exception.append("\nException for topic=").append(part.getTopic()).append(" partition=").append(part.getPartition()).append(": ").append(StringUtils.stringifyException(ErrorMapping.exceptionFor(code)));
                }
            }
            if (++retries >= 3) {
                throw new IOException("Unable to get last offset for partitions " + partitionStates + ": " + exception.toString());
            } else {
                LOG.warn("Unable to get last offset for partitions: Exception(s): {}", exception);
            }
        } else {
            // leave retry loop
            break;
        }
    }
    for (KafkaTopicPartitionState<TopicAndPartition> part : partitionStates) {
        // there will be offsets only for partitions that were requested for
        if (partitionToRequestInfo.containsKey(part.getKafkaPartitionHandle())) {
            final long offset = response.offsets(part.getTopic(), part.getPartition())[0];
            // the offset returned is that of the next record to fetch. because our state reflects the latest
            // successfully emitted record, we subtract one
            part.setOffset(offset - 1);
        }
    }
}
Also used : OffsetResponse(kafka.javaapi.OffsetResponse) TopicAndPartition(kafka.common.TopicAndPartition) IOException(java.io.IOException) OffsetRequest(kafka.api.OffsetRequest)

Aggregations

TopicAndPartition (kafka.common.TopicAndPartition)18 PartitionOffsetRequestInfo (kafka.api.PartitionOffsetRequestInfo)13 HashMap (java.util.HashMap)11 OffsetRequest (kafka.javaapi.OffsetRequest)9 OffsetResponse (kafka.javaapi.OffsetResponse)9 IOException (java.io.IOException)4 ArrayList (java.util.ArrayList)4 SimpleConsumer (kafka.javaapi.consumer.SimpleConsumer)4 List (java.util.List)3 Map (java.util.Map)3 OffsetRequest (kafka.api.OffsetRequest)2 Node (org.apache.kafka.common.Node)2 PrestoException (com.facebook.presto.spi.PrestoException)1 ImmutableMap (com.google.common.collect.ImmutableMap)1 SyncFailedException (java.io.SyncFailedException)1 ByteBuffer (java.nio.ByteBuffer)1 ClosedByInterruptException (java.nio.channels.ClosedByInterruptException)1 ClosedChannelException (java.nio.channels.ClosedChannelException)1 AccessDeniedException (java.nio.file.AccessDeniedException)1 SortedMap (java.util.SortedMap)1