Search in sources :

Example 1 with StableLimitingSorter

use of org.apache.druid.collections.StableLimitingSorter in project druid by druid-io.

the class ScanQueryRunnerFactory method stableLimitingSort.

/**
 * Returns a sorted and limited copy of the provided {@param inputSequence}. Materializes the full sequence
 * in memory before returning it. The amount of memory use is limited by the limit of the {@param scanQuery}.
 */
@VisibleForTesting
Sequence<ScanResultValue> stableLimitingSort(Sequence<ScanResultValue> inputSequence, ScanQuery scanQuery, List<Interval> intervalsOrdered) throws IOException {
    Comparator<ScanResultValue> comparator = scanQuery.getResultOrdering();
    if (scanQuery.getScanRowsLimit() > Integer.MAX_VALUE) {
        throw new UOE("Limit of %,d rows not supported for priority queue strategy of time-ordering scan results", scanQuery.getScanRowsLimit());
    }
    // Converting the limit from long to int could theoretically throw an ArithmeticException but this branch
    // only runs if limit < MAX_LIMIT_FOR_IN_MEMORY_TIME_ORDERING (which should be < Integer.MAX_VALUE)
    int limit = Math.toIntExact(scanQuery.getScanRowsLimit());
    final StableLimitingSorter<ScanResultValue> sorter = new StableLimitingSorter<>(comparator, limit);
    Yielder<ScanResultValue> yielder = Yielders.each(inputSequence);
    try {
        boolean doneScanning = yielder.isDone();
        // We need to scan limit elements and anything else in the last segment
        int numRowsScanned = 0;
        Interval finalInterval = null;
        while (!doneScanning) {
            ScanResultValue next = yielder.get();
            List<ScanResultValue> singleEventScanResultValues = next.toSingleEventScanResultValues();
            for (ScanResultValue srv : singleEventScanResultValues) {
                numRowsScanned++;
                // Using an intermediate unbatched ScanResultValue is not that great memory-wise, but the column list
                // needs to be preserved for queries using the compactedList result format
                sorter.add(srv);
                // Finish scanning the interval containing the limit row
                if (numRowsScanned > limit && finalInterval == null) {
                    long timestampOfLimitRow = srv.getFirstEventTimestamp(scanQuery.getResultFormat());
                    for (Interval interval : intervalsOrdered) {
                        if (interval.contains(timestampOfLimitRow)) {
                            finalInterval = interval;
                        }
                    }
                    if (finalInterval == null) {
                        throw new ISE("Row came from an unscanned interval");
                    }
                }
            }
            yielder = yielder.next(null);
            doneScanning = yielder.isDone() || (finalInterval != null && !finalInterval.contains(next.getFirstEventTimestamp(scanQuery.getResultFormat())));
        }
        final List<ScanResultValue> sortedElements = new ArrayList<>(sorter.size());
        Iterators.addAll(sortedElements, sorter.drain());
        return Sequences.simple(sortedElements);
    } finally {
        yielder.close();
    }
}
Also used : StableLimitingSorter(org.apache.druid.collections.StableLimitingSorter) ArrayList(java.util.ArrayList) UOE(org.apache.druid.java.util.common.UOE) ISE(org.apache.druid.java.util.common.ISE) Interval(org.joda.time.Interval) VisibleForTesting(com.google.common.annotations.VisibleForTesting)

Aggregations

VisibleForTesting (com.google.common.annotations.VisibleForTesting)1 ArrayList (java.util.ArrayList)1 StableLimitingSorter (org.apache.druid.collections.StableLimitingSorter)1 ISE (org.apache.druid.java.util.common.ISE)1 UOE (org.apache.druid.java.util.common.UOE)1 Interval (org.joda.time.Interval)1