Search in sources :

Example 66 with Filter

use of org.apache.druid.query.filter.Filter in project druid by druid-io.

the class FilterPartitionBenchmark method readWithExFnPostFilter.

@Benchmark
@BenchmarkMode(Mode.AverageTime)
@OutputTimeUnit(TimeUnit.MICROSECONDS)
public void readWithExFnPostFilter(Blackhole blackhole) {
    Filter filter = new NoBitmapSelectorDimFilter("dimSequential", "super-199", JS_EXTRACTION_FN).toFilter();
    StorageAdapter sa = new QueryableIndexStorageAdapter(qIndex);
    Sequence<Cursor> cursors = makeCursors(sa, filter);
    readCursors(cursors, blackhole);
}
Also used : DimensionPredicateFilter(org.apache.druid.segment.filter.DimensionPredicateFilter) SelectorDimFilter(org.apache.druid.query.filter.SelectorDimFilter) OrFilter(org.apache.druid.segment.filter.OrFilter) AndDimFilter(org.apache.druid.query.filter.AndDimFilter) SelectorFilter(org.apache.druid.segment.filter.SelectorFilter) DimFilter(org.apache.druid.query.filter.DimFilter) BoundFilter(org.apache.druid.segment.filter.BoundFilter) BoundDimFilter(org.apache.druid.query.filter.BoundDimFilter) AndFilter(org.apache.druid.segment.filter.AndFilter) OrDimFilter(org.apache.druid.query.filter.OrDimFilter) Filter(org.apache.druid.query.filter.Filter) StorageAdapter(org.apache.druid.segment.StorageAdapter) QueryableIndexStorageAdapter(org.apache.druid.segment.QueryableIndexStorageAdapter) QueryableIndexStorageAdapter(org.apache.druid.segment.QueryableIndexStorageAdapter) Cursor(org.apache.druid.segment.Cursor) BenchmarkMode(org.openjdk.jmh.annotations.BenchmarkMode) Benchmark(org.openjdk.jmh.annotations.Benchmark) OutputTimeUnit(org.openjdk.jmh.annotations.OutputTimeUnit)

Example 67 with Filter

use of org.apache.druid.query.filter.Filter in project druid by druid-io.

the class FilterPartitionBenchmark method readWithExFnPreFilter.

@Benchmark
@BenchmarkMode(Mode.AverageTime)
@OutputTimeUnit(TimeUnit.MICROSECONDS)
public void readWithExFnPreFilter(Blackhole blackhole) {
    Filter filter = new SelectorDimFilter("dimSequential", "super-199", JS_EXTRACTION_FN).toFilter();
    StorageAdapter sa = new QueryableIndexStorageAdapter(qIndex);
    Sequence<Cursor> cursors = makeCursors(sa, filter);
    readCursors(cursors, blackhole);
}
Also used : DimensionPredicateFilter(org.apache.druid.segment.filter.DimensionPredicateFilter) SelectorDimFilter(org.apache.druid.query.filter.SelectorDimFilter) OrFilter(org.apache.druid.segment.filter.OrFilter) AndDimFilter(org.apache.druid.query.filter.AndDimFilter) SelectorFilter(org.apache.druid.segment.filter.SelectorFilter) DimFilter(org.apache.druid.query.filter.DimFilter) BoundFilter(org.apache.druid.segment.filter.BoundFilter) BoundDimFilter(org.apache.druid.query.filter.BoundDimFilter) AndFilter(org.apache.druid.segment.filter.AndFilter) OrDimFilter(org.apache.druid.query.filter.OrDimFilter) Filter(org.apache.druid.query.filter.Filter) SelectorDimFilter(org.apache.druid.query.filter.SelectorDimFilter) StorageAdapter(org.apache.druid.segment.StorageAdapter) QueryableIndexStorageAdapter(org.apache.druid.segment.QueryableIndexStorageAdapter) QueryableIndexStorageAdapter(org.apache.druid.segment.QueryableIndexStorageAdapter) Cursor(org.apache.druid.segment.Cursor) BenchmarkMode(org.openjdk.jmh.annotations.BenchmarkMode) Benchmark(org.openjdk.jmh.annotations.Benchmark) OutputTimeUnit(org.openjdk.jmh.annotations.OutputTimeUnit)

Example 68 with Filter

use of org.apache.druid.query.filter.Filter in project druid by druid-io.

the class FilterPartitionBenchmark method readWithPreFilter.

@Benchmark
@BenchmarkMode(Mode.AverageTime)
@OutputTimeUnit(TimeUnit.MICROSECONDS)
public void readWithPreFilter(Blackhole blackhole) {
    Filter filter = new SelectorFilter("dimSequential", "199");
    StorageAdapter sa = new QueryableIndexStorageAdapter(qIndex);
    Sequence<Cursor> cursors = makeCursors(sa, filter);
    readCursors(cursors, blackhole);
}
Also used : SelectorFilter(org.apache.druid.segment.filter.SelectorFilter) DimensionPredicateFilter(org.apache.druid.segment.filter.DimensionPredicateFilter) SelectorDimFilter(org.apache.druid.query.filter.SelectorDimFilter) OrFilter(org.apache.druid.segment.filter.OrFilter) AndDimFilter(org.apache.druid.query.filter.AndDimFilter) SelectorFilter(org.apache.druid.segment.filter.SelectorFilter) DimFilter(org.apache.druid.query.filter.DimFilter) BoundFilter(org.apache.druid.segment.filter.BoundFilter) BoundDimFilter(org.apache.druid.query.filter.BoundDimFilter) AndFilter(org.apache.druid.segment.filter.AndFilter) OrDimFilter(org.apache.druid.query.filter.OrDimFilter) Filter(org.apache.druid.query.filter.Filter) StorageAdapter(org.apache.druid.segment.StorageAdapter) QueryableIndexStorageAdapter(org.apache.druid.segment.QueryableIndexStorageAdapter) QueryableIndexStorageAdapter(org.apache.druid.segment.QueryableIndexStorageAdapter) Cursor(org.apache.druid.segment.Cursor) BenchmarkMode(org.openjdk.jmh.annotations.BenchmarkMode) Benchmark(org.openjdk.jmh.annotations.Benchmark) OutputTimeUnit(org.openjdk.jmh.annotations.OutputTimeUnit)

Example 69 with Filter

use of org.apache.druid.query.filter.Filter in project druid by druid-io.

the class ScanQueryEngine method process.

public Sequence<ScanResultValue> process(final ScanQuery query, final Segment segment, final ResponseContext responseContext) {
    // "legacy" should be non-null due to toolChest.mergeResults
    final boolean legacy = Preconditions.checkNotNull(query.isLegacy(), "Expected non-null 'legacy' parameter");
    final Long numScannedRows = responseContext.getRowScanCount();
    if (numScannedRows != null && numScannedRows >= query.getScanRowsLimit() && query.getTimeOrder().equals(ScanQuery.Order.NONE)) {
        return Sequences.empty();
    }
    final boolean hasTimeout = QueryContexts.hasTimeout(query);
    final Long timeoutAt = responseContext.getTimeoutTime();
    final long start = System.currentTimeMillis();
    final StorageAdapter adapter = segment.asStorageAdapter();
    if (adapter == null) {
        throw new ISE("Null storage adapter found. Probably trying to issue a query against a segment being memory unmapped.");
    }
    final List<String> allColumns = new ArrayList<>();
    if (query.getColumns() != null && !query.getColumns().isEmpty()) {
        if (legacy && !query.getColumns().contains(LEGACY_TIMESTAMP_KEY)) {
            allColumns.add(LEGACY_TIMESTAMP_KEY);
        }
        // Unless we're in legacy mode, allColumns equals query.getColumns() exactly. This is nice since it makes
        // the compactedList form easier to use.
        allColumns.addAll(query.getColumns());
    } else {
        final Set<String> availableColumns = Sets.newLinkedHashSet(Iterables.concat(Collections.singleton(legacy ? LEGACY_TIMESTAMP_KEY : ColumnHolder.TIME_COLUMN_NAME), Iterables.transform(Arrays.asList(query.getVirtualColumns().getVirtualColumns()), VirtualColumn::getOutputName), adapter.getAvailableDimensions(), adapter.getAvailableMetrics()));
        allColumns.addAll(availableColumns);
        if (legacy) {
            allColumns.remove(ColumnHolder.TIME_COLUMN_NAME);
        }
    }
    final List<Interval> intervals = query.getQuerySegmentSpec().getIntervals();
    Preconditions.checkArgument(intervals.size() == 1, "Can only handle a single interval, got[%s]", intervals);
    final SegmentId segmentId = segment.getId();
    final Filter filter = Filters.convertToCNFFromQueryContext(query, Filters.toFilter(query.getFilter()));
    // If the row count is not set, set it to 0, else do nothing.
    responseContext.addRowScanCount(0);
    final long limit = calculateRemainingScanRowsLimit(query, responseContext);
    return Sequences.concat(adapter.makeCursors(filter, intervals.get(0), query.getVirtualColumns(), Granularities.ALL, query.getTimeOrder().equals(ScanQuery.Order.DESCENDING) || (query.getTimeOrder().equals(ScanQuery.Order.NONE) && query.isDescending()), null).map(cursor -> new BaseSequence<>(new BaseSequence.IteratorMaker<ScanResultValue, Iterator<ScanResultValue>>() {

        @Override
        public Iterator<ScanResultValue> make() {
            final List<BaseObjectColumnValueSelector> columnSelectors = new ArrayList<>(allColumns.size());
            for (String column : allColumns) {
                final BaseObjectColumnValueSelector selector;
                if (legacy && LEGACY_TIMESTAMP_KEY.equals(column)) {
                    selector = cursor.getColumnSelectorFactory().makeColumnValueSelector(ColumnHolder.TIME_COLUMN_NAME);
                } else {
                    selector = cursor.getColumnSelectorFactory().makeColumnValueSelector(column);
                }
                columnSelectors.add(selector);
            }
            final int batchSize = query.getBatchSize();
            return new Iterator<ScanResultValue>() {

                private long offset = 0;

                @Override
                public boolean hasNext() {
                    return !cursor.isDone() && offset < limit;
                }

                @Override
                public ScanResultValue next() {
                    if (!hasNext()) {
                        throw new NoSuchElementException();
                    }
                    if (hasTimeout && System.currentTimeMillis() >= timeoutAt) {
                        throw new QueryTimeoutException(StringUtils.nonStrictFormat("Query [%s] timed out", query.getId()));
                    }
                    final long lastOffset = offset;
                    final Object events;
                    final ScanQuery.ResultFormat resultFormat = query.getResultFormat();
                    if (ScanQuery.ResultFormat.RESULT_FORMAT_COMPACTED_LIST.equals(resultFormat)) {
                        events = rowsToCompactedList();
                    } else if (ScanQuery.ResultFormat.RESULT_FORMAT_LIST.equals(resultFormat)) {
                        events = rowsToList();
                    } else {
                        throw new UOE("resultFormat[%s] is not supported", resultFormat.toString());
                    }
                    responseContext.addRowScanCount(offset - lastOffset);
                    if (hasTimeout) {
                        responseContext.putTimeoutTime(timeoutAt - (System.currentTimeMillis() - start));
                    }
                    return new ScanResultValue(segmentId.toString(), allColumns, events);
                }

                @Override
                public void remove() {
                    throw new UnsupportedOperationException();
                }

                private List<List<Object>> rowsToCompactedList() {
                    final List<List<Object>> events = new ArrayList<>(batchSize);
                    final long iterLimit = Math.min(limit, offset + batchSize);
                    for (; !cursor.isDone() && offset < iterLimit; cursor.advance(), offset++) {
                        final List<Object> theEvent = new ArrayList<>(allColumns.size());
                        for (int j = 0; j < allColumns.size(); j++) {
                            theEvent.add(getColumnValue(j));
                        }
                        events.add(theEvent);
                    }
                    return events;
                }

                private List<Map<String, Object>> rowsToList() {
                    List<Map<String, Object>> events = Lists.newArrayListWithCapacity(batchSize);
                    final long iterLimit = Math.min(limit, offset + batchSize);
                    for (; !cursor.isDone() && offset < iterLimit; cursor.advance(), offset++) {
                        final Map<String, Object> theEvent = new LinkedHashMap<>();
                        for (int j = 0; j < allColumns.size(); j++) {
                            theEvent.put(allColumns.get(j), getColumnValue(j));
                        }
                        events.add(theEvent);
                    }
                    return events;
                }

                private Object getColumnValue(int i) {
                    final BaseObjectColumnValueSelector selector = columnSelectors.get(i);
                    final Object value;
                    if (legacy && allColumns.get(i).equals(LEGACY_TIMESTAMP_KEY)) {
                        value = DateTimes.utc((long) selector.getObject());
                    } else {
                        value = selector == null ? null : selector.getObject();
                    }
                    return value;
                }
            };
        }

        @Override
        public void cleanup(Iterator<ScanResultValue> iterFromMake) {
        }
    })));
}
Also used : Iterables(com.google.common.collect.Iterables) Arrays(java.util.Arrays) StorageAdapter(org.apache.druid.segment.StorageAdapter) ArrayList(java.util.ArrayList) LinkedHashMap(java.util.LinkedHashMap) Interval(org.joda.time.Interval) Lists(com.google.common.collect.Lists) ColumnHolder(org.apache.druid.segment.column.ColumnHolder) Map(java.util.Map) UOE(org.apache.druid.java.util.common.UOE) NoSuchElementException(java.util.NoSuchElementException) BaseObjectColumnValueSelector(org.apache.druid.segment.BaseObjectColumnValueSelector) Sequences(org.apache.druid.java.util.common.guava.Sequences) Segment(org.apache.druid.segment.Segment) DateTimes(org.apache.druid.java.util.common.DateTimes) Sequence(org.apache.druid.java.util.common.guava.Sequence) Iterator(java.util.Iterator) ResponseContext(org.apache.druid.query.context.ResponseContext) VirtualColumn(org.apache.druid.segment.VirtualColumn) StringUtils(org.apache.druid.java.util.common.StringUtils) Set(java.util.Set) ISE(org.apache.druid.java.util.common.ISE) Sets(com.google.common.collect.Sets) QueryContexts(org.apache.druid.query.QueryContexts) Granularities(org.apache.druid.java.util.common.granularity.Granularities) List(java.util.List) QueryTimeoutException(org.apache.druid.query.QueryTimeoutException) Preconditions(com.google.common.base.Preconditions) BaseSequence(org.apache.druid.java.util.common.guava.BaseSequence) SegmentId(org.apache.druid.timeline.SegmentId) Filters(org.apache.druid.segment.filter.Filters) Collections(java.util.Collections) Filter(org.apache.druid.query.filter.Filter) ArrayList(java.util.ArrayList) StorageAdapter(org.apache.druid.segment.StorageAdapter) QueryTimeoutException(org.apache.druid.query.QueryTimeoutException) Iterator(java.util.Iterator) ISE(org.apache.druid.java.util.common.ISE) ArrayList(java.util.ArrayList) List(java.util.List) SegmentId(org.apache.druid.timeline.SegmentId) BaseObjectColumnValueSelector(org.apache.druid.segment.BaseObjectColumnValueSelector) UOE(org.apache.druid.java.util.common.UOE) BaseSequence(org.apache.druid.java.util.common.guava.BaseSequence) Filter(org.apache.druid.query.filter.Filter) VirtualColumn(org.apache.druid.segment.VirtualColumn) LinkedHashMap(java.util.LinkedHashMap) Map(java.util.Map) NoSuchElementException(java.util.NoSuchElementException) Interval(org.joda.time.Interval)

Example 70 with Filter

use of org.apache.druid.query.filter.Filter in project druid by druid-io.

the class JoinFilterAnalyzerTest method test_filterPushDown_baseTableFilter.

@Test
public void test_filterPushDown_baseTableFilter() {
    Filter originalFilter = new SelectorFilter("channel", "#en.wikipedia");
    Filter baseTableFilter = new SelectorFilter("countryIsoCode", "CA");
    List<JoinableClause> joinableClauses = ImmutableList.of(factToRegion(JoinType.LEFT), regionToCountry(JoinType.LEFT));
    JoinFilterPreAnalysis joinFilterPreAnalysis = makeDefaultConfigPreAnalysis(originalFilter, joinableClauses, VirtualColumns.EMPTY);
    JoinFilterSplit expectedFilterSplit = new JoinFilterSplit(Filters.and(Arrays.asList(originalFilter, baseTableFilter)), null, ImmutableSet.of());
    JoinFilterSplit actualFilterSplit = JoinFilterAnalyzer.splitFilter(joinFilterPreAnalysis, baseTableFilter);
    Assert.assertEquals(expectedFilterSplit, actualFilterSplit);
}
Also used : SelectorFilter(org.apache.druid.segment.filter.SelectorFilter) JoinFilterPreAnalysis(org.apache.druid.segment.join.filter.JoinFilterPreAnalysis) FalseFilter(org.apache.druid.segment.filter.FalseFilter) OrFilter(org.apache.druid.segment.filter.OrFilter) BoundDimFilter(org.apache.druid.query.filter.BoundDimFilter) SelectorFilter(org.apache.druid.segment.filter.SelectorFilter) AndFilter(org.apache.druid.segment.filter.AndFilter) BoundFilter(org.apache.druid.segment.filter.BoundFilter) InDimFilter(org.apache.druid.query.filter.InDimFilter) ExpressionDimFilter(org.apache.druid.query.filter.ExpressionDimFilter) Filter(org.apache.druid.query.filter.Filter) JoinFilterSplit(org.apache.druid.segment.join.filter.JoinFilterSplit) Test(org.junit.Test)

Aggregations

Filter (org.apache.druid.query.filter.Filter)127 Test (org.junit.Test)88 SelectorFilter (org.apache.druid.segment.filter.SelectorFilter)74 ExpressionDimFilter (org.apache.druid.query.filter.ExpressionDimFilter)64 JoinFilterPreAnalysis (org.apache.druid.segment.join.filter.JoinFilterPreAnalysis)63 OrFilter (org.apache.druid.segment.filter.OrFilter)52 AndFilter (org.apache.druid.segment.filter.AndFilter)47 BoundDimFilter (org.apache.druid.query.filter.BoundDimFilter)43 InDimFilter (org.apache.druid.query.filter.InDimFilter)43 BoundFilter (org.apache.druid.segment.filter.BoundFilter)42 FalseFilter (org.apache.druid.segment.filter.FalseFilter)41 SelectorDimFilter (org.apache.druid.query.filter.SelectorDimFilter)39 OrDimFilter (org.apache.druid.query.filter.OrDimFilter)36 JoinFilterSplit (org.apache.druid.segment.join.filter.JoinFilterSplit)35 ArrayList (java.util.ArrayList)19 DimFilter (org.apache.druid.query.filter.DimFilter)15 IndexedTableJoinable (org.apache.druid.segment.join.table.IndexedTableJoinable)13 AndDimFilter (org.apache.druid.query.filter.AndDimFilter)10 BooleanFilter (org.apache.druid.query.filter.BooleanFilter)10 Cursor (org.apache.druid.segment.Cursor)10