Search in sources :

Example 1 with ExtractionFn

use of org.apache.druid.query.extraction.ExtractionFn in project druid by druid-io.

the class InFilterTest method testMatchWithExtractionFn.

@Test
public void testMatchWithExtractionFn() {
    String extractionJsFn = "function(str) { return 'super-' + str; }";
    ExtractionFn superFn = new JavaScriptExtractionFn(extractionJsFn, false, JavaScriptConfig.getEnabledInstance());
    String nullJsFn = "function(str) { if (str === null) { return 'YES'; } else { return 'NO';} }";
    ExtractionFn yesNullFn = new JavaScriptExtractionFn(nullJsFn, false, JavaScriptConfig.getEnabledInstance());
    if (NullHandling.replaceWithDefault()) {
        assertFilterMatches(toInFilterWithFn("dim2", superFn, "super-null", "super-a", "super-b"), ImmutableList.of("a", "b", "c", "d", "f"));
        assertFilterMatches(toInFilterWithFn("dim1", superFn, "super-null", "super-10", "super-def"), ImmutableList.of("a", "b", "e"));
        assertFilterMatches(toInFilterWithFn("dim2", yesNullFn, "YES"), ImmutableList.of("b", "c", "f"));
        assertFilterMatches(toInFilterWithFn("dim1", yesNullFn, "NO"), ImmutableList.of("b", "c", "d", "e", "f"));
    } else {
        assertFilterMatches(toInFilterWithFn("dim2", superFn, "super-null", "super-a", "super-b"), ImmutableList.of("a", "b", "d", "f"));
        assertFilterMatches(toInFilterWithFn("dim1", superFn, "super-null", "super-10", "super-def"), ImmutableList.of("b", "e"));
        assertFilterMatches(toInFilterWithFn("dim2", yesNullFn, "YES"), ImmutableList.of("b", "f"));
        assertFilterMatches(toInFilterWithFn("dim1", yesNullFn, "NO"), ImmutableList.of("a", "b", "c", "d", "e", "f"));
    }
    assertFilterMatches(toInFilterWithFn("dim3", yesNullFn, "NO"), ImmutableList.of());
    assertFilterMatches(toInFilterWithFn("dim3", yesNullFn, "YES"), ImmutableList.of("a", "b", "c", "d", "e", "f"));
}
Also used : ExtractionFn(org.apache.druid.query.extraction.ExtractionFn) JavaScriptExtractionFn(org.apache.druid.query.extraction.JavaScriptExtractionFn) LookupExtractionFn(org.apache.druid.query.lookup.LookupExtractionFn) JavaScriptExtractionFn(org.apache.druid.query.extraction.JavaScriptExtractionFn) Test(org.junit.Test)

Example 2 with ExtractionFn

use of org.apache.druid.query.extraction.ExtractionFn in project druid by druid-io.

the class RegexFilterTest method testRegexWithExtractionFn.

@Test
public void testRegexWithExtractionFn() {
    String nullJsFn = "function(str) { if (str === null) { return 'NOT_NULL_ANYMORE'; } else { return str;} }";
    ExtractionFn changeNullFn = new JavaScriptExtractionFn(nullJsFn, false, JavaScriptConfig.getEnabledInstance());
    if (NullHandling.replaceWithDefault()) {
        assertFilterMatches(new RegexDimFilter("dim1", ".*ANYMORE", changeNullFn), ImmutableList.of("0"));
        assertFilterMatches(new RegexDimFilter("dim2", ".*ANYMORE", changeNullFn), ImmutableList.of("1", "2", "5"));
    } else {
        assertFilterMatches(new RegexDimFilter("dim1", ".*ANYMORE", changeNullFn), ImmutableList.of());
        assertFilterMatches(new RegexDimFilter("dim2", ".*ANYMORE", changeNullFn), ImmutableList.of("1", "5"));
    }
    assertFilterMatches(new RegexDimFilter("dim1", "ab.*", changeNullFn), ImmutableList.of("4", "5"));
    assertFilterMatches(new RegexDimFilter("dim2", "a.*", changeNullFn), ImmutableList.of("0", "3"));
    assertFilterMatches(new RegexDimFilter("dim3", ".*ANYMORE", changeNullFn), ImmutableList.of("0", "1", "2", "3", "4", "5"));
    assertFilterMatches(new RegexDimFilter("dim3", "a.*", changeNullFn), ImmutableList.of());
    assertFilterMatches(new RegexDimFilter("dim4", ".*ANYMORE", changeNullFn), ImmutableList.of("0", "1", "2", "3", "4", "5"));
    assertFilterMatches(new RegexDimFilter("dim4", "a.*", changeNullFn), ImmutableList.of());
}
Also used : ExtractionFn(org.apache.druid.query.extraction.ExtractionFn) JavaScriptExtractionFn(org.apache.druid.query.extraction.JavaScriptExtractionFn) RegexDimFilter(org.apache.druid.query.filter.RegexDimFilter) JavaScriptExtractionFn(org.apache.druid.query.extraction.JavaScriptExtractionFn) Test(org.junit.Test)

Example 3 with ExtractionFn

use of org.apache.druid.query.extraction.ExtractionFn in project druid by druid-io.

the class RowBasedColumnSelectorFactory method makeDimensionSelectorUndecorated.

private DimensionSelector makeDimensionSelectorUndecorated(DimensionSpec dimensionSpec) {
    final String dimension = dimensionSpec.getDimension();
    final ExtractionFn extractionFn = dimensionSpec.getExtractionFn();
    if (ColumnHolder.TIME_COLUMN_NAME.equals(dimensionSpec.getDimension())) {
        if (extractionFn == null) {
            throw new UnsupportedOperationException("time dimension must provide an extraction function");
        }
        final ToLongFunction<T> timestampFunction = adapter.timestampFunction();
        return new BaseSingleValueDimensionSelector() {

            private long currentId = NO_ID;

            private String currentValue;

            @Override
            protected String getValue() {
                updateCurrentValue();
                return currentValue;
            }

            @Override
            public void inspectRuntimeShape(RuntimeShapeInspector inspector) {
                inspector.visit("row", rowSupplier);
                inspector.visit("extractionFn", extractionFn);
            }

            private void updateCurrentValue() {
                if (rowIdSupplier == null || rowIdSupplier.getAsLong() != currentId) {
                    currentValue = extractionFn.apply(timestampFunction.applyAsLong(rowSupplier.get()));
                    if (rowIdSupplier != null) {
                        currentId = rowIdSupplier.getAsLong();
                    }
                }
            }
        };
    } else {
        final Function<T, Object> dimFunction = adapter.columnFunction(dimension);
        return new DimensionSelector() {

            private long currentId = NO_ID;

            private List<String> dimensionValues;

            private final RangeIndexedInts indexedInts = new RangeIndexedInts();

            @Override
            public IndexedInts getRow() {
                updateCurrentValues();
                indexedInts.setSize(dimensionValues.size());
                return indexedInts;
            }

            @Override
            public ValueMatcher makeValueMatcher(@Nullable final String value) {
                return new ValueMatcher() {

                    @Override
                    public boolean matches() {
                        updateCurrentValues();
                        if (dimensionValues.isEmpty()) {
                            return value == null;
                        }
                        for (String dimensionValue : dimensionValues) {
                            if (Objects.equals(NullHandling.emptyToNullIfNeeded(dimensionValue), value)) {
                                return true;
                            }
                        }
                        return false;
                    }

                    @Override
                    public void inspectRuntimeShape(RuntimeShapeInspector inspector) {
                        inspector.visit("row", rowSupplier);
                        inspector.visit("extractionFn", extractionFn);
                    }
                };
            }

            @Override
            public ValueMatcher makeValueMatcher(final Predicate<String> predicate) {
                final boolean matchNull = predicate.apply(null);
                return new ValueMatcher() {

                    @Override
                    public boolean matches() {
                        updateCurrentValues();
                        if (dimensionValues.isEmpty()) {
                            return matchNull;
                        }
                        for (String dimensionValue : dimensionValues) {
                            if (predicate.apply(NullHandling.emptyToNullIfNeeded(dimensionValue))) {
                                return true;
                            }
                        }
                        return false;
                    }

                    @Override
                    public void inspectRuntimeShape(RuntimeShapeInspector inspector) {
                        inspector.visit("row", rowSupplier);
                        inspector.visit("predicate", predicate);
                        inspector.visit("extractionFn", extractionFn);
                    }
                };
            }

            @Override
            public int getValueCardinality() {
                return DimensionDictionarySelector.CARDINALITY_UNKNOWN;
            }

            @Override
            public String lookupName(int id) {
                updateCurrentValues();
                return NullHandling.emptyToNullIfNeeded(dimensionValues.get(id));
            }

            @Override
            public boolean nameLookupPossibleInAdvance() {
                return false;
            }

            @Nullable
            @Override
            public IdLookup idLookup() {
                return null;
            }

            @Nullable
            @Override
            public Object getObject() {
                updateCurrentValues();
                if (dimensionValues.size() == 1) {
                    return dimensionValues.get(0);
                }
                return dimensionValues;
            }

            @Override
            public Class classOfObject() {
                return Object.class;
            }

            @Override
            public void inspectRuntimeShape(RuntimeShapeInspector inspector) {
                inspector.visit("row", rowSupplier);
                inspector.visit("extractionFn", extractionFn);
            }

            private void updateCurrentValues() {
                if (rowIdSupplier == null || rowIdSupplier.getAsLong() != currentId) {
                    try {
                        final Object rawValue = dimFunction.apply(rowSupplier.get());
                        if (rawValue == null || rawValue instanceof String) {
                            final String s = NullHandling.emptyToNullIfNeeded((String) rawValue);
                            if (extractionFn == null) {
                                dimensionValues = Collections.singletonList(s);
                            } else {
                                dimensionValues = Collections.singletonList(extractionFn.apply(s));
                            }
                        } else if (rawValue instanceof List) {
                            // Consistent behavior with Rows.objectToStrings, but applies extractionFn too.
                            // noinspection rawtypes
                            final List<String> values = new ArrayList<>(((List) rawValue).size());
                            // noinspection rawtypes
                            for (final Object item : ((List) rawValue)) {
                                // commonly used when retrieving strings from input-row-like objects.
                                if (extractionFn == null) {
                                    values.add(String.valueOf(item));
                                } else {
                                    values.add(extractionFn.apply(String.valueOf(item)));
                                }
                            }
                            dimensionValues = values;
                        } else {
                            final List<String> nonExtractedValues = Rows.objectToStrings(rawValue);
                            dimensionValues = new ArrayList<>(nonExtractedValues.size());
                            for (final String value : nonExtractedValues) {
                                final String s = NullHandling.emptyToNullIfNeeded(value);
                                if (extractionFn == null) {
                                    dimensionValues.add(s);
                                } else {
                                    dimensionValues.add(extractionFn.apply(s));
                                }
                            }
                        }
                    } catch (Throwable e) {
                        currentId = NO_ID;
                        throw e;
                    }
                    if (rowIdSupplier != null) {
                        currentId = rowIdSupplier.getAsLong();
                    }
                }
            }
        };
    }
}
Also used : ValueMatcher(org.apache.druid.query.filter.ValueMatcher) ArrayList(java.util.ArrayList) RuntimeShapeInspector(org.apache.druid.query.monomorphicprocessing.RuntimeShapeInspector) RangeIndexedInts(org.apache.druid.segment.data.RangeIndexedInts) Predicate(com.google.common.base.Predicate) ExtractionFn(org.apache.druid.query.extraction.ExtractionFn) ArrayList(java.util.ArrayList) List(java.util.List) Nullable(javax.annotation.Nullable)

Example 4 with ExtractionFn

use of org.apache.druid.query.extraction.ExtractionFn in project druid by druid-io.

the class QueryableIndexColumnSelectorFactory method makeDimensionSelectorUndecorated.

private DimensionSelector makeDimensionSelectorUndecorated(DimensionSpec dimensionSpec) {
    final String dimension = dimensionSpec.getDimension();
    final ExtractionFn extractionFn = dimensionSpec.getExtractionFn();
    final ColumnHolder columnHolder = index.getColumnHolder(dimension);
    if (columnHolder == null) {
        return DimensionSelector.constant(null, extractionFn);
    }
    if (dimension.equals(ColumnHolder.TIME_COLUMN_NAME)) {
        return new SingleScanTimeDimensionSelector(makeColumnValueSelector(dimension), extractionFn, descending);
    }
    ColumnCapabilities capabilities = columnHolder.getCapabilities();
    if (columnHolder.getCapabilities().isNumeric()) {
        return ValueTypes.makeNumericWrappingDimensionSelector(capabilities.getType(), makeColumnValueSelector(dimension), extractionFn);
    }
    final DictionaryEncodedColumn column = getCachedColumn(dimension, DictionaryEncodedColumn.class);
    if (column != null) {
        return column.makeDimensionSelector(offset, extractionFn);
    } else {
        return DimensionSelector.constant(null, extractionFn);
    }
}
Also used : ExtractionFn(org.apache.druid.query.extraction.ExtractionFn) ColumnHolder(org.apache.druid.segment.column.ColumnHolder) DictionaryEncodedColumn(org.apache.druid.segment.column.DictionaryEncodedColumn) ColumnCapabilities(org.apache.druid.segment.column.ColumnCapabilities)

Example 5 with ExtractionFn

use of org.apache.druid.query.extraction.ExtractionFn in project druid by druid-io.

the class IncrementalIndexColumnSelectorFactory method makeDimensionSelectorUndecorated.

private DimensionSelector makeDimensionSelectorUndecorated(DimensionSpec dimensionSpec) {
    final String dimension = dimensionSpec.getDimension();
    final ExtractionFn extractionFn = dimensionSpec.getExtractionFn();
    if (dimension.equals(ColumnHolder.TIME_COLUMN_NAME)) {
        return new SingleScanTimeDimensionSelector(makeColumnValueSelector(dimension), extractionFn, descending);
    }
    final IncrementalIndex.DimensionDesc dimensionDesc = index.getDimension(dimensionSpec.getDimension());
    if (dimensionDesc == null) {
        // not a dimension, column may be a metric
        ColumnCapabilities capabilities = getColumnCapabilities(dimension);
        if (capabilities == null) {
            return DimensionSelector.constant(null, extractionFn);
        }
        if (capabilities.isNumeric()) {
            return ValueTypes.makeNumericWrappingDimensionSelector(capabilities.getType(), makeColumnValueSelector(dimension), extractionFn);
        }
        // if we can't wrap the base column, just return a column of all nulls
        return DimensionSelector.constant(null, extractionFn);
    } else {
        final DimensionIndexer indexer = dimensionDesc.getIndexer();
        return indexer.makeDimensionSelector(dimensionSpec, rowHolder, dimensionDesc);
    }
}
Also used : ExtractionFn(org.apache.druid.query.extraction.ExtractionFn) DimensionIndexer(org.apache.druid.segment.DimensionIndexer) SingleScanTimeDimensionSelector(org.apache.druid.segment.SingleScanTimeDimensionSelector) ColumnCapabilities(org.apache.druid.segment.column.ColumnCapabilities)

Aggregations

ExtractionFn (org.apache.druid.query.extraction.ExtractionFn)40 Test (org.junit.Test)33 JavaScriptExtractionFn (org.apache.druid.query.extraction.JavaScriptExtractionFn)30 TimeFormatExtractionFn (org.apache.druid.query.extraction.TimeFormatExtractionFn)26 LookupExtractionFn (org.apache.druid.query.lookup.LookupExtractionFn)26 RegexDimExtractionFn (org.apache.druid.query.extraction.RegexDimExtractionFn)23 InitializedNullHandlingTest (org.apache.druid.testing.InitializedNullHandlingTest)22 ExtractionDimensionSpec (org.apache.druid.query.dimension.ExtractionDimensionSpec)20 DimExtractionFn (org.apache.druid.query.extraction.DimExtractionFn)20 StringFormatExtractionFn (org.apache.druid.query.extraction.StringFormatExtractionFn)20 StrlenExtractionFn (org.apache.druid.query.extraction.StrlenExtractionFn)20 CascadeExtractionFn (org.apache.druid.query.extraction.CascadeExtractionFn)12 SearchQuerySpecDimExtractionFn (org.apache.druid.query.extraction.SearchQuerySpecDimExtractionFn)12 SubstringDimExtractionFn (org.apache.druid.query.extraction.SubstringDimExtractionFn)12 SelectorDimFilter (org.apache.druid.query.filter.SelectorDimFilter)11 Result (org.apache.druid.query.Result)9 LongSumAggregatorFactory (org.apache.druid.query.aggregation.LongSumAggregatorFactory)9 DefaultDimensionSpec (org.apache.druid.query.dimension.DefaultDimensionSpec)8 ArrayList (java.util.ArrayList)6 DoubleMaxAggregatorFactory (org.apache.druid.query.aggregation.DoubleMaxAggregatorFactory)5