Search in sources :

Example 26 with ExtractionFn

use of org.apache.druid.query.extraction.ExtractionFn in project druid by druid-io.

the class GroupByQueryRunnerTest method testGroupByLongColumnWithExFn.

@Test
public void testGroupByLongColumnWithExFn() {
    // Cannot vectorize due to extraction dimension spec.
    cannotVectorize();
    if (config.getDefaultStrategy().equals(GroupByStrategySelector.STRATEGY_V1)) {
        expectedException.expect(UnsupportedOperationException.class);
        expectedException.expectMessage("GroupBy v1 does not support dimension selectors with unknown cardinality.");
    }
    String jsFn = "function(str) { return 'super-' + str; }";
    ExtractionFn jsExtractionFn = new JavaScriptExtractionFn(jsFn, false, JavaScriptConfig.getEnabledInstance());
    GroupByQuery query = makeQueryBuilder().setDataSource(QueryRunnerTestHelper.DATA_SOURCE).setQuerySegmentSpec(QueryRunnerTestHelper.FIRST_TO_THIRD).setDimensions(new ExtractionDimensionSpec("qualityLong", "ql_alias", jsExtractionFn)).setDimFilter(new SelectorDimFilter("quality", "entertainment", null)).setAggregatorSpecs(QueryRunnerTestHelper.ROWS_COUNT, new LongSumAggregatorFactory("idx", "index")).setGranularity(QueryRunnerTestHelper.DAY_GRAN).build();
    List<ResultRow> expectedResults = Arrays.asList(makeRow(query, "2011-04-01", "ql_alias", "super-1200", "rows", 1L, "idx", 158L), makeRow(query, "2011-04-02", "ql_alias", "super-1200", "rows", 1L, "idx", 166L));
    Iterable<ResultRow> results = GroupByQueryRunnerTestHelper.runQuery(factory, runner, query);
    TestHelper.assertExpectedObjects(expectedResults, results, "long-extraction");
}
Also used : RegexDimExtractionFn(org.apache.druid.query.extraction.RegexDimExtractionFn) StringFormatExtractionFn(org.apache.druid.query.extraction.StringFormatExtractionFn) LookupExtractionFn(org.apache.druid.query.lookup.LookupExtractionFn) CascadeExtractionFn(org.apache.druid.query.extraction.CascadeExtractionFn) StrlenExtractionFn(org.apache.druid.query.extraction.StrlenExtractionFn) SubstringDimExtractionFn(org.apache.druid.query.extraction.SubstringDimExtractionFn) ExtractionFn(org.apache.druid.query.extraction.ExtractionFn) DimExtractionFn(org.apache.druid.query.extraction.DimExtractionFn) JavaScriptExtractionFn(org.apache.druid.query.extraction.JavaScriptExtractionFn) SearchQuerySpecDimExtractionFn(org.apache.druid.query.extraction.SearchQuerySpecDimExtractionFn) TimeFormatExtractionFn(org.apache.druid.query.extraction.TimeFormatExtractionFn) SelectorDimFilter(org.apache.druid.query.filter.SelectorDimFilter) JavaScriptExtractionFn(org.apache.druid.query.extraction.JavaScriptExtractionFn) LongSumAggregatorFactory(org.apache.druid.query.aggregation.LongSumAggregatorFactory) ExtractionDimensionSpec(org.apache.druid.query.dimension.ExtractionDimensionSpec) InitializedNullHandlingTest(org.apache.druid.testing.InitializedNullHandlingTest) Test(org.junit.Test)

Example 27 with ExtractionFn

use of org.apache.druid.query.extraction.ExtractionFn in project druid by druid-io.

the class GroupByQueryRunnerTest method testGroupByNestedWithInnerQueryOutputNullNumerics.

@Test
public void testGroupByNestedWithInnerQueryOutputNullNumerics() {
    cannotVectorize();
    if (config.getDefaultStrategy().equals(GroupByStrategySelector.STRATEGY_V1)) {
        expectedException.expect(UnsupportedOperationException.class);
        expectedException.expectMessage("GroupBy v1 only supports dimensions with an outputType of STRING.");
    }
    // Following extractionFn will generate null value for one kind of quality
    ExtractionFn extractionFn = new SearchQuerySpecDimExtractionFn(new ContainsSearchQuerySpec("1200", false));
    GroupByQuery subquery = makeQueryBuilder().setDataSource(QueryRunnerTestHelper.DATA_SOURCE).setQuerySegmentSpec(QueryRunnerTestHelper.FIRST_TO_THIRD).setDimensions(new DefaultDimensionSpec("quality", "alias"), new ExtractionDimensionSpec("qualityLong", "ql_alias", ColumnType.LONG, extractionFn), new ExtractionDimensionSpec("qualityFloat", "qf_alias", ColumnType.FLOAT, extractionFn), new ExtractionDimensionSpec("qualityDouble", "qd_alias", ColumnType.DOUBLE, extractionFn)).setDimFilter(new InDimFilter("quality", Arrays.asList("entertainment", "business"), null)).setAggregatorSpecs(QueryRunnerTestHelper.ROWS_COUNT, new LongSumAggregatorFactory("idx", "index")).setGranularity(QueryRunnerTestHelper.DAY_GRAN).build();
    GroupByQuery outerQuery = makeQueryBuilder().setDataSource(subquery).setQuerySegmentSpec(QueryRunnerTestHelper.FIRST_TO_THIRD).setDimensions(new DefaultDimensionSpec("ql_alias", "quallong", ColumnType.LONG), new DefaultDimensionSpec("qf_alias", "qualfloat", ColumnType.FLOAT), new DefaultDimensionSpec("qd_alias", "qualdouble", ColumnType.DOUBLE)).setAggregatorSpecs(new LongSumAggregatorFactory("ql_alias_sum", "ql_alias"), new DoubleSumAggregatorFactory("qf_alias_sum", "qf_alias"), new DoubleSumAggregatorFactory("qd_alias_sum", "qd_alias")).setGranularity(QueryRunnerTestHelper.ALL_GRAN).build();
    List<ResultRow> expectedResults = Arrays.asList(makeRow(outerQuery, "2011-04-01", "quallong", NullHandling.defaultLongValue(), "qualfloat", NullHandling.defaultFloatValue(), "qualdouble", NullHandling.defaultDoubleValue(), "ql_alias_sum", NullHandling.defaultLongValue(), "qf_alias_sum", NullHandling.defaultFloatValue(), "qd_alias_sum", NullHandling.defaultDoubleValue()), makeRow(outerQuery, "2011-04-01", "quallong", 1200L, "qualfloat", 12000.0, "qualdouble", 12000.0, "ql_alias_sum", 2400L, "qf_alias_sum", 24000.0, "qd_alias_sum", 24000.0));
    Iterable<ResultRow> results = GroupByQueryRunnerTestHelper.runQuery(factory, runner, outerQuery);
    TestHelper.assertExpectedObjects(expectedResults, results, "numerics");
}
Also used : RegexDimExtractionFn(org.apache.druid.query.extraction.RegexDimExtractionFn) StringFormatExtractionFn(org.apache.druid.query.extraction.StringFormatExtractionFn) LookupExtractionFn(org.apache.druid.query.lookup.LookupExtractionFn) CascadeExtractionFn(org.apache.druid.query.extraction.CascadeExtractionFn) StrlenExtractionFn(org.apache.druid.query.extraction.StrlenExtractionFn) SubstringDimExtractionFn(org.apache.druid.query.extraction.SubstringDimExtractionFn) ExtractionFn(org.apache.druid.query.extraction.ExtractionFn) DimExtractionFn(org.apache.druid.query.extraction.DimExtractionFn) JavaScriptExtractionFn(org.apache.druid.query.extraction.JavaScriptExtractionFn) SearchQuerySpecDimExtractionFn(org.apache.druid.query.extraction.SearchQuerySpecDimExtractionFn) TimeFormatExtractionFn(org.apache.druid.query.extraction.TimeFormatExtractionFn) DoubleSumAggregatorFactory(org.apache.druid.query.aggregation.DoubleSumAggregatorFactory) ContainsSearchQuerySpec(org.apache.druid.query.search.ContainsSearchQuerySpec) InDimFilter(org.apache.druid.query.filter.InDimFilter) LongSumAggregatorFactory(org.apache.druid.query.aggregation.LongSumAggregatorFactory) SearchQuerySpecDimExtractionFn(org.apache.druid.query.extraction.SearchQuerySpecDimExtractionFn) DefaultDimensionSpec(org.apache.druid.query.dimension.DefaultDimensionSpec) ExtractionDimensionSpec(org.apache.druid.query.dimension.ExtractionDimensionSpec) InitializedNullHandlingTest(org.apache.druid.testing.InitializedNullHandlingTest) Test(org.junit.Test)

Example 28 with ExtractionFn

use of org.apache.druid.query.extraction.ExtractionFn in project druid by druid-io.

the class GroupByQueryRunnerTest method testGroupByWithNullProducingDimExtractionFn.

@Test
public void testGroupByWithNullProducingDimExtractionFn() {
    // Cannot vectorize due to extraction dimension spec.
    cannotVectorize();
    final ExtractionFn nullExtractionFn = new RegexDimExtractionFn("(\\w{1})", false, null) {

        @Override
        public byte[] getCacheKey() {
            return new byte[] { (byte) 0xFF };
        }

        @Override
        public String apply(String dimValue) {
            return "mezzanine".equals(dimValue) ? null : super.apply(dimValue);
        }
    };
    GroupByQuery query = makeQueryBuilder().setDataSource(QueryRunnerTestHelper.DATA_SOURCE).setQuerySegmentSpec(QueryRunnerTestHelper.FIRST_TO_THIRD).setAggregatorSpecs(QueryRunnerTestHelper.ROWS_COUNT, new LongSumAggregatorFactory("idx", "index")).setGranularity(QueryRunnerTestHelper.DAY_GRAN).setDimensions(new ExtractionDimensionSpec("quality", "alias", nullExtractionFn)).build();
    List<ResultRow> expectedResults = Arrays.asList(makeRow(query, "2011-04-01", "alias", null, "rows", 3L, "idx", 2870L), makeRow(query, "2011-04-01", "alias", "a", "rows", 1L, "idx", 135L), makeRow(query, "2011-04-01", "alias", "b", "rows", 1L, "idx", 118L), makeRow(query, "2011-04-01", "alias", "e", "rows", 1L, "idx", 158L), makeRow(query, "2011-04-01", "alias", "h", "rows", 1L, "idx", 120L), makeRow(query, "2011-04-01", "alias", "n", "rows", 1L, "idx", 121L), makeRow(query, "2011-04-01", "alias", "p", "rows", 3L, "idx", 2900L), makeRow(query, "2011-04-01", "alias", "t", "rows", 2L, "idx", 197L), makeRow(query, "2011-04-02", "alias", null, "rows", 3L, "idx", 2447L), makeRow(query, "2011-04-02", "alias", "a", "rows", 1L, "idx", 147L), makeRow(query, "2011-04-02", "alias", "b", "rows", 1L, "idx", 112L), makeRow(query, "2011-04-02", "alias", "e", "rows", 1L, "idx", 166L), makeRow(query, "2011-04-02", "alias", "h", "rows", 1L, "idx", 113L), makeRow(query, "2011-04-02", "alias", "n", "rows", 1L, "idx", 114L), makeRow(query, "2011-04-02", "alias", "p", "rows", 3L, "idx", 2505L), makeRow(query, "2011-04-02", "alias", "t", "rows", 2L, "idx", 223L));
    TestHelper.assertExpectedObjects(expectedResults, GroupByQueryRunnerTestHelper.runQuery(factory, runner, query), "null-dimextraction");
}
Also used : RegexDimExtractionFn(org.apache.druid.query.extraction.RegexDimExtractionFn) StringFormatExtractionFn(org.apache.druid.query.extraction.StringFormatExtractionFn) LookupExtractionFn(org.apache.druid.query.lookup.LookupExtractionFn) CascadeExtractionFn(org.apache.druid.query.extraction.CascadeExtractionFn) StrlenExtractionFn(org.apache.druid.query.extraction.StrlenExtractionFn) SubstringDimExtractionFn(org.apache.druid.query.extraction.SubstringDimExtractionFn) ExtractionFn(org.apache.druid.query.extraction.ExtractionFn) DimExtractionFn(org.apache.druid.query.extraction.DimExtractionFn) JavaScriptExtractionFn(org.apache.druid.query.extraction.JavaScriptExtractionFn) SearchQuerySpecDimExtractionFn(org.apache.druid.query.extraction.SearchQuerySpecDimExtractionFn) TimeFormatExtractionFn(org.apache.druid.query.extraction.TimeFormatExtractionFn) LongSumAggregatorFactory(org.apache.druid.query.aggregation.LongSumAggregatorFactory) RegexDimExtractionFn(org.apache.druid.query.extraction.RegexDimExtractionFn) ExtractionDimensionSpec(org.apache.druid.query.dimension.ExtractionDimensionSpec) InitializedNullHandlingTest(org.apache.druid.testing.InitializedNullHandlingTest) Test(org.junit.Test)

Example 29 with ExtractionFn

use of org.apache.druid.query.extraction.ExtractionFn in project druid by druid-io.

the class GroupByQueryRunnerTest method testGroupByWithEmptyStringProducingDimExtractionFn.

@Test
@Ignore
public /**
 * This test exists only to show what the current behavior is and not necessarily to define that this is
 * correct behavior.  In fact, the behavior when returning the empty string from a DimExtractionFn is, by
 * contract, undefined, so this can do anything.
 */
void testGroupByWithEmptyStringProducingDimExtractionFn() {
    final ExtractionFn emptyStringExtractionFn = new RegexDimExtractionFn("(\\w{1})", false, null) {

        @Override
        public byte[] getCacheKey() {
            return new byte[] { (byte) 0xFF };
        }

        @Override
        public String apply(String dimValue) {
            return "mezzanine".equals(dimValue) ? "" : super.apply(dimValue);
        }
    };
    GroupByQuery query = makeQueryBuilder().setDataSource(QueryRunnerTestHelper.DATA_SOURCE).setQuerySegmentSpec(QueryRunnerTestHelper.FIRST_TO_THIRD).setAggregatorSpecs(QueryRunnerTestHelper.ROWS_COUNT, new LongSumAggregatorFactory("idx", "index")).setGranularity(QueryRunnerTestHelper.DAY_GRAN).setDimensions(new ExtractionDimensionSpec("quality", "alias", emptyStringExtractionFn)).build();
    List<ResultRow> expectedResults = Arrays.asList(makeRow(query, "2011-04-01", "alias", "", "rows", 3L, "idx", 2870L), makeRow(query, "2011-04-01", "alias", "a", "rows", 1L, "idx", 135L), makeRow(query, "2011-04-01", "alias", "b", "rows", 1L, "idx", 118L), makeRow(query, "2011-04-01", "alias", "e", "rows", 1L, "idx", 158L), makeRow(query, "2011-04-01", "alias", "h", "rows", 1L, "idx", 120L), makeRow(query, "2011-04-01", "alias", "n", "rows", 1L, "idx", 121L), makeRow(query, "2011-04-01", "alias", "p", "rows", 3L, "idx", 2900L), makeRow(query, "2011-04-01", "alias", "t", "rows", 2L, "idx", 197L), makeRow(query, "2011-04-02", "alias", "", "rows", 3L, "idx", 2447L), makeRow(query, "2011-04-02", "alias", "a", "rows", 1L, "idx", 147L), makeRow(query, "2011-04-02", "alias", "b", "rows", 1L, "idx", 112L), makeRow(query, "2011-04-02", "alias", "e", "rows", 1L, "idx", 166L), makeRow(query, "2011-04-02", "alias", "h", "rows", 1L, "idx", 113L), makeRow(query, "2011-04-02", "alias", "n", "rows", 1L, "idx", 114L), makeRow(query, "2011-04-02", "alias", "p", "rows", 3L, "idx", 2505L), makeRow(query, "2011-04-02", "alias", "t", "rows", 2L, "idx", 223L));
    TestHelper.assertExpectedObjects(expectedResults, GroupByQueryRunnerTestHelper.runQuery(factory, runner, query), "empty-string-dimextraction");
}
Also used : RegexDimExtractionFn(org.apache.druid.query.extraction.RegexDimExtractionFn) StringFormatExtractionFn(org.apache.druid.query.extraction.StringFormatExtractionFn) LookupExtractionFn(org.apache.druid.query.lookup.LookupExtractionFn) CascadeExtractionFn(org.apache.druid.query.extraction.CascadeExtractionFn) StrlenExtractionFn(org.apache.druid.query.extraction.StrlenExtractionFn) SubstringDimExtractionFn(org.apache.druid.query.extraction.SubstringDimExtractionFn) ExtractionFn(org.apache.druid.query.extraction.ExtractionFn) DimExtractionFn(org.apache.druid.query.extraction.DimExtractionFn) JavaScriptExtractionFn(org.apache.druid.query.extraction.JavaScriptExtractionFn) SearchQuerySpecDimExtractionFn(org.apache.druid.query.extraction.SearchQuerySpecDimExtractionFn) TimeFormatExtractionFn(org.apache.druid.query.extraction.TimeFormatExtractionFn) LongSumAggregatorFactory(org.apache.druid.query.aggregation.LongSumAggregatorFactory) RegexDimExtractionFn(org.apache.druid.query.extraction.RegexDimExtractionFn) ExtractionDimensionSpec(org.apache.druid.query.dimension.ExtractionDimensionSpec) Ignore(org.junit.Ignore) InitializedNullHandlingTest(org.apache.druid.testing.InitializedNullHandlingTest) Test(org.junit.Test)

Example 30 with ExtractionFn

use of org.apache.druid.query.extraction.ExtractionFn in project druid by druid-io.

the class GroupByQueryRunnerTest method testGroupByStringOutputAsLong.

@Test
public void testGroupByStringOutputAsLong() {
    // Cannot vectorize due to extraction dimension spec.
    cannotVectorize();
    if (config.getDefaultStrategy().equals(GroupByStrategySelector.STRATEGY_V1)) {
        expectedException.expect(UnsupportedOperationException.class);
        expectedException.expectMessage("GroupBy v1 only supports dimensions with an outputType of STRING.");
    }
    ExtractionFn strlenFn = StrlenExtractionFn.instance();
    GroupByQuery query = makeQueryBuilder().setDataSource(QueryRunnerTestHelper.DATA_SOURCE).setQuerySegmentSpec(QueryRunnerTestHelper.FIRST_TO_THIRD).setDimensions(new ExtractionDimensionSpec(QueryRunnerTestHelper.QUALITY_DIMENSION, "alias", ColumnType.LONG, strlenFn)).setDimFilter(new SelectorDimFilter(QueryRunnerTestHelper.QUALITY_DIMENSION, "entertainment", null)).setAggregatorSpecs(QueryRunnerTestHelper.ROWS_COUNT, new LongSumAggregatorFactory("idx", "index")).setGranularity(QueryRunnerTestHelper.DAY_GRAN).build();
    List<ResultRow> expectedResults = Arrays.asList(makeRow(query, "2011-04-01", "alias", 13L, "rows", 1L, "idx", 158L), makeRow(query, "2011-04-02", "alias", 13L, "rows", 1L, "idx", 166L));
    Iterable<ResultRow> results = GroupByQueryRunnerTestHelper.runQuery(factory, runner, query);
    TestHelper.assertExpectedObjects(expectedResults, results, "string-long");
}
Also used : RegexDimExtractionFn(org.apache.druid.query.extraction.RegexDimExtractionFn) StringFormatExtractionFn(org.apache.druid.query.extraction.StringFormatExtractionFn) LookupExtractionFn(org.apache.druid.query.lookup.LookupExtractionFn) CascadeExtractionFn(org.apache.druid.query.extraction.CascadeExtractionFn) StrlenExtractionFn(org.apache.druid.query.extraction.StrlenExtractionFn) SubstringDimExtractionFn(org.apache.druid.query.extraction.SubstringDimExtractionFn) ExtractionFn(org.apache.druid.query.extraction.ExtractionFn) DimExtractionFn(org.apache.druid.query.extraction.DimExtractionFn) JavaScriptExtractionFn(org.apache.druid.query.extraction.JavaScriptExtractionFn) SearchQuerySpecDimExtractionFn(org.apache.druid.query.extraction.SearchQuerySpecDimExtractionFn) TimeFormatExtractionFn(org.apache.druid.query.extraction.TimeFormatExtractionFn) SelectorDimFilter(org.apache.druid.query.filter.SelectorDimFilter) LongSumAggregatorFactory(org.apache.druid.query.aggregation.LongSumAggregatorFactory) ExtractionDimensionSpec(org.apache.druid.query.dimension.ExtractionDimensionSpec) InitializedNullHandlingTest(org.apache.druid.testing.InitializedNullHandlingTest) Test(org.junit.Test)

Aggregations

ExtractionFn (org.apache.druid.query.extraction.ExtractionFn)40 Test (org.junit.Test)33 JavaScriptExtractionFn (org.apache.druid.query.extraction.JavaScriptExtractionFn)30 TimeFormatExtractionFn (org.apache.druid.query.extraction.TimeFormatExtractionFn)26 LookupExtractionFn (org.apache.druid.query.lookup.LookupExtractionFn)26 RegexDimExtractionFn (org.apache.druid.query.extraction.RegexDimExtractionFn)23 InitializedNullHandlingTest (org.apache.druid.testing.InitializedNullHandlingTest)22 ExtractionDimensionSpec (org.apache.druid.query.dimension.ExtractionDimensionSpec)20 DimExtractionFn (org.apache.druid.query.extraction.DimExtractionFn)20 StringFormatExtractionFn (org.apache.druid.query.extraction.StringFormatExtractionFn)20 StrlenExtractionFn (org.apache.druid.query.extraction.StrlenExtractionFn)20 CascadeExtractionFn (org.apache.druid.query.extraction.CascadeExtractionFn)12 SearchQuerySpecDimExtractionFn (org.apache.druid.query.extraction.SearchQuerySpecDimExtractionFn)12 SubstringDimExtractionFn (org.apache.druid.query.extraction.SubstringDimExtractionFn)12 SelectorDimFilter (org.apache.druid.query.filter.SelectorDimFilter)11 Result (org.apache.druid.query.Result)9 LongSumAggregatorFactory (org.apache.druid.query.aggregation.LongSumAggregatorFactory)9 DefaultDimensionSpec (org.apache.druid.query.dimension.DefaultDimensionSpec)8 ArrayList (java.util.ArrayList)6 DoubleMaxAggregatorFactory (org.apache.druid.query.aggregation.DoubleMaxAggregatorFactory)5