Search in sources :

Example 61 with AggregatorFactory

use of io.druid.query.aggregation.AggregatorFactory in project druid by druid-io.

the class DataSchema method getParser.

@JsonIgnore
public InputRowParser getParser() {
    if (parser == null) {
        log.warn("No parser has been specified");
        return null;
    }
    final InputRowParser inputRowParser = jsonMapper.convertValue(this.parser, InputRowParser.class);
    final Set<String> dimensionExclusions = Sets.newHashSet();
    for (AggregatorFactory aggregator : aggregators) {
        dimensionExclusions.addAll(aggregator.requiredFields());
        dimensionExclusions.add(aggregator.getName());
    }
    if (inputRowParser.getParseSpec() != null) {
        final DimensionsSpec dimensionsSpec = inputRowParser.getParseSpec().getDimensionsSpec();
        final TimestampSpec timestampSpec = inputRowParser.getParseSpec().getTimestampSpec();
        // exclude timestamp from dimensions by default, unless explicitly included in the list of dimensions
        if (timestampSpec != null) {
            final String timestampColumn = timestampSpec.getTimestampColumn();
            if (!(dimensionsSpec.hasCustomDimensions() && dimensionsSpec.getDimensionNames().contains(timestampColumn))) {
                dimensionExclusions.add(timestampColumn);
            }
        }
        if (dimensionsSpec != null) {
            final Set<String> metSet = Sets.newHashSet();
            for (AggregatorFactory aggregator : aggregators) {
                metSet.add(aggregator.getName());
            }
            final Set<String> dimSet = Sets.newHashSet(dimensionsSpec.getDimensionNames());
            final Set<String> overlap = Sets.intersection(metSet, dimSet);
            if (!overlap.isEmpty()) {
                throw new IAE("Cannot have overlapping dimensions and metrics of the same name. Please change the name of the metric. Overlap: %s", overlap);
            }
            return inputRowParser.withParseSpec(inputRowParser.getParseSpec().withDimensionsSpec(dimensionsSpec.withDimensionExclusions(Sets.difference(dimensionExclusions, dimSet))));
        } else {
            return inputRowParser;
        }
    } else {
        log.warn("No parseSpec in parser has been specified.");
        return inputRowParser;
    }
}
Also used : TimestampSpec(io.druid.data.input.impl.TimestampSpec) DimensionsSpec(io.druid.data.input.impl.DimensionsSpec) InputRowParser(io.druid.data.input.impl.InputRowParser) AggregatorFactory(io.druid.query.aggregation.AggregatorFactory) IAE(io.druid.java.util.common.IAE) JsonIgnore(com.fasterxml.jackson.annotation.JsonIgnore)

Example 62 with AggregatorFactory

use of io.druid.query.aggregation.AggregatorFactory in project druid by druid-io.

the class CachingClusteredClientFunctionalityTest method testUncoveredInterval.

@Test
public void testUncoveredInterval() throws Exception {
    addToTimeline(new Interval("2015-01-02/2015-01-03"), "1");
    addToTimeline(new Interval("2015-01-04/2015-01-05"), "1");
    addToTimeline(new Interval("2015-02-04/2015-02-05"), "1");
    final Druids.TimeseriesQueryBuilder builder = Druids.newTimeseriesQueryBuilder().dataSource("test").intervals("2015-01-02/2015-01-03").granularity("day").aggregators(Arrays.<AggregatorFactory>asList(new CountAggregatorFactory("rows"))).context(ImmutableMap.<String, Object>of("uncoveredIntervalsLimit", 3));
    Map<String, Object> responseContext = new HashMap<>();
    client.run(builder.build(), responseContext);
    Assert.assertNull(responseContext.get("uncoveredIntervals"));
    builder.intervals("2015-01-01/2015-01-03");
    responseContext = new HashMap<>();
    client.run(builder.build(), responseContext);
    assertUncovered(responseContext, false, "2015-01-01/2015-01-02");
    builder.intervals("2015-01-01/2015-01-04");
    responseContext = new HashMap<>();
    client.run(builder.build(), responseContext);
    assertUncovered(responseContext, false, "2015-01-01/2015-01-02", "2015-01-03/2015-01-04");
    builder.intervals("2015-01-02/2015-01-04");
    responseContext = new HashMap<>();
    client.run(builder.build(), responseContext);
    assertUncovered(responseContext, false, "2015-01-03/2015-01-04");
    builder.intervals("2015-01-01/2015-01-30");
    responseContext = new HashMap<>();
    client.run(builder.build(), responseContext);
    assertUncovered(responseContext, false, "2015-01-01/2015-01-02", "2015-01-03/2015-01-04", "2015-01-05/2015-01-30");
    builder.intervals("2015-01-02/2015-01-30");
    responseContext = new HashMap<>();
    client.run(builder.build(), responseContext);
    assertUncovered(responseContext, false, "2015-01-03/2015-01-04", "2015-01-05/2015-01-30");
    builder.intervals("2015-01-04/2015-01-30");
    responseContext = new HashMap<>();
    client.run(builder.build(), responseContext);
    assertUncovered(responseContext, false, "2015-01-05/2015-01-30");
    builder.intervals("2015-01-10/2015-01-30");
    responseContext = new HashMap<>();
    client.run(builder.build(), responseContext);
    assertUncovered(responseContext, false, "2015-01-10/2015-01-30");
    builder.intervals("2015-01-01/2015-02-25");
    responseContext = new HashMap<>();
    client.run(builder.build(), responseContext);
    assertUncovered(responseContext, true, "2015-01-01/2015-01-02", "2015-01-03/2015-01-04", "2015-01-05/2015-02-04");
}
Also used : CountAggregatorFactory(io.druid.query.aggregation.CountAggregatorFactory) HashMap(java.util.HashMap) Druids(io.druid.query.Druids) AggregatorFactory(io.druid.query.aggregation.AggregatorFactory) CountAggregatorFactory(io.druid.query.aggregation.CountAggregatorFactory) Interval(org.joda.time.Interval) Test(org.junit.Test)

Example 63 with AggregatorFactory

use of io.druid.query.aggregation.AggregatorFactory in project druid by druid-io.

the class CachingClusteredClientTest method testGroupByCaching.

@Test
public void testGroupByCaching() throws Exception {
    List<AggregatorFactory> aggsWithUniques = ImmutableList.<AggregatorFactory>builder().addAll(AGGS).add(new HyperUniquesAggregatorFactory("uniques", "uniques")).build();
    final HashFunction hashFn = Hashing.murmur3_128();
    GroupByQuery.Builder builder = new GroupByQuery.Builder().setDataSource(DATA_SOURCE).setQuerySegmentSpec(SEG_SPEC).setDimFilter(DIM_FILTER).setGranularity(GRANULARITY).setDimensions(Arrays.<DimensionSpec>asList(new DefaultDimensionSpec("a", "a"))).setAggregatorSpecs(aggsWithUniques).setPostAggregatorSpecs(POST_AGGS).setContext(CONTEXT);
    final HyperLogLogCollector collector = HyperLogLogCollector.makeLatestCollector();
    collector.add(hashFn.hashString("abc123", Charsets.UTF_8).asBytes());
    collector.add(hashFn.hashString("123abc", Charsets.UTF_8).asBytes());
    testQueryCaching(client, builder.build(), new Interval("2011-01-01/2011-01-02"), makeGroupByResults(new DateTime("2011-01-01"), ImmutableMap.of("a", "a", "rows", 1, "imps", 1, "impers", 1, "uniques", collector)), new Interval("2011-01-02/2011-01-03"), makeGroupByResults(new DateTime("2011-01-02"), ImmutableMap.of("a", "b", "rows", 2, "imps", 2, "impers", 2, "uniques", collector)), new Interval("2011-01-05/2011-01-10"), makeGroupByResults(new DateTime("2011-01-05"), ImmutableMap.of("a", "c", "rows", 3, "imps", 3, "impers", 3, "uniques", collector), new DateTime("2011-01-06"), ImmutableMap.of("a", "d", "rows", 4, "imps", 4, "impers", 4, "uniques", collector), new DateTime("2011-01-07"), ImmutableMap.of("a", "e", "rows", 5, "imps", 5, "impers", 5, "uniques", collector), new DateTime("2011-01-08"), ImmutableMap.of("a", "f", "rows", 6, "imps", 6, "impers", 6, "uniques", collector), new DateTime("2011-01-09"), ImmutableMap.of("a", "g", "rows", 7, "imps", 7, "impers", 7, "uniques", collector)), new Interval("2011-01-05/2011-01-10"), makeGroupByResults(new DateTime("2011-01-05T01"), ImmutableMap.of("a", "c", "rows", 3, "imps", 3, "impers", 3, "uniques", collector), new DateTime("2011-01-06T01"), ImmutableMap.of("a", "d", "rows", 4, "imps", 4, "impers", 4, "uniques", collector), new DateTime("2011-01-07T01"), ImmutableMap.of("a", "e", "rows", 5, "imps", 5, "impers", 5, "uniques", collector), new DateTime("2011-01-08T01"), ImmutableMap.of("a", "f", "rows", 6, "imps", 6, "impers", 6, "uniques", collector), new DateTime("2011-01-09T01"), ImmutableMap.of("a", "g", "rows", 7, "imps", 7, "impers", 7, "uniques", collector)));
    QueryRunner runner = new FinalizeResultsQueryRunner(client, GroupByQueryRunnerTest.makeQueryRunnerFactory(new GroupByQueryConfig()).getToolchest());
    HashMap<String, Object> context = new HashMap<String, Object>();
    TestHelper.assertExpectedObjects(makeGroupByResults(new DateTime("2011-01-05T"), ImmutableMap.of("a", "c", "rows", 3, "imps", 3, "impers", 3, "uniques", collector), new DateTime("2011-01-05T01"), ImmutableMap.of("a", "c", "rows", 3, "imps", 3, "impers", 3, "uniques", collector), new DateTime("2011-01-06T"), ImmutableMap.of("a", "d", "rows", 4, "imps", 4, "impers", 4, "uniques", collector), new DateTime("2011-01-06T01"), ImmutableMap.of("a", "d", "rows", 4, "imps", 4, "impers", 4, "uniques", collector), new DateTime("2011-01-07T"), ImmutableMap.of("a", "e", "rows", 5, "imps", 5, "impers", 5, "uniques", collector), new DateTime("2011-01-07T01"), ImmutableMap.of("a", "e", "rows", 5, "imps", 5, "impers", 5, "uniques", collector), new DateTime("2011-01-08T"), ImmutableMap.of("a", "f", "rows", 6, "imps", 6, "impers", 6, "uniques", collector), new DateTime("2011-01-08T01"), ImmutableMap.of("a", "f", "rows", 6, "imps", 6, "impers", 6, "uniques", collector), new DateTime("2011-01-09T"), ImmutableMap.of("a", "g", "rows", 7, "imps", 7, "impers", 7, "uniques", collector), new DateTime("2011-01-09T01"), ImmutableMap.of("a", "g", "rows", 7, "imps", 7, "impers", 7, "uniques", collector)), runner.run(builder.setInterval("2011-01-05/2011-01-10").build(), context), "");
}
Also used : DefaultDimensionSpec(io.druid.query.dimension.DefaultDimensionSpec) DimensionSpec(io.druid.query.dimension.DimensionSpec) GroupByQueryConfig(io.druid.query.groupby.GroupByQueryConfig) HashMap(java.util.HashMap) HyperLogLogCollector(io.druid.hll.HyperLogLogCollector) TopNQueryBuilder(io.druid.query.topn.TopNQueryBuilder) LongSumAggregatorFactory(io.druid.query.aggregation.LongSumAggregatorFactory) CountAggregatorFactory(io.druid.query.aggregation.CountAggregatorFactory) HyperUniquesAggregatorFactory(io.druid.query.aggregation.hyperloglog.HyperUniquesAggregatorFactory) AggregatorFactory(io.druid.query.aggregation.AggregatorFactory) DefaultDimensionSpec(io.druid.query.dimension.DefaultDimensionSpec) DateTime(org.joda.time.DateTime) FinalizeResultsQueryRunner(io.druid.query.FinalizeResultsQueryRunner) QueryRunner(io.druid.query.QueryRunner) GroupByQuery(io.druid.query.groupby.GroupByQuery) HashFunction(com.google.common.hash.HashFunction) FinalizeResultsQueryRunner(io.druid.query.FinalizeResultsQueryRunner) HyperUniquesAggregatorFactory(io.druid.query.aggregation.hyperloglog.HyperUniquesAggregatorFactory) Interval(org.joda.time.Interval) Test(org.junit.Test) GroupByQueryRunnerTest(io.druid.query.groupby.GroupByQueryRunnerTest)

Example 64 with AggregatorFactory

use of io.druid.query.aggregation.AggregatorFactory in project druid by druid-io.

the class SqlBenchmark method setup.

@Setup(Level.Trial)
public void setup() throws Exception {
    tmpDir = Files.createTempDir();
    log.info("Starting benchmark setup using tmpDir[%s], rows[%,d].", tmpDir, rowsPerSegment);
    if (ComplexMetrics.getSerdeForType("hyperUnique") == null) {
        ComplexMetrics.registerSerde("hyperUnique", new HyperUniquesSerde(HyperLogLogHash.getDefault()));
    }
    final BenchmarkSchemaInfo schemaInfo = BenchmarkSchemas.SCHEMA_MAP.get("basic");
    final BenchmarkDataGenerator dataGenerator = new BenchmarkDataGenerator(schemaInfo.getColumnSchemas(), RNG_SEED + 1, schemaInfo.getDataInterval(), rowsPerSegment);
    final List<InputRow> rows = Lists.newArrayList();
    for (int i = 0; i < rowsPerSegment; i++) {
        final InputRow row = dataGenerator.nextRow();
        if (i % 20000 == 0) {
            log.info("%,d/%,d rows generated.", i, rowsPerSegment);
        }
        rows.add(row);
    }
    log.info("%,d/%,d rows generated.", rows.size(), rowsPerSegment);
    final PlannerConfig plannerConfig = new PlannerConfig();
    final QueryRunnerFactoryConglomerate conglomerate = CalciteTests.queryRunnerFactoryConglomerate();
    final QueryableIndex index = IndexBuilder.create().tmpDir(new File(tmpDir, "1")).indexMerger(TestHelper.getTestIndexMergerV9()).rows(rows).buildMMappedIndex();
    this.walker = new SpecificSegmentsQuerySegmentWalker(conglomerate).add(DataSegment.builder().dataSource("foo").interval(index.getDataInterval()).version("1").shardSpec(new LinearShardSpec(0)).build(), index);
    final Map<String, Table> tableMap = ImmutableMap.<String, Table>of("foo", new DruidTable(new TableDataSource("foo"), RowSignature.builder().add("__time", ValueType.LONG).add("dimSequential", ValueType.STRING).add("dimZipf", ValueType.STRING).add("dimUniform", ValueType.STRING).build()));
    final Schema druidSchema = new AbstractSchema() {

        @Override
        protected Map<String, Table> getTableMap() {
            return tableMap;
        }
    };
    plannerFactory = new PlannerFactory(Calcites.createRootSchema(druidSchema), walker, CalciteTests.createOperatorTable(), plannerConfig);
    groupByQuery = GroupByQuery.builder().setDataSource("foo").setInterval(new Interval(JodaUtils.MIN_INSTANT, JodaUtils.MAX_INSTANT)).setDimensions(Arrays.<DimensionSpec>asList(new DefaultDimensionSpec("dimZipf", "d0"), new DefaultDimensionSpec("dimSequential", "d1"))).setAggregatorSpecs(Arrays.<AggregatorFactory>asList(new CountAggregatorFactory("c"))).setGranularity(Granularities.ALL).build();
    sqlQuery = "SELECT\n" + "  dimZipf AS d0," + "  dimSequential AS d1,\n" + "  COUNT(*) AS c\n" + "FROM druid.foo\n" + "GROUP BY dimZipf, dimSequential";
}
Also used : DruidTable(io.druid.sql.calcite.table.DruidTable) Table(org.apache.calcite.schema.Table) LinearShardSpec(io.druid.timeline.partition.LinearShardSpec) Schema(org.apache.calcite.schema.Schema) AbstractSchema(org.apache.calcite.schema.impl.AbstractSchema) BenchmarkDataGenerator(io.druid.benchmark.datagen.BenchmarkDataGenerator) HyperUniquesSerde(io.druid.query.aggregation.hyperloglog.HyperUniquesSerde) DruidTable(io.druid.sql.calcite.table.DruidTable) CountAggregatorFactory(io.druid.query.aggregation.CountAggregatorFactory) AggregatorFactory(io.druid.query.aggregation.AggregatorFactory) DefaultDimensionSpec(io.druid.query.dimension.DefaultDimensionSpec) QueryRunnerFactoryConglomerate(io.druid.query.QueryRunnerFactoryConglomerate) SpecificSegmentsQuerySegmentWalker(io.druid.sql.calcite.util.SpecificSegmentsQuerySegmentWalker) TableDataSource(io.druid.query.TableDataSource) CountAggregatorFactory(io.druid.query.aggregation.CountAggregatorFactory) AbstractSchema(org.apache.calcite.schema.impl.AbstractSchema) QueryableIndex(io.druid.segment.QueryableIndex) BenchmarkSchemaInfo(io.druid.benchmark.datagen.BenchmarkSchemaInfo) PlannerConfig(io.druid.sql.calcite.planner.PlannerConfig) InputRow(io.druid.data.input.InputRow) PlannerFactory(io.druid.sql.calcite.planner.PlannerFactory) File(java.io.File) Interval(org.joda.time.Interval) Setup(org.openjdk.jmh.annotations.Setup)

Example 65 with AggregatorFactory

use of io.druid.query.aggregation.AggregatorFactory in project druid by druid-io.

the class IncrementalIndexTest method testgetDimensions.

@Test
public void testgetDimensions() {
    final IncrementalIndex<Aggregator> incrementalIndex = new OnheapIncrementalIndex(new IncrementalIndexSchema.Builder().withQueryGranularity(Granularities.NONE).withMetrics(new AggregatorFactory[] { new CountAggregatorFactory("count") }).withDimensionsSpec(new DimensionsSpec(DimensionsSpec.getDefaultSchemas(Arrays.asList("dim0", "dim1")), null, null)).build(), true, 1000000);
    closer.closeLater(incrementalIndex);
    Assert.assertEquals(Arrays.asList("dim0", "dim1"), incrementalIndex.getDimensionNames());
}
Also used : CountAggregatorFactory(io.druid.query.aggregation.CountAggregatorFactory) OnheapIncrementalIndex(io.druid.segment.incremental.OnheapIncrementalIndex) DimensionsSpec(io.druid.data.input.impl.DimensionsSpec) Aggregator(io.druid.query.aggregation.Aggregator) CountAggregatorFactory(io.druid.query.aggregation.CountAggregatorFactory) DoubleSumAggregatorFactory(io.druid.query.aggregation.DoubleSumAggregatorFactory) AggregatorFactory(io.druid.query.aggregation.AggregatorFactory) FilteredAggregatorFactory(io.druid.query.aggregation.FilteredAggregatorFactory) LongSumAggregatorFactory(io.druid.query.aggregation.LongSumAggregatorFactory) IncrementalIndexSchema(io.druid.segment.incremental.IncrementalIndexSchema) Test(org.junit.Test)

Aggregations

AggregatorFactory (io.druid.query.aggregation.AggregatorFactory)148 Test (org.junit.Test)86 CountAggregatorFactory (io.druid.query.aggregation.CountAggregatorFactory)82 LongSumAggregatorFactory (io.druid.query.aggregation.LongSumAggregatorFactory)64 Interval (org.joda.time.Interval)45 DoubleSumAggregatorFactory (io.druid.query.aggregation.DoubleSumAggregatorFactory)38 DateTime (org.joda.time.DateTime)37 FilteredAggregatorFactory (io.druid.query.aggregation.FilteredAggregatorFactory)32 Result (io.druid.query.Result)31 DoubleMaxAggregatorFactory (io.druid.query.aggregation.DoubleMaxAggregatorFactory)27 HyperUniquesAggregatorFactory (io.druid.query.aggregation.hyperloglog.HyperUniquesAggregatorFactory)25 Row (io.druid.data.input.Row)24 PostAggregator (io.druid.query.aggregation.PostAggregator)24 DefaultDimensionSpec (io.druid.query.dimension.DefaultDimensionSpec)22 CardinalityAggregatorFactory (io.druid.query.aggregation.cardinality.CardinalityAggregatorFactory)19 LongMaxAggregatorFactory (io.druid.query.aggregation.LongMaxAggregatorFactory)18 LongFirstAggregatorFactory (io.druid.query.aggregation.first.LongFirstAggregatorFactory)18 LongLastAggregatorFactory (io.druid.query.aggregation.last.LongLastAggregatorFactory)18 DimensionSpec (io.druid.query.dimension.DimensionSpec)18 TimeseriesQuery (io.druid.query.timeseries.TimeseriesQuery)17