Examples with BenchmarkDataGenerator - io.druid.benchmark.datagen.BenchmarkDataGenerator

Example 1 with BenchmarkDataGenerator

use of io.druid.benchmark.datagen.BenchmarkDataGenerator in project druid by druid-io.

the class IncrementalIndexReadBenchmark method setup.

@Setup
public void setup() throws IOException {
    log.info("SETUP CALLED AT " + +System.currentTimeMillis());
    if (ComplexMetrics.getSerdeForType("hyperUnique") == null) {
        ComplexMetrics.registerSerde("hyperUnique", new HyperUniquesSerde(HyperLogLogHash.getDefault()));
    }
    schemaInfo = BenchmarkSchemas.SCHEMA_MAP.get(schema);
    BenchmarkDataGenerator gen = new BenchmarkDataGenerator(schemaInfo.getColumnSchemas(), RNG_SEED, schemaInfo.getDataInterval(), rowsPerSegment);
    incIndex = makeIncIndex();
    for (int j = 0; j < rowsPerSegment; j++) {
        InputRow row = gen.nextRow();
        if (j % 10000 == 0) {
            log.info(j + " rows generated.");
        }
        incIndex.add(row);
    }
}

Also used : BenchmarkDataGenerator(io.druid.benchmark.datagen.BenchmarkDataGenerator) InputRow(io.druid.data.input.InputRow) HyperUniquesSerde(io.druid.query.aggregation.hyperloglog.HyperUniquesSerde) Setup(org.openjdk.jmh.annotations.Setup)

Example 2 with BenchmarkDataGenerator

use of io.druid.benchmark.datagen.BenchmarkDataGenerator in project druid by druid-io.

the class IndexPersistBenchmark method setup.

@Setup
public void setup() throws IOException {
    log.info("SETUP CALLED AT " + +System.currentTimeMillis());
    if (ComplexMetrics.getSerdeForType("hyperUnique") == null) {
        ComplexMetrics.registerSerde("hyperUnique", new HyperUniquesSerde(HyperLogLogHash.getDefault()));
    }
    rows = new ArrayList<InputRow>();
    schemaInfo = BenchmarkSchemas.SCHEMA_MAP.get(schema);
    BenchmarkDataGenerator gen = new BenchmarkDataGenerator(schemaInfo.getColumnSchemas(), RNG_SEED, schemaInfo.getDataInterval(), rowsPerSegment);
    for (int i = 0; i < rowsPerSegment; i++) {
        InputRow row = gen.nextRow();
        if (i % 10000 == 0) {
            log.info(i + " rows generated.");
        }
        rows.add(row);
    }
}

Also used : InputRow(io.druid.data.input.InputRow) BenchmarkDataGenerator(io.druid.benchmark.datagen.BenchmarkDataGenerator) HyperUniquesSerde(io.druid.query.aggregation.hyperloglog.HyperUniquesSerde) Setup(org.openjdk.jmh.annotations.Setup)

Example 3 with BenchmarkDataGenerator

use of io.druid.benchmark.datagen.BenchmarkDataGenerator in project druid by druid-io.

the class TopNTypeInterfaceBenchmark method setup.

@Setup
public void setup() throws IOException {
    log.info("SETUP CALLED AT " + System.currentTimeMillis());
    if (ComplexMetrics.getSerdeForType("hyperUnique") == null) {
        ComplexMetrics.registerSerde("hyperUnique", new HyperUniquesSerde(HyperLogLogHash.getDefault()));
    }
    executorService = Execs.multiThreaded(numSegments, "TopNThreadPool");
    setupQueries();
    schemaInfo = BenchmarkSchemas.SCHEMA_MAP.get("basic");
    queryBuilder = SCHEMA_QUERY_MAP.get("basic").get("string");
    queryBuilder.threshold(threshold);
    stringQuery = queryBuilder.build();
    TopNQueryBuilder longBuilder = SCHEMA_QUERY_MAP.get("basic").get("long");
    longBuilder.threshold(threshold);
    longQuery = longBuilder.build();
    TopNQueryBuilder floatBuilder = SCHEMA_QUERY_MAP.get("basic").get("float");
    floatBuilder.threshold(threshold);
    floatQuery = floatBuilder.build();
    incIndexes = new ArrayList<>();
    for (int i = 0; i < numSegments; i++) {
        log.info("Generating rows for segment " + i);
        BenchmarkDataGenerator gen = new BenchmarkDataGenerator(schemaInfo.getColumnSchemas(), RNG_SEED + i, schemaInfo.getDataInterval(), rowsPerSegment);
        IncrementalIndex incIndex = makeIncIndex();
        for (int j = 0; j < rowsPerSegment; j++) {
            InputRow row = gen.nextRow();
            if (j % 10000 == 0) {
                log.info(j + " rows generated.");
            }
            incIndex.add(row);
        }
        incIndexes.add(incIndex);
    }
    File tmpFile = Files.createTempDir();
    log.info("Using temp dir: " + tmpFile.getAbsolutePath());
    tmpFile.deleteOnExit();
    qIndexes = new ArrayList<>();
    for (int i = 0; i < numSegments; i++) {
        File indexFile = INDEX_MERGER_V9.persist(incIndexes.get(i), tmpFile, new IndexSpec());
        QueryableIndex qIndex = INDEX_IO.loadIndex(indexFile);
        qIndexes.add(qIndex);
    }
    factory = new TopNQueryRunnerFactory(new StupidPool<>("TopNBenchmark-compute-bufferPool", new OffheapBufferGenerator("compute", 250000000), 0, Integer.MAX_VALUE), new TopNQueryQueryToolChest(new TopNQueryConfig(), QueryBenchmarkUtil.NoopIntervalChunkingQueryRunnerDecorator()), QueryBenchmarkUtil.NOOP_QUERYWATCHER);
}

Also used : TopNQueryBuilder(io.druid.query.topn.TopNQueryBuilder) IndexSpec(io.druid.segment.IndexSpec) IncrementalIndex(io.druid.segment.incremental.IncrementalIndex) OnheapIncrementalIndex(io.druid.segment.incremental.OnheapIncrementalIndex) BenchmarkDataGenerator(io.druid.benchmark.datagen.BenchmarkDataGenerator) HyperUniquesSerde(io.druid.query.aggregation.hyperloglog.HyperUniquesSerde) OffheapBufferGenerator(io.druid.offheap.OffheapBufferGenerator) TopNQueryConfig(io.druid.query.topn.TopNQueryConfig) QueryableIndex(io.druid.segment.QueryableIndex) InputRow(io.druid.data.input.InputRow) TopNQueryRunnerFactory(io.druid.query.topn.TopNQueryRunnerFactory) StupidPool(io.druid.collections.StupidPool) TopNQueryQueryToolChest(io.druid.query.topn.TopNQueryQueryToolChest) File(java.io.File) Setup(org.openjdk.jmh.annotations.Setup)

Example 4 with BenchmarkDataGenerator

use of io.druid.benchmark.datagen.BenchmarkDataGenerator in project druid by druid-io.

the class SqlBenchmark method setup.

@Setup(Level.Trial)
public void setup() throws Exception {
    tmpDir = Files.createTempDir();
    log.info("Starting benchmark setup using tmpDir[%s], rows[%,d].", tmpDir, rowsPerSegment);
    if (ComplexMetrics.getSerdeForType("hyperUnique") == null) {
        ComplexMetrics.registerSerde("hyperUnique", new HyperUniquesSerde(HyperLogLogHash.getDefault()));
    }
    final BenchmarkSchemaInfo schemaInfo = BenchmarkSchemas.SCHEMA_MAP.get("basic");
    final BenchmarkDataGenerator dataGenerator = new BenchmarkDataGenerator(schemaInfo.getColumnSchemas(), RNG_SEED + 1, schemaInfo.getDataInterval(), rowsPerSegment);
    final List<InputRow> rows = Lists.newArrayList();
    for (int i = 0; i < rowsPerSegment; i++) {
        final InputRow row = dataGenerator.nextRow();
        if (i % 20000 == 0) {
            log.info("%,d/%,d rows generated.", i, rowsPerSegment);
        }
        rows.add(row);
    }
    log.info("%,d/%,d rows generated.", rows.size(), rowsPerSegment);
    final PlannerConfig plannerConfig = new PlannerConfig();
    final QueryRunnerFactoryConglomerate conglomerate = CalciteTests.queryRunnerFactoryConglomerate();
    final QueryableIndex index = IndexBuilder.create().tmpDir(new File(tmpDir, "1")).indexMerger(TestHelper.getTestIndexMergerV9()).rows(rows).buildMMappedIndex();
    this.walker = new SpecificSegmentsQuerySegmentWalker(conglomerate).add(DataSegment.builder().dataSource("foo").interval(index.getDataInterval()).version("1").shardSpec(new LinearShardSpec(0)).build(), index);
    final Map<String, Table> tableMap = ImmutableMap.<String, Table>of("foo", new DruidTable(new TableDataSource("foo"), RowSignature.builder().add("__time", ValueType.LONG).add("dimSequential", ValueType.STRING).add("dimZipf", ValueType.STRING).add("dimUniform", ValueType.STRING).build()));
    final Schema druidSchema = new AbstractSchema() {

        @Override
        protected Map<String, Table> getTableMap() {
            return tableMap;
        }
    };
    plannerFactory = new PlannerFactory(Calcites.createRootSchema(druidSchema), walker, CalciteTests.createOperatorTable(), plannerConfig);
    groupByQuery = GroupByQuery.builder().setDataSource("foo").setInterval(new Interval(JodaUtils.MIN_INSTANT, JodaUtils.MAX_INSTANT)).setDimensions(Arrays.<DimensionSpec>asList(new DefaultDimensionSpec("dimZipf", "d0"), new DefaultDimensionSpec("dimSequential", "d1"))).setAggregatorSpecs(Arrays.<AggregatorFactory>asList(new CountAggregatorFactory("c"))).setGranularity(Granularities.ALL).build();
    sqlQuery = "SELECT\n" + "  dimZipf AS d0," + "  dimSequential AS d1,\n" + "  COUNT(*) AS c\n" + "FROM druid.foo\n" + "GROUP BY dimZipf, dimSequential";
}

Also used : DruidTable(io.druid.sql.calcite.table.DruidTable) Table(org.apache.calcite.schema.Table) LinearShardSpec(io.druid.timeline.partition.LinearShardSpec) Schema(org.apache.calcite.schema.Schema) AbstractSchema(org.apache.calcite.schema.impl.AbstractSchema) BenchmarkDataGenerator(io.druid.benchmark.datagen.BenchmarkDataGenerator) HyperUniquesSerde(io.druid.query.aggregation.hyperloglog.HyperUniquesSerde) DruidTable(io.druid.sql.calcite.table.DruidTable) CountAggregatorFactory(io.druid.query.aggregation.CountAggregatorFactory) AggregatorFactory(io.druid.query.aggregation.AggregatorFactory) DefaultDimensionSpec(io.druid.query.dimension.DefaultDimensionSpec) QueryRunnerFactoryConglomerate(io.druid.query.QueryRunnerFactoryConglomerate) SpecificSegmentsQuerySegmentWalker(io.druid.sql.calcite.util.SpecificSegmentsQuerySegmentWalker) TableDataSource(io.druid.query.TableDataSource) CountAggregatorFactory(io.druid.query.aggregation.CountAggregatorFactory) AbstractSchema(org.apache.calcite.schema.impl.AbstractSchema) QueryableIndex(io.druid.segment.QueryableIndex) BenchmarkSchemaInfo(io.druid.benchmark.datagen.BenchmarkSchemaInfo) PlannerConfig(io.druid.sql.calcite.planner.PlannerConfig) InputRow(io.druid.data.input.InputRow) PlannerFactory(io.druid.sql.calcite.planner.PlannerFactory) File(java.io.File) Interval(org.joda.time.Interval) Setup(org.openjdk.jmh.annotations.Setup)

Example 5 with BenchmarkDataGenerator

use of io.druid.benchmark.datagen.BenchmarkDataGenerator in project druid by druid-io.

the class TimeseriesBenchmark method setup.

@Setup
public void setup() throws IOException {
    log.info("SETUP CALLED AT " + System.currentTimeMillis());
    if (ComplexMetrics.getSerdeForType("hyperUnique") == null) {
        ComplexMetrics.registerSerde("hyperUnique", new HyperUniquesSerde(HyperLogLogHash.getDefault()));
    }
    executorService = Execs.multiThreaded(numSegments, "TimeseriesThreadPool");
    setupQueries();
    String[] schemaQuery = schemaAndQuery.split("\\.");
    String schemaName = schemaQuery[0];
    String queryName = schemaQuery[1];
    schemaInfo = BenchmarkSchemas.SCHEMA_MAP.get(schemaName);
    query = SCHEMA_QUERY_MAP.get(schemaName).get(queryName);
    incIndexes = new ArrayList<>();
    for (int i = 0; i < numSegments; i++) {
        log.info("Generating rows for segment " + i);
        BenchmarkDataGenerator gen = new BenchmarkDataGenerator(schemaInfo.getColumnSchemas(), RNG_SEED + i, schemaInfo.getDataInterval(), rowsPerSegment);
        IncrementalIndex incIndex = makeIncIndex();
        for (int j = 0; j < rowsPerSegment; j++) {
            InputRow row = gen.nextRow();
            if (j % 10000 == 0) {
                log.info(j + " rows generated.");
            }
            incIndex.add(row);
        }
        log.info(rowsPerSegment + " rows generated");
        incIndexes.add(incIndex);
    }
    tmpDir = Files.createTempDir();
    log.info("Using temp dir: " + tmpDir.getAbsolutePath());
    qIndexes = new ArrayList<>();
    for (int i = 0; i < numSegments; i++) {
        File indexFile = INDEX_MERGER_V9.persist(incIndexes.get(i), tmpDir, new IndexSpec());
        QueryableIndex qIndex = INDEX_IO.loadIndex(indexFile);
        qIndexes.add(qIndex);
    }
    factory = new TimeseriesQueryRunnerFactory(new TimeseriesQueryQueryToolChest(QueryBenchmarkUtil.NoopIntervalChunkingQueryRunnerDecorator()), new TimeseriesQueryEngine(), QueryBenchmarkUtil.NOOP_QUERYWATCHER);
}

Also used : IndexSpec(io.druid.segment.IndexSpec) IncrementalIndex(io.druid.segment.incremental.IncrementalIndex) OnheapIncrementalIndex(io.druid.segment.incremental.OnheapIncrementalIndex) BenchmarkDataGenerator(io.druid.benchmark.datagen.BenchmarkDataGenerator) HyperUniquesSerde(io.druid.query.aggregation.hyperloglog.HyperUniquesSerde) TimeseriesQueryQueryToolChest(io.druid.query.timeseries.TimeseriesQueryQueryToolChest) TimeseriesQueryEngine(io.druid.query.timeseries.TimeseriesQueryEngine) TimeseriesQueryRunnerFactory(io.druid.query.timeseries.TimeseriesQueryRunnerFactory) QueryableIndex(io.druid.segment.QueryableIndex) InputRow(io.druid.data.input.InputRow) File(java.io.File) Setup(org.openjdk.jmh.annotations.Setup)

Aggregations

BenchmarkDataGenerator (io.druid.benchmark.datagen.BenchmarkDataGenerator)22 InputRow (io.druid.data.input.InputRow)22 HyperUniquesSerde (io.druid.query.aggregation.hyperloglog.HyperUniquesSerde)14 Setup (org.openjdk.jmh.annotations.Setup)14 IndexSpec (io.druid.segment.IndexSpec)10 File (java.io.File)9 ArrayList (java.util.ArrayList)9 BenchmarkColumnSchema (io.druid.benchmark.datagen.BenchmarkColumnSchema)8 IncrementalIndex (io.druid.segment.incremental.IncrementalIndex)8 OnheapIncrementalIndex (io.druid.segment.incremental.OnheapIncrementalIndex)8 Test (org.junit.Test)8 QueryableIndex (io.druid.segment.QueryableIndex)7 StupidPool (io.druid.collections.StupidPool)4 OffheapBufferGenerator (io.druid.offheap.OffheapBufferGenerator)4 Interval (org.joda.time.Interval)3 ObjectMapper (com.fasterxml.jackson.databind.ObjectMapper)2 SmileFactory (com.fasterxml.jackson.dataformat.smile.SmileFactory)2 BenchmarkSchemaInfo (io.druid.benchmark.datagen.BenchmarkSchemaInfo)2 BlockingPool (io.druid.collections.BlockingPool)2 DefaultObjectMapper (io.druid.jackson.DefaultObjectMapper)2