Search in sources :

Example 46 with InputRow

use of io.druid.data.input.InputRow in project druid by druid-io.

the class ReplayableFirehoseFactoryTest method testReplayableFirehoseWithEvents.

@Test
public void testReplayableFirehoseWithEvents() throws Exception {
    final boolean[] hasMore = { true };
    expect(delegateFactory.connect(parser)).andReturn(delegateFirehose);
    expect(delegateFirehose.hasMore()).andAnswer(new IAnswer<Boolean>() {

        @Override
        public Boolean answer() throws Throwable {
            return hasMore[0];
        }
    }).anyTimes();
    expect(delegateFirehose.nextRow()).andReturn(testRows.get(0)).andReturn(testRows.get(1)).andAnswer(new IAnswer<InputRow>() {

        @Override
        public InputRow answer() throws Throwable {
            hasMore[0] = false;
            return testRows.get(2);
        }
    });
    delegateFirehose.close();
    replayAll();
    List<InputRow> rows = Lists.newArrayList();
    try (Firehose firehose = replayableFirehoseFactory.connect(parser)) {
        while (firehose.hasMore()) {
            rows.add(firehose.nextRow());
        }
    }
    Assert.assertEquals(testRows, rows);
    // now replay!
    rows.clear();
    try (Firehose firehose = replayableFirehoseFactory.connect(parser)) {
        while (firehose.hasMore()) {
            rows.add(firehose.nextRow());
        }
    }
    Assert.assertEquals(testRows, rows);
    verifyAll();
}
Also used : IAnswer(org.easymock.IAnswer) Firehose(io.druid.data.input.Firehose) MapBasedInputRow(io.druid.data.input.MapBasedInputRow) InputRow(io.druid.data.input.InputRow) Test(org.junit.Test)

Example 47 with InputRow

use of io.druid.data.input.InputRow in project druid by druid-io.

the class ReplayableFirehoseFactoryTest method testReplayableFirehoseWithMultipleFiles.

@Test
public void testReplayableFirehoseWithMultipleFiles() throws Exception {
    replayableFirehoseFactory = new ReplayableFirehoseFactory(delegateFactory, false, 1, 3, mapper);
    final boolean[] hasMore = { true };
    final int multiplicationFactor = 500;
    final InputRow finalRow = new MapBasedInputRow(DateTime.now(), Lists.newArrayList("dim4", "dim5"), ImmutableMap.<String, Object>of("dim4", "val12", "dim5", "val20", "met1", 30));
    expect(delegateFactory.connect(parser)).andReturn(delegateFirehose);
    expect(delegateFirehose.hasMore()).andAnswer(new IAnswer<Boolean>() {

        @Override
        public Boolean answer() throws Throwable {
            return hasMore[0];
        }
    }).anyTimes();
    expect(delegateFirehose.nextRow()).andReturn(testRows.get(0)).times(multiplicationFactor).andReturn(testRows.get(1)).times(multiplicationFactor).andReturn(testRows.get(2)).times(multiplicationFactor).andAnswer(new IAnswer<InputRow>() {

        @Override
        public InputRow answer() throws Throwable {
            hasMore[0] = false;
            return finalRow;
        }
    });
    delegateFirehose.close();
    replayAll();
    List<InputRow> testRowsMultiplied = Lists.newArrayList();
    for (InputRow row : testRows) {
        for (int i = 0; i < multiplicationFactor; i++) {
            testRowsMultiplied.add(row);
        }
    }
    testRowsMultiplied.add(finalRow);
    List<InputRow> rows = Lists.newArrayList();
    try (Firehose firehose = replayableFirehoseFactory.connect(parser)) {
        while (firehose.hasMore()) {
            rows.add(firehose.nextRow());
        }
    }
    Assert.assertEquals(testRowsMultiplied, rows);
    // now replay!
    rows.clear();
    try (Firehose firehose = replayableFirehoseFactory.connect(parser)) {
        while (firehose.hasMore()) {
            rows.add(firehose.nextRow());
        }
    }
    Assert.assertEquals(testRowsMultiplied, rows);
    verifyAll();
}
Also used : IAnswer(org.easymock.IAnswer) Firehose(io.druid.data.input.Firehose) MapBasedInputRow(io.druid.data.input.MapBasedInputRow) InputRow(io.druid.data.input.InputRow) ReplayableFirehoseFactory(io.druid.segment.realtime.firehose.ReplayableFirehoseFactory) MapBasedInputRow(io.druid.data.input.MapBasedInputRow) Test(org.junit.Test)

Example 48 with InputRow

use of io.druid.data.input.InputRow in project druid by druid-io.

the class ReplayableFirehoseFactoryTest method testReplayableFirehoseWithoutReportParseExceptions.

@Test
public void testReplayableFirehoseWithoutReportParseExceptions() throws Exception {
    final boolean[] hasMore = { true };
    replayableFirehoseFactory = new ReplayableFirehoseFactory(delegateFactory, false, 10000, 3, mapper);
    expect(delegateFactory.connect(parser)).andReturn(delegateFirehose);
    expect(delegateFirehose.hasMore()).andAnswer(new IAnswer<Boolean>() {

        @Override
        public Boolean answer() throws Throwable {
            return hasMore[0];
        }
    }).anyTimes();
    expect(delegateFirehose.nextRow()).andReturn(testRows.get(0)).andReturn(testRows.get(1)).andThrow(new ParseException("unparseable!")).andAnswer(new IAnswer<InputRow>() {

        @Override
        public InputRow answer() throws Throwable {
            hasMore[0] = false;
            return testRows.get(2);
        }
    });
    delegateFirehose.close();
    replayAll();
    List<InputRow> rows = Lists.newArrayList();
    try (Firehose firehose = replayableFirehoseFactory.connect(parser)) {
        while (firehose.hasMore()) {
            rows.add(firehose.nextRow());
        }
    }
    Assert.assertEquals(testRows, rows);
    verifyAll();
}
Also used : IAnswer(org.easymock.IAnswer) Firehose(io.druid.data.input.Firehose) MapBasedInputRow(io.druid.data.input.MapBasedInputRow) InputRow(io.druid.data.input.InputRow) ReplayableFirehoseFactory(io.druid.segment.realtime.firehose.ReplayableFirehoseFactory) ParseException(io.druid.java.util.common.parsers.ParseException) Test(org.junit.Test)

Example 49 with InputRow

use of io.druid.data.input.InputRow in project druid by druid-io.

the class RealtimePlumberSchoolTest method testPersist.

private void testPersist(final Object commitMetadata) throws Exception {
    plumber.getSinks().put(0L, new Sink(new Interval(0, TimeUnit.HOURS.toMillis(1)), schema, tuningConfig.getShardSpec(), new DateTime("2014-12-01T12:34:56.789").toString(), tuningConfig.getMaxRowsInMemory(), tuningConfig.isReportParseExceptions()));
    Assert.assertNull(plumber.startJob());
    final InputRow row = EasyMock.createNiceMock(InputRow.class);
    EasyMock.expect(row.getTimestampFromEpoch()).andReturn(0L);
    EasyMock.expect(row.getDimensions()).andReturn(new ArrayList<String>());
    EasyMock.replay(row);
    final CountDownLatch doneSignal = new CountDownLatch(1);
    final Committer committer = new Committer() {

        @Override
        public Object getMetadata() {
            return commitMetadata;
        }

        @Override
        public void run() {
            doneSignal.countDown();
        }
    };
    plumber.add(row, Suppliers.ofInstance(committer));
    plumber.persist(committer);
    doneSignal.await();
    plumber.getSinks().clear();
    plumber.finishJob();
}
Also used : InputRow(io.druid.data.input.InputRow) Committer(io.druid.data.input.Committer) CountDownLatch(java.util.concurrent.CountDownLatch) DateTime(org.joda.time.DateTime) Interval(org.joda.time.Interval)

Example 50 with InputRow

use of io.druid.data.input.InputRow in project druid by druid-io.

the class GroupByTypeInterfaceBenchmark method setup.

@Setup(Level.Trial)
public void setup() throws IOException {
    log.info("SETUP CALLED AT %d", System.currentTimeMillis());
    if (ComplexMetrics.getSerdeForType("hyperUnique") == null) {
        ComplexMetrics.registerSerde("hyperUnique", new HyperUniquesSerde(HyperLogLogHash.getDefault()));
    }
    executorService = Execs.multiThreaded(numProcessingThreads, "GroupByThreadPool[%d]");
    setupQueries();
    String schemaName = "basic";
    schemaInfo = BenchmarkSchemas.SCHEMA_MAP.get(schemaName);
    stringQuery = SCHEMA_QUERY_MAP.get(schemaName).get("string");
    longFloatQuery = SCHEMA_QUERY_MAP.get(schemaName).get("longFloat");
    longQuery = SCHEMA_QUERY_MAP.get(schemaName).get("long");
    floatQuery = SCHEMA_QUERY_MAP.get(schemaName).get("float");
    final BenchmarkDataGenerator dataGenerator = new BenchmarkDataGenerator(schemaInfo.getColumnSchemas(), RNG_SEED + 1, schemaInfo.getDataInterval(), rowsPerSegment);
    tmpDir = Files.createTempDir();
    log.info("Using temp dir: %s", tmpDir.getAbsolutePath());
    // queryableIndexes   -> numSegments worth of on-disk segments
    // anIncrementalIndex -> the last incremental index
    anIncrementalIndex = null;
    queryableIndexes = new ArrayList<>(numSegments);
    for (int i = 0; i < numSegments; i++) {
        log.info("Generating rows for segment %d/%d", i + 1, numSegments);
        final IncrementalIndex index = makeIncIndex();
        for (int j = 0; j < rowsPerSegment; j++) {
            final InputRow row = dataGenerator.nextRow();
            if (j % 20000 == 0) {
                log.info("%,d/%,d rows generated.", i * rowsPerSegment + j, rowsPerSegment * numSegments);
            }
            index.add(row);
        }
        log.info("%,d/%,d rows generated, persisting segment %d/%d.", (i + 1) * rowsPerSegment, rowsPerSegment * numSegments, i + 1, numSegments);
        final File file = INDEX_MERGER_V9.persist(index, new File(tmpDir, String.valueOf(i)), new IndexSpec());
        queryableIndexes.add(INDEX_IO.loadIndex(file));
        if (i == numSegments - 1) {
            anIncrementalIndex = index;
        } else {
            index.close();
        }
    }
    StupidPool<ByteBuffer> bufferPool = new StupidPool<>("GroupByBenchmark-computeBufferPool", new OffheapBufferGenerator("compute", 250_000_000), 0, Integer.MAX_VALUE);
    // limit of 2 is required since we simulate both historical merge and broker merge in the same process
    BlockingPool<ByteBuffer> mergePool = new BlockingPool<>(new OffheapBufferGenerator("merge", 250_000_000), 2);
    final GroupByQueryConfig config = new GroupByQueryConfig() {

        @Override
        public String getDefaultStrategy() {
            return defaultStrategy;
        }

        @Override
        public int getBufferGrouperInitialBuckets() {
            return initialBuckets;
        }

        @Override
        public long getMaxOnDiskStorage() {
            return 1_000_000_000L;
        }
    };
    config.setSingleThreaded(false);
    config.setMaxIntermediateRows(Integer.MAX_VALUE);
    config.setMaxResults(Integer.MAX_VALUE);
    DruidProcessingConfig druidProcessingConfig = new DruidProcessingConfig() {

        @Override
        public int getNumThreads() {
            // Used by "v2" strategy for concurrencyHint
            return numProcessingThreads;
        }

        @Override
        public String getFormatString() {
            return null;
        }
    };
    final Supplier<GroupByQueryConfig> configSupplier = Suppliers.ofInstance(config);
    final GroupByStrategySelector strategySelector = new GroupByStrategySelector(configSupplier, new GroupByStrategyV1(configSupplier, new GroupByQueryEngine(configSupplier, bufferPool), QueryBenchmarkUtil.NOOP_QUERYWATCHER, bufferPool), new GroupByStrategyV2(druidProcessingConfig, configSupplier, bufferPool, mergePool, new ObjectMapper(new SmileFactory()), QueryBenchmarkUtil.NOOP_QUERYWATCHER));
    factory = new GroupByQueryRunnerFactory(strategySelector, new GroupByQueryQueryToolChest(strategySelector, QueryBenchmarkUtil.NoopIntervalChunkingQueryRunnerDecorator()));
}
Also used : GroupByStrategySelector(io.druid.query.groupby.strategy.GroupByStrategySelector) IndexSpec(io.druid.segment.IndexSpec) BenchmarkDataGenerator(io.druid.benchmark.datagen.BenchmarkDataGenerator) HyperUniquesSerde(io.druid.query.aggregation.hyperloglog.HyperUniquesSerde) GroupByQueryQueryToolChest(io.druid.query.groupby.GroupByQueryQueryToolChest) GroupByStrategyV1(io.druid.query.groupby.strategy.GroupByStrategyV1) GroupByStrategyV2(io.druid.query.groupby.strategy.GroupByStrategyV2) GroupByQueryEngine(io.druid.query.groupby.GroupByQueryEngine) DefaultObjectMapper(io.druid.jackson.DefaultObjectMapper) ObjectMapper(com.fasterxml.jackson.databind.ObjectMapper) GroupByQueryRunnerFactory(io.druid.query.groupby.GroupByQueryRunnerFactory) IncrementalIndex(io.druid.segment.incremental.IncrementalIndex) OnheapIncrementalIndex(io.druid.segment.incremental.OnheapIncrementalIndex) GroupByQueryConfig(io.druid.query.groupby.GroupByQueryConfig) ByteBuffer(java.nio.ByteBuffer) SmileFactory(com.fasterxml.jackson.dataformat.smile.SmileFactory) OffheapBufferGenerator(io.druid.offheap.OffheapBufferGenerator) InputRow(io.druid.data.input.InputRow) BlockingPool(io.druid.collections.BlockingPool) StupidPool(io.druid.collections.StupidPool) DruidProcessingConfig(io.druid.query.DruidProcessingConfig) File(java.io.File) Setup(org.openjdk.jmh.annotations.Setup)

Aggregations

InputRow (io.druid.data.input.InputRow)81 Test (org.junit.Test)35 MapBasedInputRow (io.druid.data.input.MapBasedInputRow)24 BenchmarkDataGenerator (io.druid.benchmark.datagen.BenchmarkDataGenerator)22 File (java.io.File)18 Setup (org.openjdk.jmh.annotations.Setup)15 HyperUniquesSerde (io.druid.query.aggregation.hyperloglog.HyperUniquesSerde)14 Firehose (io.druid.data.input.Firehose)12 OnheapIncrementalIndex (io.druid.segment.incremental.OnheapIncrementalIndex)12 IndexSpec (io.druid.segment.IndexSpec)11 ArrayList (java.util.ArrayList)11 IncrementalIndex (io.druid.segment.incremental.IncrementalIndex)10 DateTime (org.joda.time.DateTime)10 QueryableIndex (io.druid.segment.QueryableIndex)9 IOException (java.io.IOException)9 BenchmarkColumnSchema (io.druid.benchmark.datagen.BenchmarkColumnSchema)8 Interval (org.joda.time.Interval)8 ParseException (io.druid.java.util.common.parsers.ParseException)7 AggregatorFactory (io.druid.query.aggregation.AggregatorFactory)6 DataSegment (io.druid.timeline.DataSegment)5