Examples with DataSchema - org.apache.druid.segment.indexing.DataSchema

Example 16 with DataSchema

use of org.apache.druid.segment.indexing.DataSchema in project druid by druid-io.

the class CompactionTaskParallelRunTest method runIndexTask.

private void runIndexTask(@Nullable PartitionsSpec partitionsSpec, boolean appendToExisting) {
    ParallelIndexIOConfig ioConfig = new ParallelIndexIOConfig(null, new LocalInputSource(inputDir, "druid*"), new CsvInputFormat(Arrays.asList("ts", "dim", "val"), "|", null, false, 0), appendToExisting, null);
    ParallelIndexTuningConfig tuningConfig = newTuningConfig(partitionsSpec, 2, !appendToExisting);
    ParallelIndexSupervisorTask indexTask = new ParallelIndexSupervisorTask(null, null, null, new ParallelIndexIngestionSpec(new DataSchema(DATA_SOURCE, new TimestampSpec("ts", "auto", null), new DimensionsSpec(DimensionsSpec.getDefaultSchemas(Arrays.asList("ts", "dim"))), new AggregatorFactory[] { new LongSumAggregatorFactory("val", "val") }, new UniformGranularitySpec(Granularities.HOUR, Granularities.MINUTE, ImmutableList.of(INTERVAL_TO_INDEX)), null), ioConfig, tuningConfig), null);
    runTask(indexTask);
}

Also used : DataSchema(org.apache.druid.segment.indexing.DataSchema) UniformGranularitySpec(org.apache.druid.segment.indexing.granularity.UniformGranularitySpec) ParallelIndexIOConfig(org.apache.druid.indexing.common.task.batch.parallel.ParallelIndexIOConfig) ParallelIndexSupervisorTask(org.apache.druid.indexing.common.task.batch.parallel.ParallelIndexSupervisorTask) TimestampSpec(org.apache.druid.data.input.impl.TimestampSpec) LongSumAggregatorFactory(org.apache.druid.query.aggregation.LongSumAggregatorFactory) ParallelIndexIngestionSpec(org.apache.druid.indexing.common.task.batch.parallel.ParallelIndexIngestionSpec) DimensionsSpec(org.apache.druid.data.input.impl.DimensionsSpec) CsvInputFormat(org.apache.druid.data.input.impl.CsvInputFormat) ParallelIndexTuningConfig(org.apache.druid.indexing.common.task.batch.parallel.ParallelIndexTuningConfig) LocalInputSource(org.apache.druid.data.input.impl.LocalInputSource)

Example 17 with DataSchema

use of org.apache.druid.segment.indexing.DataSchema in project druid by druid-io.

the class IndexIngestionSpecTest method testFirehoseAndInputFormat.

@Test
public void testFirehoseAndInputFormat() {
    expectedException.expect(IllegalArgumentException.class);
    expectedException.expectMessage("Cannot use firehose and inputFormat together.");
    final IndexIngestionSpec spec = new IndexIngestionSpec(new DataSchema("dataSource", new TimestampSpec(null, null, null), DimensionsSpec.EMPTY, new AggregatorFactory[0], new ArbitraryGranularitySpec(Granularities.NONE, null), null), new IndexIOConfig(new NoopFirehoseFactory(), null, new NoopInputFormat(), null, null), null);
}

Also used : DataSchema(org.apache.druid.segment.indexing.DataSchema) IndexIngestionSpec(org.apache.druid.indexing.common.task.IndexTask.IndexIngestionSpec) IndexIOConfig(org.apache.druid.indexing.common.task.IndexTask.IndexIOConfig) TimestampSpec(org.apache.druid.data.input.impl.TimestampSpec) NoopFirehoseFactory(org.apache.druid.data.input.impl.NoopFirehoseFactory) NoopInputFormat(org.apache.druid.data.input.impl.NoopInputFormat) AggregatorFactory(org.apache.druid.query.aggregation.AggregatorFactory) ArbitraryGranularitySpec(org.apache.druid.segment.indexing.granularity.ArbitraryGranularitySpec) Test(org.junit.Test)

Example 18 with DataSchema

use of org.apache.druid.segment.indexing.DataSchema in project druid by druid-io.

the class IndexIngestionSpecTest method testFirehoseAndInputSource.

@Test
public void testFirehoseAndInputSource() {
    expectedException.expect(IllegalArgumentException.class);
    expectedException.expectMessage("At most one of [Property{name='firehose', value=NoopFirehoseFactory{}}, Property{name='inputSource'");
    final IndexIngestionSpec spec = new IndexIngestionSpec(new DataSchema("dataSource", new TimestampSpec(null, null, null), DimensionsSpec.EMPTY, new AggregatorFactory[0], new ArbitraryGranularitySpec(Granularities.NONE, null), null), new IndexIOConfig(new NoopFirehoseFactory(), new NoopInputSource(), null, null, null), null);
}

Also used : DataSchema(org.apache.druid.segment.indexing.DataSchema) IndexIngestionSpec(org.apache.druid.indexing.common.task.IndexTask.IndexIngestionSpec) IndexIOConfig(org.apache.druid.indexing.common.task.IndexTask.IndexIOConfig) NoopInputSource(org.apache.druid.data.input.impl.NoopInputSource) TimestampSpec(org.apache.druid.data.input.impl.TimestampSpec) NoopFirehoseFactory(org.apache.druid.data.input.impl.NoopFirehoseFactory) AggregatorFactory(org.apache.druid.query.aggregation.AggregatorFactory) ArbitraryGranularitySpec(org.apache.druid.segment.indexing.granularity.ArbitraryGranularitySpec) Test(org.junit.Test)

Example 19 with DataSchema

use of org.apache.druid.segment.indexing.DataSchema in project druid by druid-io.

the class TaskSerdeTest method testIndexTaskSerde.

@Test
public void testIndexTaskSerde() throws Exception {
    final IndexTask task = new IndexTask(null, null, new IndexIngestionSpec(new DataSchema("foo", new TimestampSpec(null, null, null), DimensionsSpec.EMPTY, new AggregatorFactory[] { new DoubleSumAggregatorFactory("met", "met") }, new UniformGranularitySpec(Granularities.DAY, null, ImmutableList.of(Intervals.of("2010-01-01/P2D"))), null), new IndexIOConfig(null, new LocalInputSource(new File("lol"), "rofl"), new NoopInputFormat(), true, false), new IndexTuningConfig(null, null, null, 10, null, null, null, 9999, null, null, new DynamicPartitionsSpec(10000, null), indexSpec, null, 3, false, null, null, null, null, null, null, null, null, 1L)), null);
    final String json = jsonMapper.writeValueAsString(task);
    // Just want to run the clock a bit to make sure the task id doesn't change
    Thread.sleep(100);
    final IndexTask task2 = (IndexTask) jsonMapper.readValue(json, Task.class);
    Assert.assertEquals("foo", task.getDataSource());
    Assert.assertEquals(task.getId(), task2.getId());
    Assert.assertEquals(task.getGroupId(), task2.getGroupId());
    Assert.assertEquals(task.getDataSource(), task2.getDataSource());
    IndexTask.IndexIOConfig taskIoConfig = task.getIngestionSchema().getIOConfig();
    IndexTask.IndexIOConfig task2IoConfig = task2.getIngestionSchema().getIOConfig();
    Assert.assertTrue(taskIoConfig.getInputSource() instanceof LocalInputSource);
    Assert.assertTrue(task2IoConfig.getInputSource() instanceof LocalInputSource);
    Assert.assertEquals(taskIoConfig.isAppendToExisting(), task2IoConfig.isAppendToExisting());
    IndexTask.IndexTuningConfig taskTuningConfig = task.getIngestionSchema().getTuningConfig();
    IndexTask.IndexTuningConfig task2TuningConfig = task2.getIngestionSchema().getTuningConfig();
    Assert.assertEquals(taskTuningConfig.getBasePersistDirectory(), task2TuningConfig.getBasePersistDirectory());
    Assert.assertEquals(taskTuningConfig.getIndexSpec(), task2TuningConfig.getIndexSpec());
    Assert.assertEquals(taskTuningConfig.getIntermediatePersistPeriod(), task2TuningConfig.getIntermediatePersistPeriod());
    Assert.assertEquals(taskTuningConfig.getMaxPendingPersists(), task2TuningConfig.getMaxPendingPersists());
    Assert.assertEquals(taskTuningConfig.getMaxRowsInMemory(), task2TuningConfig.getMaxRowsInMemory());
    Assert.assertEquals(taskTuningConfig.getNumShards(), task2TuningConfig.getNumShards());
    Assert.assertEquals(taskTuningConfig.getMaxRowsPerSegment(), task2TuningConfig.getMaxRowsPerSegment());
    Assert.assertEquals(taskTuningConfig.isReportParseExceptions(), task2TuningConfig.isReportParseExceptions());
    Assert.assertEquals(taskTuningConfig.getAwaitSegmentAvailabilityTimeoutMillis(), task2TuningConfig.getAwaitSegmentAvailabilityTimeoutMillis());
}

Also used : IndexIOConfig(org.apache.druid.indexing.common.task.IndexTask.IndexIOConfig) DoubleSumAggregatorFactory(org.apache.druid.query.aggregation.DoubleSumAggregatorFactory) LocalInputSource(org.apache.druid.data.input.impl.LocalInputSource) DataSchema(org.apache.druid.segment.indexing.DataSchema) IndexIngestionSpec(org.apache.druid.indexing.common.task.IndexTask.IndexIngestionSpec) UniformGranularitySpec(org.apache.druid.segment.indexing.granularity.UniformGranularitySpec) DynamicPartitionsSpec(org.apache.druid.indexer.partitions.DynamicPartitionsSpec) TimestampSpec(org.apache.druid.data.input.impl.TimestampSpec) IndexTuningConfig(org.apache.druid.indexing.common.task.IndexTask.IndexTuningConfig) NoopInputFormat(org.apache.druid.data.input.impl.NoopInputFormat) IndexIOConfig(org.apache.druid.indexing.common.task.IndexTask.IndexIOConfig) File(java.io.File) ParallelIndexTuningConfig(org.apache.druid.indexing.common.task.batch.parallel.ParallelIndexTuningConfig) IndexTuningConfig(org.apache.druid.indexing.common.task.IndexTask.IndexTuningConfig) Test(org.junit.Test)

Example 20 with DataSchema

use of org.apache.druid.segment.indexing.DataSchema in project druid by druid-io.

the class TaskSerdeTest method testRealtimeIndexTaskSerde.

@Test
public void testRealtimeIndexTaskSerde() throws Exception {
    final RealtimeIndexTask task = new RealtimeIndexTask(null, new TaskResource("rofl", 2), new FireDepartment(new DataSchema("foo", null, new AggregatorFactory[0], new UniformGranularitySpec(Granularities.HOUR, Granularities.NONE, null), null, jsonMapper), new RealtimeIOConfig(new LocalFirehoseFactory(new File("lol"), "rofl", null), (schema, config, metrics) -> null), new RealtimeTuningConfig(null, 1, 10L, null, new Period("PT10M"), null, null, null, null, 1, NoneShardSpec.instance(), indexSpec, null, 0, 0, true, null, null, null, null)), null);
    final String json = jsonMapper.writeValueAsString(task);
    // Just want to run the clock a bit to make sure the task id doesn't change
    Thread.sleep(100);
    final RealtimeIndexTask task2 = (RealtimeIndexTask) jsonMapper.readValue(json, Task.class);
    Assert.assertEquals("foo", task.getDataSource());
    Assert.assertEquals(2, task.getTaskResource().getRequiredCapacity());
    Assert.assertEquals("rofl", task.getTaskResource().getAvailabilityGroup());
    Assert.assertEquals(new Period("PT10M"), task.getRealtimeIngestionSchema().getTuningConfig().getWindowPeriod());
    Assert.assertEquals(Granularities.HOUR, task.getRealtimeIngestionSchema().getDataSchema().getGranularitySpec().getSegmentGranularity());
    Assert.assertTrue(task.getRealtimeIngestionSchema().getTuningConfig().isReportParseExceptions());
    Assert.assertEquals(task.getId(), task2.getId());
    Assert.assertEquals(task.getGroupId(), task2.getGroupId());
    Assert.assertEquals(task.getDataSource(), task2.getDataSource());
    Assert.assertEquals(task.getTaskResource().getRequiredCapacity(), task2.getTaskResource().getRequiredCapacity());
    Assert.assertEquals(task.getTaskResource().getAvailabilityGroup(), task2.getTaskResource().getAvailabilityGroup());
    Assert.assertEquals(task.getRealtimeIngestionSchema().getTuningConfig().getWindowPeriod(), task2.getRealtimeIngestionSchema().getTuningConfig().getWindowPeriod());
    Assert.assertEquals(task.getRealtimeIngestionSchema().getTuningConfig().getMaxBytesInMemory(), task2.getRealtimeIngestionSchema().getTuningConfig().getMaxBytesInMemory());
    Assert.assertEquals(task.getRealtimeIngestionSchema().getDataSchema().getGranularitySpec().getSegmentGranularity(), task2.getRealtimeIngestionSchema().getDataSchema().getGranularitySpec().getSegmentGranularity());
}

Also used : DataSchema(org.apache.druid.segment.indexing.DataSchema) FireDepartment(org.apache.druid.segment.realtime.FireDepartment) UniformGranularitySpec(org.apache.druid.segment.indexing.granularity.UniformGranularitySpec) RealtimeIOConfig(org.apache.druid.segment.indexing.RealtimeIOConfig) Period(org.joda.time.Period) LocalFirehoseFactory(org.apache.druid.segment.realtime.firehose.LocalFirehoseFactory) RealtimeTuningConfig(org.apache.druid.segment.indexing.RealtimeTuningConfig) File(java.io.File) Test(org.junit.Test)

Aggregations

DataSchema (org.apache.druid.segment.indexing.DataSchema)80 UniformGranularitySpec (org.apache.druid.segment.indexing.granularity.UniformGranularitySpec)49 TimestampSpec (org.apache.druid.data.input.impl.TimestampSpec)45 Test (org.junit.Test)44 DimensionsSpec (org.apache.druid.data.input.impl.DimensionsSpec)32 AggregatorFactory (org.apache.druid.query.aggregation.AggregatorFactory)25 LongSumAggregatorFactory (org.apache.druid.query.aggregation.LongSumAggregatorFactory)22 GranularitySpec (org.apache.druid.segment.indexing.granularity.GranularitySpec)19 InputSource (org.apache.druid.data.input.InputSource)17 InitializedNullHandlingTest (org.apache.druid.testing.InitializedNullHandlingTest)17 File (java.io.File)16 Map (java.util.Map)15 InputFormat (org.apache.druid.data.input.InputFormat)15 CountAggregatorFactory (org.apache.druid.query.aggregation.CountAggregatorFactory)15 SamplerResponse (org.apache.druid.client.indexing.SamplerResponse)14 SamplerResponseRow (org.apache.druid.client.indexing.SamplerResponse.SamplerResponseRow)13 CsvInputFormat (org.apache.druid.data.input.impl.CsvInputFormat)13 Interval (org.joda.time.Interval)13 ArrayList (java.util.ArrayList)12 JsonInputFormat (org.apache.druid.data.input.impl.JsonInputFormat)12