Search in sources :

Example 1 with JsonInputFormat

use of org.apache.druid.data.input.impl.JsonInputFormat in project druid by druid-io.

the class S3InputSourceTest method testCreateSplitsWithSplitHintSpecRespectingHint.

@Test
public void testCreateSplitsWithSplitHintSpecRespectingHint() {
    EasyMock.reset(S3_CLIENT);
    expectListObjects(PREFIXES.get(0), ImmutableList.of(EXPECTED_URIS.get(0)), CONTENT);
    expectListObjects(PREFIXES.get(1), ImmutableList.of(EXPECTED_URIS.get(1)), CONTENT);
    EasyMock.replay(S3_CLIENT);
    S3InputSource inputSource = new S3InputSource(SERVICE, SERVER_SIDE_ENCRYPTING_AMAZON_S3_BUILDER, INPUT_DATA_CONFIG, null, PREFIXES, null, null);
    Stream<InputSplit<List<CloudObjectLocation>>> splits = inputSource.createSplits(new JsonInputFormat(JSONPathSpec.DEFAULT, null, null), new MaxSizeSplitHintSpec(new HumanReadableBytes(CONTENT.length * 3L), null));
    Assert.assertEquals(ImmutableList.of(EXPECTED_URIS.stream().map(CloudObjectLocation::new).collect(Collectors.toList())), splits.map(InputSplit::get).collect(Collectors.toList()));
    EasyMock.verify(S3_CLIENT);
}
Also used : JsonInputFormat(org.apache.druid.data.input.impl.JsonInputFormat) CloudObjectLocation(org.apache.druid.data.input.impl.CloudObjectLocation) HumanReadableBytes(org.apache.druid.java.util.common.HumanReadableBytes) InputSplit(org.apache.druid.data.input.InputSplit) MaxSizeSplitHintSpec(org.apache.druid.data.input.MaxSizeSplitHintSpec) InitializedNullHandlingTest(org.apache.druid.testing.InitializedNullHandlingTest) Test(org.junit.Test)

Example 2 with JsonInputFormat

use of org.apache.druid.data.input.impl.JsonInputFormat in project druid by druid-io.

the class S3InputSourceTest method testWithPrefixesSplit.

@Test
public void testWithPrefixesSplit() {
    EasyMock.reset(S3_CLIENT);
    expectListObjects(PREFIXES.get(0), ImmutableList.of(EXPECTED_URIS.get(0)), CONTENT);
    expectListObjects(PREFIXES.get(1), ImmutableList.of(EXPECTED_URIS.get(1)), CONTENT);
    EasyMock.replay(S3_CLIENT);
    S3InputSource inputSource = new S3InputSource(SERVICE, SERVER_SIDE_ENCRYPTING_AMAZON_S3_BUILDER, INPUT_DATA_CONFIG, null, PREFIXES, null, null);
    Stream<InputSplit<List<CloudObjectLocation>>> splits = inputSource.createSplits(new JsonInputFormat(JSONPathSpec.DEFAULT, null, null), new MaxSizeSplitHintSpec(null, 1));
    Assert.assertEquals(EXPECTED_COORDS, splits.map(InputSplit::get).collect(Collectors.toList()));
    EasyMock.verify(S3_CLIENT);
}
Also used : JsonInputFormat(org.apache.druid.data.input.impl.JsonInputFormat) CloudObjectLocation(org.apache.druid.data.input.impl.CloudObjectLocation) InputSplit(org.apache.druid.data.input.InputSplit) MaxSizeSplitHintSpec(org.apache.druid.data.input.MaxSizeSplitHintSpec) InitializedNullHandlingTest(org.apache.druid.testing.InitializedNullHandlingTest) Test(org.junit.Test)

Example 3 with JsonInputFormat

use of org.apache.druid.data.input.impl.JsonInputFormat in project druid by druid-io.

the class SettableByteEntityReader method setEntity.

void setEntity(ByteEntity entity) {
    InputFormat format = (inputFormat instanceof JsonInputFormat) ? ((JsonInputFormat) inputFormat).withLineSplittable(false) : inputFormat;
    this.delegate = new TransformingInputEntityReader(// This should be fine as long as initializing a reader is cheap which it is for now.
    format.createReader(inputRowSchema, entity, indexingTmpDir), transformer);
}
Also used : TransformingInputEntityReader(org.apache.druid.segment.transform.TransformingInputEntityReader) JsonInputFormat(org.apache.druid.data.input.impl.JsonInputFormat) InputFormat(org.apache.druid.data.input.InputFormat) JsonInputFormat(org.apache.druid.data.input.impl.JsonInputFormat)

Example 4 with JsonInputFormat

use of org.apache.druid.data.input.impl.JsonInputFormat in project druid by druid-io.

the class PartialHashSegmentGenerateTaskTest method requiresGranularitySpecInputIntervals.

@Test
public void requiresGranularitySpecInputIntervals() {
    expectedException.expect(IllegalArgumentException.class);
    expectedException.expectMessage("Missing intervals in granularitySpec");
    new PartialHashSegmentGenerateTask(ParallelIndexTestingFactory.AUTOMATIC_ID, ParallelIndexTestingFactory.GROUP_ID, ParallelIndexTestingFactory.TASK_RESOURCE, ParallelIndexTestingFactory.SUPERVISOR_TASK_ID, ParallelIndexTestingFactory.SUBTASK_SPEC_ID, ParallelIndexTestingFactory.NUM_ATTEMPTS, ParallelIndexTestingFactory.createIngestionSpec(new LocalInputSource(new File("baseDir"), "filer"), new JsonInputFormat(null, null, null), new ParallelIndexTestingFactory.TuningConfigBuilder().build(), ParallelIndexTestingFactory.createDataSchema(null)), ParallelIndexTestingFactory.CONTEXT, null);
}
Also used : JsonInputFormat(org.apache.druid.data.input.impl.JsonInputFormat) File(java.io.File) LocalInputSource(org.apache.druid.data.input.impl.LocalInputSource) Test(org.junit.Test)

Example 5 with JsonInputFormat

use of org.apache.druid.data.input.impl.JsonInputFormat in project druid by druid-io.

the class SeekableStreamSupervisorSpecTest method testSeekableStreamSupervisorSpecWithScaleDisable.

@Test
public void testSeekableStreamSupervisorSpecWithScaleDisable() throws InterruptedException {
    SeekableStreamSupervisorIOConfig seekableStreamSupervisorIOConfig = new SeekableStreamSupervisorIOConfig("stream", new JsonInputFormat(new JSONPathSpec(true, ImmutableList.of()), ImmutableMap.of(), false), 1, 1, new Period("PT1H"), new Period("P1D"), new Period("PT30S"), false, new Period("PT30M"), null, null, null, null) {
    };
    EasyMock.expect(spec.getSupervisorStateManagerConfig()).andReturn(supervisorConfig).anyTimes();
    EasyMock.expect(spec.getDataSchema()).andReturn(getDataSchema()).anyTimes();
    EasyMock.expect(spec.getIoConfig()).andReturn(seekableStreamSupervisorIOConfig).anyTimes();
    EasyMock.expect(spec.getTuningConfig()).andReturn(getTuningConfig()).anyTimes();
    EasyMock.expect(spec.getEmitter()).andReturn(emitter).anyTimes();
    EasyMock.expect(spec.isSuspended()).andReturn(false).anyTimes();
    EasyMock.replay(spec);
    EasyMock.expect(ingestionSchema.getIOConfig()).andReturn(this.seekableStreamSupervisorIOConfig).anyTimes();
    EasyMock.expect(ingestionSchema.getDataSchema()).andReturn(dataSchema).anyTimes();
    EasyMock.expect(ingestionSchema.getTuningConfig()).andReturn(seekableStreamSupervisorTuningConfig).anyTimes();
    EasyMock.replay(ingestionSchema);
    EasyMock.expect(taskMaster.getTaskRunner()).andReturn(Optional.absent()).anyTimes();
    EasyMock.expect(taskMaster.getSupervisorManager()).andReturn(Optional.absent()).anyTimes();
    EasyMock.replay(taskMaster);
    TestSeekableStreamSupervisor supervisor = new TestSeekableStreamSupervisor(3);
    NoopTaskAutoScaler autoScaler = new NoopTaskAutoScaler();
    supervisor.start();
    autoScaler.start();
    supervisor.runInternal();
    int taskCountBeforeScaleOut = supervisor.getIoConfig().getTaskCount();
    Assert.assertEquals(1, taskCountBeforeScaleOut);
    Thread.sleep(1 * 1000);
    int taskCountAfterScaleOut = supervisor.getIoConfig().getTaskCount();
    Assert.assertEquals(1, taskCountAfterScaleOut);
    autoScaler.reset();
    autoScaler.stop();
}
Also used : SeekableStreamSupervisorIOConfig(org.apache.druid.indexing.seekablestream.supervisor.SeekableStreamSupervisorIOConfig) JsonInputFormat(org.apache.druid.data.input.impl.JsonInputFormat) JSONPathSpec(org.apache.druid.java.util.common.parsers.JSONPathSpec) Period(org.joda.time.Period) NoopTaskAutoScaler(org.apache.druid.indexing.seekablestream.supervisor.autoscaler.NoopTaskAutoScaler) Test(org.junit.Test)

Aggregations

JsonInputFormat (org.apache.druid.data.input.impl.JsonInputFormat)23 Test (org.junit.Test)21 InitializedNullHandlingTest (org.apache.druid.testing.InitializedNullHandlingTest)13 InputSplit (org.apache.druid.data.input.InputSplit)11 CloudObjectLocation (org.apache.druid.data.input.impl.CloudObjectLocation)11 MaxSizeSplitHintSpec (org.apache.druid.data.input.MaxSizeSplitHintSpec)6 JSONPathSpec (org.apache.druid.java.util.common.parsers.JSONPathSpec)6 File (java.io.File)3 HumanReadableBytes (org.apache.druid.java.util.common.HumanReadableBytes)3 ObjectMapper (com.fasterxml.jackson.databind.ObjectMapper)2 SamplerResponse (org.apache.druid.client.indexing.SamplerResponse)2 InputFormat (org.apache.druid.data.input.InputFormat)2 DimensionsSpec (org.apache.druid.data.input.impl.DimensionsSpec)2 TimestampSpec (org.apache.druid.data.input.impl.TimestampSpec)2 InputSourceSampler (org.apache.druid.indexing.overlord.sampler.InputSourceSampler)2 SamplerConfig (org.apache.druid.indexing.overlord.sampler.SamplerConfig)2 SamplerTestUtils (org.apache.druid.indexing.overlord.sampler.SamplerTestUtils)2 DataSchema (org.apache.druid.segment.indexing.DataSchema)2 UniformGranularitySpec (org.apache.druid.segment.indexing.granularity.UniformGranularitySpec)2 DataSegment (org.apache.druid.timeline.DataSegment)2