Search in sources :

Example 6 with HadoopIOConfig

use of org.apache.druid.indexer.HadoopIOConfig in project druid by druid-io.

the class GranularityPathSpecTest method testAddInputPath.

@Test
public void testAddInputPath() throws Exception {
    UserGroupInformation.setLoginUser(UserGroupInformation.createUserForTesting("test", new String[] { "testGroup" }));
    HadoopIngestionSpec spec = new HadoopIngestionSpec(new DataSchema("foo", null, new AggregatorFactory[0], new UniformGranularitySpec(Granularities.DAY, Granularities.MINUTE, ImmutableList.of(Intervals.of("2015-11-06T00:00Z/2015-11-07T00:00Z"))), null, jsonMapper), new HadoopIOConfig(null, null, null), DEFAULT_TUNING_CONFIG);
    granularityPathSpec.setDataGranularity(Granularities.HOUR);
    granularityPathSpec.setFilePattern(".*");
    granularityPathSpec.setInputFormat(TextInputFormat.class);
    Job job = Job.getInstance();
    String formatStr = "file:%s/%s;org.apache.hadoop.mapreduce.lib.input.TextInputFormat";
    testFolder.newFolder("test", "y=2015", "m=11", "d=06", "H=00");
    testFolder.newFolder("test", "y=2015", "m=11", "d=06", "H=02");
    testFolder.newFolder("test", "y=2015", "m=11", "d=06", "H=05");
    testFolder.newFile("test/y=2015/m=11/d=06/H=00/file1");
    testFolder.newFile("test/y=2015/m=11/d=06/H=02/file2");
    testFolder.newFile("test/y=2015/m=11/d=06/H=05/file3");
    testFolder.newFile("test/y=2015/m=11/d=06/H=05/file4");
    granularityPathSpec.setInputPath(testFolder.getRoot().getPath() + "/test");
    granularityPathSpec.addInputPaths(HadoopDruidIndexerConfig.fromSpec(spec), job);
    String actual = job.getConfiguration().get("mapreduce.input.multipleinputs.dir.formats");
    String expected = Joiner.on(",").join(Lists.newArrayList(StringUtils.format(formatStr, testFolder.getRoot(), "test/y=2015/m=11/d=06/H=00/file1"), StringUtils.format(formatStr, testFolder.getRoot(), "test/y=2015/m=11/d=06/H=02/file2"), StringUtils.format(formatStr, testFolder.getRoot(), "test/y=2015/m=11/d=06/H=05/file3"), StringUtils.format(formatStr, testFolder.getRoot(), "test/y=2015/m=11/d=06/H=05/file4")));
    Assert.assertEquals("Did not find expected input paths", expected, actual);
}
Also used : HadoopIngestionSpec(org.apache.druid.indexer.HadoopIngestionSpec) DataSchema(org.apache.druid.segment.indexing.DataSchema) UniformGranularitySpec(org.apache.druid.segment.indexing.granularity.UniformGranularitySpec) AggregatorFactory(org.apache.druid.query.aggregation.AggregatorFactory) Job(org.apache.hadoop.mapreduce.Job) HadoopIOConfig(org.apache.druid.indexer.HadoopIOConfig) Test(org.junit.Test)

Aggregations

HadoopIOConfig (org.apache.druid.indexer.HadoopIOConfig)6 HadoopIngestionSpec (org.apache.druid.indexer.HadoopIngestionSpec)6 DataSchema (org.apache.druid.segment.indexing.DataSchema)6 Test (org.junit.Test)5 UniformGranularitySpec (org.apache.druid.segment.indexing.granularity.UniformGranularitySpec)3 Job (org.apache.hadoop.mapreduce.Job)3 HashMap (java.util.HashMap)2 HadoopIndexTask (org.apache.druid.indexing.common.task.HadoopIndexTask)2 AggregatorFactory (org.apache.druid.query.aggregation.AggregatorFactory)2 ImmutableMap (com.google.common.collect.ImmutableMap)1 Map (java.util.Map)1 SortedMap (java.util.SortedMap)1 HadoopDruidIndexerConfig (org.apache.druid.indexer.HadoopDruidIndexerConfig)1 HadoopTuningConfig (org.apache.druid.indexer.HadoopTuningConfig)1 DatasourceIngestionSpec (org.apache.druid.indexer.hadoop.DatasourceIngestionSpec)1 ArbitraryGranularitySpec (org.apache.druid.segment.indexing.granularity.ArbitraryGranularitySpec)1 DataSegment (org.apache.druid.timeline.DataSegment)1 HashBasedNumberedShardSpec (org.apache.druid.timeline.partition.HashBasedNumberedShardSpec)1 Interval (org.joda.time.Interval)1