Search in sources :

Example 46 with MapBasedInputRow

use of org.apache.druid.data.input.MapBasedInputRow in project druid by druid-io.

the class DruidSegmentReaderTest method testReaderTimestampAsPosixIncorrectly.

@Test
public void testReaderTimestampAsPosixIncorrectly() throws IOException {
    final DruidSegmentReader reader = new DruidSegmentReader(makeInputEntity(Intervals.of("2000/P1D")), indexIO, new TimestampSpec("__time", "posix", null), new DimensionsSpec(ImmutableList.of(StringDimensionSchema.create("s"), new DoubleDimensionSchema("d"))), ColumnsFilter.all(), null, temporaryFolder.newFolder());
    Assert.assertEquals(ImmutableList.of(new MapBasedInputRow(DateTimes.of("31969-04-01T00:00:00.000Z"), ImmutableList.of("s", "d"), ImmutableMap.<String, Object>builder().put("__time", DateTimes.of("2000T").getMillis()).put("s", "foo").put("d", 1.23d).put("cnt", 1L).put("met_s", makeHLLC("foo")).build()), new MapBasedInputRow(DateTimes.of("31969-05-12T16:00:00.000Z"), ImmutableList.of("s", "d"), ImmutableMap.<String, Object>builder().put("__time", DateTimes.of("2000T01").getMillis()).put("s", "bar").put("d", 4.56d).put("cnt", 1L).put("met_s", makeHLLC("bar")).build())), readRows(reader));
}
Also used : DoubleDimensionSchema(org.apache.druid.data.input.impl.DoubleDimensionSchema) TimestampSpec(org.apache.druid.data.input.impl.TimestampSpec) DimensionsSpec(org.apache.druid.data.input.impl.DimensionsSpec) MapBasedInputRow(org.apache.druid.data.input.MapBasedInputRow) NullHandlingTest(org.apache.druid.common.config.NullHandlingTest) Test(org.junit.Test)

Example 47 with MapBasedInputRow

use of org.apache.druid.data.input.MapBasedInputRow in project druid by druid-io.

the class DruidSegmentReaderTest method testReaderWithInclusiveColumnsFilter.

@Test
public void testReaderWithInclusiveColumnsFilter() throws IOException {
    final DruidSegmentReader reader = new DruidSegmentReader(makeInputEntity(Intervals.of("2000/P1D")), indexIO, new TimestampSpec("__time", "millis", DateTimes.of("1971")), new DimensionsSpec(ImmutableList.of(StringDimensionSchema.create("s"), new DoubleDimensionSchema("d"))), ColumnsFilter.inclusionBased(ImmutableSet.of("__time", "s", "d")), null, temporaryFolder.newFolder());
    Assert.assertEquals(ImmutableList.of(new MapBasedInputRow(DateTimes.of("2000"), ImmutableList.of("s", "d"), ImmutableMap.<String, Object>builder().put("__time", DateTimes.of("2000T").getMillis()).put("s", "foo").put("d", 1.23d).build()), new MapBasedInputRow(DateTimes.of("2000T01"), ImmutableList.of("s", "d"), ImmutableMap.<String, Object>builder().put("__time", DateTimes.of("2000T01").getMillis()).put("s", "bar").put("d", 4.56d).build())), readRows(reader));
}
Also used : DoubleDimensionSchema(org.apache.druid.data.input.impl.DoubleDimensionSchema) TimestampSpec(org.apache.druid.data.input.impl.TimestampSpec) DimensionsSpec(org.apache.druid.data.input.impl.DimensionsSpec) MapBasedInputRow(org.apache.druid.data.input.MapBasedInputRow) NullHandlingTest(org.apache.druid.common.config.NullHandlingTest) Test(org.junit.Test)

Example 48 with MapBasedInputRow

use of org.apache.druid.data.input.MapBasedInputRow in project druid by druid-io.

the class DruidSegmentReaderTest method testReader.

@Test
public void testReader() throws IOException {
    final DruidSegmentReader reader = new DruidSegmentReader(makeInputEntity(Intervals.of("2000/P1D")), indexIO, new TimestampSpec("__time", "millis", DateTimes.of("1971")), new DimensionsSpec(ImmutableList.of(StringDimensionSchema.create("s"), new DoubleDimensionSchema("d"))), ColumnsFilter.all(), null, temporaryFolder.newFolder());
    Assert.assertEquals(ImmutableList.of(new MapBasedInputRow(DateTimes.of("2000"), ImmutableList.of("s", "d"), ImmutableMap.<String, Object>builder().put("__time", DateTimes.of("2000T").getMillis()).put("s", "foo").put("d", 1.23d).put("cnt", 1L).put("met_s", makeHLLC("foo")).build()), new MapBasedInputRow(DateTimes.of("2000T01"), ImmutableList.of("s", "d"), ImmutableMap.<String, Object>builder().put("__time", DateTimes.of("2000T01").getMillis()).put("s", "bar").put("d", 4.56d).put("cnt", 1L).put("met_s", makeHLLC("bar")).build())), readRows(reader));
}
Also used : DoubleDimensionSchema(org.apache.druid.data.input.impl.DoubleDimensionSchema) TimestampSpec(org.apache.druid.data.input.impl.TimestampSpec) DimensionsSpec(org.apache.druid.data.input.impl.DimensionsSpec) MapBasedInputRow(org.apache.druid.data.input.MapBasedInputRow) NullHandlingTest(org.apache.druid.common.config.NullHandlingTest) Test(org.junit.Test)

Example 49 with MapBasedInputRow

use of org.apache.druid.data.input.MapBasedInputRow in project druid by druid-io.

the class SingleDimensionShardSpecTest method testIsInChunk.

@Test
public void testIsInChunk() {
    Map<SingleDimensionShardSpec, List<Pair<Boolean, Map<String, String>>>> tests = ImmutableMap.<SingleDimensionShardSpec, List<Pair<Boolean, Map<String, String>>>>builder().put(makeSpec(null, null), makeListOfPairs(true, null, true, "a", true, "h", true, "p", true, "y")).put(makeSpec(null, "m"), makeListOfPairs(true, null, true, "a", true, "h", false, "p", false, "y")).put(makeSpec("a", "h"), makeListOfPairs(false, null, true, "a", false, "h", false, "p", false, "y")).put(makeSpec("d", "u"), makeListOfPairs(false, null, false, "a", true, "h", true, "p", false, "y")).put(makeSpec("h", null), makeListOfPairs(false, null, false, "a", true, "h", true, "p", true, "y")).build();
    for (Map.Entry<SingleDimensionShardSpec, List<Pair<Boolean, Map<String, String>>>> entry : tests.entrySet()) {
        SingleDimensionShardSpec spec = entry.getKey();
        for (Pair<Boolean, Map<String, String>> pair : entry.getValue()) {
            final InputRow inputRow = new MapBasedInputRow(0, ImmutableList.of("billy"), Maps.transformValues(pair.rhs, input -> input));
            Assert.assertEquals(StringUtils.format("spec[%s], row[%s]", spec, inputRow), pair.lhs, spec.isInChunk(inputRow));
        }
    }
}
Also used : RangeSet(com.google.common.collect.RangeSet) ImmutableMap(com.google.common.collect.ImmutableMap) ObjectMapper(com.fasterxml.jackson.databind.ObjectMapper) Range(com.google.common.collect.Range) MapBasedInputRow(org.apache.druid.data.input.MapBasedInputRow) StringUtils(org.apache.druid.java.util.common.StringUtils) JsonProcessingException(com.fasterxml.jackson.core.JsonProcessingException) Test(org.junit.Test) IOException(java.io.IOException) Maps(com.google.common.collect.Maps) Pair(org.apache.druid.java.util.common.Pair) ArrayList(java.util.ArrayList) InputRow(org.apache.druid.data.input.InputRow) List(java.util.List) ImmutableList(com.google.common.collect.ImmutableList) Map(java.util.Map) Preconditions(com.google.common.base.Preconditions) ImmutableRangeSet(com.google.common.collect.ImmutableRangeSet) Assert(org.junit.Assert) MapBasedInputRow(org.apache.druid.data.input.MapBasedInputRow) InputRow(org.apache.druid.data.input.InputRow) ArrayList(java.util.ArrayList) List(java.util.List) ImmutableList(com.google.common.collect.ImmutableList) MapBasedInputRow(org.apache.druid.data.input.MapBasedInputRow) ImmutableMap(com.google.common.collect.ImmutableMap) Map(java.util.Map) Test(org.junit.Test)

Example 50 with MapBasedInputRow

use of org.apache.druid.data.input.MapBasedInputRow in project druid by druid-io.

the class ThriftInputRowParser method parseBatch.

@Override
public List<InputRow> parseBatch(Object input) {
    if (parser == null) {
        // parser should be created when it is really used to avoid unnecessary initialization of the underlying
        // parseSpec.
        parser = parseSpec.makeParser();
    }
    // Place it this initialization in constructor will get ClassNotFoundException
    try {
        if (thriftClass == null) {
            thriftClass = getThriftClass();
        }
    } catch (IOException e) {
        throw new IAE(e, "failed to load jar [%s]", jarPath);
    } catch (ClassNotFoundException e) {
        throw new IAE(e, "class [%s] not found in jar", thriftClassName);
    } catch (InstantiationException | IllegalAccessException e) {
        throw new IAE(e, "instantiation thrift instance failed");
    }
    final String json;
    try {
        if (input instanceof ByteBuffer) {
            // realtime stream
            final byte[] bytes = ((ByteBuffer) input).array();
            TBase o = thriftClass.newInstance();
            ThriftDeserialization.detectAndDeserialize(bytes, o);
            json = ThriftDeserialization.SERIALIZER_SIMPLE_JSON.get().toString(o);
        } else if (input instanceof BytesWritable) {
            // sequence file
            final byte[] bytes = ((BytesWritable) input).getBytes();
            TBase o = thriftClass.newInstance();
            ThriftDeserialization.detectAndDeserialize(bytes, o);
            json = ThriftDeserialization.SERIALIZER_SIMPLE_JSON.get().toString(o);
        } else if (input instanceof ThriftWritable) {
            // LzoBlockThrift file
            TBase o = (TBase) ((ThriftWritable) input).get();
            json = ThriftDeserialization.SERIALIZER_SIMPLE_JSON.get().toString(o);
        } else {
            throw new IAE("unsupport input class of [%s]", input.getClass());
        }
    } catch (IllegalAccessException | InstantiationException | TException e) {
        throw new IAE("some thing wrong with your thrift?");
    }
    Map<String, Object> record = parser.parseToMap(json);
    final List<String> dimensions;
    if (!this.dimensions.isEmpty()) {
        dimensions = this.dimensions;
    } else {
        dimensions = Lists.newArrayList(Sets.difference(record.keySet(), parseSpec.getDimensionsSpec().getDimensionExclusions()));
    }
    return ImmutableList.of(new MapBasedInputRow(parseSpec.getTimestampSpec().extractTimestamp(record), dimensions, record));
}
Also used : TException(org.apache.thrift.TException) ThriftWritable(com.twitter.elephantbird.mapreduce.io.ThriftWritable) BytesWritable(org.apache.hadoop.io.BytesWritable) IOException(java.io.IOException) IAE(org.apache.druid.java.util.common.IAE) ByteBuffer(java.nio.ByteBuffer) TBase(org.apache.thrift.TBase) MapBasedInputRow(org.apache.druid.data.input.MapBasedInputRow)

Aggregations

MapBasedInputRow (org.apache.druid.data.input.MapBasedInputRow)114 Test (org.junit.Test)77 InitializedNullHandlingTest (org.apache.druid.testing.InitializedNullHandlingTest)46 IncrementalIndex (org.apache.druid.segment.incremental.IncrementalIndex)42 OnheapIncrementalIndex (org.apache.druid.segment.incremental.OnheapIncrementalIndex)38 InputRow (org.apache.druid.data.input.InputRow)31 File (java.io.File)24 DimensionsSpec (org.apache.druid.data.input.impl.DimensionsSpec)21 LongSumAggregatorFactory (org.apache.druid.query.aggregation.LongSumAggregatorFactory)20 CountAggregatorFactory (org.apache.druid.query.aggregation.CountAggregatorFactory)19 ArrayList (java.util.ArrayList)17 HashMap (java.util.HashMap)15 DateTime (org.joda.time.DateTime)15 TimestampSpec (org.apache.druid.data.input.impl.TimestampSpec)14 IncrementalIndexTest (org.apache.druid.segment.data.IncrementalIndexTest)14 Interval (org.joda.time.Interval)14 IOException (java.io.IOException)13 DoubleDimensionSchema (org.apache.druid.data.input.impl.DoubleDimensionSchema)13 IncrementalIndexSchema (org.apache.druid.segment.incremental.IncrementalIndexSchema)12 ImmutableMap (com.google.common.collect.ImmutableMap)11