Search in sources :

Example 6 with RuntimeShapeInspector

use of org.apache.druid.query.monomorphicprocessing.RuntimeShapeInspector in project druid by druid-io.

the class ColumnSelectorBitmapIndexSelector method getDimensionValues.

@Nullable
@Override
public CloseableIndexed<String> getDimensionValues(String dimension) {
    if (isVirtualColumn(dimension)) {
        BitmapIndex bitmapIndex = virtualColumns.getBitmapIndex(dimension, index);
        if (bitmapIndex == null) {
            return null;
        }
        return new CloseableIndexed<String>() {

            @Override
            public int size() {
                return bitmapIndex.getCardinality();
            }

            @Override
            public String get(int index) {
                return bitmapIndex.getValue(index);
            }

            @Override
            public int indexOf(String value) {
                return bitmapIndex.getIndex(value);
            }

            @Override
            public Iterator<String> iterator() {
                return IndexedIterable.create(this).iterator();
            }

            @Override
            public void inspectRuntimeShape(RuntimeShapeInspector inspector) {
                inspector.visit("column", bitmapIndex);
            }

            @Override
            public void close() {
            }
        };
    }
    final ColumnHolder columnHolder = index.getColumnHolder(dimension);
    if (columnHolder == null) {
        return null;
    }
    if (!columnHolder.getCapabilities().toColumnType().is(ValueType.STRING)) {
        // work correctly here until reworking is done to support filtering/indexing other types of columns
        return null;
    }
    BaseColumn col = columnHolder.getColumn();
    if (!(col instanceof DictionaryEncodedColumn)) {
        return null;
    }
    final DictionaryEncodedColumn<String> column = (DictionaryEncodedColumn<String>) col;
    return new CloseableIndexed<String>() {

        @Override
        public int size() {
            return column.getCardinality();
        }

        @Override
        public String get(int index) {
            return column.lookupName(index);
        }

        @Override
        public int indexOf(String value) {
            return column.lookupId(value);
        }

        @Override
        public Iterator<String> iterator() {
            return IndexedIterable.create(this).iterator();
        }

        @Override
        public void inspectRuntimeShape(RuntimeShapeInspector inspector) {
            inspector.visit("column", column);
        }

        @Override
        public void close() throws IOException {
            column.close();
        }
    };
}
Also used : ColumnHolder(org.apache.druid.segment.column.ColumnHolder) BitmapIndex(org.apache.druid.segment.column.BitmapIndex) RuntimeShapeInspector(org.apache.druid.query.monomorphicprocessing.RuntimeShapeInspector) BaseColumn(org.apache.druid.segment.column.BaseColumn) DictionaryEncodedColumn(org.apache.druid.segment.column.DictionaryEncodedColumn) CloseableIndexed(org.apache.druid.segment.data.CloseableIndexed) Nullable(javax.annotation.Nullable)

Example 7 with RuntimeShapeInspector

use of org.apache.druid.query.monomorphicprocessing.RuntimeShapeInspector in project druid by druid-io.

the class ExpressionVirtualColumnTest method testMultiObjectSelectorMakesRightSelector.

@Test
public void testMultiObjectSelectorMakesRightSelector() {
    DimensionSpec spec = new DefaultDimensionSpec("expr", "expr");
    // do some ugly faking to test if SingleStringInputDeferredEvaluationExpressionDimensionSelector is created for multi-value expressions when possible
    ColumnSelectorFactory factory = new ColumnSelectorFactory() {

        @Override
        public DimensionSelector makeDimensionSelector(DimensionSpec dimensionSpec) {
            DimensionSelector delegate = COLUMN_SELECTOR_FACTORY.makeDimensionSelector(dimensionSpec);
            DimensionSelector faker = new DimensionSelector() {

                @Override
                public IndexedInts getRow() {
                    return delegate.getRow();
                }

                @Override
                public ValueMatcher makeValueMatcher(@Nullable String value) {
                    return delegate.makeValueMatcher(value);
                }

                @Override
                public ValueMatcher makeValueMatcher(Predicate<String> predicate) {
                    return delegate.makeValueMatcher(predicate);
                }

                @Override
                public void inspectRuntimeShape(RuntimeShapeInspector inspector) {
                    delegate.inspectRuntimeShape(inspector);
                }

                @Nullable
                @Override
                public Object getObject() {
                    return delegate.getObject();
                }

                @Override
                public Class<?> classOfObject() {
                    return delegate.classOfObject();
                }

                @Override
                public int getValueCardinality() {
                    // value doesn't matter as long as not CARDINALITY_UNKNOWN
                    return 3;
                }

                @Nullable
                @Override
                public String lookupName(int id) {
                    return null;
                }

                @Override
                public boolean nameLookupPossibleInAdvance() {
                    // fake this so when SingleStringInputDeferredEvaluationExpressionDimensionSelector it doesn't explode
                    return true;
                }

                @Nullable
                @Override
                public IdLookup idLookup() {
                    return name -> 0;
                }
            };
            return faker;
        }

        @Override
        public ColumnValueSelector makeColumnValueSelector(String columnName) {
            return COLUMN_SELECTOR_FACTORY.makeColumnValueSelector(columnName);
        }

        @Nullable
        @Override
        public ColumnCapabilities getColumnCapabilities(String column) {
            return new ColumnCapabilitiesImpl().setType(ColumnType.STRING).setHasMultipleValues(true).setDictionaryEncoded(true);
        }
    };
    final BaseObjectColumnValueSelector selectorImplicit = SCALE_LIST_SELF_IMPLICIT.makeDimensionSelector(spec, factory);
    final BaseObjectColumnValueSelector selectorExplicit = SCALE_LIST_SELF_EXPLICIT.makeDimensionSelector(spec, factory);
    Assert.assertTrue(selectorImplicit instanceof SingleStringInputDeferredEvaluationExpressionDimensionSelector);
    Assert.assertTrue(selectorExplicit instanceof ExpressionMultiValueDimensionSelector);
}
Also used : Arrays(java.util.Arrays) RuntimeShapeInspector(org.apache.druid.query.monomorphicprocessing.RuntimeShapeInspector) ConstantMultiValueDimensionSelector(org.apache.druid.segment.ConstantMultiValueDimensionSelector) MapBasedInputRow(org.apache.druid.data.input.MapBasedInputRow) ColumnValueSelector(org.apache.druid.segment.ColumnValueSelector) RowAdapters(org.apache.druid.segment.RowAdapters) Parser(org.apache.druid.math.expr.Parser) IdLookup(org.apache.druid.segment.IdLookup) IndexedInts(org.apache.druid.segment.data.IndexedInts) BaseFloatColumnValueSelector(org.apache.druid.segment.BaseFloatColumnValueSelector) Row(org.apache.druid.data.input.Row) DefaultDimensionSpec(org.apache.druid.query.dimension.DefaultDimensionSpec) ColumnSelectorFactory(org.apache.druid.segment.ColumnSelectorFactory) ImmutableList(com.google.common.collect.ImmutableList) Predicates(com.google.common.base.Predicates) DimensionSelector(org.apache.druid.segment.DimensionSelector) BucketExtractionFn(org.apache.druid.query.extraction.BucketExtractionFn) ExtractionDimensionSpec(org.apache.druid.query.dimension.ExtractionDimensionSpec) BaseObjectColumnValueSelector(org.apache.druid.segment.BaseObjectColumnValueSelector) Nullable(javax.annotation.Nullable) ValueMatcher(org.apache.druid.query.filter.ValueMatcher) DateTimes(org.apache.druid.java.util.common.DateTimes) RowBasedColumnSelectorFactory(org.apache.druid.segment.RowBasedColumnSelectorFactory) ImmutableMap(com.google.common.collect.ImmutableMap) ValueType(org.apache.druid.segment.column.ValueType) InitializedNullHandlingTest(org.apache.druid.testing.InitializedNullHandlingTest) Test(org.junit.Test) TestExprMacroTable(org.apache.druid.query.expression.TestExprMacroTable) ExprEval(org.apache.druid.math.expr.ExprEval) InputRow(org.apache.druid.data.input.InputRow) BaseLongColumnValueSelector(org.apache.druid.segment.BaseLongColumnValueSelector) ColumnCapabilitiesImpl(org.apache.druid.segment.column.ColumnCapabilitiesImpl) Predicate(com.google.common.base.Predicate) NullHandling(org.apache.druid.common.config.NullHandling) RowSignature(org.apache.druid.segment.column.RowSignature) DimensionSpec(org.apache.druid.query.dimension.DimensionSpec) ColumnCapabilities(org.apache.druid.segment.column.ColumnCapabilities) ColumnType(org.apache.druid.segment.column.ColumnType) Assert(org.junit.Assert) ConstantDimensionSelector(org.apache.druid.segment.ConstantDimensionSelector) DefaultDimensionSpec(org.apache.druid.query.dimension.DefaultDimensionSpec) ExtractionDimensionSpec(org.apache.druid.query.dimension.ExtractionDimensionSpec) DimensionSpec(org.apache.druid.query.dimension.DimensionSpec) ConstantMultiValueDimensionSelector(org.apache.druid.segment.ConstantMultiValueDimensionSelector) DimensionSelector(org.apache.druid.segment.DimensionSelector) ConstantDimensionSelector(org.apache.druid.segment.ConstantDimensionSelector) ColumnSelectorFactory(org.apache.druid.segment.ColumnSelectorFactory) RowBasedColumnSelectorFactory(org.apache.druid.segment.RowBasedColumnSelectorFactory) RuntimeShapeInspector(org.apache.druid.query.monomorphicprocessing.RuntimeShapeInspector) BaseObjectColumnValueSelector(org.apache.druid.segment.BaseObjectColumnValueSelector) DefaultDimensionSpec(org.apache.druid.query.dimension.DefaultDimensionSpec) Predicate(com.google.common.base.Predicate) Nullable(javax.annotation.Nullable) ColumnCapabilitiesImpl(org.apache.druid.segment.column.ColumnCapabilitiesImpl) InitializedNullHandlingTest(org.apache.druid.testing.InitializedNullHandlingTest) Test(org.junit.Test)

Example 8 with RuntimeShapeInspector

use of org.apache.druid.query.monomorphicprocessing.RuntimeShapeInspector in project druid by druid-io.

the class LargeColumnSupportedComplexColumnSerializerTest method testSanity.

@Test
public void testSanity() throws IOException {
    HyperUniquesSerdeForTest serde = new HyperUniquesSerdeForTest(Hashing.murmur3_128());
    int[] cases = { 1000, 5000, 10000, 20000 };
    int[] columnSizes = { Integer.MAX_VALUE, Integer.MAX_VALUE / 2, Integer.MAX_VALUE / 4, 5000 * Long.BYTES, 2500 * Long.BYTES };
    for (int columnSize : columnSizes) {
        for (int aCase : cases) {
            File tmpFile = temporaryFolder.newFolder();
            HyperLogLogCollector baseCollector = HyperLogLogCollector.makeLatestCollector();
            try (SegmentWriteOutMedium segmentWriteOutMedium = new OffHeapMemorySegmentWriteOutMedium();
                FileSmoosher v9Smoosher = new FileSmoosher(tmpFile)) {
                LargeColumnSupportedComplexColumnSerializer serializer = LargeColumnSupportedComplexColumnSerializer.createWithColumnSize(segmentWriteOutMedium, "test", serde.getObjectStrategy(), columnSize);
                serializer.open();
                for (int i = 0; i < aCase; i++) {
                    HyperLogLogCollector collector = HyperLogLogCollector.makeLatestCollector();
                    byte[] hashBytes = fn.hashLong(i).asBytes();
                    collector.add(hashBytes);
                    baseCollector.fold(collector);
                    serializer.serialize(new ObjectColumnSelector() {

                        @Nullable
                        @Override
                        public Object getObject() {
                            return collector;
                        }

                        @Override
                        public Class classOfObject() {
                            return HyperLogLogCollector.class;
                        }

                        @Override
                        public void inspectRuntimeShape(RuntimeShapeInspector inspector) {
                        // doesn't matter in tests
                        }
                    });
                }
                try (final SmooshedWriter channel = v9Smoosher.addWithSmooshedWriter("test", serializer.getSerializedSize())) {
                    serializer.writeTo(channel, v9Smoosher);
                }
            }
            SmooshedFileMapper mapper = Smoosh.map(tmpFile);
            final ColumnBuilder builder = new ColumnBuilder().setType(ValueType.COMPLEX).setHasMultipleValues(false).setFileMapper(mapper);
            serde.deserializeColumn(mapper.mapFile("test"), builder, null);
            ColumnHolder columnHolder = builder.build();
            ComplexColumn complexColumn = (ComplexColumn) columnHolder.getColumn();
            HyperLogLogCollector collector = HyperLogLogCollector.makeLatestCollector();
            for (int i = 0; i < aCase; i++) {
                collector.fold((HyperLogLogCollector) complexColumn.getRowValue(i));
            }
            Assert.assertEquals(baseCollector.estimateCardinality(), collector.estimateCardinality(), 0.0);
        }
    }
}
Also used : SmooshedWriter(org.apache.druid.java.util.common.io.smoosh.SmooshedWriter) ColumnHolder(org.apache.druid.segment.column.ColumnHolder) HyperLogLogCollector(org.apache.druid.hll.HyperLogLogCollector) OffHeapMemorySegmentWriteOutMedium(org.apache.druid.segment.writeout.OffHeapMemorySegmentWriteOutMedium) RuntimeShapeInspector(org.apache.druid.query.monomorphicprocessing.RuntimeShapeInspector) SegmentWriteOutMedium(org.apache.druid.segment.writeout.SegmentWriteOutMedium) OffHeapMemorySegmentWriteOutMedium(org.apache.druid.segment.writeout.OffHeapMemorySegmentWriteOutMedium) FileSmoosher(org.apache.druid.java.util.common.io.smoosh.FileSmoosher) ColumnBuilder(org.apache.druid.segment.column.ColumnBuilder) File(java.io.File) Nullable(javax.annotation.Nullable) SmooshedFileMapper(org.apache.druid.java.util.common.io.smoosh.SmooshedFileMapper) ComplexColumn(org.apache.druid.segment.column.ComplexColumn) ObjectColumnSelector(org.apache.druid.segment.ObjectColumnSelector) Test(org.junit.Test)

Example 9 with RuntimeShapeInspector

use of org.apache.druid.query.monomorphicprocessing.RuntimeShapeInspector in project druid by druid-io.

the class DimensionSelectorUtils method makeDictionaryEncodedValueMatcherGeneric.

private static ValueMatcher makeDictionaryEncodedValueMatcherGeneric(final DimensionSelector selector, final int valueId, final boolean matchNull) {
    if (valueId >= 0) {
        return new ValueMatcher() {

            @Override
            public boolean matches() {
                final IndexedInts row = selector.getRow();
                final int size = row.size();
                if (size == 0) {
                    // null should match empty rows in multi-value columns
                    return matchNull;
                } else {
                    for (int i = 0; i < size; ++i) {
                        if (row.get(i) == valueId) {
                            return true;
                        }
                    }
                    return false;
                }
            }

            @Override
            public void inspectRuntimeShape(RuntimeShapeInspector inspector) {
                inspector.visit("selector", selector);
            }
        };
    } else {
        if (matchNull) {
            return new ValueMatcher() {

                @Override
                public boolean matches() {
                    final IndexedInts row = selector.getRow();
                    final int size = row.size();
                    return size == 0;
                }

                @Override
                public void inspectRuntimeShape(RuntimeShapeInspector inspector) {
                    inspector.visit("selector", selector);
                }
            };
        } else {
            return BooleanValueMatcher.of(false);
        }
    }
}
Also used : ValueMatcher(org.apache.druid.query.filter.ValueMatcher) BooleanValueMatcher(org.apache.druid.segment.filter.BooleanValueMatcher) IndexedInts(org.apache.druid.segment.data.IndexedInts) RuntimeShapeInspector(org.apache.druid.query.monomorphicprocessing.RuntimeShapeInspector)

Example 10 with RuntimeShapeInspector

use of org.apache.druid.query.monomorphicprocessing.RuntimeShapeInspector in project druid by druid-io.

the class StringDimensionIndexer method makeDimensionSelector.

@Override
public DimensionSelector makeDimensionSelector(final DimensionSpec spec, final IncrementalIndexRowHolder currEntry, final IncrementalIndex.DimensionDesc desc) {
    final ExtractionFn extractionFn = spec.getExtractionFn();
    final int dimIndex = desc.getIndex();
    // maxId is used in concert with getLastRowIndex() in IncrementalIndex to ensure that callers do not encounter
    // rows that contain IDs over the initially-reported cardinality. The main idea is that IncrementalIndex establishes
    // a watermark at the time a cursor is created, and doesn't allow the cursor to walk past that watermark.
    // 
    // Additionally, this selector explicitly blocks knowledge of IDs past maxId that may occur from other causes
    // (for example: nulls getting generated for empty arrays, or calls to lookupId).
    final int maxId = getCardinality();
    class IndexerDimensionSelector implements DimensionSelector, IdLookup {

        private final ArrayBasedIndexedInts indexedInts = new ArrayBasedIndexedInts();

        @Nullable
        @MonotonicNonNull
        private int[] nullIdIntArray;

        @Override
        public IndexedInts getRow() {
            final Object[] dims = currEntry.get().getDims();
            int[] indices;
            if (dimIndex < dims.length) {
                indices = (int[]) dims[dimIndex];
            } else {
                indices = null;
            }
            int[] row = null;
            int rowSize = 0;
            // usually due to currEntry's rowIndex is smaller than the row's rowIndex in which this dim first appears
            if (indices == null || indices.length == 0) {
                if (hasMultipleValues) {
                    row = IntArrays.EMPTY_ARRAY;
                    rowSize = 0;
                } else {
                    final int nullId = getEncodedValue(null, false);
                    if (nullId >= 0 && nullId < maxId) {
                        // null was added to the dictionary before this selector was created; return its ID.
                        if (nullIdIntArray == null) {
                            nullIdIntArray = new int[] { nullId };
                        }
                        row = nullIdIntArray;
                        rowSize = 1;
                    } else {
                        // null doesn't exist in the dictionary; return an empty array.
                        // Choose to use ArrayBasedIndexedInts later, instead of special "empty" IndexedInts, for monomorphism
                        row = IntArrays.EMPTY_ARRAY;
                        rowSize = 0;
                    }
                }
            }
            if (row == null && indices != null && indices.length > 0) {
                row = indices;
                rowSize = indices.length;
            }
            indexedInts.setValues(row, rowSize);
            return indexedInts;
        }

        @Override
        public ValueMatcher makeValueMatcher(final String value) {
            if (extractionFn == null) {
                final int valueId = lookupId(value);
                if (valueId >= 0 || value == null) {
                    return new ValueMatcher() {

                        @Override
                        public boolean matches() {
                            Object[] dims = currEntry.get().getDims();
                            if (dimIndex >= dims.length) {
                                return value == null;
                            }
                            int[] dimsInt = (int[]) dims[dimIndex];
                            if (dimsInt == null || dimsInt.length == 0) {
                                return value == null;
                            }
                            for (int id : dimsInt) {
                                if (id == valueId) {
                                    return true;
                                }
                            }
                            return false;
                        }

                        @Override
                        public void inspectRuntimeShape(RuntimeShapeInspector inspector) {
                        // nothing to inspect
                        }
                    };
                } else {
                    return BooleanValueMatcher.of(false);
                }
            } else {
                // Employ caching BitSet optimization
                return makeValueMatcher(Predicates.equalTo(value));
            }
        }

        @Override
        public ValueMatcher makeValueMatcher(final Predicate<String> predicate) {
            final BitSet checkedIds = new BitSet(maxId);
            final BitSet matchingIds = new BitSet(maxId);
            final boolean matchNull = predicate.apply(null);
            // Lazy matcher; only check an id if matches() is called.
            return new ValueMatcher() {

                @Override
                public boolean matches() {
                    Object[] dims = currEntry.get().getDims();
                    if (dimIndex >= dims.length) {
                        return matchNull;
                    }
                    int[] dimsInt = (int[]) dims[dimIndex];
                    if (dimsInt == null || dimsInt.length == 0) {
                        return matchNull;
                    }
                    for (int id : dimsInt) {
                        if (checkedIds.get(id)) {
                            if (matchingIds.get(id)) {
                                return true;
                            }
                        } else {
                            final boolean matches = predicate.apply(lookupName(id));
                            checkedIds.set(id);
                            if (matches) {
                                matchingIds.set(id);
                                return true;
                            }
                        }
                    }
                    return false;
                }

                @Override
                public void inspectRuntimeShape(RuntimeShapeInspector inspector) {
                // nothing to inspect
                }
            };
        }

        @Override
        public int getValueCardinality() {
            return maxId;
        }

        @Override
        public String lookupName(int id) {
            if (id >= maxId) {
                // Sanity check; IDs beyond maxId should not be known to callers. (See comment above.)
                throw new ISE("id[%d] >= maxId[%d]", id, maxId);
            }
            final String strValue = getActualValue(id, false);
            return extractionFn == null ? strValue : extractionFn.apply(strValue);
        }

        @Override
        public boolean nameLookupPossibleInAdvance() {
            return dictionaryEncodesAllValues();
        }

        @Nullable
        @Override
        public IdLookup idLookup() {
            return extractionFn == null ? this : null;
        }

        @Override
        public int lookupId(String name) {
            if (extractionFn != null) {
                throw new UnsupportedOperationException("cannot perform lookup when applying an extraction function");
            }
            final int id = getEncodedValue(name, false);
            if (id < maxId) {
                return id;
            } else {
                // doesn't exist.
                return DimensionDictionary.ABSENT_VALUE_ID;
            }
        }

        @SuppressWarnings("deprecation")
        @Nullable
        @Override
        public Object getObject() {
            IncrementalIndexRow key = currEntry.get();
            if (key == null) {
                return null;
            }
            Object[] dims = key.getDims();
            if (dimIndex >= dims.length) {
                return null;
            }
            return convertUnsortedEncodedKeyComponentToActualList((int[]) dims[dimIndex]);
        }

        @SuppressWarnings("deprecation")
        @Override
        public Class classOfObject() {
            return Object.class;
        }

        @Override
        public void inspectRuntimeShape(RuntimeShapeInspector inspector) {
        // nothing to inspect
        }
    }
    return new IndexerDimensionSelector();
}
Also used : ValueMatcher(org.apache.druid.query.filter.ValueMatcher) BooleanValueMatcher(org.apache.druid.segment.filter.BooleanValueMatcher) BitSet(java.util.BitSet) RuntimeShapeInspector(org.apache.druid.query.monomorphicprocessing.RuntimeShapeInspector) Predicate(com.google.common.base.Predicate) ExtractionFn(org.apache.druid.query.extraction.ExtractionFn) IncrementalIndexRow(org.apache.druid.segment.incremental.IncrementalIndexRow) ArrayBasedIndexedInts(org.apache.druid.segment.data.ArrayBasedIndexedInts) ISE(org.apache.druid.java.util.common.ISE)

Aggregations

RuntimeShapeInspector (org.apache.druid.query.monomorphicprocessing.RuntimeShapeInspector)26 ValueMatcher (org.apache.druid.query.filter.ValueMatcher)19 IndexedInts (org.apache.druid.segment.data.IndexedInts)11 BooleanValueMatcher (org.apache.druid.segment.filter.BooleanValueMatcher)8 Nullable (javax.annotation.Nullable)7 Predicate (com.google.common.base.Predicate)5 ArrayBasedIndexedInts (org.apache.druid.segment.data.ArrayBasedIndexedInts)5 BitSet (java.util.BitSet)4 IdLookup (org.apache.druid.segment.IdLookup)4 DimensionSpec (org.apache.druid.query.dimension.DimensionSpec)3 ColumnSelectorFactory (org.apache.druid.segment.ColumnSelectorFactory)3 ColumnHolder (org.apache.druid.segment.column.ColumnHolder)3 Test (org.junit.Test)3 InputRow (org.apache.druid.data.input.InputRow)2 MapBasedInputRow (org.apache.druid.data.input.MapBasedInputRow)2 ISE (org.apache.druid.java.util.common.ISE)2 ExtractionFn (org.apache.druid.query.extraction.ExtractionFn)2 AbstractDimensionSelector (org.apache.druid.segment.AbstractDimensionSelector)2 ColumnValueSelector (org.apache.druid.segment.ColumnValueSelector)2 RowBasedColumnSelectorFactory (org.apache.druid.segment.RowBasedColumnSelectorFactory)2