Search in sources :

Example 1 with ColumnMetadata

use of com.linkedin.pinot.core.segment.index.ColumnMetadata in project pinot by linkedin.

the class ColumnValueSegmentPruner method pruneSegment.

/**
   * Helper method to determine if a segment can be pruned based on the column min/max value in segment metadata and
   * the predicates on time column. The algorithm is as follows:
   *
   * <ul>
   *   <li> For leaf node: Returns true if there is a predicate on the column and apply the predicate would result in
   *   filtering out all docs of the segment, false otherwise. </li>
   *   <li> For non-leaf AND node: True if any of its children returned true, false otherwise. </li>
   *   <li> For non-leaf OR node: True if all its children returned true, false otherwise. </li>
   * </ul>
   *
   * @param filterQueryTree Filter tree for the query.
   * @param columnMetadataMap Map from column name to column metadata.
   * @return True if segment can be pruned out, false otherwise.
   */
@SuppressWarnings("unchecked")
public static boolean pruneSegment(@Nonnull FilterQueryTree filterQueryTree, @Nonnull Map<String, ColumnMetadata> columnMetadataMap) {
    FilterOperator filterOperator = filterQueryTree.getOperator();
    List<FilterQueryTree> children = filterQueryTree.getChildren();
    if (children == null || children.isEmpty()) {
        // Skip operator other than EQUALITY and RANGE
        if ((filterOperator != FilterOperator.EQUALITY) && (filterOperator != FilterOperator.RANGE)) {
            return false;
        }
        ColumnMetadata columnMetadata = columnMetadataMap.get(filterQueryTree.getColumn());
        if (columnMetadata == null) {
            // Should not reach here after DataSchemaSegmentPruner
            return true;
        }
        Comparable minValue = columnMetadata.getMinValue();
        Comparable maxValue = columnMetadata.getMaxValue();
        if (filterOperator == FilterOperator.EQUALITY) {
            // Doesn't have min/max value set in metadata
            if ((minValue == null) || (maxValue == null)) {
                return false;
            }
            // Check if the value is in the min/max range
            FieldSpec.DataType dataType = columnMetadata.getDataType();
            Comparable value = getValue(filterQueryTree.getValue().get(0), dataType);
            return (value.compareTo(minValue) < 0) || (value.compareTo(maxValue) > 0);
        } else {
            // RANGE
            // Get lower/upper boundary value
            FieldSpec.DataType dataType = columnMetadata.getDataType();
            RangePredicate rangePredicate = new RangePredicate(null, filterQueryTree.getValue());
            String lowerBoundary = rangePredicate.getLowerBoundary();
            boolean includeLowerBoundary = rangePredicate.includeLowerBoundary();
            Comparable lowerBoundaryValue = null;
            if (!lowerBoundary.equals(RangePredicate.UNBOUNDED)) {
                lowerBoundaryValue = getValue(lowerBoundary, dataType);
            }
            String upperBoundary = rangePredicate.getUpperBoundary();
            boolean includeUpperBoundary = rangePredicate.includeUpperBoundary();
            Comparable upperBoundaryValue = null;
            if (!upperBoundary.equals(RangePredicate.UNBOUNDED)) {
                upperBoundaryValue = getValue(upperBoundary, dataType);
            }
            // Check if the range is valid
            if ((lowerBoundaryValue != null) && (upperBoundaryValue != null)) {
                if (includeLowerBoundary && includeUpperBoundary) {
                    if (lowerBoundaryValue.compareTo(upperBoundaryValue) > 0) {
                        return true;
                    }
                } else {
                    if (lowerBoundaryValue.compareTo(upperBoundaryValue) >= 0) {
                        return true;
                    }
                }
            }
            // Doesn't have min/max value set in metadata
            if ((minValue == null) || (maxValue == null)) {
                return false;
            }
            if (lowerBoundaryValue != null) {
                if (includeLowerBoundary) {
                    if (lowerBoundaryValue.compareTo(maxValue) > 0) {
                        return true;
                    }
                } else {
                    if (lowerBoundaryValue.compareTo(maxValue) >= 0) {
                        return true;
                    }
                }
            }
            if (upperBoundaryValue != null) {
                if (includeUpperBoundary) {
                    if (upperBoundaryValue.compareTo(minValue) < 0) {
                        return true;
                    }
                } else {
                    if (upperBoundaryValue.compareTo(minValue) <= 0) {
                        return true;
                    }
                }
            }
            return false;
        }
    } else {
        switch(filterOperator) {
            case AND:
                for (FilterQueryTree child : children) {
                    if (pruneSegment(child, columnMetadataMap)) {
                        return true;
                    }
                }
                return false;
            case OR:
                for (FilterQueryTree child : children) {
                    if (!pruneSegment(child, columnMetadataMap)) {
                        return false;
                    }
                }
                return true;
            default:
                throw new IllegalStateException("Unsupported filter operator: " + filterOperator);
        }
    }
}
Also used : FilterOperator(com.linkedin.pinot.common.request.FilterOperator) RangePredicate(com.linkedin.pinot.core.common.predicate.RangePredicate) ColumnMetadata(com.linkedin.pinot.core.segment.index.ColumnMetadata) FilterQueryTree(com.linkedin.pinot.common.utils.request.FilterQueryTree) FieldSpec(com.linkedin.pinot.common.data.FieldSpec)

Example 2 with ColumnMetadata

use of com.linkedin.pinot.core.segment.index.ColumnMetadata in project pinot by linkedin.

the class ColumnValueSegmentPruner method prune.

@Override
public boolean prune(@Nonnull IndexSegment segment, @Nonnull BrokerRequest brokerRequest) {
    FilterQueryTree filterQueryTree = RequestUtils.generateFilterQueryTree(brokerRequest);
    if (filterQueryTree == null) {
        return false;
    }
    // For realtime segment, this map can be null.
    Map<String, ColumnMetadata> columnMetadataMap = ((SegmentMetadataImpl) segment.getSegmentMetadata()).getColumnMetadataMap();
    return (columnMetadataMap != null) && pruneSegment(filterQueryTree, columnMetadataMap);
}
Also used : ColumnMetadata(com.linkedin.pinot.core.segment.index.ColumnMetadata) FilterQueryTree(com.linkedin.pinot.common.utils.request.FilterQueryTree) SegmentMetadataImpl(com.linkedin.pinot.core.segment.index.SegmentMetadataImpl)

Example 3 with ColumnMetadata

use of com.linkedin.pinot.core.segment.index.ColumnMetadata in project pinot by linkedin.

the class ColumnMinMaxValueGenerator method addColumnMinMaxValueForColumn.

private void addColumnMinMaxValueForColumn(String columnName) throws Exception {
    // Skip column without dictionary or with min/max value already set
    ColumnMetadata columnMetadata = _segmentMetadata.getColumnMetadataFor(columnName);
    if ((!columnMetadata.hasDictionary()) || (columnMetadata.getMinValue() != null)) {
        return;
    }
    PinotDataBuffer dictionaryBuffer = _segmentWriter.getIndexFor(columnName, ColumnIndexType.DICTIONARY);
    FieldSpec.DataType dataType = columnMetadata.getDataType();
    switch(dataType) {
        case INT:
            IntDictionary intDictionary = new IntDictionary(dictionaryBuffer, columnMetadata);
            SegmentColumnarIndexCreator.addColumnMinMaxValueInfo(_segmentProperties, columnName, intDictionary.getStringValue(0), intDictionary.getStringValue(intDictionary.length() - 1));
            break;
        case LONG:
            LongDictionary longDictionary = new LongDictionary(dictionaryBuffer, columnMetadata);
            SegmentColumnarIndexCreator.addColumnMinMaxValueInfo(_segmentProperties, columnName, longDictionary.getStringValue(0), longDictionary.getStringValue(longDictionary.length() - 1));
            break;
        case FLOAT:
            FloatDictionary floatDictionary = new FloatDictionary(dictionaryBuffer, columnMetadata);
            SegmentColumnarIndexCreator.addColumnMinMaxValueInfo(_segmentProperties, columnName, floatDictionary.getStringValue(0), floatDictionary.getStringValue(floatDictionary.length() - 1));
            break;
        case DOUBLE:
            DoubleDictionary doubleDictionary = new DoubleDictionary(dictionaryBuffer, columnMetadata);
            SegmentColumnarIndexCreator.addColumnMinMaxValueInfo(_segmentProperties, columnName, doubleDictionary.getStringValue(0), doubleDictionary.getStringValue(doubleDictionary.length() - 1));
            break;
        case STRING:
            StringDictionary stringDictionary = new StringDictionary(dictionaryBuffer, columnMetadata);
            SegmentColumnarIndexCreator.addColumnMinMaxValueInfo(_segmentProperties, columnName, stringDictionary.get(0), stringDictionary.get(stringDictionary.length() - 1));
            break;
        default:
            throw new IllegalStateException("Unsupported data type: " + dataType + " for column: " + columnName);
    }
    _minMaxValueAdded = true;
}
Also used : ColumnMetadata(com.linkedin.pinot.core.segment.index.ColumnMetadata) LongDictionary(com.linkedin.pinot.core.segment.index.readers.LongDictionary) FloatDictionary(com.linkedin.pinot.core.segment.index.readers.FloatDictionary) PinotDataBuffer(com.linkedin.pinot.core.segment.memory.PinotDataBuffer) DoubleDictionary(com.linkedin.pinot.core.segment.index.readers.DoubleDictionary) FieldSpec(com.linkedin.pinot.common.data.FieldSpec) IntDictionary(com.linkedin.pinot.core.segment.index.readers.IntDictionary) StringDictionary(com.linkedin.pinot.core.segment.index.readers.StringDictionary)

Example 4 with ColumnMetadata

use of com.linkedin.pinot.core.segment.index.ColumnMetadata in project pinot by linkedin.

the class InvertedIndexHandler method getInvertedIndexColumns.

private Set<String> getInvertedIndexColumns() {
    Set<String> invertedIndexColumns = new HashSet<>();
    if (indexConfig == null) {
        return invertedIndexColumns;
    }
    Set<String> invertedIndexColumnsFromConfig = indexConfig.getLoadingInvertedIndexColumns();
    for (String column : invertedIndexColumnsFromConfig) {
        ColumnMetadata columnMetadata = segmentMetadata.getColumnMetadataFor(column);
        if (columnMetadata != null && !columnMetadata.isSorted()) {
            invertedIndexColumns.add(column);
        }
    }
    return invertedIndexColumns;
}
Also used : ColumnMetadata(com.linkedin.pinot.core.segment.index.ColumnMetadata) HashSet(java.util.HashSet)

Example 5 with ColumnMetadata

use of com.linkedin.pinot.core.segment.index.ColumnMetadata in project pinot by linkedin.

the class DictionariesTest method test2.

@Test
public void test2() throws Exception {
    final IndexSegmentImpl heapSegment = (IndexSegmentImpl) ColumnarSegmentLoader.load(segmentDirectory, ReadMode.heap);
    final IndexSegmentImpl mmapSegment = (IndexSegmentImpl) ColumnarSegmentLoader.load(segmentDirectory, ReadMode.mmap);
    final Map<String, ColumnMetadata> metadataMap = ((SegmentMetadataImpl) mmapSegment.getSegmentMetadata()).getColumnMetadataMap();
    for (final String column : metadataMap.keySet()) {
        final ImmutableDictionaryReader heapDictionary = heapSegment.getDictionaryFor(column);
        final ImmutableDictionaryReader mmapDictionary = mmapSegment.getDictionaryFor(column);
        final Set<Object> uniques = uniqueEntries.get(column);
        final List<Object> list = Arrays.asList(uniques.toArray());
        Collections.shuffle(list);
        for (final Object entry : list) {
            Assert.assertEquals(mmapDictionary.indexOf(entry), heapDictionary.indexOf(entry));
            if (!column.equals("pageKey")) {
                Assert.assertFalse(heapDictionary.indexOf(entry) < 0);
                Assert.assertFalse(mmapDictionary.indexOf(entry) < 0);
            }
        }
    }
}
Also used : ColumnMetadata(com.linkedin.pinot.core.segment.index.ColumnMetadata) IndexSegmentImpl(com.linkedin.pinot.core.segment.index.IndexSegmentImpl) ImmutableDictionaryReader(com.linkedin.pinot.core.segment.index.readers.ImmutableDictionaryReader) SegmentMetadataImpl(com.linkedin.pinot.core.segment.index.SegmentMetadataImpl) Test(org.testng.annotations.Test)

Aggregations

ColumnMetadata (com.linkedin.pinot.core.segment.index.ColumnMetadata)16 SegmentMetadataImpl (com.linkedin.pinot.core.segment.index.SegmentMetadataImpl)10 PinotDataBuffer (com.linkedin.pinot.core.segment.memory.PinotDataBuffer)5 SegmentDirectory (com.linkedin.pinot.core.segment.store.SegmentDirectory)5 File (java.io.File)5 FieldSpec (com.linkedin.pinot.common.data.FieldSpec)4 Test (org.testng.annotations.Test)4 SingleColumnMultiValueReader (com.linkedin.pinot.core.io.reader.SingleColumnMultiValueReader)3 SingleColumnSingleValueReader (com.linkedin.pinot.core.io.reader.SingleColumnSingleValueReader)3 ImmutableDictionaryReader (com.linkedin.pinot.core.segment.index.readers.ImmutableDictionaryReader)3 HashMap (java.util.HashMap)3 FilterQueryTree (com.linkedin.pinot.common.utils.request.FilterQueryTree)2 IndexSegmentImpl (com.linkedin.pinot.core.segment.index.IndexSegmentImpl)2 ColumnIndexContainer (com.linkedin.pinot.core.segment.index.column.ColumnIndexContainer)2 BitmapInvertedIndexReader (com.linkedin.pinot.core.segment.index.readers.BitmapInvertedIndexReader)2 IntDictionary (com.linkedin.pinot.core.segment.index.readers.IntDictionary)2 StringDictionary (com.linkedin.pinot.core.segment.index.readers.StringDictionary)2 PropertiesConfiguration (org.apache.commons.configuration.PropertiesConfiguration)2 ImmutableRoaringBitmap (org.roaringbitmap.buffer.ImmutableRoaringBitmap)2 DimensionFieldSpec (com.linkedin.pinot.common.data.DimensionFieldSpec)1