Search in sources :

Example 36 with LongPoint

use of org.apache.lucene.document.LongPoint in project lucene-solr by apache.

the class TestFieldCacheSort method testLongMissing.

/** Tests sorting on type long with a missing value */
public void testLongMissing() throws IOException {
    Directory dir = newDirectory();
    RandomIndexWriter writer = new RandomIndexWriter(random(), dir);
    Document doc = new Document();
    writer.addDocument(doc);
    doc = new Document();
    doc.add(new LongPoint("value", -1));
    doc.add(new StoredField("value", -1));
    writer.addDocument(doc);
    doc = new Document();
    doc.add(new LongPoint("value", 4));
    doc.add(new StoredField("value", 4));
    writer.addDocument(doc);
    IndexReader ir = UninvertingReader.wrap(writer.getReader(), Collections.singletonMap("value", Type.LONG_POINT));
    writer.close();
    IndexSearcher searcher = newSearcher(ir, false);
    Sort sort = new Sort(new SortField("value", SortField.Type.LONG));
    TopDocs td = searcher.search(new MatchAllDocsQuery(), 10, sort);
    assertEquals(3, td.totalHits);
    // null is treated as 0
    assertEquals("-1", searcher.doc(td.scoreDocs[0].doc).get("value"));
    assertNull(searcher.doc(td.scoreDocs[1].doc).get("value"));
    assertEquals("4", searcher.doc(td.scoreDocs[2].doc).get("value"));
    TestUtil.checkReader(ir);
    ir.close();
    dir.close();
}
Also used : IndexSearcher(org.apache.lucene.search.IndexSearcher) TopDocs(org.apache.lucene.search.TopDocs) StoredField(org.apache.lucene.document.StoredField) IndexReader(org.apache.lucene.index.IndexReader) Sort(org.apache.lucene.search.Sort) SortField(org.apache.lucene.search.SortField) LongPoint(org.apache.lucene.document.LongPoint) Document(org.apache.lucene.document.Document) MatchAllDocsQuery(org.apache.lucene.search.MatchAllDocsQuery) RandomIndexWriter(org.apache.lucene.index.RandomIndexWriter) Directory(org.apache.lucene.store.Directory)

Example 37 with LongPoint

use of org.apache.lucene.document.LongPoint in project lucene-solr by apache.

the class TestFieldCacheSort method testLong.

/** Tests sorting on type long */
public void testLong() throws IOException {
    Directory dir = newDirectory();
    RandomIndexWriter writer = new RandomIndexWriter(random(), dir);
    Document doc = new Document();
    doc.add(new LongPoint("value", 3000000000L));
    doc.add(new StoredField("value", 3000000000L));
    writer.addDocument(doc);
    doc = new Document();
    doc.add(new LongPoint("value", -1));
    doc.add(new StoredField("value", -1));
    writer.addDocument(doc);
    doc = new Document();
    doc.add(new LongPoint("value", 4));
    doc.add(new StoredField("value", 4));
    writer.addDocument(doc);
    IndexReader ir = UninvertingReader.wrap(writer.getReader(), Collections.singletonMap("value", Type.LONG_POINT));
    writer.close();
    IndexSearcher searcher = newSearcher(ir, false);
    Sort sort = new Sort(new SortField("value", SortField.Type.LONG));
    TopDocs td = searcher.search(new MatchAllDocsQuery(), 10, sort);
    assertEquals(3, td.totalHits);
    // numeric order
    assertEquals("-1", searcher.doc(td.scoreDocs[0].doc).get("value"));
    assertEquals("4", searcher.doc(td.scoreDocs[1].doc).get("value"));
    assertEquals("3000000000", searcher.doc(td.scoreDocs[2].doc).get("value"));
    TestUtil.checkReader(ir);
    ir.close();
    dir.close();
}
Also used : IndexSearcher(org.apache.lucene.search.IndexSearcher) TopDocs(org.apache.lucene.search.TopDocs) StoredField(org.apache.lucene.document.StoredField) IndexReader(org.apache.lucene.index.IndexReader) Sort(org.apache.lucene.search.Sort) SortField(org.apache.lucene.search.SortField) LongPoint(org.apache.lucene.document.LongPoint) Document(org.apache.lucene.document.Document) MatchAllDocsQuery(org.apache.lucene.search.MatchAllDocsQuery) RandomIndexWriter(org.apache.lucene.index.RandomIndexWriter) Directory(org.apache.lucene.store.Directory)

Example 38 with LongPoint

use of org.apache.lucene.document.LongPoint in project lucene-solr by apache.

the class JoinUtil method createJoinQuery.

/**
   * Method for query time joining for numeric fields. It supports multi- and single- values longs, ints, floats and longs.
   * All considerations from {@link JoinUtil#createJoinQuery(String, boolean, String, Query, IndexSearcher, ScoreMode)} are applicable here too,
   * though memory consumption might be higher.
   * <p>
   *
   * @param fromField                 The from field to join from
   * @param multipleValuesPerDocument Whether the from field has multiple terms per document
   *                                  when true fromField might be {@link DocValuesType#SORTED_NUMERIC},
   *                                  otherwise fromField should be {@link DocValuesType#NUMERIC}
   * @param toField                   The to field to join to, should be {@link IntPoint}, {@link LongPoint}, {@link FloatPoint}
   *                                  or {@link DoublePoint}.
   * @param numericType               either {@link java.lang.Integer}, {@link java.lang.Long}, {@link java.lang.Float}
   *                                  or {@link java.lang.Double} it should correspond to toField types
   * @param fromQuery                 The query to match documents on the from side
   * @param fromSearcher              The searcher that executed the specified fromQuery
   * @param scoreMode                 Instructs how scores from the fromQuery are mapped to the returned query
   * @return a {@link Query} instance that can be used to join documents based on the
   *         terms in the from and to field
   * @throws IOException If I/O related errors occur
   */
public static Query createJoinQuery(String fromField, boolean multipleValuesPerDocument, String toField, Class<? extends Number> numericType, Query fromQuery, IndexSearcher fromSearcher, ScoreMode scoreMode) throws IOException {
    TreeSet<Long> joinValues = new TreeSet<>();
    Map<Long, Float> aggregatedScores = new HashMap<>();
    Map<Long, Integer> occurrences = new HashMap<>();
    boolean needsScore = scoreMode != ScoreMode.None;
    BiConsumer<Long, Float> scoreAggregator;
    if (scoreMode == ScoreMode.Max) {
        scoreAggregator = (key, score) -> {
            Float currentValue = aggregatedScores.putIfAbsent(key, score);
            if (currentValue != null) {
                aggregatedScores.put(key, Math.max(currentValue, score));
            }
        };
    } else if (scoreMode == ScoreMode.Min) {
        scoreAggregator = (key, score) -> {
            Float currentValue = aggregatedScores.putIfAbsent(key, score);
            if (currentValue != null) {
                aggregatedScores.put(key, Math.min(currentValue, score));
            }
        };
    } else if (scoreMode == ScoreMode.Total) {
        scoreAggregator = (key, score) -> {
            Float currentValue = aggregatedScores.putIfAbsent(key, score);
            if (currentValue != null) {
                aggregatedScores.put(key, currentValue + score);
            }
        };
    } else if (scoreMode == ScoreMode.Avg) {
        scoreAggregator = (key, score) -> {
            Float currentSore = aggregatedScores.putIfAbsent(key, score);
            if (currentSore != null) {
                aggregatedScores.put(key, currentSore + score);
            }
            Integer currentOccurrence = occurrences.putIfAbsent(key, 1);
            if (currentOccurrence != null) {
                occurrences.put(key, ++currentOccurrence);
            }
        };
    } else {
        scoreAggregator = (key, score) -> {
            throw new UnsupportedOperationException();
        };
    }
    LongFunction<Float> joinScorer;
    if (scoreMode == ScoreMode.Avg) {
        joinScorer = (joinValue) -> {
            Float aggregatedScore = aggregatedScores.get(joinValue);
            Integer occurrence = occurrences.get(joinValue);
            return aggregatedScore / occurrence;
        };
    } else {
        joinScorer = aggregatedScores::get;
    }
    Collector collector;
    if (multipleValuesPerDocument) {
        collector = new SimpleCollector() {

            SortedNumericDocValues sortedNumericDocValues;

            Scorer scorer;

            @Override
            public void collect(int doc) throws IOException {
                if (doc > sortedNumericDocValues.docID()) {
                    sortedNumericDocValues.advance(doc);
                }
                if (doc == sortedNumericDocValues.docID()) {
                    for (int i = 0; i < sortedNumericDocValues.docValueCount(); i++) {
                        long value = sortedNumericDocValues.nextValue();
                        joinValues.add(value);
                        if (needsScore) {
                            scoreAggregator.accept(value, scorer.score());
                        }
                    }
                }
            }

            @Override
            protected void doSetNextReader(LeafReaderContext context) throws IOException {
                sortedNumericDocValues = DocValues.getSortedNumeric(context.reader(), fromField);
            }

            @Override
            public void setScorer(Scorer scorer) throws IOException {
                this.scorer = scorer;
            }

            @Override
            public boolean needsScores() {
                return needsScore;
            }
        };
    } else {
        collector = new SimpleCollector() {

            NumericDocValues numericDocValues;

            Scorer scorer;

            private int lastDocID = -1;

            private boolean docsInOrder(int docID) {
                if (docID < lastDocID) {
                    throw new AssertionError("docs out of order: lastDocID=" + lastDocID + " vs docID=" + docID);
                }
                lastDocID = docID;
                return true;
            }

            @Override
            public void collect(int doc) throws IOException {
                assert docsInOrder(doc);
                int dvDocID = numericDocValues.docID();
                if (dvDocID < doc) {
                    dvDocID = numericDocValues.advance(doc);
                }
                long value;
                if (dvDocID == doc) {
                    value = numericDocValues.longValue();
                } else {
                    value = 0;
                }
                joinValues.add(value);
                if (needsScore) {
                    scoreAggregator.accept(value, scorer.score());
                }
            }

            @Override
            protected void doSetNextReader(LeafReaderContext context) throws IOException {
                numericDocValues = DocValues.getNumeric(context.reader(), fromField);
                lastDocID = -1;
            }

            @Override
            public void setScorer(Scorer scorer) throws IOException {
                this.scorer = scorer;
            }

            @Override
            public boolean needsScores() {
                return needsScore;
            }
        };
    }
    fromSearcher.search(fromQuery, collector);
    Iterator<Long> iterator = joinValues.iterator();
    final int bytesPerDim;
    final BytesRef encoded = new BytesRef();
    final PointInSetIncludingScoreQuery.Stream stream;
    if (Integer.class.equals(numericType)) {
        bytesPerDim = Integer.BYTES;
        stream = new PointInSetIncludingScoreQuery.Stream() {

            @Override
            public BytesRef next() {
                if (iterator.hasNext()) {
                    long value = iterator.next();
                    IntPoint.encodeDimension((int) value, encoded.bytes, 0);
                    if (needsScore) {
                        score = joinScorer.apply(value);
                    }
                    return encoded;
                } else {
                    return null;
                }
            }
        };
    } else if (Long.class.equals(numericType)) {
        bytesPerDim = Long.BYTES;
        stream = new PointInSetIncludingScoreQuery.Stream() {

            @Override
            public BytesRef next() {
                if (iterator.hasNext()) {
                    long value = iterator.next();
                    LongPoint.encodeDimension(value, encoded.bytes, 0);
                    if (needsScore) {
                        score = joinScorer.apply(value);
                    }
                    return encoded;
                } else {
                    return null;
                }
            }
        };
    } else if (Float.class.equals(numericType)) {
        bytesPerDim = Float.BYTES;
        stream = new PointInSetIncludingScoreQuery.Stream() {

            @Override
            public BytesRef next() {
                if (iterator.hasNext()) {
                    long value = iterator.next();
                    FloatPoint.encodeDimension(Float.intBitsToFloat((int) value), encoded.bytes, 0);
                    if (needsScore) {
                        score = joinScorer.apply(value);
                    }
                    return encoded;
                } else {
                    return null;
                }
            }
        };
    } else if (Double.class.equals(numericType)) {
        bytesPerDim = Double.BYTES;
        stream = new PointInSetIncludingScoreQuery.Stream() {

            @Override
            public BytesRef next() {
                if (iterator.hasNext()) {
                    long value = iterator.next();
                    DoublePoint.encodeDimension(Double.longBitsToDouble(value), encoded.bytes, 0);
                    if (needsScore) {
                        score = joinScorer.apply(value);
                    }
                    return encoded;
                } else {
                    return null;
                }
            }
        };
    } else {
        throw new IllegalArgumentException("unsupported numeric type, only Integer, Long, Float and Double are supported");
    }
    encoded.bytes = new byte[bytesPerDim];
    encoded.length = bytesPerDim;
    if (needsScore) {
        return new PointInSetIncludingScoreQuery(scoreMode, fromQuery, multipleValuesPerDocument, toField, bytesPerDim, stream) {

            @Override
            protected String toString(byte[] value) {
                return toString.apply(value, numericType);
            }
        };
    } else {
        return new PointInSetQuery(toField, 1, bytesPerDim, stream) {

            @Override
            protected String toString(byte[] value) {
                return PointInSetIncludingScoreQuery.toString.apply(value, numericType);
            }
        };
    }
}
Also used : Query(org.apache.lucene.search.Query) LongPoint(org.apache.lucene.document.LongPoint) MatchNoDocsQuery(org.apache.lucene.search.MatchNoDocsQuery) NumericDocValues(org.apache.lucene.index.NumericDocValues) HashMap(java.util.HashMap) TreeSet(java.util.TreeSet) DoublePoint(org.apache.lucene.document.DoublePoint) PointInSetQuery(org.apache.lucene.search.PointInSetQuery) Locale(java.util.Locale) Map(java.util.Map) BiConsumer(java.util.function.BiConsumer) SortedSetDocValues(org.apache.lucene.index.SortedSetDocValues) IntPoint(org.apache.lucene.document.IntPoint) LeafReaderContext(org.apache.lucene.index.LeafReaderContext) SortedDocValues(org.apache.lucene.index.SortedDocValues) SimpleCollector(org.apache.lucene.search.SimpleCollector) Scorer(org.apache.lucene.search.Scorer) Iterator(java.util.Iterator) LongFunction(java.util.function.LongFunction) FloatPoint(org.apache.lucene.document.FloatPoint) MultiDocValues(org.apache.lucene.index.MultiDocValues) BytesRef(org.apache.lucene.util.BytesRef) IOException(java.io.IOException) Collector(org.apache.lucene.search.Collector) SortedNumericDocValues(org.apache.lucene.index.SortedNumericDocValues) Function(org.apache.lucene.search.join.DocValuesTermsCollector.Function) DocValues(org.apache.lucene.index.DocValues) DocValuesType(org.apache.lucene.index.DocValuesType) LeafReader(org.apache.lucene.index.LeafReader) BinaryDocValues(org.apache.lucene.index.BinaryDocValues) IndexSearcher(org.apache.lucene.search.IndexSearcher) NumericDocValues(org.apache.lucene.index.NumericDocValues) SortedNumericDocValues(org.apache.lucene.index.SortedNumericDocValues) SortedNumericDocValues(org.apache.lucene.index.SortedNumericDocValues) HashMap(java.util.HashMap) Scorer(org.apache.lucene.search.Scorer) SimpleCollector(org.apache.lucene.search.SimpleCollector) TreeSet(java.util.TreeSet) SimpleCollector(org.apache.lucene.search.SimpleCollector) Collector(org.apache.lucene.search.Collector) LeafReaderContext(org.apache.lucene.index.LeafReaderContext) BytesRef(org.apache.lucene.util.BytesRef) PointInSetQuery(org.apache.lucene.search.PointInSetQuery) IOException(java.io.IOException) LongPoint(org.apache.lucene.document.LongPoint) DoublePoint(org.apache.lucene.document.DoublePoint) IntPoint(org.apache.lucene.document.IntPoint) FloatPoint(org.apache.lucene.document.FloatPoint)

Example 39 with LongPoint

use of org.apache.lucene.document.LongPoint in project lucene-solr by apache.

the class TestJoinUtil method addLinkFields.

private void addLinkFields(final Random random, Document document, final String fieldName, String linkValue, boolean multipleValuesPerDocument, boolean globalOrdinalJoin) {
    document.add(newTextField(random, fieldName, linkValue, Field.Store.NO));
    final int linkInt = Integer.parseUnsignedInt(linkValue, 16);
    document.add(new IntPoint(fieldName + "INT", linkInt));
    document.add(new FloatPoint(fieldName + "FLOAT", linkInt));
    final long linkLong = linkInt << 32 | linkInt;
    document.add(new LongPoint(fieldName + "LONG", linkLong));
    document.add(new DoublePoint(fieldName + "DOUBLE", linkLong));
    if (multipleValuesPerDocument) {
        document.add(new SortedSetDocValuesField(fieldName, new BytesRef(linkValue)));
        document.add(new SortedNumericDocValuesField(fieldName + "INT", linkInt));
        document.add(new SortedNumericDocValuesField(fieldName + "FLOAT", Float.floatToRawIntBits(linkInt)));
        document.add(new SortedNumericDocValuesField(fieldName + "LONG", linkLong));
        document.add(new SortedNumericDocValuesField(fieldName + "DOUBLE", Double.doubleToRawLongBits(linkLong)));
    } else {
        document.add(new SortedDocValuesField(fieldName, new BytesRef(linkValue)));
        document.add(new NumericDocValuesField(fieldName + "INT", linkInt));
        document.add(new FloatDocValuesField(fieldName + "FLOAT", linkInt));
        document.add(new NumericDocValuesField(fieldName + "LONG", linkLong));
        document.add(new DoubleDocValuesField(fieldName + "DOUBLE", linkLong));
    }
    if (globalOrdinalJoin) {
        document.add(new SortedDocValuesField("join_field", new BytesRef(linkValue)));
    }
}
Also used : FloatDocValuesField(org.apache.lucene.document.FloatDocValuesField) LongPoint(org.apache.lucene.document.LongPoint) DoublePoint(org.apache.lucene.document.DoublePoint) LongPoint(org.apache.lucene.document.LongPoint) IntPoint(org.apache.lucene.document.IntPoint) FloatPoint(org.apache.lucene.document.FloatPoint) IntPoint(org.apache.lucene.document.IntPoint) SortedNumericDocValuesField(org.apache.lucene.document.SortedNumericDocValuesField) FloatPoint(org.apache.lucene.document.FloatPoint) SortedNumericDocValuesField(org.apache.lucene.document.SortedNumericDocValuesField) NumericDocValuesField(org.apache.lucene.document.NumericDocValuesField) DoubleDocValuesField(org.apache.lucene.document.DoubleDocValuesField) DoublePoint(org.apache.lucene.document.DoublePoint) SortedDocValuesField(org.apache.lucene.document.SortedDocValuesField) SortedSetDocValuesField(org.apache.lucene.document.SortedSetDocValuesField) BytesRef(org.apache.lucene.util.BytesRef)

Example 40 with LongPoint

use of org.apache.lucene.document.LongPoint in project lucene-solr by apache.

the class Test2BPoints method test1D.

public void test1D() throws Exception {
    Directory dir = FSDirectory.open(createTempDir("2BPoints1D"));
    IndexWriterConfig iwc = new IndexWriterConfig(new MockAnalyzer(random())).setCodec(getCodec()).setMaxBufferedDocs(IndexWriterConfig.DISABLE_AUTO_FLUSH).setRAMBufferSizeMB(256.0).setMergeScheduler(new ConcurrentMergeScheduler()).setMergePolicy(newLogMergePolicy(false, 10)).setOpenMode(IndexWriterConfig.OpenMode.CREATE);
    ((ConcurrentMergeScheduler) iwc.getMergeScheduler()).setMaxMergesAndThreads(6, 3);
    IndexWriter w = new IndexWriter(dir, iwc);
    MergePolicy mp = w.getConfig().getMergePolicy();
    if (mp instanceof LogByteSizeMergePolicy) {
        // 1 petabyte:
        ((LogByteSizeMergePolicy) mp).setMaxMergeMB(1024 * 1024 * 1024);
    }
    final int numDocs = (Integer.MAX_VALUE / 26) + 1;
    int counter = 0;
    for (int i = 0; i < numDocs; i++) {
        Document doc = new Document();
        for (int j = 0; j < 26; j++) {
            long x = (((long) random().nextInt() << 32)) | (long) counter;
            doc.add(new LongPoint("long", x));
            counter++;
        }
        w.addDocument(doc);
        if (VERBOSE && i % 100000 == 0) {
            System.out.println(i + " of " + numDocs + "...");
        }
    }
    w.forceMerge(1);
    DirectoryReader r = DirectoryReader.open(w);
    IndexSearcher s = new IndexSearcher(r);
    assertEquals(numDocs, s.count(LongPoint.newRangeQuery("long", Long.MIN_VALUE, Long.MAX_VALUE)));
    assertTrue(r.leaves().get(0).reader().getPointValues("long").size() > Integer.MAX_VALUE);
    r.close();
    w.close();
    System.out.println("TEST: now CheckIndex");
    TestUtil.checkIndex(dir);
    dir.close();
}
Also used : IndexSearcher(org.apache.lucene.search.IndexSearcher) LongPoint(org.apache.lucene.document.LongPoint) Document(org.apache.lucene.document.Document) LongPoint(org.apache.lucene.document.LongPoint) MockAnalyzer(org.apache.lucene.analysis.MockAnalyzer) Directory(org.apache.lucene.store.Directory) FSDirectory(org.apache.lucene.store.FSDirectory)

Aggregations

LongPoint (org.apache.lucene.document.LongPoint)49 Document (org.apache.lucene.document.Document)41 Directory (org.apache.lucene.store.Directory)34 IndexReader (org.apache.lucene.index.IndexReader)26 RandomIndexWriter (org.apache.lucene.index.RandomIndexWriter)26 IndexWriterConfig (org.apache.lucene.index.IndexWriterConfig)20 DoublePoint (org.apache.lucene.document.DoublePoint)19 FloatPoint (org.apache.lucene.document.FloatPoint)16 IntPoint (org.apache.lucene.document.IntPoint)16 IndexSearcher (org.apache.lucene.search.IndexSearcher)16 NumericDocValuesField (org.apache.lucene.document.NumericDocValuesField)15 IndexWriter (org.apache.lucene.index.IndexWriter)14 MockAnalyzer (org.apache.lucene.analysis.MockAnalyzer)8 BinaryPoint (org.apache.lucene.document.BinaryPoint)8 SortedNumericDocValuesField (org.apache.lucene.document.SortedNumericDocValuesField)8 StoredField (org.apache.lucene.document.StoredField)8 Term (org.apache.lucene.index.Term)7 BytesRef (org.apache.lucene.util.BytesRef)7 MatchAllDocsQuery (org.apache.lucene.search.MatchAllDocsQuery)6 SortedDocValuesField (org.apache.lucene.document.SortedDocValuesField)5