Search in sources :

Example 11 with FixedBitSet

use of org.apache.lucene.util.FixedBitSet in project lucene-solr by apache.

the class DocSetBase method getBits.

/**
   * Return a {@link FixedBitSet} with a bit set for every document in this
   * {@link DocSet}. The default implementation iterates on all docs and sets
   * the relevant bits. You should override if you can provide a more efficient
   * implementation.
   */
protected FixedBitSet getBits() {
    FixedBitSet bits = new FixedBitSet(size());
    for (DocIterator iter = iterator(); iter.hasNext(); ) {
        int nextDoc = iter.nextDoc();
        bits = FixedBitSet.ensureCapacity(bits, nextDoc);
        bits.set(nextDoc);
    }
    return bits;
}
Also used : FixedBitSet(org.apache.lucene.util.FixedBitSet)

Example 12 with FixedBitSet

use of org.apache.lucene.util.FixedBitSet in project lucene-solr by apache.

the class DocSetUtil method createBigSet.

private static DocSet createBigSet(List<LeafReaderContext> leaves, PostingsEnum[] postList, int maxDoc, int firstReader) throws IOException {
    long[] bits = new long[FixedBitSet.bits2words(maxDoc)];
    int sz = 0;
    for (int i = firstReader; i < postList.length; i++) {
        PostingsEnum postings = postList[i];
        if (postings == null)
            continue;
        LeafReaderContext ctx = leaves.get(i);
        Bits liveDocs = ctx.reader().getLiveDocs();
        int base = ctx.docBase;
        for (; ; ) {
            int subId = postings.nextDoc();
            if (subId == DocIdSetIterator.NO_MORE_DOCS)
                break;
            if (liveDocs != null && !liveDocs.get(subId))
                continue;
            int globalId = subId + base;
            bits[globalId >> 6] |= (1L << globalId);
            sz++;
        }
    }
    BitDocSet docSet = new BitDocSet(new FixedBitSet(bits, maxDoc), sz);
    int smallSetSize = smallSetSize(maxDoc);
    if (sz < smallSetSize) {
        // make this optional?
        DocSet smallSet = toSmallSet(docSet);
        // assert equals(docSet, smallSet);
        return smallSet;
    }
    return docSet;
}
Also used : FixedBitSet(org.apache.lucene.util.FixedBitSet) LeafReaderContext(org.apache.lucene.index.LeafReaderContext) Bits(org.apache.lucene.util.Bits) PostingsEnum(org.apache.lucene.index.PostingsEnum)

Example 13 with FixedBitSet

use of org.apache.lucene.util.FixedBitSet in project lucene-solr by apache.

the class DocSetBase method andNot.

@Override
public DocSet andNot(DocSet other) {
    FixedBitSet newbits = getBits().clone();
    newbits.andNot(toBitSet(other));
    return new BitDocSet(newbits);
}
Also used : FixedBitSet(org.apache.lucene.util.FixedBitSet)

Example 14 with FixedBitSet

use of org.apache.lucene.util.FixedBitSet in project lucene-solr by apache.

the class DocSetBase method toBitSet.

public static FixedBitSet toBitSet(DocSet set) {
    if (set instanceof DocSetBase) {
        return ((DocSetBase) set).getBits();
    } else {
        FixedBitSet bits = new FixedBitSet(64);
        for (DocIterator iter = set.iterator(); iter.hasNext(); ) {
            int nextDoc = iter.nextDoc();
            bits = FixedBitSet.ensureCapacity(bits, nextDoc);
            bits.set(nextDoc);
        }
        return bits;
    }
}
Also used : FixedBitSet(org.apache.lucene.util.FixedBitSet)

Example 15 with FixedBitSet

use of org.apache.lucene.util.FixedBitSet in project lucene-solr by apache.

the class RandomSamplingFacetsCollector method createSample.

/** Create a sampled of the given hits. */
private MatchingDocs createSample(MatchingDocs docs) {
    int maxdoc = docs.context.reader().maxDoc();
    // TODO: we could try the WAH8DocIdSet here as well, as the results will be sparse
    FixedBitSet sampleDocs = new FixedBitSet(maxdoc);
    int binSize = (int) (1.0 / samplingRate);
    try {
        int counter = 0;
        int limit, randomIndex;
        if (leftoverBin != NOT_CALCULATED) {
            limit = leftoverBin;
            // either NOT_CALCULATED, which means we already sampled from that bin,
            // or the next document to sample
            randomIndex = leftoverIndex;
        } else {
            limit = binSize;
            randomIndex = random.nextInt(binSize);
        }
        final DocIdSetIterator it = docs.bits.iterator();
        for (int doc = it.nextDoc(); doc != DocIdSetIterator.NO_MORE_DOCS; doc = it.nextDoc()) {
            if (counter == randomIndex) {
                sampleDocs.set(doc);
            }
            counter++;
            if (counter >= limit) {
                counter = 0;
                limit = binSize;
                randomIndex = random.nextInt(binSize);
            }
        }
        if (counter == 0) {
            // we either exhausted the bin and the iterator at the same time, or
            // this segment had no results. in the latter case we might want to
            // carry leftover to the next segment as is, but that complicates the
            // code and doesn't seem so important.
            leftoverBin = leftoverIndex = NOT_CALCULATED;
        } else {
            leftoverBin = limit - counter;
            if (randomIndex > counter) {
                // the document to sample is in the next bin
                leftoverIndex = randomIndex - counter;
            } else if (randomIndex < counter) {
                // we sampled a document from the bin, so just skip over remaining
                // documents in the bin in the next segment.
                leftoverIndex = NOT_CALCULATED;
            }
        }
        return new MatchingDocs(docs.context, new BitDocIdSet(sampleDocs), docs.totalHits, null);
    } catch (IOException e) {
        throw new RuntimeException(e);
    }
}
Also used : BitDocIdSet(org.apache.lucene.util.BitDocIdSet) FixedBitSet(org.apache.lucene.util.FixedBitSet) IOException(java.io.IOException) DocIdSetIterator(org.apache.lucene.search.DocIdSetIterator)

Aggregations

FixedBitSet (org.apache.lucene.util.FixedBitSet)162 Term (org.apache.lucene.index.Term)27 LeafReaderContext (org.apache.lucene.index.LeafReaderContext)26 Directory (org.apache.lucene.store.Directory)25 BytesRef (org.apache.lucene.util.BytesRef)22 IOException (java.io.IOException)19 Document (org.apache.lucene.document.Document)17 ArrayList (java.util.ArrayList)15 Query (org.apache.lucene.search.Query)15 NumericDocValues (org.apache.lucene.index.NumericDocValues)14 BitDocIdSet (org.apache.lucene.util.BitDocIdSet)13 Bits (org.apache.lucene.util.Bits)13 LeafReader (org.apache.lucene.index.LeafReader)12 IndexSearcher (org.apache.lucene.search.IndexSearcher)12 TermQuery (org.apache.lucene.search.TermQuery)12 IndexReader (org.apache.lucene.index.IndexReader)11 HashSet (java.util.HashSet)10 NumericDocValuesField (org.apache.lucene.document.NumericDocValuesField)10 DocIterator (org.apache.solr.search.DocIterator)10 MockAnalyzer (org.apache.lucene.analysis.MockAnalyzer)9