Examples with RandomIndexWriter - org.apache.lucene.index.RandomIndexWriter

Example 96 with RandomIndexWriter

use of org.apache.lucene.index.RandomIndexWriter in project lucene-solr by apache.

the class BaseGeoPointTestCase method verifyRandomPolygons.

protected void verifyRandomPolygons(double[] lats, double[] lons) throws Exception {
    IndexWriterConfig iwc = newIndexWriterConfig();
    // Else seeds may not reproduce:
    iwc.setMergeScheduler(new SerialMergeScheduler());
    // Else we can get O(N^2) merging:
    int mbd = iwc.getMaxBufferedDocs();
    if (mbd != -1 && mbd < lats.length / 100) {
        iwc.setMaxBufferedDocs(lats.length / 100);
    }
    Directory dir;
    if (lats.length > 100000) {
        dir = newFSDirectory(createTempDir(getClass().getSimpleName()));
    } else {
        dir = newDirectory();
    }
    Set<Integer> deleted = new HashSet<>();
    // RandomIndexWriter is too slow here:
    IndexWriter w = new IndexWriter(dir, iwc);
    for (int id = 0; id < lats.length; id++) {
        Document doc = new Document();
        doc.add(newStringField("id", "" + id, Field.Store.NO));
        doc.add(new NumericDocValuesField("id", id));
        if (Double.isNaN(lats[id]) == false) {
            addPointToDoc(FIELD_NAME, doc, lats[id], lons[id]);
        }
        w.addDocument(doc);
        if (id > 0 && random().nextInt(100) == 42) {
            int idToDelete = random().nextInt(id);
            w.deleteDocuments(new Term("id", "" + idToDelete));
            deleted.add(idToDelete);
            if (VERBOSE) {
                System.out.println("  delete id=" + idToDelete);
            }
        }
    }
    if (random().nextBoolean()) {
        w.forceMerge(1);
    }
    final IndexReader r = DirectoryReader.open(w);
    w.close();
    // We can't wrap with "exotic" readers because points needs to work:
    IndexSearcher s = newSearcher(r);
    final int iters = atLeast(75);
    Bits liveDocs = MultiFields.getLiveDocs(s.getIndexReader());
    int maxDoc = s.getIndexReader().maxDoc();
    for (int iter = 0; iter < iters; iter++) {
        if (VERBOSE) {
            System.out.println("\nTEST: iter=" + iter + " s=" + s);
        }
        // Polygon
        Polygon polygon = nextPolygon();
        Query query = newPolygonQuery(FIELD_NAME, polygon);
        if (VERBOSE) {
            System.out.println("  query=" + query);
        }
        final FixedBitSet hits = new FixedBitSet(maxDoc);
        s.search(query, new SimpleCollector() {

            private int docBase;

            @Override
            public boolean needsScores() {
                return false;
            }

            @Override
            protected void doSetNextReader(LeafReaderContext context) throws IOException {
                docBase = context.docBase;
            }

            @Override
            public void collect(int doc) {
                hits.set(docBase + doc);
            }
        });
        boolean fail = false;
        NumericDocValues docIDToID = MultiDocValues.getNumericValues(r, "id");
        for (int docID = 0; docID < maxDoc; docID++) {
            assertEquals(docID, docIDToID.nextDoc());
            int id = (int) docIDToID.longValue();
            boolean expected;
            if (liveDocs != null && liveDocs.get(docID) == false) {
                // document is deleted
                expected = false;
            } else if (Double.isNaN(lats[id])) {
                expected = false;
            } else {
                expected = GeoTestUtil.containsSlowly(polygon, lats[id], lons[id]);
            }
            if (hits.get(docID) != expected) {
                StringBuilder b = new StringBuilder();
                if (expected) {
                    b.append("FAIL: id=" + id + " should match but did not\n");
                } else {
                    b.append("FAIL: id=" + id + " should not match but did\n");
                }
                b.append("  query=" + query + " docID=" + docID + "\n");
                b.append("  lat=" + lats[id] + " lon=" + lons[id] + "\n");
                b.append("  deleted?=" + (liveDocs != null && liveDocs.get(docID) == false));
                b.append("  polygon=" + polygon);
                if (true) {
                    fail("wrong hit (first of possibly more):\n\n" + b);
                } else {
                    System.out.println(b.toString());
                    fail = true;
                }
            }
        }
        if (fail) {
            fail("some hits were wrong");
        }
    }
    IOUtils.close(r, dir);
}

Also used : IndexSearcher(org.apache.lucene.search.IndexSearcher) NumericDocValues(org.apache.lucene.index.NumericDocValues) Query(org.apache.lucene.search.Query) MatchNoDocsQuery(org.apache.lucene.search.MatchNoDocsQuery) Term(org.apache.lucene.index.Term) IOException(java.io.IOException) Document(org.apache.lucene.document.Document) SerialMergeScheduler(org.apache.lucene.index.SerialMergeScheduler) SimpleCollector(org.apache.lucene.search.SimpleCollector) NumericDocValuesField(org.apache.lucene.document.NumericDocValuesField) IndexWriter(org.apache.lucene.index.IndexWriter) RandomIndexWriter(org.apache.lucene.index.RandomIndexWriter) FixedBitSet(org.apache.lucene.util.FixedBitSet) IndexReader(org.apache.lucene.index.IndexReader) Bits(org.apache.lucene.util.Bits) LeafReaderContext(org.apache.lucene.index.LeafReaderContext) IndexWriterConfig(org.apache.lucene.index.IndexWriterConfig) Directory(org.apache.lucene.store.Directory) HashSet(java.util.HashSet)

Example 97 with RandomIndexWriter

use of org.apache.lucene.index.RandomIndexWriter in project lucene-solr by apache.

the class TestLTRScoringQuery method testLTRScoringQuery.

@Test
public void testLTRScoringQuery() throws IOException, ModelException {
    final Directory dir = newDirectory();
    final RandomIndexWriter w = new RandomIndexWriter(random(), dir);
    Document doc = new Document();
    doc.add(newStringField("id", "0", Field.Store.YES));
    doc.add(newTextField("field", "wizard the the the the the oz", Field.Store.NO));
    doc.add(new FloatDocValuesField("final-score", 1.0f));
    w.addDocument(doc);
    doc = new Document();
    doc.add(newStringField("id", "1", Field.Store.YES));
    // 1 extra token, but wizard and oz are close;
    doc.add(newTextField("field", "wizard oz the the the the the the", Field.Store.NO));
    doc.add(new FloatDocValuesField("final-score", 2.0f));
    w.addDocument(doc);
    final IndexReader r = w.getReader();
    w.close();
    // Do ordinary BooleanQuery:
    final BooleanQuery.Builder bqBuilder = new BooleanQuery.Builder();
    bqBuilder.add(new TermQuery(new Term("field", "wizard")), BooleanClause.Occur.SHOULD);
    bqBuilder.add(new TermQuery(new Term("field", "oz")), BooleanClause.Occur.SHOULD);
    final IndexSearcher searcher = getSearcher(r);
    // first run the standard query
    final TopDocs hits = searcher.search(bqBuilder.build(), 10);
    assertEquals(2, hits.totalHits);
    assertEquals("0", searcher.doc(hits.scoreDocs[0].doc).get("id"));
    assertEquals("1", searcher.doc(hits.scoreDocs[1].doc).get("id"));
    List<Feature> features = makeFeatures(new int[] { 0, 1, 2 });
    final List<Feature> allFeatures = makeFeatures(new int[] { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 });
    List<Normalizer> norms = new ArrayList<Normalizer>(Collections.nCopies(features.size(), IdentityNormalizer.INSTANCE));
    LTRScoringModel ltrScoringModel = TestLinearModel.createLinearModel("test", features, norms, "test", allFeatures, makeFeatureWeights(features));
    LTRScoringQuery.ModelWeight modelWeight = performQuery(hits, searcher, hits.scoreDocs[0].doc, new LTRScoringQuery(ltrScoringModel));
    assertEquals(3, modelWeight.getModelFeatureValuesNormalized().length);
    for (int i = 0; i < 3; i++) {
        assertEquals(i, modelWeight.getModelFeatureValuesNormalized()[i], 0.0001);
    }
    int[] posVals = new int[] { 0, 1, 2 };
    int pos = 0;
    for (LTRScoringQuery.FeatureInfo fInfo : modelWeight.getFeaturesInfo()) {
        if (fInfo == null) {
            continue;
        }
        assertEquals(posVals[pos], fInfo.getValue(), 0.0001);
        assertEquals("f" + posVals[pos], fInfo.getName());
        pos++;
    }
    final int[] mixPositions = new int[] { 8, 2, 4, 9, 0 };
    features = makeFeatures(mixPositions);
    norms = new ArrayList<Normalizer>(Collections.nCopies(features.size(), IdentityNormalizer.INSTANCE));
    ltrScoringModel = TestLinearModel.createLinearModel("test", features, norms, "test", allFeatures, makeFeatureWeights(features));
    modelWeight = performQuery(hits, searcher, hits.scoreDocs[0].doc, new LTRScoringQuery(ltrScoringModel));
    assertEquals(mixPositions.length, modelWeight.getModelFeatureWeights().length);
    for (int i = 0; i < mixPositions.length; i++) {
        assertEquals(mixPositions[i], modelWeight.getModelFeatureValuesNormalized()[i], 0.0001);
    }
    final ModelException expectedModelException = new ModelException("no features declared for model test");
    final int[] noPositions = new int[] {};
    features = makeFeatures(noPositions);
    norms = new ArrayList<Normalizer>(Collections.nCopies(features.size(), IdentityNormalizer.INSTANCE));
    try {
        ltrScoringModel = TestLinearModel.createLinearModel("test", features, norms, "test", allFeatures, makeFeatureWeights(features));
        fail("unexpectedly got here instead of catching " + expectedModelException);
        modelWeight = performQuery(hits, searcher, hits.scoreDocs[0].doc, new LTRScoringQuery(ltrScoringModel));
        assertEquals(0, modelWeight.getModelFeatureWeights().length);
    } catch (ModelException actualModelException) {
        assertEquals(expectedModelException.toString(), actualModelException.toString());
    }
    // test normalizers
    features = makeFilterFeatures(mixPositions);
    final Normalizer norm = new Normalizer() {

        @Override
        public float normalize(float value) {
            return 42.42f;
        }

        @Override
        public LinkedHashMap<String, Object> paramsToMap() {
            return null;
        }

        @Override
        protected void validate() throws NormalizerException {
        }
    };
    norms = new ArrayList<Normalizer>(Collections.nCopies(features.size(), norm));
    final LTRScoringModel normMeta = TestLinearModel.createLinearModel("test", features, norms, "test", allFeatures, makeFeatureWeights(features));
    modelWeight = performQuery(hits, searcher, hits.scoreDocs[0].doc, new LTRScoringQuery(normMeta));
    normMeta.normalizeFeaturesInPlace(modelWeight.getModelFeatureValuesNormalized());
    assertEquals(mixPositions.length, modelWeight.getModelFeatureWeights().length);
    for (int i = 0; i < mixPositions.length; i++) {
        assertEquals(42.42f, modelWeight.getModelFeatureValuesNormalized()[i], 0.0001);
    }
    r.close();
    dir.close();
}

Also used : IndexSearcher(org.apache.lucene.search.IndexSearcher) BooleanQuery(org.apache.lucene.search.BooleanQuery) ArrayList(java.util.ArrayList) FloatDocValuesField(org.apache.lucene.document.FloatDocValuesField) Document(org.apache.lucene.document.Document) ValueFeature(org.apache.solr.ltr.feature.ValueFeature) Feature(org.apache.solr.ltr.feature.Feature) TopDocs(org.apache.lucene.search.TopDocs) Directory(org.apache.lucene.store.Directory) TermQuery(org.apache.lucene.search.TermQuery) ModelException(org.apache.solr.ltr.model.ModelException) Normalizer(org.apache.solr.ltr.norm.Normalizer) IdentityNormalizer(org.apache.solr.ltr.norm.IdentityNormalizer) Term(org.apache.lucene.index.Term) IndexReader(org.apache.lucene.index.IndexReader) RandomIndexWriter(org.apache.lucene.index.RandomIndexWriter) LTRScoringModel(org.apache.solr.ltr.model.LTRScoringModel) Test(org.junit.Test)

Example 98 with RandomIndexWriter

use of org.apache.lucene.index.RandomIndexWriter in project lucene-solr by apache.

the class TestLTRReRankingPipeline method testDifferentTopN.

@Ignore
@Test
public void testDifferentTopN() throws IOException {
    final Directory dir = newDirectory();
    final RandomIndexWriter w = new RandomIndexWriter(random(), dir);
    Document doc = new Document();
    doc.add(newStringField("id", "0", Field.Store.YES));
    doc.add(newTextField("field", "wizard oz oz oz oz oz", Field.Store.NO));
    doc.add(new FloatDocValuesField("final-score", 1.0f));
    w.addDocument(doc);
    doc = new Document();
    doc.add(newStringField("id", "1", Field.Store.YES));
    doc.add(newTextField("field", "wizard oz oz oz oz the", Field.Store.NO));
    doc.add(new FloatDocValuesField("final-score", 2.0f));
    w.addDocument(doc);
    doc = new Document();
    doc.add(newStringField("id", "2", Field.Store.YES));
    doc.add(newTextField("field", "wizard oz oz oz the the ", Field.Store.NO));
    doc.add(new FloatDocValuesField("final-score", 3.0f));
    w.addDocument(doc);
    doc = new Document();
    doc.add(newStringField("id", "3", Field.Store.YES));
    doc.add(newTextField("field", "wizard oz oz the the the the ", Field.Store.NO));
    doc.add(new FloatDocValuesField("final-score", 4.0f));
    w.addDocument(doc);
    doc = new Document();
    doc.add(newStringField("id", "4", Field.Store.YES));
    doc.add(newTextField("field", "wizard oz the the the the the the", Field.Store.NO));
    doc.add(new FloatDocValuesField("final-score", 5.0f));
    w.addDocument(doc);
    final IndexReader r = w.getReader();
    w.close();
    // Do ordinary BooleanQuery:
    final BooleanQuery.Builder bqBuilder = new BooleanQuery.Builder();
    bqBuilder.add(new TermQuery(new Term("field", "wizard")), BooleanClause.Occur.SHOULD);
    bqBuilder.add(new TermQuery(new Term("field", "oz")), BooleanClause.Occur.SHOULD);
    final IndexSearcher searcher = getSearcher(r);
    // first run the standard query
    TopDocs hits = searcher.search(bqBuilder.build(), 10);
    assertEquals(5, hits.totalHits);
    assertEquals("0", searcher.doc(hits.scoreDocs[0].doc).get("id"));
    assertEquals("1", searcher.doc(hits.scoreDocs[1].doc).get("id"));
    assertEquals("2", searcher.doc(hits.scoreDocs[2].doc).get("id"));
    assertEquals("3", searcher.doc(hits.scoreDocs[3].doc).get("id"));
    assertEquals("4", searcher.doc(hits.scoreDocs[4].doc).get("id"));
    final List<Feature> features = makeFieldValueFeatures(new int[] { 0, 1, 2 }, "final-score");
    final List<Normalizer> norms = new ArrayList<Normalizer>(Collections.nCopies(features.size(), IdentityNormalizer.INSTANCE));
    final List<Feature> allFeatures = makeFieldValueFeatures(new int[] { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 }, "final-score");
    final LTRScoringModel ltrScoringModel = TestLinearModel.createLinearModel("test", features, norms, "test", allFeatures, null);
    final LTRRescorer rescorer = new LTRRescorer(new LTRScoringQuery(ltrScoringModel));
    // rerank @ 0 should not change the order
    hits = rescorer.rescore(searcher, hits, 0);
    assertEquals("0", searcher.doc(hits.scoreDocs[0].doc).get("id"));
    assertEquals("1", searcher.doc(hits.scoreDocs[1].doc).get("id"));
    assertEquals("2", searcher.doc(hits.scoreDocs[2].doc).get("id"));
    assertEquals("3", searcher.doc(hits.scoreDocs[3].doc).get("id"));
    assertEquals("4", searcher.doc(hits.scoreDocs[4].doc).get("id"));
    for (int topN = 1; topN <= 5; topN++) {
        log.info("rerank {} documents ", topN);
        hits = searcher.search(bqBuilder.build(), 10);
        final ScoreDoc[] slice = new ScoreDoc[topN];
        System.arraycopy(hits.scoreDocs, 0, slice, 0, topN);
        hits = new TopDocs(hits.totalHits, slice, hits.getMaxScore());
        hits = rescorer.rescore(searcher, hits, topN);
        for (int i = topN - 1, j = 0; i >= 0; i--, j++) {
            log.info("doc {} in pos {}", searcher.doc(hits.scoreDocs[j].doc).get("id"), j);
            assertEquals(i, Integer.parseInt(searcher.doc(hits.scoreDocs[j].doc).get("id")));
            assertEquals(i + 1, hits.scoreDocs[j].score, 0.00001);
        }
    }
    r.close();
    dir.close();
}

Also used : IndexSearcher(org.apache.lucene.search.IndexSearcher) BooleanQuery(org.apache.lucene.search.BooleanQuery) TermQuery(org.apache.lucene.search.TermQuery) Normalizer(org.apache.solr.ltr.norm.Normalizer) IdentityNormalizer(org.apache.solr.ltr.norm.IdentityNormalizer) ArrayList(java.util.ArrayList) FloatDocValuesField(org.apache.lucene.document.FloatDocValuesField) Term(org.apache.lucene.index.Term) Document(org.apache.lucene.document.Document) FieldValueFeature(org.apache.solr.ltr.feature.FieldValueFeature) Feature(org.apache.solr.ltr.feature.Feature) ScoreDoc(org.apache.lucene.search.ScoreDoc) TopDocs(org.apache.lucene.search.TopDocs) IndexReader(org.apache.lucene.index.IndexReader) RandomIndexWriter(org.apache.lucene.index.RandomIndexWriter) LTRScoringModel(org.apache.solr.ltr.model.LTRScoringModel) Directory(org.apache.lucene.store.Directory) Ignore(org.junit.Ignore) Test(org.junit.Test)

Example 99 with RandomIndexWriter

use of org.apache.lucene.index.RandomIndexWriter in project lucene-solr by apache.

the class TestBooleanSimilarity method testPhraseScoreIsEqualToBoost.

public void testPhraseScoreIsEqualToBoost() throws IOException {
    Directory dir = newDirectory();
    RandomIndexWriter w = new RandomIndexWriter(random(), dir, newIndexWriterConfig().setSimilarity(new BooleanSimilarity()));
    Document doc = new Document();
    doc.add(new TextField("foo", "bar baz quux", Store.NO));
    w.addDocument(doc);
    DirectoryReader reader = w.getReader();
    w.close();
    IndexSearcher searcher = newSearcher(reader);
    searcher.setSimilarity(new BooleanSimilarity());
    PhraseQuery query = new PhraseQuery(2, "foo", "bar", "quux");
    TopDocs topDocs = searcher.search(query, 2);
    assertEquals(1, topDocs.totalHits);
    assertEquals(1f, topDocs.scoreDocs[0].score, 0f);
    topDocs = searcher.search(new BoostQuery(query, 7), 2);
    assertEquals(1, topDocs.totalHits);
    assertEquals(7f, topDocs.scoreDocs[0].score, 0f);
    reader.close();
    dir.close();
}

Also used : IndexSearcher(org.apache.lucene.search.IndexSearcher) TopDocs(org.apache.lucene.search.TopDocs) PhraseQuery(org.apache.lucene.search.PhraseQuery) DirectoryReader(org.apache.lucene.index.DirectoryReader) TextField(org.apache.lucene.document.TextField) Document(org.apache.lucene.document.Document) BoostQuery(org.apache.lucene.search.BoostQuery) RandomIndexWriter(org.apache.lucene.index.RandomIndexWriter) Directory(org.apache.lucene.store.Directory)

Example 100 with RandomIndexWriter

use of org.apache.lucene.index.RandomIndexWriter in project lucene-solr by apache.

the class TestDrillSideways method testBasic.

public void testBasic() throws Exception {
    Directory dir = newDirectory();
    Directory taxoDir = newDirectory();
    // Writes facet ords to a separate directory from the
    // main index:
    DirectoryTaxonomyWriter taxoWriter = new DirectoryTaxonomyWriter(taxoDir, IndexWriterConfig.OpenMode.CREATE);
    FacetsConfig config = new FacetsConfig();
    config.setHierarchical("Publish Date", true);
    RandomIndexWriter writer = new RandomIndexWriter(random(), dir);
    Document doc = new Document();
    doc.add(new FacetField("Author", "Bob"));
    doc.add(new FacetField("Publish Date", "2010", "10", "15"));
    writer.addDocument(config.build(taxoWriter, doc));
    doc = new Document();
    doc.add(new FacetField("Author", "Lisa"));
    doc.add(new FacetField("Publish Date", "2010", "10", "20"));
    writer.addDocument(config.build(taxoWriter, doc));
    doc = new Document();
    doc.add(new FacetField("Author", "Lisa"));
    doc.add(new FacetField("Publish Date", "2012", "1", "1"));
    writer.addDocument(config.build(taxoWriter, doc));
    doc = new Document();
    doc.add(new FacetField("Author", "Susan"));
    doc.add(new FacetField("Publish Date", "2012", "1", "7"));
    writer.addDocument(config.build(taxoWriter, doc));
    doc = new Document();
    doc.add(new FacetField("Author", "Frank"));
    doc.add(new FacetField("Publish Date", "1999", "5", "5"));
    writer.addDocument(config.build(taxoWriter, doc));
    // NRT open
    IndexSearcher searcher = newSearcher(writer.getReader());
    //System.out.println("searcher=" + searcher);
    // NRT open
    TaxonomyReader taxoReader = new DirectoryTaxonomyReader(taxoWriter);
    DrillSideways ds = getNewDrillSideways(searcher, config, taxoReader);
    //  case: drill-down on a single field; in this
    // case the drill-sideways + drill-down counts ==
    // drill-down of just the query:
    DrillDownQuery ddq = new DrillDownQuery(config);
    ddq.add("Author", "Lisa");
    DrillSidewaysResult r = ds.search(null, ddq, 10);
    assertEquals(2, r.hits.totalHits);
    // Publish Date is only drill-down, and Lisa published
    // one in 2012 and one in 2010:
    assertEquals("dim=Publish Date path=[] value=2 childCount=2\n  2010 (1)\n  2012 (1)\n", r.facets.getTopChildren(10, "Publish Date").toString());
    // Author is drill-sideways + drill-down: Lisa
    // (drill-down) published twice, and Frank/Susan/Bob
    // published once:
    assertEquals("dim=Author path=[] value=5 childCount=4\n  Lisa (2)\n  Bob (1)\n  Susan (1)\n  Frank (1)\n", r.facets.getTopChildren(10, "Author").toString());
    // Same simple case, but no baseQuery (pure browse):
    // drill-down on a single field; in this case the
    // drill-sideways + drill-down counts == drill-down of
    // just the query:
    ddq = new DrillDownQuery(config);
    ddq.add("Author", "Lisa");
    r = ds.search(null, ddq, 10);
    assertEquals(2, r.hits.totalHits);
    // Publish Date is only drill-down, and Lisa published
    // one in 2012 and one in 2010:
    assertEquals("dim=Publish Date path=[] value=2 childCount=2\n  2010 (1)\n  2012 (1)\n", r.facets.getTopChildren(10, "Publish Date").toString());
    // Author is drill-sideways + drill-down: Lisa
    // (drill-down) published twice, and Frank/Susan/Bob
    // published once:
    assertEquals("dim=Author path=[] value=5 childCount=4\n  Lisa (2)\n  Bob (1)\n  Susan (1)\n  Frank (1)\n", r.facets.getTopChildren(10, "Author").toString());
    // Another simple case: drill-down on single fields
    // but OR of two values
    ddq = new DrillDownQuery(config);
    ddq.add("Author", "Lisa");
    ddq.add("Author", "Bob");
    r = ds.search(null, ddq, 10);
    assertEquals(3, r.hits.totalHits);
    // Publish Date is only drill-down: Lisa and Bob
    // (drill-down) published twice in 2010 and once in 2012:
    assertEquals("dim=Publish Date path=[] value=3 childCount=2\n  2010 (2)\n  2012 (1)\n", r.facets.getTopChildren(10, "Publish Date").toString());
    // Author is drill-sideways + drill-down: Lisa
    // (drill-down) published twice, and Frank/Susan/Bob
    // published once:
    assertEquals("dim=Author path=[] value=5 childCount=4\n  Lisa (2)\n  Bob (1)\n  Susan (1)\n  Frank (1)\n", r.facets.getTopChildren(10, "Author").toString());
    assertTrue(r.facets instanceof MultiFacets);
    List<FacetResult> allResults = r.facets.getAllDims(10);
    assertEquals(2, allResults.size());
    assertEquals("dim=Author path=[] value=5 childCount=4\n  Lisa (2)\n  Bob (1)\n  Susan (1)\n  Frank (1)\n", allResults.get(0).toString());
    assertEquals("dim=Publish Date path=[] value=3 childCount=2\n  2010 (2)\n  2012 (1)\n", allResults.get(1).toString());
    // More interesting case: drill-down on two fields
    ddq = new DrillDownQuery(config);
    ddq.add("Author", "Lisa");
    ddq.add("Publish Date", "2010");
    r = ds.search(null, ddq, 10);
    assertEquals(1, r.hits.totalHits);
    // Publish Date is drill-sideways + drill-down: Lisa
    // (drill-down) published once in 2010 and once in 2012:
    assertEquals("dim=Publish Date path=[] value=2 childCount=2\n  2010 (1)\n  2012 (1)\n", r.facets.getTopChildren(10, "Publish Date").toString());
    // Author is drill-sideways + drill-down:
    // only Lisa & Bob published (once each) in 2010:
    assertEquals("dim=Author path=[] value=2 childCount=2\n  Bob (1)\n  Lisa (1)\n", r.facets.getTopChildren(10, "Author").toString());
    // Even more interesting case: drill down on two fields,
    // but one of them is OR
    ddq = new DrillDownQuery(config);
    // Drill down on Lisa or Bob:
    ddq.add("Author", "Lisa");
    ddq.add("Publish Date", "2010");
    ddq.add("Author", "Bob");
    r = ds.search(null, ddq, 10);
    assertEquals(2, r.hits.totalHits);
    // Publish Date is both drill-sideways + drill-down:
    // Lisa or Bob published twice in 2010 and once in 2012:
    assertEquals("dim=Publish Date path=[] value=3 childCount=2\n  2010 (2)\n  2012 (1)\n", r.facets.getTopChildren(10, "Publish Date").toString());
    // Author is drill-sideways + drill-down:
    // only Lisa & Bob published (once each) in 2010:
    assertEquals("dim=Author path=[] value=2 childCount=2\n  Bob (1)\n  Lisa (1)\n", r.facets.getTopChildren(10, "Author").toString());
    // Test drilling down on invalid field:
    ddq = new DrillDownQuery(config);
    ddq.add("Foobar", "Baz");
    r = ds.search(null, ddq, 10);
    assertEquals(0, r.hits.totalHits);
    assertNull(r.facets.getTopChildren(10, "Publish Date"));
    assertNull(r.facets.getTopChildren(10, "Foobar"));
    // Test drilling down on valid term or'd with invalid term:
    ddq = new DrillDownQuery(config);
    ddq.add("Author", "Lisa");
    ddq.add("Author", "Tom");
    r = ds.search(null, ddq, 10);
    assertEquals(2, r.hits.totalHits);
    // Publish Date is only drill-down, and Lisa published
    // one in 2012 and one in 2010:
    assertEquals("dim=Publish Date path=[] value=2 childCount=2\n  2010 (1)\n  2012 (1)\n", r.facets.getTopChildren(10, "Publish Date").toString());
    // Author is drill-sideways + drill-down: Lisa
    // (drill-down) published twice, and Frank/Susan/Bob
    // published once:
    assertEquals("dim=Author path=[] value=5 childCount=4\n  Lisa (2)\n  Bob (1)\n  Susan (1)\n  Frank (1)\n", r.facets.getTopChildren(10, "Author").toString());
    // LUCENE-4915: test drilling down on a dimension but
    // NOT facet counting it:
    ddq = new DrillDownQuery(config);
    ddq.add("Author", "Lisa");
    ddq.add("Author", "Tom");
    r = ds.search(null, ddq, 10);
    assertEquals(2, r.hits.totalHits);
    // Publish Date is only drill-down, and Lisa published
    // one in 2012 and one in 2010:
    assertEquals("dim=Publish Date path=[] value=2 childCount=2\n  2010 (1)\n  2012 (1)\n", r.facets.getTopChildren(10, "Publish Date").toString());
    // Test main query gets null scorer:
    ddq = new DrillDownQuery(config, new TermQuery(new Term("foobar", "baz")));
    ddq.add("Author", "Lisa");
    r = ds.search(null, ddq, 10);
    assertEquals(0, r.hits.totalHits);
    assertNull(r.facets.getTopChildren(10, "Publish Date"));
    assertNull(r.facets.getTopChildren(10, "Author"));
    writer.close();
    IOUtils.close(searcher.getIndexReader(), taxoReader, taxoWriter, dir, taxoDir);
}

Also used : IndexSearcher(org.apache.lucene.search.IndexSearcher) TermQuery(org.apache.lucene.search.TermQuery) TaxonomyReader(org.apache.lucene.facet.taxonomy.TaxonomyReader) DirectoryTaxonomyReader(org.apache.lucene.facet.taxonomy.directory.DirectoryTaxonomyReader) SortedSetDocValuesFacetField(org.apache.lucene.facet.sortedset.SortedSetDocValuesFacetField) Term(org.apache.lucene.index.Term) Document(org.apache.lucene.document.Document) DirectoryTaxonomyWriter(org.apache.lucene.facet.taxonomy.directory.DirectoryTaxonomyWriter) DrillSidewaysResult(org.apache.lucene.facet.DrillSideways.DrillSidewaysResult) RandomIndexWriter(org.apache.lucene.index.RandomIndexWriter) Directory(org.apache.lucene.store.Directory) DirectoryTaxonomyReader(org.apache.lucene.facet.taxonomy.directory.DirectoryTaxonomyReader)

Aggregations

RandomIndexWriter (org.apache.lucene.index.RandomIndexWriter)779 Document (org.apache.lucene.document.Document)679 Directory (org.apache.lucene.store.Directory)588 IndexReader (org.apache.lucene.index.IndexReader)510 Term (org.apache.lucene.index.Term)325 IndexSearcher (org.apache.lucene.search.IndexSearcher)294 MockAnalyzer (org.apache.lucene.analysis.MockAnalyzer)220 BytesRef (org.apache.lucene.util.BytesRef)142 Field (org.apache.lucene.document.Field)141 MatchAllDocsQuery (org.apache.lucene.search.MatchAllDocsQuery)136 TopDocs (org.apache.lucene.search.TopDocs)134 TermQuery (org.apache.lucene.search.TermQuery)121 DirectoryReader (org.apache.lucene.index.DirectoryReader)120 IndexWriterConfig (org.apache.lucene.index.IndexWriterConfig)110 ArrayList (java.util.ArrayList)95 StringField (org.apache.lucene.document.StringField)93 Analyzer (org.apache.lucene.analysis.Analyzer)88 BooleanQuery (org.apache.lucene.search.BooleanQuery)88 NumericDocValuesField (org.apache.lucene.document.NumericDocValuesField)77 Test (org.junit.Test)75