Search in sources :

Example 81 with TopDocs

use of org.apache.lucene.search.TopDocs in project lucene-solr by apache.

the class TestUnifiedHighlighter method testCuriousGeorge.

public void testCuriousGeorge() throws Exception {
    String text = "It’s the formula for success for preschoolers—Curious George and fire trucks! " + "Curious George and the Firefighters is a story based on H. A. and Margret Rey’s " + "popular primate and painted in the original watercolor and charcoal style. " + "Firefighters are a famously brave lot, but can they withstand a visit from one curious monkey?";
    RandomIndexWriter iw = new RandomIndexWriter(random(), dir, indexAnalyzer);
    Field body = new Field("body", text, fieldType);
    Document document = new Document();
    document.add(body);
    iw.addDocument(document);
    IndexReader ir = iw.getReader();
    iw.close();
    IndexSearcher searcher = newSearcher(ir);
    PhraseQuery query = new PhraseQuery.Builder().add(new Term("body", "curious")).add(new Term("body", "george")).build();
    TopDocs topDocs = searcher.search(query, 10);
    assertEquals(1, topDocs.totalHits);
    UnifiedHighlighter highlighter = new UnifiedHighlighter(searcher, indexAnalyzer);
    highlighter.setHighlightPhrasesStrictly(false);
    String[] snippets = highlighter.highlight("body", query, topDocs, 2);
    assertEquals(1, snippets.length);
    assertFalse(snippets[0].contains("<b>Curious</b>Curious"));
    ir.close();
}
Also used : IndexSearcher(org.apache.lucene.search.IndexSearcher) TopDocs(org.apache.lucene.search.TopDocs) Field(org.apache.lucene.document.Field) PhraseQuery(org.apache.lucene.search.PhraseQuery) IndexReader(org.apache.lucene.index.IndexReader) Term(org.apache.lucene.index.Term) Document(org.apache.lucene.document.Document) RandomIndexWriter(org.apache.lucene.index.RandomIndexWriter)

Example 82 with TopDocs

use of org.apache.lucene.search.TopDocs in project lucene-solr by apache.

the class SynonymTokenizer method testQueryScorerHits.

public void testQueryScorerHits() throws Exception {
    PhraseQuery phraseQuery = new PhraseQuery(FIELD_NAME, "very", "long");
    query = phraseQuery;
    searcher = newSearcher(reader);
    TopDocs hits = searcher.search(query, 10);
    QueryScorer scorer = new QueryScorer(query, FIELD_NAME);
    Highlighter highlighter = new Highlighter(scorer);
    for (int i = 0; i < hits.scoreDocs.length; i++) {
        final int docId = hits.scoreDocs[i].doc;
        Document doc = searcher.doc(docId);
        String storedField = doc.get(FIELD_NAME);
        TokenStream stream = getAnyTokenStream(FIELD_NAME, docId);
        Fragmenter fragmenter = new SimpleSpanFragmenter(scorer);
        highlighter.setTextFragmenter(fragmenter);
        String fragment = highlighter.getBestFragment(stream, storedField);
        if (VERBOSE)
            System.out.println(fragment);
    }
}
Also used : TopDocs(org.apache.lucene.search.TopDocs) CannedTokenStream(org.apache.lucene.analysis.CannedTokenStream) TokenStream(org.apache.lucene.analysis.TokenStream) PhraseQuery(org.apache.lucene.search.PhraseQuery) MultiPhraseQuery(org.apache.lucene.search.MultiPhraseQuery) Document(org.apache.lucene.document.Document) IntPoint(org.apache.lucene.document.IntPoint)

Example 83 with TopDocs

use of org.apache.lucene.search.TopDocs in project lucene-solr by apache.

the class SimplePrimaryNode method verifyAtLeastMarkerCount.

private void verifyAtLeastMarkerCount(int expectedAtLeastCount, DataOutput out) throws IOException {
    IndexSearcher searcher = mgr.acquire();
    try {
        long version = ((DirectoryReader) searcher.getIndexReader()).getVersion();
        int hitCount = searcher.count(new TermQuery(new Term("marker", "marker")));
        if (hitCount < expectedAtLeastCount) {
            message("marker search: expectedAtLeastCount=" + expectedAtLeastCount + " but hitCount=" + hitCount);
            TopDocs hits = searcher.search(new TermQuery(new Term("marker", "marker")), expectedAtLeastCount);
            List<Integer> seen = new ArrayList<>();
            for (ScoreDoc hit : hits.scoreDocs) {
                Document doc = searcher.doc(hit.doc);
                seen.add(Integer.parseInt(doc.get("docid").substring(1)));
            }
            Collections.sort(seen);
            message("saw markers:");
            for (int marker : seen) {
                message("saw m" + marker);
            }
            throw new IllegalStateException("at flush: marker count " + hitCount + " but expected at least " + expectedAtLeastCount + " version=" + version);
        }
        if (out != null) {
            out.writeVLong(version);
            out.writeVInt(hitCount);
        }
    } finally {
        mgr.release(searcher);
    }
}
Also used : IndexSearcher(org.apache.lucene.search.IndexSearcher) TermQuery(org.apache.lucene.search.TermQuery) DirectoryReader(org.apache.lucene.index.DirectoryReader) ArrayList(java.util.ArrayList) Term(org.apache.lucene.index.Term) Document(org.apache.lucene.document.Document) ScoreDoc(org.apache.lucene.search.ScoreDoc) TopDocs(org.apache.lucene.search.TopDocs)

Example 84 with TopDocs

use of org.apache.lucene.search.TopDocs in project lucene-solr by apache.

the class BaseGeoPointTestCase method searchSmallSet.

/** return topdocs over a small set of points in field "point" */
private TopDocs searchSmallSet(Query query, int size) throws Exception {
    // this is a simple systematic test, indexing these points
    // TODO: fragile: does not understand quantization in any way yet uses extremely high precision!
    double[][] pts = new double[][] { { 32.763420, -96.774 }, { 32.7559529921407, -96.7759895324707 }, { 32.77866942010977, -96.77701950073242 }, { 32.7756745755423, -96.7706036567688 }, { 27.703618681345585, -139.73458170890808 }, { 32.94823588839368, -96.4538113027811 }, { 33.06047141970814, -96.65084838867188 }, { 32.778650, -96.7772 }, { -88.56029371730983, -177.23537676036358 }, { 33.541429799076354, -26.779373834241003 }, { 26.774024500421728, -77.35379276106497 }, { -90.0, -14.796283808944777 }, { 32.94823588839368, -178.8538113027811 }, { 32.94823588839368, 178.8538113027811 }, { 40.720611, -73.998776 }, { -44.5, -179.5 } };
    Directory directory = newDirectory();
    // TODO: must these simple tests really rely on docid order?
    IndexWriterConfig iwc = newIndexWriterConfig(new MockAnalyzer(random()));
    iwc.setMaxBufferedDocs(TestUtil.nextInt(random(), 100, 1000));
    iwc.setMergePolicy(newLogMergePolicy());
    // Else seeds may not reproduce:
    iwc.setMergeScheduler(new SerialMergeScheduler());
    RandomIndexWriter writer = new RandomIndexWriter(random(), directory, iwc);
    for (double[] p : pts) {
        Document doc = new Document();
        addPointToDoc("point", doc, p[0], p[1]);
        writer.addDocument(doc);
    }
    // add explicit multi-valued docs
    for (int i = 0; i < pts.length; i += 2) {
        Document doc = new Document();
        addPointToDoc("point", doc, pts[i][0], pts[i][1]);
        addPointToDoc("point", doc, pts[i + 1][0], pts[i + 1][1]);
        writer.addDocument(doc);
    }
    // index random string documents
    for (int i = 0; i < random().nextInt(10); ++i) {
        Document doc = new Document();
        doc.add(new StringField("string", Integer.toString(i), Field.Store.NO));
        writer.addDocument(doc);
    }
    IndexReader reader = writer.getReader();
    writer.close();
    IndexSearcher searcher = newSearcher(reader);
    TopDocs topDocs = searcher.search(query, size);
    reader.close();
    directory.close();
    return topDocs;
}
Also used : IndexSearcher(org.apache.lucene.search.IndexSearcher) Document(org.apache.lucene.document.Document) SerialMergeScheduler(org.apache.lucene.index.SerialMergeScheduler) TopDocs(org.apache.lucene.search.TopDocs) MockAnalyzer(org.apache.lucene.analysis.MockAnalyzer) StringField(org.apache.lucene.document.StringField) IndexReader(org.apache.lucene.index.IndexReader) RandomIndexWriter(org.apache.lucene.index.RandomIndexWriter) Directory(org.apache.lucene.store.Directory) IndexWriterConfig(org.apache.lucene.index.IndexWriterConfig)

Example 85 with TopDocs

use of org.apache.lucene.search.TopDocs in project lucene-solr by apache.

the class BaseGeoPointTestCase method testSmallSetRect.

public void testSmallSetRect() throws Exception {
    TopDocs td = searchSmallSet(newRectQuery("point", 32.778, 32.779, -96.778, -96.777), 5);
    assertEquals(4, td.totalHits);
}
Also used : TopDocs(org.apache.lucene.search.TopDocs)

Aggregations

TopDocs (org.apache.lucene.search.TopDocs)496 IndexSearcher (org.apache.lucene.search.IndexSearcher)302 Document (org.apache.lucene.document.Document)275 TermQuery (org.apache.lucene.search.TermQuery)189 IndexReader (org.apache.lucene.index.IndexReader)187 Term (org.apache.lucene.index.Term)174 Directory (org.apache.lucene.store.Directory)174 Query (org.apache.lucene.search.Query)172 RandomIndexWriter (org.apache.lucene.index.RandomIndexWriter)144 BooleanQuery (org.apache.lucene.search.BooleanQuery)127 ScoreDoc (org.apache.lucene.search.ScoreDoc)127 MatchAllDocsQuery (org.apache.lucene.search.MatchAllDocsQuery)120 Sort (org.apache.lucene.search.Sort)94 Field (org.apache.lucene.document.Field)85 SortField (org.apache.lucene.search.SortField)74 IOException (java.io.IOException)58 MockAnalyzer (org.apache.lucene.analysis.MockAnalyzer)56 TextField (org.apache.lucene.document.TextField)47 PhraseQuery (org.apache.lucene.search.PhraseQuery)46 PrefixQuery (org.apache.lucene.search.PrefixQuery)45