Search in sources :

Example 6 with BM25Similarity

use of org.apache.lucene.search.similarities.BM25Similarity in project lucene-solr by apache.

the class TestNonDefinedSimilarityFactory method testCurrentBM25.

public void testCurrentBM25() throws Exception {
    // no sys prop set, rely on LATEST
    initCore("solrconfig-basic.xml", "schema-tiny.xml");
    BM25Similarity sim = getSimilarity("text", BM25Similarity.class);
    assertEquals(0.75F, sim.getB(), 0.0F);
}
Also used : BM25Similarity(org.apache.lucene.search.similarities.BM25Similarity)

Example 7 with BM25Similarity

use of org.apache.lucene.search.similarities.BM25Similarity in project lucene-solr by apache.

the class TestBM25SimilarityFactory method testParameters.

/** bm25 with parameters */
public void testParameters() throws Exception {
    Similarity sim = getSimilarity("text_params");
    assertEquals(BM25Similarity.class, sim.getClass());
    BM25Similarity bm25 = (BM25Similarity) sim;
    assertEquals(1.2f, bm25.getK1(), 0.01f);
    assertEquals(0.76f, bm25.getB(), 0.01f);
}
Also used : Similarity(org.apache.lucene.search.similarities.Similarity) BM25Similarity(org.apache.lucene.search.similarities.BM25Similarity) BM25Similarity(org.apache.lucene.search.similarities.BM25Similarity)

Example 8 with BM25Similarity

use of org.apache.lucene.search.similarities.BM25Similarity in project lucene-solr by apache.

the class TestPhraseQuery method testSlopScoring.

public void testSlopScoring() throws IOException {
    Directory directory = newDirectory();
    RandomIndexWriter writer = new RandomIndexWriter(random(), directory, newIndexWriterConfig(new MockAnalyzer(random())).setMergePolicy(newLogMergePolicy()).setSimilarity(new BM25Similarity()));
    Document doc = new Document();
    doc.add(newTextField("field", "foo firstname lastname foo", Field.Store.YES));
    writer.addDocument(doc);
    Document doc2 = new Document();
    doc2.add(newTextField("field", "foo firstname zzz lastname foo", Field.Store.YES));
    writer.addDocument(doc2);
    Document doc3 = new Document();
    doc3.add(newTextField("field", "foo firstname zzz yyy lastname foo", Field.Store.YES));
    writer.addDocument(doc3);
    IndexReader reader = writer.getReader();
    writer.close();
    IndexSearcher searcher = newSearcher(reader);
    searcher.setSimilarity(new ClassicSimilarity());
    PhraseQuery query = new PhraseQuery(Integer.MAX_VALUE, "field", "firstname", "lastname");
    ScoreDoc[] hits = searcher.search(query, 1000).scoreDocs;
    assertEquals(3, hits.length);
    // Make sure that those matches where the terms appear closer to
    // each other get a higher score:
    assertEquals(1.0, hits[0].score, 0.01);
    assertEquals(0, hits[0].doc);
    assertEquals(0.63, hits[1].score, 0.01);
    assertEquals(1, hits[1].doc);
    assertEquals(0.47, hits[2].score, 0.01);
    assertEquals(2, hits[2].doc);
    QueryUtils.check(random(), query, searcher);
    reader.close();
    directory.close();
}
Also used : ClassicSimilarity(org.apache.lucene.search.similarities.ClassicSimilarity) MockAnalyzer(org.apache.lucene.analysis.MockAnalyzer) BM25Similarity(org.apache.lucene.search.similarities.BM25Similarity) IndexReader(org.apache.lucene.index.IndexReader) Document(org.apache.lucene.document.Document) RandomIndexWriter(org.apache.lucene.index.RandomIndexWriter) Directory(org.apache.lucene.store.Directory)

Example 9 with BM25Similarity

use of org.apache.lucene.search.similarities.BM25Similarity in project lucene-solr by apache.

the class KNearestNeighborClassifierTest method testBasicUsage.

@Test
public void testBasicUsage() throws Exception {
    LeafReader leafReader = null;
    try {
        MockAnalyzer analyzer = new MockAnalyzer(random());
        leafReader = getSampleIndex(analyzer);
        checkCorrectClassification(new KNearestNeighborClassifier(leafReader, null, analyzer, null, 1, 0, 0, categoryFieldName, textFieldName), TECHNOLOGY_INPUT, TECHNOLOGY_RESULT);
        checkCorrectClassification(new KNearestNeighborClassifier(leafReader, new LMDirichletSimilarity(), analyzer, null, 1, 0, 0, categoryFieldName, textFieldName), TECHNOLOGY_INPUT, TECHNOLOGY_RESULT);
        ClassificationResult<BytesRef> resultDS = checkCorrectClassification(new KNearestNeighborClassifier(leafReader, new BM25Similarity(), analyzer, null, 3, 2, 1, categoryFieldName, textFieldName), TECHNOLOGY_INPUT, TECHNOLOGY_RESULT);
        ClassificationResult<BytesRef> resultLMS = checkCorrectClassification(new KNearestNeighborClassifier(leafReader, new LMDirichletSimilarity(), analyzer, null, 3, 2, 1, categoryFieldName, textFieldName), TECHNOLOGY_INPUT, TECHNOLOGY_RESULT);
        assertTrue(resultDS.getScore() != resultLMS.getScore());
    } finally {
        if (leafReader != null) {
            leafReader.close();
        }
    }
}
Also used : LeafReader(org.apache.lucene.index.LeafReader) MockAnalyzer(org.apache.lucene.analysis.MockAnalyzer) BM25Similarity(org.apache.lucene.search.similarities.BM25Similarity) LMDirichletSimilarity(org.apache.lucene.search.similarities.LMDirichletSimilarity) BytesRef(org.apache.lucene.util.BytesRef) Test(org.junit.Test)

Example 10 with BM25Similarity

use of org.apache.lucene.search.similarities.BM25Similarity in project lucene-solr by apache.

the class ElevationComparatorSource method testSorting.

//@Test
public void testSorting() throws Throwable {
    Directory directory = newDirectory();
    IndexWriter writer = new IndexWriter(directory, newIndexWriterConfig(new MockAnalyzer(random())).setMaxBufferedDocs(2).setMergePolicy(newLogMergePolicy(1000)).setSimilarity(new ClassicSimilarity()));
    writer.addDocument(adoc(new String[] { "id", "a", "title", "ipod", "str_s", "a" }));
    writer.addDocument(adoc(new String[] { "id", "b", "title", "ipod ipod", "str_s", "b" }));
    writer.addDocument(adoc(new String[] { "id", "c", "title", "ipod ipod ipod", "str_s", "c" }));
    writer.addDocument(adoc(new String[] { "id", "x", "title", "boosted", "str_s", "x" }));
    writer.addDocument(adoc(new String[] { "id", "y", "title", "boosted boosted", "str_s", "y" }));
    writer.addDocument(adoc(new String[] { "id", "z", "title", "boosted boosted boosted", "str_s", "z" }));
    IndexReader r = DirectoryReader.open(writer);
    writer.close();
    IndexSearcher searcher = newSearcher(r);
    searcher.setSimilarity(new BM25Similarity());
    runTest(searcher, true);
    runTest(searcher, false);
    r.close();
    directory.close();
}
Also used : ClassicSimilarity(org.apache.lucene.search.similarities.ClassicSimilarity) MockAnalyzer(org.apache.lucene.analysis.MockAnalyzer) IndexWriter(org.apache.lucene.index.IndexWriter) IndexReader(org.apache.lucene.index.IndexReader) BM25Similarity(org.apache.lucene.search.similarities.BM25Similarity) Directory(org.apache.lucene.store.Directory)

Aggregations

BM25Similarity (org.apache.lucene.search.similarities.BM25Similarity)29 Directory (org.apache.lucene.store.Directory)12 IndexSearcher (org.apache.lucene.search.IndexSearcher)11 IndexReader (org.apache.lucene.index.IndexReader)10 Similarity (org.apache.lucene.search.similarities.Similarity)9 FSDirectory (org.apache.lucene.store.FSDirectory)9 Query (org.apache.lucene.search.Query)8 TopDocs (org.apache.lucene.search.TopDocs)8 TermQuery (org.apache.lucene.search.TermQuery)7 ClassicSimilarity (org.apache.lucene.search.similarities.ClassicSimilarity)7 Test (org.junit.Test)7 Term (org.apache.lucene.index.Term)6 RerankerCascade (io.anserini.rerank.RerankerCascade)5 BooleanQuery (org.apache.lucene.search.BooleanQuery)5 MockAnalyzer (org.apache.lucene.analysis.MockAnalyzer)4 FeatureExtractors (io.anserini.ltr.feature.FeatureExtractors)3 IdentityReranker (io.anserini.rerank.IdentityReranker)3 ScoredDocuments (io.anserini.rerank.ScoredDocuments)3 Qrels (io.anserini.util.Qrels)3 PrintStream (java.io.PrintStream)3