Search in sources :

Example 36 with ClassicSimilarity

use of org.apache.lucene.search.similarities.ClassicSimilarity in project lucene-solr by apache.

the class TestValueSources method testTF.

public void testTF() throws Exception {
    Similarity saved = searcher.getSimilarity(true);
    try {
        // no norm field (so agnostic to indexed similarity)
        searcher.setSimilarity(new ClassicSimilarity());
        ValueSource vs = new TFValueSource("bogus", "bogus", "text", new BytesRef("test"));
        assertHits(new FunctionQuery(vs), new float[] { (float) Math.sqrt(3d), (float) Math.sqrt(1d) });
        assertAllExist(vs);
        vs = new TFValueSource("bogus", "bogus", "string", new BytesRef("bar"));
        assertHits(new FunctionQuery(vs), new float[] { 0f, 1f });
        assertAllExist(vs);
        // regardless of whether norms exist, value source exists == 0
        vs = new TFValueSource("bogus", "bogus", "bogus", new BytesRef("bogus"));
        assertHits(new FunctionQuery(vs), new float[] { 0F, 0F });
        assertAllExist(vs);
    } finally {
        searcher.setSimilarity(saved);
    }
}
Also used : ClassicSimilarity(org.apache.lucene.search.similarities.ClassicSimilarity) ClassicSimilarity(org.apache.lucene.search.similarities.ClassicSimilarity) Similarity(org.apache.lucene.search.similarities.Similarity) SumTotalTermFreqValueSource(org.apache.lucene.queries.function.valuesource.SumTotalTermFreqValueSource) DoubleConstValueSource(org.apache.lucene.queries.function.valuesource.DoubleConstValueSource) ConstValueSource(org.apache.lucene.queries.function.valuesource.ConstValueSource) QueryValueSource(org.apache.lucene.queries.function.valuesource.QueryValueSource) DocFreqValueSource(org.apache.lucene.queries.function.valuesource.DocFreqValueSource) NormValueSource(org.apache.lucene.queries.function.valuesource.NormValueSource) NumDocsValueSource(org.apache.lucene.queries.function.valuesource.NumDocsValueSource) MaxDocValueSource(org.apache.lucene.queries.function.valuesource.MaxDocValueSource) JoinDocFreqValueSource(org.apache.lucene.queries.function.valuesource.JoinDocFreqValueSource) LiteralValueSource(org.apache.lucene.queries.function.valuesource.LiteralValueSource) TotalTermFreqValueSource(org.apache.lucene.queries.function.valuesource.TotalTermFreqValueSource) IDFValueSource(org.apache.lucene.queries.function.valuesource.IDFValueSource) TermFreqValueSource(org.apache.lucene.queries.function.valuesource.TermFreqValueSource) TFValueSource(org.apache.lucene.queries.function.valuesource.TFValueSource) TFValueSource(org.apache.lucene.queries.function.valuesource.TFValueSource) BytesRef(org.apache.lucene.util.BytesRef)

Example 37 with ClassicSimilarity

use of org.apache.lucene.search.similarities.ClassicSimilarity in project lucene-solr by apache.

the class TestValueSources method testIDF.

public void testIDF() throws Exception {
    Similarity saved = searcher.getSimilarity(true);
    try {
        searcher.setSimilarity(new ClassicSimilarity());
        ValueSource vs = new IDFValueSource("bogus", "bogus", "text", new BytesRef("test"));
        assertHits(new FunctionQuery(vs), new float[] { 1.0f, 1.0f });
        assertAllExist(vs);
    } finally {
        searcher.setSimilarity(saved);
    }
}
Also used : ClassicSimilarity(org.apache.lucene.search.similarities.ClassicSimilarity) ClassicSimilarity(org.apache.lucene.search.similarities.ClassicSimilarity) Similarity(org.apache.lucene.search.similarities.Similarity) SumTotalTermFreqValueSource(org.apache.lucene.queries.function.valuesource.SumTotalTermFreqValueSource) DoubleConstValueSource(org.apache.lucene.queries.function.valuesource.DoubleConstValueSource) ConstValueSource(org.apache.lucene.queries.function.valuesource.ConstValueSource) QueryValueSource(org.apache.lucene.queries.function.valuesource.QueryValueSource) DocFreqValueSource(org.apache.lucene.queries.function.valuesource.DocFreqValueSource) NormValueSource(org.apache.lucene.queries.function.valuesource.NormValueSource) NumDocsValueSource(org.apache.lucene.queries.function.valuesource.NumDocsValueSource) MaxDocValueSource(org.apache.lucene.queries.function.valuesource.MaxDocValueSource) JoinDocFreqValueSource(org.apache.lucene.queries.function.valuesource.JoinDocFreqValueSource) LiteralValueSource(org.apache.lucene.queries.function.valuesource.LiteralValueSource) TotalTermFreqValueSource(org.apache.lucene.queries.function.valuesource.TotalTermFreqValueSource) IDFValueSource(org.apache.lucene.queries.function.valuesource.IDFValueSource) TermFreqValueSource(org.apache.lucene.queries.function.valuesource.TermFreqValueSource) TFValueSource(org.apache.lucene.queries.function.valuesource.TFValueSource) IDFValueSource(org.apache.lucene.queries.function.valuesource.IDFValueSource) BytesRef(org.apache.lucene.util.BytesRef)

Example 38 with ClassicSimilarity

use of org.apache.lucene.search.similarities.ClassicSimilarity in project lucene-solr by apache.

the class TestFuzzyQuery method testMultipleQueriesIdfWorks.

public void testMultipleQueriesIdfWorks() throws Exception {
    // With issue LUCENE-329 - it could be argued a MultiTermQuery.TopTermsBoostOnlyBooleanQueryRewrite
    // is the solution as it disables IDF.
    // However - IDF is still useful as in this case where there are multiple FuzzyQueries.
    Directory directory = newDirectory();
    RandomIndexWriter writer = new RandomIndexWriter(random(), directory);
    addDoc("michael smith", writer);
    addDoc("michael lucero", writer);
    addDoc("doug cutting", writer);
    addDoc("doug cuttin", writer);
    addDoc("michael wardle", writer);
    addDoc("micheal vegas", writer);
    addDoc("michael lydon", writer);
    IndexReader reader = writer.getReader();
    IndexSearcher searcher = newSearcher(reader);
    //avoid randomisation of similarity algo by test framework
    searcher.setSimilarity(new ClassicSimilarity());
    writer.close();
    BooleanQuery.Builder query = new BooleanQuery.Builder();
    String commonSearchTerm = "michael";
    FuzzyQuery commonQuery = new FuzzyQuery(new Term("field", commonSearchTerm), 2, 1);
    query.add(commonQuery, Occur.SHOULD);
    String rareSearchTerm = "cutting";
    FuzzyQuery rareQuery = new FuzzyQuery(new Term("field", rareSearchTerm), 2, 1);
    query.add(rareQuery, Occur.SHOULD);
    ScoreDoc[] hits = searcher.search(query.build(), 1000).scoreDocs;
    // Matches on the rare surname should be worth more than matches on the common forename
    assertEquals(7, hits.length);
    Document bestDoc = searcher.doc(hits[0].doc);
    String topMatch = bestDoc.get("field");
    assertTrue(topMatch.contains(rareSearchTerm));
    Document runnerUpDoc = searcher.doc(hits[1].doc);
    String runnerUpMatch = runnerUpDoc.get("field");
    assertTrue(runnerUpMatch.contains("cuttin"));
    Document worstDoc = searcher.doc(hits[hits.length - 1].doc);
    String worstMatch = worstDoc.get("field");
    //misspelling of common name
    assertTrue(worstMatch.contains("micheal"));
    reader.close();
    directory.close();
}
Also used : ClassicSimilarity(org.apache.lucene.search.similarities.ClassicSimilarity) Term(org.apache.lucene.index.Term) Document(org.apache.lucene.document.Document) IndexReader(org.apache.lucene.index.IndexReader) RandomIndexWriter(org.apache.lucene.index.RandomIndexWriter) Directory(org.apache.lucene.store.Directory)

Example 39 with ClassicSimilarity

use of org.apache.lucene.search.similarities.ClassicSimilarity in project lucene-solr by apache.

the class TestMemoryIndex method testFreezeAPI.

@Test
public void testFreezeAPI() {
    MemoryIndex mi = new MemoryIndex();
    mi.addField("f1", "some text", analyzer);
    assertThat(mi.search(new MatchAllDocsQuery()), not(is(0.0f)));
    assertThat(mi.search(new TermQuery(new Term("f1", "some"))), not(is(0.0f)));
    // check we can add a new field after searching
    mi.addField("f2", "some more text", analyzer);
    assertThat(mi.search(new TermQuery(new Term("f2", "some"))), not(is(0.0f)));
    // freeze!
    mi.freeze();
    RuntimeException expected = expectThrows(RuntimeException.class, () -> {
        mi.addField("f3", "and yet more", analyzer);
    });
    assertThat(expected.getMessage(), containsString("frozen"));
    expected = expectThrows(RuntimeException.class, () -> {
        mi.setSimilarity(new BM25Similarity(1, 1));
    });
    assertThat(expected.getMessage(), containsString("frozen"));
    assertThat(mi.search(new TermQuery(new Term("f1", "some"))), not(is(0.0f)));
    mi.reset();
    mi.addField("f1", "wibble", analyzer);
    assertThat(mi.search(new TermQuery(new Term("f1", "some"))), is(0.0f));
    assertThat(mi.search(new TermQuery(new Term("f1", "wibble"))), not(is(0.0f)));
    // check we can set the Similarity again
    mi.setSimilarity(new ClassicSimilarity());
}
Also used : TermQuery(org.apache.lucene.search.TermQuery) ClassicSimilarity(org.apache.lucene.search.similarities.ClassicSimilarity) BM25Similarity(org.apache.lucene.search.similarities.BM25Similarity) Term(org.apache.lucene.index.Term) MatchAllDocsQuery(org.apache.lucene.search.MatchAllDocsQuery) Test(org.junit.Test)

Example 40 with ClassicSimilarity

use of org.apache.lucene.search.similarities.ClassicSimilarity in project lucene-solr by apache.

the class TestQueryRescorer method getSearcher.

private IndexSearcher getSearcher(IndexReader r) {
    IndexSearcher searcher = newSearcher(r);
    // We rely on more tokens = lower score:
    searcher.setSimilarity(new ClassicSimilarity());
    return searcher;
}
Also used : ClassicSimilarity(org.apache.lucene.search.similarities.ClassicSimilarity)

Aggregations

ClassicSimilarity (org.apache.lucene.search.similarities.ClassicSimilarity)43 RandomIndexWriter (org.apache.lucene.index.RandomIndexWriter)14 Document (org.apache.lucene.document.Document)13 Term (org.apache.lucene.index.Term)12 Directory (org.apache.lucene.store.Directory)10 IndexReader (org.apache.lucene.index.IndexReader)9 Similarity (org.apache.lucene.search.similarities.Similarity)9 TermQuery (org.apache.lucene.search.TermQuery)7 BM25Similarity (org.apache.lucene.search.similarities.BM25Similarity)7 MockAnalyzer (org.apache.lucene.analysis.MockAnalyzer)6 ConstValueSource (org.apache.lucene.queries.function.valuesource.ConstValueSource)5 DocFreqValueSource (org.apache.lucene.queries.function.valuesource.DocFreqValueSource)4 DoubleConstValueSource (org.apache.lucene.queries.function.valuesource.DoubleConstValueSource)4 IDFValueSource (org.apache.lucene.queries.function.valuesource.IDFValueSource)4 JoinDocFreqValueSource (org.apache.lucene.queries.function.valuesource.JoinDocFreqValueSource)4 LiteralValueSource (org.apache.lucene.queries.function.valuesource.LiteralValueSource)4 MaxDocValueSource (org.apache.lucene.queries.function.valuesource.MaxDocValueSource)4 IndexSearcher (org.apache.lucene.search.IndexSearcher)4 Query (org.apache.lucene.search.Query)4 IndexWriterConfig (org.apache.lucene.index.IndexWriterConfig)3