Search in sources :

Example 21 with ClassicSimilarity

use of org.apache.lucene.search.similarities.ClassicSimilarity in project lucene-solr by apache.

the class TestPayloadSpanUtil method testPayloadSpanUtil.

public void testPayloadSpanUtil() throws Exception {
    Directory directory = newDirectory();
    RandomIndexWriter writer = new RandomIndexWriter(random(), directory, newIndexWriterConfig(new PayloadAnalyzer()).setSimilarity(new ClassicSimilarity()));
    Document doc = new Document();
    doc.add(newTextField(FIELD, "xx rr yy mm  pp", Field.Store.YES));
    writer.addDocument(doc);
    IndexReader reader = writer.getReader();
    writer.close();
    IndexSearcher searcher = newSearcher(reader);
    PayloadSpanUtil psu = new PayloadSpanUtil(searcher.getTopReaderContext());
    Collection<byte[]> payloads = psu.getPayloadsForQuery(new TermQuery(new Term(FIELD, "rr")));
    if (VERBOSE) {
        System.out.println("Num payloads:" + payloads.size());
        for (final byte[] bytes : payloads) {
            System.out.println(new String(bytes, StandardCharsets.UTF_8));
        }
    }
    reader.close();
    directory.close();
}
Also used : IndexSearcher(org.apache.lucene.search.IndexSearcher) ClassicSimilarity(org.apache.lucene.search.similarities.ClassicSimilarity) TermQuery(org.apache.lucene.search.TermQuery) Term(org.apache.lucene.index.Term) Document(org.apache.lucene.document.Document) IndexReader(org.apache.lucene.index.IndexReader) RandomIndexWriter(org.apache.lucene.index.RandomIndexWriter) Directory(org.apache.lucene.store.Directory)

Example 22 with ClassicSimilarity

use of org.apache.lucene.search.similarities.ClassicSimilarity in project lucene-solr by apache.

the class TestSpans method testSpanScorerZeroSloppyFreq.

public void testSpanScorerZeroSloppyFreq() throws Exception {
    IndexReaderContext topReaderContext = searcher.getTopReaderContext();
    List<LeafReaderContext> leaves = topReaderContext.leaves();
    int subIndex = ReaderUtil.subIndex(11, leaves);
    for (int i = 0, c = leaves.size(); i < c; i++) {
        final LeafReaderContext ctx = leaves.get(i);
        final Similarity sim = new ClassicSimilarity() {

            @Override
            public float sloppyFreq(int distance) {
                return 0.0f;
            }
        };
        final Similarity oldSim = searcher.getSimilarity(true);
        Scorer spanScorer;
        try {
            searcher.setSimilarity(sim);
            SpanQuery snq = spanNearOrderedQuery(field, 1, "t1", "t2");
            spanScorer = searcher.createNormalizedWeight(snq, true).scorer(ctx);
        } finally {
            searcher.setSimilarity(oldSim);
        }
        if (i == subIndex) {
            assertTrue("first doc", spanScorer.iterator().nextDoc() != DocIdSetIterator.NO_MORE_DOCS);
            assertEquals("first doc number", spanScorer.docID() + ctx.docBase, 11);
            float score = spanScorer.score();
            assertTrue("first doc score should be zero, " + score, score == 0.0f);
        } else {
            assertTrue("no second doc", spanScorer == null || spanScorer.iterator().nextDoc() == DocIdSetIterator.NO_MORE_DOCS);
        }
    }
}
Also used : ClassicSimilarity(org.apache.lucene.search.similarities.ClassicSimilarity) Similarity(org.apache.lucene.search.similarities.Similarity) ClassicSimilarity(org.apache.lucene.search.similarities.ClassicSimilarity) LeafReaderContext(org.apache.lucene.index.LeafReaderContext) Scorer(org.apache.lucene.search.Scorer) IndexReaderContext(org.apache.lucene.index.IndexReaderContext)

Example 23 with ClassicSimilarity

use of org.apache.lucene.search.similarities.ClassicSimilarity in project lucene-solr by apache.

the class TestPayloadExplanations method setUp.

@Override
public void setUp() throws Exception {
    super.setUp();
    searcher.setSimilarity(new ClassicSimilarity() {

        @Override
        public float scorePayload(int doc, int start, int end, BytesRef payload) {
            return 1 + (payload.hashCode() % 10);
        }
    });
}
Also used : ClassicSimilarity(org.apache.lucene.search.similarities.ClassicSimilarity) BytesRef(org.apache.lucene.util.BytesRef)

Example 24 with ClassicSimilarity

use of org.apache.lucene.search.similarities.ClassicSimilarity in project lucene-solr by apache.

the class SchemaSimilarityFactory method getSimilarity.

@Override
public Similarity getSimilarity() {
    if (null == core) {
        throw new IllegalStateException("SchemaSimilarityFactory can not be used until SolrCoreAware.inform has been called");
    }
    if (null == similarity) {
        // Need to instantiate lazily, can't do this in inform(SolrCore) because of chicken/egg
        // circular initialization hell with core.getLatestSchema() to lookup defaultSimFromFieldType
        Similarity defaultSim = null;
        if (null == defaultSimFromFieldType) {
            // nothing configured, choose a sensible implicit default...
            defaultSim = this.core.getSolrConfig().luceneMatchVersion.onOrAfter(Version.LUCENE_6_0_0) ? new BM25Similarity() : new ClassicSimilarity();
        } else {
            FieldType defSimFT = core.getLatestSchema().getFieldTypeByName(defaultSimFromFieldType);
            if (null == defSimFT) {
                throw new SolrException(ErrorCode.SERVER_ERROR, "SchemaSimilarityFactory configured with " + INIT_OPT + "='" + defaultSimFromFieldType + "' but that <fieldType> does not exist");
            }
            defaultSim = defSimFT.getSimilarity();
            if (null == defaultSim) {
                throw new SolrException(ErrorCode.SERVER_ERROR, "SchemaSimilarityFactory configured with " + INIT_OPT + "='" + defaultSimFromFieldType + "' but that <fieldType> does not define a <similarity>");
            }
        }
        similarity = new SchemaSimilarity(defaultSim);
    }
    return similarity;
}
Also used : ClassicSimilarity(org.apache.lucene.search.similarities.ClassicSimilarity) ClassicSimilarity(org.apache.lucene.search.similarities.ClassicSimilarity) Similarity(org.apache.lucene.search.similarities.Similarity) BM25Similarity(org.apache.lucene.search.similarities.BM25Similarity) BM25Similarity(org.apache.lucene.search.similarities.BM25Similarity) SolrException(org.apache.solr.common.SolrException) FieldType(org.apache.solr.schema.FieldType)

Example 25 with ClassicSimilarity

use of org.apache.lucene.search.similarities.ClassicSimilarity in project Anserini by castorini.

the class IdfPassageScorer method getTermIdfJSON.

@Override
public JSONObject getTermIdfJSON(List<String> sentList) {
    //    EnglishAnalyzer ea = new EnglishAnalyzer(StopFilter.makeStopSet(stopWords));
    EnglishAnalyzer ea = new EnglishAnalyzer(CharArraySet.EMPTY_SET);
    QueryParser qp = new QueryParser(LuceneDocumentGenerator.FIELD_BODY, ea);
    ClassicSimilarity similarity = new ClassicSimilarity();
    for (String sent : sentList) {
        String[] thisSentence = sent.trim().split("\\s+");
        for (String term : thisSentence) {
            try {
                TermQuery q = (TermQuery) qp.parse(term);
                Term t = q.getTerm();
                double termIDF = similarity.idf(reader.docFreq(t), reader.numDocs());
                termIdfMap.put(term, String.valueOf(termIDF));
            } catch (Exception e) {
                continue;
            }
        }
    }
    return new JSONObject(termIdfMap);
}
Also used : ClassicSimilarity(org.apache.lucene.search.similarities.ClassicSimilarity) TermQuery(org.apache.lucene.search.TermQuery) QueryParser(org.apache.lucene.queryparser.classic.QueryParser) JSONObject(org.json.JSONObject) EnglishAnalyzer(org.apache.lucene.analysis.en.EnglishAnalyzer) Term(org.apache.lucene.index.Term)

Aggregations

ClassicSimilarity (org.apache.lucene.search.similarities.ClassicSimilarity)43 RandomIndexWriter (org.apache.lucene.index.RandomIndexWriter)14 Document (org.apache.lucene.document.Document)13 Term (org.apache.lucene.index.Term)12 Directory (org.apache.lucene.store.Directory)10 IndexReader (org.apache.lucene.index.IndexReader)9 Similarity (org.apache.lucene.search.similarities.Similarity)9 TermQuery (org.apache.lucene.search.TermQuery)7 BM25Similarity (org.apache.lucene.search.similarities.BM25Similarity)7 MockAnalyzer (org.apache.lucene.analysis.MockAnalyzer)6 ConstValueSource (org.apache.lucene.queries.function.valuesource.ConstValueSource)5 DocFreqValueSource (org.apache.lucene.queries.function.valuesource.DocFreqValueSource)4 DoubleConstValueSource (org.apache.lucene.queries.function.valuesource.DoubleConstValueSource)4 IDFValueSource (org.apache.lucene.queries.function.valuesource.IDFValueSource)4 JoinDocFreqValueSource (org.apache.lucene.queries.function.valuesource.JoinDocFreqValueSource)4 LiteralValueSource (org.apache.lucene.queries.function.valuesource.LiteralValueSource)4 MaxDocValueSource (org.apache.lucene.queries.function.valuesource.MaxDocValueSource)4 IndexSearcher (org.apache.lucene.search.IndexSearcher)4 Query (org.apache.lucene.search.Query)4 IndexWriterConfig (org.apache.lucene.index.IndexWriterConfig)3