Search in sources :

Example 1 with PhraseQueryGenerator

use of io.anserini.search.query.PhraseQueryGenerator in project Anserini by castorini.

the class IndexReaderUtils method getTermCountsWithAnalyzer.

/**
 * Returns count information on a term or a phrase.
 *
 * @param reader index reader
 * @param termStr term
 * @param analyzer analyzer to use
 * @return df (+cf if only one term) of the phrase
 * @throws IOException if error encountered during access to index
 */
public static Map<String, Long> getTermCountsWithAnalyzer(IndexReader reader, String termStr, Analyzer analyzer) throws IOException {
    if (AnalyzerUtils.analyze(analyzer, termStr).size() > 1) {
        Query query = new PhraseQueryGenerator().buildQuery(IndexArgs.CONTENTS, analyzer, termStr);
        IndexSearcher searcher = new IndexSearcher(reader);
        TotalHitCountCollector totalHitCountCollector = new TotalHitCountCollector();
        searcher.search(query, totalHitCountCollector);
        return Map.ofEntries(Map.entry("docFreq", (long) totalHitCountCollector.getTotalHits()));
    }
    Term t = new Term(IndexArgs.CONTENTS, AnalyzerUtils.analyze(analyzer, termStr).get(0));
    Map<String, Long> termInfo = Map.ofEntries(Map.entry("collectionFreq", reader.totalTermFreq(t)), Map.entry("docFreq", (long) reader.docFreq(t)));
    return termInfo;
}
Also used : IndexSearcher(org.apache.lucene.search.IndexSearcher) PhraseQueryGenerator(io.anserini.search.query.PhraseQueryGenerator) Query(org.apache.lucene.search.Query) ConstantScoreQuery(org.apache.lucene.search.ConstantScoreQuery) TermQuery(org.apache.lucene.search.TermQuery) BooleanQuery(org.apache.lucene.search.BooleanQuery) TotalHitCountCollector(org.apache.lucene.search.TotalHitCountCollector) Term(org.apache.lucene.index.Term)

Aggregations

PhraseQueryGenerator (io.anserini.search.query.PhraseQueryGenerator)1 Term (org.apache.lucene.index.Term)1 BooleanQuery (org.apache.lucene.search.BooleanQuery)1 ConstantScoreQuery (org.apache.lucene.search.ConstantScoreQuery)1 IndexSearcher (org.apache.lucene.search.IndexSearcher)1 Query (org.apache.lucene.search.Query)1 TermQuery (org.apache.lucene.search.TermQuery)1 TotalHitCountCollector (org.apache.lucene.search.TotalHitCountCollector)1