Search in sources :

Example 11 with TotalHits

use of org.apache.lucene.search.TotalHits in project hazelcast by hazelcast.

the class ElasticSourcePTest method givenMultipleResults_when_runProcessor_then_useScrollIdInFollowupScrollRequest.

@Test
public void givenMultipleResults_when_runProcessor_then_useScrollIdInFollowupScrollRequest() throws Exception {
    SearchHit hit = new SearchHit(0, "id-0", new Text("ignored"), emptyMap(), emptyMap());
    hit.sourceRef(new BytesArray(HIT_SOURCE));
    when(response.getHits()).thenReturn(new SearchHits(new SearchHit[] { hit }, new TotalHits(3, EQUAL_TO), Float.NaN));
    SearchResponse response2 = mock(SearchResponse.class);
    SearchHit hit2 = new SearchHit(1, "id-1", new Text("ignored"), emptyMap(), emptyMap());
    hit2.sourceRef(new BytesArray(HIT_SOURCE2));
    when(response2.getHits()).thenReturn(new SearchHits(new SearchHit[] { hit2 }, new TotalHits(3, EQUAL_TO), Float.NaN));
    SearchResponse response3 = mock(SearchResponse.class);
    when(response3.getHits()).thenReturn(new SearchHits(new SearchHit[] {}, new TotalHits(3, EQUAL_TO), Float.NaN));
    when(mockClient.scroll(any(), any())).thenReturn(response2, response3);
    TestSupport testSupport = runProcessor();
    testSupport.expectOutput(newArrayList(HIT_SOURCE, HIT_SOURCE2));
    ArgumentCaptor<SearchScrollRequest> captor = forClass(SearchScrollRequest.class);
    verify(mockClient, times(2)).scroll(captor.capture(), any());
    SearchScrollRequest request = captor.getValue();
    assertThat(request.scrollId()).isEqualTo(SCROLL_ID);
    assertThat(request.scroll().keepAlive().getStringRep()).isEqualTo(KEEP_ALIVE);
}
Also used : TotalHits(org.apache.lucene.search.TotalHits) BytesArray(org.elasticsearch.common.bytes.BytesArray) SearchHit(org.elasticsearch.search.SearchHit) TestSupport(com.hazelcast.jet.core.test.TestSupport) Text(org.elasticsearch.common.text.Text) SearchHits(org.elasticsearch.search.SearchHits) SearchScrollRequest(org.elasticsearch.action.search.SearchScrollRequest) SearchResponse(org.elasticsearch.action.search.SearchResponse) ParallelJVMTest(com.hazelcast.test.annotation.ParallelJVMTest) QuickTest(com.hazelcast.test.annotation.QuickTest) Test(org.junit.Test)

Example 12 with TotalHits

use of org.apache.lucene.search.TotalHits in project hazelcast by hazelcast.

the class ElasticSourcePTest method when_runProcessorWithCoLocation_then_useLocalNodeOnly.

@Test
public void when_runProcessorWithCoLocation_then_useLocalNodeOnly() throws Exception {
    RestClient lowClient = mock(RestClient.class);
    when(mockClient.getLowLevelClient()).thenReturn(lowClient);
    when(response.getHits()).thenReturn(new SearchHits(new SearchHit[] {}, new TotalHits(0, EQUAL_TO), Float.NaN));
    TestSupport testSupport = runProcessorWithCoLocation(newArrayList(new Shard("my-index", 0, Prirep.p, 42, "STARTED", "10.0.0.1", "10.0.0.1:9200", "es1")));
    testSupport.expectOutput(emptyList());
    ArgumentCaptor<Collection<Node>> nodesCaptor = ArgumentCaptor.forClass(Collection.class);
    verify(lowClient).setNodes(nodesCaptor.capture());
    Collection<Node> nodes = nodesCaptor.getValue();
    assertThat(nodes).hasSize(1);
    Node node = nodes.iterator().next();
    assertThat(node.getHost().toHostString()).isEqualTo("10.0.0.1:9200");
}
Also used : TotalHits(org.apache.lucene.search.TotalHits) SearchHit(org.elasticsearch.search.SearchHit) Node(org.elasticsearch.client.Node) RestClient(org.elasticsearch.client.RestClient) TestSupport(com.hazelcast.jet.core.test.TestSupport) Collection(java.util.Collection) SearchHits(org.elasticsearch.search.SearchHits) ParallelJVMTest(com.hazelcast.test.annotation.ParallelJVMTest) QuickTest(com.hazelcast.test.annotation.QuickTest) Test(org.junit.Test)

Example 13 with TotalHits

use of org.apache.lucene.search.TotalHits in project hazelcast by hazelcast.

the class ElasticSourcePTest method given_singleHit_when_runProcessor_then_produceSingleHit.

@Test
public void given_singleHit_when_runProcessor_then_produceSingleHit() throws Exception {
    SearchHit hit = new SearchHit(0, "id-0", new Text("ignored"), emptyMap(), emptyMap());
    hit.sourceRef(new BytesArray(HIT_SOURCE));
    when(response.getHits()).thenReturn(new SearchHits(new SearchHit[] { hit }, new TotalHits(1, EQUAL_TO), Float.NaN));
    SearchResponse response2 = mock(SearchResponse.class);
    when(response2.getHits()).thenReturn(new SearchHits(new SearchHit[] {}, new TotalHits(1, EQUAL_TO), Float.NaN));
    when(mockClient.scroll(any(), any())).thenReturn(response2);
    TestSupport testSupport = runProcessor();
    testSupport.expectOutput(newArrayList(HIT_SOURCE));
}
Also used : TotalHits(org.apache.lucene.search.TotalHits) BytesArray(org.elasticsearch.common.bytes.BytesArray) SearchHit(org.elasticsearch.search.SearchHit) TestSupport(com.hazelcast.jet.core.test.TestSupport) Text(org.elasticsearch.common.text.Text) SearchHits(org.elasticsearch.search.SearchHits) SearchResponse(org.elasticsearch.action.search.SearchResponse) ParallelJVMTest(com.hazelcast.test.annotation.ParallelJVMTest) QuickTest(com.hazelcast.test.annotation.QuickTest) Test(org.junit.Test)

Example 14 with TotalHits

use of org.apache.lucene.search.TotalHits in project hazelcast by hazelcast.

the class ElasticSourcePTest method when_runProcessorWithOptionsFn_then_shouldUseOptionsFnForSearchRequest.

@Test
public void when_runProcessorWithOptionsFn_then_shouldUseOptionsFnForSearchRequest() throws Exception {
    when(response.getHits()).thenReturn(new SearchHits(new SearchHit[] {}, new TotalHits(0, EQUAL_TO), Float.NaN));
    // get different instance than default
    TestSupport testSupport = runProcessor(request -> {
        Builder builder = RequestOptions.DEFAULT.toBuilder();
        builder.addHeader("TestHeader", "value");
        return builder.build();
    });
    testSupport.expectOutput(emptyList());
    ArgumentCaptor<RequestOptions> captor = forClass(RequestOptions.class);
    verify(mockClient).search(any(), captor.capture());
    RequestOptions capturedOptions = captor.getValue();
    assertThat(capturedOptions.getHeaders()).extracting(h -> tuple(h.getName(), h.getValue())).containsExactly(tuple("TestHeader", "value"));
}
Also used : TotalHits(org.apache.lucene.search.TotalHits) RestClient(org.elasticsearch.client.RestClient) ArgumentMatchers.any(org.mockito.ArgumentMatchers.any) ParallelJVMTest(com.hazelcast.test.annotation.ParallelJVMTest) RestClientBuilder(org.elasticsearch.client.RestClientBuilder) SearchHits(org.elasticsearch.search.SearchHits) QuickTest(com.hazelcast.test.annotation.QuickTest) Assertions.assertThat(org.assertj.core.api.Assertions.assertThat) RunWith(org.junit.runner.RunWith) TestOutbox(com.hazelcast.jet.core.test.TestOutbox) SearchRequest(org.elasticsearch.action.search.SearchRequest) Node(org.elasticsearch.client.Node) Builder(org.elasticsearch.client.RequestOptions.Builder) Prirep(com.hazelcast.jet.elastic.impl.Shard.Prirep) BytesArray(org.elasticsearch.common.bytes.BytesArray) ArgumentCaptor(org.mockito.ArgumentCaptor) Text(org.elasticsearch.common.text.Text) SearchResponse(org.elasticsearch.action.search.SearchResponse) RequestOptions(org.elasticsearch.client.RequestOptions) RETURNS_DEEP_STUBS(org.mockito.Mockito.RETURNS_DEEP_STUBS) Before(org.junit.Before) SearchHit(org.elasticsearch.search.SearchHit) FunctionEx(com.hazelcast.function.FunctionEx) Collections.emptyMap(java.util.Collections.emptyMap) ActionRequest(org.elasticsearch.action.ActionRequest) SliceBuilder(org.elasticsearch.search.slice.SliceBuilder) Assertions.tuple(org.assertj.core.api.Assertions.tuple) Collections.emptyList(java.util.Collections.emptyList) Collection(java.util.Collection) Test(org.junit.Test) Mockito.times(org.mockito.Mockito.times) Mockito.when(org.mockito.Mockito.when) RestHighLevelClient(org.elasticsearch.client.RestHighLevelClient) Category(org.junit.experimental.categories.Category) EQUAL_TO(org.apache.lucene.search.TotalHits.Relation.EQUAL_TO) Serializable(java.io.Serializable) Mockito.verify(org.mockito.Mockito.verify) TotalHits(org.apache.lucene.search.TotalHits) List(java.util.List) ArgumentCaptor.forClass(org.mockito.ArgumentCaptor.forClass) TestSupport(com.hazelcast.jet.core.test.TestSupport) HazelcastParallelClassRunner(com.hazelcast.test.HazelcastParallelClassRunner) SearchScrollRequest(org.elasticsearch.action.search.SearchScrollRequest) Lists.newArrayList(org.assertj.core.util.Lists.newArrayList) Mockito.mock(org.mockito.Mockito.mock) SearchHit(org.elasticsearch.search.SearchHit) RequestOptions(org.elasticsearch.client.RequestOptions) RestClientBuilder(org.elasticsearch.client.RestClientBuilder) Builder(org.elasticsearch.client.RequestOptions.Builder) SliceBuilder(org.elasticsearch.search.slice.SliceBuilder) TestSupport(com.hazelcast.jet.core.test.TestSupport) SearchHits(org.elasticsearch.search.SearchHits) ParallelJVMTest(com.hazelcast.test.annotation.ParallelJVMTest) QuickTest(com.hazelcast.test.annotation.QuickTest) Test(org.junit.Test)

Example 15 with TotalHits

use of org.apache.lucene.search.TotalHits in project Anserini by castorini.

the class SearchCollection method searchTweets.

public <K> ScoredDocuments searchTweets(IndexSearcher searcher, K qid, String queryString, long t, RerankerCascade cascade, ScoredDocuments queryQrels, boolean hasRelDocs) throws IOException {
    Query keywordQuery;
    if (args.sdm) {
        keywordQuery = new SdmQueryGenerator(args.sdm_tw, args.sdm_ow, args.sdm_uw).buildQuery(IndexArgs.CONTENTS, analyzer, queryString);
    } else {
        try {
            QueryGenerator generator = (QueryGenerator) Class.forName("io.anserini.search.query." + args.queryGenerator).getConstructor().newInstance();
            keywordQuery = generator.buildQuery(IndexArgs.CONTENTS, analyzer, queryString);
        } catch (Exception e) {
            e.printStackTrace();
            throw new IllegalArgumentException("Unable to load QueryGenerator: " + args.topicReader);
        }
    }
    List<String> queryTokens = AnalyzerUtils.analyze(analyzer, queryString);
    // Do not consider the tweets with tweet ids that are beyond the queryTweetTime
    // <querytweettime> tag contains the timestamp of the query in terms of the
    // chronologically nearest tweet id within the corpus
    Query filter = LongPoint.newRangeQuery(TweetGenerator.TweetField.ID_LONG.name, 0L, t);
    BooleanQuery.Builder builder = new BooleanQuery.Builder();
    builder.add(filter, BooleanClause.Occur.FILTER);
    builder.add(keywordQuery, BooleanClause.Occur.MUST);
    Query compositeQuery = builder.build();
    TopDocs rs = new TopDocs(new TotalHits(0, TotalHits.Relation.EQUAL_TO), new ScoreDoc[] {});
    if (!isRerank || (args.rerankcutoff > 0 && args.rf_qrels == null) || (args.rf_qrels != null && !hasRelDocs)) {
        if (args.arbitraryScoreTieBreak) {
            // Figure out how to break the scoring ties.
            rs = searcher.search(compositeQuery, (isRerank && args.rf_qrels == null) ? args.rerankcutoff : args.hits);
        } else {
            rs = searcher.search(compositeQuery, (isRerank && args.rf_qrels == null) ? args.rerankcutoff : args.hits, BREAK_SCORE_TIES_BY_TWEETID, true);
        }
    }
    RerankerContext context = new RerankerContext<>(searcher, qid, keywordQuery, null, queryString, queryTokens, filter, args);
    ScoredDocuments scoredFbDocs;
    if (isRerank && args.rf_qrels != null) {
        if (hasRelDocs) {
            scoredFbDocs = queryQrels;
        } else {
            // if no relevant documents, only perform score based tie breaking next
            scoredFbDocs = ScoredDocuments.fromTopDocs(rs, searcher);
            cascade = new RerankerCascade();
            cascade.add(new ScoreTiesAdjusterReranker());
        }
    } else {
        scoredFbDocs = ScoredDocuments.fromTopDocs(rs, searcher);
    }
    return cascade.run(scoredFbDocs, context);
}
Also used : TotalHits(org.apache.lucene.search.TotalHits) BooleanQuery(org.apache.lucene.search.BooleanQuery) Query(org.apache.lucene.search.Query) TermInSetQuery(org.apache.lucene.search.TermInSetQuery) BooleanQuery(org.apache.lucene.search.BooleanQuery) ScoredDocuments(io.anserini.rerank.ScoredDocuments) QueryNodeException(org.apache.lucene.queryparser.flexible.core.QueryNodeException) IOException(java.io.IOException) CompletionException(java.util.concurrent.CompletionException) CmdLineException(org.kohsuke.args4j.CmdLineException) AtomicMoveNotSupportedException(java.nio.file.AtomicMoveNotSupportedException) TopDocs(org.apache.lucene.search.TopDocs) RerankerCascade(io.anserini.rerank.RerankerCascade) QueryGenerator(io.anserini.search.query.QueryGenerator) SdmQueryGenerator(io.anserini.search.query.SdmQueryGenerator) ScoreTiesAdjusterReranker(io.anserini.rerank.lib.ScoreTiesAdjusterReranker) SdmQueryGenerator(io.anserini.search.query.SdmQueryGenerator) RerankerContext(io.anserini.rerank.RerankerContext)

Aggregations

TotalHits (org.apache.lucene.search.TotalHits)18 Test (org.junit.Test)14 SearchHits (org.elasticsearch.search.SearchHits)13 SearchHit (org.elasticsearch.search.SearchHit)12 TestSupport (com.hazelcast.jet.core.test.TestSupport)9 ParallelJVMTest (com.hazelcast.test.annotation.ParallelJVMTest)9 QuickTest (com.hazelcast.test.annotation.QuickTest)9 SearchRequest (org.elasticsearch.action.search.SearchRequest)8 SearchResponse (org.elasticsearch.action.search.SearchResponse)7 BytesArray (org.elasticsearch.common.bytes.BytesArray)3 Text (org.elasticsearch.common.text.Text)3 SliceBuilder (org.elasticsearch.search.slice.SliceBuilder)3 FunctionEx (com.hazelcast.function.FunctionEx)2 TestOutbox (com.hazelcast.jet.core.test.TestOutbox)2 Prirep (com.hazelcast.jet.elastic.impl.Shard.Prirep)2 HazelcastParallelClassRunner (com.hazelcast.test.HazelcastParallelClassRunner)2 RerankerCascade (io.anserini.rerank.RerankerCascade)2 RerankerContext (io.anserini.rerank.RerankerContext)2 ScoredDocuments (io.anserini.rerank.ScoredDocuments)2 ScoreTiesAdjusterReranker (io.anserini.rerank.lib.ScoreTiesAdjusterReranker)2