Search in sources :

Example 6 with SignificantTerms

use of org.elasticsearch.search.aggregations.bucket.significant.SignificantTerms in project elasticsearch by elastic.

the class SignificantTermsIT method testStructuredAnalysisWithIncludeExclude.

public void testStructuredAnalysisWithIncludeExclude() throws Exception {
    long[] excludeTerms = { MUSIC_CATEGORY };
    SearchResponse response = client().prepareSearch("test").setSearchType(SearchType.QUERY_THEN_FETCH).setQuery(new TermQueryBuilder("description", "paul")).setFrom(0).setSize(60).setExplain(true).addAggregation(significantTerms("mySignificantTerms").field("fact_category").executionHint(randomExecutionHint()).minDocCount(1).includeExclude(new IncludeExclude(null, excludeTerms))).execute().actionGet();
    assertSearchResponse(response);
    SignificantTerms topTerms = response.getAggregations().get("mySignificantTerms");
    Number topCategory = (Number) topTerms.getBuckets().iterator().next().getKey();
    assertTrue(topCategory.equals(Long.valueOf(OTHER_CATEGORY)));
}
Also used : SignificantTerms(org.elasticsearch.search.aggregations.bucket.significant.SignificantTerms) IncludeExclude(org.elasticsearch.search.aggregations.bucket.terms.support.IncludeExclude) TermQueryBuilder(org.elasticsearch.index.query.TermQueryBuilder) SearchResponse(org.elasticsearch.action.search.SearchResponse) ElasticsearchAssertions.assertSearchResponse(org.elasticsearch.test.hamcrest.ElasticsearchAssertions.assertSearchResponse)

Example 7 with SignificantTerms

use of org.elasticsearch.search.aggregations.bucket.significant.SignificantTerms in project elasticsearch by elastic.

the class SignificantTermsIT method testTextAnalysisGND.

public void testTextAnalysisGND() throws Exception {
    SearchResponse response = client().prepareSearch("test").setSearchType(SearchType.QUERY_THEN_FETCH).setQuery(new TermQueryBuilder("description", "terje")).setFrom(0).setSize(60).setExplain(true).addAggregation(significantTerms("mySignificantTerms").field("description").executionHint(randomExecutionHint()).significanceHeuristic(new GND(true)).minDocCount(2)).execute().actionGet();
    assertSearchResponse(response);
    SignificantTerms topTerms = response.getAggregations().get("mySignificantTerms");
    checkExpectedStringTermsFound(topTerms);
}
Also used : SignificantTerms(org.elasticsearch.search.aggregations.bucket.significant.SignificantTerms) TermQueryBuilder(org.elasticsearch.index.query.TermQueryBuilder) GND(org.elasticsearch.search.aggregations.bucket.significant.heuristics.GND) SearchResponse(org.elasticsearch.action.search.SearchResponse) ElasticsearchAssertions.assertSearchResponse(org.elasticsearch.test.hamcrest.ElasticsearchAssertions.assertSearchResponse)

Example 8 with SignificantTerms

use of org.elasticsearch.search.aggregations.bucket.significant.SignificantTerms in project elasticsearch by elastic.

the class SignificantTermsIT method testTextAnalysisPercentageScore.

public void testTextAnalysisPercentageScore() throws Exception {
    SearchResponse response = client().prepareSearch("test").setSearchType(SearchType.QUERY_THEN_FETCH).setQuery(new TermQueryBuilder("description", "terje")).setFrom(0).setSize(60).setExplain(true).addAggregation(significantTerms("mySignificantTerms").field("description").executionHint(randomExecutionHint()).significanceHeuristic(new PercentageScore()).minDocCount(2)).execute().actionGet();
    assertSearchResponse(response);
    SignificantTerms topTerms = response.getAggregations().get("mySignificantTerms");
    checkExpectedStringTermsFound(topTerms);
}
Also used : SignificantTerms(org.elasticsearch.search.aggregations.bucket.significant.SignificantTerms) TermQueryBuilder(org.elasticsearch.index.query.TermQueryBuilder) PercentageScore(org.elasticsearch.search.aggregations.bucket.significant.heuristics.PercentageScore) SearchResponse(org.elasticsearch.action.search.SearchResponse) ElasticsearchAssertions.assertSearchResponse(org.elasticsearch.test.hamcrest.ElasticsearchAssertions.assertSearchResponse)

Example 9 with SignificantTerms

use of org.elasticsearch.search.aggregations.bucket.significant.SignificantTerms in project elasticsearch by elastic.

the class SignificantTermsIT method testMutualInformation.

public void testMutualInformation() throws Exception {
    SearchResponse response = client().prepareSearch("test").setSearchType(SearchType.QUERY_THEN_FETCH).setQuery(new TermQueryBuilder("description", "terje")).setFrom(0).setSize(60).setExplain(true).addAggregation(significantTerms("mySignificantTerms").field("description").executionHint(randomExecutionHint()).significanceHeuristic(new MutualInformation(false, true)).minDocCount(1)).execute().actionGet();
    assertSearchResponse(response);
    SignificantTerms topTerms = response.getAggregations().get("mySignificantTerms");
    checkExpectedStringTermsFound(topTerms);
}
Also used : SignificantTerms(org.elasticsearch.search.aggregations.bucket.significant.SignificantTerms) MutualInformation(org.elasticsearch.search.aggregations.bucket.significant.heuristics.MutualInformation) TermQueryBuilder(org.elasticsearch.index.query.TermQueryBuilder) SearchResponse(org.elasticsearch.action.search.SearchResponse) ElasticsearchAssertions.assertSearchResponse(org.elasticsearch.test.hamcrest.ElasticsearchAssertions.assertSearchResponse)

Example 10 with SignificantTerms

use of org.elasticsearch.search.aggregations.bucket.significant.SignificantTerms in project elasticsearch by elastic.

the class SignificantTermsIT method testBadFilteredAnalysis.

public void testBadFilteredAnalysis() throws Exception {
    // Deliberately using a bad choice of filter here for the background context in order
    // to test robustness.
    // We search for the name of a snowboarder but use music-related content (fact_category:1)
    // as the background source of term statistics.
    SearchResponse response = client().prepareSearch("test").setSearchType(SearchType.QUERY_THEN_FETCH).setQuery(new TermQueryBuilder("description", "terje")).setFrom(0).setSize(60).setExplain(true).addAggregation(significantTerms("mySignificantTerms").field("description").minDocCount(2).backgroundFilter(QueryBuilders.termQuery("fact_category", 1))).execute().actionGet();
    assertSearchResponse(response);
    SignificantTerms topTerms = response.getAggregations().get("mySignificantTerms");
    // We expect at least one of the significant terms to have been selected on the basis
    // that it is present in the foreground selection but entirely missing from the filtered
    // background used as context.
    boolean hasMissingBackgroundTerms = false;
    for (Bucket topTerm : topTerms) {
        if (topTerm.getSupersetDf() == 0) {
            hasMissingBackgroundTerms = true;
            break;
        }
    }
    assertTrue(hasMissingBackgroundTerms);
}
Also used : SignificantTerms(org.elasticsearch.search.aggregations.bucket.significant.SignificantTerms) Bucket(org.elasticsearch.search.aggregations.bucket.significant.SignificantTerms.Bucket) TermQueryBuilder(org.elasticsearch.index.query.TermQueryBuilder) SearchResponse(org.elasticsearch.action.search.SearchResponse) ElasticsearchAssertions.assertSearchResponse(org.elasticsearch.test.hamcrest.ElasticsearchAssertions.assertSearchResponse)

Aggregations

SearchResponse (org.elasticsearch.action.search.SearchResponse)23 SignificantTerms (org.elasticsearch.search.aggregations.bucket.significant.SignificantTerms)23 ElasticsearchAssertions.assertSearchResponse (org.elasticsearch.test.hamcrest.ElasticsearchAssertions.assertSearchResponse)23 TermQueryBuilder (org.elasticsearch.index.query.TermQueryBuilder)14 Bucket (org.elasticsearch.search.aggregations.bucket.significant.SignificantTerms.Bucket)6 StringTerms (org.elasticsearch.search.aggregations.bucket.terms.StringTerms)6 AggregationBuilders.significantTerms (org.elasticsearch.search.aggregations.AggregationBuilders.significantTerms)5 Terms (org.elasticsearch.search.aggregations.bucket.terms.Terms)5 Matchers.containsString (org.hamcrest.Matchers.containsString)5 HashSet (java.util.HashSet)4 Aggregation (org.elasticsearch.search.aggregations.Aggregation)3 IncludeExclude (org.elasticsearch.search.aggregations.bucket.terms.support.IncludeExclude)3 Aggregations (org.elasticsearch.search.aggregations.Aggregations)2 InternalFilter (org.elasticsearch.search.aggregations.bucket.filter.InternalFilter)2 ArrayList (java.util.ArrayList)1 IndexRequestBuilder (org.elasticsearch.action.index.IndexRequestBuilder)1 XContentBuilder (org.elasticsearch.common.xcontent.XContentBuilder)1 ChiSquare (org.elasticsearch.search.aggregations.bucket.significant.heuristics.ChiSquare)1 GND (org.elasticsearch.search.aggregations.bucket.significant.heuristics.GND)1 JLHScore (org.elasticsearch.search.aggregations.bucket.significant.heuristics.JLHScore)1