Search in sources :

Example 6 with GND

use of org.elasticsearch.search.aggregations.bucket.significant.heuristics.GND in project elasticsearch by elastic.

the class SignificanceHeuristicTests method testGNDCornerCases.

public void testGNDCornerCases() throws Exception {
    GND gnd = new GND(true);
    //term is only in the subset, not at all in the other set but that is because the other set is empty.
    // this should actually not happen because only terms that are in the subset are considered now,
    // however, in this case the score should be 0 because a term that does not exist cannot be relevant...
    assertThat(gnd.getScore(0, randomIntBetween(1, 2), 0, randomIntBetween(2, 3)), equalTo(0.0));
    // the terms do not co-occur at all - should be 0
    assertThat(gnd.getScore(0, randomIntBetween(1, 2), randomIntBetween(2, 3), randomIntBetween(5, 6)), equalTo(0.0));
    // comparison between two terms that do not exist - probably not relevant
    assertThat(gnd.getScore(0, 0, 0, randomIntBetween(1, 2)), equalTo(0.0));
    // terms co-occur perfectly - should be 1
    assertThat(gnd.getScore(1, 1, 1, 1), equalTo(1.0));
    gnd = new GND(false);
    assertThat(gnd.getScore(0, 0, 0, 0), equalTo(0.0));
}
Also used : GND(org.elasticsearch.search.aggregations.bucket.significant.heuristics.GND)

Example 7 with GND

use of org.elasticsearch.search.aggregations.bucket.significant.heuristics.GND in project elasticsearch by elastic.

the class SignificanceHeuristicTests method testAssertions.

public void testAssertions() throws Exception {
    testBackgroundAssertions(new MutualInformation(true, true), new MutualInformation(true, false));
    testBackgroundAssertions(new ChiSquare(true, true), new ChiSquare(true, false));
    testBackgroundAssertions(new GND(true), new GND(false));
    testAssertions(new PercentageScore());
    testAssertions(new JLHScore());
}
Also used : JLHScore(org.elasticsearch.search.aggregations.bucket.significant.heuristics.JLHScore) ChiSquare(org.elasticsearch.search.aggregations.bucket.significant.heuristics.ChiSquare) MutualInformation(org.elasticsearch.search.aggregations.bucket.significant.heuristics.MutualInformation) GND(org.elasticsearch.search.aggregations.bucket.significant.heuristics.GND) PercentageScore(org.elasticsearch.search.aggregations.bucket.significant.heuristics.PercentageScore)

Example 8 with GND

use of org.elasticsearch.search.aggregations.bucket.significant.heuristics.GND in project elasticsearch by elastic.

the class SignificantTermsSignificanceScoreIT method testBackgroundVsSeparateSet.

public void testBackgroundVsSeparateSet() throws Exception {
    String type = randomBoolean() ? "text" : "long";
    String settings = "{\"index.number_of_shards\": 1, \"index.number_of_replicas\": 0}";
    SharedSignificantTermsTestMethods.index01Docs(type, settings, this);
    testBackgroundVsSeparateSet(new MutualInformation(true, true), new MutualInformation(true, false));
    testBackgroundVsSeparateSet(new ChiSquare(true, true), new ChiSquare(true, false));
    testBackgroundVsSeparateSet(new GND(true), new GND(false));
}
Also used : ChiSquare(org.elasticsearch.search.aggregations.bucket.significant.heuristics.ChiSquare) MutualInformation(org.elasticsearch.search.aggregations.bucket.significant.heuristics.MutualInformation) GND(org.elasticsearch.search.aggregations.bucket.significant.heuristics.GND)

Aggregations

GND (org.elasticsearch.search.aggregations.bucket.significant.heuristics.GND)8 ChiSquare (org.elasticsearch.search.aggregations.bucket.significant.heuristics.ChiSquare)6 MutualInformation (org.elasticsearch.search.aggregations.bucket.significant.heuristics.MutualInformation)6 JLHScore (org.elasticsearch.search.aggregations.bucket.significant.heuristics.JLHScore)5 PercentageScore (org.elasticsearch.search.aggregations.bucket.significant.heuristics.PercentageScore)3 SignificanceHeuristic (org.elasticsearch.search.aggregations.bucket.significant.heuristics.SignificanceHeuristic)2 ArrayList (java.util.ArrayList)1 TreeSet (java.util.TreeSet)1 BytesRef (org.apache.lucene.util.BytesRef)1 RegExp (org.apache.lucene.util.automaton.RegExp)1 SearchResponse (org.elasticsearch.action.search.SearchResponse)1 TermQueryBuilder (org.elasticsearch.index.query.TermQueryBuilder)1 Script (org.elasticsearch.script.Script)1 SearchModule (org.elasticsearch.search.SearchModule)1 SignificantTerms (org.elasticsearch.search.aggregations.bucket.significant.SignificantTerms)1 SignificantTermsAggregationBuilder (org.elasticsearch.search.aggregations.bucket.significant.SignificantTermsAggregationBuilder)1 ScriptHeuristic (org.elasticsearch.search.aggregations.bucket.significant.heuristics.ScriptHeuristic)1 SignificanceHeuristicParser (org.elasticsearch.search.aggregations.bucket.significant.heuristics.SignificanceHeuristicParser)1 IncludeExclude (org.elasticsearch.search.aggregations.bucket.terms.support.IncludeExclude)1 ElasticsearchAssertions.assertSearchResponse (org.elasticsearch.test.hamcrest.ElasticsearchAssertions.assertSearchResponse)1