Search in sources :

Example 21 with UpdateSketch

use of org.apache.datasketches.theta.UpdateSketch in project sketches-core by DataSketches.

the class JaccardSimilarityTest method checkMinK2.

@Test
public void checkMinK2() {
    // tuple, theta
    // 4096
    final UpdatableSketch<Double, DoubleSummary> skA = tupleBldr.build();
    // 4096
    final UpdateSketch skB = UpdateSketch.builder().build();
    skA.update(1, constSummary);
    skB.update(1);
    double[] result = jaccard(skA, skB, factory.newSummary(), dsso);
    println(result[0] + ", " + result[1] + ", " + result[2]);
    for (int i = 1; i < 4096; i++) {
        skA.update(i, constSummary);
        skB.update(i);
    }
    result = jaccard(skA, skB, factory.newSummary(), dsso);
    println(result[0] + ", " + result[1] + ", " + result[2]);
}
Also used : DoubleSummary(org.apache.datasketches.tuple.adouble.DoubleSummary) UpdateSketch(org.apache.datasketches.theta.UpdateSketch) JaccardSimilarity.similarityTest(org.apache.datasketches.tuple.JaccardSimilarity.similarityTest) Test(org.testng.annotations.Test) JaccardSimilarity.dissimilarityTest(org.apache.datasketches.tuple.JaccardSimilarity.dissimilarityTest)

Example 22 with UpdateSketch

use of org.apache.datasketches.theta.UpdateSketch in project sketches-core by DataSketches.

the class BoundsOnRatiosInTupleSketchedSetsTest method checkAbnormalReturns2.

@Test(expectedExceptions = SketchesArgumentException.class)
public void checkAbnormalReturns2() {
    // tuple, theta
    // 4K
    final UpdatableSketch<Double, DoubleSummary> skA = tupleBldr.build();
    final UpdateSketch skC = thetaBldr.build();
    final int uA = 100000;
    final int uC = 10000;
    for (int i = 0; i < uA; i++) {
        skA.update(i, constSummary);
    }
    for (int i = 0; i < uC; i++) {
        skC.update(i + (uA / 2));
    }
    BoundsOnRatiosInTupleSketchedSets.getEstimateOfBoverA(skA, skC);
}
Also used : DoubleSummary(org.apache.datasketches.tuple.adouble.DoubleSummary) UpdateSketch(org.apache.datasketches.theta.UpdateSketch) Test(org.testng.annotations.Test)

Example 23 with UpdateSketch

use of org.apache.datasketches.theta.UpdateSketch in project sketches-core by DataSketches.

the class BoundsOnRatiosInTupleSketchedSetsTest method checkNormalReturns2.

@Test
public void checkNormalReturns2() {
    // tuple, theta
    // 4K
    final UpdatableSketch<Double, DoubleSummary> skA = tupleBldr.build();
    final UpdateSketch skC = thetaBldr.build();
    final int uA = 10000;
    final int uC = 100000;
    for (int i = 0; i < uA; i++) {
        skA.update(i, constSummary);
    }
    for (int i = 0; i < uC; i++) {
        skC.update(i + (uA / 2));
    }
    final Intersection<DoubleSummary> inter = new Intersection<>(dsso);
    inter.intersect(skA);
    inter.intersect(skC, factory.newSummary());
    final Sketch<DoubleSummary> skB = inter.getResult();
    double est = BoundsOnRatiosInTupleSketchedSets.getEstimateOfBoverA(skA, skB);
    double lb = BoundsOnRatiosInTupleSketchedSets.getLowerBoundForBoverA(skA, skB);
    double ub = BoundsOnRatiosInTupleSketchedSets.getUpperBoundForBoverA(skA, skB);
    assertTrue(ub > est);
    assertTrue(est > lb);
    assertEquals(est, 0.5, .03);
    println("ub : " + ub);
    println("est: " + est);
    println("lb : " + lb);
    // skA is now empty
    skA.reset();
    est = BoundsOnRatiosInTupleSketchedSets.getEstimateOfBoverA(skA, skB);
    lb = BoundsOnRatiosInTupleSketchedSets.getLowerBoundForBoverA(skA, skB);
    ub = BoundsOnRatiosInTupleSketchedSets.getUpperBoundForBoverA(skA, skB);
    println("ub : " + ub);
    println("est: " + est);
    println("lb : " + lb);
    // Now both are empty
    skC.reset();
    est = BoundsOnRatiosInTupleSketchedSets.getEstimateOfBoverA(skA, skC);
    lb = BoundsOnRatiosInTupleSketchedSets.getLowerBoundForBoverA(skA, skC);
    ub = BoundsOnRatiosInTupleSketchedSets.getUpperBoundForBoverA(skA, skC);
    println("ub : " + ub);
    println("est: " + est);
    println("lb : " + lb);
}
Also used : Intersection(org.apache.datasketches.tuple.Intersection) DoubleSummary(org.apache.datasketches.tuple.adouble.DoubleSummary) UpdateSketch(org.apache.datasketches.theta.UpdateSketch) Test(org.testng.annotations.Test)

Example 24 with UpdateSketch

use of org.apache.datasketches.theta.UpdateSketch in project druid by druid-io.

the class SketchAggregationTest method testRelocation.

@Test
public void testRelocation() {
    final TestColumnSelectorFactory columnSelectorFactory = GrouperTestUtil.newColumnSelectorFactory();
    SketchHolder sketchHolder = SketchHolder.of(Sketches.updateSketchBuilder().setNominalEntries(16).build());
    UpdateSketch updateSketch = (UpdateSketch) sketchHolder.getSketch();
    updateSketch.update(1);
    columnSelectorFactory.setRow(new MapBasedRow(0, ImmutableMap.of("sketch", sketchHolder)));
    SketchHolder[] holders = helper.runRelocateVerificationTest(new SketchMergeAggregatorFactory("sketch", "sketch", 16, false, true, 2), columnSelectorFactory, SketchHolder.class);
    Assert.assertEquals(holders[0].getEstimate(), holders[1].getEstimate(), 0);
}
Also used : MapBasedRow(org.apache.druid.data.input.MapBasedRow) TestColumnSelectorFactory(org.apache.druid.query.groupby.epinephelinae.TestColumnSelectorFactory) UpdateSketch(org.apache.datasketches.theta.UpdateSketch) GroupByQueryRunnerTest(org.apache.druid.query.groupby.GroupByQueryRunnerTest) Test(org.junit.Test)

Example 25 with UpdateSketch

use of org.apache.datasketches.theta.UpdateSketch in project druid by druid-io.

the class OldApiSketchAggregationTest method testRelocation.

@Test
public void testRelocation() {
    final TestColumnSelectorFactory columnSelectorFactory = GrouperTestUtil.newColumnSelectorFactory();
    SketchHolder sketchHolder = SketchHolder.of(Sketches.updateSketchBuilder().setNominalEntries(16).build());
    UpdateSketch updateSketch = (UpdateSketch) sketchHolder.getSketch();
    updateSketch.update(1);
    columnSelectorFactory.setRow(new MapBasedRow(0, ImmutableMap.of("sketch", sketchHolder)));
    SketchHolder[] holders = helper.runRelocateVerificationTest(new OldSketchMergeAggregatorFactory("sketch", "sketch", 16, false), columnSelectorFactory, SketchHolder.class);
    Assert.assertEquals(holders[0].getEstimate(), holders[1].getEstimate(), 0);
}
Also used : MapBasedRow(org.apache.druid.data.input.MapBasedRow) TestColumnSelectorFactory(org.apache.druid.query.groupby.epinephelinae.TestColumnSelectorFactory) SketchHolder(org.apache.druid.query.aggregation.datasketches.theta.SketchHolder) UpdateSketch(org.apache.datasketches.theta.UpdateSketch) GroupByQueryRunnerTest(org.apache.druid.query.groupby.GroupByQueryRunnerTest) InitializedNullHandlingTest(org.apache.druid.testing.InitializedNullHandlingTest) Test(org.junit.Test)

Aggregations

UpdateSketch (org.apache.datasketches.theta.UpdateSketch)46 Test (org.testng.annotations.Test)42 DoubleSummary (org.apache.datasketches.tuple.adouble.DoubleSummary)12 AnotB (org.apache.datasketches.tuple.AnotB)6 JaccardSimilarity.dissimilarityTest (org.apache.datasketches.tuple.JaccardSimilarity.dissimilarityTest)6 JaccardSimilarity.similarityTest (org.apache.datasketches.tuple.JaccardSimilarity.similarityTest)6 UpdateSketchBuilder (org.apache.datasketches.theta.UpdateSketchBuilder)5 Intersection (org.apache.datasketches.tuple.Intersection)4 MapBasedRow (org.apache.druid.data.input.MapBasedRow)3 TestColumnSelectorFactory (org.apache.druid.query.groupby.epinephelinae.TestColumnSelectorFactory)3 Test (org.junit.Test)3 SketchesArgumentException (org.apache.datasketches.SketchesArgumentException)2 IntegerSummary (org.apache.datasketches.tuple.aninteger.IntegerSummary)2 GroupByQueryRunnerTest (org.apache.druid.query.groupby.GroupByQueryRunnerTest)2 SketchesStateException (org.apache.datasketches.SketchesStateException)1 CompactSketch (org.apache.datasketches.theta.CompactSketch)1 Intersection (org.apache.datasketches.theta.Intersection)1 Union (org.apache.datasketches.tuple.Union)1 SketchHolder (org.apache.druid.query.aggregation.datasketches.theta.SketchHolder)1 InitializedNullHandlingTest (org.apache.druid.testing.InitializedNullHandlingTest)1