Search in sources :

Example 16 with TDigest

use of com.facebook.presto.tdigest.TDigest in project presto by prestodb.

the class TestTDigestFunctions method testGetQuantileAtValueOutsideRange.

@Test
public void testGetQuantileAtValueOutsideRange() {
    TDigest tDigest = createTDigest(STANDARD_COMPRESSION_FACTOR);
    for (int i = 0; i < NUMBER_OF_ENTRIES; i++) {
        double value = Math.random() * NUMBER_OF_ENTRIES;
        tDigest.add(value);
    }
    functionAssertions.assertFunction(format("quantile_at_value(CAST(X'%s' AS tdigest(%s)), %s) = 1", new SqlVarbinary(tDigest.serialize().getBytes()).toString().replaceAll("\\s+", " "), DOUBLE, 1_000_000_000d), BOOLEAN, true);
    functionAssertions.assertFunction(format("quantile_at_value(CAST(X'%s' AS tdigest(%s)), %s) = 0", new SqlVarbinary(tDigest.serialize().getBytes()).toString().replaceAll("\\s+", " "), DOUBLE, -500d), BOOLEAN, true);
}
Also used : TDigest(com.facebook.presto.tdigest.TDigest) TDigest.createTDigest(com.facebook.presto.tdigest.TDigest.createTDigest) SqlVarbinary(com.facebook.presto.common.type.SqlVarbinary) Test(org.testng.annotations.Test)

Example 17 with TDigest

use of com.facebook.presto.tdigest.TDigest in project presto by prestodb.

the class TestTDigestFunctions method testMergeManySmallNormalDistributions.

@Test
public void testMergeManySmallNormalDistributions() {
    TDigest tDigest = createTDigest(STANDARD_COMPRESSION_FACTOR);
    List<Double> list = new ArrayList<>();
    NormalDistribution normal = new NormalDistribution(500, 20);
    int digests = 100_000;
    for (int k = 0; k < digests; k++) {
        TDigest current = createTDigest(STANDARD_COMPRESSION_FACTOR);
        for (int i = 0; i < 10; i++) {
            double value = normal.sample();
            current.add(value);
            list.add(value);
        }
        tDigest.merge(current);
    }
    sort(list);
    for (int i = 0; i < quantiles.length; i++) {
        assertContinuousQuantileWithinBound(quantiles[i], STANDARD_ERROR, list, tDigest);
    }
}
Also used : TDigest(com.facebook.presto.tdigest.TDigest) TDigest.createTDigest(com.facebook.presto.tdigest.TDigest.createTDigest) NormalDistribution(org.apache.commons.math3.distribution.NormalDistribution) ArrayList(java.util.ArrayList) Test(org.testng.annotations.Test)

Example 18 with TDigest

use of com.facebook.presto.tdigest.TDigest in project presto by prestodb.

the class TDigestFunctions method destructureTDigest.

@ScalarFunction(value = "destructure_tdigest", visibility = EXPERIMENTAL)
@Description("Return the raw TDigest, including arrays of centroid means and weights, as well as min, max, sum, count, and compression factor.")
@SqlType("row(centroid_means array(double), centroid_weights array(integer), compression double, min double, max double, sum double, count bigint)")
public static Block destructureTDigest(@SqlType("tdigest(double)") Slice input) {
    TDigest tDigest = createTDigest(input);
    BlockBuilder blockBuilder = TDIGEST_CENTROIDS_ROW_TYPE.createBlockBuilder(null, 1);
    BlockBuilder rowBuilder = blockBuilder.beginBlockEntry();
    // Centroid means / weights
    BlockBuilder meansBuilder = DOUBLE.createBlockBuilder(null, tDigest.centroidCount());
    BlockBuilder weightsBuilder = INTEGER.createBlockBuilder(null, tDigest.centroidCount());
    for (Centroid centroid : tDigest.centroids()) {
        int weight = (int) centroid.getWeight();
        DOUBLE.writeDouble(meansBuilder, centroid.getMean());
        INTEGER.writeLong(weightsBuilder, weight);
    }
    rowBuilder.appendStructure(meansBuilder);
    rowBuilder.appendStructure(weightsBuilder);
    // Compression, min, max, sum, count
    DOUBLE.writeDouble(rowBuilder, tDigest.getCompressionFactor());
    DOUBLE.writeDouble(rowBuilder, tDigest.getMin());
    DOUBLE.writeDouble(rowBuilder, tDigest.getMax());
    DOUBLE.writeDouble(rowBuilder, tDigest.getSum());
    BIGINT.writeLong(rowBuilder, (long) tDigest.getSize());
    blockBuilder.closeEntry();
    return TDIGEST_CENTROIDS_ROW_TYPE.getObject(blockBuilder, blockBuilder.getPositionCount() - 1);
}
Also used : Centroid(com.facebook.presto.tdigest.Centroid) TDigest.createTDigest(com.facebook.presto.tdigest.TDigest.createTDigest) TDigest(com.facebook.presto.tdigest.TDigest) BlockBuilder(com.facebook.presto.common.block.BlockBuilder) ScalarFunction(com.facebook.presto.spi.function.ScalarFunction) Description(com.facebook.presto.spi.function.Description) SqlType(com.facebook.presto.spi.function.SqlType)

Example 19 with TDigest

use of com.facebook.presto.tdigest.TDigest in project presto by prestodb.

the class TestTDigestFunctions method testDestructureTDigestLarge.

@Test
public void testDestructureTDigestLarge() {
    TDigest tDigest = createTDigest(STANDARD_COMPRESSION_FACTOR);
    List<Double> values = new ArrayList<>();
    for (int i = 0; i < NUMBER_OF_ENTRIES; i++) {
        values.add((double) i);
    }
    values.stream().forEach(tDigest::add);
    double compression = Double.valueOf(STANDARD_COMPRESSION_FACTOR);
    double min = values.stream().reduce(Double.POSITIVE_INFINITY, Double::min);
    double max = values.stream().reduce(Double.NEGATIVE_INFINITY, Double::max);
    double sum = values.stream().reduce(0.0d, Double::sum);
    long count = values.size();
    String sql = format("destructure_tdigest(CAST(X'%s' AS tdigest(%s)))", new SqlVarbinary(tDigest.serialize().getBytes()).toString().replaceAll("\\s+", " "), DOUBLE);
    functionAssertions.assertFunction(format("%s.compression", sql), DOUBLE, compression);
    functionAssertions.assertFunction(format("%s.min", sql), DOUBLE, min);
    functionAssertions.assertFunction(format("%s.max", sql), DOUBLE, max);
    functionAssertions.assertFunction(format("%s.sum", sql), DOUBLE, sum);
    functionAssertions.assertFunction(format("%s.count", sql), BIGINT, count);
}
Also used : TDigest(com.facebook.presto.tdigest.TDigest) TDigest.createTDigest(com.facebook.presto.tdigest.TDigest.createTDigest) ArrayList(java.util.ArrayList) SqlVarbinary(com.facebook.presto.common.type.SqlVarbinary) Test(org.testng.annotations.Test)

Example 20 with TDigest

use of com.facebook.presto.tdigest.TDigest in project presto by prestodb.

the class TestTDigestFunctions method testNormalDistributionLowVariance.

@Test
public void testNormalDistributionLowVariance() {
    TDigest tDigest = createTDigest(STANDARD_COMPRESSION_FACTOR);
    List<Double> list = new ArrayList<>();
    NormalDistribution normal = new NormalDistribution(1000, 1);
    for (int i = 0; i < NUMBER_OF_ENTRIES; i++) {
        double value = normal.sample();
        tDigest.add(value);
        list.add(value);
    }
    sort(list);
    for (int i = 0; i < quantiles.length; i++) {
        assertContinuousQuantileWithinBound(quantiles[i], STANDARD_ERROR, list, tDigest);
    }
}
Also used : TDigest(com.facebook.presto.tdigest.TDigest) TDigest.createTDigest(com.facebook.presto.tdigest.TDigest.createTDigest) NormalDistribution(org.apache.commons.math3.distribution.NormalDistribution) ArrayList(java.util.ArrayList) Test(org.testng.annotations.Test)

Aggregations

TDigest (com.facebook.presto.tdigest.TDigest)27 TDigest.createTDigest (com.facebook.presto.tdigest.TDigest.createTDigest)27 Test (org.testng.annotations.Test)21 ArrayList (java.util.ArrayList)18 NormalDistribution (org.apache.commons.math3.distribution.NormalDistribution)8 SqlVarbinary (com.facebook.presto.common.type.SqlVarbinary)6 BlockBuilder (com.facebook.presto.common.block.BlockBuilder)4 Description (com.facebook.presto.spi.function.Description)4 ScalarFunction (com.facebook.presto.spi.function.ScalarFunction)4 SqlType (com.facebook.presto.spi.function.SqlType)4 Block (com.facebook.presto.common.block.Block)1 ArrayType (com.facebook.presto.common.type.ArrayType)1 DoubleType (com.facebook.presto.common.type.DoubleType)1 Type (com.facebook.presto.common.type.Type)1 Centroid (com.facebook.presto.tdigest.Centroid)1 BinomialDistribution (org.apache.commons.math3.distribution.BinomialDistribution)1 GeometricDistribution (org.apache.commons.math3.distribution.GeometricDistribution)1 PoissonDistribution (org.apache.commons.math3.distribution.PoissonDistribution)1