Search in sources :

Example 11 with ArrayOfDoublesSketch

use of com.yahoo.sketches.tuple.ArrayOfDoublesSketch in project sketches-pig by DataSketches.

the class UnionArrayOfDoublesSketchTest method accumulatorNotABag.

@Test
public void accumulatorNotABag() throws Exception {
    Accumulator<Tuple> func = new UnionArrayOfDoublesSketch();
    func.accumulate(PigUtil.objectsToTuple((Object) null));
    Tuple resultTuple = func.getValue();
    Assert.assertNotNull(resultTuple);
    Assert.assertEquals(resultTuple.size(), 1);
    DataByteArray bytes = (DataByteArray) resultTuple.get(0);
    Assert.assertTrue(bytes.size() > 0);
    ArrayOfDoublesSketch sketch = ArrayOfDoublesSketches.heapifySketch(Memory.wrap(bytes.get()));
    Assert.assertEquals(sketch.getEstimate(), 0.0);
}
Also used : DataByteArray(org.apache.pig.data.DataByteArray) Tuple(org.apache.pig.data.Tuple) ArrayOfDoublesSketch(com.yahoo.sketches.tuple.ArrayOfDoublesSketch) Test(org.testng.annotations.Test)

Example 12 with ArrayOfDoublesSketch

use of com.yahoo.sketches.tuple.ArrayOfDoublesSketch in project sketches-pig by DataSketches.

the class ArrayOfDoublesSketchToEstimateAndErrorBounds method exec.

@Override
public Tuple exec(final Tuple input) throws IOException {
    if ((input == null) || (input.size() == 0)) {
        return null;
    }
    final DataByteArray dba = (DataByteArray) input.get(0);
    final ArrayOfDoublesSketch sketch = ArrayOfDoublesSketches.wrapSketch(Memory.wrap(dba.get()));
    return TupleFactory.getInstance().newTuple(Arrays.asList(sketch.getEstimate(), sketch.getLowerBound(2), sketch.getUpperBound(2)));
}
Also used : DataByteArray(org.apache.pig.data.DataByteArray) ArrayOfDoublesSketch(com.yahoo.sketches.tuple.ArrayOfDoublesSketch)

Example 13 with ArrayOfDoublesSketch

use of com.yahoo.sketches.tuple.ArrayOfDoublesSketch in project sketches-pig by DataSketches.

the class ArrayOfDoublesSketchToEstimates method exec.

@Override
public Tuple exec(final Tuple input) throws IOException {
    if ((input == null) || (input.size() == 0)) {
        return null;
    }
    final DataByteArray dba = (DataByteArray) input.get(0);
    final ArrayOfDoublesSketch sketch = ArrayOfDoublesSketches.wrapSketch(Memory.wrap(dba.get()));
    final double[] estimates = new double[sketch.getNumValues() + 1];
    estimates[0] = sketch.getEstimate();
    if (sketch.getRetainedEntries() > 0) {
        // remove unnecessary check when version of sketches-core > 0.4.0
        final ArrayOfDoublesSketchIterator it = sketch.iterator();
        while (it.next()) {
            final double[] values = it.getValues();
            for (int i = 0; i < sketch.getNumValues(); i++) {
                estimates[i + 1] += values[i];
            }
        }
        for (int i = 0; i < sketch.getNumValues(); i++) {
            estimates[i + 1] /= sketch.getTheta();
        }
    }
    return Util.doubleArrayToTuple(estimates);
}
Also used : ArrayOfDoublesSketchIterator(com.yahoo.sketches.tuple.ArrayOfDoublesSketchIterator) DataByteArray(org.apache.pig.data.DataByteArray) ArrayOfDoublesSketch(com.yahoo.sketches.tuple.ArrayOfDoublesSketch)

Example 14 with ArrayOfDoublesSketch

use of com.yahoo.sketches.tuple.ArrayOfDoublesSketch in project sketches-pig by DataSketches.

the class ArrayOfDoublesSketchToMeans method exec.

@Override
public Tuple exec(final Tuple input) throws IOException {
    if ((input == null) || (input.size() == 0)) {
        return null;
    }
    final DataByteArray dba = (DataByteArray) input.get(0);
    final ArrayOfDoublesSketch sketch = ArrayOfDoublesSketches.wrapSketch(Memory.wrap(dba.get()));
    if (sketch.getRetainedEntries() < 1) {
        return null;
    }
    final SummaryStatistics[] summaries = ArrayOfDoublesSketchStats.sketchToSummaryStatistics(sketch);
    final Tuple means = TupleFactory.getInstance().newTuple(sketch.getNumValues());
    for (int i = 0; i < sketch.getNumValues(); i++) {
        means.set(i, summaries[i].getMean());
    }
    return means;
}
Also used : SummaryStatistics(org.apache.commons.math3.stat.descriptive.SummaryStatistics) DataByteArray(org.apache.pig.data.DataByteArray) ArrayOfDoublesSketch(com.yahoo.sketches.tuple.ArrayOfDoublesSketch) Tuple(org.apache.pig.data.Tuple)

Example 15 with ArrayOfDoublesSketch

use of com.yahoo.sketches.tuple.ArrayOfDoublesSketch in project sketches-pig by DataSketches.

the class ArrayOfDoublesSketchToNumberOfRetainedEntries method exec.

@Override
public Integer exec(final Tuple input) throws IOException {
    if ((input == null) || (input.size() == 0)) {
        return null;
    }
    final DataByteArray dba = (DataByteArray) input.get(0);
    final ArrayOfDoublesSketch sketch = ArrayOfDoublesSketches.wrapSketch(Memory.wrap(dba.get()));
    return sketch.getRetainedEntries();
}
Also used : DataByteArray(org.apache.pig.data.DataByteArray) ArrayOfDoublesSketch(com.yahoo.sketches.tuple.ArrayOfDoublesSketch)

Aggregations

ArrayOfDoublesSketch (com.yahoo.sketches.tuple.ArrayOfDoublesSketch)26 DataByteArray (org.apache.pig.data.DataByteArray)26 Tuple (org.apache.pig.data.Tuple)22 Test (org.testng.annotations.Test)19 DataBag (org.apache.pig.data.DataBag)12 ArrayOfDoublesUpdatableSketch (com.yahoo.sketches.tuple.ArrayOfDoublesUpdatableSketch)8 ArrayOfDoublesUpdatableSketchBuilder (com.yahoo.sketches.tuple.ArrayOfDoublesUpdatableSketchBuilder)8 SummaryStatistics (org.apache.commons.math3.stat.descriptive.SummaryStatistics)3 ArrayOfDoublesSketchIterator (com.yahoo.sketches.tuple.ArrayOfDoublesSketchIterator)2 DoublesSketchBuilder (com.yahoo.sketches.quantiles.DoublesSketchBuilder)1 UpdateDoublesSketch (com.yahoo.sketches.quantiles.UpdateDoublesSketch)1 Random (java.util.Random)1 TTest (org.apache.commons.math3.stat.inference.TTest)1