use of com.yahoo.sketches.theta.CompactSketch in project sketches-core by DataSketches.
the class BoundsOnRatiosInThetaSketchedSetsTest method checkNormalReturns.
@Test
public void checkNormalReturns() {
//4K
UpdateSketch skA = Sketches.updateSketchBuilder().build();
UpdateSketch skC = Sketches.updateSketchBuilder().build();
int uA = 10000;
int uC = 100000;
for (int i = 0; i < uA; i++) {
skA.update(i);
}
for (int i = 0; i < uC; i++) {
skC.update(i + uA / 2);
}
Intersection inter = Sketches.setOperationBuilder().buildIntersection();
inter.update(skA);
inter.update(skC);
CompactSketch skB = inter.getResult();
double est = BoundsOnRatiosInThetaSketchedSets.getEstimateOfBoverA(skA, skB);
double lb = BoundsOnRatiosInThetaSketchedSets.getLowerBoundForBoverA(skA, skB);
double ub = BoundsOnRatiosInThetaSketchedSets.getUpperBoundForBoverA(skA, skB);
assertTrue(ub > est);
assertTrue(est > lb);
assertEquals(est, 0.5, .03);
println("ub : " + ub);
println("est: " + est);
println("lb : " + lb);
//skA is now empty
skA.reset();
est = BoundsOnRatiosInThetaSketchedSets.getEstimateOfBoverA(skA, skB);
lb = BoundsOnRatiosInThetaSketchedSets.getLowerBoundForBoverA(skA, skB);
ub = BoundsOnRatiosInThetaSketchedSets.getUpperBoundForBoverA(skA, skB);
println("ub : " + ub);
println("est: " + est);
println("lb : " + lb);
//Now both are empty
skC.reset();
est = BoundsOnRatiosInThetaSketchedSets.getEstimateOfBoverA(skA, skC);
lb = BoundsOnRatiosInThetaSketchedSets.getLowerBoundForBoverA(skA, skC);
ub = BoundsOnRatiosInThetaSketchedSets.getUpperBoundForBoverA(skA, skC);
println("ub : " + ub);
println("est: " + est);
println("lb : " + lb);
}
use of com.yahoo.sketches.theta.CompactSketch in project sketches-pig by DataSketches.
the class PigUtilTest method checkCompOrdSketchToTuple.
@Test(expectedExceptions = IllegalArgumentException.class)
public void checkCompOrdSketchToTuple() {
UpdateSketch usk = UpdateSketch.builder().setNominalEntries(16).build();
for (int i = 0; i < 16; i++) usk.update(i);
CompactSketch csk = usk.compact(false, null);
compactOrderedSketchToTuple(csk);
}
use of com.yahoo.sketches.theta.CompactSketch in project sketches-pig by DataSketches.
the class AexcludeB method exec.
// @formatter:off
/**
* Top Level Exec Function.
* <p>
* This method accepts a <b>Sketch AnotB Input Tuple</b> and returns a
* <b>Sketch Tuple</b>.
* </p>
*
* <b>Sketch AnotB Input Tuple</b>
* <ul>
* <li>Tuple: TUPLE (Must contain 2 fields): <br>
* Java data type: Pig DataType: Description
* <ul>
* <li>index 0: DataByteArray: BYTEARRAY: Sketch A</li>
* <li>index 1: DataByteArray: BYTEARRAY: Sketch B</li>
* </ul>
* </li>
* </ul>
*
* <p>
* Any other input tuple will throw an exception!
* </p>
*
* <b>Sketch Tuple</b>
* <ul>
* <li>Tuple: TUPLE (Contains exactly 1 field)
* <ul>
* <li>index 0: DataByteArray: BYTEARRAY = The serialization of a Sketch object.</li>
* </ul>
* </li>
* </ul>
*
* @throws ExecException from Pig.
*/
// @formatter:on
// TOP LEVEL EXEC
@Override
public Tuple exec(final Tuple inputTuple) throws IOException {
// The exec is a stateless function. It operates on the input and returns a result.
// It can only call static functions.
final Object objA = extractFieldAtIndex(inputTuple, 0);
Sketch sketchA = null;
if (objA != null) {
final DataByteArray dbaA = (DataByteArray) objA;
final Memory srcMem = Memory.wrap(dbaA.get());
sketchA = Sketch.wrap(srcMem, seed_);
}
final Object objB = extractFieldAtIndex(inputTuple, 1);
Sketch sketchB = null;
if (objB != null) {
final DataByteArray dbaB = (DataByteArray) objB;
final Memory srcMem = Memory.wrap(dbaB.get());
sketchB = Sketch.wrap(srcMem, seed_);
}
final AnotB aNOTb = SetOperation.builder().setSeed(seed_).buildANotB();
aNOTb.update(sketchA, sketchB);
final CompactSketch compactSketch = aNOTb.getResult(true, null);
return compactOrderedSketchToTuple(compactSketch);
}
use of com.yahoo.sketches.theta.CompactSketch in project sketches-pig by DataSketches.
the class DataToSketch method exec.
// @formatter:off
/**
***********************************************************************************************
* Top-level exec function.
* This method accepts an input Tuple containing a Bag of one or more inner <b>Datum Tuples</b>
* and returns a single updated <b>Sketch</b> as a <b>Sketch Tuple</b>.
*
* <p>If a large number of calls is anticipated, leveraging either the <i>Algebraic</i> or
* <i>Accumulator</i> interfaces is recommended. Pig normally handles this automatically.
*
* <p>Internally, this method presents the inner <b>Datum Tuples</b> to a new <b>Sketch</b>,
* which is returned as a <b>Sketch Tuple</b>
*
* <p><b>Input Tuple</b>
* <ul>
* <li>Tuple: TUPLE (Must contain only one field)
* <ul>
* <li>index 0: DataBag: BAG (May contain 0 or more Inner Tuples)
* <ul>
* <li>index 0: Tuple: TUPLE <b>Datum Tuple</b></li>
* <li>...</li>
* <li>index n-1: Tuple: TUPLE <b>Datum Tuple</b></li>
* </ul>
* </li>
* </ul>
* </li>
* </ul>
*
* <b>Datum Tuple</b>
* <ul>
* <li>Tuple: TUPLE (Must contain only one field)
* <ul>
* <li>index 0: Java data type : Pig DataType: may be any one of:
* <ul>
* <li>Byte: BYTE</li>
* <li>Integer: INTEGER</li>
* <li>Long: LONG</li>
* <li>Float: FLOAT</li>
* <li>Double: DOUBLE</li>
* <li>String: CHARARRAY</li>
* <li>DataByteArray: BYTEARRAY</li>
* </ul>
* </li>
* </ul>
* </li>
* </ul>
*
* <b>Sketch Tuple</b>
* <ul>
* <li>Tuple: TUPLE (Contains exactly 1 field)
* <ul>
* <li>index 0: DataByteArray: BYTEARRAY = The serialization of a Sketch object.</li>
* </ul>
* </li>
* </ul>
*
* @param inputTuple A tuple containing a single bag, containing Datum Tuples.
* @return Sketch Tuple. If inputTuple is null or empty, returns empty sketch (8 bytes).
* @see "org.apache.pig.EvalFunc.exec(org.apache.pig.data.Tuple)"
* @throws IOException from Pig.
*/
// @formatter:on
// TOP LEVEL EXEC
@Override
public Tuple exec(final Tuple inputTuple) throws IOException {
// throws is in API
// The exec is a stateless function. It operates on the input and returns a result.
// It can only call static functions.
final Union union = newUnion(nomEntries_, p_, seed_);
final DataBag bag = extractBag(inputTuple);
if (bag == null) {
// Configured with parent
return emptyCompactOrderedSketchTuple_;
}
// updates union with all elements of the bag
updateUnion(bag, union);
final CompactSketch compOrdSketch = union.getResult(true, null);
return compactOrderedSketchToTuple(compOrdSketch);
}
use of com.yahoo.sketches.theta.CompactSketch in project sketches-pig by DataSketches.
the class PigUtil method emptySketchTuple.
/**
* Return an empty Compact Ordered Sketch Tuple. Empty sketch is only 8 bytes.
* @param seed the given seed
* @return an empty compact ordered sketch tuple
*/
static final Tuple emptySketchTuple(final long seed) {
final UpdateSketch sketch = UpdateSketch.builder().setSeed(seed).setResizeFactor(RF).setNominalEntries(16).build();
final CompactSketch compOrdSketch = sketch.compact(true, null);
return compactOrderedSketchToTuple(compOrdSketch);
}
Aggregations