Search in sources :

Example 1 with Family

use of org.apache.datasketches.Family in project sketches-core by DataSketches.

the class PreambleUtil method preambleToString.

/**
 * Returns a human readable string summary of the preamble state of the given Memory.
 * Note: other than making sure that the given Memory size is large
 * enough for just the preamble, this does not do much value checking of the contents of the
 * preamble as this is primarily a tool for debugging the preamble visually.
 *
 * @param mem the given Memory.
 * @return the summary preamble string.
 */
static String preambleToString(final Memory mem) {
    // make sure we can get the assumed preamble
    final int preLongs = getAndCheckPreLongs(mem);
    final Family family = Family.idToFamily(mem.getByte(FAMILY_BYTE));
    switch(family) {
        case RESERVOIR:
        case VAROPT:
            return sketchPreambleToString(mem, family, preLongs);
        case RESERVOIR_UNION:
        case VAROPT_UNION:
            return unionPreambleToString(mem, family, preLongs);
        default:
            throw new SketchesArgumentException("Inspecting preamble with Sampling family's " + "PreambleUtil with object of family " + family.getFamilyName());
    }
}
Also used : SketchesArgumentException(org.apache.datasketches.SketchesArgumentException) Family(org.apache.datasketches.Family)

Example 2 with Family

use of org.apache.datasketches.Family in project sketches-core by DataSketches.

the class HeapQuickSelectSketch method heapifyInstance.

/**
 * Heapify a sketch from a Memory UpdateSketch or Union object
 * containing sketch data.
 * @param srcMem The source Memory object.
 * <a href="{@docRoot}/resources/dictionary.html#mem">See Memory</a>
 * @param seed <a href="{@docRoot}/resources/dictionary.html#seed">See seed</a>
 * @return instance of this sketch
 */
static HeapQuickSelectSketch heapifyInstance(final Memory srcMem, final long seed) {
    // byte 0
    final int preambleLongs = extractPreLongs(srcMem);
    // byte 3
    final int lgNomLongs = extractLgNomLongs(srcMem);
    // byte 4
    final int lgArrLongs = extractLgArrLongs(srcMem);
    checkUnionQuickSelectFamily(srcMem, preambleLongs, lgNomLongs);
    checkMemIntegrity(srcMem, seed, preambleLongs, lgNomLongs, lgArrLongs);
    // bytes 12-15
    final float p = extractP(srcMem);
    // byte 0
    final int memlgRF = extractLgResizeFactor(srcMem);
    ResizeFactor memRF = ResizeFactor.getRF(memlgRF);
    final int familyID = extractFamilyID(srcMem);
    final Family family = Family.idToFamily(familyID);
    if (isResizeFactorIncorrect(srcMem, lgNomLongs, lgArrLongs)) {
        // X2 always works.
        memRF = ResizeFactor.X2;
    }
    final HeapQuickSelectSketch hqss = new HeapQuickSelectSketch(lgNomLongs, seed, p, memRF, preambleLongs, family);
    hqss.lgArrLongs_ = lgArrLongs;
    hqss.hashTableThreshold_ = setHashTableThreshold(lgNomLongs, lgArrLongs);
    hqss.curCount_ = extractCurCount(srcMem);
    hqss.thetaLong_ = extractThetaLong(srcMem);
    hqss.empty_ = PreambleUtil.isEmptyFlag(srcMem);
    hqss.cache_ = new long[1 << lgArrLongs];
    // read in as hash table
    srcMem.getLongArray(preambleLongs << 3, hqss.cache_, 0, 1 << lgArrLongs);
    return hqss;
}
Also used : Family(org.apache.datasketches.Family) PreambleUtil.extractLgResizeFactor(org.apache.datasketches.theta.PreambleUtil.extractLgResizeFactor) ResizeFactor(org.apache.datasketches.ResizeFactor)

Example 3 with Family

use of org.apache.datasketches.Family in project sketches-core by DataSketches.

the class Sketch method heapify.

// public static factory constructor-type methods
/**
 * Heapify takes the sketch image in Memory and instantiates an on-heap Sketch.
 *
 * <p>The resulting sketch will not retain any link to the source Memory.</p>
 *
 * <p>For Update Sketches this method checks if the
 * <a href="{@docRoot}/resources/dictionary.html#defaultUpdateSeed">Default Update Seed</a></p>
 * was used to create the source Memory image.
 *
 * <p>For Compact Sketches this method assumes that the sketch image was created with the
 * correct hash seed, so it is not checked.</p>
 *
 * @param srcMem an image of a Sketch.
 * <a href="{@docRoot}/resources/dictionary.html#mem">See Memory</a>.
 * @return a Sketch on the heap.
 */
public static Sketch heapify(final Memory srcMem) {
    final byte familyID = srcMem.getByte(FAMILY_BYTE);
    final Family family = idToFamily(familyID);
    if (family == Family.COMPACT) {
        return CompactSketch.heapify(srcMem);
    }
    return heapifyUpdateFromMemory(srcMem, DEFAULT_UPDATE_SEED);
}
Also used : Family.idToFamily(org.apache.datasketches.Family.idToFamily) Family(org.apache.datasketches.Family)

Example 4 with Family

use of org.apache.datasketches.Family in project sketches-core by DataSketches.

the class Sketch method wrap.

/**
 * Wrap takes the sketch image in the given Memory and refers to it directly.
 * There is no data copying onto the java heap.
 * The wrap operation enables fast read-only merging and access to all the public read-only API.
 *
 * <p>Only "Direct" Serialization Version 3 (i.e, OpenSource) sketches that have
 * been explicitly stored as direct sketches can be wrapped.
 * Wrapping earlier serial version sketches will result in a on-heap CompactSketch
 * where all data will be copied to the heap. These early versions were never designed to
 * "wrap".</p>
 *
 * <p>Wrapping any subclass of this class that is empty or contains only a single item will
 * result in on-heap equivalent forms of empty and single item sketch respectively.
 * This is actually faster and consumes less overall memory.</p>
 *
 * <p>For Update and Compact Sketches this method checks if the given expectedSeed was used to
 * create the source Memory image.  However, SerialVersion 1 sketches cannot be checked.</p>
 *
 * @param srcMem an image of a Sketch.
 * <a href="{@docRoot}/resources/dictionary.html#mem">See Memory</a>
 * @param expectedSeed the seed used to validate the given Memory image.
 * <a href="{@docRoot}/resources/dictionary.html#seed">See Update Hash Seed</a>.
 * @return a UpdateSketch backed by the given Memory except as above.
 */
public static Sketch wrap(final Memory srcMem, final long expectedSeed) {
    final int preLongs = srcMem.getByte(PREAMBLE_LONGS_BYTE) & 0X3F;
    final int serVer = srcMem.getByte(SER_VER_BYTE) & 0XFF;
    final int familyID = srcMem.getByte(FAMILY_BYTE) & 0XFF;
    final Family family = Family.idToFamily(familyID);
    if (family == Family.QUICKSELECT) {
        if (serVer == 3 && preLongs == 3) {
            return DirectQuickSelectSketchR.readOnlyWrap(srcMem, expectedSeed);
        } else {
            throw new SketchesArgumentException("Corrupted: " + family + " family image: must have SerVer = 3 and preLongs = 3");
        }
    }
    if (family == Family.COMPACT) {
        return CompactSketch.wrap(srcMem, expectedSeed);
    }
    throw new SketchesArgumentException("Cannot wrap family: " + family + " as a Sketch");
}
Also used : SketchesArgumentException(org.apache.datasketches.SketchesArgumentException) Family.idToFamily(org.apache.datasketches.Family.idToFamily) Family(org.apache.datasketches.Family)

Example 5 with Family

use of org.apache.datasketches.Family in project sketches-core by DataSketches.

the class Sketch method wrap.

/**
 * Wrap takes the sketch image in the given Memory and refers to it directly.
 * There is no data copying onto the java heap.
 * The wrap operation enables fast read-only merging and access to all the public read-only API.
 *
 * <p>Only "Direct" Serialization Version 3 (i.e, OpenSource) sketches that have
 * been explicitly stored as direct sketches can be wrapped.
 * Wrapping earlier serial version sketches will result in a on-heap CompactSketch
 * where all data will be copied to the heap. These early versions were never designed to
 * "wrap".</p>
 *
 * <p>Wrapping any subclass of this class that is empty or contains only a single item will
 * result in on-heap equivalent forms of empty and single item sketch respectively.
 * This is actually faster and consumes less overall memory.</p>
 *
 * <p>For Update Sketches this method checks if the
 * <a href="{@docRoot}/resources/dictionary.html#defaultUpdateSeed">Default Update Seed</a></p>
 * was used to create the source Memory image.
 *
 * <p>For Compact Sketches this method assumes that the sketch image was created with the
 * correct hash seed, so it is not checked.</p>
 *
 * @param srcMem an image of a Sketch.
 * <a href="{@docRoot}/resources/dictionary.html#mem">See Memory</a>.
 * @return a Sketch backed by the given Memory
 */
public static Sketch wrap(final Memory srcMem) {
    final int preLongs = srcMem.getByte(PREAMBLE_LONGS_BYTE) & 0X3F;
    final int serVer = srcMem.getByte(SER_VER_BYTE) & 0XFF;
    final int familyID = srcMem.getByte(FAMILY_BYTE) & 0XFF;
    final Family family = Family.idToFamily(familyID);
    if (family == Family.QUICKSELECT) {
        if (serVer == 3 && preLongs == 3) {
            return DirectQuickSelectSketchR.readOnlyWrap(srcMem, DEFAULT_UPDATE_SEED);
        } else {
            throw new SketchesArgumentException("Corrupted: " + family + " family image: must have SerVer = 3 and preLongs = 3");
        }
    }
    if (family == Family.COMPACT) {
        return CompactSketch.wrap(srcMem);
    }
    throw new SketchesArgumentException("Cannot wrap family: " + family + " as a Sketch");
}
Also used : SketchesArgumentException(org.apache.datasketches.SketchesArgumentException) Family.idToFamily(org.apache.datasketches.Family.idToFamily) Family(org.apache.datasketches.Family)

Aggregations

Family (org.apache.datasketches.Family)25 SketchesArgumentException (org.apache.datasketches.SketchesArgumentException)14 Family.idToFamily (org.apache.datasketches.Family.idToFamily)12 ResizeFactor (org.apache.datasketches.ResizeFactor)4 WritableMemory (org.apache.datasketches.memory.WritableMemory)2 Test (org.testng.annotations.Test)2 DefaultMemoryRequestServer (org.apache.datasketches.memory.DefaultMemoryRequestServer)1 Memory (org.apache.datasketches.memory.Memory)1 MemoryRequestServer (org.apache.datasketches.memory.MemoryRequestServer)1 PreambleUtil.extractLgResizeFactor (org.apache.datasketches.theta.PreambleUtil.extractLgResizeFactor)1 PreambleUtil.insertLgResizeFactor (org.apache.datasketches.theta.PreambleUtil.insertLgResizeFactor)1