Search in sources :

Example 1 with LongRBTreeSet

use of it.unimi.dsi.fastutil.longs.LongRBTreeSet in project presto by prestodb.

the class KHyperLogLog method jaccardIndex.

public static double jaccardIndex(KHyperLogLog a, KHyperLogLog b) {
    int sizeOfSmallerSet = Math.min(a.minhash.size(), b.minhash.size());
    LongSortedSet minUnion = new LongRBTreeSet(a.minhash.keySet());
    minUnion.addAll(b.minhash.keySet());
    int intersection = 0;
    int i = 0;
    LongIterator iterator = minUnion.iterator();
    while (iterator.hasNext()) {
        long key = iterator.nextLong();
        if (a.minhash.containsKey(key) && b.minhash.containsKey(key)) {
            intersection++;
        }
        i++;
        if (i >= sizeOfSmallerSet) {
            break;
        }
    }
    return intersection / (double) sizeOfSmallerSet;
}
Also used : LongRBTreeSet(it.unimi.dsi.fastutil.longs.LongRBTreeSet) LongIterator(it.unimi.dsi.fastutil.longs.LongIterator) LongSortedSet(it.unimi.dsi.fastutil.longs.LongSortedSet)

Example 2 with LongRBTreeSet

use of it.unimi.dsi.fastutil.longs.LongRBTreeSet in project presto by prestodb.

the class SetDigest method jaccardIndex.

public static double jaccardIndex(SetDigest a, SetDigest b) {
    int sizeOfSmallerSet = Math.min(a.minhash.size(), b.minhash.size());
    LongSortedSet minUnion = new LongRBTreeSet(a.minhash.keySet());
    minUnion.addAll(b.minhash.keySet());
    int intersection = 0;
    int i = 0;
    for (long key : minUnion) {
        if (a.minhash.containsKey(key) && b.minhash.containsKey(key)) {
            intersection++;
        }
        i++;
        if (i >= sizeOfSmallerSet) {
            break;
        }
    }
    return intersection / (double) sizeOfSmallerSet;
}
Also used : LongRBTreeSet(it.unimi.dsi.fastutil.longs.LongRBTreeSet) LongSortedSet(it.unimi.dsi.fastutil.longs.LongSortedSet)

Aggregations

LongRBTreeSet (it.unimi.dsi.fastutil.longs.LongRBTreeSet)2 LongSortedSet (it.unimi.dsi.fastutil.longs.LongSortedSet)2 LongIterator (it.unimi.dsi.fastutil.longs.LongIterator)1