Search in sources :

Example 61 with HyperLogLogPlus

use of com.clearspring.analytics.stream.cardinality.HyperLogLogPlus in project shifu by ShifuML.

the class AutoTypeDistinctCountReducer method reduce.

@Override
protected void reduce(IntWritable key, Iterable<CountAndFrequentItemsWritable> values, Context context) throws IOException, InterruptedException {
    HyperLogLogPlus hyperLogLogPlus = null;
    Set<String> fis = new HashSet<String>();
    long count = 0, invalidCount = 0, validNumCount = 0;
    for (CountAndFrequentItemsWritable cfiw : values) {
        count += cfiw.getCount();
        invalidCount += cfiw.getInvalidCount();
        validNumCount += cfiw.getValidNumCount();
        fis.addAll(cfiw.getFrequetItems());
        if (hyperLogLogPlus == null) {
            hyperLogLogPlus = HyperLogLogPlus.Builder.build(cfiw.getHyperBytes());
        } else {
            try {
                hyperLogLogPlus = (HyperLogLogPlus) hyperLogLogPlus.merge(HyperLogLogPlus.Builder.build(cfiw.getHyperBytes()));
            } catch (CardinalityMergeException e) {
                throw new RuntimeException(e);
            }
        }
    }
    outputValue.set(count + ":" + invalidCount + ":" + validNumCount + ":" + hyperLogLogPlus.cardinality() + ":" + limitedFrequentItems(fis));
    context.write(key, outputValue);
}
Also used : HyperLogLogPlus(com.clearspring.analytics.stream.cardinality.HyperLogLogPlus) CardinalityMergeException(com.clearspring.analytics.stream.cardinality.CardinalityMergeException) HashSet(java.util.HashSet)

Example 62 with HyperLogLogPlus

use of com.clearspring.analytics.stream.cardinality.HyperLogLogPlus in project angel by Tencent.

the class GetHyperLogLog method partitionGet.

@Override
public PartitionGetResult partitionGet(PartitionGetParam partParam) {
    GetHyperLogLogPartParam param = (GetHyperLogLogPartParam) partParam;
    ServerLongAnyRow row = GraphMatrixUtils.getPSLongKeyRow(psContext, param);
    ILongKeyPartOp keyPart = (ILongKeyPartOp) param.getNodes();
    long[] nodes = keyPart.getKeys();
    Long2ObjectOpenHashMap<HyperLogLogPlus> logs = new Long2ObjectOpenHashMap<>(nodes.length);
    row.startRead(20000);
    try {
        for (int i = 0; i < nodes.length; i++) {
            HyperLogLogPlusElement hllElem = (HyperLogLogPlusElement) row.get(nodes[i]);
            if (hllElem.isActive()) {
                logs.put(nodes[i], hllElem.getHyperLogLogPlus());
            }
        }
    } finally {
        row.endRead();
    }
    return new GetHyperLogLogPartResult(logs);
}
Also used : HyperLogLogPlus(com.clearspring.analytics.stream.cardinality.HyperLogLogPlus) Long2ObjectOpenHashMap(it.unimi.dsi.fastutil.longs.Long2ObjectOpenHashMap) ServerLongAnyRow(com.tencent.angel.ps.storage.vector.ServerLongAnyRow) ILongKeyPartOp(com.tencent.angel.psagent.matrix.transport.router.operator.ILongKeyPartOp)

Example 63 with HyperLogLogPlus

use of com.clearspring.analytics.stream.cardinality.HyperLogLogPlus in project angel by Tencent.

the class UpdateHyperLogLog method partitionUpdate.

@Override
public void partitionUpdate(PartitionUpdateParam partParam) {
    UpdateHyperLogLogPartParam param = (UpdateHyperLogLogPartParam) partParam;
    ServerLongAnyRow row = GraphMatrixUtils.getPSLongKeyRow(psContext, param);
    ILongKeyAnyValuePartOp split = (ILongKeyAnyValuePartOp) param.getKeyValuePart();
    int p = param.getP();
    int sp = param.getSp();
    long seed = param.getSeed();
    long[] keys = split.getKeys();
    IElement[] values = split.getValues();
    row.startWrite();
    try {
        if (keys != null && keys.length > 0 && values != null && values.length > 0) {
            for (int i = 0; i < keys.length; i++) {
                long key = keys[i];
                HyperLogLogPlus value = ((HLLPlusElement) values[i]).getCounter();
                if (!row.exist(key))
                    row.set(key, new HyperLogLogPlusElement(key, p, sp, seed));
                HyperLogLogPlusElement hllElem = (HyperLogLogPlusElement) row.get(key);
                if (hllElem.isActive()) {
                    hllElem.merge(value);
                }
            }
        }
    } finally {
        row.endWrite();
    }
}
Also used : IElement(com.tencent.angel.ps.storage.vector.element.IElement) HyperLogLogPlus(com.clearspring.analytics.stream.cardinality.HyperLogLogPlus) ServerLongAnyRow(com.tencent.angel.ps.storage.vector.ServerLongAnyRow) ILongKeyAnyValuePartOp(com.tencent.angel.psagent.matrix.transport.router.operator.ILongKeyAnyValuePartOp)

Aggregations

HyperLogLogPlus (com.clearspring.analytics.stream.cardinality.HyperLogLogPlus)63 Test (org.junit.jupiter.api.Test)19 Test (org.junit.Test)14 Entity (uk.gov.gchq.gaffer.data.element.Entity)8 User (uk.gov.gchq.gaffer.user.User)6 Edge (uk.gov.gchq.gaffer.data.element.Edge)5 Element (uk.gov.gchq.gaffer.data.element.Element)5 AggregateFunctionTest (uk.gov.gchq.gaffer.function.AggregateFunctionTest)5 Graph (uk.gov.gchq.gaffer.graph.Graph)5 FunctionTest (uk.gov.gchq.koryphe.function.FunctionTest)5 ArrayList (java.util.ArrayList)4 AddElements (uk.gov.gchq.gaffer.operation.impl.add.AddElements)4 HashSet (java.util.HashSet)3 View (uk.gov.gchq.gaffer.data.elementdefinition.view.View)3 SerialisationException (uk.gov.gchq.gaffer.exception.SerialisationException)3 GetAllElements (uk.gov.gchq.gaffer.operation.impl.get.GetAllElements)3 CardinalityMergeException (com.clearspring.analytics.stream.cardinality.CardinalityMergeException)2 TreeNode (com.fasterxml.jackson.core.TreeNode)2 TextNode (com.fasterxml.jackson.databind.node.TextNode)2 ServerLongAnyRow (com.tencent.angel.ps.storage.vector.ServerLongAnyRow)2