Search in sources :

Example 1 with InvertedOutlierScoreMeta

use of de.lmu.ifi.dbs.elki.result.outlier.InvertedOutlierScoreMeta in project elki by elki-project.

the class ODIN method run.

/**
 * Run the ODIN algorithm
 *
 * Tutorial note: the <em>signature</em> of this method depends on the types
 * that we requested in the {@link #getInputTypeRestriction} method. Here we
 * requested a single relation of type {@code O} , the data type of our
 * distance function.
 *
 * @param database Database to run on.
 * @param relation Relation to process.
 * @return ODIN outlier result.
 */
public OutlierResult run(Database database, Relation<O> relation) {
    // Get the query functions:
    DistanceQuery<O> dq = database.getDistanceQuery(relation, getDistanceFunction());
    KNNQuery<O> knnq = database.getKNNQuery(dq, k);
    // Get the objects to process, and a data storage for counting and output:
    DBIDs ids = relation.getDBIDs();
    WritableDoubleDataStore scores = DataStoreUtil.makeDoubleStorage(ids, DataStoreFactory.HINT_DB, 0.);
    // Process all objects
    for (DBIDIter iter = ids.iter(); iter.valid(); iter.advance()) {
        // Find the nearest neighbors (using an index, if available!)
        KNNList neighbors = knnq.getKNNForDBID(iter, k);
        // For each neighbor, except ourselves, increase the in-degree:
        for (DBIDIter nei = neighbors.iter(); nei.valid(); nei.advance()) {
            if (DBIDUtil.equal(iter, nei)) {
                continue;
            }
            scores.put(nei, scores.doubleValue(nei) + 1);
        }
    }
    // Compute maximum
    double min = Double.POSITIVE_INFINITY, max = 0.0;
    for (DBIDIter iter = ids.iter(); iter.valid(); iter.advance()) {
        min = Math.min(min, scores.doubleValue(iter));
        max = Math.max(max, scores.doubleValue(iter));
    }
    // Wrap the result and add metadata.
    // By actually specifying theoretical min, max and baseline, we get a better
    // visualization (try it out - or see the screenshots in the tutorial)!
    OutlierScoreMeta meta = new InvertedOutlierScoreMeta(min, max, 0., ids.size() - 1, k);
    DoubleRelation rel = new MaterializedDoubleRelation("ODIN In-Degree", "odin", scores, ids);
    return new OutlierResult(meta, rel);
}
Also used : WritableDoubleDataStore(de.lmu.ifi.dbs.elki.database.datastore.WritableDoubleDataStore) DBIDs(de.lmu.ifi.dbs.elki.database.ids.DBIDs) OutlierResult(de.lmu.ifi.dbs.elki.result.outlier.OutlierResult) InvertedOutlierScoreMeta(de.lmu.ifi.dbs.elki.result.outlier.InvertedOutlierScoreMeta) DoubleRelation(de.lmu.ifi.dbs.elki.database.relation.DoubleRelation) MaterializedDoubleRelation(de.lmu.ifi.dbs.elki.database.relation.MaterializedDoubleRelation) OutlierScoreMeta(de.lmu.ifi.dbs.elki.result.outlier.OutlierScoreMeta) InvertedOutlierScoreMeta(de.lmu.ifi.dbs.elki.result.outlier.InvertedOutlierScoreMeta) DBIDIter(de.lmu.ifi.dbs.elki.database.ids.DBIDIter) KNNList(de.lmu.ifi.dbs.elki.database.ids.KNNList) MaterializedDoubleRelation(de.lmu.ifi.dbs.elki.database.relation.MaterializedDoubleRelation)

Example 2 with InvertedOutlierScoreMeta

use of de.lmu.ifi.dbs.elki.result.outlier.InvertedOutlierScoreMeta in project elki by elki-project.

the class ODIN method run.

/**
 * Run the ODIN algorithm
 *
 * @param database Database to run on.
 * @param relation Relation to process.
 * @return ODIN outlier result.
 */
public OutlierResult run(Database database, Relation<O> relation) {
    // Get the query functions:
    DistanceQuery<O> dq = database.getDistanceQuery(relation, getDistanceFunction());
    KNNQuery<O> knnq = database.getKNNQuery(dq, k);
    // Get the objects to process, and a data storage for counting and output:
    DBIDs ids = relation.getDBIDs();
    WritableDoubleDataStore scores = DataStoreUtil.makeDoubleStorage(ids, DataStoreFactory.HINT_DB, 0.);
    double inc = 1. / (k - 1);
    double min = Double.POSITIVE_INFINITY, max = 0.0;
    // Process all objects
    for (DBIDIter iter = ids.iter(); iter.valid(); iter.advance()) {
        // Find the nearest neighbors (using an index, if available!)
        DBIDs neighbors = knnq.getKNNForDBID(iter, k);
        // For each neighbor, except ourselves, increase the in-degree:
        for (DBIDIter nei = neighbors.iter(); nei.valid(); nei.advance()) {
            if (DBIDUtil.equal(iter, nei)) {
                continue;
            }
            final double value = scores.doubleValue(nei) + inc;
            if (value < min) {
                min = value;
            }
            if (value > max) {
                max = value;
            }
            scores.put(nei, value);
        }
    }
    // Wrap the result and add metadata.
    OutlierScoreMeta meta = new InvertedOutlierScoreMeta(min, max, 0., inc * (ids.size() - 1), 1);
    DoubleRelation rel = new MaterializedDoubleRelation("ODIN In-Degree", "odin", scores, ids);
    return new OutlierResult(meta, rel);
}
Also used : WritableDoubleDataStore(de.lmu.ifi.dbs.elki.database.datastore.WritableDoubleDataStore) DBIDs(de.lmu.ifi.dbs.elki.database.ids.DBIDs) OutlierResult(de.lmu.ifi.dbs.elki.result.outlier.OutlierResult) InvertedOutlierScoreMeta(de.lmu.ifi.dbs.elki.result.outlier.InvertedOutlierScoreMeta) DoubleRelation(de.lmu.ifi.dbs.elki.database.relation.DoubleRelation) MaterializedDoubleRelation(de.lmu.ifi.dbs.elki.database.relation.MaterializedDoubleRelation) OutlierScoreMeta(de.lmu.ifi.dbs.elki.result.outlier.OutlierScoreMeta) InvertedOutlierScoreMeta(de.lmu.ifi.dbs.elki.result.outlier.InvertedOutlierScoreMeta) MaterializedDoubleRelation(de.lmu.ifi.dbs.elki.database.relation.MaterializedDoubleRelation) DBIDIter(de.lmu.ifi.dbs.elki.database.ids.DBIDIter)

Example 3 with InvertedOutlierScoreMeta

use of de.lmu.ifi.dbs.elki.result.outlier.InvertedOutlierScoreMeta in project elki by elki-project.

the class ExternalDoubleOutlierScore method run.

/**
 * Run the algorithm.
 *
 * @param database Database to use
 * @param relation Relation to use
 * @return Result
 */
public OutlierResult run(Database database, Relation<?> relation) {
    WritableDoubleDataStore scores = DataStoreUtil.makeDoubleStorage(relation.getDBIDs(), DataStoreFactory.HINT_STATIC);
    DoubleMinMax minmax = new DoubleMinMax();
    try (// 
    InputStream in = FileUtil.tryGzipInput(new FileInputStream(file));
        TokenizedReader reader = CSVReaderFormat.DEFAULT_FORMAT.makeReader()) {
        Tokenizer tokenizer = reader.getTokenizer();
        CharSequence buf = reader.getBuffer();
        Matcher mi = idpattern.matcher(buf), ms = scorepattern.matcher(buf);
        reader.reset(in);
        while (reader.nextLineExceptComments()) {
            Integer id = null;
            double score = Double.NaN;
            for (; /* initialized by nextLineExceptComments */
            tokenizer.valid(); tokenizer.advance()) {
                mi.region(tokenizer.getStart(), tokenizer.getEnd());
                ms.region(tokenizer.getStart(), tokenizer.getEnd());
                final boolean mif = mi.find();
                final boolean msf = ms.find();
                if (mif && msf) {
                    throw new AbortException("ID pattern and score pattern both match value: " + tokenizer.getSubstring());
                }
                if (mif) {
                    if (id != null) {
                        throw new AbortException("ID pattern matched twice: previous value " + id + " second value: " + tokenizer.getSubstring());
                    }
                    id = ParseUtil.parseIntBase10(buf, mi.end(), tokenizer.getEnd());
                }
                if (msf) {
                    if (!Double.isNaN(score)) {
                        throw new AbortException("Score pattern matched twice: previous value " + score + " second value: " + tokenizer.getSubstring());
                    }
                    score = ParseUtil.parseDouble(buf, ms.end(), tokenizer.getEnd());
                }
            }
            if (id != null && !Double.isNaN(score)) {
                scores.putDouble(DBIDUtil.importInteger(id), score);
                minmax.put(score);
            } else if (id == null && Double.isNaN(score)) {
                LOG.warning("Line did not match either ID nor score nor comment: " + reader.getLineNumber());
            } else {
                throw new AbortException("Line matched only ID or only SCORE patterns: " + reader.getLineNumber());
            }
        }
    } catch (IOException e) {
        throw new AbortException("Could not load outlier scores: " + e.getMessage() + " when loading " + file, e);
    }
    OutlierScoreMeta meta;
    if (inverted) {
        meta = new InvertedOutlierScoreMeta(minmax.getMin(), minmax.getMax());
    } else {
        meta = new BasicOutlierScoreMeta(minmax.getMin(), minmax.getMax());
    }
    DoubleRelation scoresult = new MaterializedDoubleRelation("External Outlier", "external-outlier", scores, relation.getDBIDs());
    OutlierResult or = new OutlierResult(meta, scoresult);
    // Apply scaling
    if (scaling instanceof OutlierScalingFunction) {
        ((OutlierScalingFunction) scaling).prepare(or);
    }
    DoubleMinMax mm = new DoubleMinMax();
    for (DBIDIter iditer = relation.iterDBIDs(); iditer.valid(); iditer.advance()) {
        double val = scoresult.doubleValue(iditer);
        val = scaling.getScaled(val);
        scores.putDouble(iditer, val);
        mm.put(val);
    }
    meta = new BasicOutlierScoreMeta(mm.getMin(), mm.getMax());
    or = new OutlierResult(meta, scoresult);
    return or;
}
Also used : WritableDoubleDataStore(de.lmu.ifi.dbs.elki.database.datastore.WritableDoubleDataStore) Matcher(java.util.regex.Matcher) FileInputStream(java.io.FileInputStream) InputStream(java.io.InputStream) OutlierScalingFunction(de.lmu.ifi.dbs.elki.utilities.scaling.outlier.OutlierScalingFunction) OutlierResult(de.lmu.ifi.dbs.elki.result.outlier.OutlierResult) InvertedOutlierScoreMeta(de.lmu.ifi.dbs.elki.result.outlier.InvertedOutlierScoreMeta) IOException(java.io.IOException) DoubleRelation(de.lmu.ifi.dbs.elki.database.relation.DoubleRelation) MaterializedDoubleRelation(de.lmu.ifi.dbs.elki.database.relation.MaterializedDoubleRelation) FileInputStream(java.io.FileInputStream) BasicOutlierScoreMeta(de.lmu.ifi.dbs.elki.result.outlier.BasicOutlierScoreMeta) OutlierScoreMeta(de.lmu.ifi.dbs.elki.result.outlier.OutlierScoreMeta) InvertedOutlierScoreMeta(de.lmu.ifi.dbs.elki.result.outlier.InvertedOutlierScoreMeta) BasicOutlierScoreMeta(de.lmu.ifi.dbs.elki.result.outlier.BasicOutlierScoreMeta) DBIDIter(de.lmu.ifi.dbs.elki.database.ids.DBIDIter) DoubleMinMax(de.lmu.ifi.dbs.elki.math.DoubleMinMax) TokenizedReader(de.lmu.ifi.dbs.elki.utilities.io.TokenizedReader) Tokenizer(de.lmu.ifi.dbs.elki.utilities.io.Tokenizer) MaterializedDoubleRelation(de.lmu.ifi.dbs.elki.database.relation.MaterializedDoubleRelation) AbortException(de.lmu.ifi.dbs.elki.utilities.exceptions.AbortException)

Example 4 with InvertedOutlierScoreMeta

use of de.lmu.ifi.dbs.elki.result.outlier.InvertedOutlierScoreMeta in project elki by elki-project.

the class DWOF method run.

/**
 * Performs the Generalized DWOF_SCORE algorithm on the given database by
 * calling all the other methods in the proper order.
 *
 * @param database Database to query
 * @param relation Data to process
 * @return new OutlierResult instance
 */
public OutlierResult run(Database database, Relation<O> relation) {
    final DBIDs ids = relation.getDBIDs();
    DistanceQuery<O> distFunc = database.getDistanceQuery(relation, getDistanceFunction());
    // Get k nearest neighbor and range query on the relation.
    KNNQuery<O> knnq = database.getKNNQuery(distFunc, k, DatabaseQuery.HINT_HEAVY_USE);
    RangeQuery<O> rnnQuery = database.getRangeQuery(distFunc, DatabaseQuery.HINT_HEAVY_USE);
    StepProgress stepProg = LOG.isVerbose() ? new StepProgress("DWOF", 2) : null;
    // DWOF output score storage.
    WritableDoubleDataStore dwofs = DataStoreUtil.makeDoubleStorage(ids, DataStoreFactory.HINT_DB | DataStoreFactory.HINT_HOT, 0.);
    if (stepProg != null) {
        stepProg.beginStep(1, "Initializing objects' Radii", LOG);
    }
    WritableDoubleDataStore radii = DataStoreUtil.makeDoubleStorage(ids, DataStoreFactory.HINT_TEMP | DataStoreFactory.HINT_HOT, 0.);
    // Find an initial radius for each object:
    initializeRadii(ids, knnq, distFunc, radii);
    WritableIntegerDataStore oldSizes = DataStoreUtil.makeIntegerStorage(ids, DataStoreFactory.HINT_HOT, 1);
    WritableIntegerDataStore newSizes = DataStoreUtil.makeIntegerStorage(ids, DataStoreFactory.HINT_HOT, 1);
    int countUnmerged = relation.size();
    if (stepProg != null) {
        stepProg.beginStep(2, "Clustering-Evaluating Cycles.", LOG);
    }
    IndefiniteProgress clusEvalProgress = LOG.isVerbose() ? new IndefiniteProgress("Evaluating DWOFs", LOG) : null;
    while (countUnmerged > 0) {
        LOG.incrementProcessed(clusEvalProgress);
        // Increase radii
        for (DBIDIter iter = ids.iter(); iter.valid(); iter.advance()) {
            radii.putDouble(iter, radii.doubleValue(iter) * delta);
        }
        // stores the clustering label for each object
        WritableDataStore<ModifiableDBIDs> labels = DataStoreUtil.makeStorage(ids, DataStoreFactory.HINT_TEMP, ModifiableDBIDs.class);
        // Cluster objects based on the current radius
        clusterData(ids, rnnQuery, radii, labels);
        // simple reference swap
        WritableIntegerDataStore temp = newSizes;
        newSizes = oldSizes;
        oldSizes = temp;
        // Update the cluster size count for each object.
        countUnmerged = updateSizes(ids, labels, newSizes);
        labels.destroy();
        // Update DWOF scores.
        for (DBIDIter iter = ids.iter(); iter.valid(); iter.advance()) {
            double newScore = (newSizes.intValue(iter) > 0) ? ((double) (oldSizes.intValue(iter) - 1) / (double) newSizes.intValue(iter)) : 0.0;
            dwofs.putDouble(iter, dwofs.doubleValue(iter) + newScore);
        }
    }
    LOG.setCompleted(clusEvalProgress);
    LOG.setCompleted(stepProg);
    // Build result representation.
    DoubleMinMax minmax = new DoubleMinMax();
    for (DBIDIter iter = relation.iterDBIDs(); iter.valid(); iter.advance()) {
        minmax.put(dwofs.doubleValue(iter));
    }
    OutlierScoreMeta meta = new InvertedOutlierScoreMeta(minmax.getMin(), minmax.getMax(), 0.0, Double.POSITIVE_INFINITY);
    DoubleRelation rel = new MaterializedDoubleRelation("Dynamic-Window Outlier Factors", "dwof-outlier", dwofs, ids);
    return new OutlierResult(meta, rel);
}
Also used : WritableIntegerDataStore(de.lmu.ifi.dbs.elki.database.datastore.WritableIntegerDataStore) WritableDoubleDataStore(de.lmu.ifi.dbs.elki.database.datastore.WritableDoubleDataStore) DBIDs(de.lmu.ifi.dbs.elki.database.ids.DBIDs) ModifiableDBIDs(de.lmu.ifi.dbs.elki.database.ids.ModifiableDBIDs) OutlierResult(de.lmu.ifi.dbs.elki.result.outlier.OutlierResult) InvertedOutlierScoreMeta(de.lmu.ifi.dbs.elki.result.outlier.InvertedOutlierScoreMeta) StepProgress(de.lmu.ifi.dbs.elki.logging.progress.StepProgress) DoubleRelation(de.lmu.ifi.dbs.elki.database.relation.DoubleRelation) MaterializedDoubleRelation(de.lmu.ifi.dbs.elki.database.relation.MaterializedDoubleRelation) OutlierScoreMeta(de.lmu.ifi.dbs.elki.result.outlier.OutlierScoreMeta) InvertedOutlierScoreMeta(de.lmu.ifi.dbs.elki.result.outlier.InvertedOutlierScoreMeta) DBIDIter(de.lmu.ifi.dbs.elki.database.ids.DBIDIter) DoubleMinMax(de.lmu.ifi.dbs.elki.math.DoubleMinMax) IndefiniteProgress(de.lmu.ifi.dbs.elki.logging.progress.IndefiniteProgress) ModifiableDBIDs(de.lmu.ifi.dbs.elki.database.ids.ModifiableDBIDs) MaterializedDoubleRelation(de.lmu.ifi.dbs.elki.database.relation.MaterializedDoubleRelation)

Example 5 with InvertedOutlierScoreMeta

use of de.lmu.ifi.dbs.elki.result.outlier.InvertedOutlierScoreMeta in project elki by elki-project.

the class ABOD method run.

/**
 * Run ABOD on the data set.
 *
 * @param relation Relation to process
 * @return Outlier detection result
 */
public OutlierResult run(Database db, Relation<V> relation) {
    ArrayDBIDs ids = DBIDUtil.ensureArray(relation.getDBIDs());
    // Build a kernel matrix, to make O(n^3) slightly less bad.
    SimilarityQuery<V> sq = db.getSimilarityQuery(relation, kernelFunction);
    KernelMatrix kernelMatrix = new KernelMatrix(sq, relation, ids);
    WritableDoubleDataStore abodvalues = DataStoreUtil.makeDoubleStorage(ids, DataStoreFactory.HINT_STATIC);
    DoubleMinMax minmaxabod = new DoubleMinMax();
    MeanVariance s = new MeanVariance();
    DBIDArrayIter pA = ids.iter(), pB = ids.iter(), pC = ids.iter();
    for (; pA.valid(); pA.advance()) {
        final double abof = computeABOF(kernelMatrix, pA, pB, pC, s);
        minmaxabod.put(abof);
        abodvalues.putDouble(pA, abof);
    }
    // Build result representation.
    DoubleRelation scoreResult = new MaterializedDoubleRelation("Angle-Based Outlier Degree", "abod-outlier", abodvalues, relation.getDBIDs());
    OutlierScoreMeta scoreMeta = new InvertedOutlierScoreMeta(minmaxabod.getMin(), minmaxabod.getMax(), 0.0, Double.POSITIVE_INFINITY);
    return new OutlierResult(scoreMeta, scoreResult);
}
Also used : WritableDoubleDataStore(de.lmu.ifi.dbs.elki.database.datastore.WritableDoubleDataStore) OutlierResult(de.lmu.ifi.dbs.elki.result.outlier.OutlierResult) DBIDArrayIter(de.lmu.ifi.dbs.elki.database.ids.DBIDArrayIter) InvertedOutlierScoreMeta(de.lmu.ifi.dbs.elki.result.outlier.InvertedOutlierScoreMeta) DoubleRelation(de.lmu.ifi.dbs.elki.database.relation.DoubleRelation) MaterializedDoubleRelation(de.lmu.ifi.dbs.elki.database.relation.MaterializedDoubleRelation) OutlierScoreMeta(de.lmu.ifi.dbs.elki.result.outlier.OutlierScoreMeta) InvertedOutlierScoreMeta(de.lmu.ifi.dbs.elki.result.outlier.InvertedOutlierScoreMeta) KernelMatrix(de.lmu.ifi.dbs.elki.distance.similarityfunction.kernel.KernelMatrix) MeanVariance(de.lmu.ifi.dbs.elki.math.MeanVariance) DoubleMinMax(de.lmu.ifi.dbs.elki.math.DoubleMinMax) ArrayDBIDs(de.lmu.ifi.dbs.elki.database.ids.ArrayDBIDs) MaterializedDoubleRelation(de.lmu.ifi.dbs.elki.database.relation.MaterializedDoubleRelation)

Aggregations

InvertedOutlierScoreMeta (de.lmu.ifi.dbs.elki.result.outlier.InvertedOutlierScoreMeta)15 DBIDIter (de.lmu.ifi.dbs.elki.database.ids.DBIDIter)14 DoubleRelation (de.lmu.ifi.dbs.elki.database.relation.DoubleRelation)14 WritableDoubleDataStore (de.lmu.ifi.dbs.elki.database.datastore.WritableDoubleDataStore)13 MaterializedDoubleRelation (de.lmu.ifi.dbs.elki.database.relation.MaterializedDoubleRelation)13 OutlierResult (de.lmu.ifi.dbs.elki.result.outlier.OutlierResult)13 OutlierScoreMeta (de.lmu.ifi.dbs.elki.result.outlier.OutlierScoreMeta)13 DoubleMinMax (de.lmu.ifi.dbs.elki.math.DoubleMinMax)11 DBIDs (de.lmu.ifi.dbs.elki.database.ids.DBIDs)8 ArrayDBIDs (de.lmu.ifi.dbs.elki.database.ids.ArrayDBIDs)3 DBIDArrayIter (de.lmu.ifi.dbs.elki.database.ids.DBIDArrayIter)3 KNNList (de.lmu.ifi.dbs.elki.database.ids.KNNList)3 KernelMatrix (de.lmu.ifi.dbs.elki.distance.similarityfunction.kernel.KernelMatrix)3 MeanVariance (de.lmu.ifi.dbs.elki.math.MeanVariance)3 AbortException (de.lmu.ifi.dbs.elki.utilities.exceptions.AbortException)3 DoubleDBIDListIter (de.lmu.ifi.dbs.elki.database.ids.DoubleDBIDListIter)2 KNNHeap (de.lmu.ifi.dbs.elki.database.ids.KNNHeap)2 BasicOutlierScoreMeta (de.lmu.ifi.dbs.elki.result.outlier.BasicOutlierScoreMeta)2 WritableIntegerDataStore (de.lmu.ifi.dbs.elki.database.datastore.WritableIntegerDataStore)1 ModifiableDBIDs (de.lmu.ifi.dbs.elki.database.ids.ModifiableDBIDs)1