Search in sources :

Example 6 with DiVector

use of com.amazon.randomcutforest.returntypes.DiVector in project random-cut-forest-by-aws by aws.

the class AnomalyAttributionVisitorTest method testAcceptLeafEquals.

@Test
public void testAcceptLeafEquals() {
    float[] point = { 1.1f, -2.2f, 3.3f };
    INodeView leafNode = mock(NodeView.class);
    when(leafNode.getLeafPoint()).thenReturn(point);
    when(leafNode.getBoundingBox()).thenReturn(new BoundingBox(point, point));
    int leafDepth = 100;
    int leafMass = 10;
    when(leafNode.getMass()).thenReturn(leafMass);
    int treeMass = 21;
    AnomalyAttributionVisitor visitor = new AnomalyAttributionVisitor(point, treeMass, 0);
    visitor.acceptLeaf(leafNode, leafDepth);
    assertTrue(visitor.hitDuplicates);
    double expectedScoreSum = CommonUtils.defaultDampFunction(leafMass, treeMass) / (leafDepth + Math.log(leafMass + 1) / Math.log(2));
    double expectedScore = expectedScoreSum / (2 * point.length);
    DiVector result = visitor.getResult();
    for (int i = 0; i < point.length; i++) {
        assertEquals(defaultScalarNormalizerFunction(expectedScore, treeMass), result.low[i], EPSILON);
        assertEquals(defaultScalarNormalizerFunction(expectedScore, treeMass), result.high[i], EPSILON);
    }
}
Also used : DiVector(com.amazon.randomcutforest.returntypes.DiVector) BoundingBox(com.amazon.randomcutforest.tree.BoundingBox) INodeView(com.amazon.randomcutforest.tree.INodeView) Test(org.junit.jupiter.api.Test)

Example 7 with DiVector

use of com.amazon.randomcutforest.returntypes.DiVector in project random-cut-forest-by-aws by aws.

the class RandomCutForestFunctionalTest method testSideEffectsB.

@ParameterizedTest
@ArgumentsSource(TestForestProvider.class)
public void testSideEffectsB(RandomCutForest forest) {
    /* the changes to score and attribution should be in sync */
    DiVector initial = forest.getAnomalyAttribution(new double[] { 0.0, 0.0, 0.0 });
    NormalMixtureTestData generator2 = new NormalMixtureTestData(baseMu, baseSigma, anomalyMu, anomalySigma, transitionToAnomalyProbability, transitionToBaseProbability);
    double[][] newData = generator2.generateTestData(dataSize, dimensions);
    for (int i = 0; i < dataSize; i++) {
        forest.getAnomalyAttribution(newData[i]);
    }
    double newScore = forest.getAnomalyScore(new double[] { 0.0, 0.0, 0.0 });
    DiVector newVector = forest.getAnomalyAttribution(new double[] { 0.0, 0.0, 0.0 });
    assertEquals(initial.getHighLowSum(), newVector.getHighLowSum(), 10E-10);
    assertEquals(initial.getHighLowSum(), newScore, 1E-10);
    assertArrayEquals(initial.high, newVector.high, 1E-10);
    assertArrayEquals(initial.low, newVector.low, 1E-10);
}
Also used : DiVector(com.amazon.randomcutforest.returntypes.DiVector) NormalMixtureTestData(com.amazon.randomcutforest.testutils.NormalMixtureTestData) ParameterizedTest(org.junit.jupiter.params.ParameterizedTest) ArgumentsSource(org.junit.jupiter.params.provider.ArgumentsSource)

Example 8 with DiVector

use of com.amazon.randomcutforest.returntypes.DiVector in project random-cut-forest-by-aws by aws.

the class RandomCutForestFunctionalTest method testShadowBuffer.

@Test
public void testShadowBuffer() {
    /**
     * This test checks that the attribution *DOES NOT* change as a ratio as more
     * copies of the points are added. The shadowbox in
     * the @DirectionalAttributionVisitor allows us to simulate a deletion without
     * performing a deletion.
     *
     * The goal is to measure the attribution and have many copies of the same point
     * and eventually the attribution will become uniform in all directions.
     *
     * we create a new forest so that other tests are unaffected.
     */
    numberOfTrees = 100;
    sampleSize = 256;
    dimensions = 3;
    randomSeed = 123;
    RandomCutForest newForest = RandomCutForest.builder().numberOfTrees(numberOfTrees).sampleSize(sampleSize).dimensions(dimensions).randomSeed(randomSeed).centerOfMassEnabled(true).timeDecay(1e-5).storeSequenceIndexesEnabled(true).build();
    dataSize = 10_000;
    baseMu = 0.0;
    baseSigma = 1.0;
    anomalyMu = 5.0;
    anomalySigma = 1.5;
    transitionToAnomalyProbability = 0.01;
    transitionToBaseProbability = 0.4;
    NormalMixtureTestData generator = new NormalMixtureTestData(baseMu, baseSigma, anomalyMu, anomalySigma, transitionToAnomalyProbability, transitionToBaseProbability);
    double[][] data = generator.generateTestData(dataSize, dimensions);
    for (int i = 0; i < dataSize; i++) {
        newForest.update(data[i]);
    }
    double[] point = new double[] { -8.0, -8.0, 0.0 };
    DiVector result = newForest.getAnomalyAttribution(point);
    double score = newForest.getAnomalyScore(point);
    assertEquals(score, result.getHighLowSum(), 1E-5);
    assertTrue(score > 2);
    assertTrue(result.getHighLowSum(2) < 0.2);
    // 256/10_000
    for (int i = 0; i < 5; i++) {
        newForest.update(point);
    }
    DiVector newResult = newForest.getAnomalyAttribution(point);
    double newScore = newForest.getAnomalyScore(point);
    assertEquals(newScore, newResult.getHighLowSum(), 1E-5);
    assertTrue(newScore < score);
    for (int j = 0; j < 3; j++) {
        // relationship holds at larger values
        if (result.high[j] > 0.2) {
            assertEquals(score * newResult.high[j], newScore * result.high[j], 0.1 * score);
        } else {
            assertTrue(newResult.high[j] < 0.2);
        }
        if (result.low[j] > 0.2) {
            assertEquals(score * newResult.low[j], newScore * result.low[j], 0.1 * score);
        } else {
            assertTrue(newResult.low[j] < 0.2);
        }
    }
    // this will make the point an inlier
    for (int i = 0; i < 5000; i++) {
        newForest.update(point);
    }
    DiVector finalResult = newForest.getAnomalyAttribution(point);
    double finalScore = newForest.getAnomalyScore(point);
    assertTrue(finalScore < 1);
    assertEquals(finalScore, finalResult.getHighLowSum(), 1E-5);
    for (int j = 0; j < 3; j++) {
        // relationship holds at larger values
        if (finalResult.high[j] > 0.2) {
            assertEquals(score * finalResult.high[j], finalScore * result.high[j], 0.1 * score);
        } else {
            assertTrue(newResult.high[j] < 0.2);
        }
        if (finalResult.low[j] > 0.2) {
            assertEquals(score * finalResult.low[j], finalScore * result.low[j], 0.1 * score);
        } else {
            assertTrue(finalResult.low[j] < 0.2);
        }
    }
}
Also used : DiVector(com.amazon.randomcutforest.returntypes.DiVector) NormalMixtureTestData(com.amazon.randomcutforest.testutils.NormalMixtureTestData) Test(org.junit.jupiter.api.Test) ParameterizedTest(org.junit.jupiter.params.ParameterizedTest)

Example 9 with DiVector

use of com.amazon.randomcutforest.returntypes.DiVector in project random-cut-forest-by-aws by aws.

the class RandomCutForestTest method testGetAnomalyAttribution.

@Test
public void testGetAnomalyAttribution() {
    float[] point = { 1.2f, -3.4f };
    assertFalse(forest.isOutputReady());
    DiVector zero = new DiVector(dimensions);
    DiVector result = forest.getAnomalyAttribution(point);
    assertArrayEquals(zero.high, result.high);
    assertArrayEquals(zero.low, result.low);
    doReturn(true).when(forest).isOutputReady();
    DiVector expectedResult = new DiVector(dimensions);
    for (int i = 0; i < numberOfTrees; i++) {
        DiVector treeResult = new DiVector(dimensions);
        for (int j = 0; j < dimensions; j++) {
            treeResult.high[j] = Math.random();
            treeResult.low[j] = Math.random();
        }
        SamplerPlusTree<Integer, float[]> component = (SamplerPlusTree<Integer, float[]>) components.get(i);
        ITree<Integer, float[]> tree = component.getTree();
        when(tree.traverse(aryEq(point), any(VisitorFactory.class))).thenReturn(treeResult);
        when(tree.getMass()).thenReturn(256);
        DiVector.addToLeft(expectedResult, treeResult);
    }
    expectedResult = expectedResult.scale(1.0 / numberOfTrees);
    result = forest.getAnomalyAttribution(point);
    assertArrayEquals(expectedResult.high, result.high, EPSILON);
    assertArrayEquals(expectedResult.low, result.low, EPSILON);
}
Also used : DiVector(com.amazon.randomcutforest.returntypes.DiVector) SamplerPlusTree(com.amazon.randomcutforest.executor.SamplerPlusTree) Test(org.junit.jupiter.api.Test)

Example 10 with DiVector

use of com.amazon.randomcutforest.returntypes.DiVector in project random-cut-forest-by-aws by aws.

the class RandomCutForest method getApproximateAnomalyAttribution.

public DiVector getApproximateAnomalyAttribution(float[] point) {
    if (!isOutputReady()) {
        return new DiVector(dimensions);
    }
    IVisitorFactory<DiVector> visitorFactory = new VisitorFactory<>((tree, y) -> new AnomalyAttributionVisitor(tree.projectToTree(y), tree.getMass()), (tree, x) -> x.lift(tree::liftFromTree));
    ConvergingAccumulator<DiVector> accumulator = new OneSidedConvergingDiVectorAccumulator(dimensions, DEFAULT_APPROXIMATE_ANOMALY_SCORE_HIGH_IS_CRITICAL, DEFAULT_APPROXIMATE_DYNAMIC_SCORE_PRECISION, DEFAULT_APPROXIMATE_DYNAMIC_SCORE_MIN_VALUES_ACCEPTED, numberOfTrees);
    Function<DiVector, DiVector> finisher = x -> x.scale(1.0 / accumulator.getValuesAccepted());
    return traverseForest(transformToShingledPoint(point), visitorFactory, accumulator, finisher);
}
Also used : CommonUtils.checkNotNull(com.amazon.randomcutforest.CommonUtils.checkNotNull) Arrays(java.util.Arrays) BiFunction(java.util.function.BiFunction) ParallelForestTraversalExecutor(com.amazon.randomcutforest.executor.ParallelForestTraversalExecutor) Random(java.util.Random) ParallelForestUpdateExecutor(com.amazon.randomcutforest.executor.ParallelForestUpdateExecutor) AbstractForestUpdateExecutor(com.amazon.randomcutforest.executor.AbstractForestUpdateExecutor) IStateCoordinator(com.amazon.randomcutforest.executor.IStateCoordinator) RandomCutTree(com.amazon.randomcutforest.tree.RandomCutTree) CommonUtils.toFloatArray(com.amazon.randomcutforest.CommonUtils.toFloatArray) Neighbor(com.amazon.randomcutforest.returntypes.Neighbor) ConditionalSampleSummarizer(com.amazon.randomcutforest.imputation.ConditionalSampleSummarizer) ImputeVisitor(com.amazon.randomcutforest.imputation.ImputeVisitor) NearNeighborVisitor(com.amazon.randomcutforest.inspect.NearNeighborVisitor) IBoundingBoxView(com.amazon.randomcutforest.tree.IBoundingBoxView) Collector(java.util.stream.Collector) PointStoreCoordinator(com.amazon.randomcutforest.executor.PointStoreCoordinator) AnomalyAttributionVisitor(com.amazon.randomcutforest.anomalydetection.AnomalyAttributionVisitor) OneSidedConvergingDoubleAccumulator(com.amazon.randomcutforest.returntypes.OneSidedConvergingDoubleAccumulator) AnomalyScoreVisitor(com.amazon.randomcutforest.anomalydetection.AnomalyScoreVisitor) AbstractForestTraversalExecutor(com.amazon.randomcutforest.executor.AbstractForestTraversalExecutor) BinaryOperator(java.util.function.BinaryOperator) SequentialForestTraversalExecutor(com.amazon.randomcutforest.executor.SequentialForestTraversalExecutor) List(java.util.List) Math.max(java.lang.Math.max) Optional(java.util.Optional) DensityOutput(com.amazon.randomcutforest.returntypes.DensityOutput) CommonUtils.toDoubleArray(com.amazon.randomcutforest.CommonUtils.toDoubleArray) Precision(com.amazon.randomcutforest.config.Precision) CompactSampler(com.amazon.randomcutforest.sampler.CompactSampler) SamplerPlusTree(com.amazon.randomcutforest.executor.SamplerPlusTree) ShingleBuilder(com.amazon.randomcutforest.util.ShingleBuilder) Function(java.util.function.Function) ArrayList(java.util.ArrayList) SimulatedTransductiveScalarScoreVisitor(com.amazon.randomcutforest.anomalydetection.SimulatedTransductiveScalarScoreVisitor) PointStore(com.amazon.randomcutforest.store.PointStore) DynamicAttributionVisitor(com.amazon.randomcutforest.anomalydetection.DynamicAttributionVisitor) ConvergingAccumulator(com.amazon.randomcutforest.returntypes.ConvergingAccumulator) Config(com.amazon.randomcutforest.config.Config) SimpleInterpolationVisitor(com.amazon.randomcutforest.interpolation.SimpleInterpolationVisitor) InterpolationMeasure(com.amazon.randomcutforest.returntypes.InterpolationMeasure) IPointStore(com.amazon.randomcutforest.store.IPointStore) SequentialForestUpdateExecutor(com.amazon.randomcutforest.executor.SequentialForestUpdateExecutor) ArrayUtils(com.amazon.randomcutforest.util.ArrayUtils) OneSidedConvergingDiVectorAccumulator(com.amazon.randomcutforest.returntypes.OneSidedConvergingDiVectorAccumulator) CommonUtils.checkArgument(com.amazon.randomcutforest.CommonUtils.checkArgument) DynamicScoreVisitor(com.amazon.randomcutforest.anomalydetection.DynamicScoreVisitor) DiVector(com.amazon.randomcutforest.returntypes.DiVector) ITree(com.amazon.randomcutforest.tree.ITree) ConditionalTreeSample(com.amazon.randomcutforest.returntypes.ConditionalTreeSample) ConditionalSampleSummary(com.amazon.randomcutforest.returntypes.ConditionalSampleSummary) Collections(java.util.Collections) IStreamSampler(com.amazon.randomcutforest.sampler.IStreamSampler) DiVector(com.amazon.randomcutforest.returntypes.DiVector) AnomalyAttributionVisitor(com.amazon.randomcutforest.anomalydetection.AnomalyAttributionVisitor) OneSidedConvergingDiVectorAccumulator(com.amazon.randomcutforest.returntypes.OneSidedConvergingDiVectorAccumulator)

Aggregations

DiVector (com.amazon.randomcutforest.returntypes.DiVector)24 Test (org.junit.jupiter.api.Test)11 SamplerPlusTree (com.amazon.randomcutforest.executor.SamplerPlusTree)6 Random (java.util.Random)6 OneSidedConvergingDiVectorAccumulator (com.amazon.randomcutforest.returntypes.OneSidedConvergingDiVectorAccumulator)5 CommonUtils.checkArgument (com.amazon.randomcutforest.CommonUtils.checkArgument)4 CommonUtils.checkNotNull (com.amazon.randomcutforest.CommonUtils.checkNotNull)4 CommonUtils.toDoubleArray (com.amazon.randomcutforest.CommonUtils.toDoubleArray)4 CommonUtils.toFloatArray (com.amazon.randomcutforest.CommonUtils.toFloatArray)4 AnomalyAttributionVisitor (com.amazon.randomcutforest.anomalydetection.AnomalyAttributionVisitor)4 AnomalyScoreVisitor (com.amazon.randomcutforest.anomalydetection.AnomalyScoreVisitor)4 DynamicAttributionVisitor (com.amazon.randomcutforest.anomalydetection.DynamicAttributionVisitor)4 DynamicScoreVisitor (com.amazon.randomcutforest.anomalydetection.DynamicScoreVisitor)4 SimulatedTransductiveScalarScoreVisitor (com.amazon.randomcutforest.anomalydetection.SimulatedTransductiveScalarScoreVisitor)4 Config (com.amazon.randomcutforest.config.Config)4 Precision (com.amazon.randomcutforest.config.Precision)4 AbstractForestTraversalExecutor (com.amazon.randomcutforest.executor.AbstractForestTraversalExecutor)4 AbstractForestUpdateExecutor (com.amazon.randomcutforest.executor.AbstractForestUpdateExecutor)4 IStateCoordinator (com.amazon.randomcutforest.executor.IStateCoordinator)4 ParallelForestTraversalExecutor (com.amazon.randomcutforest.executor.ParallelForestTraversalExecutor)4