Search in sources :

Example 16 with ThresholdedRandomCutForest

use of com.amazon.randomcutforest.parkservices.ThresholdedRandomCutForest in project random-cut-forest-by-aws by aws.

the class ThresholdedRandomCutForestMapperTest method testRoundTripStandardInitial.

@ParameterizedTest
@EnumSource(value = TransformMethod.class)
public void testRoundTripStandardInitial(TransformMethod method) {
    int sampleSize = 256;
    int baseDimensions = 2;
    int shingleSize = 8;
    int dimensions = baseDimensions * shingleSize;
    long seed = new Random().nextLong();
    ThresholdedRandomCutForest first = new ThresholdedRandomCutForest.Builder<>().compact(true).dimensions(dimensions).precision(Precision.FLOAT_32).randomSeed(seed).internalShinglingEnabled(true).shingleSize(shingleSize).anomalyRate(0.01).adjustThreshold(true).build();
    ThresholdedRandomCutForest second = new ThresholdedRandomCutForest.Builder<>().compact(true).dimensions(dimensions).precision(Precision.FLOAT_32).randomSeed(seed).internalShinglingEnabled(true).shingleSize(shingleSize).anomalyRate(0.01).adjustThreshold(true).build();
    MultiDimDataWithKey dataWithKeys = ShingledMultiDimDataWithKeys.getMultiDimData(sampleSize, 50, 100, 5, seed, baseDimensions);
    for (double[] point : dataWithKeys.data) {
        AnomalyDescriptor firstResult = first.process(point, 0L);
        AnomalyDescriptor secondResult = second.process(point, 0L);
        assertEquals(firstResult.getRCFScore(), secondResult.getRCFScore(), 1e-10);
        assertEquals(secondResult.getAnomalyGrade(), firstResult.getAnomalyGrade(), 1e-10);
        // serialize + deserialize
        ThresholdedRandomCutForestMapper mapper = new ThresholdedRandomCutForestMapper();
        second = mapper.toModel(mapper.toState(second));
    }
}
Also used : Random(java.util.Random) AnomalyDescriptor(com.amazon.randomcutforest.parkservices.AnomalyDescriptor) MultiDimDataWithKey(com.amazon.randomcutforest.testutils.MultiDimDataWithKey) ThresholdedRandomCutForest(com.amazon.randomcutforest.parkservices.ThresholdedRandomCutForest) EnumSource(org.junit.jupiter.params.provider.EnumSource) ParameterizedTest(org.junit.jupiter.params.ParameterizedTest)

Example 17 with ThresholdedRandomCutForest

use of com.amazon.randomcutforest.parkservices.ThresholdedRandomCutForest in project random-cut-forest-by-aws by aws.

the class ThresholdedRandomCutForestMapper method toModel.

@Override
public ThresholdedRandomCutForest toModel(ThresholdedRandomCutForestState state, long seed) {
    RandomCutForestMapper randomCutForestMapper = new RandomCutForestMapper();
    BasicThresholderMapper thresholderMapper = new BasicThresholderMapper();
    PreprocessorMapper preprocessorMapper = new PreprocessorMapper();
    RandomCutForest forest = randomCutForestMapper.toModel(state.getForestState());
    BasicThresholder thresholder = thresholderMapper.toModel(state.getThresholderState());
    Preprocessor preprocessor = preprocessorMapper.toModel(state.getPreprocessorStates()[0]);
    ForestMode forestMode = ForestMode.valueOf(state.getForestMode());
    TransformMethod transformMethod = TransformMethod.valueOf(state.getTransformMethod());
    RCFComputeDescriptor descriptor = new RCFComputeDescriptor(null, 0L);
    descriptor.setRCFScore(state.getLastAnomalyScore());
    descriptor.setInternalTimeStamp(state.getLastAnomalyTimeStamp());
    descriptor.setAttribution(new DiVectorMapper().toModel(state.getLastAnomalyAttribution()));
    descriptor.setRCFPoint(state.getLastAnomalyPoint());
    descriptor.setExpectedRCFPoint(state.getLastExpectedPoint());
    descriptor.setRelativeIndex(state.getLastRelativeIndex());
    descriptor.setForestMode(forestMode);
    descriptor.setTransformMethod(transformMethod);
    descriptor.setImputationMethod(ImputationMethod.valueOf(state.getPreprocessorStates()[0].getImputationMethod()));
    PredictorCorrector predictorCorrector = new PredictorCorrector(thresholder);
    predictorCorrector.setIgnoreSimilar(state.isIgnoreSimilar());
    predictorCorrector.setIgnoreSimilarFactor(state.getIgnoreSimilarFactor());
    predictorCorrector.setTriggerFactor(state.getTriggerFactor());
    predictorCorrector.setNumberOfAttributors(state.getNumberOfAttributors());
    return new ThresholdedRandomCutForest(forestMode, transformMethod, forest, predictorCorrector, preprocessor, descriptor);
}
Also used : ForestMode(com.amazon.randomcutforest.config.ForestMode) BasicThresholderMapper(com.amazon.randomcutforest.parkservices.state.threshold.BasicThresholderMapper) PredictorCorrector(com.amazon.randomcutforest.parkservices.PredictorCorrector) RandomCutForestMapper(com.amazon.randomcutforest.state.RandomCutForestMapper) RandomCutForest(com.amazon.randomcutforest.RandomCutForest) ThresholdedRandomCutForest(com.amazon.randomcutforest.parkservices.ThresholdedRandomCutForest) DiVectorMapper(com.amazon.randomcutforest.state.returntypes.DiVectorMapper) Preprocessor(com.amazon.randomcutforest.parkservices.preprocessor.Preprocessor) PreprocessorMapper(com.amazon.randomcutforest.parkservices.state.preprocessor.PreprocessorMapper) TransformMethod(com.amazon.randomcutforest.config.TransformMethod) IRCFComputeDescriptor(com.amazon.randomcutforest.parkservices.IRCFComputeDescriptor) RCFComputeDescriptor(com.amazon.randomcutforest.parkservices.RCFComputeDescriptor) BasicThresholder(com.amazon.randomcutforest.parkservices.threshold.BasicThresholder) ThresholdedRandomCutForest(com.amazon.randomcutforest.parkservices.ThresholdedRandomCutForest)

Example 18 with ThresholdedRandomCutForest

use of com.amazon.randomcutforest.parkservices.ThresholdedRandomCutForest in project random-cut-forest-by-aws by aws.

the class Thresholded1DGaussianMix method run.

@Override
public void run() throws Exception {
    // Create and populate a random cut forest
    int shingleSize = 4;
    int numberOfTrees = 50;
    int sampleSize = 256;
    Precision precision = Precision.FLOAT_32;
    int dataSize = 4 * sampleSize;
    // change this to try different number of attributes,
    // this parameter is not expected to be larger than 5 for this example
    int baseDimensions = 1;
    int count = 0;
    int dimensions = baseDimensions * shingleSize;
    ThresholdedRandomCutForest forest = new ThresholdedRandomCutForest.Builder<>().compact(true).dimensions(dimensions).randomSeed(0).numberOfTrees(numberOfTrees).shingleSize(shingleSize).sampleSize(sampleSize).precision(precision).anomalyRate(0.01).forestMode(ForestMode.TIME_AUGMENTED).build();
    long seed = new Random().nextLong();
    System.out.println("Anomalies would correspond to a run, based on a change of state.");
    System.out.println("Each change is normal <-> anomaly;  so after the second change the data is normal");
    System.out.println("seed = " + seed);
    NormalMixtureTestData normalMixtureTestData = new NormalMixtureTestData(10, 1.0, 50, 2.0, 0.01, 0.1);
    MultiDimDataWithKey dataWithKeys = normalMixtureTestData.generateTestDataWithKey(dataSize, 1, 0);
    int keyCounter = 0;
    for (double[] point : dataWithKeys.data) {
        AnomalyDescriptor result = forest.process(point, count);
        if (keyCounter < dataWithKeys.changeIndices.length && result.getInternalTimeStamp() == dataWithKeys.changeIndices[keyCounter]) {
            System.out.println("timestamp " + (result.getInputTimestamp()) + " CHANGE");
            ++keyCounter;
        }
        if (keyCounter < dataWithKeys.changeIndices.length && count == dataWithKeys.changeIndices[keyCounter]) {
            System.out.println("timestamp " + (count) + " CHANGE ");
            ++keyCounter;
        }
        if (result.getAnomalyGrade() != 0) {
            System.out.print("timestamp " + (count) + " RESULT value ");
            for (int i = 0; i < baseDimensions; i++) {
                System.out.print(result.getCurrentInput()[i] + ", ");
            }
            System.out.print("score " + result.getRCFScore() + ", grade " + result.getAnomalyGrade() + ", ");
            if (result.isExpectedValuesPresent()) {
                if (result.getRelativeIndex() != 0 && result.isStartOfAnomaly()) {
                    System.out.print(-result.getRelativeIndex() + " steps ago, instead of ");
                    for (int i = 0; i < baseDimensions; i++) {
                        System.out.print(result.getPastValues()[i] + ", ");
                    }
                    System.out.print("expected ");
                    for (int i = 0; i < baseDimensions; i++) {
                        System.out.print(result.getExpectedValuesList()[0][i] + ", ");
                        if (result.getPastValues()[i] != result.getExpectedValuesList()[0][i]) {
                            System.out.print("( " + (result.getPastValues()[i] - result.getExpectedValuesList()[0][i]) + " ) ");
                        }
                    }
                } else {
                    System.out.print("expected ");
                    for (int i = 0; i < baseDimensions; i++) {
                        System.out.print(result.getExpectedValuesList()[0][i] + ", ");
                        if (result.getCurrentInput()[i] != result.getExpectedValuesList()[0][i]) {
                            System.out.print("( " + (result.getCurrentInput()[i] - result.getExpectedValuesList()[0][i]) + " ) ");
                        }
                    }
                }
            }
            System.out.println();
        }
        ++count;
    }
}
Also used : Random(java.util.Random) Precision(com.amazon.randomcutforest.config.Precision) AnomalyDescriptor(com.amazon.randomcutforest.parkservices.AnomalyDescriptor) NormalMixtureTestData(com.amazon.randomcutforest.testutils.NormalMixtureTestData) MultiDimDataWithKey(com.amazon.randomcutforest.testutils.MultiDimDataWithKey) ThresholdedRandomCutForest(com.amazon.randomcutforest.parkservices.ThresholdedRandomCutForest)

Example 19 with ThresholdedRandomCutForest

use of com.amazon.randomcutforest.parkservices.ThresholdedRandomCutForest in project random-cut-forest-by-aws by aws.

the class ThresholdedMultiDimensionalExample method run.

@Override
public void run() throws Exception {
    // Create and populate a random cut forest
    int shingleSize = 4;
    int numberOfTrees = 50;
    int sampleSize = 256;
    Precision precision = Precision.FLOAT_32;
    int dataSize = 4 * sampleSize;
    // change this to try different number of attributes,
    // this parameter is not expected to be larger than 5 for this example
    int baseDimensions = 2;
    int dimensions = baseDimensions * shingleSize;
    ThresholdedRandomCutForest forest = ThresholdedRandomCutForest.builder().compact(true).dimensions(dimensions).randomSeed(0).numberOfTrees(numberOfTrees).shingleSize(shingleSize).sampleSize(sampleSize).precision(precision).anomalyRate(0.01).forestMode(ForestMode.STANDARD).build();
    long seed = new Random().nextLong();
    System.out.println("seed = " + seed);
    // change the last argument seed for a different run
    MultiDimDataWithKey dataWithKeys = ShingledMultiDimDataWithKeys.generateShingledDataWithKey(dataSize, 50, shingleSize, baseDimensions, seed);
    int keyCounter = 0;
    int count = 0;
    for (double[] point : dataWithKeys.data) {
        AnomalyDescriptor result = forest.process(point, 0L);
        if (keyCounter < dataWithKeys.changeIndices.length && count + shingleSize - 1 == dataWithKeys.changeIndices[keyCounter]) {
            System.out.println("timestamp " + (count + shingleSize - 1) + " CHANGE " + Arrays.toString(dataWithKeys.changes[keyCounter]));
            ++keyCounter;
        }
        if (result.getAnomalyGrade() != 0) {
            System.out.print("timestamp " + (count + shingleSize - 1) + " RESULT value ");
            for (int i = (shingleSize - 1) * baseDimensions; i < shingleSize * baseDimensions; i++) {
                System.out.print(result.getCurrentInput()[i] + ", ");
            }
            System.out.print("score " + result.getRCFScore() + ", grade " + result.getAnomalyGrade() + ", ");
            if (result.isExpectedValuesPresent()) {
                if (result.getRelativeIndex() != 0 && result.isStartOfAnomaly()) {
                    System.out.print(-result.getRelativeIndex() + " steps ago, instead of ");
                    for (int i = 0; i < baseDimensions; i++) {
                        System.out.print(result.getPastValues()[i] + ", ");
                    }
                    System.out.print("expected ");
                    for (int i = 0; i < baseDimensions; i++) {
                        System.out.print(result.getExpectedValuesList()[0][i] + ", ");
                        if (result.getPastValues()[i] != result.getExpectedValuesList()[0][i]) {
                            System.out.print("( " + (result.getPastValues()[i] - result.getExpectedValuesList()[0][i]) + " ) ");
                        }
                    }
                } else {
                    System.out.print("expected ");
                    for (int i = 0; i < baseDimensions; i++) {
                        System.out.print(result.getExpectedValuesList()[0][i] + ", ");
                        if (result.getCurrentInput()[(shingleSize - 1) * baseDimensions + i] != result.getExpectedValuesList()[0][i]) {
                            System.out.print("( " + (result.getCurrentInput()[(shingleSize - 1) * baseDimensions + i] - result.getExpectedValuesList()[0][i]) + " ) ");
                        }
                    }
                }
            }
            System.out.println();
        }
        ++count;
    }
}
Also used : Random(java.util.Random) Precision(com.amazon.randomcutforest.config.Precision) AnomalyDescriptor(com.amazon.randomcutforest.parkservices.AnomalyDescriptor) MultiDimDataWithKey(com.amazon.randomcutforest.testutils.MultiDimDataWithKey) ThresholdedRandomCutForest(com.amazon.randomcutforest.parkservices.ThresholdedRandomCutForest)

Example 20 with ThresholdedRandomCutForest

use of com.amazon.randomcutforest.parkservices.ThresholdedRandomCutForest in project ml-commons by opensearch-project.

the class FixedInTimeRandomCutForest method train.

@Override
public Model train(DataFrame dataFrame) {
    ThresholdedRandomCutForest forest = createThresholdedRandomCutForest(dataFrame);
    process(dataFrame, forest);
    Model model = new Model();
    model.setName(FunctionName.FIT_RCF.name());
    model.setVersion(1);
    ThresholdedRandomCutForestState state = trcfMapper.toState(forest);
    model.setContent(ModelSerDeSer.serialize(state));
    return model;
}
Also used : Model(org.opensearch.ml.common.parameter.Model) ThresholdedRandomCutForestState(com.amazon.randomcutforest.parkservices.state.ThresholdedRandomCutForestState) ThresholdedRandomCutForest(com.amazon.randomcutforest.parkservices.ThresholdedRandomCutForest)

Aggregations

ThresholdedRandomCutForest (com.amazon.randomcutforest.parkservices.ThresholdedRandomCutForest)20 AnomalyDescriptor (com.amazon.randomcutforest.parkservices.AnomalyDescriptor)16 Random (java.util.Random)15 MultiDimDataWithKey (com.amazon.randomcutforest.testutils.MultiDimDataWithKey)13 ParameterizedTest (org.junit.jupiter.params.ParameterizedTest)11 RandomCutForest (com.amazon.randomcutforest.RandomCutForest)5 Precision (com.amazon.randomcutforest.config.Precision)5 EnumSource (org.junit.jupiter.params.provider.EnumSource)5 Test (org.junit.jupiter.api.Test)4 ThresholdedRandomCutForestState (com.amazon.randomcutforest.parkservices.state.ThresholdedRandomCutForestState)3 HashMap (java.util.HashMap)3 Map (java.util.Map)3 ForestMode (com.amazon.randomcutforest.config.ForestMode)2 TransformMethod (com.amazon.randomcutforest.config.TransformMethod)2 RandomCutForestMapper (com.amazon.randomcutforest.state.RandomCutForestMapper)2 NormalMixtureTestData (com.amazon.randomcutforest.testutils.NormalMixtureTestData)2 MethodSource (org.junit.jupiter.params.provider.MethodSource)2 Model (org.opensearch.ml.common.parameter.Model)2 IRCFComputeDescriptor (com.amazon.randomcutforest.parkservices.IRCFComputeDescriptor)1 PredictorCorrector (com.amazon.randomcutforest.parkservices.PredictorCorrector)1