Search in sources :

Example 6 with RandomCutForestMapper

use of com.amazon.randomcutforest.state.RandomCutForestMapper in project random-cut-forest-by-aws by aws.

the class ProtostuffExampleWithShingles method run.

@Override
public void run() throws Exception {
    // Create and populate a random cut forest
    int dimensions = 10;
    int numberOfTrees = 50;
    int sampleSize = 256;
    Precision precision = Precision.FLOAT_64;
    RandomCutForest forest = RandomCutForest.builder().compact(true).dimensions(dimensions).numberOfTrees(numberOfTrees).sampleSize(sampleSize).precision(precision).shingleSize(dimensions).build();
    int count = 1;
    int dataSize = 1000 * sampleSize;
    for (double[] point : generateShingledData(dataSize, dimensions, 0)) {
        forest.update(point);
    }
    // Convert to an array of bytes and print the size
    RandomCutForestMapper mapper = new RandomCutForestMapper();
    mapper.setSaveExecutorContextEnabled(true);
    mapper.setSaveTreeStateEnabled(false);
    Schema<RandomCutForestState> schema = RuntimeSchema.getSchema(RandomCutForestState.class);
    LinkedBuffer buffer = LinkedBuffer.allocate(512);
    byte[] bytes;
    try {
        RandomCutForestState state = mapper.toState(forest);
        bytes = ProtostuffIOUtil.toByteArray(state, schema, buffer);
    } finally {
        buffer.clear();
    }
    System.out.printf("dimensions = %d, numberOfTrees = %d, sampleSize = %d, precision = %s%n", dimensions, numberOfTrees, sampleSize, precision);
    System.out.printf("protostuff size = %d bytes%n", bytes.length);
    // Restore from protostuff and compare anomaly scores produced by the two
    // forests
    RandomCutForestState state2 = schema.newMessage();
    ProtostuffIOUtil.mergeFrom(bytes, state2, schema);
    RandomCutForest forest2 = mapper.toModel(state2);
    int testSize = 10000;
    double delta = Math.log(sampleSize) / Math.log(2) * 0.05;
    int differences = 0;
    int anomalies = 0;
    for (double[] point : generateShingledData(testSize, dimensions, 2)) {
        double score = forest.getAnomalyScore(point);
        double score2 = forest2.getAnomalyScore(point);
        // also scored as an anomaly by the other forest
        if (score > 1 || score2 > 1) {
            anomalies++;
            if (Math.abs(score - score2) > delta) {
                differences++;
            }
        }
        forest.update(point);
        forest2.update(point);
    }
    // validate that the two forests agree on anomaly scores
    if (differences >= 0.01 * testSize) {
        throw new IllegalStateException("restored forest does not agree with original forest");
    }
    System.out.println("Looks good!");
}
Also used : LinkedBuffer(io.protostuff.LinkedBuffer) Precision(com.amazon.randomcutforest.config.Precision) RandomCutForest(com.amazon.randomcutforest.RandomCutForest) RandomCutForestMapper(com.amazon.randomcutforest.state.RandomCutForestMapper) RandomCutForestState(com.amazon.randomcutforest.state.RandomCutForestState)

Example 7 with RandomCutForestMapper

use of com.amazon.randomcutforest.state.RandomCutForestMapper in project random-cut-forest-by-aws by aws.

the class RandomCutForestTest method testUpdateAfterRoundTrip.

@Test
public void testUpdateAfterRoundTrip() {
    int dimensions = 10;
    for (int trials = 0; trials < 10; trials++) {
        RandomCutForest forest = RandomCutForest.builder().compact(true).dimensions(dimensions).sampleSize(64).build();
        Random r = new Random();
        for (int i = 0; i < new Random(trials).nextInt(3000); i++) {
            forest.update(r.ints(dimensions, 0, 50).asDoubleStream().toArray());
        }
        // serialize + deserialize
        RandomCutForestMapper mapper = new RandomCutForestMapper();
        mapper.setSaveExecutorContextEnabled(true);
        mapper.setSaveTreeStateEnabled(true);
        RandomCutForest forest2 = mapper.toModel(mapper.toState(forest));
        // update re-instantiated forest
        for (int i = 0; i < 10000; i++) {
            double[] point = r.ints(dimensions, 0, 50).asDoubleStream().toArray();
            double score = forest.getAnomalyScore(point);
            assertEquals(score, forest2.getAnomalyScore(point), 1e-5);
            forest2.update(point);
            forest.update(point);
        }
    }
}
Also used : Random(java.util.Random) RandomCutForestMapper(com.amazon.randomcutforest.state.RandomCutForestMapper) Test(org.junit.jupiter.api.Test)

Example 8 with RandomCutForestMapper

use of com.amazon.randomcutforest.state.RandomCutForestMapper in project random-cut-forest-by-aws by aws.

the class RandomCutForestTest method testUpdateAfterRoundTripLargeNodeStore.

@Test
public void testUpdateAfterRoundTripLargeNodeStore() {
    int dimensions = 5;
    for (int trials = 0; trials < 10; trials++) {
        RandomCutForest forest = RandomCutForest.builder().compact(true).dimensions(dimensions).numberOfTrees(1).sampleSize(20000).precision(Precision.FLOAT_32).build();
        Random r = new Random();
        for (int i = 0; i < 30000 + new Random().nextInt(300); i++) {
            forest.update(r.ints(dimensions, 0, 50).asDoubleStream().toArray());
        }
        // serialize + deserialize
        RandomCutForestMapper mapper = new RandomCutForestMapper();
        mapper.setSaveTreeStateEnabled(true);
        mapper.setSaveExecutorContextEnabled(true);
        RandomCutForestState state = mapper.toState(forest);
        RandomCutForest forest2 = mapper.toModel(state);
        // update re-instantiated forest
        for (int i = 0; i < 10000; i++) {
            double[] point = r.ints(dimensions, 0, 50).asDoubleStream().toArray();
            double score = forest.getAnomalyScore(point);
            assertEquals(score, forest2.getAnomalyScore(point), 1E-10);
            forest2.update(point);
            forest.update(point);
        }
    }
}
Also used : Random(java.util.Random) RandomCutForestMapper(com.amazon.randomcutforest.state.RandomCutForestMapper) RandomCutForestState(com.amazon.randomcutforest.state.RandomCutForestState) Test(org.junit.jupiter.api.Test)

Example 9 with RandomCutForestMapper

use of com.amazon.randomcutforest.state.RandomCutForestMapper in project random-cut-forest-by-aws by aws.

the class StateMapperShingledBenchmark method roundTripFromState.

@Benchmark
@OperationsPerInvocation(NUM_TEST_SAMPLES)
public RandomCutForestState roundTripFromState(BenchmarkState state, Blackhole blackhole) {
    RandomCutForestState forestState = state.forestState;
    double[][] testData = state.testData;
    for (int i = 0; i < NUM_TEST_SAMPLES; i++) {
        RandomCutForestMapper mapper = new RandomCutForestMapper();
        mapper.setSaveExecutorContextEnabled(true);
        mapper.setSaveTreeStateEnabled(state.saveTreeState);
        RandomCutForest forest = mapper.toModel(forestState);
        double score = forest.getAnomalyScore(testData[i]);
        blackhole.consume(score);
        forest.update(testData[i]);
        forestState = mapper.toState(forest);
    }
    return forestState;
}
Also used : RandomCutForestMapper(com.amazon.randomcutforest.state.RandomCutForestMapper) RandomCutForestState(com.amazon.randomcutforest.state.RandomCutForestState) Benchmark(org.openjdk.jmh.annotations.Benchmark) OperationsPerInvocation(org.openjdk.jmh.annotations.OperationsPerInvocation)

Example 10 with RandomCutForestMapper

use of com.amazon.randomcutforest.state.RandomCutForestMapper in project random-cut-forest-by-aws by aws.

the class V1JsonToV3StateConverterTest method testConvert.

@ParameterizedTest
@MethodSource("args")
public void testConvert(V1JsonResource jsonResource, Precision precision) {
    String resource = jsonResource.getResource();
    try (InputStream is = V1JsonToV3StateConverterTest.class.getResourceAsStream(jsonResource.getResource());
        BufferedReader rr = new BufferedReader(new InputStreamReader(is, StandardCharsets.UTF_8))) {
        StringBuilder b = new StringBuilder();
        String line;
        while ((line = rr.readLine()) != null) {
            b.append(line);
        }
        String json = b.toString();
        RandomCutForestState state = converter.convert(json, precision);
        assertEquals(jsonResource.getDimensions(), state.getDimensions());
        assertEquals(jsonResource.getNumberOfTrees(), state.getNumberOfTrees());
        assertEquals(jsonResource.getSampleSize(), state.getSampleSize());
        RandomCutForest forest = new RandomCutForestMapper().toModel(state, 0);
        assertEquals(jsonResource.getDimensions(), forest.getDimensions());
        assertEquals(jsonResource.getNumberOfTrees(), forest.getNumberOfTrees());
        assertEquals(jsonResource.getSampleSize(), forest.getSampleSize());
        // perform a simple validation of the deserialized forest by update and scoring
        // with a few points
        Random random = new Random(0);
        for (int i = 0; i < 100; i++) {
            double[] point = getPoint(jsonResource.getDimensions(), random);
            double score = forest.getAnomalyScore(point);
            assertTrue(score > 0);
            forest.update(point);
        }
        String newString = new ObjectMapper().writeValueAsString(new RandomCutForestMapper().toState(forest));
        System.out.println(" Old size " + json.length() + ", new Size " + newString.length() + ", improvement factor " + json.length() / newString.length());
    } catch (IOException e) {
        fail("Unable to load JSON resource");
    }
}
Also used : InputStreamReader(java.io.InputStreamReader) InputStream(java.io.InputStream) RandomCutForest(com.amazon.randomcutforest.RandomCutForest) RandomCutForestState(com.amazon.randomcutforest.state.RandomCutForestState) IOException(java.io.IOException) Random(java.util.Random) RandomCutForestMapper(com.amazon.randomcutforest.state.RandomCutForestMapper) BufferedReader(java.io.BufferedReader) ObjectMapper(com.fasterxml.jackson.databind.ObjectMapper) ParameterizedTest(org.junit.jupiter.params.ParameterizedTest) MethodSource(org.junit.jupiter.params.provider.MethodSource)

Aggregations

RandomCutForestMapper (com.amazon.randomcutforest.state.RandomCutForestMapper)21 RandomCutForestState (com.amazon.randomcutforest.state.RandomCutForestState)15 RandomCutForest (com.amazon.randomcutforest.RandomCutForest)10 Precision (com.amazon.randomcutforest.config.Precision)6 Benchmark (org.openjdk.jmh.annotations.Benchmark)6 OperationsPerInvocation (org.openjdk.jmh.annotations.OperationsPerInvocation)6 NormalMixtureTestData (com.amazon.randomcutforest.testutils.NormalMixtureTestData)5 ObjectMapper (com.fasterxml.jackson.databind.ObjectMapper)5 LinkedBuffer (io.protostuff.LinkedBuffer)5 Random (java.util.Random)5 Test (org.junit.jupiter.api.Test)4 ParameterizedTest (org.junit.jupiter.params.ParameterizedTest)4 IRCFComputeDescriptor (com.amazon.randomcutforest.parkservices.IRCFComputeDescriptor)2 ThresholdedRandomCutForest (com.amazon.randomcutforest.parkservices.ThresholdedRandomCutForest)2 Preprocessor (com.amazon.randomcutforest.parkservices.preprocessor.Preprocessor)2 PreprocessorMapper (com.amazon.randomcutforest.parkservices.state.preprocessor.PreprocessorMapper)2 BasicThresholderMapper (com.amazon.randomcutforest.parkservices.state.threshold.BasicThresholderMapper)2 DiVectorMapper (com.amazon.randomcutforest.state.returntypes.DiVectorMapper)2 BufferedReader (java.io.BufferedReader)2 IOException (java.io.IOException)2