Search in sources :

Example 21 with EntityModel

use of org.opensearch.ad.ml.EntityModel in project anomaly-detection by opensearch-project.

the class CheckpointWriteWorkerTests method testEmptyDetectorId.

@SuppressWarnings("unchecked")
public void testEmptyDetectorId() {
    ModelState<EntityModel> state = mock(ModelState.class);
    when(state.getLastCheckpointTime()).thenReturn(Instant.now());
    EntityModel model = mock(EntityModel.class);
    when(state.getModel()).thenReturn(model);
    when(state.getDetectorId()).thenReturn(null);
    when(state.getModelId()).thenReturn("a");
    worker.write(state, true, RequestPriority.MEDIUM);
    verify(checkpoint, never()).batchWrite(any(), any());
}
Also used : EntityModel(org.opensearch.ad.ml.EntityModel)

Example 22 with EntityModel

use of org.opensearch.ad.ml.EntityModel in project anomaly-detection by opensearch-project.

the class PriorityCache method getAllModelProfile.

@Override
public List<ModelProfile> getAllModelProfile(String detectorId) {
    CacheBuffer cacheBuffer = activeEnities.get(detectorId);
    List<ModelProfile> res = new ArrayList<>();
    if (cacheBuffer != null) {
        long size = cacheBuffer.getMemoryConsumptionPerEntity();
        cacheBuffer.getAllModels().forEach(entry -> {
            EntityModel model = entry.getModel();
            Entity entity = null;
            if (model != null && model.getEntity().isPresent()) {
                entity = model.getEntity().get();
            }
            res.add(new ModelProfile(entry.getModelId(), entity, size));
        });
    }
    return res;
}
Also used : Entity(org.opensearch.ad.model.Entity) ArrayList(java.util.ArrayList) EntityModel(org.opensearch.ad.ml.EntityModel) ModelProfile(org.opensearch.ad.model.ModelProfile)

Example 23 with EntityModel

use of org.opensearch.ad.ml.EntityModel in project anomaly-detection by opensearch-project.

the class CacheBuffer method remove.

/**
 * Remove everything associated with the key and make a checkpoint.
 *
 * @param keyToRemove The key to remove
 * @return the associated ModelState associated with the key, or null if there
 * is no associated ModelState for the key
 */
public ModelState<EntityModel> remove(String keyToRemove) {
    priorityTracker.removePriority(keyToRemove);
    // if shared cache is empty, we are using reserved memory
    boolean reserved = sharedCacheEmpty();
    ModelState<EntityModel> valueRemoved = items.remove(keyToRemove);
    if (valueRemoved != null) {
        if (!reserved) {
            // release in shared memory
            memoryTracker.releaseMemory(memoryConsumptionPerEntity, false, Origin.HC_DETECTOR);
        }
        EntityModel modelRemoved = valueRemoved.getModel();
        if (modelRemoved != null) {
            // null model has only samples. For null model we save a checkpoint
            // regardless of last checkpoint time. whether If we don't save,
            // we throw the new samples and might never be able to initialize the model
            boolean isNullModel = !modelRemoved.getTrcf().isPresent();
            checkpointWriteQueue.write(valueRemoved, isNullModel, RequestPriority.MEDIUM);
            modelRemoved.clear();
        }
    }
    return valueRemoved;
}
Also used : EntityModel(org.opensearch.ad.ml.EntityModel)

Example 24 with EntityModel

use of org.opensearch.ad.ml.EntityModel in project anomaly-detection by opensearch-project.

the class CacheBuffer method maintenance.

/**
 * Remove expired state and save checkpoints of existing states
 * @return removed states
 */
public List<ModelState<EntityModel>> maintenance() {
    List<ModelState<EntityModel>> modelsToSave = new ArrayList<>();
    List<ModelState<EntityModel>> removedStates = new ArrayList<>();
    items.entrySet().stream().forEach(entry -> {
        String entityModelId = entry.getKey();
        try {
            ModelState<EntityModel> modelState = entry.getValue();
            Instant now = clock.instant();
            if (modelState.getLastUsedTime().plus(modelTtl).isBefore(now)) {
                // race conditions can happen between the put and one of the following operations:
                // remove: not a problem as all of the data structures are concurrent.
                // Two threads removing the same entry is not a problem.
                // clear: not a problem as we are releasing memory in MemoryTracker.
                // The removed one loses references and soon GC will collect it.
                // We have memory tracking correction to fix incorrect memory usage record.
                // put: not a problem as we are unlikely to maintain an entry that's not
                // already in the cache
                // remove method saves checkpoint as well
                removedStates.add(remove(entityModelId));
            } else if (random.nextInt(6) == 0) {
                // checkpoint is relatively big compared to other queued requests
                // save checkpoints with 1/6 probability as we expect to save
                // all every 6 hours statistically
                // 
                // Background:
                // We will save a checkpoint when
                // 
                // (a)removing the model from cache.
                // (b) cold start
                // (c) no complete model only a few samples. If we don't save new samples,
                // we will never be able to have enough samples for a trained mode.
                // (d) periodically save in case of exceptions.
                // 
                // This branch is doing d). Previously, I will do it every hour for all
                // in-cache models. Consider we are moving to 1M entities, this will bring
                // the cluster in a heavy payload every hour. That's why I am doing it randomly
                // (expected 6 hours for each checkpoint statistically).
                // 
                // I am doing it random since maintaining a state of which one has been saved
                // and which one hasn't are not cheap. Also, the models in the cache can be
                // dynamically changing. Will have to maintain the state in the removing logic.
                // Random is a lazy way to deal with this as it is stateless and statistically sound.
                // 
                // If a checkpoint does not fall into the 6-hour bucket in a particular scenario, the model
                // is stale (i.e., we don't recover from the freshest model in disaster.).
                // 
                // All in all, randomness is mostly due to performance and easy maintenance.
                modelsToSave.add(modelState);
            }
        } catch (Exception e) {
            LOG.warn("Failed to finish maintenance for model id " + entityModelId, e);
        }
    });
    checkpointWriteQueue.writeAll(modelsToSave, detectorId, false, RequestPriority.MEDIUM);
    return removedStates;
}
Also used : Instant(java.time.Instant) ArrayList(java.util.ArrayList) EntityModel(org.opensearch.ad.ml.EntityModel) ModelState(org.opensearch.ad.ml.ModelState)

Aggregations

EntityModel (org.opensearch.ad.ml.EntityModel)24 ModelState (org.opensearch.ad.ml.ModelState)8 Entity (org.opensearch.ad.model.Entity)8 ArrayList (java.util.ArrayList)7 Instant (java.time.Instant)5 ParameterizedMessage (org.apache.logging.log4j.message.ParameterizedMessage)4 EntityCache (org.opensearch.ad.caching.EntityCache)4 Clock (java.time.Clock)3 HashMap (java.util.HashMap)3 Before (org.junit.Before)3 ThresholdingResult (org.opensearch.ad.ml.ThresholdingResult)3 AnomalyDetector (org.opensearch.ad.model.AnomalyDetector)3 RandomModelStateConfig (test.org.opensearch.ad.util.RandomModelStateConfig)3 ArrayDeque (java.util.ArrayDeque)2 Map (java.util.Map)2 Optional (java.util.Optional)2 Random (java.util.Random)2 ArgumentMatchers.anyString (org.mockito.ArgumentMatchers.anyString)2 CacheProvider (org.opensearch.ad.caching.CacheProvider)2 AnomalyDetectionIndices (org.opensearch.ad.indices.AnomalyDetectionIndices)2