Search in sources :

Example 1 with NumericFeatureDistribution

use of org.kie.kogito.explainability.model.NumericFeatureDistribution in project kogito-apps by kiegroup.

the class CounterfactualEntityFactoryTest method testDurationFactory.

@Test
void testDurationFactory() {
    final Duration value = Duration.ofDays(1);
    Feature feature = FeatureFactory.newDurationFeature("duration-feature", value);
    CounterfactualEntity counterfactualEntity = CounterfactualEntityFactory.from(feature);
    assertTrue(counterfactualEntity instanceof FixedDurationEntity);
    assertEquals(Type.DURATION, counterfactualEntity.asFeature().getType());
    FeatureDomain domain = DurationFeatureDomain.create(0, 60, ChronoUnit.SECONDS);
    feature = FeatureFactory.newDurationFeature("duration-feature", value, domain);
    counterfactualEntity = CounterfactualEntityFactory.from(feature);
    assertTrue(counterfactualEntity instanceof DurationEntity);
    assertEquals(Type.DURATION, counterfactualEntity.asFeature().getType());
    assertFalse(counterfactualEntity.isConstrained());
    CounterfactualEntity entity = DurationEntity.from(feature, Duration.ZERO, Duration.ofDays(2));
    assertEquals(0, entity.distance());
    assertTrue(((DurationEntity) entity).getValueRange().contains(1e5));
    assertFalse(((DurationEntity) entity).getValueRange().contains(2e5));
    assertFalse(entity.isConstrained());
    entity = DurationEntity.from(feature, Duration.ZERO, Duration.ofDays(2), false);
    assertEquals(0, entity.distance());
    assertFalse(entity.isConstrained());
    FeatureDistribution distribution = new NumericFeatureDistribution(feature, new Random().doubles(10).toArray());
    entity = DurationEntity.from(feature, Duration.ZERO, Duration.ofDays(2), distribution);
    assertEquals(0, entity.distance());
    assertFalse(entity.isConstrained());
}
Also used : CounterfactualEntity(org.kie.kogito.explainability.local.counterfactual.entities.CounterfactualEntity) FeatureDistribution(org.kie.kogito.explainability.model.FeatureDistribution) NumericFeatureDistribution(org.kie.kogito.explainability.model.NumericFeatureDistribution) FixedDurationEntity(org.kie.kogito.explainability.local.counterfactual.entities.fixed.FixedDurationEntity) Random(java.util.Random) ObjectFeatureDomain(org.kie.kogito.explainability.model.domain.ObjectFeatureDomain) EmptyFeatureDomain(org.kie.kogito.explainability.model.domain.EmptyFeatureDomain) CategoricalFeatureDomain(org.kie.kogito.explainability.model.domain.CategoricalFeatureDomain) CurrencyFeatureDomain(org.kie.kogito.explainability.model.domain.CurrencyFeatureDomain) URIFeatureDomain(org.kie.kogito.explainability.model.domain.URIFeatureDomain) DurationFeatureDomain(org.kie.kogito.explainability.model.domain.DurationFeatureDomain) TimeFeatureDomain(org.kie.kogito.explainability.model.domain.TimeFeatureDomain) NumericalFeatureDomain(org.kie.kogito.explainability.model.domain.NumericalFeatureDomain) BinaryFeatureDomain(org.kie.kogito.explainability.model.domain.BinaryFeatureDomain) FeatureDomain(org.kie.kogito.explainability.model.domain.FeatureDomain) Duration(java.time.Duration) Feature(org.kie.kogito.explainability.model.Feature) FixedDurationEntity(org.kie.kogito.explainability.local.counterfactual.entities.fixed.FixedDurationEntity) DurationEntity(org.kie.kogito.explainability.local.counterfactual.entities.DurationEntity) NumericFeatureDistribution(org.kie.kogito.explainability.model.NumericFeatureDistribution) Test(org.junit.jupiter.api.Test)

Example 2 with NumericFeatureDistribution

use of org.kie.kogito.explainability.model.NumericFeatureDistribution in project kogito-apps by kiegroup.

the class CounterfactualExplainerTest method testCounterfactualConstrainedMatchScaled.

@ParameterizedTest
@ValueSource(ints = { 0, 1, 2 })
void testCounterfactualConstrainedMatchScaled(int seed) throws ExecutionException, InterruptedException, TimeoutException {
    Random random = new Random();
    random.setSeed(seed);
    final List<Output> goal = List.of(new Output("inside", Type.BOOLEAN, new Value(true), 0.0d));
    List<Feature> features = new LinkedList<>();
    List<FeatureDistribution> featureDistributions = new LinkedList<>();
    final Feature fnum1 = FeatureFactory.newNumericalFeature("f-num1", 100.0);
    features.add(fnum1);
    featureDistributions.add(new NumericFeatureDistribution(fnum1, (new NormalDistribution(500, 1.1)).sample(1000)));
    final Feature fnum2 = FeatureFactory.newNumericalFeature("f-num2", 100.0, NumericalFeatureDomain.create(0.0, 1000.0));
    features.add(fnum2);
    featureDistributions.add(new NumericFeatureDistribution(fnum2, (new NormalDistribution(430.0, 1.7)).sample(1000)));
    final Feature fnum3 = FeatureFactory.newNumericalFeature("f-num3", 100.0, NumericalFeatureDomain.create(0.0, 1000.0));
    features.add(fnum3);
    featureDistributions.add(new NumericFeatureDistribution(fnum3, (new NormalDistribution(470.0, 2.9)).sample(1000)));
    final Feature fnum4 = FeatureFactory.newNumericalFeature("f-num4", 100.0);
    features.add(fnum4);
    featureDistributions.add(new NumericFeatureDistribution(fnum4, (new NormalDistribution(2390.0, 0.3)).sample(1000)));
    final double center = 500.0;
    final double epsilon = 10.0;
    final CounterfactualResult result = runCounterfactualSearch((long) seed, goal, features, TestUtils.getSumThresholdModel(center, epsilon), DEFAULT_GOAL_THRESHOLD);
    final List<CounterfactualEntity> counterfactualEntities = result.getEntities();
    double totalSum = 0;
    for (CounterfactualEntity entity : counterfactualEntities) {
        totalSum += entity.asFeature().getValue().asNumber();
        logger.debug("Entity: {}", entity);
    }
    assertFalse(counterfactualEntities.get(0).isChanged());
    assertFalse(counterfactualEntities.get(3).isChanged());
    assertTrue(totalSum <= center + epsilon);
    assertTrue(totalSum >= center - epsilon);
    assertTrue(result.isValid());
}
Also used : Feature(org.kie.kogito.explainability.model.Feature) LinkedList(java.util.LinkedList) CounterfactualEntity(org.kie.kogito.explainability.local.counterfactual.entities.CounterfactualEntity) FeatureDistribution(org.kie.kogito.explainability.model.FeatureDistribution) NumericFeatureDistribution(org.kie.kogito.explainability.model.NumericFeatureDistribution) Random(java.util.Random) NormalDistribution(org.apache.commons.math3.distribution.NormalDistribution) PredictionOutput(org.kie.kogito.explainability.model.PredictionOutput) Output(org.kie.kogito.explainability.model.Output) Value(org.kie.kogito.explainability.model.Value) NumericFeatureDistribution(org.kie.kogito.explainability.model.NumericFeatureDistribution) ValueSource(org.junit.jupiter.params.provider.ValueSource) ParameterizedTest(org.junit.jupiter.params.ParameterizedTest)

Example 3 with NumericFeatureDistribution

use of org.kie.kogito.explainability.model.NumericFeatureDistribution in project kogito-apps by kiegroup.

the class DataUtils method boostrapFeatureDistributions.

/**
 * Generate feature distributions from an existing (evantually small) {@link DataDistribution} for each {@link Feature}.
 * Each feature intervals (min, max) and density information (mean, stdDev) are generated using bootstrap, then
 * data points are sampled from a normal distribution (see {@link #generateData(double, double, int, Random)}).
 *
 * @param dataDistribution data distribution to take feature values from
 * @param perturbationContext perturbation context
 * @param featureDistributionSize desired size of generated feature distributions
 * @param draws number of times sampling from feature values is performed
 * @param sampleSize size of each sample draw
 * @param numericFeatureZonesMap high feature score zones
 * @return a map feature name -> generated feature distribution
 */
public static Map<String, FeatureDistribution> boostrapFeatureDistributions(DataDistribution dataDistribution, PerturbationContext perturbationContext, int featureDistributionSize, int draws, int sampleSize, Map<String, HighScoreNumericFeatureZones> numericFeatureZonesMap) {
    Map<String, FeatureDistribution> featureDistributions = new HashMap<>();
    for (FeatureDistribution featureDistribution : dataDistribution.asFeatureDistributions()) {
        Feature feature = featureDistribution.getFeature();
        if (Type.NUMBER.equals(feature.getType())) {
            List<Value> values = featureDistribution.getAllSamples();
            double[] means = new double[draws];
            double[] stdDevs = new double[draws];
            double[] mins = new double[draws];
            double[] maxs = new double[draws];
            for (int i = 0; i < draws; i++) {
                List<Value> sampledValues = DataUtils.sampleWithReplacement(values, sampleSize, perturbationContext.getRandom());
                double[] data = sampledValues.stream().mapToDouble(Value::asNumber).toArray();
                double mean = DataUtils.getMean(data);
                double stdDev = Math.pow(DataUtils.getStdDev(data, mean), 2);
                double min = Arrays.stream(data).min().orElse(Double.MIN_VALUE);
                double max = Arrays.stream(data).max().orElse(Double.MAX_VALUE);
                means[i] = mean;
                stdDevs[i] = stdDev;
                mins[i] = min;
                maxs[i] = max;
            }
            double finalMean = DataUtils.getMean(means);
            double finalStdDev = Math.sqrt(DataUtils.getMean(stdDevs));
            double finalMin = DataUtils.getMean(mins);
            double finalMax = DataUtils.getMean(maxs);
            double[] doubles = DataUtils.generateData(finalMean, finalStdDev, featureDistributionSize, perturbationContext.getRandom());
            double[] boundedData = Arrays.stream(doubles).map(d -> Math.min(Math.max(d, finalMin), finalMax)).toArray();
            HighScoreNumericFeatureZones highScoreNumericFeatureZones = numericFeatureZonesMap.get(feature.getName());
            double[] finaldata;
            if (highScoreNumericFeatureZones != null) {
                double[] filteredData = DoubleStream.of(boundedData).filter(highScoreNumericFeatureZones::test).toArray();
                // only use the filtered data if it's not discarding more than 50% of the points
                if (filteredData.length > featureDistributionSize / 2) {
                    finaldata = filteredData;
                } else {
                    finaldata = boundedData;
                }
            } else {
                finaldata = boundedData;
            }
            NumericFeatureDistribution numericFeatureDistribution = new NumericFeatureDistribution(feature, finaldata);
            featureDistributions.put(feature.getName(), numericFeatureDistribution);
        }
    }
    return featureDistributions;
}
Also used : IntStream(java.util.stream.IntStream) FeatureFactory(org.kie.kogito.explainability.model.FeatureFactory) Arrays(java.util.Arrays) MalformedInputException(java.nio.charset.MalformedInputException) PredictionInputsDataDistribution(org.kie.kogito.explainability.model.PredictionInputsDataDistribution) PerturbationContext(org.kie.kogito.explainability.model.PerturbationContext) Feature(org.kie.kogito.explainability.model.Feature) Prediction(org.kie.kogito.explainability.model.Prediction) CSVRecord(org.apache.commons.csv.CSVRecord) TimeoutException(java.util.concurrent.TimeoutException) HashMap(java.util.HashMap) Random(java.util.Random) Value(org.kie.kogito.explainability.model.Value) DataDistribution(org.kie.kogito.explainability.model.DataDistribution) ArrayList(java.util.ArrayList) CSVFormat(org.apache.commons.csv.CSVFormat) NumericFeatureDistribution(org.kie.kogito.explainability.model.NumericFeatureDistribution) PartialDependenceGraph(org.kie.kogito.explainability.model.PartialDependenceGraph) Map(java.util.Map) FeatureDistribution(org.kie.kogito.explainability.model.FeatureDistribution) LinkedList(java.util.LinkedList) Path(java.nio.file.Path) PredictionOutput(org.kie.kogito.explainability.model.PredictionOutput) IndependentFeaturesDataDistribution(org.kie.kogito.explainability.model.IndependentFeaturesDataDistribution) SimplePrediction(org.kie.kogito.explainability.model.SimplePrediction) Files(java.nio.file.Files) IOException(java.io.IOException) Collectors(java.util.stream.Collectors) Type(org.kie.kogito.explainability.model.Type) PredictionProvider(org.kie.kogito.explainability.model.PredictionProvider) DoubleStream(java.util.stream.DoubleStream) ExecutionException(java.util.concurrent.ExecutionException) PredictionInput(org.kie.kogito.explainability.model.PredictionInput) List(java.util.List) Output(org.kie.kogito.explainability.model.Output) Writer(java.io.Writer) Optional(java.util.Optional) HighScoreNumericFeatureZones(org.kie.kogito.explainability.local.lime.HighScoreNumericFeatureZones) BufferedReader(java.io.BufferedReader) Config(org.kie.kogito.explainability.Config) Collections(java.util.Collections) CSVPrinter(org.apache.commons.csv.CSVPrinter) HashMap(java.util.HashMap) Feature(org.kie.kogito.explainability.model.Feature) HighScoreNumericFeatureZones(org.kie.kogito.explainability.local.lime.HighScoreNumericFeatureZones) NumericFeatureDistribution(org.kie.kogito.explainability.model.NumericFeatureDistribution) FeatureDistribution(org.kie.kogito.explainability.model.FeatureDistribution) Value(org.kie.kogito.explainability.model.Value) NumericFeatureDistribution(org.kie.kogito.explainability.model.NumericFeatureDistribution)

Example 4 with NumericFeatureDistribution

use of org.kie.kogito.explainability.model.NumericFeatureDistribution in project kogito-apps by kiegroup.

the class CompositeEntityTest method distanceScaled.

@ParameterizedTest
@ValueSource(ints = { 0, 1, 2, 3, 4 })
void distanceScaled(int seed) {
    Random random = new Random();
    random.setSeed(seed);
    final Feature doubleFeature = FeatureFactory.newNumericalFeature("feature-double", 20.0, NumericalFeatureDomain.create(0.0, 40.0));
    final FeatureDistribution featureDistribution = new NumericFeatureDistribution(doubleFeature, random.doubles(5000, 10.0, 40.0).toArray());
    DoubleEntity entity = (DoubleEntity) CounterfactualEntityFactory.from(doubleFeature, featureDistribution);
    entity.proposedValue = 30.0;
    final double distance = entity.distance();
    assertTrue(distance > 0.1 && distance < 0.2);
}
Also used : NumericFeatureDistribution(org.kie.kogito.explainability.model.NumericFeatureDistribution) FeatureDistribution(org.kie.kogito.explainability.model.FeatureDistribution) Random(java.util.Random) Feature(org.kie.kogito.explainability.model.Feature) NumericFeatureDistribution(org.kie.kogito.explainability.model.NumericFeatureDistribution) ValueSource(org.junit.jupiter.params.provider.ValueSource) ParameterizedTest(org.junit.jupiter.params.ParameterizedTest)

Example 5 with NumericFeatureDistribution

use of org.kie.kogito.explainability.model.NumericFeatureDistribution in project kogito-apps by kiegroup.

the class DoubleEntityTest method distanceScaled.

@ParameterizedTest
@ValueSource(ints = { 0, 1, 2, 3, 4 })
void distanceScaled(int seed) {
    Random random = new Random();
    random.setSeed(seed);
    final FeatureDomain featureDomain = NumericalFeatureDomain.create(0.0, 40.0);
    final Feature doubleFeature = FeatureFactory.newNumericalFeature("feature-double", 20.0, featureDomain);
    final FeatureDistribution featureDistribution = new NumericFeatureDistribution(doubleFeature, random.doubles(5000, 10.0, 40.0).toArray());
    DoubleEntity entity = (DoubleEntity) CounterfactualEntityFactory.from(doubleFeature, featureDistribution);
    entity.proposedValue = 30.0;
    final double distance = entity.distance();
    assertTrue(distance > 0.1 && distance < 0.2);
}
Also used : NumericFeatureDistribution(org.kie.kogito.explainability.model.NumericFeatureDistribution) FeatureDistribution(org.kie.kogito.explainability.model.FeatureDistribution) Random(java.util.Random) NumericalFeatureDomain(org.kie.kogito.explainability.model.domain.NumericalFeatureDomain) FeatureDomain(org.kie.kogito.explainability.model.domain.FeatureDomain) Feature(org.kie.kogito.explainability.model.Feature) NumericFeatureDistribution(org.kie.kogito.explainability.model.NumericFeatureDistribution) ValueSource(org.junit.jupiter.params.provider.ValueSource) ParameterizedTest(org.junit.jupiter.params.ParameterizedTest)

Aggregations

Feature (org.kie.kogito.explainability.model.Feature)8 FeatureDistribution (org.kie.kogito.explainability.model.FeatureDistribution)8 NumericFeatureDistribution (org.kie.kogito.explainability.model.NumericFeatureDistribution)8 Random (java.util.Random)7 ParameterizedTest (org.junit.jupiter.params.ParameterizedTest)5 ValueSource (org.junit.jupiter.params.provider.ValueSource)5 LinkedList (java.util.LinkedList)3 FeatureDomain (org.kie.kogito.explainability.model.domain.FeatureDomain)3 NumericalFeatureDomain (org.kie.kogito.explainability.model.domain.NumericalFeatureDomain)3 CounterfactualEntity (org.kie.kogito.explainability.local.counterfactual.entities.CounterfactualEntity)2 Output (org.kie.kogito.explainability.model.Output)2 PredictionOutput (org.kie.kogito.explainability.model.PredictionOutput)2 Value (org.kie.kogito.explainability.model.Value)2 BufferedReader (java.io.BufferedReader)1 IOException (java.io.IOException)1 Writer (java.io.Writer)1 MalformedInputException (java.nio.charset.MalformedInputException)1 Files (java.nio.file.Files)1 Path (java.nio.file.Path)1 Duration (java.time.Duration)1