Search in sources :

Example 21 with Expression

use of org.dmg.pmml.Expression in project jpmml-sparkml by jpmml.

the class MinMaxScalerModelConverter method encodeFeatures.

@Override
public List<Feature> encodeFeatures(SparkMLEncoder encoder) {
    MinMaxScalerModel transformer = getTransformer();
    double rescaleFactor = (transformer.getMax() - transformer.getMin());
    double rescaleConstant = transformer.getMin();
    List<Feature> features = encoder.getFeatures(transformer.getInputCol());
    Vector originalMax = transformer.originalMax();
    if (originalMax.size() != features.size()) {
        throw new IllegalArgumentException();
    }
    Vector originalMin = transformer.originalMin();
    if (originalMin.size() != features.size()) {
        throw new IllegalArgumentException();
    }
    List<Feature> result = new ArrayList<>();
    for (int i = 0; i < features.size(); i++) {
        Feature feature = features.get(i);
        ContinuousFeature continuousFeature = feature.toContinuousFeature();
        double max = originalMax.apply(i);
        double min = originalMin.apply(i);
        Expression expression = PMMLUtil.createApply("/", PMMLUtil.createApply("-", continuousFeature.ref(), PMMLUtil.createConstant(min)), PMMLUtil.createConstant(max - min));
        if (!ValueUtil.isOne(rescaleFactor)) {
            expression = PMMLUtil.createApply("*", expression, PMMLUtil.createConstant(rescaleFactor));
        }
        if (!ValueUtil.isZero(rescaleConstant)) {
            expression = PMMLUtil.createApply("+", expression, PMMLUtil.createConstant(rescaleConstant));
        }
        DerivedField derivedField = encoder.createDerivedField(formatName(transformer, i), OpType.CONTINUOUS, DataType.DOUBLE, expression);
        result.add(new ContinuousFeature(encoder, derivedField));
    }
    return result;
}
Also used : MinMaxScalerModel(org.apache.spark.ml.feature.MinMaxScalerModel) ContinuousFeature(org.jpmml.converter.ContinuousFeature) Expression(org.dmg.pmml.Expression) ArrayList(java.util.ArrayList) Feature(org.jpmml.converter.Feature) ContinuousFeature(org.jpmml.converter.ContinuousFeature) Vector(org.apache.spark.ml.linalg.Vector) DerivedField(org.dmg.pmml.DerivedField)

Example 22 with Expression

use of org.dmg.pmml.Expression in project jpmml-sparkml by jpmml.

the class PCAModelConverter method encodeFeatures.

@Override
public List<Feature> encodeFeatures(SparkMLEncoder encoder) {
    PCAModel transformer = getTransformer();
    List<Feature> features = encoder.getFeatures(transformer.getInputCol());
    DenseMatrix pc = transformer.pc();
    if (pc.numRows() != features.size()) {
        throw new IllegalArgumentException();
    }
    List<Feature> result = new ArrayList<>();
    for (int i = 0; i < transformer.getK(); i++) {
        Apply apply = new Apply("sum");
        for (int j = 0; j < features.size(); j++) {
            Feature feature = features.get(j);
            ContinuousFeature continuousFeature = feature.toContinuousFeature();
            Expression expression = continuousFeature.ref();
            Double coefficient = pc.apply(j, i);
            if (!ValueUtil.isOne(coefficient)) {
                expression = PMMLUtil.createApply("*", expression, PMMLUtil.createConstant(coefficient));
            }
            apply.addExpressions(expression);
        }
        DerivedField derivedField = encoder.createDerivedField(formatName(transformer, i), OpType.CONTINUOUS, DataType.DOUBLE, apply);
        result.add(new ContinuousFeature(encoder, derivedField));
    }
    return result;
}
Also used : PCAModel(org.apache.spark.ml.feature.PCAModel) ContinuousFeature(org.jpmml.converter.ContinuousFeature) Expression(org.dmg.pmml.Expression) Apply(org.dmg.pmml.Apply) ArrayList(java.util.ArrayList) Feature(org.jpmml.converter.Feature) ContinuousFeature(org.jpmml.converter.ContinuousFeature) DerivedField(org.dmg.pmml.DerivedField) DenseMatrix(org.apache.spark.ml.linalg.DenseMatrix)

Aggregations

Expression (org.dmg.pmml.Expression)22 Apply (org.dmg.pmml.Apply)11 Test (org.junit.Test)8 DerivedField (org.dmg.pmml.DerivedField)7 ArrayList (java.util.ArrayList)6 ContinuousFeature (org.jpmml.converter.ContinuousFeature)6 Feature (org.jpmml.converter.Feature)6 FieldRef (org.dmg.pmml.FieldRef)5 Constant (org.dmg.pmml.Constant)4 Vector (org.apache.spark.ml.linalg.Vector)3 DataType (org.dmg.pmml.DataType)2 FieldName (org.dmg.pmml.FieldName)2 OpType (org.dmg.pmml.OpType)2 LinkedHashMap (java.util.LinkedHashMap)1 List (java.util.List)1 Map (java.util.Map)1 Function (java.util.function.Function)1 MaxAbsScalerModel (org.apache.spark.ml.feature.MaxAbsScalerModel)1 MinMaxScalerModel (org.apache.spark.ml.feature.MinMaxScalerModel)1 PCAModel (org.apache.spark.ml.feature.PCAModel)1