Search in sources :

Example 11 with DataField

use of org.dmg.pmml.DataField in project jpmml-sparkml by jpmml.

the class SparkMLEncoder method getFeatures.

public List<Feature> getFeatures(String column) {
    List<Feature> features = this.columnFeatures.get(column);
    if (features == null) {
        FieldName name = FieldName.create(column);
        DataField dataField = getDataField(name);
        if (dataField == null) {
            dataField = createDataField(name);
        }
        Feature feature;
        DataType dataType = dataField.getDataType();
        switch(dataType) {
            case STRING:
                feature = new WildcardFeature(this, dataField);
                break;
            case INTEGER:
            case DOUBLE:
                feature = new ContinuousFeature(this, dataField);
                break;
            case BOOLEAN:
                feature = new BooleanFeature(this, dataField);
                break;
            default:
                throw new IllegalArgumentException("Data type " + dataType + " is not supported");
        }
        return Collections.singletonList(feature);
    }
    return features;
}
Also used : ContinuousFeature(org.jpmml.converter.ContinuousFeature) DataField(org.dmg.pmml.DataField) DataType(org.dmg.pmml.DataType) Feature(org.jpmml.converter.Feature) ContinuousFeature(org.jpmml.converter.ContinuousFeature) BooleanFeature(org.jpmml.converter.BooleanFeature) WildcardFeature(org.jpmml.converter.WildcardFeature) FieldName(org.dmg.pmml.FieldName) BooleanFeature(org.jpmml.converter.BooleanFeature) WildcardFeature(org.jpmml.converter.WildcardFeature)

Example 12 with DataField

use of org.dmg.pmml.DataField in project jpmml-sparkml by jpmml.

the class SparkMLEncoder method removeDataField.

public void removeDataField(FieldName name) {
    Map<FieldName, DataField> dataFields = getDataFields();
    DataField dataField = dataFields.remove(name);
    if (dataField == null) {
        throw new IllegalArgumentException(name.getValue());
    }
}
Also used : DataField(org.dmg.pmml.DataField) FieldName(org.dmg.pmml.FieldName)

Example 13 with DataField

use of org.dmg.pmml.DataField in project openscoring by openscoring.

the class ModelUtil method encodeTargetFields.

private static List<Field> encodeTargetFields(List<TargetField> targetFields) {
    Function<TargetField, Field> function = new Function<TargetField, Field>() {

        @Override
        public Field apply(TargetField targetField) {
            FieldName name = targetField.getName();
            // A "phantom" default target field
            if (targetField.isSynthetic()) {
                name = ModelResource.DEFAULT_NAME;
            }
            DataField dataField = targetField.getDataField();
            Field field = new Field(name.getValue());
            field.setName(dataField.getDisplayName());
            field.setDataType(targetField.getDataType());
            field.setOpType(targetField.getOpType());
            field.setValues(encodeValues(dataField));
            return field;
        }
    };
    List<Field> fields = new ArrayList<>(Lists.transform(targetFields, function));
    return fields;
}
Also used : InputField(org.jpmml.evaluator.InputField) OutputField(org.jpmml.evaluator.OutputField) DataField(org.dmg.pmml.DataField) TargetField(org.jpmml.evaluator.TargetField) Field(org.openscoring.common.Field) Function(com.google.common.base.Function) DataField(org.dmg.pmml.DataField) ArrayList(java.util.ArrayList) TargetField(org.jpmml.evaluator.TargetField) FieldName(org.dmg.pmml.FieldName)

Example 14 with DataField

use of org.dmg.pmml.DataField in project jpmml-r by jpmml.

the class FormulaUtil method createFormula.

public static Formula createFormula(RExp terms, FormulaContext context, RExpEncoder encoder) {
    Formula formula = new Formula(encoder);
    RIntegerVector factors = (RIntegerVector) terms.getAttributeValue("factors");
    RStringVector dataClasses = (RStringVector) terms.getAttributeValue("dataClasses");
    RStringVector variableRows = factors.dimnames(0);
    RStringVector termColumns = factors.dimnames(1);
    VariableMap expressionFields = new VariableMap();
    for (int i = 0; i < variableRows.size(); i++) {
        String variable = variableRows.getDequotedValue(i);
        FieldName name = FieldName.create(variable);
        OpType opType = OpType.CONTINUOUS;
        DataType dataType = RExpUtil.getDataType(dataClasses.getValue(variable));
        List<String> categories = context.getCategories(variable);
        if (categories != null && categories.size() > 0) {
            opType = OpType.CATEGORICAL;
        }
        Expression expression = null;
        FieldName shortName = name;
        expression: if (variable.indexOf('(') > -1 && variable.indexOf(')') > -1) {
            FunctionExpression functionExpression;
            try {
                functionExpression = (FunctionExpression) ExpressionTranslator.translateExpression(variable);
            } catch (Exception e) {
                break expression;
            }
            if (functionExpression.hasId("base", "cut")) {
                expression = encodeCutExpression(functionExpression, categories, expressionFields, encoder);
            } else if (functionExpression.hasId("base", "I")) {
                expression = encodeIdentityExpression(functionExpression, expressionFields, encoder);
            } else if (functionExpression.hasId("base", "ifelse")) {
                expression = encodeIfElseExpression(functionExpression, expressionFields, encoder);
            } else if (functionExpression.hasId("plyr", "mapvalues")) {
                expression = encodeMapValuesExpression(functionExpression, categories, expressionFields, encoder);
            } else if (functionExpression.hasId("plyr", "revalue")) {
                expression = encodeReValueExpression(functionExpression, categories, expressionFields, encoder);
            } else {
                break expression;
            }
            FunctionExpression.Argument xArgument = functionExpression.getArgument("x", 0);
            String value = (xArgument.formatExpression()).trim();
            shortName = FieldName.create(functionExpression.hasId("base", "I") ? value : (functionExpression.getFunction() + "(" + value + ")"));
        }
        if (expression != null) {
            DerivedField derivedField = encoder.createDerivedField(name, opType, dataType, expression).addExtensions(createExtension(variable));
            if (categories != null && categories.size() > 0) {
                formula.addField(derivedField, categories);
            } else {
                formula.addField(derivedField);
            }
            if (!(name).equals(shortName)) {
                encoder.renameField(name, shortName);
            }
        } else {
            if ((DataType.BOOLEAN).equals(dataType)) {
                categories = Arrays.asList("false", "true");
            }
            if (categories != null && categories.size() > 0) {
                DataField dataField = encoder.createDataField(name, OpType.CATEGORICAL, dataType, categories);
                List<String> categoryNames;
                List<String> categoryValues;
                switch(dataType) {
                    case BOOLEAN:
                        categoryNames = Arrays.asList("FALSE", "TRUE");
                        categoryValues = Arrays.asList("false", "true");
                        break;
                    default:
                        categoryNames = categories;
                        categoryValues = categories;
                        break;
                }
                formula.addField(dataField, categoryNames, categoryValues);
            } else {
                DataField dataField = encoder.createDataField(name, OpType.CONTINUOUS, dataType);
                formula.addField(dataField);
            }
        }
    }
    Collection<Map.Entry<FieldName, List<String>>> entries = expressionFields.entrySet();
    for (Map.Entry<FieldName, List<String>> entry : entries) {
        FieldName name = entry.getKey();
        List<String> categories = entry.getValue();
        DataField dataField = encoder.getDataField(name);
        if (dataField == null) {
            OpType opType = OpType.CONTINUOUS;
            DataType dataType = DataType.DOUBLE;
            if (categories != null && categories.size() > 0) {
                opType = OpType.CATEGORICAL;
            }
            RGenericVector data = context.getData();
            if (data != null && data.hasValue(name.getValue())) {
                RVector<?> column = (RVector<?>) data.getValue(name.getValue());
                dataType = column.getDataType();
            }
            dataField = encoder.createDataField(name, opType, dataType, categories);
        }
    }
    return formula;
}
Also used : DataField(org.dmg.pmml.DataField) Expression(org.dmg.pmml.Expression) DataType(org.dmg.pmml.DataType) OpType(org.dmg.pmml.OpType) ArrayList(java.util.ArrayList) List(java.util.List) FieldName(org.dmg.pmml.FieldName) DerivedField(org.dmg.pmml.DerivedField) LinkedHashMap(java.util.LinkedHashMap) Map(java.util.Map)

Example 15 with DataField

use of org.dmg.pmml.DataField in project jpmml-r by jpmml.

the class GLMConverter method encodeSchema.

@Override
public void encodeSchema(RExpEncoder encoder) {
    RGenericVector glm = getObject();
    RGenericVector family = (RGenericVector) glm.getValue("family");
    RGenericVector model = (RGenericVector) glm.getValue("model");
    RStringVector familyFamily = (RStringVector) family.getValue("family");
    super.encodeSchema(encoder);
    MiningFunction miningFunction = getMiningFunction(familyFamily.asScalar());
    switch(miningFunction) {
        case CLASSIFICATION:
            Label label = encoder.getLabel();
            RIntegerVector variable = (RIntegerVector) model.getValue((label.getName()).getValue());
            DataField dataField = (DataField) encoder.toCategorical(label.getName(), RExpUtil.getFactorLevels(variable));
            encoder.setLabel(dataField);
            break;
        default:
            break;
    }
}
Also used : DataField(org.dmg.pmml.DataField) CategoricalLabel(org.jpmml.converter.CategoricalLabel) Label(org.jpmml.converter.Label) MiningFunction(org.dmg.pmml.MiningFunction)

Aggregations

DataField (org.dmg.pmml.DataField)26 Feature (org.jpmml.converter.Feature)13 FieldName (org.dmg.pmml.FieldName)12 ArrayList (java.util.ArrayList)9 ContinuousFeature (org.jpmml.converter.ContinuousFeature)8 CategoricalFeature (org.jpmml.converter.CategoricalFeature)5 DataType (org.dmg.pmml.DataType)4 DerivedField (org.dmg.pmml.DerivedField)4 OpType (org.dmg.pmml.OpType)4 Apply (org.dmg.pmml.Apply)3 CategoricalLabel (org.jpmml.converter.CategoricalLabel)3 ContinuousLabel (org.jpmml.converter.ContinuousLabel)3 Label (org.jpmml.converter.Label)3 Function (com.google.common.base.Function)2 MiningFunction (org.dmg.pmml.MiningFunction)2 BooleanFeature (org.jpmml.converter.BooleanFeature)2 InputField (org.jpmml.evaluator.InputField)2 OutputField (org.jpmml.evaluator.OutputField)2 TargetField (org.jpmml.evaluator.TargetField)2 Field (org.openscoring.common.Field)2