Search in sources :

Example 1 with OneHotEncoderModel

use of org.apache.spark.ml.feature.OneHotEncoderModel in project jpmml-sparkml by jpmml.

the class OneHotEncoderModelConverter method encodeFeatures.

@Override
public List<Feature> encodeFeatures(SparkMLEncoder encoder) {
    OneHotEncoderModel transformer = getTransformer();
    String[] inputCols = transformer.getInputCols();
    boolean dropLast = transformer.getDropLast();
    List<Feature> result = new ArrayList<>();
    for (int i = 0; i < inputCols.length; i++) {
        CategoricalFeature categoricalFeature = (CategoricalFeature) encoder.getOnlyFeature(inputCols[i]);
        List<String> values = categoricalFeature.getValues();
        if (dropLast) {
            values = values.subList(0, values.size() - 1);
        }
        List<BinaryFeature> binaryFeatures = new ArrayList<>();
        for (String value : values) {
            binaryFeatures.add(new BinaryFeature(encoder, categoricalFeature.getName(), DataType.STRING, value));
        }
        result.add(new BinarizedCategoricalFeature(encoder, categoricalFeature.getName(), categoricalFeature.getDataType(), binaryFeatures));
    }
    return result;
}
Also used : ArrayList(java.util.ArrayList) BinarizedCategoricalFeature(org.jpmml.sparkml.BinarizedCategoricalFeature) BinaryFeature(org.jpmml.converter.BinaryFeature) Feature(org.jpmml.converter.Feature) CategoricalFeature(org.jpmml.converter.CategoricalFeature) BinaryFeature(org.jpmml.converter.BinaryFeature) BinarizedCategoricalFeature(org.jpmml.sparkml.BinarizedCategoricalFeature) CategoricalFeature(org.jpmml.converter.CategoricalFeature) BinarizedCategoricalFeature(org.jpmml.sparkml.BinarizedCategoricalFeature) OneHotEncoderModel(org.apache.spark.ml.feature.OneHotEncoderModel)

Aggregations

ArrayList (java.util.ArrayList)1 OneHotEncoderModel (org.apache.spark.ml.feature.OneHotEncoderModel)1 BinaryFeature (org.jpmml.converter.BinaryFeature)1 CategoricalFeature (org.jpmml.converter.CategoricalFeature)1 Feature (org.jpmml.converter.Feature)1 BinarizedCategoricalFeature (org.jpmml.sparkml.BinarizedCategoricalFeature)1