Search in sources :

Example 6 with DataType

use of org.apache.spark.sql.types.DataType in project incubator-systemml by apache.

the class FrameRDDConverterUtils method convertFrameSchemaToDFSchema.

/**
 * This function will convert Frame schema into DataFrame schema
 *
 * @param fschema frame schema
 * @param containsID true if contains ID column
 * @return Spark StructType of StructFields representing schema
 */
public static StructType convertFrameSchemaToDFSchema(ValueType[] fschema, boolean containsID) {
    // generate the schema based on the string of schema
    List<StructField> fields = new ArrayList<>();
    // add id column type
    if (containsID)
        fields.add(DataTypes.createStructField(RDDConverterUtils.DF_ID_COLUMN, DataTypes.DoubleType, true));
    // add remaining types
    int col = 1;
    for (ValueType schema : fschema) {
        DataType dt = null;
        switch(schema) {
            case STRING:
                dt = DataTypes.StringType;
                break;
            case DOUBLE:
                dt = DataTypes.DoubleType;
                break;
            case INT:
                dt = DataTypes.LongType;
                break;
            case BOOLEAN:
                dt = DataTypes.BooleanType;
                break;
            default:
                dt = DataTypes.StringType;
                LOG.warn("Using default type String for " + schema.toString());
        }
        fields.add(DataTypes.createStructField("C" + col++, dt, true));
    }
    return DataTypes.createStructType(fields);
}
Also used : StructField(org.apache.spark.sql.types.StructField) ValueType(org.apache.sysml.parser.Expression.ValueType) ArrayList(java.util.ArrayList) DataType(org.apache.spark.sql.types.DataType)

Aggregations

DataType (org.apache.spark.sql.types.DataType)6 StructField (org.apache.spark.sql.types.StructField)5 ArrayList (java.util.ArrayList)4 StructType (org.apache.spark.sql.types.StructType)4 JavaSparkContext (org.apache.spark.api.java.JavaSparkContext)2 DenseVector (org.apache.spark.ml.linalg.DenseVector)2 VectorUDT (org.apache.spark.ml.linalg.VectorUDT)2 Row (org.apache.spark.sql.Row)2 MatrixBlock (org.apache.sysml.runtime.matrix.data.MatrixBlock)2 HashMap (java.util.HashMap)1 HashSet (java.util.HashSet)1 LinkedHashSet (java.util.LinkedHashSet)1 Set (java.util.Set)1 BooleanType (org.apache.spark.sql.types.BooleanType)1 DecimalType (org.apache.spark.sql.types.DecimalType)1 DoubleType (org.apache.spark.sql.types.DoubleType)1 FloatType (org.apache.spark.sql.types.FloatType)1 IntegerType (org.apache.spark.sql.types.IntegerType)1 LongType (org.apache.spark.sql.types.LongType)1 ShortType (org.apache.spark.sql.types.ShortType)1