Search in sources :

Example 6 with BooleanObject

use of org.apache.sysml.runtime.instructions.cp.BooleanObject in project incubator-systemml by apache.

the class CheckpointSPInstruction method processInstruction.

@Override
@SuppressWarnings("unchecked")
public void processInstruction(ExecutionContext ec) throws DMLRuntimeException {
    SparkExecutionContext sec = (SparkExecutionContext) ec;
    // this is valid if relevant branches are never entered)
    if (sec.getVariable(input1.getName()) == null || sec.getVariable(input1.getName()) instanceof BooleanObject) {
        //add a dummy entry to the input, which will be immediately overwritten by the null output.
        sec.setVariable(input1.getName(), new BooleanObject(false));
        sec.setVariable(output.getName(), new BooleanObject(false));
        return;
    }
    //get input rdd handle (for matrix or frame)
    JavaPairRDD<?, ?> in = sec.getRDDHandleForVariable(input1.getName(), InputInfo.BinaryBlockInputInfo);
    MatrixCharacteristics mcIn = sec.getMatrixCharacteristics(input1.getName());
    // Step 2: Checkpoint given rdd (only if currently in different storage level to prevent redundancy)
    // -------
    // Note that persist is an transformation which will be triggered on-demand with the next rdd operations
    // This prevents unnecessary overhead if the dataset is only consumed by cp operations.
    JavaPairRDD<?, ?> out = null;
    if (!in.getStorageLevel().equals(_level)) {
        //(trigger coalesce if intended number of partitions exceeded by 20%
        //and not hash partitioned to avoid losing the existing partitioner)
        int numPartitions = SparkUtils.getNumPreferredPartitions(mcIn, in);
        boolean coalesce = (1.2 * numPartitions < in.getNumPartitions() && !SparkUtils.isHashPartitioned(in));
        //checkpoint pre-processing rdd operations
        if (coalesce) {
            //merge partitions without shuffle if too many partitions
            out = in.coalesce(numPartitions);
        } else {
            //apply a narrow shallow copy to allow for short-circuit collects 
            if (input1.getDataType() == DataType.MATRIX)
                out = SparkUtils.copyBinaryBlockMatrix((JavaPairRDD<MatrixIndexes, MatrixBlock>) in, false);
            else if (input1.getDataType() == DataType.FRAME)
                out = ((JavaPairRDD<Long, FrameBlock>) in).mapValues(new CopyFrameBlockFunction(false));
        }
        //convert mcsr into memory-efficient csr if potentially sparse
        if (input1.getDataType() == DataType.MATRIX && OptimizerUtils.checkSparseBlockCSRConversion(mcIn) && !_level.equals(Checkpoint.SER_STORAGE_LEVEL)) {
            out = ((JavaPairRDD<MatrixIndexes, MatrixBlock>) out).mapValues(new CreateSparseBlockFunction(SparseBlock.Type.CSR));
        }
        //actual checkpoint into given storage level
        out = out.persist(_level);
    } else {
        //pass-through
        out = in;
    }
    // Step 3: In-place update of input matrix/frame rdd handle and set as output
    // -------
    // We use this in-place approach for two reasons. First, it is correct because our checkpoint 
    // injection rewrites guarantee that after checkpoint instructions there are no consumers on the 
    // given input. Second, it is beneficial because otherwise we need to pass in-memory objects and
    // filenames to the new matrix object in order to prevent repeated reads from hdfs and unnecessary
    // caching and subsequent collects. Note that in-place update requires us to explicitly handle
    // lineage information in order to prevent cycles on cleanup. 
    CacheableData<?> cd = sec.getCacheableData(input1.getName());
    if (out != in) {
        //prevent unnecessary lineage info
        //guaranteed to exist (see above)
        RDDObject inro = cd.getRDDHandle();
        //create new rdd object
        RDDObject outro = new RDDObject(out, output.getName());
        //mark as checkpointed
        outro.setCheckpointRDD(true);
        //keep lineage to prevent cycles on cleanup
        outro.addLineageChild(inro);
        cd.setRDDHandle(outro);
    }
    sec.setVariable(output.getName(), cd);
}
Also used : MatrixBlock(org.apache.sysml.runtime.matrix.data.MatrixBlock) MatrixIndexes(org.apache.sysml.runtime.matrix.data.MatrixIndexes) Checkpoint(org.apache.sysml.lops.Checkpoint) MatrixCharacteristics(org.apache.sysml.runtime.matrix.MatrixCharacteristics) CreateSparseBlockFunction(org.apache.sysml.runtime.instructions.spark.functions.CreateSparseBlockFunction) FrameBlock(org.apache.sysml.runtime.matrix.data.FrameBlock) RDDObject(org.apache.sysml.runtime.instructions.spark.data.RDDObject) SparkExecutionContext(org.apache.sysml.runtime.controlprogram.context.SparkExecutionContext) CopyFrameBlockFunction(org.apache.sysml.runtime.instructions.spark.functions.CopyFrameBlockFunction) BooleanObject(org.apache.sysml.runtime.instructions.cp.BooleanObject)

Example 7 with BooleanObject

use of org.apache.sysml.runtime.instructions.cp.BooleanObject in project incubator-systemml by apache.

the class ProgramConverter method parseDataObject.

/**
	 * NOTE: MRJobConfiguration cannot be used for the general case because program blocks and
	 * related symbol tables can be hierarchically structured.
	 * 
	 * @param in data object as string
	 * @return array of objects
	 * @throws DMLRuntimeException if DMLRuntimeException occurs
	 */
public static Object[] parseDataObject(String in) throws DMLRuntimeException {
    Object[] ret = new Object[2];
    StringTokenizer st = new StringTokenizer(in, DATA_FIELD_DELIM);
    String name = st.nextToken();
    DataType datatype = DataType.valueOf(st.nextToken());
    ValueType valuetype = ValueType.valueOf(st.nextToken());
    String valString = st.hasMoreTokens() ? st.nextToken() : "";
    Data dat = null;
    switch(datatype) {
        case SCALAR:
            {
                switch(valuetype) {
                    case INT:
                        long value1 = Long.parseLong(valString);
                        dat = new IntObject(name, value1);
                        break;
                    case DOUBLE:
                        double value2 = Double.parseDouble(valString);
                        dat = new DoubleObject(name, value2);
                        break;
                    case BOOLEAN:
                        boolean value3 = Boolean.parseBoolean(valString);
                        dat = new BooleanObject(name, value3);
                        break;
                    case STRING:
                        dat = new StringObject(name, valString);
                        break;
                    default:
                        throw new DMLRuntimeException("Unable to parse valuetype " + valuetype);
                }
                break;
            }
        case MATRIX:
            {
                MatrixObject mo = new MatrixObject(valuetype, valString);
                long rows = Long.parseLong(st.nextToken());
                long cols = Long.parseLong(st.nextToken());
                int brows = Integer.parseInt(st.nextToken());
                int bcols = Integer.parseInt(st.nextToken());
                long nnz = Long.parseLong(st.nextToken());
                InputInfo iin = InputInfo.stringToInputInfo(st.nextToken());
                OutputInfo oin = OutputInfo.stringToOutputInfo(st.nextToken());
                PartitionFormat partFormat = PartitionFormat.valueOf(st.nextToken());
                UpdateType inplace = UpdateType.valueOf(st.nextToken());
                MatrixCharacteristics mc = new MatrixCharacteristics(rows, cols, brows, bcols, nnz);
                MatrixFormatMetaData md = new MatrixFormatMetaData(mc, oin, iin);
                mo.setMetaData(md);
                mo.setVarName(name);
                if (partFormat._dpf != PDataPartitionFormat.NONE)
                    mo.setPartitioned(partFormat._dpf, partFormat._N);
                mo.setUpdateType(inplace);
                dat = mo;
                break;
            }
        default:
            throw new DMLRuntimeException("Unable to parse datatype " + datatype);
    }
    ret[0] = name;
    ret[1] = dat;
    return ret;
}
Also used : MatrixObject(org.apache.sysml.runtime.controlprogram.caching.MatrixObject) ValueType(org.apache.sysml.parser.Expression.ValueType) DoubleObject(org.apache.sysml.runtime.instructions.cp.DoubleObject) Data(org.apache.sysml.runtime.instructions.cp.Data) MatrixFormatMetaData(org.apache.sysml.runtime.matrix.MatrixFormatMetaData) PartitionFormat(org.apache.sysml.runtime.controlprogram.ParForProgramBlock.PartitionFormat) PDataPartitionFormat(org.apache.sysml.runtime.controlprogram.ParForProgramBlock.PDataPartitionFormat) UpdateType(org.apache.sysml.runtime.controlprogram.caching.MatrixObject.UpdateType) MatrixFormatMetaData(org.apache.sysml.runtime.matrix.MatrixFormatMetaData) DMLRuntimeException(org.apache.sysml.runtime.DMLRuntimeException) MatrixCharacteristics(org.apache.sysml.runtime.matrix.MatrixCharacteristics) OutputInfo(org.apache.sysml.runtime.matrix.data.OutputInfo) StringTokenizer(java.util.StringTokenizer) IntObject(org.apache.sysml.runtime.instructions.cp.IntObject) InputInfo(org.apache.sysml.runtime.matrix.data.InputInfo) StringObject(org.apache.sysml.runtime.instructions.cp.StringObject) DataType(org.apache.sysml.parser.Expression.DataType) MatrixObject(org.apache.sysml.runtime.controlprogram.caching.MatrixObject) ScalarObject(org.apache.sysml.runtime.instructions.cp.ScalarObject) DoubleObject(org.apache.sysml.runtime.instructions.cp.DoubleObject) BooleanObject(org.apache.sysml.runtime.instructions.cp.BooleanObject) IntObject(org.apache.sysml.runtime.instructions.cp.IntObject) StringObject(org.apache.sysml.runtime.instructions.cp.StringObject) BooleanObject(org.apache.sysml.runtime.instructions.cp.BooleanObject)

Example 8 with BooleanObject

use of org.apache.sysml.runtime.instructions.cp.BooleanObject in project incubator-systemml by apache.

the class IfProgramBlock method execute.

@Override
public void execute(ExecutionContext ec) throws DMLRuntimeException {
    BooleanObject predResult = executePredicate(ec);
    //execute if statement
    if (predResult.getBooleanValue()) {
        try {
            for (int i = 0; i < _childBlocksIfBody.size(); i++) {
                ec.updateDebugState(i);
                _childBlocksIfBody.get(i).execute(ec);
            }
        } catch (DMLScriptException e) {
            throw e;
        } catch (Exception e) {
            throw new DMLRuntimeException(this.printBlockErrorLocation() + "Error evaluating if statement body ", e);
        }
    } else {
        try {
            for (int i = 0; i < _childBlocksElseBody.size(); i++) {
                ec.updateDebugState(i);
                _childBlocksElseBody.get(i).execute(ec);
            }
        } catch (DMLScriptException e) {
            throw e;
        } catch (Exception e) {
            throw new DMLRuntimeException(this.printBlockErrorLocation() + "Error evaluating else statement body ", e);
        }
    }
    //execute exit instructions
    try {
        executeInstructions(_exitInstructions, ec);
    } catch (DMLScriptException e) {
        throw e;
    } catch (Exception e) {
        throw new DMLRuntimeException(this.printBlockErrorLocation() + "Error evaluating if exit instructions ", e);
    }
}
Also used : DMLScriptException(org.apache.sysml.runtime.DMLScriptException) DMLRuntimeException(org.apache.sysml.runtime.DMLRuntimeException) DMLScriptException(org.apache.sysml.runtime.DMLScriptException) BooleanObject(org.apache.sysml.runtime.instructions.cp.BooleanObject) DMLRuntimeException(org.apache.sysml.runtime.DMLRuntimeException)

Example 9 with BooleanObject

use of org.apache.sysml.runtime.instructions.cp.BooleanObject in project incubator-systemml by apache.

the class MLContextUtil method convertInputType.

/**
	 * Convert input types to internal SystemML representations
	 *
	 * @param parameterName
	 *            The name of the input parameter
	 * @param parameterValue
	 *            The value of the input parameter
	 * @param metadata
	 *            matrix/frame metadata
	 * @return input in SystemML data representation
	 */
public static Data convertInputType(String parameterName, Object parameterValue, Metadata metadata) {
    String name = parameterName;
    Object value = parameterValue;
    boolean hasMetadata = (metadata != null) ? true : false;
    boolean hasMatrixMetadata = hasMetadata && (metadata instanceof MatrixMetadata) ? true : false;
    boolean hasFrameMetadata = hasMetadata && (metadata instanceof FrameMetadata) ? true : false;
    if (name == null) {
        throw new MLContextException("Input parameter name is null");
    } else if (value == null) {
        throw new MLContextException("Input parameter value is null for: " + parameterName);
    } else if (value instanceof JavaRDD<?>) {
        @SuppressWarnings("unchecked") JavaRDD<String> javaRDD = (JavaRDD<String>) value;
        if (hasMatrixMetadata) {
            MatrixMetadata matrixMetadata = (MatrixMetadata) metadata;
            if (matrixMetadata.getMatrixFormat() == MatrixFormat.IJV) {
                return MLContextConversionUtil.javaRDDStringIJVToMatrixObject(name, javaRDD, matrixMetadata);
            } else {
                return MLContextConversionUtil.javaRDDStringCSVToMatrixObject(name, javaRDD, matrixMetadata);
            }
        } else if (hasFrameMetadata) {
            FrameMetadata frameMetadata = (FrameMetadata) metadata;
            if (frameMetadata.getFrameFormat() == FrameFormat.IJV) {
                return MLContextConversionUtil.javaRDDStringIJVToFrameObject(name, javaRDD, frameMetadata);
            } else {
                return MLContextConversionUtil.javaRDDStringCSVToFrameObject(name, javaRDD, frameMetadata);
            }
        } else if (!hasMetadata) {
            String firstLine = javaRDD.first();
            boolean isAllNumbers = isCSVLineAllNumbers(firstLine);
            if (isAllNumbers) {
                return MLContextConversionUtil.javaRDDStringCSVToMatrixObject(name, javaRDD);
            } else {
                return MLContextConversionUtil.javaRDDStringCSVToFrameObject(name, javaRDD);
            }
        }
    } else if (value instanceof RDD<?>) {
        @SuppressWarnings("unchecked") RDD<String> rdd = (RDD<String>) value;
        if (hasMatrixMetadata) {
            MatrixMetadata matrixMetadata = (MatrixMetadata) metadata;
            if (matrixMetadata.getMatrixFormat() == MatrixFormat.IJV) {
                return MLContextConversionUtil.rddStringIJVToMatrixObject(name, rdd, matrixMetadata);
            } else {
                return MLContextConversionUtil.rddStringCSVToMatrixObject(name, rdd, matrixMetadata);
            }
        } else if (hasFrameMetadata) {
            FrameMetadata frameMetadata = (FrameMetadata) metadata;
            if (frameMetadata.getFrameFormat() == FrameFormat.IJV) {
                return MLContextConversionUtil.rddStringIJVToFrameObject(name, rdd, frameMetadata);
            } else {
                return MLContextConversionUtil.rddStringCSVToFrameObject(name, rdd, frameMetadata);
            }
        } else if (!hasMetadata) {
            String firstLine = rdd.first();
            boolean isAllNumbers = isCSVLineAllNumbers(firstLine);
            if (isAllNumbers) {
                return MLContextConversionUtil.rddStringCSVToMatrixObject(name, rdd);
            } else {
                return MLContextConversionUtil.rddStringCSVToFrameObject(name, rdd);
            }
        }
    } else if (value instanceof MatrixBlock) {
        MatrixBlock matrixBlock = (MatrixBlock) value;
        return MLContextConversionUtil.matrixBlockToMatrixObject(name, matrixBlock, (MatrixMetadata) metadata);
    } else if (value instanceof FrameBlock) {
        FrameBlock frameBlock = (FrameBlock) value;
        return MLContextConversionUtil.frameBlockToFrameObject(name, frameBlock, (FrameMetadata) metadata);
    } else if (value instanceof Dataset<?>) {
        @SuppressWarnings("unchecked") Dataset<Row> dataFrame = (Dataset<Row>) value;
        dataFrame = MLUtils.convertVectorColumnsToML(dataFrame);
        if (hasMatrixMetadata) {
            return MLContextConversionUtil.dataFrameToMatrixObject(name, dataFrame, (MatrixMetadata) metadata);
        } else if (hasFrameMetadata) {
            return MLContextConversionUtil.dataFrameToFrameObject(name, dataFrame, (FrameMetadata) metadata);
        } else if (!hasMetadata) {
            boolean looksLikeMatrix = doesDataFrameLookLikeMatrix(dataFrame);
            if (looksLikeMatrix) {
                return MLContextConversionUtil.dataFrameToMatrixObject(name, dataFrame);
            } else {
                return MLContextConversionUtil.dataFrameToFrameObject(name, dataFrame);
            }
        }
    } else if (value instanceof BinaryBlockMatrix) {
        BinaryBlockMatrix binaryBlockMatrix = (BinaryBlockMatrix) value;
        if (metadata == null) {
            metadata = binaryBlockMatrix.getMatrixMetadata();
        }
        JavaPairRDD<MatrixIndexes, MatrixBlock> binaryBlocks = binaryBlockMatrix.getBinaryBlocks();
        return MLContextConversionUtil.binaryBlocksToMatrixObject(name, binaryBlocks, (MatrixMetadata) metadata);
    } else if (value instanceof BinaryBlockFrame) {
        BinaryBlockFrame binaryBlockFrame = (BinaryBlockFrame) value;
        if (metadata == null) {
            metadata = binaryBlockFrame.getFrameMetadata();
        }
        JavaPairRDD<Long, FrameBlock> binaryBlocks = binaryBlockFrame.getBinaryBlocks();
        return MLContextConversionUtil.binaryBlocksToFrameObject(name, binaryBlocks, (FrameMetadata) metadata);
    } else if (value instanceof Matrix) {
        Matrix matrix = (Matrix) value;
        return matrix.toMatrixObject();
    } else if (value instanceof Frame) {
        Frame frame = (Frame) value;
        return frame.toFrameObject();
    } else if (value instanceof double[][]) {
        double[][] doubleMatrix = (double[][]) value;
        return MLContextConversionUtil.doubleMatrixToMatrixObject(name, doubleMatrix, (MatrixMetadata) metadata);
    } else if (value instanceof URL) {
        URL url = (URL) value;
        return MLContextConversionUtil.urlToMatrixObject(name, url, (MatrixMetadata) metadata);
    } else if (value instanceof Integer) {
        return new IntObject((Integer) value);
    } else if (value instanceof Double) {
        return new DoubleObject((Double) value);
    } else if (value instanceof String) {
        return new StringObject((String) value);
    } else if (value instanceof Boolean) {
        return new BooleanObject((Boolean) value);
    }
    return null;
}
Also used : MatrixBlock(org.apache.sysml.runtime.matrix.data.MatrixBlock) DoubleObject(org.apache.sysml.runtime.instructions.cp.DoubleObject) URL(java.net.URL) RDD(org.apache.spark.rdd.RDD) JavaRDD(org.apache.spark.api.java.JavaRDD) JavaPairRDD(org.apache.spark.api.java.JavaPairRDD) IntObject(org.apache.sysml.runtime.instructions.cp.IntObject) FrameBlock(org.apache.sysml.runtime.matrix.data.FrameBlock) JavaPairRDD(org.apache.spark.api.java.JavaPairRDD) StringObject(org.apache.sysml.runtime.instructions.cp.StringObject) Dataset(org.apache.spark.sql.Dataset) JavaRDD(org.apache.spark.api.java.JavaRDD) MatrixObject(org.apache.sysml.runtime.controlprogram.caching.MatrixObject) DoubleObject(org.apache.sysml.runtime.instructions.cp.DoubleObject) FrameObject(org.apache.sysml.runtime.controlprogram.caching.FrameObject) BooleanObject(org.apache.sysml.runtime.instructions.cp.BooleanObject) IntObject(org.apache.sysml.runtime.instructions.cp.IntObject) StringObject(org.apache.sysml.runtime.instructions.cp.StringObject) Row(org.apache.spark.sql.Row) BooleanObject(org.apache.sysml.runtime.instructions.cp.BooleanObject)

Aggregations

BooleanObject (org.apache.sysml.runtime.instructions.cp.BooleanObject)9 StringObject (org.apache.sysml.runtime.instructions.cp.StringObject)7 DMLRuntimeException (org.apache.sysml.runtime.DMLRuntimeException)6 DoubleObject (org.apache.sysml.runtime.instructions.cp.DoubleObject)5 IntObject (org.apache.sysml.runtime.instructions.cp.IntObject)5 ScalarObject (org.apache.sysml.runtime.instructions.cp.ScalarObject)5 Data (org.apache.sysml.runtime.instructions.cp.Data)4 DMLScriptException (org.apache.sysml.runtime.DMLScriptException)3 MatrixObject (org.apache.sysml.runtime.controlprogram.caching.MatrixObject)3 StringTokenizer (java.util.StringTokenizer)2 Hop (org.apache.sysml.hops.Hop)2 DataType (org.apache.sysml.parser.Expression.DataType)2 ValueType (org.apache.sysml.parser.Expression.ValueType)2 MatrixCharacteristics (org.apache.sysml.runtime.matrix.MatrixCharacteristics)2 FrameBlock (org.apache.sysml.runtime.matrix.data.FrameBlock)2 MatrixBlock (org.apache.sysml.runtime.matrix.data.MatrixBlock)2 URL (java.net.URL)1 ArrayList (java.util.ArrayList)1 JavaPairRDD (org.apache.spark.api.java.JavaPairRDD)1 JavaRDD (org.apache.spark.api.java.JavaRDD)1