Search in sources :

Example 6 with JSONArray

use of org.apache.wink.json4j.JSONArray in project incubator-systemml by apache.

the class TfMetaUtils method parseJsonObjectIDList.

public static int[] parseJsonObjectIDList(JSONObject spec, String[] colnames, String group) throws JSONException {
    int[] colList = new int[0];
    boolean ids = spec.containsKey("ids") && spec.getBoolean("ids");
    if (spec.containsKey(group) && spec.get(group) instanceof JSONArray) {
        JSONArray colspecs = (JSONArray) spec.get(group);
        colList = new int[colspecs.size()];
        for (int j = 0; j < colspecs.size(); j++) {
            JSONObject colspec = (JSONObject) colspecs.get(j);
            colList[j] = ids ? colspec.getInt("id") : (ArrayUtils.indexOf(colnames, colspec.get("name")) + 1);
            if (colList[j] <= 0) {
                throw new RuntimeException("Specified column '" + colspec.get(ids ? "id" : "name") + "' does not exist.");
            }
        }
        //ensure ascending order of column IDs
        Arrays.sort(colList);
    }
    return colList;
}
Also used : DMLRuntimeException(org.apache.sysml.runtime.DMLRuntimeException) JSONObject(org.apache.wink.json4j.JSONObject) JSONArray(org.apache.wink.json4j.JSONArray)

Example 7 with JSONArray

use of org.apache.wink.json4j.JSONArray in project incubator-systemml by apache.

the class DataTransform method getNumColumnsTf.

/**
	 * Helper function to determine the number of columns after applying
	 * transformations. Note that dummycoding changes the number of columns.
	 * 
	 * @param fs file system
	 * @param header header line
	 * @param delim delimiter
	 * @param tfMtdPath transform metadata path
	 * @return number of columns after applying transformations
	 * @throws IllegalArgumentException if IllegalArgumentException occurs
	 * @throws IOException if IOException occurs
	 * @throws DMLRuntimeException if DMLRuntimeException occurs
	 * @throws JSONException  if JSONException occurs
	 */
private static int getNumColumnsTf(FileSystem fs, String header, String delim, String tfMtdPath) throws IllegalArgumentException, IOException, DMLRuntimeException, JSONException {
    String[] columnNames = Pattern.compile(Pattern.quote(delim)).split(header, -1);
    int ret = columnNames.length;
    JSONObject spec = null;
    try (BufferedReader br = new BufferedReader(new InputStreamReader(fs.open(new Path(tfMtdPath + "/spec.json"))))) {
        spec = JSONHelper.parse(br);
    }
    // fetch relevant attribute lists
    if (!spec.containsKey(TfUtils.TXMETHOD_DUMMYCODE))
        return ret;
    JSONArray dcdList = (JSONArray) ((JSONObject) spec.get(TfUtils.TXMETHOD_DUMMYCODE)).get(TfUtils.JSON_ATTRS);
    // look for numBins among binned columns
    for (Object o : dcdList) {
        int id = UtilFunctions.toInt(o);
        Path binpath = new Path(tfMtdPath + "/Bin/" + UtilFunctions.unquote(columnNames[id - 1]) + TfUtils.TXMTD_BIN_FILE_SUFFIX);
        Path rcdpath = new Path(tfMtdPath + "/Recode/" + UtilFunctions.unquote(columnNames[id - 1]) + TfUtils.TXMTD_RCD_DISTINCT_SUFFIX);
        if (TfUtils.checkValidInputFile(fs, binpath, false)) {
            int nbins = -1;
            try (BufferedReader br = new BufferedReader(new InputStreamReader(fs.open(binpath)))) {
                nbins = UtilFunctions.parseToInt(br.readLine().split(TfUtils.TXMTD_SEP)[4]);
            }
            ret += (nbins - 1);
        } else if (TfUtils.checkValidInputFile(fs, rcdpath, false)) {
            int ndistinct = -1;
            try (BufferedReader br = new BufferedReader(new InputStreamReader(fs.open(rcdpath)))) {
                ndistinct = UtilFunctions.parseToInt(br.readLine());
            }
            ret += (ndistinct - 1);
        } else
            throw new DMLRuntimeException("Relevant transformation metadata for column (id=" + id + ", name=" + columnNames[id - 1] + ") is not found.");
    }
    return ret;
}
Also used : Path(org.apache.hadoop.fs.Path) JSONObject(org.apache.wink.json4j.JSONObject) InputStreamReader(java.io.InputStreamReader) BufferedReader(java.io.BufferedReader) JSONArray(org.apache.wink.json4j.JSONArray) MatrixObject(org.apache.sysml.runtime.controlprogram.caching.MatrixObject) FrameObject(org.apache.sysml.runtime.controlprogram.caching.FrameObject) JSONObject(org.apache.wink.json4j.JSONObject) RDDObject(org.apache.sysml.runtime.instructions.spark.data.RDDObject) DMLRuntimeException(org.apache.sysml.runtime.DMLRuntimeException)

Aggregations

JSONArray (org.apache.wink.json4j.JSONArray)7 JSONObject (org.apache.wink.json4j.JSONObject)7 DMLRuntimeException (org.apache.sysml.runtime.DMLRuntimeException)3 Path (org.apache.hadoop.fs.Path)2 FrameObject (org.apache.sysml.runtime.controlprogram.caching.FrameObject)2 MatrixObject (org.apache.sysml.runtime.controlprogram.caching.MatrixObject)2 RDDObject (org.apache.sysml.runtime.instructions.spark.data.RDDObject)2 BufferedReader (java.io.BufferedReader)1 BufferedWriter (java.io.BufferedWriter)1 IOException (java.io.IOException)1 InputStreamReader (java.io.InputStreamReader)1 OutputStreamWriter (java.io.OutputStreamWriter)1 ArrayList (java.util.ArrayList)1 Entry (java.util.Map.Entry)1 FileSystem (org.apache.hadoop.fs.FileSystem)1 KahanObject (org.apache.sysml.runtime.instructions.cp.KahanObject)1