Search in sources :

Example 91 with Key

use of water.Key in project h2o-3 by h2oai.

the class PersistManager method importFiles.

/**
   * From a path produce a list of files and keys for parsing.
   *
   * Use as follows:
   *
   * ArrayList<String> files = new ArrayList();
   * ArrayList<String> keys = new ArrayList();
   * ArrayList<String> fails = new ArrayList();
   * ArrayList<String> dels = new ArrayList();
   * importFiles(importFiles.path, files, keys, fails, dels);
   *
   * @param path  (Input) Path to import data from
   * @param pattern (Input) Regex pattern to match files by
   * @param files (Output) List of files found
   * @param keys  (Output) List of keys corresponding to files
   * @param fails (Output) List of failed files which mismatch among nodes
   * @param dels  (Output) I don't know what this is
   */
public void importFiles(String path, String pattern, ArrayList<String> files, ArrayList<String> keys, ArrayList<String> fails, ArrayList<String> dels) {
    URI uri = FileUtils.getURI(path);
    String scheme = uri.getScheme();
    if (scheme == null || "file".equals(scheme)) {
        I[Value.NFS].importFiles(path, pattern, files, keys, fails, dels);
    } else if ("http".equals(scheme) || "https".equals(scheme)) {
        try {
            java.net.URL url = new URL(path);
            Key destination_key = Key.make(path);
            java.io.InputStream is = url.openStream();
            UploadFileVec.ReadPutStats stats = new UploadFileVec.ReadPutStats();
            UploadFileVec.readPut(destination_key, is, stats);
            files.add(path);
            keys.add(destination_key.toString());
        } catch (Throwable e) {
            // Fails for e.g. broken sockets silently swallow exceptions and just record the failed path
            fails.add(path);
        }
    } else if ("s3".equals(scheme)) {
        if (I[Value.S3] == null)
            throw new H2OIllegalArgumentException("S3 support is not configured");
        I[Value.S3].importFiles(path, pattern, files, keys, fails, dels);
    } else if ("hdfs".equals(scheme) || "s3n:".equals(scheme) || "s3a:".equals(scheme) || "maprfs:".equals(scheme) || (useHdfsAsFallback() && I[Value.HDFS] != null && I[Value.HDFS].canHandle(path))) {
        if (I[Value.HDFS] == null)
            throw new H2OIllegalArgumentException("HDFS, S3N, and S3A support is not configured");
        I[Value.HDFS].importFiles(path, pattern, files, keys, fails, dels);
    }
    if (pattern != null && !pattern.isEmpty()) {
        //New files ArrayList after matching pattern of choice
        files.retainAll(matchPattern(path, files, pattern));
        //New keys ArrayList after matching pattern of choice
        keys.retainAll(matchPattern(path, keys, pattern));
        //New fails ArrayList after matching pattern of choice. Only show failures that match pattern
        if (!fails.isEmpty()) {
            fails.retainAll(matchPattern(path, fails, pattern));
        }
    }
    return;
}
Also used : UploadFileVec(water.fvec.UploadFileVec) H2OIllegalArgumentException(water.exceptions.H2OIllegalArgumentException) URI(java.net.URI) URL(java.net.URL) Key(water.Key)

Example 92 with Key

use of water.Key in project h2o-3 by h2oai.

the class PersistManager method anyURIToKey.

/** Convert given URI into a specific H2O key representation.
   *
   * The representation depends on persistent backend, since it will
   * deduce file location from the key content.
   *
   * The method will look at scheme of URI and based on it, it will
   * ask a backend to provide a conversion to a key (i.e., URI with scheme
   * 'hdfs' will be forwared to HDFS backend).
   *
   * @param uri file location
   * @return a key encoding URI
   * @throws IOException in the case of uri conversion problem
   * @throws water.exceptions.H2OIllegalArgumentException in case of unsupported scheme
   */
public final Key anyURIToKey(URI uri) throws IOException {
    Key ikey = null;
    String scheme = uri.getScheme();
    if ("s3".equals(scheme)) {
        ikey = I[Value.S3].uriToKey(uri);
    } else if ("hdfs".equals(scheme)) {
        ikey = I[Value.HDFS].uriToKey(uri);
    } else if ("s3".equals(scheme) || "s3n".equals(scheme) || "s3a".equals(scheme)) {
        ikey = I[Value.HDFS].uriToKey(uri);
    } else if ("file".equals(scheme) || scheme == null) {
        ikey = I[Value.NFS].uriToKey(uri);
    } else if (useHdfsAsFallback() && I[Value.HDFS].canHandle(uri.toString())) {
        ikey = I[Value.HDFS].uriToKey(uri);
    } else {
        throw new H2OIllegalArgumentException("Unsupported schema '" + scheme + "' for given uri " + uri);
    }
    return ikey;
}
Also used : H2OIllegalArgumentException(water.exceptions.H2OIllegalArgumentException) Key(water.Key)

Example 93 with Key

use of water.Key in project h2o-3 by h2oai.

the class AstVariance method array.

// Matrix covariance.  Compute covariance between all columns from each Frame
// against each other.  Return a matrix of covariances which is frx.numCols
// wide and fry.numCols tall.
private Val array(Frame frx, Frame fry, Mode mode, boolean symmetric) {
    Vec[] vecxs = frx.vecs();
    int ncolx = vecxs.length;
    Vec[] vecys = fry.vecs();
    int ncoly = vecys.length;
    if (mode.equals(Mode.Everything) || mode.equals(Mode.AllObs)) {
        if (mode.equals(Mode.AllObs)) {
            for (Vec v : vecxs) if (v.naCnt() != 0)
                throw new IllegalArgumentException("Mode is 'all.obs' but NAs are present");
            if (!symmetric)
                for (Vec v : vecys) if (v.naCnt() != 0)
                    throw new IllegalArgumentException("Mode is 'all.obs' but NAs are present");
        }
        CoVarTaskEverything[] cvs = new CoVarTaskEverything[ncoly];
        double[] xmeans = new double[ncolx];
        for (int x = 0; x < ncoly; x++) xmeans[x] = vecxs[x].mean();
        if (symmetric) {
            //1-col returns scalar
            if (ncoly == 1)
                return new ValNum(vecys[0].naCnt() == 0 ? vecys[0].sigma() * vecys[0].sigma() : Double.NaN);
            int[] idx = new int[ncoly];
            for (int y = 1; y < ncoly; y++) idx[y] = y;
            int[] first_index = new int[] { 0 };
            //compute covariances between column_i and column_i+1, column_i+2, ...
            Frame reduced_fr;
            for (int y = 0; y < ncoly - 1; y++) {
                idx = ArrayUtils.removeIds(idx, first_index);
                reduced_fr = new Frame(frx.vecs(idx));
                cvs[y] = new CoVarTaskEverything(vecys[y].mean(), xmeans).dfork(new Frame(vecys[y]).add(reduced_fr));
            }
            double[][] res_array = new double[ncoly][ncoly];
            //fill in the diagonals (variances) using sigma from rollupstats
            for (int y = 0; y < ncoly; y++) res_array[y][y] = vecys[y].naCnt() == 0 ? vecys[y].sigma() * vecys[y].sigma() : Double.NaN;
            //arrange the results into the bottom left of res_array. each successive cvs is 1 smaller in length
            for (int y = 0; y < ncoly - 1; y++) System.arraycopy(ArrayUtils.div(cvs[y].getResult()._covs, (fry.numRows() - 1)), 0, res_array[y], y + 1, ncoly - y - 1);
            //copy over the bottom left of res_array to its top right
            for (int y = 0; y < ncoly - 1; y++) {
                for (int x = y + 1; x < ncoly; x++) {
                    res_array[x][y] = res_array[y][x];
                }
            }
            //set Frame
            Vec[] res = new Vec[ncoly];
            Key<Vec>[] keys = Vec.VectorGroup.VG_LEN1.addVecs(ncoly);
            for (int y = 0; y < ncoly; y++) {
                res[y] = Vec.makeVec(res_array[y], keys[y]);
            }
            return new ValFrame(new Frame(fry._names, res));
        }
        // Launch tasks; each does all Xs vs one Y
        for (int y = 0; y < ncoly; y++) cvs[y] = new CoVarTaskEverything(vecys[y].mean(), xmeans).dfork(new Frame(vecys[y]).add(frx));
        // 1-col returns scalar 
        if (ncolx == 1 && ncoly == 1) {
            return new ValNum(cvs[0].getResult()._covs[0] / (fry.numRows() - 1));
        }
        // Gather all the Xs-vs-Y covariance arrays; divide by rows
        Vec[] res = new Vec[ncoly];
        Key<Vec>[] keys = Vec.VectorGroup.VG_LEN1.addVecs(ncoly);
        for (int y = 0; y < ncoly; y++) res[y] = Vec.makeVec(ArrayUtils.div(cvs[y].getResult()._covs, (fry.numRows() - 1)), keys[y]);
        return new ValFrame(new Frame(fry._names, res));
    } else {
        if (symmetric) {
            if (ncoly == 1)
                return new ValNum(vecys[0].sigma() * vecys[0].sigma());
            CoVarTaskCompleteObsMeanSym taskCompleteObsMeanSym = new CoVarTaskCompleteObsMeanSym().doAll(fry);
            long NACount = taskCompleteObsMeanSym._NACount;
            double[] ymeans = ArrayUtils.div(taskCompleteObsMeanSym._ysum, fry.numRows() - NACount);
            // 1 task with all Ys
            CoVarTaskCompleteObsSym cvs = new CoVarTaskCompleteObsSym(ymeans).doAll(new Frame(fry));
            double[][] res_array = new double[ncoly][ncoly];
            for (int y = 0; y < ncoly; y++) {
                System.arraycopy(ArrayUtils.div(cvs._covs[y], (fry.numRows() - 1 - NACount)), y, res_array[y], y, ncoly - y);
            }
            //copy over the bottom left of res_array to its top right
            for (int y = 0; y < ncoly - 1; y++) {
                for (int x = y + 1; x < ncoly; x++) {
                    res_array[x][y] = res_array[y][x];
                }
            }
            //set Frame
            Vec[] res = new Vec[ncoly];
            Key<Vec>[] keys = Vec.VectorGroup.VG_LEN1.addVecs(ncoly);
            for (int y = 0; y < ncoly; y++) {
                res[y] = Vec.makeVec(res_array[y], keys[y]);
            }
            return new ValFrame(new Frame(fry._names, res));
        }
        CoVarTaskCompleteObsMean taskCompleteObsMean = new CoVarTaskCompleteObsMean(ncoly, ncolx).doAll(new Frame(fry).add(frx));
        long NACount = taskCompleteObsMean._NACount;
        double[] ymeans = ArrayUtils.div(taskCompleteObsMean._ysum, fry.numRows() - NACount);
        double[] xmeans = ArrayUtils.div(taskCompleteObsMean._xsum, fry.numRows() - NACount);
        // 1 task with all Xs and Ys
        CoVarTaskCompleteObs cvs = new CoVarTaskCompleteObs(ymeans, xmeans).doAll(new Frame(fry).add(frx));
        // 1-col returns scalar 
        if (ncolx == 1 && ncoly == 1) {
            return new ValNum(cvs._covs[0][0] / (fry.numRows() - 1 - NACount));
        }
        // Gather all the Xs-vs-Y covariance arrays; divide by rows
        Vec[] res = new Vec[ncoly];
        Key<Vec>[] keys = Vec.VectorGroup.VG_LEN1.addVecs(ncoly);
        for (int y = 0; y < ncoly; y++) res[y] = Vec.makeVec(ArrayUtils.div(cvs._covs[y], (fry.numRows() - 1 - NACount)), keys[y]);
        return new ValFrame(new Frame(fry._names, res));
    }
}
Also used : ValFrame(water.rapids.vals.ValFrame) Frame(water.fvec.Frame) ValNum(water.rapids.vals.ValNum) ValFrame(water.rapids.vals.ValFrame) Vec(water.fvec.Vec) Key(water.Key)

Example 94 with Key

use of water.Key in project h2o-3 by h2oai.

the class AstRm method apply.

@Override
public ValNum apply(Env env, Env.StackHelp stk, AstRoot[] asts) {
    Key id = Key.make(env.expand(asts[1].str()));
    Value val = DKV.get(id);
    if (val == null)
        return new ValNum(0);
    if (val.isFrame())
        // Remove unshared Vecs
        env._ses.remove(val.<Frame>get());
    else
        // Normal (e.g. Model) remove
        Keyed.remove(id);
    return new ValNum(1);
}
Also used : Frame(water.fvec.Frame) Value(water.Value) ValNum(water.rapids.vals.ValNum) Key(water.Key)

Aggregations

Key (water.Key)94 Frame (water.fvec.Frame)56 Test (org.junit.Test)42 Vec (water.fvec.Vec)21 File (java.io.File)18 NFSFileVec (water.fvec.NFSFileVec)17 Futures (water.Futures)10 Random (java.util.Random)7 H2OIllegalArgumentException (water.exceptions.H2OIllegalArgumentException)6 ValFrame (water.rapids.vals.ValFrame)6 DateTimeZone (org.joda.time.DateTimeZone)5 Model (hex.Model)4 SplitFrame (hex.SplitFrame)4 DeepLearning (hex.deeplearning.DeepLearning)4 DeepLearningModel (hex.deeplearning.DeepLearningModel)4 AppendableVec (water.fvec.AppendableVec)4 NewChunk (water.fvec.NewChunk)4 Grid (hex.grid.Grid)3 IOException (java.io.IOException)3 ArrayList (java.util.ArrayList)3