Search in sources :

Example 1 with IntDoubleReduceFunction

use of org.deeplearning4j.spark.impl.common.reduce.IntDoubleReduceFunction in project deeplearning4j by deeplearning4j.

the class SparkComputationGraph method calculateScoreMultiDataSet.

/**
     * Calculate the score for all examples in the provided {@code JavaRDD<MultiDataSet>}, either by summing
     * or averaging over the entire data set.
     *      *
     * @param data          Data to score
     * @param average       Whether to sum the scores, or average them
     * @param minibatchSize The number of examples to use in each minibatch when scoring. If more examples are in a partition than
     *                      this, multiple scoring operations will be done (to avoid using too much memory by doing the whole partition
     *                      in one go)
     */
public double calculateScoreMultiDataSet(JavaRDD<MultiDataSet> data, boolean average, int minibatchSize) {
    JavaRDD<Tuple2<Integer, Double>> rdd = data.mapPartitions(new ScoreFlatMapFunctionCGMultiDataSet(conf.toJson(), sc.broadcast(network.params(false)), minibatchSize));
    //Reduce to a single tuple, with example count + sum of scores
    Tuple2<Integer, Double> countAndSumScores = rdd.reduce(new IntDoubleReduceFunction());
    if (average) {
        return countAndSumScores._2() / countAndSumScores._1();
    } else {
        return countAndSumScores._2();
    }
}
Also used : AtomicInteger(java.util.concurrent.atomic.AtomicInteger) IntDoubleReduceFunction(org.deeplearning4j.spark.impl.common.reduce.IntDoubleReduceFunction) Tuple2(scala.Tuple2)

Example 2 with IntDoubleReduceFunction

use of org.deeplearning4j.spark.impl.common.reduce.IntDoubleReduceFunction in project deeplearning4j by deeplearning4j.

the class SparkComputationGraph method calculateScore.

/**
     * Calculate the score for all examples in the provided {@code JavaRDD<DataSet>}, either by summing
     * or averaging over the entire data set. To calculate a score for each example individually, use {@link #scoreExamples(JavaPairRDD, boolean)}
     * or one of the similar methods
     *
     * @param data          Data to score
     * @param average       Whether to sum the scores, or average them
     * @param minibatchSize The number of examples to use in each minibatch when scoring. If more examples are in a partition than
     *                      this, multiple scoring operations will be done (to avoid using too much memory by doing the whole partition
     *                      in one go)
     */
public double calculateScore(JavaRDD<DataSet> data, boolean average, int minibatchSize) {
    JavaRDD<Tuple2<Integer, Double>> rdd = data.mapPartitions(new ScoreFlatMapFunctionCGDataSet(conf.toJson(), sc.broadcast(network.params(false)), minibatchSize));
    //Reduce to a single tuple, with example count + sum of scores
    Tuple2<Integer, Double> countAndSumScores = rdd.reduce(new IntDoubleReduceFunction());
    if (average) {
        return countAndSumScores._2() / countAndSumScores._1();
    } else {
        return countAndSumScores._2();
    }
}
Also used : AtomicInteger(java.util.concurrent.atomic.AtomicInteger) IntDoubleReduceFunction(org.deeplearning4j.spark.impl.common.reduce.IntDoubleReduceFunction) Tuple2(scala.Tuple2)

Example 3 with IntDoubleReduceFunction

use of org.deeplearning4j.spark.impl.common.reduce.IntDoubleReduceFunction in project deeplearning4j by deeplearning4j.

the class SparkDl4jMultiLayer method calculateScore.

/**
     * Calculate the score for all examples in the provided {@code JavaRDD<DataSet>}, either by summing
     * or averaging over the entire data set. To calculate a score for each example individually, use {@link #scoreExamples(JavaPairRDD, boolean)}
     * or one of the similar methods
     *
     * @param data          Data to score
     * @param average       Whether to sum the scores, or average them
     * @param minibatchSize The number of examples to use in each minibatch when scoring. If more examples are in a partition than
     *                      this, multiple scoring operations will be done (to avoid using too much memory by doing the whole partition
     *                      in one go)
     */
public double calculateScore(JavaRDD<DataSet> data, boolean average, int minibatchSize) {
    JavaRDD<Tuple2<Integer, Double>> rdd = data.mapPartitions(new ScoreFlatMapFunction(conf.toJson(), sc.broadcast(network.params(false)), minibatchSize));
    //Reduce to a single tuple, with example count + sum of scores
    Tuple2<Integer, Double> countAndSumScores = rdd.reduce(new IntDoubleReduceFunction());
    if (average) {
        return countAndSumScores._2() / countAndSumScores._1();
    } else {
        return countAndSumScores._2();
    }
}
Also used : ScoreFlatMapFunction(org.deeplearning4j.spark.impl.multilayer.scoring.ScoreFlatMapFunction) IntDoubleReduceFunction(org.deeplearning4j.spark.impl.common.reduce.IntDoubleReduceFunction) Tuple2(scala.Tuple2)

Aggregations

IntDoubleReduceFunction (org.deeplearning4j.spark.impl.common.reduce.IntDoubleReduceFunction)3 Tuple2 (scala.Tuple2)3 AtomicInteger (java.util.concurrent.atomic.AtomicInteger)2 ScoreFlatMapFunction (org.deeplearning4j.spark.impl.multilayer.scoring.ScoreFlatMapFunction)1