Search in sources :

Example 1 with TrainingMaster

use of org.deeplearning4j.spark.api.TrainingMaster in project deeplearning4j by deeplearning4j.

the class TestKryoWarning method doTestCG.

private static void doTestCG(SparkConf sparkConf) {
    JavaSparkContext sc = new JavaSparkContext(sparkConf);
    try {
        ComputationGraphConfiguration conf = new NeuralNetConfiguration.Builder().graphBuilder().addInputs("in").addLayer("0", new OutputLayer.Builder().nIn(10).nOut(10).build(), "in").setOutputs("0").pretrain(false).backprop(true).build();
        TrainingMaster tm = new ParameterAveragingTrainingMaster.Builder(1).build();
        SparkListenable scg = new SparkComputationGraph(sc, conf, tm);
    } finally {
        sc.stop();
    }
}
Also used : SparkComputationGraph(org.deeplearning4j.spark.impl.graph.SparkComputationGraph) ComputationGraphConfiguration(org.deeplearning4j.nn.conf.ComputationGraphConfiguration) JavaSparkContext(org.apache.spark.api.java.JavaSparkContext) NeuralNetConfiguration(org.deeplearning4j.nn.conf.NeuralNetConfiguration) ParameterAveragingTrainingMaster(org.deeplearning4j.spark.impl.paramavg.ParameterAveragingTrainingMaster) ParameterAveragingTrainingMaster(org.deeplearning4j.spark.impl.paramavg.ParameterAveragingTrainingMaster) TrainingMaster(org.deeplearning4j.spark.api.TrainingMaster)

Example 2 with TrainingMaster

use of org.deeplearning4j.spark.api.TrainingMaster in project deeplearning4j by deeplearning4j.

the class TestSparkComputationGraph method testBasic.

@Test
public void testBasic() throws Exception {
    JavaSparkContext sc = this.sc;
    RecordReader rr = new CSVRecordReader(0, ",");
    rr.initialize(new FileSplit(new ClassPathResource("iris.txt").getTempFileFromArchive()));
    MultiDataSetIterator iter = new RecordReaderMultiDataSetIterator.Builder(1).addReader("iris", rr).addInput("iris", 0, 3).addOutputOneHot("iris", 4, 3).build();
    List<MultiDataSet> list = new ArrayList<>(150);
    while (iter.hasNext()) list.add(iter.next());
    ComputationGraphConfiguration config = new NeuralNetConfiguration.Builder().optimizationAlgo(OptimizationAlgorithm.STOCHASTIC_GRADIENT_DESCENT).learningRate(0.1).graphBuilder().addInputs("in").addLayer("dense", new DenseLayer.Builder().nIn(4).nOut(2).build(), "in").addLayer("out", new OutputLayer.Builder(LossFunctions.LossFunction.MCXENT).nIn(2).nOut(3).build(), "dense").setOutputs("out").pretrain(false).backprop(true).build();
    ComputationGraph cg = new ComputationGraph(config);
    cg.init();
    TrainingMaster tm = new ParameterAveragingTrainingMaster(true, numExecutors(), 1, 10, 1, 0);
    SparkComputationGraph scg = new SparkComputationGraph(sc, cg, tm);
    scg.setListeners(Collections.singleton((IterationListener) new ScoreIterationListener(1)));
    JavaRDD<MultiDataSet> rdd = sc.parallelize(list);
    scg.fitMultiDataSet(rdd);
    //Try: fitting using DataSet
    DataSetIterator iris = new IrisDataSetIterator(1, 150);
    List<DataSet> list2 = new ArrayList<>();
    while (iris.hasNext()) list2.add(iris.next());
    JavaRDD<DataSet> rddDS = sc.parallelize(list2);
    scg.fit(rddDS);
}
Also used : IrisDataSetIterator(org.deeplearning4j.datasets.iterator.impl.IrisDataSetIterator) DataSet(org.nd4j.linalg.dataset.DataSet) MultiDataSet(org.nd4j.linalg.dataset.api.MultiDataSet) RecordReader(org.datavec.api.records.reader.RecordReader) CSVRecordReader(org.datavec.api.records.reader.impl.csv.CSVRecordReader) FileSplit(org.datavec.api.split.FileSplit) TrainingMaster(org.deeplearning4j.spark.api.TrainingMaster) ParameterAveragingTrainingMaster(org.deeplearning4j.spark.impl.paramavg.ParameterAveragingTrainingMaster) RecordReaderMultiDataSetIterator(org.deeplearning4j.datasets.datavec.RecordReaderMultiDataSetIterator) MultiDataSetIterator(org.nd4j.linalg.dataset.api.iterator.MultiDataSetIterator) CSVRecordReader(org.datavec.api.records.reader.impl.csv.CSVRecordReader) JavaSparkContext(org.apache.spark.api.java.JavaSparkContext) ComputationGraph(org.deeplearning4j.nn.graph.ComputationGraph) ScoreIterationListener(org.deeplearning4j.optimize.listeners.ScoreIterationListener) ParameterAveragingTrainingMaster(org.deeplearning4j.spark.impl.paramavg.ParameterAveragingTrainingMaster) ClassPathResource(org.nd4j.linalg.io.ClassPathResource) MultiDataSet(org.nd4j.linalg.dataset.api.MultiDataSet) DenseLayer(org.deeplearning4j.nn.conf.layers.DenseLayer) ComputationGraphConfiguration(org.deeplearning4j.nn.conf.ComputationGraphConfiguration) IterationListener(org.deeplearning4j.optimize.api.IterationListener) ScoreIterationListener(org.deeplearning4j.optimize.listeners.ScoreIterationListener) IrisDataSetIterator(org.deeplearning4j.datasets.iterator.impl.IrisDataSetIterator) DataSetIterator(org.nd4j.linalg.dataset.api.iterator.DataSetIterator) RecordReaderMultiDataSetIterator(org.deeplearning4j.datasets.datavec.RecordReaderMultiDataSetIterator) MultiDataSetIterator(org.nd4j.linalg.dataset.api.iterator.MultiDataSetIterator) BaseSparkTest(org.deeplearning4j.spark.BaseSparkTest) Test(org.junit.Test)

Example 3 with TrainingMaster

use of org.deeplearning4j.spark.api.TrainingMaster in project deeplearning4j by deeplearning4j.

the class TestCompareParameterAveragingSparkVsSingleMachine method testOneExecutor.

@Test
public void testOneExecutor() {
    //Idea: single worker/executor on Spark should give identical results to a single machine
    int miniBatchSize = 10;
    int nWorkers = 1;
    for (boolean saveUpdater : new boolean[] { true, false }) {
        JavaSparkContext sc = getContext(nWorkers);
        try {
            //Do training locally, for 3 minibatches
            int[] seeds = { 1, 2, 3 };
            MultiLayerNetwork net = new MultiLayerNetwork(getConf(12345, Updater.RMSPROP));
            net.init();
            INDArray initialParams = net.params().dup();
            for (int i = 0; i < seeds.length; i++) {
                DataSet ds = getOneDataSet(miniBatchSize, seeds[i]);
                if (!saveUpdater)
                    net.setUpdater(null);
                net.fit(ds);
            }
            INDArray finalParams = net.params().dup();
            //Do training on Spark with one executor, for 3 separate minibatches
            TrainingMaster tm = getTrainingMaster(1, miniBatchSize, saveUpdater);
            SparkDl4jMultiLayer sparkNet = new SparkDl4jMultiLayer(sc, getConf(12345, Updater.RMSPROP), tm);
            sparkNet.setCollectTrainingStats(true);
            INDArray initialSparkParams = sparkNet.getNetwork().params().dup();
            for (int i = 0; i < seeds.length; i++) {
                List<DataSet> list = getOneDataSetAsIndividalExamples(miniBatchSize, seeds[i]);
                JavaRDD<DataSet> rdd = sc.parallelize(list);
                sparkNet.fit(rdd);
            }
            INDArray finalSparkParams = sparkNet.getNetwork().params().dup();
            assertEquals(initialParams, initialSparkParams);
            assertNotEquals(initialParams, finalParams);
            assertEquals(finalParams, finalSparkParams);
        } finally {
            sc.stop();
        }
    }
}
Also used : INDArray(org.nd4j.linalg.api.ndarray.INDArray) DataSet(org.nd4j.linalg.dataset.DataSet) SparkDl4jMultiLayer(org.deeplearning4j.spark.impl.multilayer.SparkDl4jMultiLayer) JavaSparkContext(org.apache.spark.api.java.JavaSparkContext) MultiLayerNetwork(org.deeplearning4j.nn.multilayer.MultiLayerNetwork) TrainingMaster(org.deeplearning4j.spark.api.TrainingMaster) Test(org.junit.Test)

Example 4 with TrainingMaster

use of org.deeplearning4j.spark.api.TrainingMaster in project deeplearning4j by deeplearning4j.

the class TestCompareParameterAveragingSparkVsSingleMachine method testAverageEveryStepGraph.

@Test
public void testAverageEveryStepGraph() {
    //Idea: averaging every step with SGD (SGD updater + optimizer) is mathematically identical to doing the learning
    // on a single machine for synchronous distributed training
    //BUT: This is *ONLY* the case if all workers get an identical number of examples. This won't be the case if
    // we use RDD.randomSplit (which is what occurs if we use .fit(JavaRDD<DataSet> on a data set that needs splitting),
    // which might give a number of examples that isn't divisible by number of workers (like 39 examples on 4 executors)
    //This is also ONLY the case using SGD updater
    int miniBatchSizePerWorker = 10;
    int nWorkers = 4;
    for (boolean saveUpdater : new boolean[] { true, false }) {
        JavaSparkContext sc = getContext(nWorkers);
        try {
            //Do training locally, for 3 minibatches
            int[] seeds = { 1, 2, 3 };
            //                CudaGridExecutioner executioner = (CudaGridExecutioner) Nd4j.getExecutioner();
            ComputationGraph net = new ComputationGraph(getGraphConf(12345, Updater.SGD));
            net.init();
            INDArray initialParams = net.params().dup();
            for (int i = 0; i < seeds.length; i++) {
                DataSet ds = getOneDataSet(miniBatchSizePerWorker * nWorkers, seeds[i]);
                if (!saveUpdater)
                    net.setUpdater(null);
                net.fit(ds);
            }
            INDArray finalParams = net.params().dup();
            //                executioner.addToWatchdog(finalParams, "finalParams");
            //Do training on Spark with one executor, for 3 separate minibatches
            TrainingMaster tm = getTrainingMaster(1, miniBatchSizePerWorker, saveUpdater);
            SparkComputationGraph sparkNet = new SparkComputationGraph(sc, getGraphConf(12345, Updater.SGD), tm);
            sparkNet.setCollectTrainingStats(true);
            INDArray initialSparkParams = sparkNet.getNetwork().params().dup();
            for (int i = 0; i < seeds.length; i++) {
                List<DataSet> list = getOneDataSetAsIndividalExamples(miniBatchSizePerWorker * nWorkers, seeds[i]);
                JavaRDD<DataSet> rdd = sc.parallelize(list);
                sparkNet.fit(rdd);
            }
            System.out.println(sparkNet.getSparkTrainingStats().statsAsString());
            INDArray finalSparkParams = sparkNet.getNetwork().params().dup();
            //                executioner.addToWatchdog(finalSparkParams, "finalSparkParams");
            float[] fp = finalParams.data().asFloat();
            float[] fps = finalSparkParams.data().asFloat();
            System.out.println("Initial (Local) params:       " + Arrays.toString(initialParams.data().asFloat()));
            System.out.println("Initial (Spark) params:       " + Arrays.toString(initialSparkParams.data().asFloat()));
            System.out.println("Final (Local) params: " + Arrays.toString(fp));
            System.out.println("Final (Spark) params: " + Arrays.toString(fps));
            assertEquals(initialParams, initialSparkParams);
            assertNotEquals(initialParams, finalParams);
            assertArrayEquals(fp, fps, 1e-5f);
            double sparkScore = sparkNet.getScore();
            assertTrue(sparkScore > 0.0);
            assertEquals(net.score(), sparkScore, 1e-3);
        } finally {
            sc.stop();
        }
    }
}
Also used : SparkComputationGraph(org.deeplearning4j.spark.impl.graph.SparkComputationGraph) DataSet(org.nd4j.linalg.dataset.DataSet) TrainingMaster(org.deeplearning4j.spark.api.TrainingMaster) INDArray(org.nd4j.linalg.api.ndarray.INDArray) JavaSparkContext(org.apache.spark.api.java.JavaSparkContext) ComputationGraph(org.deeplearning4j.nn.graph.ComputationGraph) SparkComputationGraph(org.deeplearning4j.spark.impl.graph.SparkComputationGraph) Test(org.junit.Test)

Example 5 with TrainingMaster

use of org.deeplearning4j.spark.api.TrainingMaster in project deeplearning4j by deeplearning4j.

the class TestCompareParameterAveragingSparkVsSingleMachine method testAverageEveryStepGraphCNN.

@Test
public void testAverageEveryStepGraphCNN() {
    //Idea: averaging every step with SGD (SGD updater + optimizer) is mathematically identical to doing the learning
    // on a single machine for synchronous distributed training
    //BUT: This is *ONLY* the case if all workers get an identical number of examples. This won't be the case if
    // we use RDD.randomSplit (which is what occurs if we use .fit(JavaRDD<DataSet> on a data set that needs splitting),
    // which might give a number of examples that isn't divisible by number of workers (like 39 examples on 4 executors)
    //This is also ONLY the case using SGD updater
    int miniBatchSizePerWorker = 10;
    int nWorkers = 4;
    for (boolean saveUpdater : new boolean[] { true, false }) {
        JavaSparkContext sc = getContext(nWorkers);
        try {
            //Do training locally, for 3 minibatches
            int[] seeds = { 1, 2, 3 };
            ComputationGraph net = new ComputationGraph(getGraphConfCNN(12345, Updater.SGD));
            net.init();
            INDArray initialParams = net.params().dup();
            for (int i = 0; i < seeds.length; i++) {
                DataSet ds = getOneDataSetCNN(miniBatchSizePerWorker * nWorkers, seeds[i]);
                if (!saveUpdater)
                    net.setUpdater(null);
                net.fit(ds);
            }
            INDArray finalParams = net.params().dup();
            //Do training on Spark with one executor, for 3 separate minibatches
            TrainingMaster tm = getTrainingMaster(1, miniBatchSizePerWorker, saveUpdater);
            SparkComputationGraph sparkNet = new SparkComputationGraph(sc, getGraphConfCNN(12345, Updater.SGD), tm);
            sparkNet.setCollectTrainingStats(true);
            INDArray initialSparkParams = sparkNet.getNetwork().params().dup();
            for (int i = 0; i < seeds.length; i++) {
                List<DataSet> list = getOneDataSetAsIndividalExamplesCNN(miniBatchSizePerWorker * nWorkers, seeds[i]);
                JavaRDD<DataSet> rdd = sc.parallelize(list);
                sparkNet.fit(rdd);
            }
            System.out.println(sparkNet.getSparkTrainingStats().statsAsString());
            INDArray finalSparkParams = sparkNet.getNetwork().params().dup();
            System.out.println("Initial (Local) params:  " + Arrays.toString(initialParams.data().asFloat()));
            System.out.println("Initial (Spark) params:  " + Arrays.toString(initialSparkParams.data().asFloat()));
            System.out.println("Final (Local) params:    " + Arrays.toString(finalParams.data().asFloat()));
            System.out.println("Final (Spark) params:    " + Arrays.toString(finalSparkParams.data().asFloat()));
            assertArrayEquals(initialParams.data().asFloat(), initialSparkParams.data().asFloat(), 1e-8f);
            assertArrayEquals(finalParams.data().asFloat(), finalSparkParams.data().asFloat(), 1e-6f);
            double sparkScore = sparkNet.getScore();
            assertTrue(sparkScore > 0.0);
            assertEquals(net.score(), sparkScore, 1e-3);
        } finally {
            sc.stop();
        }
    }
}
Also used : SparkComputationGraph(org.deeplearning4j.spark.impl.graph.SparkComputationGraph) INDArray(org.nd4j.linalg.api.ndarray.INDArray) DataSet(org.nd4j.linalg.dataset.DataSet) JavaSparkContext(org.apache.spark.api.java.JavaSparkContext) ComputationGraph(org.deeplearning4j.nn.graph.ComputationGraph) SparkComputationGraph(org.deeplearning4j.spark.impl.graph.SparkComputationGraph) TrainingMaster(org.deeplearning4j.spark.api.TrainingMaster) Test(org.junit.Test)

Aggregations

TrainingMaster (org.deeplearning4j.spark.api.TrainingMaster)15 Test (org.junit.Test)13 DataSet (org.nd4j.linalg.dataset.DataSet)12 ComputationGraph (org.deeplearning4j.nn.graph.ComputationGraph)10 ParameterAveragingTrainingMaster (org.deeplearning4j.spark.impl.paramavg.ParameterAveragingTrainingMaster)10 NeuralNetConfiguration (org.deeplearning4j.nn.conf.NeuralNetConfiguration)9 JavaSparkContext (org.apache.spark.api.java.JavaSparkContext)8 ComputationGraphConfiguration (org.deeplearning4j.nn.conf.ComputationGraphConfiguration)8 OutputLayer (org.deeplearning4j.nn.conf.layers.OutputLayer)6 ScoreIterationListener (org.deeplearning4j.optimize.listeners.ScoreIterationListener)6 EarlyStoppingConfiguration (org.deeplearning4j.earlystopping.EarlyStoppingConfiguration)5 InMemoryModelSaver (org.deeplearning4j.earlystopping.saver.InMemoryModelSaver)5 MaxEpochsTerminationCondition (org.deeplearning4j.earlystopping.termination.MaxEpochsTerminationCondition)5 SparkEarlyStoppingGraphTrainer (org.deeplearning4j.spark.earlystopping.SparkEarlyStoppingGraphTrainer)5 SparkLossCalculatorComputationGraph (org.deeplearning4j.spark.earlystopping.SparkLossCalculatorComputationGraph)5 DataSetToMultiDataSetFn (org.deeplearning4j.spark.impl.graph.dataset.DataSetToMultiDataSetFn)5 INDArray (org.nd4j.linalg.api.ndarray.INDArray)5 MaxTimeIterationTerminationCondition (org.deeplearning4j.earlystopping.termination.MaxTimeIterationTerminationCondition)4 SparkComputationGraph (org.deeplearning4j.spark.impl.graph.SparkComputationGraph)4 IrisDataSetIterator (org.deeplearning4j.datasets.iterator.impl.IrisDataSetIterator)3