Search in sources :

Example 1 with ActivationIdentity

use of org.nd4j.linalg.activations.impl.ActivationIdentity in project deeplearning4j by deeplearning4j.

the class VariationalAutoencoder method reconstructionError.

/**
     * Return the reconstruction error for this variational autoencoder.<br>
     * <b>NOTE (important):</b> This method is used ONLY for VAEs that have a standard neural network loss function (i.e.,
     * an {@link org.nd4j.linalg.lossfunctions.ILossFunction} instance such as mean squared error) instead of using a
     * probabilistic reconstruction distribution P(x|z) for the reconstructions (as presented in the VAE architecture by
     * Kingma and Welling).<br>
     * You can check if the VAE has a loss function using {@link #hasLossFunction()}<br>
     * Consequently, the reconstruction error is a simple deterministic function (no Monte-Carlo sampling is required,
     * unlike {@link #reconstructionProbability(INDArray, int)} and {@link #reconstructionLogProbability(INDArray, int)})
     *
     * @param data       The data to calculate the reconstruction error on
     * @return Column vector of reconstruction errors for each example (shape: [numExamples,1])
     */
public INDArray reconstructionError(INDArray data) {
    if (!hasLossFunction()) {
        throw new IllegalStateException("Cannot use reconstructionError method unless the variational autoencoder is " + "configured with a standard loss function (via LossFunctionWrapper). For VAEs utilizing a reconstruction " + "distribution, use the reconstructionProbability or reconstructionLogProbability methods");
    }
    INDArray pZXMean = activate(data, false);
    //Not probabilistic -> "mean" == output
    INDArray reconstruction = generateAtMeanGivenZ(pZXMean);
    if (reconstructionDistribution instanceof CompositeReconstructionDistribution) {
        CompositeReconstructionDistribution c = (CompositeReconstructionDistribution) reconstructionDistribution;
        return c.computeLossFunctionScoreArray(data, reconstruction);
    } else {
        LossFunctionWrapper lfw = (LossFunctionWrapper) reconstructionDistribution;
        ILossFunction lossFunction = lfw.getLossFunction();
        // so we don't want to apply it again. i.e., we are passing the output, not the pre-output.
        return lossFunction.computeScoreArray(data, reconstruction, new ActivationIdentity(), null);
    }
}
Also used : INDArray(org.nd4j.linalg.api.ndarray.INDArray) ActivationIdentity(org.nd4j.linalg.activations.impl.ActivationIdentity) CompositeReconstructionDistribution(org.deeplearning4j.nn.conf.layers.variational.CompositeReconstructionDistribution) ILossFunction(org.nd4j.linalg.lossfunctions.ILossFunction) LossFunctionWrapper(org.deeplearning4j.nn.conf.layers.variational.LossFunctionWrapper)

Example 2 with ActivationIdentity

use of org.nd4j.linalg.activations.impl.ActivationIdentity in project deeplearning4j by deeplearning4j.

the class VaeGradientCheckTests method testVaePretrainReconstructionDistributions.

@Test
public void testVaePretrainReconstructionDistributions() {
    int inOutSize = 6;
    ReconstructionDistribution[] reconstructionDistributions = new ReconstructionDistribution[] { new GaussianReconstructionDistribution(Activation.IDENTITY), new GaussianReconstructionDistribution(Activation.TANH), new BernoulliReconstructionDistribution(Activation.SIGMOID), new CompositeReconstructionDistribution.Builder().addDistribution(2, new GaussianReconstructionDistribution(Activation.IDENTITY)).addDistribution(2, new BernoulliReconstructionDistribution()).addDistribution(2, new GaussianReconstructionDistribution(Activation.TANH)).build(), new ExponentialReconstructionDistribution("identity"), new ExponentialReconstructionDistribution("tanh"), new LossFunctionWrapper(new ActivationTanH(), new LossMSE()), new LossFunctionWrapper(new ActivationIdentity(), new LossMAE()) };
    Nd4j.getRandom().setSeed(12345);
    for (int minibatch : new int[] { 1, 5 }) {
        for (int i = 0; i < reconstructionDistributions.length; i++) {
            INDArray data;
            switch(i) {
                //Gaussian + identity
                case 0:
                case //Gaussian + tanh
                1:
                    data = Nd4j.rand(minibatch, inOutSize);
                    break;
                case //Bernoulli
                2:
                    data = Nd4j.create(minibatch, inOutSize);
                    Nd4j.getExecutioner().exec(new BernoulliDistribution(data, 0.5), Nd4j.getRandom());
                    break;
                case //Composite
                3:
                    data = Nd4j.create(minibatch, inOutSize);
                    data.get(NDArrayIndex.all(), NDArrayIndex.interval(0, 2)).assign(Nd4j.rand(minibatch, 2));
                    Nd4j.getExecutioner().exec(new BernoulliDistribution(data.get(NDArrayIndex.all(), NDArrayIndex.interval(2, 4)), 0.5), Nd4j.getRandom());
                    data.get(NDArrayIndex.all(), NDArrayIndex.interval(4, 6)).assign(Nd4j.rand(minibatch, 2));
                    break;
                case 4:
                case 5:
                    data = Nd4j.rand(minibatch, inOutSize);
                    break;
                case 6:
                case 7:
                    data = Nd4j.randn(minibatch, inOutSize);
                    break;
                default:
                    throw new RuntimeException();
            }
            MultiLayerConfiguration conf = new NeuralNetConfiguration.Builder().regularization(true).l2(0.2).l1(0.3).optimizationAlgo(OptimizationAlgorithm.STOCHASTIC_GRADIENT_DESCENT).learningRate(1.0).seed(12345L).weightInit(WeightInit.DISTRIBUTION).dist(new NormalDistribution(0, 1)).list().layer(0, new VariationalAutoencoder.Builder().nIn(inOutSize).nOut(3).encoderLayerSizes(5).decoderLayerSizes(6).pzxActivationFunction(Activation.TANH).reconstructionDistribution(reconstructionDistributions[i]).activation(Activation.TANH).updater(Updater.SGD).build()).pretrain(true).backprop(false).build();
            MultiLayerNetwork mln = new MultiLayerNetwork(conf);
            mln.init();
            mln.initGradientsView();
            org.deeplearning4j.nn.api.Layer layer = mln.getLayer(0);
            String msg = "testVaePretrainReconstructionDistributions() - " + reconstructionDistributions[i];
            if (PRINT_RESULTS) {
                System.out.println(msg);
                for (int j = 0; j < mln.getnLayers(); j++) System.out.println("Layer " + j + " # params: " + mln.getLayer(j).numParams());
            }
            boolean gradOK = GradientCheckUtil.checkGradientsPretrainLayer(layer, DEFAULT_EPS, DEFAULT_MAX_REL_ERROR, DEFAULT_MIN_ABS_ERROR, PRINT_RESULTS, RETURN_ON_FIRST_FAILURE, data, 12345);
            assertTrue(msg, gradOK);
        }
    }
}
Also used : LossMAE(org.nd4j.linalg.lossfunctions.impl.LossMAE) MultiLayerConfiguration(org.deeplearning4j.nn.conf.MultiLayerConfiguration) ActivationIdentity(org.nd4j.linalg.activations.impl.ActivationIdentity) org.deeplearning4j.nn.api(org.deeplearning4j.nn.api) MultiLayerNetwork(org.deeplearning4j.nn.multilayer.MultiLayerNetwork) NeuralNetConfiguration(org.deeplearning4j.nn.conf.NeuralNetConfiguration) LossMSE(org.nd4j.linalg.lossfunctions.impl.LossMSE) INDArray(org.nd4j.linalg.api.ndarray.INDArray) BernoulliDistribution(org.nd4j.linalg.api.ops.random.impl.BernoulliDistribution) NormalDistribution(org.deeplearning4j.nn.conf.distribution.NormalDistribution) ActivationTanH(org.nd4j.linalg.activations.impl.ActivationTanH) Test(org.junit.Test)

Example 3 with ActivationIdentity

use of org.nd4j.linalg.activations.impl.ActivationIdentity in project deeplearning4j by deeplearning4j.

the class RegressionTest071 method regressionTestMLP2.

@Test
public void regressionTestMLP2() throws Exception {
    File f = new ClassPathResource("regression_testing/071/071_ModelSerializer_Regression_MLP_2.zip").getTempFileFromArchive();
    MultiLayerNetwork net = ModelSerializer.restoreMultiLayerNetwork(f, true);
    MultiLayerConfiguration conf = net.getLayerWiseConfigurations();
    assertEquals(2, conf.getConfs().size());
    assertTrue(conf.isBackprop());
    assertFalse(conf.isPretrain());
    DenseLayer l0 = (DenseLayer) conf.getConf(0).getLayer();
    assertTrue(l0.getActivationFn() instanceof ActivationLReLU);
    assertEquals(3, l0.getNIn());
    assertEquals(4, l0.getNOut());
    assertEquals(WeightInit.DISTRIBUTION, l0.getWeightInit());
    assertEquals(new NormalDistribution(0.1, 1.2), l0.getDist());
    assertEquals(Updater.RMSPROP, l0.getUpdater());
    assertEquals(0.96, l0.getRmsDecay(), 1e-6);
    assertEquals(0.15, l0.getLearningRate(), 1e-6);
    assertEquals(0.6, l0.getDropOut(), 1e-6);
    assertEquals(0.1, l0.getL1(), 1e-6);
    assertEquals(0.2, l0.getL2(), 1e-6);
    assertEquals(GradientNormalization.ClipElementWiseAbsoluteValue, l0.getGradientNormalization());
    assertEquals(1.5, l0.getGradientNormalizationThreshold(), 1e-5);
    OutputLayer l1 = (OutputLayer) conf.getConf(1).getLayer();
    assertTrue(l1.getActivationFn() instanceof ActivationIdentity);
    assertEquals(LossFunctions.LossFunction.MSE, l1.getLossFunction());
    assertTrue(l1.getLossFn() instanceof LossMSE);
    assertEquals(4, l1.getNIn());
    assertEquals(5, l1.getNOut());
    assertEquals(WeightInit.DISTRIBUTION, l0.getWeightInit());
    assertEquals(new NormalDistribution(0.1, 1.2), l0.getDist());
    assertEquals(Updater.RMSPROP, l0.getUpdater());
    assertEquals(0.96, l1.getRmsDecay(), 1e-6);
    assertEquals(0.15, l1.getLearningRate(), 1e-6);
    assertEquals(0.6, l1.getDropOut(), 1e-6);
    assertEquals(0.1, l1.getL1(), 1e-6);
    assertEquals(0.2, l1.getL2(), 1e-6);
    assertEquals(GradientNormalization.ClipElementWiseAbsoluteValue, l1.getGradientNormalization());
    assertEquals(1.5, l1.getGradientNormalizationThreshold(), 1e-5);
    int numParams = net.numParams();
    assertEquals(Nd4j.linspace(1, numParams, numParams), net.params());
    int updaterSize = net.getUpdater().stateSizeForLayer(net);
    assertEquals(Nd4j.linspace(1, updaterSize, updaterSize), net.getUpdater().getStateViewArray());
}
Also used : LossMSE(org.nd4j.linalg.lossfunctions.impl.LossMSE) ActivationLReLU(org.nd4j.linalg.activations.impl.ActivationLReLU) NormalDistribution(org.deeplearning4j.nn.conf.distribution.NormalDistribution) ActivationIdentity(org.nd4j.linalg.activations.impl.ActivationIdentity) File(java.io.File) MultiLayerNetwork(org.deeplearning4j.nn.multilayer.MultiLayerNetwork) ClassPathResource(org.nd4j.linalg.io.ClassPathResource) Test(org.junit.Test)

Aggregations

ActivationIdentity (org.nd4j.linalg.activations.impl.ActivationIdentity)3 NormalDistribution (org.deeplearning4j.nn.conf.distribution.NormalDistribution)2 MultiLayerNetwork (org.deeplearning4j.nn.multilayer.MultiLayerNetwork)2 Test (org.junit.Test)2 INDArray (org.nd4j.linalg.api.ndarray.INDArray)2 LossMSE (org.nd4j.linalg.lossfunctions.impl.LossMSE)2 File (java.io.File)1 org.deeplearning4j.nn.api (org.deeplearning4j.nn.api)1 MultiLayerConfiguration (org.deeplearning4j.nn.conf.MultiLayerConfiguration)1 NeuralNetConfiguration (org.deeplearning4j.nn.conf.NeuralNetConfiguration)1 CompositeReconstructionDistribution (org.deeplearning4j.nn.conf.layers.variational.CompositeReconstructionDistribution)1 LossFunctionWrapper (org.deeplearning4j.nn.conf.layers.variational.LossFunctionWrapper)1 ActivationLReLU (org.nd4j.linalg.activations.impl.ActivationLReLU)1 ActivationTanH (org.nd4j.linalg.activations.impl.ActivationTanH)1 BernoulliDistribution (org.nd4j.linalg.api.ops.random.impl.BernoulliDistribution)1 ClassPathResource (org.nd4j.linalg.io.ClassPathResource)1 ILossFunction (org.nd4j.linalg.lossfunctions.ILossFunction)1 LossMAE (org.nd4j.linalg.lossfunctions.impl.LossMAE)1