Search in sources :

Example 1 with SVD

use of org.broadinstitute.hellbender.utils.svd.SVD in project gatk by broadinstitute.

the class HDF5PCACoveragePoNCreationUtils method calculateReducedPanelAndPInverses.

/**
     * SVD and Pseudo inverse calculation.
     *
     * @param logNormalized the input counts for the SVD and reduction steps, fully normalized and already logged.
     * @param requestedNumberOfEigensamples user requested number of eigensamples for the reduced panel.
     * @return never {@code null}.
     */
@VisibleForTesting
static ReductionResult calculateReducedPanelAndPInverses(final ReadCountCollection logNormalized, final OptionalInt requestedNumberOfEigensamples, final Logger logger, final JavaSparkContext ctx) {
    if (ctx == null) {
        logger.warn("No Spark context provided, not going to use Spark...");
    }
    final RealMatrix logNormalizedCounts = logNormalized.counts();
    final int numberOfCountColumns = logNormalizedCounts.getColumnDimension();
    logger.info("Starting the SVD decomposition of the log-normalized counts ...");
    final long svdStartTime = System.currentTimeMillis();
    final SVD logNormalizedSVD = SVDFactory.createSVD(logNormalized.counts(), ctx);
    final long svdEndTime = System.currentTimeMillis();
    logger.info(String.format("Finished the SVD decomposition of the log-normal counts. Elapse of %d seconds", (svdEndTime - svdStartTime) / 1000));
    final int numberOfEigensamples = determineNumberOfEigensamples(requestedNumberOfEigensamples, numberOfCountColumns, logNormalizedSVD, logger);
    logger.info(String.format("Including %d eigensamples in the reduced PoN", numberOfEigensamples));
    final double[] singularValues = logNormalizedSVD.getSingularValues();
    final RealMatrix reducedCounts = logNormalizedSVD.getU().getSubMatrix(0, logNormalizedCounts.getRowDimension() - 1, 0, numberOfEigensamples - 1).copy();
    reducedCounts.walkInOptimizedOrder(new DefaultRealMatrixChangingVisitor() {

        @Override
        public double visit(final int row, final int column, final double value) {
            return singularValues[column] * value;
        }
    });
    // The Pseudo inverse comes nearly for free from having run the SVD decomposition.
    final RealMatrix logNormalizedPseudoInverse = logNormalizedSVD.getPinv();
    logger.info("Calculating the reduced PoN inverse matrix...");
    final long riStartTime = System.currentTimeMillis();
    final RealMatrix reducedCountsPseudoInverse = SVDFactory.createSVD(reducedCounts, ctx).getPinv();
    final long riEndTime = System.currentTimeMillis();
    logger.info(String.format("Finished calculating the reduced PoN inverse matrix. Elapse of %d seconds", (riEndTime - riStartTime) / 1000));
    return new ReductionResult(logNormalizedPseudoInverse, reducedCounts, reducedCountsPseudoInverse, logNormalizedSVD.getSingularValues());
}
Also used : SVD(org.broadinstitute.hellbender.utils.svd.SVD) RealMatrix(org.apache.commons.math3.linear.RealMatrix) DefaultRealMatrixChangingVisitor(org.apache.commons.math3.linear.DefaultRealMatrixChangingVisitor) VisibleForTesting(com.google.common.annotations.VisibleForTesting)

Example 2 with SVD

use of org.broadinstitute.hellbender.utils.svd.SVD in project gatk-protected by broadinstitute.

the class HDF5PCACoveragePoNCreationUtils method calculateReducedPanelAndPInverses.

/**
     * SVD and Pseudo inverse calculation.
     *
     * @param logNormalized the input counts for the SVD and reduction steps, fully normalized and already logged.
     * @param requestedNumberOfEigensamples user requested number of eigensamples for the reduced panel.
     * @return never {@code null}.
     */
@VisibleForTesting
static ReductionResult calculateReducedPanelAndPInverses(final ReadCountCollection logNormalized, final OptionalInt requestedNumberOfEigensamples, final Logger logger, final JavaSparkContext ctx) {
    if (ctx == null) {
        logger.warn("No Spark context provided, not going to use Spark...");
    }
    final RealMatrix logNormalizedCounts = logNormalized.counts();
    final int numberOfCountColumns = logNormalizedCounts.getColumnDimension();
    logger.info("Starting the SVD decomposition of the log-normalized counts ...");
    final long svdStartTime = System.currentTimeMillis();
    final SVD logNormalizedSVD = SVDFactory.createSVD(logNormalized.counts(), ctx);
    final long svdEndTime = System.currentTimeMillis();
    logger.info(String.format("Finished the SVD decomposition of the log-normal counts. Elapse of %d seconds", (svdEndTime - svdStartTime) / 1000));
    final int numberOfEigensamples = determineNumberOfEigensamples(requestedNumberOfEigensamples, numberOfCountColumns, logNormalizedSVD, logger);
    logger.info(String.format("Including %d eigensamples in the reduced PoN", numberOfEigensamples));
    final double[] singularValues = logNormalizedSVD.getSingularValues();
    final RealMatrix reducedCounts = logNormalizedSVD.getU().getSubMatrix(0, logNormalizedCounts.getRowDimension() - 1, 0, numberOfEigensamples - 1).copy();
    reducedCounts.walkInOptimizedOrder(new DefaultRealMatrixChangingVisitor() {

        @Override
        public double visit(final int row, final int column, final double value) {
            return singularValues[column] * value;
        }
    });
    // The Pseudo inverse comes nearly for free from having run the SVD decomposition.
    final RealMatrix logNormalizedPseudoInverse = logNormalizedSVD.getPinv();
    logger.info("Calculating the reduced PoN inverse matrix...");
    final long riStartTime = System.currentTimeMillis();
    final RealMatrix reducedCountsPseudoInverse = SVDFactory.createSVD(reducedCounts, ctx).getPinv();
    final long riEndTime = System.currentTimeMillis();
    logger.info(String.format("Finished calculating the reduced PoN inverse matrix. Elapse of %d seconds", (riEndTime - riStartTime) / 1000));
    return new ReductionResult(logNormalizedPseudoInverse, reducedCounts, reducedCountsPseudoInverse, logNormalizedSVD.getSingularValues());
}
Also used : SVD(org.broadinstitute.hellbender.utils.svd.SVD) RealMatrix(org.apache.commons.math3.linear.RealMatrix) DefaultRealMatrixChangingVisitor(org.apache.commons.math3.linear.DefaultRealMatrixChangingVisitor) VisibleForTesting(com.google.common.annotations.VisibleForTesting)

Example 3 with SVD

use of org.broadinstitute.hellbender.utils.svd.SVD in project gatk by broadinstitute.

the class DecomposeSingularValuesIntegrationTest method assertSVDValues.

private void assertSVDValues(final File outputFileV, final File outputFileS, final File outputFileU) {
    try {
        final ReadCountCollection rcc = ReadCountCollectionUtils.parse(CONTROL_PCOV_FULL_FILE);
        final SVD svd = SVDFactory.createSVD(rcc.counts());
        final RealMatrix sDiag = new DiagonalMatrix(svd.getSingularValues());
        assertOutputFileValues(outputFileU, svd.getU());
        assertOutputFileValues(outputFileS, sDiag);
        assertOutputFileValues(outputFileV, svd.getV());
        assertUnitaryMatrix(svd.getV());
        assertUnitaryMatrix(svd.getU());
        Assert.assertTrue(MatrixUtils.isSymmetric(sDiag, 1e-32));
    } catch (final IOException ioe) {
        Assert.fail("Could not open test file: " + CONTROL_PCOV_FULL_FILE, ioe);
    }
}
Also used : SVD(org.broadinstitute.hellbender.utils.svd.SVD) RealMatrix(org.apache.commons.math3.linear.RealMatrix) DiagonalMatrix(org.apache.commons.math3.linear.DiagonalMatrix) IOException(java.io.IOException)

Example 4 with SVD

use of org.broadinstitute.hellbender.utils.svd.SVD in project gatk by broadinstitute.

the class DecomposeSingularValues method runPipeline.

@Override
protected void runPipeline(final JavaSparkContext ctx) {
    try {
        final ReadCountCollection rcc = ReadCountCollectionUtils.parse(inputFile);
        final SVD svd = SVDFactory.createSVD(rcc.counts(), ctx);
        writeMatrix(svd.getV(), outputFileV);
        writeMatrix(svd.getU(), outputFileU);
        writeMatrix(new DiagonalMatrix(svd.getSingularValues()), outputFileS);
    } catch (final IOException ioe) {
        throw new UserException.CouldNotReadInputFile(inputFile, ioe.getMessage());
    }
}
Also used : SVD(org.broadinstitute.hellbender.utils.svd.SVD) DiagonalMatrix(org.apache.commons.math3.linear.DiagonalMatrix) IOException(java.io.IOException) UserException(org.broadinstitute.hellbender.exceptions.UserException)

Example 5 with SVD

use of org.broadinstitute.hellbender.utils.svd.SVD in project gatk by broadinstitute.

the class HDF5PCACoveragePoNCreationUtilsUnitTest method testDetermineNumberOfEigensamplesSpark.

@Test(dataProvider = "singleEigensample")
public void testDetermineNumberOfEigensamplesSpark(final ReadCountCollection logNormals) {
    final JavaSparkContext ctx = SparkContextFactory.getTestSparkContext();
    final SVD logNormalsSVD = SVDFactory.createSVD(logNormals.counts(), ctx);
    final int actualNumber = HDF5PCACoveragePoNCreationUtils.determineNumberOfEigensamples(OptionalInt.empty(), logNormals.columnNames().size(), logNormalsSVD, NULL_LOGGER);
    Assert.assertEquals(actualNumber, 1);
}
Also used : SVD(org.broadinstitute.hellbender.utils.svd.SVD) JavaSparkContext(org.apache.spark.api.java.JavaSparkContext) BaseTest(org.broadinstitute.hellbender.utils.test.BaseTest) Test(org.testng.annotations.Test)

Aggregations

SVD (org.broadinstitute.hellbender.utils.svd.SVD)10 IOException (java.io.IOException)4 DiagonalMatrix (org.apache.commons.math3.linear.DiagonalMatrix)4 RealMatrix (org.apache.commons.math3.linear.RealMatrix)4 BaseTest (org.broadinstitute.hellbender.utils.test.BaseTest)4 Test (org.testng.annotations.Test)4 VisibleForTesting (com.google.common.annotations.VisibleForTesting)2 DefaultRealMatrixChangingVisitor (org.apache.commons.math3.linear.DefaultRealMatrixChangingVisitor)2 JavaSparkContext (org.apache.spark.api.java.JavaSparkContext)2 UserException (org.broadinstitute.hellbender.exceptions.UserException)2