Search in sources :

Example 1 with HDF5File

use of org.broadinstitute.hdf5.HDF5File in project gatk by broadinstitute.

the class CreatePanelOfNormals method writeTargetWeightsFile.

/**
     * Read target variances from an HDF5 PoN file and write the corresponding target weights
     * to a file that can be read in by R CBS.
     * @param ponFile       never {@code null}, HDF5 PoN file
     * @param outputFile    never {@code null}, output file
     */
private static void writeTargetWeightsFile(final File ponFile, final File outputFile) {
    IOUtils.canReadFile(ponFile);
    try (final HDF5File file = new HDF5File(ponFile, HDF5File.OpenMode.READ_ONLY)) {
        final HDF5PCACoveragePoN pon = new HDF5PCACoveragePoN(file);
        final double[] targetWeights = DoubleStream.of(pon.getTargetVariances()).map(v -> 1 / v).toArray();
        ParamUtils.writeValuesToFile(targetWeights, outputFile);
    }
}
Also used : DocumentedFeature(org.broadinstitute.barclay.help.DocumentedFeature) CommandLineProgramProperties(org.broadinstitute.barclay.argparser.CommandLineProgramProperties) IOUtils(org.broadinstitute.hellbender.utils.io.IOUtils) CopyNumberProgramGroup(org.broadinstitute.hellbender.cmdline.programgroups.CopyNumberProgramGroup) Argument(org.broadinstitute.barclay.argparser.Argument) JavaSparkContext(org.apache.spark.api.java.JavaSparkContext) HDF5PCACoveragePoNCreationUtils(org.broadinstitute.hellbender.tools.pon.coverage.pca.HDF5PCACoveragePoNCreationUtils) StandardArgumentDefinitions(org.broadinstitute.hellbender.cmdline.StandardArgumentDefinitions) ArgumentCollection(org.broadinstitute.barclay.argparser.ArgumentCollection) OptionalInt(java.util.OptionalInt) ParamUtils(org.broadinstitute.hellbender.utils.param.ParamUtils) File(java.io.File) ArrayList(java.util.ArrayList) DoubleStream(java.util.stream.DoubleStream) HDF5PCACoveragePoN(org.broadinstitute.hellbender.tools.pon.coverage.pca.HDF5PCACoveragePoN) List(java.util.List) PCACoveragePoN(org.broadinstitute.hellbender.tools.pon.coverage.pca.PCACoveragePoN) UserException(org.broadinstitute.hellbender.exceptions.UserException) CoveragePoNQCUtils(org.broadinstitute.hellbender.tools.pon.coverage.CoveragePoNQCUtils) Utils(org.broadinstitute.hellbender.utils.Utils) HDF5File(org.broadinstitute.hdf5.HDF5File) SparkToggleCommandLineProgram(org.broadinstitute.hellbender.utils.SparkToggleCommandLineProgram) HDF5Library(org.broadinstitute.hdf5.HDF5Library) HDF5PCACoveragePoN(org.broadinstitute.hellbender.tools.pon.coverage.pca.HDF5PCACoveragePoN) HDF5File(org.broadinstitute.hdf5.HDF5File)

Example 2 with HDF5File

use of org.broadinstitute.hdf5.HDF5File in project gatk by broadinstitute.

the class NormalizeSomaticReadCounts method doWork.

@Override
protected Object doWork() {
    if (!new HDF5Library().load(null)) {
        //Note: passing null means using the default temp dir.
        throw new UserException.HardwareFeatureException("Cannot load the required HDF5 library. " + "HDF5 is currently supported on x86-64 architecture and Linux or OSX systems.");
    }
    IOUtils.canReadFile(ponFile);
    try (final HDF5File hdf5PoNFile = new HDF5File(ponFile)) {
        final PCACoveragePoN pon = new HDF5PCACoveragePoN(hdf5PoNFile, logger);
        final TargetCollection<Target> targetCollection = readTargetCollection(targetFile);
        final ReadCountCollection proportionalCoverageProfile = readInputReadCounts(readCountsFile, targetCollection);
        final PCATangentNormalizationResult tangentNormalizationResult = pon.normalize(proportionalCoverageProfile);
        ;
        tangentNormalizationResult.write(getCommandLine(), tangentNormalizationOutFile, preTangentNormalizationOutFile, betaHatsOutFile, fntOutFile);
        return "SUCCESS";
    }
}
Also used : HDF5PCACoveragePoN(org.broadinstitute.hellbender.tools.pon.coverage.pca.HDF5PCACoveragePoN) HDF5Library(org.broadinstitute.hdf5.HDF5Library) HDF5PCACoveragePoN(org.broadinstitute.hellbender.tools.pon.coverage.pca.HDF5PCACoveragePoN) PCACoveragePoN(org.broadinstitute.hellbender.tools.pon.coverage.pca.PCACoveragePoN) PCATangentNormalizationResult(org.broadinstitute.hellbender.tools.pon.coverage.pca.PCATangentNormalizationResult) HDF5File(org.broadinstitute.hdf5.HDF5File)

Example 3 with HDF5File

use of org.broadinstitute.hdf5.HDF5File in project gatk-protected by broadinstitute.

the class CreatePanelOfNormalsIntegrationTest method assertRamPoNDuplicate.

private void assertRamPoNDuplicate(final File outputFile) {
    try (final HDF5File hdf5FilePoN = new HDF5File(outputFile)) {
        final HDF5PCACoveragePoN filePoN = new HDF5PCACoveragePoN(hdf5FilePoN);
        assertRamPoNDuplicate(filePoN);
    }
}
Also used : HDF5PCACoveragePoN(org.broadinstitute.hellbender.tools.pon.coverage.pca.HDF5PCACoveragePoN) HDF5File(org.broadinstitute.hdf5.HDF5File)

Example 4 with HDF5File

use of org.broadinstitute.hdf5.HDF5File in project gatk-protected by broadinstitute.

the class NormalizeSomaticReadCountsIntegrationTest method assertBetaHats.

/**
     * Asserts that a collection of beta-hats corresponds to the expected value given
     * the input pre-tangent normalization matrix and the PoN file.
     */
private void assertBetaHats(final ReadCountCollection preTangentNormalized, final RealMatrix actual, final File ponFile) {
    Assert.assertEquals(actual.getColumnDimension(), preTangentNormalized.columnNames().size());
    final double epsilon = PCATangentNormalizationUtils.EPSILON;
    try (final HDF5File ponReader = new HDF5File(ponFile)) {
        final PCACoveragePoN pon = new HDF5PCACoveragePoN(ponReader);
        final List<String> ponTargets = pon.getPanelTargetNames();
        final RealMatrix inCounts = reorderTargetsToPoNOrder(preTangentNormalized, ponTargets);
        // obtain subset of relevant targets to calculate the beta-hats;
        final int[][] betaHatTargets = new int[inCounts.getColumnDimension()][];
        for (int i = 0; i < inCounts.getColumnDimension(); i++) {
            final List<Integer> relevantTargets = new ArrayList<>();
            for (int j = 0; j < inCounts.getRowDimension(); j++) {
                if (inCounts.getEntry(j, i) > 1 + (Math.log(epsilon) / Math.log(2))) {
                    relevantTargets.add(j);
                }
            }
            betaHatTargets[i] = relevantTargets.stream().mapToInt(Integer::intValue).toArray();
        }
        // calculate beta-hats per column and check with actual values.
        final RealMatrix normalsInv = pon.getReducedPanelPInverseCounts();
        Assert.assertEquals(actual.getRowDimension(), normalsInv.getRowDimension());
        final RealMatrix normalsInvT = normalsInv.transpose();
        for (int i = 0; i < inCounts.getColumnDimension(); i++) {
            final RealMatrix inValues = inCounts.getColumnMatrix(i).transpose().getSubMatrix(new int[] { 0 }, betaHatTargets[i]);
            final RealMatrix normalValues = normalsInvT.getSubMatrix(betaHatTargets[i], IntStream.range(0, normalsInvT.getColumnDimension()).toArray());
            final RealMatrix betaHats = inValues.multiply(normalValues);
            for (int j = 0; j < actual.getRowDimension(); j++) {
                Assert.assertEquals(actual.getEntry(j, i), betaHats.getEntry(0, j), 0.000001, "Col " + i + " row " + j);
            }
        }
    }
}
Also used : HDF5PCACoveragePoN(org.broadinstitute.hellbender.tools.pon.coverage.pca.HDF5PCACoveragePoN) Array2DRowRealMatrix(org.apache.commons.math3.linear.Array2DRowRealMatrix) RealMatrix(org.apache.commons.math3.linear.RealMatrix) PCACoveragePoN(org.broadinstitute.hellbender.tools.pon.coverage.pca.PCACoveragePoN) HDF5PCACoveragePoN(org.broadinstitute.hellbender.tools.pon.coverage.pca.HDF5PCACoveragePoN) HDF5File(org.broadinstitute.hdf5.HDF5File)

Example 5 with HDF5File

use of org.broadinstitute.hdf5.HDF5File in project gatk-protected by broadinstitute.

the class NormalizeSomaticReadCountsIntegrationTest method assertTangentNormalized.

private void assertTangentNormalized(final ReadCountCollection actualReadCounts, final ReadCountCollection preTangentNormalized, final RealMatrix betaHats, final File ponFile) {
    try (final HDF5File ponReader = new HDF5File(ponFile)) {
        final PCACoveragePoN pon = new HDF5PCACoveragePoN(ponReader);
        final RealMatrix inCounts = reorderTargetsToPoNOrder(preTangentNormalized, pon.getPanelTargetNames());
        final RealMatrix actual = reorderTargetsToPoNOrder(actualReadCounts, pon.getPanelTargetNames());
        final RealMatrix ponMat = pon.getReducedPanelCounts();
        final RealMatrix projection = ponMat.multiply(betaHats);
        final RealMatrix expected = inCounts.subtract(projection);
        Assert.assertEquals(actual.getRowDimension(), expected.getRowDimension());
        Assert.assertEquals(actual.getColumnDimension(), expected.getColumnDimension());
        for (int i = 0; i < actual.getRowDimension(); i++) {
            Assert.assertEquals(actual.getRow(i), expected.getRow(i));
        }
    }
}
Also used : HDF5PCACoveragePoN(org.broadinstitute.hellbender.tools.pon.coverage.pca.HDF5PCACoveragePoN) Array2DRowRealMatrix(org.apache.commons.math3.linear.Array2DRowRealMatrix) RealMatrix(org.apache.commons.math3.linear.RealMatrix) PCACoveragePoN(org.broadinstitute.hellbender.tools.pon.coverage.pca.PCACoveragePoN) HDF5PCACoveragePoN(org.broadinstitute.hellbender.tools.pon.coverage.pca.HDF5PCACoveragePoN) HDF5File(org.broadinstitute.hdf5.HDF5File)

Aggregations

HDF5File (org.broadinstitute.hdf5.HDF5File)82 Test (org.testng.annotations.Test)58 BaseTest (org.broadinstitute.hellbender.utils.test.BaseTest)56 File (java.io.File)32 Array2DRowRealMatrix (org.apache.commons.math3.linear.Array2DRowRealMatrix)24 RealMatrix (org.apache.commons.math3.linear.RealMatrix)24 HDF5PCACoveragePoN (org.broadinstitute.hellbender.tools.pon.coverage.pca.HDF5PCACoveragePoN)20 BeforeTest (org.testng.annotations.BeforeTest)20 PCACoveragePoN (org.broadinstitute.hellbender.tools.pon.coverage.pca.PCACoveragePoN)16 JavaSparkContext (org.apache.spark.api.java.JavaSparkContext)10 HDF5Library (org.broadinstitute.hdf5.HDF5Library)6 UserException (org.broadinstitute.hellbender.exceptions.UserException)6 ArrayList (java.util.ArrayList)4 List (java.util.List)4 OptionalInt (java.util.OptionalInt)4 StandardArgumentDefinitions (org.broadinstitute.hellbender.cmdline.StandardArgumentDefinitions)4 IOException (java.io.IOException)2 UncheckedIOException (java.io.UncheckedIOException)2 DoubleStream (java.util.stream.DoubleStream)2 IntStream (java.util.stream.IntStream)2