Search in sources :

Example 81 with Array2DRowRealMatrix

use of org.apache.commons.math3.linear.Array2DRowRealMatrix in project gatk by broadinstitute.

the class SNPSegmenter method writeSegmentFile.

/**
     * Write segment file based on maximum-likelihood estimates of the minor allele fraction at SNP sites,
     * assuming the specified allelic bias.  These estimates are converted to target coverages,
     * which are written to a temporary file and then passed to {@link RCBSSegmenter}.
     * @param snps                  TargetCollection of allelic counts at SNP sites
     * @param sampleName            sample name
     * @param outputFile            segment file to write to and return
     * @param allelicBias           allelic bias to use in estimate of minor allele fraction
     */
public static void writeSegmentFile(final TargetCollection<AllelicCount> snps, final String sampleName, final File outputFile, final double allelicBias) {
    Utils.validateArg(snps.totalSize() > 0, "Must have a positive number of SNPs to perform SNP segmentation.");
    try {
        final File targetsFromSNPCountsFile = File.createTempFile("targets-from-snps", ".tsv");
        final List<Target> targets = snps.targets().stream().map(ac -> new Target(name(ac), ac.getInterval())).collect(Collectors.toList());
        final RealMatrix minorAlleleFractions = new Array2DRowRealMatrix(snps.targetCount(), 1);
        minorAlleleFractions.setColumn(0, snps.targets().stream().mapToDouble(ac -> ac.estimateMinorAlleleFraction(allelicBias)).toArray());
        ReadCountCollectionUtils.write(targetsFromSNPCountsFile, new ReadCountCollection(targets, Collections.singletonList(sampleName), minorAlleleFractions));
        //segment SNPs based on observed log_2 minor allele fraction (log_2 is applied in CBS.R)
        RCBSSegmenter.writeSegmentFile(sampleName, targetsFromSNPCountsFile.getAbsolutePath(), outputFile.getAbsolutePath(), false);
    } catch (final IOException e) {
        throw new UserException.CouldNotCreateOutputFile("Could not create temporary output file during " + "SNP segmentation.", e);
    }
}
Also used : Array2DRowRealMatrix(org.apache.commons.math3.linear.Array2DRowRealMatrix) List(java.util.List) UserException(org.broadinstitute.hellbender.exceptions.UserException) RCBSSegmenter(org.broadinstitute.hellbender.utils.segmenter.RCBSSegmenter) AllelicCount(org.broadinstitute.hellbender.tools.exome.alleliccount.AllelicCount) Utils(org.broadinstitute.hellbender.utils.Utils) RealMatrix(org.apache.commons.math3.linear.RealMatrix) IOException(java.io.IOException) Collections(java.util.Collections) Collectors(java.util.stream.Collectors) File(java.io.File) Array2DRowRealMatrix(org.apache.commons.math3.linear.Array2DRowRealMatrix) RealMatrix(org.apache.commons.math3.linear.RealMatrix) Array2DRowRealMatrix(org.apache.commons.math3.linear.Array2DRowRealMatrix) IOException(java.io.IOException) UserException(org.broadinstitute.hellbender.exceptions.UserException) File(java.io.File)

Example 82 with Array2DRowRealMatrix

use of org.apache.commons.math3.linear.Array2DRowRealMatrix in project gatk by broadinstitute.

the class ReadCountCollection method subsetTargets.

/**
     * Subsets the targets in the read-count collection.
     * <p>
     *     Creates  brand-new read-count collection. Changes in the new read-count collection
     *     counts won't affect the this read-count collection and vice-versa.
     * </p>
     *
     * @param targetsToKeep the new target subset.
     * @return never {@code null}. The order of targets in the result is guaranteed to
     *  follow the original order of targets. The order of count columns is guaranteed to
     *  follow the original order of count columns.
     * @throws IllegalArgumentException if {@code targetsToKeep}:
     * <ul>
     *     <li>is {@code null},</li>
     *     <li>contains {@code null}s</li>
     *     <li>or contains targets that are not part of the read-count collection</li>
     * </ul>
     */
public ReadCountCollection subsetTargets(final Set<Target> targetsToKeep) {
    Utils.nonNull(targetsToKeep, "the input target set cannot be null");
    Utils.nonEmpty(targetsToKeep, "the input target subset size must be greater than 0");
    if (!new HashSet<>(targets).containsAll(targetsToKeep)) {
        throw unknownTargetsToKeep(targetsToKeep);
    }
    if (targetsToKeep.size() == targets.size()) {
        return new ReadCountCollection(targets, columnNames, counts.copy(), false);
    }
    final int[] targetsToKeepIndices = IntStream.range(0, targets.size()).filter(i -> targetsToKeep.contains(targets.get(i))).toArray();
    final List<Target> resultTargets = Arrays.stream(targetsToKeepIndices).mapToObj(targets::get).collect(Collectors.toList());
    // compose the new counts:
    final double[][] resultCounts = new double[targetsToKeepIndices.length][columnNames.size()];
    for (int i = 0; i < resultCounts.length; i++) {
        resultCounts[i] = counts.getRow(targetsToKeepIndices[i]);
    }
    return new ReadCountCollection(Collections.unmodifiableList(resultTargets), columnNames, new Array2DRowRealMatrix(resultCounts), false);
}
Also used : IntStream(java.util.stream.IntStream) Array2DRowRealMatrix(org.apache.commons.math3.linear.Array2DRowRealMatrix) java.util(java.util) Object2IntMap(it.unimi.dsi.fastutil.objects.Object2IntMap) Utils(org.broadinstitute.hellbender.utils.Utils) RealMatrix(org.apache.commons.math3.linear.RealMatrix) Object2IntOpenHashMap(it.unimi.dsi.fastutil.objects.Object2IntOpenHashMap) Nonnull(javax.annotation.Nonnull) Collectors(java.util.stream.Collectors) ParamUtils(org.broadinstitute.hellbender.utils.param.ParamUtils) Serializable(java.io.Serializable) Array2DRowRealMatrix(org.apache.commons.math3.linear.Array2DRowRealMatrix)

Example 83 with Array2DRowRealMatrix

use of org.apache.commons.math3.linear.Array2DRowRealMatrix in project gatk-protected by broadinstitute.

the class ReCapSegCallerUnitTest method testMakeCalls.

@Test
public void testMakeCalls() {
    final List<Target> targets = new ArrayList<>();
    final List<String> columnNames = Arrays.asList("Sample");
    final List<Double> coverage = new ArrayList<>();
    //add amplification targets
    for (int i = 0; i < 10; i++) {
        final SimpleInterval interval = new SimpleInterval("chr", 100 + 2 * i, 101 + 2 * i);
        targets.add(new Target(interval));
        coverage.add(ParamUtils.log2(2.0));
    }
    //add deletion targets
    for (int i = 0; i < 10; i++) {
        final SimpleInterval interval = new SimpleInterval("chr", 300 + 2 * i, 301 + 2 * i);
        targets.add(new Target(interval));
        coverage.add(ParamUtils.log2(0.5));
    }
    //add targets that don't belong to a segment
    for (int i = 1; i < 10; i++) {
        final SimpleInterval interval = new SimpleInterval("chr", 400 + 2 * i, 401 + 2 * i);
        targets.add(new Target(interval));
        coverage.add(ParamUtils.log2(1.0));
    }
    //add obviously neutral targets with some small spread
    for (int i = -5; i < 6; i++) {
        final SimpleInterval interval = new SimpleInterval("chr", 500 + 2 * i, 501 + 2 * i);
        targets.add(new Target(interval));
        coverage.add(ParamUtils.log2(0.01 * i + 1));
    }
    //add spread-out targets to a neutral segment (mean near zero)
    for (int i = -5; i < 6; i++) {
        final SimpleInterval interval = new SimpleInterval("chr", 700 + 2 * i, 701 + 2 * i);
        targets.add(new Target(interval));
        coverage.add(ParamUtils.log2(0.1 * i + 1));
    }
    final RealMatrix coverageMatrix = new Array2DRowRealMatrix(targets.size(), 1);
    coverageMatrix.setColumn(0, coverage.stream().mapToDouble(x -> x).toArray());
    final int n = targets.size();
    final int m = coverageMatrix.getRowDimension();
    final ReadCountCollection counts = new ReadCountCollection(targets, columnNames, coverageMatrix);
    List<ModeledSegment> segments = new ArrayList<>();
    //amplification
    segments.add(new ModeledSegment(new SimpleInterval("chr", 100, 200), 100, ParamUtils.log2(2.0)));
    //deletion
    segments.add(new ModeledSegment(new SimpleInterval("chr", 300, 400), 100, ParamUtils.log2(0.5)));
    //neutral
    segments.add(new ModeledSegment(new SimpleInterval("chr", 450, 550), 100, ParamUtils.log2(1)));
    //neutral
    segments.add(new ModeledSegment(new SimpleInterval("chr", 650, 750), 100, ParamUtils.log2(1)));
    List<ModeledSegment> calls = ReCapSegCaller.makeCalls(counts, segments);
    Assert.assertEquals(calls.get(0).getCall(), ReCapSegCaller.AMPLIFICATION_CALL);
    Assert.assertEquals(calls.get(1).getCall(), ReCapSegCaller.DELETION_CALL);
    Assert.assertEquals(calls.get(2).getCall(), ReCapSegCaller.NEUTRAL_CALL);
    Assert.assertEquals(calls.get(3).getCall(), ReCapSegCaller.NEUTRAL_CALL);
}
Also used : ArrayList(java.util.ArrayList) Array2DRowRealMatrix(org.apache.commons.math3.linear.Array2DRowRealMatrix) RealMatrix(org.apache.commons.math3.linear.RealMatrix) Array2DRowRealMatrix(org.apache.commons.math3.linear.Array2DRowRealMatrix) SimpleInterval(org.broadinstitute.hellbender.utils.SimpleInterval) BaseTest(org.broadinstitute.hellbender.utils.test.BaseTest) Test(org.testng.annotations.Test)

Example 84 with Array2DRowRealMatrix

use of org.apache.commons.math3.linear.Array2DRowRealMatrix in project gatk-protected by broadinstitute.

the class ReadCountCollectionUtilsUnitTest method tooManyZerosData.

@DataProvider(name = "tooManyZerosData")
public Object[][] tooManyZerosData() {
    final double[] zeroProbabilities = new double[] { .001, .01, .02, 0.1 };
    final List<Object[]> result = new ArrayList<>();
    final Random rdn = new Random(13);
    final int columnCount = 100;
    final int targetCount = 100;
    final List<String> columnNames = IntStream.range(0, columnCount).mapToObj(i -> "sample_" + (i + 1)).collect(Collectors.toList());
    final List<Target> targets = IntStream.range(0, targetCount).mapToObj(i -> new Target("target_" + (i + 1))).collect(Collectors.toList());
    for (final double zeroProbability : zeroProbabilities) {
        final double[][] counts = new double[columnCount][targetCount];
        for (int i = 0; i < counts.length; i++) {
            for (int j = 0; j < counts[0].length; j++) {
                counts[i][j] = rdn.nextDouble() <= zeroProbability ? 0.0 : rdn.nextDouble();
            }
        }
        final ReadCountCollection readCounts = new ReadCountCollection(targets, columnNames, new Array2DRowRealMatrix(counts, false));
        result.add(new Object[] { readCounts });
    }
    return result.toArray(new Object[result.size()][]);
}
Also used : IntStream(java.util.stream.IntStream) Arrays(java.util.Arrays) DataProvider(org.testng.annotations.DataProvider) Level(org.apache.logging.log4j.Level) Test(org.testng.annotations.Test) Random(java.util.Random) ArrayList(java.util.ArrayList) Message(org.apache.logging.log4j.message.Message) Assert(org.testng.Assert) Median(org.apache.commons.math3.stat.descriptive.rank.Median) Marker(org.apache.logging.log4j.Marker) AbstractLogger(org.apache.logging.log4j.spi.AbstractLogger) PrintWriter(java.io.PrintWriter) Array2DRowRealMatrix(org.apache.commons.math3.linear.Array2DRowRealMatrix) IOException(java.io.IOException) SimpleInterval(org.broadinstitute.hellbender.utils.SimpleInterval) Collectors(java.util.stream.Collectors) File(java.io.File) DoubleStream(java.util.stream.DoubleStream) List(java.util.List) Percentile(org.apache.commons.math3.stat.descriptive.rank.Percentile) Logger(org.apache.logging.log4j.Logger) Stream(java.util.stream.Stream) UserException(org.broadinstitute.hellbender.exceptions.UserException) RealMatrix(org.apache.commons.math3.linear.RealMatrix) ArrayList(java.util.ArrayList) Random(java.util.Random) Array2DRowRealMatrix(org.apache.commons.math3.linear.Array2DRowRealMatrix) DataProvider(org.testng.annotations.DataProvider)

Example 85 with Array2DRowRealMatrix

use of org.apache.commons.math3.linear.Array2DRowRealMatrix in project gatk-protected by broadinstitute.

the class ReadCountCollectionUtilsUnitTest method readCountAndPercentileData.

@DataProvider(name = "readCountAndPercentileData")
public Object[][] readCountAndPercentileData() {
    final double[] percentiles = new double[] { 1.0, 2.5, 5.0, 10.0, 25.0 };
    final List<Object[]> result = new ArrayList<>();
    final Random rdn = new Random(13);
    final int columnCount = 100;
    final int targetCount = 100;
    final List<String> columnNames = IntStream.range(0, columnCount).mapToObj(i -> "sample_" + (i + 1)).collect(Collectors.toList());
    final List<Target> targets = IntStream.range(0, targetCount).mapToObj(i -> new Target("target_" + (i + 1))).collect(Collectors.toList());
    for (final double percentile : percentiles) {
        final double[][] counts = new double[columnCount][targetCount];
        for (int i = 0; i < counts.length; i++) {
            for (int j = 0; j < counts[0].length; j++) {
                counts[i][j] = rdn.nextDouble();
            }
        }
        final ReadCountCollection readCounts = new ReadCountCollection(targets, columnNames, new Array2DRowRealMatrix(counts, false));
        result.add(new Object[] { readCounts, percentile });
    }
    return result.toArray(new Object[result.size()][]);
}
Also used : IntStream(java.util.stream.IntStream) Arrays(java.util.Arrays) DataProvider(org.testng.annotations.DataProvider) Level(org.apache.logging.log4j.Level) Test(org.testng.annotations.Test) Random(java.util.Random) ArrayList(java.util.ArrayList) Message(org.apache.logging.log4j.message.Message) Assert(org.testng.Assert) Median(org.apache.commons.math3.stat.descriptive.rank.Median) Marker(org.apache.logging.log4j.Marker) AbstractLogger(org.apache.logging.log4j.spi.AbstractLogger) PrintWriter(java.io.PrintWriter) Array2DRowRealMatrix(org.apache.commons.math3.linear.Array2DRowRealMatrix) IOException(java.io.IOException) SimpleInterval(org.broadinstitute.hellbender.utils.SimpleInterval) Collectors(java.util.stream.Collectors) File(java.io.File) DoubleStream(java.util.stream.DoubleStream) List(java.util.List) Percentile(org.apache.commons.math3.stat.descriptive.rank.Percentile) Logger(org.apache.logging.log4j.Logger) Stream(java.util.stream.Stream) UserException(org.broadinstitute.hellbender.exceptions.UserException) RealMatrix(org.apache.commons.math3.linear.RealMatrix) ArrayList(java.util.ArrayList) Random(java.util.Random) Array2DRowRealMatrix(org.apache.commons.math3.linear.Array2DRowRealMatrix) DataProvider(org.testng.annotations.DataProvider)

Aggregations

Array2DRowRealMatrix (org.apache.commons.math3.linear.Array2DRowRealMatrix)141 RealMatrix (org.apache.commons.math3.linear.RealMatrix)101 Test (org.testng.annotations.Test)60 IntStream (java.util.stream.IntStream)31 BaseTest (org.broadinstitute.hellbender.utils.test.BaseTest)28 File (java.io.File)27 Collectors (java.util.stream.Collectors)25 ArrayList (java.util.ArrayList)24 Assert (org.testng.Assert)24 List (java.util.List)22 SimpleInterval (org.broadinstitute.hellbender.utils.SimpleInterval)22 Target (org.broadinstitute.hellbender.tools.exome.Target)18 java.util (java.util)15 Random (java.util.Random)14 ReadCountCollection (org.broadinstitute.hellbender.tools.exome.ReadCountCollection)14 ParamUtils (org.broadinstitute.hellbender.utils.param.ParamUtils)14 DataProvider (org.testng.annotations.DataProvider)14 Stream (java.util.stream.Stream)13 Arrays (java.util.Arrays)12 DoubleStream (java.util.stream.DoubleStream)12