Search in sources :

Example 1 with InsertSizeMetricsArgumentCollection

use of org.broadinstitute.hellbender.metrics.InsertSizeMetricsArgumentCollection in project gatk by broadinstitute.

the class InsertSizeMetricsCollectorSparkUnitTest method test.

@Test(dataProvider = "metricsfiles", groups = "spark")
public void test(final String fileName, final String referenceName, final boolean allLevels, final String expectedResultsFile) throws IOException {
    final String inputPath = new File(TEST_DATA_DIR, fileName).getAbsolutePath();
    final String referencePath = referenceName != null ? new File(referenceName).getAbsolutePath() : null;
    final File outfile = BaseTest.createTempFile("test", ".insert_size_metrics");
    JavaSparkContext ctx = SparkContextFactory.getTestSparkContext();
    ReadsSparkSource readSource = new ReadsSparkSource(ctx, ValidationStringency.DEFAULT_STRINGENCY);
    SAMFileHeader samHeader = readSource.getHeader(inputPath, referencePath);
    JavaRDD<GATKRead> rddParallelReads = readSource.getParallelReads(inputPath, referencePath);
    InsertSizeMetricsArgumentCollection isArgs = new InsertSizeMetricsArgumentCollection();
    isArgs.output = outfile.getAbsolutePath();
    if (allLevels) {
        isArgs.metricAccumulationLevel.accumulationLevels = new HashSet<>();
        isArgs.metricAccumulationLevel.accumulationLevels.add(MetricAccumulationLevel.ALL_READS);
        isArgs.metricAccumulationLevel.accumulationLevels.add(MetricAccumulationLevel.SAMPLE);
        isArgs.metricAccumulationLevel.accumulationLevels.add(MetricAccumulationLevel.LIBRARY);
        isArgs.metricAccumulationLevel.accumulationLevels.add(MetricAccumulationLevel.READ_GROUP);
    }
    InsertSizeMetricsCollectorSpark isSpark = new InsertSizeMetricsCollectorSpark();
    isSpark.initialize(isArgs, samHeader, null);
    // Since we're bypassing the framework in order to force this test to run on multiple partitions, we
    // need to make the read filter manually since we don't have the plugin descriptor to do it for us; so
    // remove the (default) FirstOfPairReadFilter filter and add in the SECOND_IN_PAIR manually since thats
    // required for our tests to pass
    List<ReadFilter> readFilters = isSpark.getDefaultReadFilters();
    readFilters.stream().filter(f -> !f.getClass().getSimpleName().equals(ReadFilterLibrary.FirstOfPairReadFilter.class.getSimpleName()));
    ReadFilter rf = ReadFilter.fromList(readFilters, samHeader);
    // Force the input RDD to be split into two partitions to ensure that the
    // reduce/combiners run
    rddParallelReads = rddParallelReads.repartition(2);
    isSpark.collectMetrics(rddParallelReads.filter(r -> rf.test(r)), samHeader);
    isSpark.saveMetrics(fileName, null);
    IntegrationTestSpec.assertEqualTextFiles(outfile, new File(TEST_DATA_DIR, expectedResultsFile), "#");
}
Also used : GATKRead(org.broadinstitute.hellbender.utils.read.GATKRead) DataProvider(org.testng.annotations.DataProvider) BaseTest(org.broadinstitute.hellbender.utils.test.BaseTest) JavaSparkContext(org.apache.spark.api.java.JavaSparkContext) Test(org.testng.annotations.Test) IOException(java.io.IOException) ReadFilter(org.broadinstitute.hellbender.engine.filters.ReadFilter) IntegrationTestSpec(org.broadinstitute.hellbender.utils.test.IntegrationTestSpec) GATKRead(org.broadinstitute.hellbender.utils.read.GATKRead) SAMFileHeader(htsjdk.samtools.SAMFileHeader) ValidationStringency(htsjdk.samtools.ValidationStringency) CommandLineProgramTest(org.broadinstitute.hellbender.CommandLineProgramTest) File(java.io.File) HashSet(java.util.HashSet) List(java.util.List) ReadsSparkSource(org.broadinstitute.hellbender.engine.spark.datasources.ReadsSparkSource) InsertSizeMetricsArgumentCollection(org.broadinstitute.hellbender.metrics.InsertSizeMetricsArgumentCollection) MetricAccumulationLevel(org.broadinstitute.hellbender.metrics.MetricAccumulationLevel) SparkContextFactory(org.broadinstitute.hellbender.engine.spark.SparkContextFactory) JavaRDD(org.apache.spark.api.java.JavaRDD) ReadFilterLibrary(org.broadinstitute.hellbender.engine.filters.ReadFilterLibrary) ReadsSparkSource(org.broadinstitute.hellbender.engine.spark.datasources.ReadsSparkSource) InsertSizeMetricsArgumentCollection(org.broadinstitute.hellbender.metrics.InsertSizeMetricsArgumentCollection) ReadFilter(org.broadinstitute.hellbender.engine.filters.ReadFilter) JavaSparkContext(org.apache.spark.api.java.JavaSparkContext) SAMFileHeader(htsjdk.samtools.SAMFileHeader) File(java.io.File) BaseTest(org.broadinstitute.hellbender.utils.test.BaseTest) Test(org.testng.annotations.Test) CommandLineProgramTest(org.broadinstitute.hellbender.CommandLineProgramTest)

Aggregations

SAMFileHeader (htsjdk.samtools.SAMFileHeader)1 ValidationStringency (htsjdk.samtools.ValidationStringency)1 File (java.io.File)1 IOException (java.io.IOException)1 HashSet (java.util.HashSet)1 List (java.util.List)1 JavaRDD (org.apache.spark.api.java.JavaRDD)1 JavaSparkContext (org.apache.spark.api.java.JavaSparkContext)1 CommandLineProgramTest (org.broadinstitute.hellbender.CommandLineProgramTest)1 ReadFilter (org.broadinstitute.hellbender.engine.filters.ReadFilter)1 ReadFilterLibrary (org.broadinstitute.hellbender.engine.filters.ReadFilterLibrary)1 SparkContextFactory (org.broadinstitute.hellbender.engine.spark.SparkContextFactory)1 ReadsSparkSource (org.broadinstitute.hellbender.engine.spark.datasources.ReadsSparkSource)1 InsertSizeMetricsArgumentCollection (org.broadinstitute.hellbender.metrics.InsertSizeMetricsArgumentCollection)1 MetricAccumulationLevel (org.broadinstitute.hellbender.metrics.MetricAccumulationLevel)1 GATKRead (org.broadinstitute.hellbender.utils.read.GATKRead)1 BaseTest (org.broadinstitute.hellbender.utils.test.BaseTest)1 IntegrationTestSpec (org.broadinstitute.hellbender.utils.test.IntegrationTestSpec)1 DataProvider (org.testng.annotations.DataProvider)1 Test (org.testng.annotations.Test)1