Search in sources :

Example 21 with ReadFilter

use of org.broadinstitute.hellbender.engine.filters.ReadFilter in project gatk by broadinstitute.

the class ReadFilteringIteratorUnitTest method filteringIteratorTestData.

@DataProvider(name = "FilteringIteratorTestData")
public Object[][] filteringIteratorTestData() {
    final ReadFilter allowNoReadsFilter = new ReadFilter() {

        private static final long serialVersionUID = 1L;

        @Override
        public boolean test(final GATKRead read) {
            return false;
        }
    };
    final ReadFilter allowLongReadsFilter = new ReadFilter() {

        private static final long serialVersionUID = 1L;

        @Override
        public boolean test(final GATKRead read) {
            return read.getLength() > 100;
        }
    };
    return new Object[][] { { makeReadsIterator(), ReadFilterLibrary.ALLOW_ALL_READS, 5 }, { makeReadsIterator(), allowNoReadsFilter, 0 }, { makeReadsIterator(), allowLongReadsFilter, 3 } };
}
Also used : GATKRead(org.broadinstitute.hellbender.utils.read.GATKRead) ReadFilter(org.broadinstitute.hellbender.engine.filters.ReadFilter) DataProvider(org.testng.annotations.DataProvider)

Example 22 with ReadFilter

use of org.broadinstitute.hellbender.engine.filters.ReadFilter in project gatk by broadinstitute.

the class HaplotypeCallerEngine method makeStandardHCReadFilters.

/**
     * @return the default set of read filters for use with the HaplotypeCaller
     */
public static List<ReadFilter> makeStandardHCReadFilters() {
    List<ReadFilter> filters = new ArrayList<>();
    filters.add(new MappingQualityReadFilter(READ_QUALITY_FILTER_THRESHOLD));
    filters.add(ReadFilterLibrary.MAPPING_QUALITY_AVAILABLE);
    filters.add(ReadFilterLibrary.MAPPED);
    filters.add(ReadFilterLibrary.PRIMARY_ALIGNMENT);
    filters.add(ReadFilterLibrary.NOT_DUPLICATE);
    filters.add(ReadFilterLibrary.PASSES_VENDOR_QUALITY_CHECK);
    filters.add(ReadFilterLibrary.NON_ZERO_REFERENCE_LENGTH_ALIGNMENT);
    filters.add(ReadFilterLibrary.GOOD_CIGAR);
    filters.add(new WellformedReadFilter());
    return filters;
}
Also used : WellformedReadFilter(org.broadinstitute.hellbender.engine.filters.WellformedReadFilter) MappingQualityReadFilter(org.broadinstitute.hellbender.engine.filters.MappingQualityReadFilter) ReadFilter(org.broadinstitute.hellbender.engine.filters.ReadFilter) WellformedReadFilter(org.broadinstitute.hellbender.engine.filters.WellformedReadFilter) MappingQualityReadFilter(org.broadinstitute.hellbender.engine.filters.MappingQualityReadFilter)

Example 23 with ReadFilter

use of org.broadinstitute.hellbender.engine.filters.ReadFilter in project gatk by broadinstitute.

the class HaplotypeCallerEngineUnitTest method testIsActive.

@Test
public void testIsActive() throws IOException {
    final File testBam = new File(NA12878_20_21_WGS_bam);
    final File reference = new File(b37_reference_20_21);
    final SimpleInterval shardInterval = new SimpleInterval("20", 10000000, 10001000);
    final SimpleInterval paddedShardInterval = new SimpleInterval(shardInterval.getContig(), shardInterval.getStart() - 100, shardInterval.getEnd() + 100);
    final HaplotypeCallerArgumentCollection hcArgs = new HaplotypeCallerArgumentCollection();
    // We expect isActive() to return 1.0 for the sites below, and 0.0 for all other sites
    final List<SimpleInterval> expectedActiveSites = Arrays.asList(new SimpleInterval("20", 9999996, 9999996), new SimpleInterval("20", 9999997, 9999997), new SimpleInterval("20", 10000117, 10000117), new SimpleInterval("20", 10000211, 10000211), new SimpleInterval("20", 10000439, 10000439), new SimpleInterval("20", 10000598, 10000598), new SimpleInterval("20", 10000694, 10000694), new SimpleInterval("20", 10000758, 10000758), new SimpleInterval("20", 10001019, 10001019));
    try (final ReadsDataSource reads = new ReadsDataSource(testBam.toPath());
        final ReferenceDataSource ref = new ReferenceFileSource(reference);
        final CachingIndexedFastaSequenceFile referenceReader = new CachingIndexedFastaSequenceFile(reference)) {
        final HaplotypeCallerEngine hcEngine = new HaplotypeCallerEngine(hcArgs, reads.getHeader(), referenceReader);
        List<ReadFilter> hcFilters = HaplotypeCallerEngine.makeStandardHCReadFilters();
        hcFilters.forEach(filter -> filter.setHeader(reads.getHeader()));
        ReadFilter hcCombinedFilter = hcFilters.get(0);
        for (int i = 1; i < hcFilters.size(); ++i) {
            hcCombinedFilter = hcCombinedFilter.and(hcFilters.get(i));
        }
        final Iterator<GATKRead> readIter = new ReadFilteringIterator(reads.query(paddedShardInterval), hcCombinedFilter);
        final LocusIteratorByState libs = new LocusIteratorByState(readIter, DownsamplingMethod.NONE, false, ReadUtils.getSamplesFromHeader(reads.getHeader()), reads.getHeader(), false);
        libs.forEachRemaining(pileup -> {
            final SimpleInterval pileupInterval = new SimpleInterval(pileup.getLocation());
            final ReferenceContext pileupRefContext = new ReferenceContext(ref, pileupInterval);
            final ActivityProfileState isActiveResult = hcEngine.isActive(pileup, pileupRefContext, new FeatureContext(null, pileupInterval));
            final double expectedIsActiveValue = expectedActiveSites.contains(pileupInterval) ? 1.0 : 0.0;
            Assert.assertEquals(isActiveResult.isActiveProb(), expectedIsActiveValue, "Wrong isActive probability for site " + pileupInterval);
        });
    }
}
Also used : ReadFilteringIterator(org.broadinstitute.hellbender.utils.iterators.ReadFilteringIterator) GATKRead(org.broadinstitute.hellbender.utils.read.GATKRead) ActivityProfileState(org.broadinstitute.hellbender.utils.activityprofile.ActivityProfileState) CachingIndexedFastaSequenceFile(org.broadinstitute.hellbender.utils.fasta.CachingIndexedFastaSequenceFile) LocusIteratorByState(org.broadinstitute.hellbender.utils.locusiterator.LocusIteratorByState) SimpleInterval(org.broadinstitute.hellbender.utils.SimpleInterval) ReadFilter(org.broadinstitute.hellbender.engine.filters.ReadFilter) CachingIndexedFastaSequenceFile(org.broadinstitute.hellbender.utils.fasta.CachingIndexedFastaSequenceFile) File(java.io.File) BaseTest(org.broadinstitute.hellbender.utils.test.BaseTest) Test(org.testng.annotations.Test)

Example 24 with ReadFilter

use of org.broadinstitute.hellbender.engine.filters.ReadFilter in project gatk by broadinstitute.

the class Mutect2Engine method makeStandardMutect2ReadFilters.

/**
     * @return the default set of read filters for use with Mutect2
     */
public static List<ReadFilter> makeStandardMutect2ReadFilters() {
    // The order in which we apply filters is important. Cheap filters come first so we fail fast
    List<ReadFilter> filters = new ArrayList<>();
    filters.add(new MappingQualityReadFilter(READ_QUALITY_FILTER_THRESHOLD));
    filters.add(ReadFilterLibrary.MAPPING_QUALITY_AVAILABLE);
    filters.add(ReadFilterLibrary.MAPPING_QUALITY_NOT_ZERO);
    filters.add(ReadFilterLibrary.MAPPED);
    filters.add(ReadFilterLibrary.PRIMARY_ALIGNMENT);
    filters.add(ReadFilterLibrary.NOT_DUPLICATE);
    filters.add(ReadFilterLibrary.PASSES_VENDOR_QUALITY_CHECK);
    filters.add(ReadFilterLibrary.NON_ZERO_REFERENCE_LENGTH_ALIGNMENT);
    filters.add(GOOD_READ_LENGTH_FILTER);
    filters.add(ReadFilterLibrary.MATE_ON_SAME_CONTIG_OR_NO_MAPPED_MATE);
    filters.add(ReadFilterLibrary.GOOD_CIGAR);
    filters.add(new WellformedReadFilter());
    return filters;
}
Also used : WellformedReadFilter(org.broadinstitute.hellbender.engine.filters.WellformedReadFilter) MappingQualityReadFilter(org.broadinstitute.hellbender.engine.filters.MappingQualityReadFilter) ReadFilter(org.broadinstitute.hellbender.engine.filters.ReadFilter) WellformedReadFilter(org.broadinstitute.hellbender.engine.filters.WellformedReadFilter) MappingQualityReadFilter(org.broadinstitute.hellbender.engine.filters.MappingQualityReadFilter)

Example 25 with ReadFilter

use of org.broadinstitute.hellbender.engine.filters.ReadFilter in project gatk-protected by broadinstitute.

the class HaplotypeCallerEngineUnitTest method testIsActive.

@Test
public void testIsActive() throws IOException {
    final File testBam = new File(NA12878_20_21_WGS_bam);
    final File reference = new File(b37_reference_20_21);
    final SimpleInterval shardInterval = new SimpleInterval("20", 10000000, 10001000);
    final SimpleInterval paddedShardInterval = new SimpleInterval(shardInterval.getContig(), shardInterval.getStart() - 100, shardInterval.getEnd() + 100);
    final HaplotypeCallerArgumentCollection hcArgs = new HaplotypeCallerArgumentCollection();
    // We expect isActive() to return 1.0 for the sites below, and 0.0 for all other sites
    final List<SimpleInterval> expectedActiveSites = Arrays.asList(new SimpleInterval("20", 9999996, 9999996), new SimpleInterval("20", 9999997, 9999997), new SimpleInterval("20", 10000117, 10000117), new SimpleInterval("20", 10000211, 10000211), new SimpleInterval("20", 10000439, 10000439), new SimpleInterval("20", 10000598, 10000598), new SimpleInterval("20", 10000694, 10000694), new SimpleInterval("20", 10000758, 10000758), new SimpleInterval("20", 10001019, 10001019));
    try (final ReadsDataSource reads = new ReadsDataSource(testBam.toPath());
        final ReferenceDataSource ref = new ReferenceFileSource(reference);
        final CachingIndexedFastaSequenceFile referenceReader = new CachingIndexedFastaSequenceFile(reference)) {
        final HaplotypeCallerEngine hcEngine = new HaplotypeCallerEngine(hcArgs, reads.getHeader(), referenceReader);
        List<ReadFilter> hcFilters = HaplotypeCallerEngine.makeStandardHCReadFilters();
        hcFilters.forEach(filter -> filter.setHeader(reads.getHeader()));
        ReadFilter hcCombinedFilter = hcFilters.get(0);
        for (int i = 1; i < hcFilters.size(); ++i) {
            hcCombinedFilter = hcCombinedFilter.and(hcFilters.get(i));
        }
        final Iterator<GATKRead> readIter = new ReadFilteringIterator(reads.query(paddedShardInterval), hcCombinedFilter);
        final LocusIteratorByState libs = new LocusIteratorByState(readIter, DownsamplingMethod.NONE, false, ReadUtils.getSamplesFromHeader(reads.getHeader()), reads.getHeader(), false);
        libs.forEachRemaining(pileup -> {
            final SimpleInterval pileupInterval = new SimpleInterval(pileup.getLocation());
            final ReferenceContext pileupRefContext = new ReferenceContext(ref, pileupInterval);
            final ActivityProfileState isActiveResult = hcEngine.isActive(pileup, pileupRefContext, new FeatureContext(null, pileupInterval));
            final double expectedIsActiveValue = expectedActiveSites.contains(pileupInterval) ? 1.0 : 0.0;
            Assert.assertEquals(isActiveResult.isActiveProb(), expectedIsActiveValue, "Wrong isActive probability for site " + pileupInterval);
        });
    }
}
Also used : ReadFilteringIterator(org.broadinstitute.hellbender.utils.iterators.ReadFilteringIterator) GATKRead(org.broadinstitute.hellbender.utils.read.GATKRead) ActivityProfileState(org.broadinstitute.hellbender.utils.activityprofile.ActivityProfileState) CachingIndexedFastaSequenceFile(org.broadinstitute.hellbender.utils.fasta.CachingIndexedFastaSequenceFile) LocusIteratorByState(org.broadinstitute.hellbender.utils.locusiterator.LocusIteratorByState) SimpleInterval(org.broadinstitute.hellbender.utils.SimpleInterval) ReadFilter(org.broadinstitute.hellbender.engine.filters.ReadFilter) CachingIndexedFastaSequenceFile(org.broadinstitute.hellbender.utils.fasta.CachingIndexedFastaSequenceFile) File(java.io.File) BaseTest(org.broadinstitute.hellbender.utils.test.BaseTest) Test(org.testng.annotations.Test)

Aggregations

ReadFilter (org.broadinstitute.hellbender.engine.filters.ReadFilter)25 WellformedReadFilter (org.broadinstitute.hellbender.engine.filters.WellformedReadFilter)12 GATKRead (org.broadinstitute.hellbender.utils.read.GATKRead)10 SimpleInterval (org.broadinstitute.hellbender.utils.SimpleInterval)7 File (java.io.File)6 ReadFilterLibrary (org.broadinstitute.hellbender.engine.filters.ReadFilterLibrary)6 CountingReadFilter (org.broadinstitute.hellbender.engine.filters.CountingReadFilter)5 MappingQualityReadFilter (org.broadinstitute.hellbender.engine.filters.MappingQualityReadFilter)5 ArrayList (java.util.ArrayList)4 JavaRDD (org.apache.spark.api.java.JavaRDD)4 JavaSparkContext (org.apache.spark.api.java.JavaSparkContext)4 SAMFileHeader (htsjdk.samtools.SAMFileHeader)3 SAMSequenceDictionary (htsjdk.samtools.SAMSequenceDictionary)3 java.util (java.util)3 Argument (org.broadinstitute.barclay.argparser.Argument)3 CommandLineProgramProperties (org.broadinstitute.barclay.argparser.CommandLineProgramProperties)3 DocumentedFeature (org.broadinstitute.barclay.help.DocumentedFeature)3 GATKSparkTool (org.broadinstitute.hellbender.engine.spark.GATKSparkTool)3 UserException (org.broadinstitute.hellbender.exceptions.UserException)3 BaseTest (org.broadinstitute.hellbender.utils.test.BaseTest)3