Search in sources :

Example 1 with BwaSparkEngine

use of org.broadinstitute.hellbender.tools.spark.bwa.BwaSparkEngine in project gatk by broadinstitute.

the class PathSeqFilterSpark method doHostBWA.

private JavaRDD<GATKRead> doHostBWA(final JavaSparkContext ctx, final SAMFileHeader readsHeader, final JavaRDD<GATKRead> reads) {
    final BwaSparkEngine engine = new BwaSparkEngine(ctx, indexImageFile, getHeaderForReads(), getReferenceSequenceDictionary());
    // null if we have no api key
    final GCSOptions gcsOptions = getAuthenticatedGCSOptions();
    final ReferenceMultiSource hostReference = new ReferenceMultiSource(gcsOptions, HOST_REF_PATH, getReferenceWindowFunction());
    final SAMSequenceDictionary hostRefDict = hostReference.getReferenceSequenceDictionary(header.getSequenceDictionary());
    readsHeader.setSequenceDictionary(hostRefDict);
    return engine.align(reads);
}
Also used : ReferenceMultiSource(org.broadinstitute.hellbender.engine.datasources.ReferenceMultiSource) BwaSparkEngine(org.broadinstitute.hellbender.tools.spark.bwa.BwaSparkEngine) SAMSequenceDictionary(htsjdk.samtools.SAMSequenceDictionary) GCSOptions(com.google.cloud.genomics.dataflow.utils.GCSOptions)

Example 2 with BwaSparkEngine

use of org.broadinstitute.hellbender.tools.spark.bwa.BwaSparkEngine in project gatk by broadinstitute.

the class BwaAndMarkDuplicatesPipelineSpark method runTool.

@Override
protected void runTool(final JavaSparkContext ctx) {
    try (final BwaSparkEngine engine = new BwaSparkEngine(ctx, indexImageFile, getHeaderForReads(), getReferenceSequenceDictionary())) {
        final JavaRDD<GATKRead> alignedReads = engine.align(getReads());
        final JavaRDD<GATKRead> markedReadsWithOD = MarkDuplicatesSpark.mark(alignedReads, engine.getHeader(), duplicatesScoringStrategy, new OpticalDuplicateFinder(), getRecommendedNumReducers());
        final JavaRDD<GATKRead> markedReads = MarkDuplicatesSpark.cleanupTemporaryAttributes(markedReadsWithOD);
        try {
            ReadsSparkSink.writeReads(ctx, output, referenceArguments.getReferenceFile().getAbsolutePath(), markedReads, engine.getHeader(), shardedOutput ? ReadsWriteFormat.SHARDED : ReadsWriteFormat.SINGLE, getRecommendedNumReducers());
        } catch (IOException e) {
            throw new GATKException("unable to write bam: " + e);
        }
    }
}
Also used : GATKRead(org.broadinstitute.hellbender.utils.read.GATKRead) OpticalDuplicateFinder(org.broadinstitute.hellbender.utils.read.markduplicates.OpticalDuplicateFinder) BwaSparkEngine(org.broadinstitute.hellbender.tools.spark.bwa.BwaSparkEngine) IOException(java.io.IOException) GATKException(org.broadinstitute.hellbender.exceptions.GATKException)

Aggregations

BwaSparkEngine (org.broadinstitute.hellbender.tools.spark.bwa.BwaSparkEngine)2 GCSOptions (com.google.cloud.genomics.dataflow.utils.GCSOptions)1 SAMSequenceDictionary (htsjdk.samtools.SAMSequenceDictionary)1 IOException (java.io.IOException)1 ReferenceMultiSource (org.broadinstitute.hellbender.engine.datasources.ReferenceMultiSource)1 GATKException (org.broadinstitute.hellbender.exceptions.GATKException)1 GATKRead (org.broadinstitute.hellbender.utils.read.GATKRead)1 OpticalDuplicateFinder (org.broadinstitute.hellbender.utils.read.markduplicates.OpticalDuplicateFinder)1