Search in sources :

Example 1 with BDGAlignmentRecordToGATKReadAdapter

use of org.broadinstitute.hellbender.utils.read.BDGAlignmentRecordToGATKReadAdapter in project gatk by broadinstitute.

the class ReadsSparkSource method getADAMReads.

/**
     * Loads ADAM reads stored as Parquet.
     * @param inputPath path to the Parquet data
     * @return RDD of (ADAM-backed) GATKReads from the file.
     */
public JavaRDD<GATKRead> getADAMReads(final String inputPath, final List<SimpleInterval> intervals, final SAMFileHeader header) throws IOException {
    Job job = Job.getInstance(ctx.hadoopConfiguration());
    AvroParquetInputFormat.setAvroReadSchema(job, AlignmentRecord.getClassSchema());
    Broadcast<SAMFileHeader> bHeader;
    if (header == null) {
        bHeader = ctx.broadcast(null);
    } else {
        bHeader = ctx.broadcast(header);
    }
    @SuppressWarnings("unchecked") JavaRDD<AlignmentRecord> recordsRdd = ctx.newAPIHadoopFile(inputPath, AvroParquetInputFormat.class, Void.class, AlignmentRecord.class, job.getConfiguration()).values();
    JavaRDD<GATKRead> readsRdd = recordsRdd.map(record -> new BDGAlignmentRecordToGATKReadAdapter(record, bHeader.getValue()));
    JavaRDD<GATKRead> filteredRdd = readsRdd.filter(record -> samRecordOverlaps(record.convertToSAMRecord(header), intervals));
    return putPairsInSamePartition(header, filteredRdd);
}
Also used : GATKRead(org.broadinstitute.hellbender.utils.read.GATKRead) AvroParquetInputFormat(org.apache.parquet.avro.AvroParquetInputFormat) AlignmentRecord(org.bdgenomics.formats.avro.AlignmentRecord) BDGAlignmentRecordToGATKReadAdapter(org.broadinstitute.hellbender.utils.read.BDGAlignmentRecordToGATKReadAdapter) Job(org.apache.hadoop.mapreduce.Job)

Aggregations

Job (org.apache.hadoop.mapreduce.Job)1 AvroParquetInputFormat (org.apache.parquet.avro.AvroParquetInputFormat)1 AlignmentRecord (org.bdgenomics.formats.avro.AlignmentRecord)1 BDGAlignmentRecordToGATKReadAdapter (org.broadinstitute.hellbender.utils.read.BDGAlignmentRecordToGATKReadAdapter)1 GATKRead (org.broadinstitute.hellbender.utils.read.GATKRead)1