Search in sources :

Example 1 with ContainsKmerReadFilterSpark

use of org.broadinstitute.hellbender.tools.spark.sv.ContainsKmerReadFilterSpark in project gatk by broadinstitute.

the class PathSeqFilterSpark method doKmerFiltering.

@SuppressWarnings("unchecked")
private JavaRDD<GATKRead> doKmerFiltering(final JavaSparkContext ctx, final JavaRDD<GATKRead> reads) {
    final PipelineOptions options = getAuthenticatedGCSOptions();
    Input input = new Input(BucketUtils.openFile(KMER_LIB_PATH));
    Kryo kryo = new Kryo();
    kryo.setReferences(false);
    Set<SVKmer> kmerLibSet = (HopscotchSet<SVKmer>) kryo.readClassAndObject(input);
    return reads.filter(new ContainsKmerReadFilterSpark(ctx.broadcast(kmerLibSet), KMER_SIZE));
}
Also used : ContainsKmerReadFilterSpark(org.broadinstitute.hellbender.tools.spark.sv.ContainsKmerReadFilterSpark) Input(com.esotericsoftware.kryo.io.Input) HopscotchSet(org.broadinstitute.hellbender.tools.spark.utils.HopscotchSet) SVKmer(org.broadinstitute.hellbender.tools.spark.sv.SVKmer) PipelineOptions(com.google.cloud.dataflow.sdk.options.PipelineOptions) Kryo(com.esotericsoftware.kryo.Kryo)

Example 2 with ContainsKmerReadFilterSpark

use of org.broadinstitute.hellbender.tools.spark.sv.ContainsKmerReadFilterSpark in project gatk by broadinstitute.

the class ContainsKmerReadFilterTest method testTest.

@Test(dataProvider = "sequenceStrings")
public void testTest(final String bases_in, final Boolean test_out) throws Exception {
    final JavaSparkContext ctx = SparkContextFactory.getTestSparkContext();
    final ContainsKmerReadFilterSpark filter = new ContainsKmerReadFilterSpark(ctx.broadcast(kmerSet), kSize);
    final byte[] quals = bases_in.getBytes().clone();
    Arrays.fill(quals, (byte) 'I');
    SAMUtils.fastqToPhred(quals);
    GATKRead read_in = ArtificialReadUtils.createArtificialRead(bases_in.getBytes(), quals, "*");
    final Boolean test_i = filter.call(read_in);
    Assert.assertEquals(test_out, test_i);
}
Also used : GATKRead(org.broadinstitute.hellbender.utils.read.GATKRead) ContainsKmerReadFilterSpark(org.broadinstitute.hellbender.tools.spark.sv.ContainsKmerReadFilterSpark) JavaSparkContext(org.apache.spark.api.java.JavaSparkContext) BaseTest(org.broadinstitute.hellbender.utils.test.BaseTest) Test(org.testng.annotations.Test)

Aggregations

ContainsKmerReadFilterSpark (org.broadinstitute.hellbender.tools.spark.sv.ContainsKmerReadFilterSpark)2 Kryo (com.esotericsoftware.kryo.Kryo)1 Input (com.esotericsoftware.kryo.io.Input)1 PipelineOptions (com.google.cloud.dataflow.sdk.options.PipelineOptions)1 JavaSparkContext (org.apache.spark.api.java.JavaSparkContext)1 SVKmer (org.broadinstitute.hellbender.tools.spark.sv.SVKmer)1 HopscotchSet (org.broadinstitute.hellbender.tools.spark.utils.HopscotchSet)1 GATKRead (org.broadinstitute.hellbender.utils.read.GATKRead)1 BaseTest (org.broadinstitute.hellbender.utils.test.BaseTest)1 Test (org.testng.annotations.Test)1