Search in sources :

Example 1 with VariantFilter

use of org.broadinstitute.hellbender.engine.filters.VariantFilter in project gatk by broadinstitute.

the class VariantWalkerBase method traverse.

/**
     * Implementation of variant-based traversal.
     * Subclasses can override to provide their own behavior but default implementation should be suitable for most uses.
     */
@Override
public void traverse() {
    final VariantFilter variantfilter = makeVariantFilter();
    final CountingReadFilter readFilter = makeReadFilter();
    // Process each variant in the input stream.
    StreamSupport.stream(getSpliteratorForDrivingVariants(), false).filter(variantfilter).forEach(variant -> {
        final SimpleInterval variantInterval = new SimpleInterval(variant);
        apply(variant, new ReadsContext(reads, variantInterval, readFilter), new ReferenceContext(reference, variantInterval), new FeatureContext(features, variantInterval));
        progressMeter.update(variantInterval);
    });
}
Also used : CountingReadFilter(org.broadinstitute.hellbender.engine.filters.CountingReadFilter) VariantFilter(org.broadinstitute.hellbender.engine.filters.VariantFilter) SimpleInterval(org.broadinstitute.hellbender.utils.SimpleInterval)

Example 2 with VariantFilter

use of org.broadinstitute.hellbender.engine.filters.VariantFilter in project gatk by broadinstitute.

the class VariantWalkerSpark method getVariants.

/**
     * Loads variants and the corresponding reads, reference and features into a {@link JavaRDD} for the intervals specified.
     * FOr the current implementation the reads context will always be empty.
     *
     * If no intervals were specified, returns all the variants.
     *
     * @return all variants as a {@link JavaRDD}, bounded by intervals if specified.
     */
public JavaRDD<VariantWalkerContext> getVariants(JavaSparkContext ctx) {
    SAMSequenceDictionary sequenceDictionary = getBestAvailableSequenceDictionary();
    List<SimpleInterval> intervals = hasIntervals() ? getIntervals() : IntervalUtils.getAllIntervalsForReference(sequenceDictionary);
    // use unpadded shards (padding is only needed for reference bases)
    final List<ShardBoundary> intervalShards = intervals.stream().flatMap(interval -> Shard.divideIntervalIntoShards(interval, variantShardSize, 0, sequenceDictionary).stream()).collect(Collectors.toList());
    JavaRDD<VariantContext> variants = variantsSource.getParallelVariantContexts(drivingVariantFile, getIntervals());
    VariantFilter variantFilter = makeVariantFilter();
    variants = variants.filter(variantFilter::test);
    JavaRDD<Shard<VariantContext>> shardedVariants = SparkSharder.shard(ctx, variants, VariantContext.class, sequenceDictionary, intervalShards, variantShardSize, shuffle);
    Broadcast<ReferenceMultiSource> bReferenceSource = hasReference() ? ctx.broadcast(getReference()) : null;
    Broadcast<FeatureManager> bFeatureManager = features == null ? null : ctx.broadcast(features);
    return shardedVariants.flatMap(getVariantsFunction(bReferenceSource, bFeatureManager, sequenceDictionary, variantShardPadding));
}
Also used : Broadcast(org.apache.spark.broadcast.Broadcast) VCFHeader(htsjdk.variant.vcf.VCFHeader) ReferenceMultiSource(org.broadinstitute.hellbender.engine.datasources.ReferenceMultiSource) SAMSequenceDictionary(htsjdk.samtools.SAMSequenceDictionary) Argument(org.broadinstitute.barclay.argparser.Argument) IndexUtils(org.broadinstitute.hellbender.utils.IndexUtils) JavaSparkContext(org.apache.spark.api.java.JavaSparkContext) VariantFilterLibrary(org.broadinstitute.hellbender.engine.filters.VariantFilterLibrary) StandardArgumentDefinitions(org.broadinstitute.hellbender.cmdline.StandardArgumentDefinitions) SimpleInterval(org.broadinstitute.hellbender.utils.SimpleInterval) Collectors(java.util.stream.Collectors) VariantFilter(org.broadinstitute.hellbender.engine.filters.VariantFilter) org.broadinstitute.hellbender.engine(org.broadinstitute.hellbender.engine) List(java.util.List) IntervalUtils(org.broadinstitute.hellbender.utils.IntervalUtils) VariantContext(htsjdk.variant.variantcontext.VariantContext) VariantsSparkSource(org.broadinstitute.hellbender.engine.spark.datasources.VariantsSparkSource) StreamSupport(java.util.stream.StreamSupport) JavaRDD(org.apache.spark.api.java.JavaRDD) FlatMapFunction(org.apache.spark.api.java.function.FlatMapFunction) ReferenceMultiSource(org.broadinstitute.hellbender.engine.datasources.ReferenceMultiSource) VariantFilter(org.broadinstitute.hellbender.engine.filters.VariantFilter) VariantContext(htsjdk.variant.variantcontext.VariantContext) SAMSequenceDictionary(htsjdk.samtools.SAMSequenceDictionary) SimpleInterval(org.broadinstitute.hellbender.utils.SimpleInterval)

Aggregations

VariantFilter (org.broadinstitute.hellbender.engine.filters.VariantFilter)2 SimpleInterval (org.broadinstitute.hellbender.utils.SimpleInterval)2 SAMSequenceDictionary (htsjdk.samtools.SAMSequenceDictionary)1 VariantContext (htsjdk.variant.variantcontext.VariantContext)1 VCFHeader (htsjdk.variant.vcf.VCFHeader)1 List (java.util.List)1 Collectors (java.util.stream.Collectors)1 StreamSupport (java.util.stream.StreamSupport)1 JavaRDD (org.apache.spark.api.java.JavaRDD)1 JavaSparkContext (org.apache.spark.api.java.JavaSparkContext)1 FlatMapFunction (org.apache.spark.api.java.function.FlatMapFunction)1 Broadcast (org.apache.spark.broadcast.Broadcast)1 Argument (org.broadinstitute.barclay.argparser.Argument)1 StandardArgumentDefinitions (org.broadinstitute.hellbender.cmdline.StandardArgumentDefinitions)1 org.broadinstitute.hellbender.engine (org.broadinstitute.hellbender.engine)1 ReferenceMultiSource (org.broadinstitute.hellbender.engine.datasources.ReferenceMultiSource)1 CountingReadFilter (org.broadinstitute.hellbender.engine.filters.CountingReadFilter)1 VariantFilterLibrary (org.broadinstitute.hellbender.engine.filters.VariantFilterLibrary)1 VariantsSparkSource (org.broadinstitute.hellbender.engine.spark.datasources.VariantsSparkSource)1 IndexUtils (org.broadinstitute.hellbender.utils.IndexUtils)1