Search in sources :

Example 1 with GCSOptions

use of com.google.cloud.genomics.dataflow.utils.GCSOptions in project gatk by broadinstitute.

the class PathSeqFilterSpark method doHostBWA.

private JavaRDD<GATKRead> doHostBWA(final JavaSparkContext ctx, final SAMFileHeader readsHeader, final JavaRDD<GATKRead> reads) {
    final BwaSparkEngine engine = new BwaSparkEngine(ctx, indexImageFile, getHeaderForReads(), getReferenceSequenceDictionary());
    // null if we have no api key
    final GCSOptions gcsOptions = getAuthenticatedGCSOptions();
    final ReferenceMultiSource hostReference = new ReferenceMultiSource(gcsOptions, HOST_REF_PATH, getReferenceWindowFunction());
    final SAMSequenceDictionary hostRefDict = hostReference.getReferenceSequenceDictionary(header.getSequenceDictionary());
    readsHeader.setSequenceDictionary(hostRefDict);
    return engine.align(reads);
}
Also used : ReferenceMultiSource(org.broadinstitute.hellbender.engine.datasources.ReferenceMultiSource) BwaSparkEngine(org.broadinstitute.hellbender.tools.spark.bwa.BwaSparkEngine) SAMSequenceDictionary(htsjdk.samtools.SAMSequenceDictionary) GCSOptions(com.google.cloud.genomics.dataflow.utils.GCSOptions)

Example 2 with GCSOptions

use of com.google.cloud.genomics.dataflow.utils.GCSOptions in project gatk by broadinstitute.

the class ApplyBQSRSpark method runTool.

@Override
protected void runTool(JavaSparkContext ctx) {
    JavaRDD<GATKRead> initialReads = getReads();
    // null if we have no api key
    final GCSOptions gcsOptions = getAuthenticatedGCSOptions();
    Broadcast<RecalibrationReport> recalibrationReportBroadCast = ctx.broadcast(new RecalibrationReport(BucketUtils.openFile(bqsrRecalFile)));
    final JavaRDD<GATKRead> recalibratedReads = ApplyBQSRSparkFn.apply(initialReads, recalibrationReportBroadCast, getHeaderForReads(), applyBQSRArgs);
    writeReads(ctx, output, recalibratedReads);
}
Also used : GATKRead(org.broadinstitute.hellbender.utils.read.GATKRead) RecalibrationReport(org.broadinstitute.hellbender.utils.recalibration.RecalibrationReport) GCSOptions(com.google.cloud.genomics.dataflow.utils.GCSOptions)

Example 3 with GCSOptions

use of com.google.cloud.genomics.dataflow.utils.GCSOptions in project gatk by broadinstitute.

the class GATKSparkTool method initializeReference.

/**
     * Initializes our reference source. Does nothing if no reference was specified.
     */
private void initializeReference() {
    // null if we have no api key
    final GCSOptions gcsOptions = getAuthenticatedGCSOptions();
    final String referenceURL = referenceArguments.getReferenceFileName();
    if (referenceURL != null) {
        referenceSource = new ReferenceMultiSource(gcsOptions, referenceURL, getReferenceWindowFunction());
        referenceDictionary = referenceSource.getReferenceSequenceDictionary(readsHeader != null ? readsHeader.getSequenceDictionary() : null);
        if (referenceDictionary == null) {
            throw new UserException.MissingReferenceDictFile(referenceURL);
        }
    }
}
Also used : ReferenceMultiSource(org.broadinstitute.hellbender.engine.datasources.ReferenceMultiSource) GCSOptions(com.google.cloud.genomics.dataflow.utils.GCSOptions)

Example 4 with GCSOptions

use of com.google.cloud.genomics.dataflow.utils.GCSOptions in project gatk by broadinstitute.

the class AuthHolder method makeStorageClient.

/**
     * @return a Storage.Objects, authenticated using the information held in this object.
     */
public Storage.Objects makeStorageClient() throws IOException, ClassNotFoundException, GeneralSecurityException {
    GCSOptions options = PipelineOptionsFactory.as(GCSOptions.class);
    options.setAppName(appName);
    options.setApiKey(apiKey);
    return GCSOptions.Methods.createStorageClient(options, getOfflineAuth());
}
Also used : GCSOptions(com.google.cloud.genomics.dataflow.utils.GCSOptions)

Example 5 with GCSOptions

use of com.google.cloud.genomics.dataflow.utils.GCSOptions in project gatk by broadinstitute.

the class AuthHolder method asPipelineOptionsDeprecated.

/**
     * @return a GCSOptions object authenticated with apiKey suitable for accessing files in GCS,
     * or null if no apiKey is present. This code completely ignores the secrets file, which is why
     * you shouldn't be using it. Instead, change the calling code to use AuthHolder directly.
     * (not putting @Deprecated because otherwise we don't compile anymore... the pitfall of -Werr)
     */
public GCSOptions asPipelineOptionsDeprecated() {
    if (apiKey == null) {
        return null;
    }
    GCSOptions options = PipelineOptionsFactory.as(GCSOptions.class);
    options.setApiKey(apiKey);
    return options;
}
Also used : GCSOptions(com.google.cloud.genomics.dataflow.utils.GCSOptions)

Aggregations

GCSOptions (com.google.cloud.genomics.dataflow.utils.GCSOptions)8 ReferenceMultiSource (org.broadinstitute.hellbender.engine.datasources.ReferenceMultiSource)2 Genomics (com.google.api.services.genomics.Genomics)1 SAMSequenceDictionary (htsjdk.samtools.SAMSequenceDictionary)1 IOException (java.io.IOException)1 UserException (org.broadinstitute.hellbender.exceptions.UserException)1 BwaSparkEngine (org.broadinstitute.hellbender.tools.spark.bwa.BwaSparkEngine)1 GATKGCSOptions (org.broadinstitute.hellbender.utils.gcs.GATKGCSOptions)1 GATKRead (org.broadinstitute.hellbender.utils.read.GATKRead)1 RecalibrationReport (org.broadinstitute.hellbender.utils.recalibration.RecalibrationReport)1 ReferenceBases (org.broadinstitute.hellbender.utils.reference.ReferenceBases)1