Search in sources :

Example 11 with OutputUpload

use of com.hartwig.pipeline.execution.vm.OutputUpload in project pipeline5 by hartwigmedical.

the class RnaIsofoxUnmapped method execute.

@Override
public VirtualMachineJobDefinition execute(InputBundle inputs, RuntimeBucket bucket, BashStartupScript startupScript, RuntimeFiles executionFlags) {
    InputFileDescriptor descriptor = inputs.get();
    final String batchInputs = descriptor.inputValue();
    final String[] batchItems = batchInputs.split(",");
    if (batchItems.length < 2) {
        System.out.print(String.format("invalid input arguments(%s) - expected SampleId,ReadLength", batchInputs));
        return null;
    }
    final String sampleId = batchItems[COL_SAMPLE_ID];
    final RefGenomeVersion refGenomeVersion = V37;
    final ResourceFiles resourceFiles = buildResourceFiles(refGenomeVersion);
    final String samplesDir = String.format("%s/%s", getRnaCohortDirectory(refGenomeVersion), "samples");
    // copy down BAM and index file for this sample
    final String bamFile = String.format("%s%s", sampleId, RNA_BAM_FILE_ID);
    startupScript.addCommand(() -> format("gsutil -u hmf-crunch cp %s/%s/%s %s", samplesDir, sampleId, bamFile, VmDirectories.INPUT));
    final String bamIndexFile = String.format("%s%s", sampleId, RNA_BAM_INDEX_FILE_ID);
    startupScript.addCommand(() -> format("gsutil -u hmf-crunch cp %s/%s/%s %s", samplesDir, sampleId, bamIndexFile, VmDirectories.INPUT));
    // copy down the executable
    startupScript.addCommand(() -> format("gsutil -u hmf-crunch cp %s/%s %s", ISOFOX_LOCATION, ISOFOX_JAR, VmDirectories.TOOLS));
    startupScript.addCommand(() -> format("cd %s", VmDirectories.OUTPUT));
    // run Isofox
    StringJoiner isofoxArgs = new StringJoiner(" ");
    isofoxArgs.add(String.format("-sample %s", sampleId));
    isofoxArgs.add(String.format("-functions UNMAPPED_READS"));
    isofoxArgs.add(String.format("-output_dir %s/", VmDirectories.OUTPUT));
    isofoxArgs.add(String.format("-bam_file %s/%s", VmDirectories.INPUT, bamFile));
    isofoxArgs.add(String.format("-ref_genome %s", resourceFiles.refGenomeFile()));
    isofoxArgs.add(String.format("-ensembl_data_dir %s", resourceFiles.ensemblDataCache()));
    final String threadCount = Bash.allCpus();
    isofoxArgs.add(String.format("-threads %s", threadCount));
    startupScript.addCommand(() -> format("java -jar %s/%s %s", VmDirectories.TOOLS, ISOFOX_JAR, isofoxArgs.toString()));
    // upload the results
    startupScript.addCommand(new OutputUpload(GoogleStorageLocation.of(bucket.name(), "isofox"), executionFlags));
    return ImmutableVirtualMachineJobDefinition.builder().name("rna-isofox").startupCommand(startupScript).namespacedResults(ResultsDirectory.defaultDirectory()).workingDiskSpaceGb(MAX_EXPECTED_BAM_SIZE_GB).build();
}
Also used : ResourceFilesFactory.buildResourceFiles(com.hartwig.pipeline.resource.ResourceFilesFactory.buildResourceFiles) ResourceFiles(com.hartwig.pipeline.resource.ResourceFiles) OutputUpload(com.hartwig.pipeline.execution.vm.OutputUpload) InputFileDescriptor(com.hartwig.batch.input.InputFileDescriptor) RefGenomeVersion(com.hartwig.pipeline.resource.RefGenomeVersion) StringJoiner(java.util.StringJoiner)

Example 12 with OutputUpload

use of com.hartwig.pipeline.execution.vm.OutputUpload in project pipeline5 by hartwigmedical.

the class AmberRerunTumorOnly method execute.

@Override
public VirtualMachineJobDefinition execute(final InputBundle inputs, final RuntimeBucket runtimeBucket, final BashStartupScript commands, final RuntimeFiles executionFlags) {
    // Inputs
    final String set = inputs.get("set").inputValue();
    final String tumorSampleName = inputs.get("tumor_sample").inputValue();
    final InputFileDescriptor remoteTumorFile = inputs.get("tumor_cram");
    final InputFileDescriptor remoteTumorIndex = remoteTumorFile.index();
    final String localTumorFile = localFilename(remoteTumorFile);
    // Download tumor
    commands.addCommand(() -> remoteTumorFile.toCommandForm(localTumorFile));
    commands.addCommand(() -> remoteTumorIndex.toCommandForm(localFilename(remoteTumorIndex)));
    final ResourceFiles resourceFiles = ResourceFilesFactory.buildResourceFiles(RefGenomeVersion.V37);
    commands.addCommand(() -> AmberCommandBuilder.newBuilder(resourceFiles).tumor(tumorSampleName, localTumorFile).build().asBash());
    // Store output
    final GoogleStorageLocation archiveStorageLocation = amberArchiveDirectory(set);
    commands.addCommand(new CopyLogToOutput(executionFlags.log(), "run.log"));
    commands.addCommand(new OutputUpload(archiveStorageLocation));
    return VirtualMachineJobDefinition.amber(commands, ResultsDirectory.defaultDirectory());
}
Also used : ResourceFiles(com.hartwig.pipeline.resource.ResourceFiles) OutputUpload(com.hartwig.pipeline.execution.vm.OutputUpload) InputFileDescriptor(com.hartwig.batch.input.InputFileDescriptor) CopyLogToOutput(com.hartwig.pipeline.execution.vm.CopyLogToOutput) GoogleStorageLocation(com.hartwig.pipeline.storage.GoogleStorageLocation)

Example 13 with OutputUpload

use of com.hartwig.pipeline.execution.vm.OutputUpload in project pipeline5 by hartwigmedical.

the class Bam2Fastq method execute.

@Override
public VirtualMachineJobDefinition execute(InputBundle inputs, RuntimeBucket bucket, BashStartupScript startupScript, RuntimeFiles executionFlags) {
    InputFileDescriptor descriptor = inputs.get();
    String localCopyOfBam = format("%s/%s", VmDirectories.INPUT, new File(descriptor.inputValue()).getName());
    startupScript.addCommand(() -> descriptor.toCommandForm(localCopyOfBam));
    startupScript.addCommand(new PipeCommands(new SambambaCommand("view", "-H", localCopyOfBam), () -> "grep ^@RG", () -> "grep -cP \"_L00[1-8]_\""));
    List<String> picargs = ImmutableList.of("SamToFastq", "ODIR=" + VmDirectories.OUTPUT, "OPRG=true", "RGT=ID", "NON_PF=true", "RC=true", "I=" + localCopyOfBam);
    startupScript.addCommand(new JavaJarCommand("picard", "2.18.27", "picard.jar", "16G", picargs));
    startupScript.addCommand(() -> format("rename 's/(.+)_(.+)_(.+)_(.+)_(.+)__(.+)\\.fastq/$1_$2_$3_$4_R$6_$5.fastq/' %s/*.fastq", VmDirectories.OUTPUT));
    startupScript.addCommand(() -> format("pigz %s/*.fastq", VmDirectories.OUTPUT));
    startupScript.addCommand(new OutputUpload(GoogleStorageLocation.of(bucket.name(), "bam2fastq"), executionFlags));
    return ImmutableVirtualMachineJobDefinition.builder().name("bam2fastq").startupCommand(startupScript).namespacedResults(ResultsDirectory.defaultDirectory()).workingDiskSpaceGb(1800).performanceProfile(VirtualMachinePerformanceProfile.custom(4, 20)).build();
}
Also used : PipeCommands(com.hartwig.pipeline.execution.vm.unix.PipeCommands) OutputUpload(com.hartwig.pipeline.execution.vm.OutputUpload) InputFileDescriptor(com.hartwig.batch.input.InputFileDescriptor) SambambaCommand(com.hartwig.pipeline.execution.vm.SambambaCommand) JavaJarCommand(com.hartwig.pipeline.execution.vm.java.JavaJarCommand) File(java.io.File)

Example 14 with OutputUpload

use of com.hartwig.pipeline.execution.vm.OutputUpload in project pipeline5 by hartwigmedical.

the class CobaltMigration method execute.

@Override
public VirtualMachineJobDefinition execute(final InputBundle inputs, final RuntimeBucket runtimeBucket, final BashStartupScript commands, final RuntimeFiles executionFlags) {
    // Inputs
    final String set = inputs.get("set").inputValue();
    final String tumorSampleName = inputs.get("tumor_sample").inputValue();
    final String referenceSampleName = inputs.get("ref_sample").inputValue();
    final GoogleStorageLocation remoteInputDirectory = cobaltArchiveDirectoryInput(set);
    // Download old files
    commands.addCommand(() -> copyInputCommand(remoteInputDirectory));
    final ResourceFiles resourceFiles = ResourceFilesFactory.buildResourceFiles(RefGenomeVersion.V37);
    commands.addCommand(() -> new CobaltMigrationCommand(resourceFiles, referenceSampleName, tumorSampleName).asBash());
    // Store output
    final GoogleStorageLocation archiveStorageLocation = cobaltArchiveDirectoryOutput(set);
    commands.addCommand(new CopyLogToOutput(executionFlags.log(), "run.log"));
    commands.addCommand(new OutputUpload(archiveStorageLocation));
    return VirtualMachineJobDefinition.cobalt(commands, ResultsDirectory.defaultDirectory());
}
Also used : ResourceFiles(com.hartwig.pipeline.resource.ResourceFiles) OutputUpload(com.hartwig.pipeline.execution.vm.OutputUpload) CobaltMigrationCommand(com.hartwig.pipeline.tertiary.cobalt.CobaltMigrationCommand) CopyLogToOutput(com.hartwig.pipeline.execution.vm.CopyLogToOutput) GoogleStorageLocation(com.hartwig.pipeline.storage.GoogleStorageLocation)

Example 15 with OutputUpload

use of com.hartwig.pipeline.execution.vm.OutputUpload in project pipeline5 by hartwigmedical.

the class StageRunner method run.

public <T extends StageOutput> T run(final M metadata, final Stage<T, M> stage) {
    final List<BashCommand> commands = commands(mode, metadata, stage);
    if (stage.shouldRun(arguments) && !commands.isEmpty()) {
        if (!startingPoint.usePersisted(stage.namespace())) {
            StageTrace trace = new StageTrace(stage.namespace(), metadata.name(), StageTrace.ExecutorType.COMPUTE_ENGINE);
            RuntimeBucket bucket = RuntimeBucket.from(storage, stage.namespace(), metadata, arguments, labels);
            BashStartupScript bash = BashStartupScript.of(bucket.name());
            bash.addCommands(stage.inputs()).addCommands(OverrideReferenceGenomeCommand.overrides(arguments)).addCommands(commands).addCommand(new OutputUpload(GoogleStorageLocation.of(bucket.name(), resultsDirectory.path()), RuntimeFiles.typical()));
            PipelineStatus status = Failsafe.with(DefaultBackoffPolicy.of(String.format("[%s] stage [%s]", metadata.name(), stage.namespace()))).get(() -> computeEngine.submit(bucket, stage.vmDefinition(bash, resultsDirectory)));
            trace.stop();
            return stage.output(metadata, status, bucket, resultsDirectory);
        }
        return stage.persistedOutput(metadata);
    }
    return stage.skippedOutput(metadata);
}
Also used : PipelineStatus(com.hartwig.pipeline.execution.PipelineStatus) StageTrace(com.hartwig.pipeline.trace.StageTrace) BashStartupScript(com.hartwig.pipeline.execution.vm.BashStartupScript) OutputUpload(com.hartwig.pipeline.execution.vm.OutputUpload) BashCommand(com.hartwig.pipeline.execution.vm.BashCommand) RuntimeBucket(com.hartwig.pipeline.storage.RuntimeBucket)

Aggregations

OutputUpload (com.hartwig.pipeline.execution.vm.OutputUpload)40 InputFileDescriptor (com.hartwig.batch.input.InputFileDescriptor)35 ResourceFiles (com.hartwig.pipeline.resource.ResourceFiles)24 StringJoiner (java.util.StringJoiner)12 GoogleStorageLocation (com.hartwig.pipeline.storage.GoogleStorageLocation)9 RemoteLocationsApi (com.hartwig.batch.api.RemoteLocationsApi)7 CopyLogToOutput (com.hartwig.pipeline.execution.vm.CopyLogToOutput)6 VersionedToolCommand (com.hartwig.pipeline.calling.command.VersionedToolCommand)5 RefGenomeVersion (com.hartwig.pipeline.resource.RefGenomeVersion)5 ResourceFilesFactory.buildResourceFiles (com.hartwig.pipeline.resource.ResourceFilesFactory.buildResourceFiles)5 SubStageInputOutput (com.hartwig.pipeline.stages.SubStageInputOutput)5 File (java.io.File)5 BwaCommand (com.hartwig.pipeline.calling.command.BwaCommand)3 SamtoolsCommand (com.hartwig.pipeline.calling.command.SamtoolsCommand)3 InputDownload (com.hartwig.pipeline.execution.vm.InputDownload)3 OutputFile (com.hartwig.pipeline.execution.vm.OutputFile)3 SageApplication (com.hartwig.pipeline.calling.sage.SageApplication)2 SageCommandBuilder (com.hartwig.pipeline.calling.sage.SageCommandBuilder)2 GridssAnnotation (com.hartwig.pipeline.calling.structural.gridss.stage.GridssAnnotation)2 PipelineStatus (com.hartwig.pipeline.execution.PipelineStatus)2