Search in sources :

Example 1 with FileLoadModel

use of bio.terra.model.FileLoadModel in project jade-data-repo by DataBiosphere.

the class IngestRequestValidator method validate.

@Override
@SuppressFBWarnings(value = "UC_USELESS_VOID_METHOD", justification = "FB mistake - this clearly validates and returns data in errors")
public void validate(@NotNull Object target, Errors errors) {
    if (target instanceof IngestRequestModel) {
        IngestRequestModel ingestRequest = (IngestRequestModel) target;
        validateTableName(ingestRequest.getTable(), errors);
    } else if (target instanceof FileLoadModel) {
        FileLoadModel fileLoadModel = (FileLoadModel) target;
        if (fileLoadModel.getProfileId() == null) {
            errors.rejectValue("profileId", "ProfileIdMissing", "File ingest requires a profile id.");
        }
    }
}
Also used : IngestRequestModel(bio.terra.model.IngestRequestModel) FileLoadModel(bio.terra.model.FileLoadModel) SuppressFBWarnings(edu.umd.cs.findbugs.annotations.SuppressFBWarnings)

Example 2 with FileLoadModel

use of bio.terra.model.FileLoadModel in project jade-data-repo by DataBiosphere.

the class IngestDriverStep method launchLoads.

private void launchLoads(FlightContext context, int launchCount, List<LoadFile> loadFiles, String profileId, UUID loadId, GoogleBucketResource bucketInfo) throws DatabaseOperationException, StairwayExecutionException, InterruptedException {
    Stairway stairway = context.getStairway();
    for (int i = 0; i < launchCount; i++) {
        LoadFile loadFile = loadFiles.get(i);
        String flightId = stairway.createFlightId();
        FileLoadModel fileLoadModel = new FileLoadModel().sourcePath(loadFile.getSourcePath()).targetPath(loadFile.getTargetPath()).mimeType(loadFile.getMimeType()).profileId(profileId).loadTag(loadTag).description(loadFile.getDescription());
        FlightMap inputParameters = new FlightMap();
        inputParameters.put(FileMapKeys.DATASET_ID, datasetId);
        inputParameters.put(FileMapKeys.REQUEST, fileLoadModel);
        inputParameters.put(FileMapKeys.BUCKET_INFO, bucketInfo);
        loadService.setLoadFileRunning(loadId, loadFile.getTargetPath(), flightId);
        // NOTE: this is the window where we have recorded a flight as RUNNING in the load_file
        // table, but it has not yet been launched. A failure in this window leaves "orphan"
        // loads that are marked running, but not actually started. We handle this
        // with the check for launch orphans at the beginning of the do() method.
        // We use submitToQueue to spread the file loaders across multiple instances of datarepo.
        stairway.submitToQueue(flightId, FileIngestWorkerFlight.class, inputParameters);
    }
}
Also used : Stairway(bio.terra.stairway.Stairway) LoadFile(bio.terra.service.load.LoadFile) FlightMap(bio.terra.stairway.FlightMap) FileLoadModel(bio.terra.model.FileLoadModel)

Example 3 with FileLoadModel

use of bio.terra.model.FileLoadModel in project jade-data-repo by DataBiosphere.

the class IngestFileDirectoryStep method doStep.

@Override
public StepResult doStep(FlightContext context) {
    FlightMap inputParameters = context.getInputParameters();
    FileLoadModel loadModel = inputParameters.get(JobMapKeys.REQUEST.getKeyName(), FileLoadModel.class);
    FlightMap workingMap = context.getWorkingMap();
    String fileId = workingMap.get(FileMapKeys.FILE_ID, String.class);
    workingMap.put(FileMapKeys.LOAD_COMPLETED, false);
    String datasetId = dataset.getId().toString();
    String targetPath = loadModel.getTargetPath();
    try {
        // The state logic goes like this:
        // 1. the directory entry doesn't exist. We need to create the directory entry for it.
        // 2. the directory entry exists. There are three cases:
        // a. If loadTags do not match, then we throw FileAlreadyExistsException.
        // b. directory entry loadTag matches our loadTag AND entry fileId matches our fileId:
        // means we are recovering and need to complete the file creation work.
        // c. directory entry loadTag matches our loadTag AND entry fileId does NOT match our fileId
        // means this is a re-run of a load job. We update the fileId in the working map. We don't
        // know if we are recovering or already finished. We try to retrieve the file object for
        // the entry fileId:
        // i. If that is successful, then we already loaded this file. We store "completed=true"
        // in the working map, so other steps do nothing.
        // ii. If that fails, then we are recovering: we leave completed unset (=false) in the working map.
        // 
        // Lookup the file - on a recovery, we may have already created it, but not
        // finished. Or it might already exist, created by someone else.
        FireStoreDirectoryEntry existingEntry = fileDao.lookupDirectoryEntryByPath(dataset, targetPath);
        if (existingEntry == null) {
            // Not there - create it
            FireStoreDirectoryEntry newEntry = new FireStoreDirectoryEntry().fileId(fileId).isFileRef(true).path(fireStoreUtils.getDirectoryPath(loadModel.getTargetPath())).name(fireStoreUtils.getName(loadModel.getTargetPath())).datasetId(datasetId).loadTag(loadModel.getLoadTag());
            fileDao.createDirectoryEntry(dataset, newEntry);
        } else {
            if (!StringUtils.equals(existingEntry.getLoadTag(), loadModel.getLoadTag())) {
                // (a) Exists and is not our file
                throw new FileAlreadyExistsException("Path already exists: " + targetPath);
            }
            // (b) or (c) Load tags match - check file ids
            if (!StringUtils.equals(existingEntry.getFileId(), fileId)) {
                // (c) We are in a re-run of a load job. Try to get the file entry.
                fileId = existingEntry.getFileId();
                workingMap.put(FileMapKeys.FILE_ID, fileId);
                FireStoreFile fileEntry = fileDao.lookupFile(dataset, fileId);
                if (fileEntry != null) {
                    // (c)(i) We successfully loaded this file already
                    workingMap.put(FileMapKeys.LOAD_COMPLETED, true);
                }
            // (c)(ii) We are recovering and should continue this load; leave load completed false/unset
            }
        }
    } catch (FileSystemAbortTransactionException rex) {
        return new StepResult(StepStatus.STEP_RESULT_FAILURE_RETRY, rex);
    }
    return StepResult.getStepResultSuccess();
}
Also used : FireStoreFile(bio.terra.service.filedata.google.firestore.FireStoreFile) FileAlreadyExistsException(bio.terra.service.filedata.exception.FileAlreadyExistsException) FlightMap(bio.terra.stairway.FlightMap) FileLoadModel(bio.terra.model.FileLoadModel) FileSystemAbortTransactionException(bio.terra.service.filedata.exception.FileSystemAbortTransactionException) StepResult(bio.terra.stairway.StepResult) FireStoreDirectoryEntry(bio.terra.service.filedata.google.firestore.FireStoreDirectoryEntry)

Example 4 with FileLoadModel

use of bio.terra.model.FileLoadModel in project jade-data-repo by DataBiosphere.

the class DatasetConnectedTest method testExcludeLockedFromFileLookups.

@Test
public void testExcludeLockedFromFileLookups() throws Exception {
    // check that the dataset metadata row is unlocked
    UUID datasetId = UUID.fromString(summaryModel.getId());
    String exclusiveLock = datasetDao.getExclusiveLock(datasetId);
    assertNull("dataset row is not exclusively locked", exclusiveLock);
    String[] sharedLocks = datasetDao.getSharedLocks(datasetId);
    assertEquals("dataset row has no shared lock", 0, sharedLocks.length);
    // ingest a file
    URI sourceUri = new URI("gs", "jade-testdata", "/fileloadprofiletest/1KBfile.txt", null, null);
    String targetPath1 = "/mm/" + Names.randomizeName("testdir") + "/testExcludeLockedFromFileLookups.txt";
    FileLoadModel fileLoadModel = new FileLoadModel().sourcePath(sourceUri.toString()).description("testExcludeLockedFromFileLookups").mimeType("text/plain").targetPath(targetPath1).profileId(billingProfile.getId());
    FileModel fileModel = connectedOperations.ingestFileSuccess(summaryModel.getId(), fileLoadModel);
    // lookup the file by id and check that it's found
    FileModel fileModelFromIdLookup = connectedOperations.lookupFileSuccess(summaryModel.getId(), fileModel.getFileId());
    assertEquals("File found by id lookup", fileModel.getDescription(), fileModelFromIdLookup.getDescription());
    // lookup the file by path and check that it's found
    FileModel fileModelFromPathLookup = connectedOperations.lookupFileByPathSuccess(summaryModel.getId(), fileModel.getPath(), -1);
    assertEquals("File found by path lookup", fileModel.getDescription(), fileModelFromPathLookup.getDescription());
    // NO ASSERTS inside the block below where hang is enabled to reduce chance of failing before disabling the hang
    // ====================================================
    // enable hang in DeleteDatasetPrimaryDataStep
    configService.setFault(ConfigEnum.DATASET_DELETE_LOCK_CONFLICT_STOP_FAULT.name(), true);
    // kick off a request to delete the dataset. this should hang before unlocking the dataset object.
    MvcResult deleteResult = mvc.perform(delete("/api/repository/v1/datasets/" + summaryModel.getId())).andReturn();
    // give the flight time to launch
    TimeUnit.SECONDS.sleep(5);
    // check that the dataset metadata row has an exclusive lock
    // note: asserts are below outside the hang block
    exclusiveLock = datasetDao.getExclusiveLock(datasetId);
    sharedLocks = datasetDao.getSharedLocks(datasetId);
    // lookup the file by id and check that it's NOT found
    // note: asserts are below outside the hang block
    MockHttpServletResponse lookupFileByIdResponse = connectedOperations.lookupFileRaw(summaryModel.getId(), fileModel.getFileId());
    // lookup the file by path and check that it's NOT found
    // note: asserts are below outside the hang block
    MockHttpServletResponse lookupFileByPathResponse = connectedOperations.lookupFileByPathRaw(summaryModel.getId(), fileModel.getPath(), -1);
    // disable hang in DeleteDatasetPrimaryDataStep
    configService.setFault(ConfigEnum.DATASET_DELETE_LOCK_CONFLICT_CONTINUE_FAULT.name(), true);
    // ====================================================
    // check that the dataset metadata row has an exclusive lock after kicking off the delete
    assertNotNull("dataset row is exclusively locked", exclusiveLock);
    assertEquals("dataset row has no shared lock", 0, sharedLocks.length);
    // check that the lookup file by id returned not found
    assertEquals("File NOT found by id lookup", HttpStatus.NOT_FOUND, HttpStatus.valueOf(lookupFileByIdResponse.getStatus()));
    // check that the lookup file by path returned not found
    assertEquals("File NOT found by path lookup", HttpStatus.NOT_FOUND, HttpStatus.valueOf(lookupFileByPathResponse.getStatus()));
    // check the response from the delete request
    MockHttpServletResponse deleteResponse = connectedOperations.validateJobModelAndWait(deleteResult);
    DeleteResponseModel deleteResponseModel = connectedOperations.handleSuccessCase(deleteResponse, DeleteResponseModel.class);
    assertEquals("Dataset delete returned successfully", DeleteResponseModel.ObjectStateEnum.DELETED, deleteResponseModel.getObjectState());
    // remove the file from the connectedoperation bookkeeping list
    connectedOperations.removeFile(summaryModel.getId(), fileModel.getFileId());
    // try to fetch the dataset again and confirm nothing is returned
    connectedOperations.getDatasetExpectError(summaryModel.getId(), HttpStatus.NOT_FOUND);
}
Also used : DataDeletionGcsFileModel(bio.terra.model.DataDeletionGcsFileModel) FileModel(bio.terra.model.FileModel) CoreMatchers.containsString(org.hamcrest.CoreMatchers.containsString) UUID(java.util.UUID) FileLoadModel(bio.terra.model.FileLoadModel) MvcResult(org.springframework.test.web.servlet.MvcResult) URI(java.net.URI) MockHttpServletResponse(org.springframework.mock.web.MockHttpServletResponse) DeleteResponseModel(bio.terra.model.DeleteResponseModel) SpringBootTest(org.springframework.boot.test.context.SpringBootTest) Test(org.junit.Test)

Example 5 with FileLoadModel

use of bio.terra.model.FileLoadModel in project jade-data-repo by DataBiosphere.

the class FileOperationTest method retryAndFailAcquireSharedLock.

@Test
public void retryAndFailAcquireSharedLock() throws Exception {
    FileLoadModel fileLoadModel = makeFileLoad(profileModel.getId());
    connectedOperations.retryAcquireLockIngestFileSuccess(false, datasetSummary.getId(), fileLoadModel, configService, datasetDao);
}
Also used : FileLoadModel(bio.terra.model.FileLoadModel) SpringBootTest(org.springframework.boot.test.context.SpringBootTest) Test(org.junit.Test)

Aggregations

FileLoadModel (bio.terra.model.FileLoadModel)16 CoreMatchers.containsString (org.hamcrest.CoreMatchers.containsString)7 Test (org.junit.Test)7 SpringBootTest (org.springframework.boot.test.context.SpringBootTest)7 FileModel (bio.terra.model.FileModel)6 URI (java.net.URI)5 MockHttpServletResponse (org.springframework.mock.web.MockHttpServletResponse)5 MvcResult (org.springframework.test.web.servlet.MvcResult)5 FlightMap (bio.terra.stairway.FlightMap)4 DataDeletionGcsFileModel (bio.terra.model.DataDeletionGcsFileModel)3 ErrorModel (bio.terra.model.ErrorModel)3 UUID (java.util.UUID)3 DeleteResponseModel (bio.terra.model.DeleteResponseModel)2 IngestRequestModel (bio.terra.model.IngestRequestModel)2 FSFileInfo (bio.terra.service.filedata.FSFileInfo)2 FileSystemAbortTransactionException (bio.terra.service.filedata.exception.FileSystemAbortTransactionException)2 FireStoreFile (bio.terra.service.filedata.google.firestore.FireStoreFile)2 StepResult (bio.terra.stairway.StepResult)2 BlobInfo (com.google.cloud.storage.BlobInfo)2 BulkLoadFileModel (bio.terra.model.BulkLoadFileModel)1