Search in sources :

Example 6 with DataDeletionRequest

use of bio.terra.model.DataDeletionRequest in project jade-data-repo by DataBiosphere.

the class DatasetIntegrationTest method testSoftDeleteHappyPath.

@Test
public void testSoftDeleteHappyPath() throws Exception {
    datasetId = ingestedDataset();
    // get row ids
    DatasetModel dataset = dataRepoFixtures.getDataset(steward(), datasetId);
    BigQuery bigQuery = BigQueryFixtures.getBigQuery(dataset.getDataProject(), stewardToken);
    List<String> participantRowIds = getRowIds(bigQuery, dataset, "participant", 3L);
    List<String> sampleRowIds = getRowIds(bigQuery, dataset, "sample", 2L);
    // write them to GCS
    String participantPath = writeListToScratch("softDel", participantRowIds);
    String samplePath = writeListToScratch("softDel", sampleRowIds);
    // build the deletion request with pointers to the two files with row ids to soft delete
    List<DataDeletionTableModel> dataDeletionTableModels = Arrays.asList(deletionTableFile("participant", participantPath), deletionTableFile("sample", samplePath));
    DataDeletionRequest request = dataDeletionRequest().tables(dataDeletionTableModels);
    // send off the soft delete request
    dataRepoFixtures.deleteData(steward(), datasetId, request);
    // make sure the new counts make sense
    assertTableCount(bigQuery, dataset, "participant", 2L);
    assertTableCount(bigQuery, dataset, "sample", 5L);
}
Also used : BigQuery(com.google.cloud.bigquery.BigQuery) DataDeletionRequest(bio.terra.model.DataDeletionRequest) EnumerateDatasetModel(bio.terra.model.EnumerateDatasetModel) DatasetModel(bio.terra.model.DatasetModel) DataDeletionTableModel(bio.terra.model.DataDeletionTableModel) SpringBootTest(org.springframework.boot.test.context.SpringBootTest) Test(org.junit.Test)

Example 7 with DataDeletionRequest

use of bio.terra.model.DataDeletionRequest in project jade-data-repo by DataBiosphere.

the class DataDeletionStep method doStep.

@Override
public StepResult doStep(FlightContext context) throws InterruptedException {
    Dataset dataset = getDataset(context, datasetService);
    String suffix = getSuffix(context);
    DataDeletionRequest dataDeletionRequest = getRequest(context);
    List<String> tableNames = dataDeletionRequest.getTables().stream().map(DataDeletionTableModel::getTableName).collect(Collectors.toList());
    bigQueryPdao.validateDeleteRequest(dataset, dataDeletionRequest.getTables(), suffix);
    if (configService.testInsertFault(ConfigEnum.SOFT_DELETE_LOCK_CONFLICT_STOP_FAULT)) {
        logger.info("SOFT_DELETE_LOCK_CONFLICT_STOP_FAULT");
        while (!configService.testInsertFault(ConfigEnum.SOFT_DELETE_LOCK_CONFLICT_CONTINUE_FAULT)) {
            logger.info("Sleeping for CONTINUE FAULT");
            TimeUnit.SECONDS.sleep(5);
        }
        logger.info("SOFT_DELETE_LOCK_CONFLICT_CONTINUE_FAULT");
    }
    bigQueryPdao.applySoftDeletes(dataset, tableNames, suffix);
    // TODO: this can be more informative, something like # rows deleted per table, or mismatched row ids
    DeleteResponseModel deleteResponseModel = new DeleteResponseModel().objectState(DeleteResponseModel.ObjectStateEnum.DELETED);
    FlightUtils.setResponse(context, deleteResponseModel, HttpStatus.OK);
    return StepResult.getStepResultSuccess();
}
Also used : DataDeletionUtils.getDataset(bio.terra.service.dataset.flight.datadelete.DataDeletionUtils.getDataset) Dataset(bio.terra.service.dataset.Dataset) DataDeletionRequest(bio.terra.model.DataDeletionRequest) DeleteResponseModel(bio.terra.model.DeleteResponseModel)

Example 8 with DataDeletionRequest

use of bio.terra.model.DataDeletionRequest in project jade-data-repo by DataBiosphere.

the class DatasetConnectedTest method uploadInputFileAndBuildSoftDeleteRequest.

private DataDeletionRequest uploadInputFileAndBuildSoftDeleteRequest(String dirInCloud, String filenameInCloud, String tableName, List<String> softDeleteRowIds) throws Exception {
    Storage storage = StorageOptions.getDefaultInstance().getService();
    // load a CSV file that contains the table rows to soft delete into the test bucket
    StringBuilder csvLines = new StringBuilder();
    for (String softDeleteRowId : softDeleteRowIds) {
        csvLines.append(softDeleteRowId + "\n");
    }
    BlobInfo softDeleteBlob = BlobInfo.newBuilder(testConfig.getIngestbucket(), dirInCloud + "/" + filenameInCloud).build();
    storage.create(softDeleteBlob, csvLines.toString().getBytes(Charset.forName("UTF-8")));
    String softDeleteInputFilePath = "gs://" + testConfig.getIngestbucket() + "/" + dirInCloud + "/" + filenameInCloud;
    // make sure the JSON file gets cleaned up on test teardown
    connectedOperations.addScratchFile(dirInCloud + "/" + filenameInCloud);
    // build the soft delete request with a pointer to a file that contains the row ids to soft delete
    DataDeletionGcsFileModel softDeleteGcsFileModel = new DataDeletionGcsFileModel().fileType(DataDeletionGcsFileModel.FileTypeEnum.CSV).path(softDeleteInputFilePath);
    DataDeletionTableModel softDeleteTableModel = new DataDeletionTableModel().tableName(tableName).gcsFileSpec(softDeleteGcsFileModel);
    DataDeletionRequest softDeleteRequest = new DataDeletionRequest().deleteType(DataDeletionRequest.DeleteTypeEnum.SOFT).specType(DataDeletionRequest.SpecTypeEnum.GCSFILE).tables(Arrays.asList(softDeleteTableModel));
    return softDeleteRequest;
}
Also used : Storage(com.google.cloud.storage.Storage) DataDeletionRequest(bio.terra.model.DataDeletionRequest) DataDeletionGcsFileModel(bio.terra.model.DataDeletionGcsFileModel) CoreMatchers.containsString(org.hamcrest.CoreMatchers.containsString) BlobInfo(com.google.cloud.storage.BlobInfo) DataDeletionTableModel(bio.terra.model.DataDeletionTableModel)

Example 9 with DataDeletionRequest

use of bio.terra.model.DataDeletionRequest in project jade-data-repo by DataBiosphere.

the class DatasetConnectedTest method testRepeatedSoftDelete.

@Test
public void testRepeatedSoftDelete() throws Exception {
    // load a CSV file that contains the table rows to load into the test bucket
    String resourceFileName = "snapshot-test-dataset-data.csv";
    String dirInCloud = "scratch/testRepeatedSoftDelete/" + UUID.randomUUID().toString();
    BlobInfo ingestTableBlob = BlobInfo.newBuilder(testConfig.getIngestbucket(), dirInCloud + "/" + resourceFileName).build();
    Storage storage = StorageOptions.getDefaultInstance().getService();
    storage.create(ingestTableBlob, IOUtils.toByteArray(getClass().getClassLoader().getResource(resourceFileName)));
    String tableIngestInputFilePath = "gs://" + testConfig.getIngestbucket() + "/" + dirInCloud + "/" + resourceFileName;
    // ingest the table
    String tableName = "thetable";
    IngestRequestModel ingestRequest = new IngestRequestModel().table(tableName).format(IngestRequestModel.FormatEnum.CSV).csvSkipLeadingRows(1).path(tableIngestInputFilePath);
    connectedOperations.ingestTableSuccess(summaryModel.getId(), ingestRequest);
    // make sure the JSON file gets cleaned up on test teardown
    connectedOperations.addScratchFile(dirInCloud + "/" + resourceFileName);
    // load a CSV file that contains the table rows to soft delete into the test bucket
    String softDeleteRowId = "8c52c63e-8d9f-4cfc-82d0-0f916b2404c1";
    List<String> softDeleteRowIds = new ArrayList<>();
    // add the same rowid twice
    softDeleteRowIds.add(softDeleteRowId);
    softDeleteRowIds.add(softDeleteRowId);
    DataDeletionRequest softDeleteRequest = uploadInputFileAndBuildSoftDeleteRequest(dirInCloud, "testRepeatedSoftDelete.csv", tableName, softDeleteRowIds);
    // make the soft delete request and wait for it to return
    connectedOperations.softDeleteSuccess(summaryModel.getId(), softDeleteRequest);
    // check that the size of the live table matches what we expect
    List<String> liveTableRowIds1 = getRowIdsFromBQTable(summaryModel.getName(), tableName);
    assertEquals("Size of live table is 3", 3, liveTableRowIds1.size());
    assertFalse("Soft deleted row id is not in live table", liveTableRowIds1.contains(softDeleteRowId));
    // note: the soft delete table name is not exposed to end users, so to check that the state of the
    // soft delete table is correct, I'm reaching into our internals to fetch the table name
    Dataset internalDatasetObj = datasetDao.retrieve(UUID.fromString(summaryModel.getId()));
    DatasetTable internalDatasetTableObj = internalDatasetObj.getTableByName(tableName).get();
    String internalSoftDeleteTableName = internalDatasetTableObj.getSoftDeleteTableName();
    // check that the size of the soft delete table matches what we expect
    List<String> softDeleteRowIds1 = getRowIdsFromBQTable(summaryModel.getName(), internalSoftDeleteTableName);
    assertEquals("Size of soft delete table is 1", 1, softDeleteRowIds1.size());
    assertTrue("Soft deleted row id is in soft delete table", softDeleteRowIds1.contains(softDeleteRowId));
    // repeat the same soft delete request and wait for it to return
    connectedOperations.softDeleteSuccess(summaryModel.getId(), softDeleteRequest);
    // check that the size of the live table has not changed
    List<String> liveTableRowIds2 = getRowIdsFromBQTable(summaryModel.getName(), tableName);
    assertEquals("Size of live table is still 3", 3, liveTableRowIds2.size());
    assertFalse("Soft deleted row id is still not in live table", liveTableRowIds2.contains(softDeleteRowId));
    // check that the size of the soft delete table has not changed
    List<String> softDeleteRowIds2 = getRowIdsFromBQTable(summaryModel.getName(), internalSoftDeleteTableName);
    assertEquals("Size of soft delete table is still 1", 1, softDeleteRowIds2.size());
    assertTrue("Soft deleted row id is still in soft delete table", softDeleteRowIds2.contains(softDeleteRowId));
    // delete the dataset and check that it succeeds
    connectedOperations.deleteTestDataset(summaryModel.getId());
    // try to fetch the dataset again and confirm nothing is returned
    connectedOperations.getDatasetExpectError(summaryModel.getId(), HttpStatus.NOT_FOUND);
}
Also used : Storage(com.google.cloud.storage.Storage) DataDeletionRequest(bio.terra.model.DataDeletionRequest) ArrayList(java.util.ArrayList) CoreMatchers.containsString(org.hamcrest.CoreMatchers.containsString) BlobInfo(com.google.cloud.storage.BlobInfo) IngestRequestModel(bio.terra.model.IngestRequestModel) SpringBootTest(org.springframework.boot.test.context.SpringBootTest) Test(org.junit.Test)

Example 10 with DataDeletionRequest

use of bio.terra.model.DataDeletionRequest in project jade-data-repo by DataBiosphere.

the class DatasetConnectedTest method testBadSoftDelete.

@Test
public void testBadSoftDelete() throws Exception {
    // load a CSV file that contains the table rows to load into the test bucket
    String resourceFileName = "snapshot-test-dataset-data.csv";
    String dirInCloud = "scratch/testBadSoftDelete/" + UUID.randomUUID().toString();
    BlobInfo ingestTableBlob = BlobInfo.newBuilder(testConfig.getIngestbucket(), dirInCloud + "/" + resourceFileName).build();
    Storage storage = StorageOptions.getDefaultInstance().getService();
    storage.create(ingestTableBlob, IOUtils.toByteArray(getClass().getClassLoader().getResource(resourceFileName)));
    String tableIngestInputFilePath = "gs://" + testConfig.getIngestbucket() + "/" + dirInCloud + "/" + resourceFileName;
    // ingest the table
    String tableName = "thetable";
    IngestRequestModel ingestRequest = new IngestRequestModel().table(tableName).format(IngestRequestModel.FormatEnum.CSV).csvSkipLeadingRows(1).path(tableIngestInputFilePath);
    connectedOperations.ingestTableSuccess(summaryModel.getId(), ingestRequest);
    // make sure the JSON file gets cleaned up on test teardown
    connectedOperations.addScratchFile(dirInCloud + "/" + resourceFileName);
    // load a CSV file that contains the table rows to soft delete into the test bucket
    String softDeleteBadRowId = "badrowid";
    String softDeleteGoodRowId = "8c52c63e-8d9f-4cfc-82d0-0f916b2404c1";
    List<String> softDeleteRowIds = new ArrayList<>();
    softDeleteRowIds.add(softDeleteBadRowId);
    softDeleteRowIds.add(softDeleteGoodRowId);
    DataDeletionRequest softDeleteRequest = uploadInputFileAndBuildSoftDeleteRequest(dirInCloud, "testBadSoftDelete.csv", tableName, softDeleteRowIds);
    // make the soft delete request and wait for it to return
    MvcResult softDeleteResult = connectedOperations.softDeleteRaw(summaryModel.getId(), softDeleteRequest);
    MockHttpServletResponse softDeleteResponse = connectedOperations.validateJobModelAndWait(softDeleteResult);
    assertEquals("soft delete of bad row id failed", HttpStatus.BAD_REQUEST.value(), softDeleteResponse.getStatus());
    // check that the size of the live table matches what we expect
    List<String> liveTableRowIds = getRowIdsFromBQTable(summaryModel.getName(), tableName);
    assertEquals("Size of live table is 4", 4, liveTableRowIds.size());
    assertFalse("Bad row id is not in live table", liveTableRowIds.contains(softDeleteBadRowId));
    assertTrue("Good row id is in live table", liveTableRowIds.contains(softDeleteGoodRowId));
    // note: the soft delete table name is not exposed to end users, so to check that the state of the
    // soft delete table is correct, I'm reaching into our internals to fetch the table name
    Dataset internalDatasetObj = datasetDao.retrieve(UUID.fromString(summaryModel.getId()));
    DatasetTable internalDatasetTableObj = internalDatasetObj.getTableByName(tableName).get();
    String internalSoftDeleteTableName = internalDatasetTableObj.getSoftDeleteTableName();
    // check that the size of the soft delete table matches what we expect
    List<String> softDeleteRowIdsFromBQ = getRowIdsFromBQTable(summaryModel.getName(), internalSoftDeleteTableName);
    assertEquals("Size of soft delete table is 0", 0, softDeleteRowIdsFromBQ.size());
    assertFalse("Bad row id is not in soft delete table", softDeleteRowIdsFromBQ.contains(softDeleteBadRowId));
    assertFalse("Good row id is not in soft delete table", softDeleteRowIdsFromBQ.contains(softDeleteGoodRowId));
    // delete the dataset and check that it succeeds
    connectedOperations.deleteTestDataset(summaryModel.getId());
    // try to fetch the dataset again and confirm nothing is returned
    connectedOperations.getDatasetExpectError(summaryModel.getId(), HttpStatus.NOT_FOUND);
}
Also used : Storage(com.google.cloud.storage.Storage) DataDeletionRequest(bio.terra.model.DataDeletionRequest) ArrayList(java.util.ArrayList) CoreMatchers.containsString(org.hamcrest.CoreMatchers.containsString) BlobInfo(com.google.cloud.storage.BlobInfo) IngestRequestModel(bio.terra.model.IngestRequestModel) MvcResult(org.springframework.test.web.servlet.MvcResult) MockHttpServletResponse(org.springframework.mock.web.MockHttpServletResponse) SpringBootTest(org.springframework.boot.test.context.SpringBootTest) Test(org.junit.Test)

Aggregations

DataDeletionRequest (bio.terra.model.DataDeletionRequest)11 Test (org.junit.Test)6 SpringBootTest (org.springframework.boot.test.context.SpringBootTest)6 DataDeletionTableModel (bio.terra.model.DataDeletionTableModel)5 BlobInfo (com.google.cloud.storage.BlobInfo)4 Storage (com.google.cloud.storage.Storage)4 CoreMatchers.containsString (org.hamcrest.CoreMatchers.containsString)4 DatasetModel (bio.terra.model.DatasetModel)3 EnumerateDatasetModel (bio.terra.model.EnumerateDatasetModel)3 IngestRequestModel (bio.terra.model.IngestRequestModel)3 Dataset (bio.terra.service.dataset.Dataset)3 DataDeletionUtils.getDataset (bio.terra.service.dataset.flight.datadelete.DataDeletionUtils.getDataset)3 BigQuery (com.google.cloud.bigquery.BigQuery)3 ArrayList (java.util.ArrayList)2 MockHttpServletResponse (org.springframework.mock.web.MockHttpServletResponse)2 MvcResult (org.springframework.test.web.servlet.MvcResult)2 DataDeletionGcsFileModel (bio.terra.model.DataDeletionGcsFileModel)1 DeleteResponseModel (bio.terra.model.DeleteResponseModel)1 SnapshotModel (bio.terra.model.SnapshotModel)1 SnapshotRequestModel (bio.terra.model.SnapshotRequestModel)1