Search in sources :

Example 1 with DataDeletionRequest

use of bio.terra.model.DataDeletionRequest in project jade-data-repo by DataBiosphere.

the class DataDeletionRequestValidator method validate.

@Override
public void validate(@NotNull Object target, Errors errors) {
    if (target instanceof DataDeletionRequest) {
        DataDeletionRequest dataDeletionRequest = (DataDeletionRequest) target;
        dataDeletionRequest.getTables().forEach(table -> validateFileSpec(table, errors));
    }
}
Also used : DataDeletionRequest(bio.terra.model.DataDeletionRequest)

Example 2 with DataDeletionRequest

use of bio.terra.model.DataDeletionRequest in project jade-data-repo by DataBiosphere.

the class CreateExternalTablesStep method doStep.

@Override
public StepResult doStep(FlightContext context) throws InterruptedException {
    Dataset dataset = getDataset(context, datasetService);
    String suffix = getSuffix(context);
    DataDeletionRequest dataDeletionRequest = getRequest(context);
    validateTablesExistInDataset(dataDeletionRequest, dataset);
    for (DataDeletionTableModel table : dataDeletionRequest.getTables()) {
        String path = table.getGcsFileSpec().getPath();
        // let any exception here trigger an undo, no use trying to continue
        bigQueryPdao.createSoftDeleteExternalTable(dataset, path, table.getTableName(), suffix);
    }
    return StepResult.getStepResultSuccess();
}
Also used : DataDeletionUtils.getDataset(bio.terra.service.dataset.flight.datadelete.DataDeletionUtils.getDataset) Dataset(bio.terra.service.dataset.Dataset) DataDeletionRequest(bio.terra.model.DataDeletionRequest) DataDeletionTableModel(bio.terra.model.DataDeletionTableModel)

Example 3 with DataDeletionRequest

use of bio.terra.model.DataDeletionRequest in project jade-data-repo by DataBiosphere.

the class DropExternalTablesStep method doStep.

@Override
public StepResult doStep(FlightContext context) throws InterruptedException {
    Dataset dataset = getDataset(context, datasetService);
    String suffix = getSuffix(context);
    DataDeletionRequest dataDeletionRequest = getRequest(context);
    for (DataDeletionTableModel table : dataDeletionRequest.getTables()) {
        bigQueryPdao.deleteSoftDeleteExternalTable(dataset, table.getTableName(), suffix);
    }
    return StepResult.getStepResultSuccess();
}
Also used : DataDeletionUtils.getDataset(bio.terra.service.dataset.flight.datadelete.DataDeletionUtils.getDataset) Dataset(bio.terra.service.dataset.Dataset) DataDeletionRequest(bio.terra.model.DataDeletionRequest) DataDeletionTableModel(bio.terra.model.DataDeletionTableModel)

Example 4 with DataDeletionRequest

use of bio.terra.model.DataDeletionRequest in project jade-data-repo by DataBiosphere.

the class DatasetConnectedTest method testConcurrentSoftDeletes.

@Test
public void testConcurrentSoftDeletes() throws Exception {
    // load a CSV file that contains the table rows to load into the test bucket
    String resourceFileName = "snapshot-test-dataset-data.csv";
    String dirInCloud = "scratch/testConcurrentSoftDeletes/" + UUID.randomUUID().toString();
    BlobInfo ingestTableBlob = BlobInfo.newBuilder(testConfig.getIngestbucket(), dirInCloud + "/" + resourceFileName).build();
    Storage storage = StorageOptions.getDefaultInstance().getService();
    storage.create(ingestTableBlob, IOUtils.toByteArray(getClass().getClassLoader().getResource(resourceFileName)));
    String tableIngestInputFilePath = "gs://" + testConfig.getIngestbucket() + "/" + dirInCloud + "/" + resourceFileName;
    // ingest the table
    String tableName = "thetable";
    IngestRequestModel ingestRequest = new IngestRequestModel().table(tableName).format(IngestRequestModel.FormatEnum.CSV).csvSkipLeadingRows(1).path(tableIngestInputFilePath);
    connectedOperations.ingestTableSuccess(summaryModel.getId(), ingestRequest);
    // make sure the JSON file gets cleaned up on test teardown
    connectedOperations.addScratchFile(dirInCloud + "/" + resourceFileName);
    // load CSV file #1 that contains the table rows to soft delete into the test bucket
    String softDeleteRowId1 = "8c52c63e-8d9f-4cfc-82d0-0f916b2404c1";
    DataDeletionRequest softDeleteRequest1 = uploadInputFileAndBuildSoftDeleteRequest(dirInCloud, "testConcurrentSoftDeletes1.csv", tableName, Collections.singletonList(softDeleteRowId1));
    // load CSV file #1 that contains the table rows to soft delete into the test bucket
    String softDeleteRowId2 = "13ae488a-e33f-4ee6-ba30-c1fca4d96b63";
    DataDeletionRequest softDeleteRequest2 = uploadInputFileAndBuildSoftDeleteRequest(dirInCloud, "testConcurrentSoftDeletes2.csv", tableName, Collections.singletonList(softDeleteRowId2));
    // NO ASSERTS inside the block below where hang is enabled to reduce chance of failing before disabling the hang
    // ====================================================
    // enable hang in DataDeletionStep
    configService.setFault(ConfigEnum.SOFT_DELETE_LOCK_CONFLICT_STOP_FAULT.name(), true);
    // kick off the first soft delete request, it should hang just before updating the soft delete table
    MvcResult softDeleteResult1 = connectedOperations.softDeleteRaw(summaryModel.getId(), softDeleteRequest1);
    // give the flight time to launch
    TimeUnit.SECONDS.sleep(5);
    // check that the dataset metadata row has a shared lock
    // note: asserts are below outside the hang block
    UUID datasetId = UUID.fromString(summaryModel.getId());
    String exclusiveLock1 = datasetDao.getExclusiveLock(datasetId);
    String[] sharedLocks1 = datasetDao.getSharedLocks(datasetId);
    // kick off the second soft delete request, it should also hang just before updating the soft delete table
    MvcResult softDeleteResult2 = connectedOperations.softDeleteRaw(summaryModel.getId(), softDeleteRequest2);
    // give the flight time to launch
    TimeUnit.SECONDS.sleep(5);
    // check that the dataset metadata row has two shared locks
    // note: asserts are below outside the hang block
    String exclusiveLock2 = datasetDao.getExclusiveLock(datasetId);
    String[] sharedLocks2 = datasetDao.getSharedLocks(datasetId);
    // disable hang in DataDeletionStep
    configService.setFault(ConfigEnum.SOFT_DELETE_LOCK_CONFLICT_CONTINUE_FAULT.name(), true);
    // ====================================================
    // check that the dataset metadata row has a shared lock after the first soft delete request was kicked off
    assertNull("dataset row has no exclusive lock", exclusiveLock1);
    assertEquals("dataset row has one shared lock", 1, sharedLocks1.length);
    // check that the dataset metadata row has two shared locks after the second soft delete request was kicked off
    assertNull("dataset row has no exclusive lock", exclusiveLock2);
    assertEquals("dataset row has two shared locks", 2, sharedLocks2.length);
    // wait for the first soft delete to finish and check it succeeded
    MockHttpServletResponse softDeleteResponse1 = connectedOperations.validateJobModelAndWait(softDeleteResult1);
    connectedOperations.handleSuccessCase(softDeleteResponse1, DeleteResponseModel.class);
    // wait for the second soft delete to finish and check it succeeded
    MockHttpServletResponse softDeleteResponse2 = connectedOperations.validateJobModelAndWait(softDeleteResult2);
    connectedOperations.handleSuccessCase(softDeleteResponse2, DeleteResponseModel.class);
    // check that the size of the live table matches what we expect
    List<String> liveTableRowIds = getRowIdsFromBQTable(summaryModel.getName(), tableName);
    assertEquals("Size of live table is 2", 2, liveTableRowIds.size());
    assertFalse("Soft deleted row id #1 is not in live table", liveTableRowIds.contains(softDeleteRowId1));
    assertFalse("Soft deleted row id #2 is not in live table", liveTableRowIds.contains(softDeleteRowId2));
    // note: the soft delete table name is not exposed to end users, so to check that the state of the
    // soft delete table is correct, I'm reaching into our internals to fetch the table name
    Dataset internalDatasetObj = datasetDao.retrieve(UUID.fromString(summaryModel.getId()));
    DatasetTable internalDatasetTableObj = internalDatasetObj.getTableByName(tableName).get();
    String internalSoftDeleteTableName = internalDatasetTableObj.getSoftDeleteTableName();
    // check that the size of the soft delete table matches what we expect
    List<String> softDeleteRowIds = getRowIdsFromBQTable(summaryModel.getName(), internalSoftDeleteTableName);
    assertEquals("Size of soft delete table is 2", 2, softDeleteRowIds.size());
    assertTrue("Soft deleted row id #1 is in soft delete table", softDeleteRowIds.contains(softDeleteRowId1));
    assertTrue("Soft deleted row id #2 is in soft delete table", softDeleteRowIds.contains(softDeleteRowId2));
    // delete the dataset and check that it succeeds
    connectedOperations.deleteTestDataset(summaryModel.getId());
    // try to fetch the dataset again and confirm nothing is returned
    connectedOperations.getDatasetExpectError(summaryModel.getId(), HttpStatus.NOT_FOUND);
}
Also used : Storage(com.google.cloud.storage.Storage) DataDeletionRequest(bio.terra.model.DataDeletionRequest) CoreMatchers.containsString(org.hamcrest.CoreMatchers.containsString) BlobInfo(com.google.cloud.storage.BlobInfo) IngestRequestModel(bio.terra.model.IngestRequestModel) MvcResult(org.springframework.test.web.servlet.MvcResult) UUID(java.util.UUID) MockHttpServletResponse(org.springframework.mock.web.MockHttpServletResponse) SpringBootTest(org.springframework.boot.test.context.SpringBootTest) Test(org.junit.Test)

Example 5 with DataDeletionRequest

use of bio.terra.model.DataDeletionRequest in project jade-data-repo by DataBiosphere.

the class DatasetIntegrationTest method wildcardSoftDelete.

@Test
public void wildcardSoftDelete() throws Exception {
    datasetId = ingestedDataset();
    String pathPrefix = "softDelWildcard" + UUID.randomUUID().toString();
    // get 5 row ids, we'll write them out to 5 separate files
    DatasetModel dataset = dataRepoFixtures.getDataset(steward(), datasetId);
    BigQuery bigQuery = BigQueryFixtures.getBigQuery(dataset.getDataProject(), stewardToken);
    List<String> sampleRowIds = getRowIds(bigQuery, dataset, "sample", 5L);
    for (String rowId : sampleRowIds) {
        writeListToScratch(pathPrefix, Collections.singletonList(rowId));
    }
    // make a wildcard path 'gs://ingestbucket/softDelWildcard/*'
    String wildcardPath = String.format("gs://%s/scratch/%s/*", testConfiguration.getIngestbucket(), pathPrefix);
    // build a request and send it off
    DataDeletionRequest request = dataDeletionRequest().tables(Collections.singletonList(deletionTableFile("sample", wildcardPath)));
    dataRepoFixtures.deleteData(steward(), datasetId, request);
    // there should be (7 - 5) = 2 rows "visible" in the sample table
    assertTableCount(bigQuery, dataset, "sample", 2L);
}
Also used : BigQuery(com.google.cloud.bigquery.BigQuery) DataDeletionRequest(bio.terra.model.DataDeletionRequest) EnumerateDatasetModel(bio.terra.model.EnumerateDatasetModel) DatasetModel(bio.terra.model.DatasetModel) SpringBootTest(org.springframework.boot.test.context.SpringBootTest) Test(org.junit.Test)

Aggregations

DataDeletionRequest (bio.terra.model.DataDeletionRequest)11 Test (org.junit.Test)6 SpringBootTest (org.springframework.boot.test.context.SpringBootTest)6 DataDeletionTableModel (bio.terra.model.DataDeletionTableModel)5 BlobInfo (com.google.cloud.storage.BlobInfo)4 Storage (com.google.cloud.storage.Storage)4 CoreMatchers.containsString (org.hamcrest.CoreMatchers.containsString)4 DatasetModel (bio.terra.model.DatasetModel)3 EnumerateDatasetModel (bio.terra.model.EnumerateDatasetModel)3 IngestRequestModel (bio.terra.model.IngestRequestModel)3 Dataset (bio.terra.service.dataset.Dataset)3 DataDeletionUtils.getDataset (bio.terra.service.dataset.flight.datadelete.DataDeletionUtils.getDataset)3 BigQuery (com.google.cloud.bigquery.BigQuery)3 ArrayList (java.util.ArrayList)2 MockHttpServletResponse (org.springframework.mock.web.MockHttpServletResponse)2 MvcResult (org.springframework.test.web.servlet.MvcResult)2 DataDeletionGcsFileModel (bio.terra.model.DataDeletionGcsFileModel)1 DeleteResponseModel (bio.terra.model.DeleteResponseModel)1 SnapshotModel (bio.terra.model.SnapshotModel)1 SnapshotRequestModel (bio.terra.model.SnapshotRequestModel)1