Search in sources :

Example 1 with DataDeletionTableModel

use of bio.terra.model.DataDeletionTableModel in project jade-data-repo by DataBiosphere.

the class CreateExternalTablesStep method doStep.

@Override
public StepResult doStep(FlightContext context) throws InterruptedException {
    Dataset dataset = getDataset(context, datasetService);
    String suffix = getSuffix(context);
    DataDeletionRequest dataDeletionRequest = getRequest(context);
    validateTablesExistInDataset(dataDeletionRequest, dataset);
    for (DataDeletionTableModel table : dataDeletionRequest.getTables()) {
        String path = table.getGcsFileSpec().getPath();
        // let any exception here trigger an undo, no use trying to continue
        bigQueryPdao.createSoftDeleteExternalTable(dataset, path, table.getTableName(), suffix);
    }
    return StepResult.getStepResultSuccess();
}
Also used : DataDeletionUtils.getDataset(bio.terra.service.dataset.flight.datadelete.DataDeletionUtils.getDataset) Dataset(bio.terra.service.dataset.Dataset) DataDeletionRequest(bio.terra.model.DataDeletionRequest) DataDeletionTableModel(bio.terra.model.DataDeletionTableModel)

Example 2 with DataDeletionTableModel

use of bio.terra.model.DataDeletionTableModel in project jade-data-repo by DataBiosphere.

the class CreateExternalTablesStep method undoStep.

@Override
public StepResult undoStep(FlightContext context) {
    Dataset dataset = getDataset(context, datasetService);
    String suffix = getSuffix(context);
    for (DataDeletionTableModel table : getRequest(context).getTables()) {
        try {
            bigQueryPdao.deleteSoftDeleteExternalTable(dataset, table.getTableName(), suffix);
        } catch (Exception ex) {
            // catch any exception and get it into the log, make a
            String msg = String.format("Couldn't clean up external table for %s from dataset %s w/ suffix %s", table.getTableName(), dataset.getName(), suffix);
            logger.warn(msg, ex);
        }
    }
    return StepResult.getStepResultSuccess();
}
Also used : DataDeletionUtils.getDataset(bio.terra.service.dataset.flight.datadelete.DataDeletionUtils.getDataset) Dataset(bio.terra.service.dataset.Dataset) TableNotFoundException(bio.terra.service.dataset.exception.TableNotFoundException) DataDeletionTableModel(bio.terra.model.DataDeletionTableModel)

Example 3 with DataDeletionTableModel

use of bio.terra.model.DataDeletionTableModel in project jade-data-repo by DataBiosphere.

the class DropExternalTablesStep method doStep.

@Override
public StepResult doStep(FlightContext context) throws InterruptedException {
    Dataset dataset = getDataset(context, datasetService);
    String suffix = getSuffix(context);
    DataDeletionRequest dataDeletionRequest = getRequest(context);
    for (DataDeletionTableModel table : dataDeletionRequest.getTables()) {
        bigQueryPdao.deleteSoftDeleteExternalTable(dataset, table.getTableName(), suffix);
    }
    return StepResult.getStepResultSuccess();
}
Also used : DataDeletionUtils.getDataset(bio.terra.service.dataset.flight.datadelete.DataDeletionUtils.getDataset) Dataset(bio.terra.service.dataset.Dataset) DataDeletionRequest(bio.terra.model.DataDeletionRequest) DataDeletionTableModel(bio.terra.model.DataDeletionTableModel)

Example 4 with DataDeletionTableModel

use of bio.terra.model.DataDeletionTableModel in project jade-data-repo by DataBiosphere.

the class BigQueryPdao method validateDeleteRequest.

/**
 * Goes through each of the provided tables and checks to see if the proposed row ids to soft delete exist in the
 * raw dataset table. This will error out on the first sign of mismatch.
 *
 * @param dataset dataset repo concept object
 * @param tables list of table specs from the DataDeletionRequest
 * @param suffix a string added onto the end of the external table to prevent collisions
 */
public void validateDeleteRequest(Dataset dataset, List<DataDeletionTableModel> tables, String suffix) throws InterruptedException {
    BigQueryProject bigQueryProject = bigQueryProjectForDataset(dataset);
    for (DataDeletionTableModel table : tables) {
        String tableName = table.getTableName();
        String rawTableName = dataset.getTableByName(tableName).get().getRawTableName();
        String sql = new ST(validateSoftDeleteTemplate).add("rowId", PDAO_ROW_ID_COLUMN).add("project", bigQueryProject.getProjectId()).add("dataset", prefixName(dataset.getName())).add("softDeleteExtTable", externalTableName(tableName, suffix)).add("rawTable", rawTableName).render();
        TableResult result = bigQueryProject.query(sql);
        long numMismatched = getSingleLongValue(result);
        // shortcut out early, no use wasting more compute
        if (numMismatched > 0) {
            throw new MismatchedRowIdException(String.format("Could not match %s row ids for table %s", numMismatched, tableName));
        }
    }
}
Also used : ST(org.stringtemplate.v4.ST) TableResult(com.google.cloud.bigquery.TableResult) MismatchedRowIdException(bio.terra.service.tabulardata.exception.MismatchedRowIdException) DataDeletionTableModel(bio.terra.model.DataDeletionTableModel)

Example 5 with DataDeletionTableModel

use of bio.terra.model.DataDeletionTableModel in project jade-data-repo by DataBiosphere.

the class DatasetIntegrationTest method testSoftDeleteHappyPath.

@Test
public void testSoftDeleteHappyPath() throws Exception {
    datasetId = ingestedDataset();
    // get row ids
    DatasetModel dataset = dataRepoFixtures.getDataset(steward(), datasetId);
    BigQuery bigQuery = BigQueryFixtures.getBigQuery(dataset.getDataProject(), stewardToken);
    List<String> participantRowIds = getRowIds(bigQuery, dataset, "participant", 3L);
    List<String> sampleRowIds = getRowIds(bigQuery, dataset, "sample", 2L);
    // write them to GCS
    String participantPath = writeListToScratch("softDel", participantRowIds);
    String samplePath = writeListToScratch("softDel", sampleRowIds);
    // build the deletion request with pointers to the two files with row ids to soft delete
    List<DataDeletionTableModel> dataDeletionTableModels = Arrays.asList(deletionTableFile("participant", participantPath), deletionTableFile("sample", samplePath));
    DataDeletionRequest request = dataDeletionRequest().tables(dataDeletionTableModels);
    // send off the soft delete request
    dataRepoFixtures.deleteData(steward(), datasetId, request);
    // make sure the new counts make sense
    assertTableCount(bigQuery, dataset, "participant", 2L);
    assertTableCount(bigQuery, dataset, "sample", 5L);
}
Also used : BigQuery(com.google.cloud.bigquery.BigQuery) DataDeletionRequest(bio.terra.model.DataDeletionRequest) EnumerateDatasetModel(bio.terra.model.EnumerateDatasetModel) DatasetModel(bio.terra.model.DatasetModel) DataDeletionTableModel(bio.terra.model.DataDeletionTableModel) SpringBootTest(org.springframework.boot.test.context.SpringBootTest) Test(org.junit.Test)

Aggregations

DataDeletionTableModel (bio.terra.model.DataDeletionTableModel)7 DataDeletionRequest (bio.terra.model.DataDeletionRequest)5 Dataset (bio.terra.service.dataset.Dataset)3 DataDeletionUtils.getDataset (bio.terra.service.dataset.flight.datadelete.DataDeletionUtils.getDataset)3 DatasetModel (bio.terra.model.DatasetModel)2 EnumerateDatasetModel (bio.terra.model.EnumerateDatasetModel)2 BigQuery (com.google.cloud.bigquery.BigQuery)2 Test (org.junit.Test)2 SpringBootTest (org.springframework.boot.test.context.SpringBootTest)2 DataDeletionGcsFileModel (bio.terra.model.DataDeletionGcsFileModel)1 SnapshotModel (bio.terra.model.SnapshotModel)1 SnapshotRequestModel (bio.terra.model.SnapshotRequestModel)1 SnapshotSummaryModel (bio.terra.model.SnapshotSummaryModel)1 TableNotFoundException (bio.terra.service.dataset.exception.TableNotFoundException)1 MismatchedRowIdException (bio.terra.service.tabulardata.exception.MismatchedRowIdException)1 TableResult (com.google.cloud.bigquery.TableResult)1 BlobInfo (com.google.cloud.storage.BlobInfo)1 Storage (com.google.cloud.storage.Storage)1 CoreMatchers.containsString (org.hamcrest.CoreMatchers.containsString)1 ST (org.stringtemplate.v4.ST)1