Search in sources :

Example 1 with SnapshotRequestRowIdTableModel

use of bio.terra.model.SnapshotRequestRowIdTableModel in project jade-data-repo by DataBiosphere.

the class CreateSnapshotPrimaryDataRowIdsStep method doStep.

@Override
public StepResult doStep(FlightContext context) throws InterruptedException {
    // TODO: this assumes single-dataset snapshots, will need to add a loop for multiple
    SnapshotRequestContentsModel contentsModel = snapshotReq.getContents().get(0);
    Snapshot snapshot = snapshotDao.retrieveSnapshotByName(snapshotReq.getName());
    SnapshotSource source = snapshot.getSnapshotSources().get(0);
    SnapshotRequestRowIdModel rowIdModel = contentsModel.getRowIdSpec();
    // for each table, make sure all of the row ids match
    for (SnapshotRequestRowIdTableModel table : rowIdModel.getTables()) {
        List<String> rowIds = table.getRowIds();
        if (!rowIds.isEmpty()) {
            RowIdMatch rowIdMatch = bigQueryPdao.matchRowIds(snapshot, source, table.getTableName(), rowIds);
            if (!rowIdMatch.getUnmatchedInputValues().isEmpty()) {
                String unmatchedValues = String.join("', '", rowIdMatch.getUnmatchedInputValues());
                String message = String.format("Mismatched row ids: '%s'", unmatchedValues);
                FlightUtils.setErrorResponse(context, message, HttpStatus.BAD_REQUEST);
                return new StepResult(StepStatus.STEP_RESULT_FAILURE_FATAL, new MismatchedValueException(message));
            }
        }
    }
    bigQueryPdao.createSnapshotWithProvidedIds(snapshot, contentsModel);
    return StepResult.getStepResultSuccess();
}
Also used : Snapshot(bio.terra.service.snapshot.Snapshot) SnapshotRequestRowIdTableModel(bio.terra.model.SnapshotRequestRowIdTableModel) RowIdMatch(bio.terra.service.snapshot.RowIdMatch) SnapshotSource(bio.terra.service.snapshot.SnapshotSource) SnapshotRequestRowIdModel(bio.terra.model.SnapshotRequestRowIdModel) SnapshotRequestContentsModel(bio.terra.model.SnapshotRequestContentsModel) MismatchedValueException(bio.terra.service.snapshot.exception.MismatchedValueException) StepResult(bio.terra.stairway.StepResult)

Example 2 with SnapshotRequestRowIdTableModel

use of bio.terra.model.SnapshotRequestRowIdTableModel in project jade-data-repo by DataBiosphere.

the class SnapshotValidationTest method makeSnapshotRowIdsRequest.

// Generate a valid snapshot-by-rowId request, we will tweak individual pieces to test validation below
public SnapshotRequestModel makeSnapshotRowIdsRequest() {
    SnapshotRequestRowIdTableModel snapshotRequestTableModel = new SnapshotRequestRowIdTableModel().tableName("snapshot").columns(Arrays.asList("col1", "col2", "col3")).rowIds(Arrays.asList("row1", "row2", "row3"));
    SnapshotRequestRowIdModel rowIdSpec = new SnapshotRequestRowIdModel().tables(Collections.singletonList(snapshotRequestTableModel));
    SnapshotRequestContentsModel snapshotRequestContentsModel = new SnapshotRequestContentsModel().datasetName("dataset").mode(SnapshotRequestContentsModel.ModeEnum.BYROWID).rowIdSpec(rowIdSpec);
    return new SnapshotRequestModel().contents(Collections.singletonList(snapshotRequestContentsModel));
}
Also used : SnapshotRequestRowIdTableModel(bio.terra.model.SnapshotRequestRowIdTableModel) SnapshotRequestRowIdModel(bio.terra.model.SnapshotRequestRowIdModel) SnapshotRequestContentsModel(bio.terra.model.SnapshotRequestContentsModel) SnapshotRequestModel(bio.terra.model.SnapshotRequestModel)

Example 3 with SnapshotRequestRowIdTableModel

use of bio.terra.model.SnapshotRequestRowIdTableModel in project jade-data-repo by DataBiosphere.

the class SnapshotService method conjureSnapshotTablesFromRowIds.

private void conjureSnapshotTablesFromRowIds(SnapshotRequestRowIdModel requestRowIdModel, Snapshot snapshot, SnapshotSource snapshotSource) {
    // TODO this will need to be changed when we have more than one dataset per snapshot (>1 contentsModel)
    List<SnapshotTable> tableList = new ArrayList<>();
    snapshot.snapshotTables(tableList);
    List<SnapshotMapTable> mapTableList = new ArrayList<>();
    snapshotSource.snapshotMapTables(mapTableList);
    Dataset dataset = snapshotSource.getDataset();
    // create a lookup from tableName -> table spec from the request
    Map<String, SnapshotRequestRowIdTableModel> requestTableLookup = requestRowIdModel.getTables().stream().collect(Collectors.toMap(SnapshotRequestRowIdTableModel::getTableName, Function.identity()));
    // for each dataset table specified in the request, create a table in the snapshot with the same name
    for (DatasetTable datasetTable : dataset.getTables()) {
        if (!requestTableLookup.containsKey(datasetTable.getName())) {
            // only capture the dataset tables in the request model
            continue;
        }
        List<Column> columnList = new ArrayList<>();
        SnapshotTable snapshotTable = new SnapshotTable().name(datasetTable.getName()).columns(columnList);
        tableList.add(snapshotTable);
        List<SnapshotMapColumn> mapColumnList = new ArrayList<>();
        mapTableList.add(new SnapshotMapTable().fromTable(datasetTable).toTable(snapshotTable).snapshotMapColumns(mapColumnList));
        // for each dataset column specified in the request, create a column in the snapshot with the same name
        Set<String> requestColumns = new HashSet<>(requestTableLookup.get(datasetTable.getName()).getColumns());
        datasetTable.getColumns().stream().filter(c -> requestColumns.contains(c.getName())).forEach(datasetColumn -> {
            Column snapshotColumn = new Column().name(datasetColumn.getName());
            SnapshotMapColumn snapshotMapColumn = new SnapshotMapColumn().fromColumn(datasetColumn).toColumn(snapshotColumn);
            columnList.add(snapshotColumn);
            mapColumnList.add(snapshotMapColumn);
        });
    }
}
Also used : DatasetService(bio.terra.service.dataset.DatasetService) SnapshotDeleteFlight(bio.terra.service.snapshot.flight.delete.SnapshotDeleteFlight) SnapshotModel(bio.terra.model.SnapshotModel) FireStoreDependencyDao(bio.terra.service.filedata.google.firestore.FireStoreDependencyDao) Autowired(org.springframework.beans.factory.annotation.Autowired) DatasetTable(bio.terra.service.dataset.DatasetTable) RelationshipModel(bio.terra.model.RelationshipModel) Map(java.util.Map) Table(bio.terra.common.Table) JobMapKeys(bio.terra.service.job.JobMapKeys) SnapshotRequestQueryModel(bio.terra.model.SnapshotRequestQueryModel) DataLocationService(bio.terra.service.resourcemanagement.DataLocationService) ValidationException(bio.terra.app.controller.exception.ValidationException) Query(bio.terra.grammar.Query) AssetColumn(bio.terra.service.dataset.AssetColumn) SnapshotRequestRowIdTableModel(bio.terra.model.SnapshotRequestRowIdTableModel) Set(java.util.Set) EnumerateSnapshotModel(bio.terra.model.EnumerateSnapshotModel) UUID(java.util.UUID) SnapshotRequestContentsModel(bio.terra.model.SnapshotRequestContentsModel) Collectors(java.util.stream.Collectors) DatasetSummaryModel(bio.terra.model.DatasetSummaryModel) List(java.util.List) AssetSpecification(bio.terra.service.dataset.AssetSpecification) Optional(java.util.Optional) BigQueryPdao(bio.terra.service.tabulardata.google.BigQueryPdao) AuthenticatedUserRequest(bio.terra.service.iam.AuthenticatedUserRequest) SnapshotCreateFlight(bio.terra.service.snapshot.flight.create.SnapshotCreateFlight) JobService(bio.terra.service.job.JobService) Relationship(bio.terra.common.Relationship) HashMap(java.util.HashMap) Column(bio.terra.common.Column) SnapshotRequestAssetModel(bio.terra.model.SnapshotRequestAssetModel) TableModel(bio.terra.model.TableModel) AssetNotFoundException(bio.terra.service.snapshot.exception.AssetNotFoundException) Function(java.util.function.Function) ArrayList(java.util.ArrayList) HashSet(java.util.HashSet) SnapshotRequestModel(bio.terra.model.SnapshotRequestModel) InvalidSnapshotException(bio.terra.service.snapshot.exception.InvalidSnapshotException) MetadataEnumeration(bio.terra.common.MetadataEnumeration) AssetTable(bio.terra.service.dataset.AssetTable) SnapshotSourceModel(bio.terra.model.SnapshotSourceModel) ColumnModel(bio.terra.model.ColumnModel) RelationshipTermModel(bio.terra.model.RelationshipTermModel) Component(org.springframework.stereotype.Component) SnapshotSummaryModel(bio.terra.model.SnapshotSummaryModel) SnapshotRequestRowIdModel(bio.terra.model.SnapshotRequestRowIdModel) Dataset(bio.terra.service.dataset.Dataset) Collections(java.util.Collections) Dataset(bio.terra.service.dataset.Dataset) ArrayList(java.util.ArrayList) SnapshotRequestRowIdTableModel(bio.terra.model.SnapshotRequestRowIdTableModel) AssetColumn(bio.terra.service.dataset.AssetColumn) Column(bio.terra.common.Column) DatasetTable(bio.terra.service.dataset.DatasetTable) HashSet(java.util.HashSet)

Example 4 with SnapshotRequestRowIdTableModel

use of bio.terra.model.SnapshotRequestRowIdTableModel in project jade-data-repo by DataBiosphere.

the class BigQueryPdao method createSnapshotWithProvidedIds.

public void createSnapshotWithProvidedIds(Snapshot snapshot, SnapshotRequestContentsModel contentsModel) throws InterruptedException {
    BigQueryProject bigQueryProject = bigQueryProjectForSnapshot(snapshot);
    String projectId = bigQueryProject.getProjectId();
    String snapshotName = snapshot.getName();
    BigQuery bigQuery = bigQueryProject.getBigQuery();
    SnapshotRequestRowIdModel rowIdModel = contentsModel.getRowIdSpec();
    // create snapshot BQ dataset
    snapshotCreateBQDataset(bigQueryProject, snapshot);
    // create the row id table
    bigQueryProject.createTable(snapshotName, PDAO_ROW_ID_TABLE, rowIdTableSchema());
    // populate root row ids. Must happen before the relationship walk.
    // NOTE: when we have multiple sources, we can put this into a loop
    SnapshotSource source = snapshot.getSnapshotSources().get(0);
    String datasetBqDatasetName = prefixName(source.getDataset().getName());
    for (SnapshotRequestRowIdTableModel table : rowIdModel.getTables()) {
        String tableName = table.getTableName();
        Table sourceTable = source.reverseTableLookup(tableName).orElseThrow(() -> new CorruptMetadataException("cannot find destination table: " + tableName));
        List<String> rowIds = table.getRowIds();
        if (rowIds.size() > 0) {
            ST sqlTemplate = new ST(loadRootRowIdsTemplate);
            sqlTemplate.add("project", projectId);
            sqlTemplate.add("snapshot", snapshotName);
            sqlTemplate.add("dataset", datasetBqDatasetName);
            sqlTemplate.add("tableId", sourceTable.getId().toString());
            sqlTemplate.add("rowIds", rowIds);
            bigQueryProject.query(sqlTemplate.render());
        }
        ST sqlTemplate = new ST(validateRowIdsForRootTemplate);
        sqlTemplate.add("project", projectId);
        sqlTemplate.add("snapshot", snapshotName);
        sqlTemplate.add("dataset", datasetBqDatasetName);
        sqlTemplate.add("table", sourceTable.getName());
        TableResult result = bigQueryProject.query(sqlTemplate.render());
        FieldValueList row = result.iterateAll().iterator().next();
        FieldValue countValue = row.get(0);
        if (countValue.getLongValue() != rowIds.size()) {
            logger.error("Invalid row ids supplied: rowIds=" + rowIds.size() + " count=" + countValue.getLongValue());
            for (String rowId : rowIds) {
                logger.error(" rowIdIn: " + rowId);
            }
            throw new PdaoException("Invalid row ids supplied");
        }
    }
    snapshotViewCreation(datasetBqDatasetName, snapshotName, snapshot, projectId, bigQuery, bigQueryProject);
}
Also used : ST(org.stringtemplate.v4.ST) BigQuery(com.google.cloud.bigquery.BigQuery) DatasetTable(bio.terra.service.dataset.DatasetTable) Table(bio.terra.common.Table) AssetTable(bio.terra.service.dataset.AssetTable) SnapshotTable(bio.terra.service.snapshot.SnapshotTable) SnapshotMapTable(bio.terra.service.snapshot.SnapshotMapTable) SnapshotRequestRowIdModel(bio.terra.model.SnapshotRequestRowIdModel) CorruptMetadataException(bio.terra.service.snapshot.exception.CorruptMetadataException) TableResult(com.google.cloud.bigquery.TableResult) SnapshotRequestRowIdTableModel(bio.terra.model.SnapshotRequestRowIdTableModel) PdaoException(bio.terra.common.exception.PdaoException) SnapshotSource(bio.terra.service.snapshot.SnapshotSource) FieldValueList(com.google.cloud.bigquery.FieldValueList) FieldValue(com.google.cloud.bigquery.FieldValue)

Aggregations

SnapshotRequestRowIdModel (bio.terra.model.SnapshotRequestRowIdModel)4 SnapshotRequestRowIdTableModel (bio.terra.model.SnapshotRequestRowIdTableModel)4 SnapshotRequestContentsModel (bio.terra.model.SnapshotRequestContentsModel)3 Table (bio.terra.common.Table)2 SnapshotRequestModel (bio.terra.model.SnapshotRequestModel)2 AssetTable (bio.terra.service.dataset.AssetTable)2 DatasetTable (bio.terra.service.dataset.DatasetTable)2 SnapshotSource (bio.terra.service.snapshot.SnapshotSource)2 ValidationException (bio.terra.app.controller.exception.ValidationException)1 Column (bio.terra.common.Column)1 MetadataEnumeration (bio.terra.common.MetadataEnumeration)1 Relationship (bio.terra.common.Relationship)1 PdaoException (bio.terra.common.exception.PdaoException)1 Query (bio.terra.grammar.Query)1 ColumnModel (bio.terra.model.ColumnModel)1 DatasetSummaryModel (bio.terra.model.DatasetSummaryModel)1 EnumerateSnapshotModel (bio.terra.model.EnumerateSnapshotModel)1 RelationshipModel (bio.terra.model.RelationshipModel)1 RelationshipTermModel (bio.terra.model.RelationshipTermModel)1 SnapshotModel (bio.terra.model.SnapshotModel)1