Search in sources :

Example 1 with MismatchedValueException

use of bio.terra.service.snapshot.exception.MismatchedValueException in project jade-data-repo by DataBiosphere.

the class CreateSnapshotPrimaryDataRowIdsStep method doStep.

@Override
public StepResult doStep(FlightContext context) throws InterruptedException {
    // TODO: this assumes single-dataset snapshots, will need to add a loop for multiple
    SnapshotRequestContentsModel contentsModel = snapshotReq.getContents().get(0);
    Snapshot snapshot = snapshotDao.retrieveSnapshotByName(snapshotReq.getName());
    SnapshotSource source = snapshot.getSnapshotSources().get(0);
    SnapshotRequestRowIdModel rowIdModel = contentsModel.getRowIdSpec();
    // for each table, make sure all of the row ids match
    for (SnapshotRequestRowIdTableModel table : rowIdModel.getTables()) {
        List<String> rowIds = table.getRowIds();
        if (!rowIds.isEmpty()) {
            RowIdMatch rowIdMatch = bigQueryPdao.matchRowIds(snapshot, source, table.getTableName(), rowIds);
            if (!rowIdMatch.getUnmatchedInputValues().isEmpty()) {
                String unmatchedValues = String.join("', '", rowIdMatch.getUnmatchedInputValues());
                String message = String.format("Mismatched row ids: '%s'", unmatchedValues);
                FlightUtils.setErrorResponse(context, message, HttpStatus.BAD_REQUEST);
                return new StepResult(StepStatus.STEP_RESULT_FAILURE_FATAL, new MismatchedValueException(message));
            }
        }
    }
    bigQueryPdao.createSnapshotWithProvidedIds(snapshot, contentsModel);
    return StepResult.getStepResultSuccess();
}
Also used : Snapshot(bio.terra.service.snapshot.Snapshot) SnapshotRequestRowIdTableModel(bio.terra.model.SnapshotRequestRowIdTableModel) RowIdMatch(bio.terra.service.snapshot.RowIdMatch) SnapshotSource(bio.terra.service.snapshot.SnapshotSource) SnapshotRequestRowIdModel(bio.terra.model.SnapshotRequestRowIdModel) SnapshotRequestContentsModel(bio.terra.model.SnapshotRequestContentsModel) MismatchedValueException(bio.terra.service.snapshot.exception.MismatchedValueException) StepResult(bio.terra.stairway.StepResult)

Example 2 with MismatchedValueException

use of bio.terra.service.snapshot.exception.MismatchedValueException in project jade-data-repo by DataBiosphere.

the class CreateSnapshotValidateQueryStep method doStep.

@Override
public StepResult doStep(FlightContext context) {
    /*
        * make sure the query is valid
        * for now--this includes making sure there is only one dataset
        * passes general grammar check (will pass sql into parse method to make sure it works
        *
        * get dataset(s) from query and make sure that it exists-- initially just one and multiple in the future
        * make sure the user has custodian data access (currently this is done in the controller,
        * but this should be moved
        */
    String snapshotQuery = snapshotReq.getContents().get(0).getQuerySpec().getQuery();
    Query query = Query.parse(snapshotQuery);
    List<String> datasetNames = query.getDatasetNames();
    if (datasetNames.isEmpty()) {
        String message = String.format("Snapshots much be associated with at least one dataset");
        return new StepResult(StepStatus.STEP_RESULT_FAILURE_FATAL, new MismatchedValueException(message));
    }
    if (datasetNames.size() > 1) {
        String message = String.format("Snapshots can currently only be associated with one dataset");
        return new StepResult(StepStatus.STEP_RESULT_FAILURE_FATAL, new MismatchedValueException(message));
    }
    if (datasetNames.size() > 1) {
        String message = String.format("Snapshots can currently only be associated with one dataset");
        return new StepResult(StepStatus.STEP_RESULT_FAILURE_FATAL, new MismatchedValueException(message));
    }
    // Get the dataset by name to ensure the dataset exists
    String datasetName = datasetNames.get(0);
    // if not found, will throw a DatasetNotFoundException
    datasetService.retrieveByName(datasetName);
    List<String> tableNames = query.getTableNames();
    if (tableNames.isEmpty()) {
        String message = String.format("Snapshots much be associated with at least one table");
        return new StepResult(StepStatus.STEP_RESULT_FAILURE_FATAL, new MismatchedValueException(message));
    }
    // TODO validate the select list. It should have one column that is the row id.
    List<String> columnNames = query.getColumnNames();
    if (columnNames.isEmpty()) {
        String message = String.format("Snapshots much be associated with at least one column");
        return new StepResult(StepStatus.STEP_RESULT_FAILURE_FATAL, new MismatchedValueException(message));
    }
    Optional<String> rowId = columnNames.stream().filter(PDAO_ROW_ID_COLUMN::equals).findFirst();
    if (!rowId.isPresent()) {
        String message = String.format("Query must include a row_id column");
        return new StepResult(StepStatus.STEP_RESULT_FAILURE_FATAL, new MismatchedValueException(message));
    }
    // TODO test this in an integration test
    return StepResult.getStepResultSuccess();
}
Also used : Query(bio.terra.grammar.Query) MismatchedValueException(bio.terra.service.snapshot.exception.MismatchedValueException) StepResult(bio.terra.stairway.StepResult)

Example 3 with MismatchedValueException

use of bio.terra.service.snapshot.exception.MismatchedValueException in project jade-data-repo by DataBiosphere.

the class CreateSnapshotPrimaryDataAssetStep method doStep.

@Override
public StepResult doStep(FlightContext context) throws InterruptedException {
    /*
         * map field ids into row ids and validate
         * then pass the row id array into create snapshot
         */
    SnapshotRequestContentsModel contentsModel = snapshotReq.getContents().get(0);
    SnapshotRequestAssetModel assetSpec = contentsModel.getAssetSpec();
    Snapshot snapshot = snapshotDao.retrieveSnapshotByName(snapshotReq.getName());
    SnapshotSource source = snapshot.getSnapshotSources().get(0);
    RowIdMatch rowIdMatch = bigQueryPdao.mapValuesToRows(snapshot, source, assetSpec.getRootValues());
    if (rowIdMatch.getUnmatchedInputValues().size() != 0) {
        String unmatchedValues = String.join("', '", rowIdMatch.getUnmatchedInputValues());
        String message = String.format("Mismatched input values: '%s'", unmatchedValues);
        FlightUtils.setErrorResponse(context, message, HttpStatus.BAD_REQUEST);
        return new StepResult(StepStatus.STEP_RESULT_FAILURE_FATAL, new MismatchedValueException(message));
    }
    bigQueryPdao.createSnapshot(snapshot, rowIdMatch.getMatchingRowIds());
    return StepResult.getStepResultSuccess();
}
Also used : Snapshot(bio.terra.service.snapshot.Snapshot) RowIdMatch(bio.terra.service.snapshot.RowIdMatch) SnapshotRequestAssetModel(bio.terra.model.SnapshotRequestAssetModel) SnapshotSource(bio.terra.service.snapshot.SnapshotSource) SnapshotRequestContentsModel(bio.terra.model.SnapshotRequestContentsModel) MismatchedValueException(bio.terra.service.snapshot.exception.MismatchedValueException) StepResult(bio.terra.stairway.StepResult)

Example 4 with MismatchedValueException

use of bio.terra.service.snapshot.exception.MismatchedValueException in project jade-data-repo by DataBiosphere.

the class BigQueryPdao method queryForRowIds.

// insert the rowIds into the snapshot row ids table and then kick off the rest of the relationship walking
// once we have the row ids in addition to the asset spec, this should look familiar to wAsset
public void queryForRowIds(AssetSpecification assetSpecification, Snapshot snapshot, String sqlQuery) throws InterruptedException {
    BigQueryProject bigQueryProject = bigQueryProjectForSnapshot(snapshot);
    BigQuery bigQuery = bigQueryProject.getBigQuery();
    String snapshotName = snapshot.getName();
    Dataset dataset = snapshot.getSnapshotSources().get(0).getDataset();
    String datasetBqDatasetName = prefixName(dataset.getName());
    String projectId = bigQueryProject.getProjectId();
    // create snapshot bq dataset
    try {
        // create snapshot BQ dataset
        snapshotCreateBQDataset(bigQueryProject, snapshot);
        // now create a temp table with all the selected row ids based on the query in it
        bigQueryProject.createTable(snapshotName, PDAO_TEMP_TABLE, tempTableSchema());
        QueryJobConfiguration queryConfig = QueryJobConfiguration.newBuilder(sqlQuery).setDestinationTable(TableId.of(snapshotName, PDAO_TEMP_TABLE)).setWriteDisposition(JobInfo.WriteDisposition.WRITE_APPEND).build();
        try {
            final TableResult query = bigQuery.query(queryConfig);
            // get results and validate that it got back more than 0 value
            if (query.getTotalRows() < 1) {
                // should this be a different error?
                throw new InvalidQueryException("Query returned 0 results");
            }
        } catch (InterruptedException ie) {
            throw new PdaoException("Append query unexpectedly interrupted", ie);
        }
        // join on the root table to validate that the dataset's rootTable.rowid is never null
        // and thus matches the PDAO_ROW_ID_COLUMN
        AssetTable rootAssetTable = assetSpecification.getRootTable();
        Table rootTable = rootAssetTable.getTable();
        String datasetTableName = rootTable.getName();
        String rootTableId = rootTable.getId().toString();
        ST sqlTemplate = new ST(joinTablesToTestForMissingRowIds);
        sqlTemplate.add("snapshotDatasetName", snapshotName);
        sqlTemplate.add("tempTable", PDAO_TEMP_TABLE);
        sqlTemplate.add("datasetDatasetName", datasetBqDatasetName);
        sqlTemplate.add("datasetTable", datasetTableName);
        sqlTemplate.add("commonColumn", PDAO_ROW_ID_COLUMN);
        TableResult result = bigQueryProject.query(sqlTemplate.render());
        FieldValueList mismatchedCount = result.getValues().iterator().next();
        Long mismatchedCountLong = mismatchedCount.get(0).getLongValue();
        if (mismatchedCountLong > 0) {
            throw new MismatchedValueException("Query results did not match dataset root row ids");
        }
        // TODO should this be pulled up to the top of queryForRowIds() / added to snapshotCreateBQDataset() helper
        bigQueryProject.createTable(snapshotName, PDAO_ROW_ID_TABLE, rowIdTableSchema());
        // populate root row ids. Must happen before the relationship walk.
        // NOTE: when we have multiple sources, we can put this into a loop
        // insert into the PDAO_ROW_ID_TABLE the literal that is the table id
        // and then all the row ids from the temp table
        ST sqlLoadTemplate = new ST(loadRootRowIdsFromTempTableTemplate);
        sqlLoadTemplate.add("project", projectId);
        sqlLoadTemplate.add("snapshot", snapshotName);
        sqlLoadTemplate.add("dataset", datasetBqDatasetName);
        sqlLoadTemplate.add("tableId", rootTableId);
        // this is the disc from classic asset
        sqlLoadTemplate.add("commonColumn", PDAO_ROW_ID_COLUMN);
        sqlLoadTemplate.add("tempTable", PDAO_TEMP_TABLE);
        bigQueryProject.query(sqlLoadTemplate.render());
        // ST sqlValidateTemplate = new ST(validateRowIdsForRootTemplate);
        // TODO do we want to reuse this validation? if yes, maybe mismatchedCount / query should be updated
        // walk and populate relationship table row ids
        List<WalkRelationship> walkRelationships = WalkRelationship.ofAssetSpecification(assetSpecification);
        walkRelationships(datasetBqDatasetName, snapshotName, walkRelationships, rootTableId, projectId, bigQuery);
        // populate root row ids. Must happen before the relationship walk.
        // NOTE: when we have multiple sources, we can put this into a loop
        snapshotViewCreation(datasetBqDatasetName, snapshotName, snapshot, projectId, bigQuery, bigQueryProject);
    } catch (PdaoException ex) {
        // TODO what if the query is invalid? Seems like there might be more to catch here.
        throw new PdaoException("createSnapshot failed", ex);
    }
}
Also used : AssetTable(bio.terra.service.dataset.AssetTable) ST(org.stringtemplate.v4.ST) BigQuery(com.google.cloud.bigquery.BigQuery) DatasetTable(bio.terra.service.dataset.DatasetTable) Table(bio.terra.common.Table) AssetTable(bio.terra.service.dataset.AssetTable) SnapshotTable(bio.terra.service.snapshot.SnapshotTable) SnapshotMapTable(bio.terra.service.snapshot.SnapshotMapTable) Dataset(bio.terra.service.dataset.Dataset) TableResult(com.google.cloud.bigquery.TableResult) PdaoException(bio.terra.common.exception.PdaoException) MismatchedValueException(bio.terra.service.snapshot.exception.MismatchedValueException) FieldValueList(com.google.cloud.bigquery.FieldValueList) QueryJobConfiguration(com.google.cloud.bigquery.QueryJobConfiguration) InvalidQueryException(bio.terra.grammar.exception.InvalidQueryException)

Example 5 with MismatchedValueException

use of bio.terra.service.snapshot.exception.MismatchedValueException in project jade-data-repo by DataBiosphere.

the class CreateSnapshotValidateAssetStep method doStep.

@Override
public StepResult doStep(FlightContext context) {
    /*
         * get dataset
         * get dataset asset list
         * get snapshot asset name
         * check that snapshot asset exists
         */
    Snapshot snapshot = snapshotService.retrieveByName(snapshotReq.getName());
    for (SnapshotSource snapshotSource : snapshot.getSnapshotSources()) {
        Dataset dataset = datasetService.retrieve(snapshotSource.getDataset().getId());
        List<String> datasetAssetNamesList = dataset.getAssetSpecifications().stream().map(assetSpec -> assetSpec.getName()).collect(Collectors.toList());
        String snapshotSourceAssetName = snapshotSource.getAssetSpecification().getName();
        if (!datasetAssetNamesList.contains(snapshotSourceAssetName)) {
            String datasetAssetNames = String.join("', '", datasetAssetNamesList);
            String message = String.format("Mismatched asset name: '%s' is not an asset in the asset list for dataset '%s'." + "Asset list is '%s'", snapshotSourceAssetName, dataset.getName(), datasetAssetNames);
            FlightUtils.setErrorResponse(context, message, HttpStatus.BAD_REQUEST);
            return new StepResult(StepStatus.STEP_RESULT_FAILURE_FATAL, new MismatchedValueException(message));
        }
    }
    return StepResult.getStepResultSuccess();
}
Also used : DatasetService(bio.terra.service.dataset.DatasetService) Collectors(java.util.stream.Collectors) SnapshotRequestModel(bio.terra.model.SnapshotRequestModel) HttpStatus(org.springframework.http.HttpStatus) MismatchedValueException(bio.terra.service.snapshot.exception.MismatchedValueException) StepResult(bio.terra.stairway.StepResult) List(java.util.List) Step(bio.terra.stairway.Step) FlightUtils(bio.terra.common.FlightUtils) Snapshot(bio.terra.service.snapshot.Snapshot) Dataset(bio.terra.service.dataset.Dataset) SnapshotService(bio.terra.service.snapshot.SnapshotService) SnapshotSource(bio.terra.service.snapshot.SnapshotSource) StepStatus(bio.terra.stairway.StepStatus) FlightContext(bio.terra.stairway.FlightContext) Snapshot(bio.terra.service.snapshot.Snapshot) Dataset(bio.terra.service.dataset.Dataset) SnapshotSource(bio.terra.service.snapshot.SnapshotSource) MismatchedValueException(bio.terra.service.snapshot.exception.MismatchedValueException) StepResult(bio.terra.stairway.StepResult)

Aggregations

MismatchedValueException (bio.terra.service.snapshot.exception.MismatchedValueException)5 StepResult (bio.terra.stairway.StepResult)4 Snapshot (bio.terra.service.snapshot.Snapshot)3 SnapshotSource (bio.terra.service.snapshot.SnapshotSource)3 SnapshotRequestContentsModel (bio.terra.model.SnapshotRequestContentsModel)2 Dataset (bio.terra.service.dataset.Dataset)2 RowIdMatch (bio.terra.service.snapshot.RowIdMatch)2 FlightUtils (bio.terra.common.FlightUtils)1 Table (bio.terra.common.Table)1 PdaoException (bio.terra.common.exception.PdaoException)1 Query (bio.terra.grammar.Query)1 InvalidQueryException (bio.terra.grammar.exception.InvalidQueryException)1 SnapshotRequestAssetModel (bio.terra.model.SnapshotRequestAssetModel)1 SnapshotRequestModel (bio.terra.model.SnapshotRequestModel)1 SnapshotRequestRowIdModel (bio.terra.model.SnapshotRequestRowIdModel)1 SnapshotRequestRowIdTableModel (bio.terra.model.SnapshotRequestRowIdTableModel)1 AssetTable (bio.terra.service.dataset.AssetTable)1 DatasetService (bio.terra.service.dataset.DatasetService)1 DatasetTable (bio.terra.service.dataset.DatasetTable)1 SnapshotMapTable (bio.terra.service.snapshot.SnapshotMapTable)1