Search in sources :

Example 1 with Query

use of bio.terra.grammar.Query in project jade-data-repo by DataBiosphere.

the class CreateSnapshotValidateQueryStep method doStep.

@Override
public StepResult doStep(FlightContext context) {
    /*
        * make sure the query is valid
        * for now--this includes making sure there is only one dataset
        * passes general grammar check (will pass sql into parse method to make sure it works
        *
        * get dataset(s) from query and make sure that it exists-- initially just one and multiple in the future
        * make sure the user has custodian data access (currently this is done in the controller,
        * but this should be moved
        */
    String snapshotQuery = snapshotReq.getContents().get(0).getQuerySpec().getQuery();
    Query query = Query.parse(snapshotQuery);
    List<String> datasetNames = query.getDatasetNames();
    if (datasetNames.isEmpty()) {
        String message = String.format("Snapshots much be associated with at least one dataset");
        return new StepResult(StepStatus.STEP_RESULT_FAILURE_FATAL, new MismatchedValueException(message));
    }
    if (datasetNames.size() > 1) {
        String message = String.format("Snapshots can currently only be associated with one dataset");
        return new StepResult(StepStatus.STEP_RESULT_FAILURE_FATAL, new MismatchedValueException(message));
    }
    if (datasetNames.size() > 1) {
        String message = String.format("Snapshots can currently only be associated with one dataset");
        return new StepResult(StepStatus.STEP_RESULT_FAILURE_FATAL, new MismatchedValueException(message));
    }
    // Get the dataset by name to ensure the dataset exists
    String datasetName = datasetNames.get(0);
    // if not found, will throw a DatasetNotFoundException
    datasetService.retrieveByName(datasetName);
    List<String> tableNames = query.getTableNames();
    if (tableNames.isEmpty()) {
        String message = String.format("Snapshots much be associated with at least one table");
        return new StepResult(StepStatus.STEP_RESULT_FAILURE_FATAL, new MismatchedValueException(message));
    }
    // TODO validate the select list. It should have one column that is the row id.
    List<String> columnNames = query.getColumnNames();
    if (columnNames.isEmpty()) {
        String message = String.format("Snapshots much be associated with at least one column");
        return new StepResult(StepStatus.STEP_RESULT_FAILURE_FATAL, new MismatchedValueException(message));
    }
    Optional<String> rowId = columnNames.stream().filter(PDAO_ROW_ID_COLUMN::equals).findFirst();
    if (!rowId.isPresent()) {
        String message = String.format("Query must include a row_id column");
        return new StepResult(StepStatus.STEP_RESULT_FAILURE_FATAL, new MismatchedValueException(message));
    }
    // TODO test this in an integration test
    return StepResult.getStepResultSuccess();
}
Also used : Query(bio.terra.grammar.Query) MismatchedValueException(bio.terra.service.snapshot.exception.MismatchedValueException) StepResult(bio.terra.stairway.StepResult)

Example 2 with Query

use of bio.terra.grammar.Query in project jade-data-repo by DataBiosphere.

the class SnapshotService method makeSnapshotFromSnapshotRequest.

/**
 * Make a Snapshot structure with all of its parts from an incoming snapshot request.
 * Note that the structure does not have UUIDs or created dates filled in. Those are
 * updated by the DAO when it stores the snapshot in the repository metadata.
 *
 * @param snapshotRequestModel
 * @return Snapshot
 */
public Snapshot makeSnapshotFromSnapshotRequest(SnapshotRequestModel snapshotRequestModel) {
    // Make this early so we can hook up back links to it
    Snapshot snapshot = new Snapshot();
    List<SnapshotRequestContentsModel> requestContentsList = snapshotRequestModel.getContents();
    // TODO: for MVM we only allow one source list
    if (requestContentsList.size() > 1) {
        throw new ValidationException("Only a single snapshot contents entry is currently allowed.");
    }
    SnapshotRequestContentsModel requestContents = requestContentsList.get(0);
    Dataset dataset = datasetService.retrieveByName(requestContents.getDatasetName());
    SnapshotSource snapshotSource = new SnapshotSource().snapshot(snapshot).dataset(dataset);
    switch(snapshotRequestModel.getContents().get(0).getMode()) {
        case BYASSET:
            // TODO: When we implement explicit definition of snapshot tables, we will handle that here.
            // For now, we generate the snapshot tables directly from the asset tables of the one source
            // allowed in a snapshot.
            AssetSpecification assetSpecification = getAssetSpecificationFromRequest(requestContents);
            snapshotSource.assetSpecification(assetSpecification);
            conjureSnapshotTablesFromAsset(snapshotSource.getAssetSpecification(), snapshot, snapshotSource);
            break;
        case BYFULLVIEW:
            conjureSnapshotTablesFromDatasetTables(snapshot, snapshotSource);
            break;
        case BYQUERY:
            SnapshotRequestQueryModel queryModel = requestContents.getQuerySpec();
            String assetName = queryModel.getAssetName();
            String snapshotQuery = queryModel.getQuery();
            Query query = Query.parse(snapshotQuery);
            List<String> datasetNames = query.getDatasetNames();
            // TODO this makes the assumption that there is only one dataset
            // (based on the validation flight step that already occurred.)
            // This will change when more than 1 dataset is allowed
            String datasetName = datasetNames.get(0);
            Dataset queryDataset = datasetService.retrieveByName(datasetName);
            AssetSpecification queryAssetSpecification = queryDataset.getAssetSpecificationByName(assetName).orElseThrow(() -> new AssetNotFoundException("This dataset does not have an asset specification with name: " + assetName));
            snapshotSource.assetSpecification(queryAssetSpecification);
            // TODO this is wrong? why dont we just pass the assetSpecification?
            conjureSnapshotTablesFromAsset(snapshotSource.getAssetSpecification(), snapshot, snapshotSource);
            break;
        case BYROWID:
            SnapshotRequestRowIdModel requestRowIdModel = requestContents.getRowIdSpec();
            conjureSnapshotTablesFromRowIds(requestRowIdModel, snapshot, snapshotSource);
            break;
        default:
            throw new InvalidSnapshotException("Snapshot does not have required mode information");
    }
    return snapshot.name(snapshotRequestModel.getName()).description(snapshotRequestModel.getDescription()).snapshotSources(Collections.singletonList(snapshotSource)).profileId(UUID.fromString(snapshotRequestModel.getProfileId())).relationships(createSnapshotRelationships(dataset.getRelationships(), snapshotSource));
}
Also used : ValidationException(bio.terra.app.controller.exception.ValidationException) Query(bio.terra.grammar.Query) SnapshotRequestQueryModel(bio.terra.model.SnapshotRequestQueryModel) Dataset(bio.terra.service.dataset.Dataset) SnapshotRequestRowIdModel(bio.terra.model.SnapshotRequestRowIdModel) SnapshotRequestContentsModel(bio.terra.model.SnapshotRequestContentsModel) AssetSpecification(bio.terra.service.dataset.AssetSpecification) AssetNotFoundException(bio.terra.service.snapshot.exception.AssetNotFoundException) InvalidSnapshotException(bio.terra.service.snapshot.exception.InvalidSnapshotException)

Example 3 with Query

use of bio.terra.grammar.Query in project jade-data-repo by DataBiosphere.

the class CreateSnapshotPrimaryDataQueryStep method doStep.

@Override
public StepResult doStep(FlightContext context) throws InterruptedException {
    // TODO: this assumes single-dataset snapshots, will need to add a loop for multiple
    // (based on the validation flight step that already occurred.)
    /*
         * get dataset and assetName
         * get asset from dataset
         * which gives the root table
         * to use in conjunction with the filtered row ids to create this snapshot
         */
    Snapshot snapshot = snapshotDao.retrieveSnapshotByName(snapshotReq.getName());
    SnapshotRequestQueryModel snapshotQuerySpec = snapshotReq.getContents().get(0).getQuerySpec();
    String snapshotAssetName = snapshotQuerySpec.getAssetName();
    String snapshotQuery = snapshotReq.getContents().get(0).getQuerySpec().getQuery();
    Query query = Query.parse(snapshotQuery);
    List<String> datasetNames = query.getDatasetNames();
    // TODO this makes the assumption that there is only one dataset
    // (based on the validation flight step that already occurred.)
    // This will change when more than 1 dataset is allowed
    String datasetName = datasetNames.get(0);
    Dataset dataset = datasetService.retrieveByName(datasetName);
    DatasetModel datasetModel = datasetService.retrieveModel(dataset);
    // get asset out of dataset
    Optional<AssetSpecification> assetSpecOp = dataset.getAssetSpecificationByName(snapshotAssetName);
    AssetSpecification assetSpec = assetSpecOp.orElseThrow(() -> new AssetNotFoundException("Expected asset specification"));
    Map<String, DatasetModel> datasetMap = Collections.singletonMap(datasetName, datasetModel);
    BigQueryVisitor bqVisitor = new BigQueryVisitor(datasetMap);
    String sqlQuery = query.translateSql(bqVisitor);
    // validate that the root table is actually a table being queried in the query -->
    // and the grammar only picks up tables names in the from clause (though there may be more than one)
    List<String> tableNames = query.getTableNames();
    String rootTablename = assetSpec.getRootTable().getTable().getName();
    if (!tableNames.contains(rootTablename)) {
        throw new InvalidQueryException("The root table of the selected asset is not present in this query");
    }
    // now using the query, get the rowIds
    // insert the rowIds into the snapshot row ids table and then kick off the rest of the relationship walking
    bigQueryPdao.queryForRowIds(assetSpec, snapshot, sqlQuery);
    return StepResult.getStepResultSuccess();
}
Also used : BigQueryVisitor(bio.terra.grammar.google.BigQueryVisitor) Query(bio.terra.grammar.Query) SnapshotRequestQueryModel(bio.terra.model.SnapshotRequestQueryModel) Dataset(bio.terra.service.dataset.Dataset) AssetSpecification(bio.terra.service.dataset.AssetSpecification) AssetNotFoundException(bio.terra.service.snapshot.exception.AssetNotFoundException) Snapshot(bio.terra.service.snapshot.Snapshot) DatasetModel(bio.terra.model.DatasetModel) InvalidQueryException(bio.terra.grammar.exception.InvalidQueryException)

Aggregations

Query (bio.terra.grammar.Query)3 SnapshotRequestQueryModel (bio.terra.model.SnapshotRequestQueryModel)2 AssetSpecification (bio.terra.service.dataset.AssetSpecification)2 Dataset (bio.terra.service.dataset.Dataset)2 AssetNotFoundException (bio.terra.service.snapshot.exception.AssetNotFoundException)2 ValidationException (bio.terra.app.controller.exception.ValidationException)1 InvalidQueryException (bio.terra.grammar.exception.InvalidQueryException)1 BigQueryVisitor (bio.terra.grammar.google.BigQueryVisitor)1 DatasetModel (bio.terra.model.DatasetModel)1 SnapshotRequestContentsModel (bio.terra.model.SnapshotRequestContentsModel)1 SnapshotRequestRowIdModel (bio.terra.model.SnapshotRequestRowIdModel)1 Snapshot (bio.terra.service.snapshot.Snapshot)1 InvalidSnapshotException (bio.terra.service.snapshot.exception.InvalidSnapshotException)1 MismatchedValueException (bio.terra.service.snapshot.exception.MismatchedValueException)1 StepResult (bio.terra.stairway.StepResult)1