Search in sources :

Example 1 with DatasetTable

use of bio.terra.service.dataset.DatasetTable in project jade-data-repo by DataBiosphere.

the class IngestSetupStep method doStep.

@Override
public StepResult doStep(FlightContext context) throws InterruptedException {
    if (configService.testInsertFault(ConfigEnum.TABLE_INGEST_LOCK_CONFLICT_STOP_FAULT)) {
        logger.info("TABLE_INGEST_LOCK_CONFLICT_STOP_FAULT");
        while (!configService.testInsertFault(ConfigEnum.TABLE_INGEST_LOCK_CONFLICT_CONTINUE_FAULT)) {
            logger.info("Sleeping for CONTINUE FAULT");
            TimeUnit.SECONDS.sleep(5);
        }
        logger.info("TABLE_INGEST_LOCK_CONFLICT_CONTINUE_FAULT");
    }
    IngestRequestModel ingestRequestModel = IngestUtils.getIngestRequestModel(context);
    // We don't actually care about the output here since BQ takes the raw "gs://" string as input.
    // As long as parsing succeeds, we're good to move forward.
    IngestUtils.parseBlobUri(ingestRequestModel.getPath());
    Dataset dataset = IngestUtils.getDataset(context, datasetService);
    IngestUtils.putDatasetName(context, dataset.getName());
    DatasetTable targetTable = IngestUtils.getDatasetTable(context, dataset);
    String sgName = DatasetUtils.generateAuxTableName(targetTable, "st");
    IngestUtils.putStagingTableName(context, sgName);
    return StepResult.getStepResultSuccess();
}
Also used : Dataset(bio.terra.service.dataset.Dataset) IngestRequestModel(bio.terra.model.IngestRequestModel) DatasetTable(bio.terra.service.dataset.DatasetTable)

Example 2 with DatasetTable

use of bio.terra.service.dataset.DatasetTable in project jade-data-repo by DataBiosphere.

the class IngestUtils method getDatasetTable.

public static DatasetTable getDatasetTable(FlightContext context, Dataset dataset) {
    IngestRequestModel ingestRequest = getIngestRequestModel(context);
    Optional<DatasetTable> optTable = dataset.getTableByName(ingestRequest.getTable());
    if (!optTable.isPresent()) {
        throw new TableNotFoundException("Table not found: " + ingestRequest.getTable());
    }
    return optTable.get();
}
Also used : TableNotFoundException(bio.terra.service.dataset.exception.TableNotFoundException) IngestRequestModel(bio.terra.model.IngestRequestModel) DatasetTable(bio.terra.service.dataset.DatasetTable)

Example 3 with DatasetTable

use of bio.terra.service.dataset.DatasetTable in project jade-data-repo by DataBiosphere.

the class CreateDatasetAssetStep method getNewAssetSpec.

private AssetSpecification getNewAssetSpec(FlightContext context, Dataset dataset) {
    // get Asset Model and convert it to a spec
    AssetModel assetModel = context.getInputParameters().get(JobMapKeys.REQUEST.getKeyName(), AssetModel.class);
    List<DatasetTable> datasetTables = dataset.getTables();
    Map<String, Relationship> relationshipMap = new HashMap<>();
    Map<String, DatasetTable> tablesMap = new HashMap<>();
    datasetTables.forEach(datasetTable -> tablesMap.put(datasetTable.getName(), datasetTable));
    List<Relationship> datasetRelationships = dataset.getRelationships();
    datasetRelationships.forEach(relationship -> relationshipMap.put(relationship.getName(), relationship));
    AssetSpecification assetSpecification = DatasetJsonConversion.assetModelToAssetSpecification(assetModel, tablesMap, relationshipMap);
    return assetSpecification;
}
Also used : HashMap(java.util.HashMap) Relationship(bio.terra.common.Relationship) AssetModel(bio.terra.model.AssetModel) AssetSpecification(bio.terra.service.dataset.AssetSpecification) DatasetTable(bio.terra.service.dataset.DatasetTable)

Example 4 with DatasetTable

use of bio.terra.service.dataset.DatasetTable in project jade-data-repo by DataBiosphere.

the class IngestInsertIntoDatasetTableStep method doStep.

@Override
public StepResult doStep(FlightContext context) throws InterruptedException {
    Dataset dataset = IngestUtils.getDataset(context, datasetService);
    DatasetTable targetTable = IngestUtils.getDatasetTable(context, dataset);
    String stagingTableName = IngestUtils.getStagingTableName(context);
    IngestRequestModel ingestRequest = IngestUtils.getIngestRequestModel(context);
    PdaoLoadStatistics loadStatistics = IngestUtils.getIngestStatistics(context);
    IngestResponseModel ingestResponse = new IngestResponseModel().dataset(dataset.getName()).datasetId(dataset.getId().toString()).table(ingestRequest.getTable()).path(ingestRequest.getPath()).loadTag(ingestRequest.getLoadTag()).badRowCount(loadStatistics.getBadRecords()).rowCount(loadStatistics.getRowCount());
    context.getWorkingMap().put(JobMapKeys.RESPONSE.getKeyName(), ingestResponse);
    bigQueryPdao.insertIntoDatasetTable(dataset, targetTable, stagingTableName);
    return StepResult.getStepResultSuccess();
}
Also used : Dataset(bio.terra.service.dataset.Dataset) IngestRequestModel(bio.terra.model.IngestRequestModel) DatasetTable(bio.terra.service.dataset.DatasetTable) PdaoLoadStatistics(bio.terra.common.PdaoLoadStatistics) IngestResponseModel(bio.terra.model.IngestResponseModel)

Example 5 with DatasetTable

use of bio.terra.service.dataset.DatasetTable in project jade-data-repo by DataBiosphere.

the class IngestLoadTableStep method doStep.

@Override
public StepResult doStep(FlightContext context) throws InterruptedException {
    Dataset dataset = IngestUtils.getDataset(context, datasetService);
    DatasetTable targetTable = IngestUtils.getDatasetTable(context, dataset);
    String stagingTableName = IngestUtils.getStagingTableName(context);
    IngestRequestModel ingestRequest = IngestUtils.getIngestRequestModel(context);
    PdaoLoadStatistics ingestStatistics = bigQueryPdao.loadToStagingTable(dataset, targetTable, stagingTableName, ingestRequest);
    // Save away the stats in the working map. We will use some of them later
    // when we make the annotations. Others are returned on the ingest response.
    IngestUtils.putIngestStatistics(context, ingestStatistics);
    return StepResult.getStepResultSuccess();
}
Also used : Dataset(bio.terra.service.dataset.Dataset) IngestRequestModel(bio.terra.model.IngestRequestModel) DatasetTable(bio.terra.service.dataset.DatasetTable) PdaoLoadStatistics(bio.terra.common.PdaoLoadStatistics)

Aggregations

DatasetTable (bio.terra.service.dataset.DatasetTable)14 Dataset (bio.terra.service.dataset.Dataset)7 IngestRequestModel (bio.terra.model.IngestRequestModel)4 ArrayList (java.util.ArrayList)4 ST (org.stringtemplate.v4.ST)4 PdaoException (bio.terra.common.exception.PdaoException)3 BigQuery (com.google.cloud.bigquery.BigQuery)3 Column (bio.terra.common.Column)2 PdaoLoadStatistics (bio.terra.common.PdaoLoadStatistics)2 Relationship (bio.terra.common.Relationship)2 Table (bio.terra.common.Table)2 InvalidQueryException (bio.terra.grammar.exception.InvalidQueryException)2 AssetColumn (bio.terra.service.dataset.AssetColumn)2 AssetSpecification (bio.terra.service.dataset.AssetSpecification)2 AssetTable (bio.terra.service.dataset.AssetTable)2 MismatchedValueException (bio.terra.service.snapshot.exception.MismatchedValueException)2 QueryJobConfiguration (com.google.cloud.bigquery.QueryJobConfiguration)2 ValidationException (bio.terra.app.controller.exception.ValidationException)1 MetadataEnumeration (bio.terra.common.MetadataEnumeration)1 Query (bio.terra.grammar.Query)1