Search in sources :

Example 1 with RetryException

use of bio.terra.stairway.exception.RetryException in project terra-workspace-manager by DataBiosphere.

the class CreateTableCopyJobsStep method doStep.

/**
 * Create one BigQuery copy job for each table in the source dataset. Keep a running map from
 * table ID to job ID as new jobs are created, and only create jobs for tables that aren't in the
 * map already. Rerun the step after every table is processed so that the map may be persisted
 * incrementally.
 *
 * <p>On retry, create the jobs for any tables that don't have them. Use WRITE_TRUNCATE to avoid
 * the possibility of duplicate data.
 */
@Override
public StepResult doStep(FlightContext flightContext) throws InterruptedException, RetryException {
    final FlightMap workingMap = flightContext.getWorkingMap();
    final CloningInstructions effectiveCloningInstructions = flightContext.getInputParameters().get(ControlledResourceKeys.CLONING_INSTRUCTIONS, CloningInstructions.class);
    if (CloningInstructions.COPY_RESOURCE != effectiveCloningInstructions) {
        return StepResult.getStepResultSuccess();
    }
    // Gather inputs
    final DatasetCloneInputs sourceInputs = getSourceInputs();
    workingMap.put(ControlledResourceKeys.SOURCE_CLONE_INPUTS, sourceInputs);
    final DatasetCloneInputs destinationInputs = getDestinationInputs(flightContext);
    workingMap.put(ControlledResourceKeys.DESTINATION_CLONE_INPUTS, destinationInputs);
    final BigQueryCow bigQueryCow = crlService.createWsmSaBigQueryCow();
    // TODO(jaycarlton):  remove usage of this client when it's all in CRL PF-942
    final Bigquery bigQueryClient = crlService.createWsmSaNakedBigQueryClient();
    try {
        // Get a list of all tables in the source dataset
        final TableList sourceTables = bigQueryCow.tables().list(sourceInputs.getProjectId(), sourceInputs.getDatasetName()).execute();
        // Start a copy job for each source table
        final Map<String, String> tableToJobId = Optional.ofNullable(workingMap.get(ControlledResourceKeys.TABLE_TO_JOB_ID_MAP, new TypeReference<Map<String, String>>() {
        })).orElseGet(HashMap::new);
        final List<Tables> tables = Optional.ofNullable(sourceTables.getTables()).orElse(Collections.emptyList());
        // Find the first table whose ID isn't a key in the map.
        final Optional<Tables> tableMaybe = tables.stream().filter(t -> null != t.getId() && !tableToJobId.containsKey(t.getId())).findFirst();
        if (tableMaybe.isPresent()) {
            final Tables table = tableMaybe.get();
            checkStreamingBuffer(sourceInputs, bigQueryCow, table);
            final Job inputJob = buildTableCopyJob(sourceInputs, destinationInputs, table);
            // bill the job to the destination project
            final Job submittedJob = bigQueryClient.jobs().insert(destinationInputs.getProjectId(), inputJob).execute();
            // Update the map, which will be persisted
            tableToJobId.put(table.getId(), submittedJob.getId());
            workingMap.put(ControlledResourceKeys.TABLE_TO_JOB_ID_MAP, tableToJobId);
            return new StepResult(StepStatus.STEP_RESULT_RERUN);
        } else {
            // All tables have entries in the map, so all jobs are started.
            workingMap.put(ControlledResourceKeys.TABLE_TO_JOB_ID_MAP, // in case it's empty
            tableToJobId);
            return StepResult.getStepResultSuccess();
        }
    } catch (IOException e) {
        return new StepResult(StepStatus.STEP_RESULT_FAILURE_RETRY, e);
    }
}
Also used : TableList(com.google.api.services.bigquery.model.TableList) BigQueryCow(bio.terra.cloudres.google.bigquery.BigQueryCow) LoggerFactory(org.slf4j.LoggerFactory) HashMap(java.util.HashMap) Tables(com.google.api.services.bigquery.model.TableList.Tables) StepResult(bio.terra.stairway.StepResult) Step(bio.terra.stairway.Step) RetryException(bio.terra.stairway.exception.RetryException) Duration(java.time.Duration) Map(java.util.Map) TypeReference(com.fasterxml.jackson.core.type.TypeReference) Job(com.google.api.services.bigquery.model.Job) CrlService(bio.terra.workspace.service.crl.CrlService) TableReference(com.google.api.services.bigquery.model.TableReference) ControlledBigQueryDatasetResource(bio.terra.workspace.service.resource.controlled.cloud.gcp.bqdataset.ControlledBigQueryDatasetResource) Logger(org.slf4j.Logger) FlightMap(bio.terra.stairway.FlightMap) IOException(java.io.IOException) UUID(java.util.UUID) Instant(java.time.Instant) JobConfigurationTableCopy(com.google.api.services.bigquery.model.JobConfigurationTableCopy) Table(com.google.api.services.bigquery.model.Table) List(java.util.List) GcpCloudContextService(bio.terra.workspace.service.workspace.GcpCloudContextService) Bigquery(com.google.api.services.bigquery.Bigquery) CloningInstructions(bio.terra.workspace.service.resource.model.CloningInstructions) Optional(java.util.Optional) ControlledResourceKeys(bio.terra.workspace.service.workspace.flight.WorkspaceFlightMapKeys.ControlledResourceKeys) StepStatus(bio.terra.stairway.StepStatus) Collections(java.util.Collections) FlightContext(bio.terra.stairway.FlightContext) JobConfiguration(com.google.api.services.bigquery.model.JobConfiguration) HashMap(java.util.HashMap) Bigquery(com.google.api.services.bigquery.Bigquery) TableList(com.google.api.services.bigquery.model.TableList) IOException(java.io.IOException) BigQueryCow(bio.terra.cloudres.google.bigquery.BigQueryCow) CloningInstructions(bio.terra.workspace.service.resource.model.CloningInstructions) Tables(com.google.api.services.bigquery.model.TableList.Tables) FlightMap(bio.terra.stairway.FlightMap) Job(com.google.api.services.bigquery.model.Job) StepResult(bio.terra.stairway.StepResult) HashMap(java.util.HashMap) Map(java.util.Map) FlightMap(bio.terra.stairway.FlightMap)

Example 2 with RetryException

use of bio.terra.stairway.exception.RetryException in project terra-resource-buffer by DataBiosphere.

the class CreateConsumerDefinedQuotaForBigQueryDailyUsageStep method doStep.

/**
 * Apply a Consumer Quota Override for the BigQuery Query Usage Quota.
 */
@Override
public StepResult doStep(FlightContext context) throws InterruptedException, RetryException {
    Optional<Long> overrideValue = GoogleProjectConfigUtils.bigQueryDailyUsageOverrideValueMebibytes(gcpProjectConfig);
    if (overrideValue.isEmpty()) {
        // Do not apply any quota override
        return StepResult.getStepResultSuccess();
    }
    long projectNumber = Optional.ofNullable(context.getWorkingMap().get(GOOGLE_PROJECT_NUMBER, Long.class)).orElseThrow();
    QuotaOverride overridePerProjectPerDay = buildQuotaOverride(projectNumber, overrideValue.get());
    // parent format and other details obtained by hitting the endpoint
    // https://serviceusage.googleapis.com/v1beta1/projects/${PROJECT_NUMBER}/services/bigquery.googleapis.com/consumerQuotaMetrics
    String parent = String.format("projects/%d/services/bigquery.googleapis.com/consumerQuotaMetrics/" + "bigquery.googleapis.com%%2Fquota%%2Fquery%%2Fusage/limits/%%2Fd%%2Fproject", projectNumber);
    try {
        // We are decreasing the quota by more than 10%, so we must tell Service Usage to bypass the
        // check with the force flag.
        Operation createOperation = serviceUsageCow.services().consumerQuotaMetrics().limits().consumerOverrides().create(parent, overridePerProjectPerDay).setForce(true).execute();
        OperationCow<Operation> operationCow = serviceUsageCow.operations().operationCow(createOperation);
        pollUntilSuccess(operationCow, Duration.ofSeconds(3), Duration.ofMinutes(5));
    } catch (IOException e) {
        throw new RetryException(e);
    }
    return StepResult.getStepResultSuccess();
}
Also used : QuotaOverride(com.google.api.services.serviceusage.v1beta1.model.QuotaOverride) Operation(com.google.api.services.serviceusage.v1beta1.model.Operation) IOException(java.io.IOException) RetryException(bio.terra.stairway.exception.RetryException) QuotaOverride(com.google.api.services.serviceusage.v1beta1.model.QuotaOverride)

Example 3 with RetryException

use of bio.terra.stairway.exception.RetryException in project terra-resource-buffer by DataBiosphere.

the class CreateGkeDefaultSAStep method doStep.

@Override
public StepResult doStep(FlightContext flightContext) throws RetryException {
    if (!createGkeDefaultSa(gcpProjectConfig)) {
        return StepResult.getStepResultSuccess();
    }
    String projectId = flightContext.getWorkingMap().get(GOOGLE_PROJECT_ID, String.class);
    CreateServiceAccountRequest createRequest = new CreateServiceAccountRequest().setAccountId(GKE_SA_NAME).setServiceAccount(new ServiceAccount().setDescription("Default service account can be used on GKE node. "));
    try {
        iamCow.projects().serviceAccounts().create("projects/" + projectId, createRequest).execute();
    } catch (GoogleJsonResponseException e) {
        // Otherwise throw a retry exception.
        if (e.getStatusCode() != HttpStatus.CONFLICT.value()) {
            throw new RetryException(e);
        }
        logger.warn("Service account {} already created for notebook instance.", GKE_SA_NAME);
    } catch (IOException e) {
        throw new RetryException(e);
    }
    // Grants permission that a GKE node runner needs
    String serviceAccountEmail = ServiceAccountName.emailFromAccountId(GKE_SA_NAME, projectId);
    try {
        Policy policy = rmCow.projects().getIamPolicy(projectId, new GetIamPolicyRequest()).execute();
        GKE_SA_ROLES.forEach(r -> policy.getBindings().add(new Binding().setRole(r).setMembers(Collections.singletonList("serviceAccount:" + serviceAccountEmail))));
        // Duplicating bindings is harmless (e.g. on retry). GCP de-duplicates.
        rmCow.projects().setIamPolicy(projectId, new SetIamPolicyRequest().setPolicy(policy)).execute();
    } catch (IOException e) {
        logger.info("Error when setting IAM policy for GKE default node SA", e);
        return new StepResult(StepStatus.STEP_RESULT_FAILURE_RETRY, e);
    }
    return StepResult.getStepResultSuccess();
}
Also used : Policy(com.google.api.services.cloudresourcemanager.v3.model.Policy) Binding(com.google.api.services.cloudresourcemanager.v3.model.Binding) ServiceAccount(com.google.api.services.iam.v1.model.ServiceAccount) GoogleJsonResponseException(com.google.api.client.googleapis.json.GoogleJsonResponseException) SetIamPolicyRequest(com.google.api.services.cloudresourcemanager.v3.model.SetIamPolicyRequest) IOException(java.io.IOException) RetryException(bio.terra.stairway.exception.RetryException) GetIamPolicyRequest(com.google.api.services.cloudresourcemanager.v3.model.GetIamPolicyRequest) StepResult(bio.terra.stairway.StepResult) CreateServiceAccountRequest(com.google.api.services.iam.v1.model.CreateServiceAccountRequest)

Example 4 with RetryException

use of bio.terra.stairway.exception.RetryException in project terra-workspace-manager by DataBiosphere.

the class CreateCustomGcpRolesStep method createCustomRole.

/**
 * Utility for creating custom roles in GCP from WSM's CustomGcpIamRole objects. These roles will
 * be defined at the project level in the specified by projectId.
 */
private void createCustomRole(CustomGcpIamRole customRole, String projectId) throws RetryException {
    try {
        Role gcpRole = new Role().setIncludedPermissions(customRole.getIncludedPermissions()).setTitle(customRole.getRoleName());
        CreateRoleRequest request = new CreateRoleRequest().setRole(gcpRole).setRoleId(customRole.getRoleName());
        logger.debug("Creating role {} with permissions {} in project {}", customRole.getRoleName(), customRole.getIncludedPermissions(), projectId);
        iamCow.projects().roles().create("projects/" + projectId, request).execute();
    } catch (GoogleJsonResponseException googleEx) {
        // of role names must be due to duplicate step execution.
        if (googleEx.getStatusCode() != HttpStatus.CONFLICT.value()) {
            throw new RetryException(googleEx);
        }
    } catch (IOException e) {
        // Retry on IO exceptions thrown by CRL.
        throw new RetryException(e);
    }
}
Also used : CustomGcpIamRole(bio.terra.workspace.service.resource.controlled.cloud.gcp.CustomGcpIamRole) Role(com.google.api.services.iam.v1.model.Role) GoogleJsonResponseException(com.google.api.client.googleapis.json.GoogleJsonResponseException) CreateRoleRequest(com.google.api.services.iam.v1.model.CreateRoleRequest) IOException(java.io.IOException) RetryException(bio.terra.stairway.exception.RetryException)

Example 5 with RetryException

use of bio.terra.stairway.exception.RetryException in project terra-workspace-manager by DataBiosphere.

the class CreateAiNotebookInstanceStep method undoStep.

@Override
public StepResult undoStep(FlightContext flightContext) throws InterruptedException {
    final GcpCloudContext gcpCloudContext = flightContext.getWorkingMap().get(ControlledResourceKeys.GCP_CLOUD_CONTEXT, GcpCloudContext.class);
    InstanceName instanceName = resource.toInstanceName(gcpCloudContext.getGcpProjectId());
    AIPlatformNotebooksCow notebooks = crlService.getAIPlatformNotebooksCow();
    try {
        OperationCow<Operation> deletionOperation;
        try {
            deletionOperation = notebooks.operations().operationCow(notebooks.instances().delete(instanceName).execute());
        } catch (GoogleJsonResponseException e) {
            // The AI notebook instance may never have been created or have already been deleted.
            if (e.getStatusCode() == HttpStatus.NOT_FOUND.value()) {
                logger.debug("No notebook instance {} to delete.", instanceName.formatName());
                return StepResult.getStepResultSuccess();
            }
            return new StepResult(StepStatus.STEP_RESULT_FAILURE_RETRY, e);
        }
        GcpUtils.pollUntilSuccess(deletionOperation, Duration.ofSeconds(20), Duration.ofMinutes(12));
    } catch (IOException | RetryException e) {
        return new StepResult(StepStatus.STEP_RESULT_FAILURE_RETRY, e);
    }
    return StepResult.getStepResultSuccess();
}
Also used : InstanceName(bio.terra.cloudres.google.notebooks.InstanceName) AIPlatformNotebooksCow(bio.terra.cloudres.google.notebooks.AIPlatformNotebooksCow) GoogleJsonResponseException(com.google.api.client.googleapis.json.GoogleJsonResponseException) Operation(com.google.api.services.notebooks.v1.model.Operation) IOException(java.io.IOException) StepResult(bio.terra.stairway.StepResult) RetryException(bio.terra.stairway.exception.RetryException) GcpCloudContext(bio.terra.workspace.service.workspace.model.GcpCloudContext)

Aggregations

RetryException (bio.terra.stairway.exception.RetryException)6 IOException (java.io.IOException)6 StepResult (bio.terra.stairway.StepResult)4 GoogleJsonResponseException (com.google.api.client.googleapis.json.GoogleJsonResponseException)3 FlightContext (bio.terra.stairway.FlightContext)2 Step (bio.terra.stairway.Step)2 StepStatus (bio.terra.stairway.StepStatus)2 Logger (org.slf4j.Logger)2 GcpProjectConfig (bio.terra.buffer.generated.model.GcpProjectConfig)1 GOOGLE_PROJECT_ID (bio.terra.buffer.service.resource.FlightMapKeys.GOOGLE_PROJECT_ID)1 BigQueryCow (bio.terra.cloudres.google.bigquery.BigQueryCow)1 CloudResourceManagerCow (bio.terra.cloudres.google.cloudresourcemanager.CloudResourceManagerCow)1 AIPlatformNotebooksCow (bio.terra.cloudres.google.notebooks.AIPlatformNotebooksCow)1 InstanceName (bio.terra.cloudres.google.notebooks.InstanceName)1 FlightMap (bio.terra.stairway.FlightMap)1 CrlService (bio.terra.workspace.service.crl.CrlService)1 CustomGcpIamRole (bio.terra.workspace.service.resource.controlled.cloud.gcp.CustomGcpIamRole)1 ControlledBigQueryDatasetResource (bio.terra.workspace.service.resource.controlled.cloud.gcp.bqdataset.ControlledBigQueryDatasetResource)1 CloningInstructions (bio.terra.workspace.service.resource.model.CloningInstructions)1 GcpCloudContextService (bio.terra.workspace.service.workspace.GcpCloudContextService)1