Search in sources :

Example 1 with StorageError

use of com.google.cloud.bigquery.storage.v1.StorageError in project beam by apache.

the class StorageApiFinalizeWritesDoFn method finishBundle.

@FinishBundle
@SuppressWarnings({ "nullness" })
public void finishBundle(PipelineOptions pipelineOptions) throws Exception {
    DatasetService datasetService = getDatasetService(pipelineOptions);
    for (Map.Entry<String, Collection<String>> entry : commitStreams.entrySet()) {
        final String tableId = entry.getKey();
        final Collection<String> streamNames = entry.getValue();
        final Set<String> alreadyCommittedStreams = Sets.newHashSet();
        RetryManager<BatchCommitWriteStreamsResponse, Context<BatchCommitWriteStreamsResponse>> retryManager = new RetryManager<>(Duration.standardSeconds(1), Duration.standardMinutes(1), 3);
        retryManager.addOperation(c -> {
            Iterable<String> streamsToCommit = Iterables.filter(streamNames, s -> !alreadyCommittedStreams.contains(s));
            batchCommitOperationsSent.inc();
            return datasetService.commitWriteStreams(tableId, streamsToCommit);
        }, contexts -> {
            LOG.error("BatchCommit failed. tableId " + tableId + " streamNames " + streamNames + " error: " + Iterables.getFirst(contexts, null).getError());
            batchCommitOperationsFailed.inc();
            return RetryType.RETRY_ALL_OPERATIONS;
        }, c -> {
            LOG.info("BatchCommit succeeded for tableId " + tableId + " response " + c.getResult());
            batchCommitOperationsSucceeded.inc();
        }, response -> {
            if (!response.hasCommitTime()) {
                for (StorageError storageError : response.getStreamErrorsList()) {
                    if (storageError.getCode() == StorageErrorCode.STREAM_ALREADY_COMMITTED) {
                        // Make sure that we don't retry any streams that are already committed.
                        alreadyCommittedStreams.add(storageError.getEntity());
                    }
                }
                Iterable<String> streamsToCommit = Iterables.filter(streamNames, s -> !alreadyCommittedStreams.contains(s));
                // retry.
                return Iterables.isEmpty(streamsToCommit);
            }
            return true;
        }, new Context<>());
        retryManager.run(true);
    }
}
Also used : Context(org.apache.beam.sdk.io.gcp.bigquery.RetryManager.Operation.Context) DatasetService(org.apache.beam.sdk.io.gcp.bigquery.BigQueryServices.DatasetService) BatchCommitWriteStreamsResponse(com.google.cloud.bigquery.storage.v1.BatchCommitWriteStreamsResponse) StorageError(com.google.cloud.bigquery.storage.v1.StorageError) Collection(java.util.Collection) Map(java.util.Map)

Aggregations

BatchCommitWriteStreamsResponse (com.google.cloud.bigquery.storage.v1.BatchCommitWriteStreamsResponse)1 StorageError (com.google.cloud.bigquery.storage.v1.StorageError)1 Collection (java.util.Collection)1 Map (java.util.Map)1 DatasetService (org.apache.beam.sdk.io.gcp.bigquery.BigQueryServices.DatasetService)1 Context (org.apache.beam.sdk.io.gcp.bigquery.RetryManager.Operation.Context)1