Search in sources :

Example 6 with BigQueryWriteClient

use of com.google.cloud.bigquery.storage.v1.BigQueryWriteClient in project java-bigquerystorage by googleapis.

the class WriteCommittedStream method writeCommittedStream.

public static void writeCommittedStream(String projectId, String datasetName, String tableName) throws DescriptorValidationException, InterruptedException, IOException {
    try (BigQueryWriteClient client = BigQueryWriteClient.create()) {
        // Initialize a write stream for the specified table.
        // For more information on WriteStream.Type, see:
        // https://googleapis.dev/java/google-cloud-bigquerystorage/latest/com/google/cloud/bigquery/storage/v1beta2/WriteStream.Type.html
        WriteStream stream = WriteStream.newBuilder().setType(WriteStream.Type.COMMITTED).build();
        TableName parentTable = TableName.of(projectId, datasetName, tableName);
        CreateWriteStreamRequest createWriteStreamRequest = CreateWriteStreamRequest.newBuilder().setParent(parentTable.toString()).setWriteStream(stream).build();
        WriteStream writeStream = client.createWriteStream(createWriteStreamRequest);
        // https://googleapis.dev/java/google-cloud-bigquerystorage/latest/com/google/cloud/bigquery/storage/v1/JsonStreamWriter.html
        try (JsonStreamWriter writer = JsonStreamWriter.newBuilder(writeStream.getName(), writeStream.getTableSchema()).build()) {
            // antipattern.
            for (int i = 0; i < 2; i++) {
                // Create a JSON object that is compatible with the table schema.
                JSONArray jsonArr = new JSONArray();
                for (int j = 0; j < 10; j++) {
                    JSONObject record = new JSONObject();
                    record.put("col1", String.format("record %03d-%03d", i, j));
                    jsonArr.put(record);
                }
                // To detect duplicate records, pass the index as the record offset.
                // To disable deduplication, omit the offset or use WriteStream.Type.DEFAULT.
                ApiFuture<AppendRowsResponse> future = writer.append(jsonArr, /*offset=*/
                i * 10);
                AppendRowsResponse response = future.get();
            }
            // Finalize the stream after use.
            FinalizeWriteStreamRequest finalizeWriteStreamRequest = FinalizeWriteStreamRequest.newBuilder().setName(writeStream.getName()).build();
            client.finalizeWriteStream(finalizeWriteStreamRequest);
        }
        System.out.println("Appended records successfully.");
    } catch (ExecutionException e) {
        // If the wrapped exception is a StatusRuntimeException, check the state of the operation.
        // If the state is INTERNAL, CANCELLED, or ABORTED, you can retry. For more information, see:
        // https://grpc.github.io/grpc-java/javadoc/io/grpc/StatusRuntimeException.html
        System.out.println("Failed to append records. \n" + e.toString());
    }
}
Also used : TableName(com.google.cloud.bigquery.storage.v1.TableName) CreateWriteStreamRequest(com.google.cloud.bigquery.storage.v1.CreateWriteStreamRequest) FinalizeWriteStreamRequest(com.google.cloud.bigquery.storage.v1.FinalizeWriteStreamRequest) JSONObject(org.json.JSONObject) JSONArray(org.json.JSONArray) AppendRowsResponse(com.google.cloud.bigquery.storage.v1.AppendRowsResponse) WriteStream(com.google.cloud.bigquery.storage.v1.WriteStream) BigQueryWriteClient(com.google.cloud.bigquery.storage.v1.BigQueryWriteClient) ExecutionException(java.util.concurrent.ExecutionException) JsonStreamWriter(com.google.cloud.bigquery.storage.v1.JsonStreamWriter)

Example 7 with BigQueryWriteClient

use of com.google.cloud.bigquery.storage.v1.BigQueryWriteClient in project spark-bigquery-connector by GoogleCloudDataproc.

the class BigQueryClientFactoryTest method testGetWriteClientForSameClientFactory.

@Test
public void testGetWriteClientForSameClientFactory() {
    BigQueryClientFactory clientFactory = new BigQueryClientFactory(bigQueryCredentialsSupplier, headerProvider, bigQueryConfig);
    when(bigQueryConfig.getBigQueryProxyConfig()).thenReturn(bigQueryProxyConfig);
    BigQueryWriteClient writeClient = clientFactory.getBigQueryWriteClient();
    assertNotNull(writeClient);
    BigQueryWriteClient writeClient2 = clientFactory.getBigQueryWriteClient();
    assertNotNull(writeClient2);
    assertSame(writeClient, writeClient2);
}
Also used : BigQueryWriteClient(com.google.cloud.bigquery.storage.v1.BigQueryWriteClient) Test(org.junit.Test)

Example 8 with BigQueryWriteClient

use of com.google.cloud.bigquery.storage.v1.BigQueryWriteClient in project java-bigquerystorage by googleapis.

the class WritePendingStream method writePendingStream.

public static void writePendingStream(String projectId, String datasetName, String tableName) throws DescriptorValidationException, InterruptedException, IOException {
    try (BigQueryWriteClient client = BigQueryWriteClient.create()) {
        // Initialize a write stream for the specified table.
        // For more information on WriteStream.Type, see:
        // https://googleapis.dev/java/google-cloud-bigquerystorage/latest/com/google/cloud/bigquery/storage/v1/WriteStream.Type.html
        WriteStream stream = WriteStream.newBuilder().setType(WriteStream.Type.PENDING).build();
        TableName parentTable = TableName.of(projectId, datasetName, tableName);
        CreateWriteStreamRequest createWriteStreamRequest = CreateWriteStreamRequest.newBuilder().setParent(parentTable.toString()).setWriteStream(stream).build();
        WriteStream writeStream = client.createWriteStream(createWriteStreamRequest);
        // https://googleapis.dev/java/google-cloud-bigquerystorage/latest/com/google/cloud/bigquery/storage/v1beta2/JsonStreamWriter.html
        try (JsonStreamWriter writer = JsonStreamWriter.newBuilder(writeStream.getName(), writeStream.getTableSchema()).build()) {
            // Write two batches to the stream, each with 10 JSON records.
            for (int i = 0; i < 2; i++) {
                // Create a JSON object that is compatible with the table schema.
                JSONArray jsonArr = new JSONArray();
                for (int j = 0; j < 10; j++) {
                    JSONObject record = new JSONObject();
                    record.put("col1", String.format("batch-record %03d-%03d", i, j));
                    jsonArr.put(record);
                }
                ApiFuture<AppendRowsResponse> future = writer.append(jsonArr);
                AppendRowsResponse response = future.get();
            }
            FinalizeWriteStreamResponse finalizeResponse = client.finalizeWriteStream(writeStream.getName());
            System.out.println("Rows written: " + finalizeResponse.getRowCount());
        }
        // Commit the streams.
        BatchCommitWriteStreamsRequest commitRequest = BatchCommitWriteStreamsRequest.newBuilder().setParent(parentTable.toString()).addWriteStreams(writeStream.getName()).build();
        BatchCommitWriteStreamsResponse commitResponse = client.batchCommitWriteStreams(commitRequest);
        // If the response does not have a commit time, it means the commit operation failed.
        if (commitResponse.hasCommitTime() == false) {
            for (StorageError err : commitResponse.getStreamErrorsList()) {
                System.out.println(err.getErrorMessage());
            }
            throw new RuntimeException("Error committing the streams");
        }
        System.out.println("Appended and committed records successfully.");
    } catch (ExecutionException e) {
        // If the wrapped exception is a StatusRuntimeException, check the state of the operation.
        // If the state is INTERNAL, CANCELLED, or ABORTED, you can retry. For more information, see:
        // https://grpc.github.io/grpc-java/javadoc/io/grpc/StatusRuntimeException.html
        System.out.println(e);
    }
}
Also used : FinalizeWriteStreamResponse(com.google.cloud.bigquery.storage.v1.FinalizeWriteStreamResponse) BatchCommitWriteStreamsRequest(com.google.cloud.bigquery.storage.v1.BatchCommitWriteStreamsRequest) JSONArray(org.json.JSONArray) AppendRowsResponse(com.google.cloud.bigquery.storage.v1.AppendRowsResponse) WriteStream(com.google.cloud.bigquery.storage.v1.WriteStream) BigQueryWriteClient(com.google.cloud.bigquery.storage.v1.BigQueryWriteClient) TableName(com.google.cloud.bigquery.storage.v1.TableName) CreateWriteStreamRequest(com.google.cloud.bigquery.storage.v1.CreateWriteStreamRequest) BatchCommitWriteStreamsResponse(com.google.cloud.bigquery.storage.v1.BatchCommitWriteStreamsResponse) JSONObject(org.json.JSONObject) StorageError(com.google.cloud.bigquery.storage.v1.StorageError) ExecutionException(java.util.concurrent.ExecutionException) JsonStreamWriter(com.google.cloud.bigquery.storage.v1.JsonStreamWriter)

Example 9 with BigQueryWriteClient

use of com.google.cloud.bigquery.storage.v1.BigQueryWriteClient in project java-bigquerystorage by googleapis.

the class ParallelWriteCommittedStream method writeLoop.

public void writeLoop(String projectId, String datasetName, String tableName, BigQueryWriteClient client) {
    LOG.info("Start writeLoop");
    long streamSwitchCount = 0;
    long successRowCount = 0;
    long failureRowCount = 0;
    Throwable loggedError = null;
    long deadlineMillis = System.currentTimeMillis() + TEST_TIME.toMillis();
    while (System.currentTimeMillis() < deadlineMillis) {
        try {
            WriteStream writeStream = createStream(projectId, datasetName, tableName, client);
            writeToStream(client, writeStream, deadlineMillis);
        } catch (Throwable e) {
            LOG.warning("Unexpected error writing to stream: " + e.toString());
        }
        waitForInflightToReachZero(Duration.ofMinutes(1));
        synchronized (this) {
            successRowCount += successCount * BATCH_SIZE;
            failureRowCount += failureCount * BATCH_SIZE;
            if (loggedError == null) {
                loggedError = error;
            }
        }
        if (!SUPPORT_STREAM_SWITCH) {
            // If stream switch is disabled, break.
            break;
        }
        LOG.info("Sleeping before switching stream.");
        sleepIgnoringInterruption(Duration.ofMinutes(1));
        streamSwitchCount++;
    }
    LOG.info("Finish writeLoop. Success row count: " + successRowCount + " Failure row count: " + failureRowCount + " Logged error: " + loggedError + " Stream switch count: " + streamSwitchCount);
    if (successRowCount > 0 && failureRowCount == 0 && loggedError == null) {
        System.out.println("All records are appended successfully.");
    }
}
Also used : WriteStream(com.google.cloud.bigquery.storage.v1.WriteStream)

Example 10 with BigQueryWriteClient

use of com.google.cloud.bigquery.storage.v1.BigQueryWriteClient in project java-bigquerystorage by googleapis.

the class ParallelWriteCommittedStream method writeToStream.

private void writeToStream(BigQueryWriteClient client, WriteStream writeStream, long deadlineMillis) throws Throwable {
    LOG.info("Start writing to new stream:" + writeStream.getName());
    synchronized (this) {
        inflightCount = 0;
        successCount = 0;
        failureCount = 0;
        error = null;
        lastMetricsTimeMillis = System.currentTimeMillis();
        lastMetricsSuccessCount = 0;
        lastMetricsFailureCount = 0;
    }
    Descriptor descriptor = BQTableSchemaToProtoDescriptor.convertBQTableSchemaToProtoDescriptor(writeStream.getTableSchema());
    ProtoSchema protoSchema = ProtoSchemaConverter.convert(descriptor);
    try (StreamWriter writer = StreamWriter.newBuilder(writeStream.getName()).setWriterSchema(protoSchema).setTraceId("SAMPLE:parallel_append").build()) {
        while (System.currentTimeMillis() < deadlineMillis) {
            synchronized (this) {
                if (error != null) {
                    // Stop writing once we get an error.
                    throw error;
                }
            }
            ApiFuture<AppendRowsResponse> future = writer.append(createAppendRows(descriptor), -1);
            synchronized (this) {
                inflightCount++;
            }
            ApiFutures.addCallback(future, new AppendCompleteCallback(this), MoreExecutors.directExecutor());
        }
    }
}
Also used : ProtoSchema(com.google.cloud.bigquery.storage.v1.ProtoSchema) StreamWriter(com.google.cloud.bigquery.storage.v1.StreamWriter) AppendRowsResponse(com.google.cloud.bigquery.storage.v1.AppendRowsResponse) Descriptor(com.google.protobuf.Descriptors.Descriptor) BQTableSchemaToProtoDescriptor(com.google.cloud.bigquery.storage.v1.BQTableSchemaToProtoDescriptor)

Aggregations

BigQueryWriteClient (com.google.cloud.bigquery.storage.v1.BigQueryWriteClient)8 WriteStream (com.google.cloud.bigquery.storage.v1.WriteStream)5 AppendRowsResponse (com.google.cloud.bigquery.storage.v1.AppendRowsResponse)4 CreateWriteStreamRequest (com.google.cloud.bigquery.storage.v1.CreateWriteStreamRequest)4 TableName (com.google.cloud.bigquery.storage.v1.TableName)4 Test (org.junit.Test)4 JsonStreamWriter (com.google.cloud.bigquery.storage.v1.JsonStreamWriter)3 ExecutionException (java.util.concurrent.ExecutionException)3 JSONArray (org.json.JSONArray)3 JSONObject (org.json.JSONObject)3 FinalizeWriteStreamRequest (com.google.cloud.bigquery.storage.v1.FinalizeWriteStreamRequest)2 BQTableSchemaToProtoDescriptor (com.google.cloud.bigquery.storage.v1.BQTableSchemaToProtoDescriptor)1 BatchCommitWriteStreamsRequest (com.google.cloud.bigquery.storage.v1.BatchCommitWriteStreamsRequest)1 BatchCommitWriteStreamsResponse (com.google.cloud.bigquery.storage.v1.BatchCommitWriteStreamsResponse)1 FinalizeWriteStreamResponse (com.google.cloud.bigquery.storage.v1.FinalizeWriteStreamResponse)1 FlushRowsRequest (com.google.cloud.bigquery.storage.v1.FlushRowsRequest)1 FlushRowsResponse (com.google.cloud.bigquery.storage.v1.FlushRowsResponse)1 ProtoSchema (com.google.cloud.bigquery.storage.v1.ProtoSchema)1 StorageError (com.google.cloud.bigquery.storage.v1.StorageError)1 StreamWriter (com.google.cloud.bigquery.storage.v1.StreamWriter)1