Search in sources :

Example 6 with AppendRowsResponse

use of com.google.cloud.bigquery.storage.v1.AppendRowsResponse in project java-bigquerystorage by googleapis.

the class WriteToDefaultStream method writeToDefaultStream.

public static void writeToDefaultStream(String projectId, String datasetName, String tableName) throws DescriptorValidationException, InterruptedException, IOException {
    BigQuery bigquery = BigQueryOptions.getDefaultInstance().getService();
    Table table = bigquery.getTable(datasetName, tableName);
    TableName parentTable = TableName.of(projectId, datasetName, tableName);
    Schema schema = table.getDefinition().getSchema();
    TableSchema tableSchema = BqToBqStorageSchemaConverter.convertTableSchema(schema);
    // https://googleapis.dev/java/google-cloud-bigquerystorage/latest/com/google/cloud/bigquery/storage/v1/JsonStreamWriter.html
    try (JsonStreamWriter writer = JsonStreamWriter.newBuilder(parentTable.toString(), tableSchema).build()) {
        // much writes as possible. Creating a writer for just one write is an antipattern.
        for (int i = 0; i < 2; i++) {
            // Create a JSON object that is compatible with the table schema.
            JSONArray jsonArr = new JSONArray();
            for (int j = 0; j < 10; j++) {
                JSONObject record = new JSONObject();
                record.put("test_string", String.format("record %03d-%03d", i, j));
                jsonArr.put(record);
            }
            ApiFuture<AppendRowsResponse> future = writer.append(jsonArr);
            AppendRowsResponse response = future.get();
        }
        System.out.println("Appended records successfully.");
    } catch (ExecutionException e) {
        // If the wrapped exception is a StatusRuntimeException, check the state of the operation.
        // If the state is INTERNAL, CANCELLED, or ABORTED, you can retry. For more information, see:
        // https://grpc.github.io/grpc-java/javadoc/io/grpc/StatusRuntimeException.html
        System.out.println("Failed to append records. \n" + e.toString());
    }
}
Also used : TableName(com.google.cloud.bigquery.storage.v1.TableName) BigQuery(com.google.cloud.bigquery.BigQuery) Table(com.google.cloud.bigquery.Table) TableSchema(com.google.cloud.bigquery.storage.v1.TableSchema) JSONObject(org.json.JSONObject) TableSchema(com.google.cloud.bigquery.storage.v1.TableSchema) Schema(com.google.cloud.bigquery.Schema) JSONArray(org.json.JSONArray) AppendRowsResponse(com.google.cloud.bigquery.storage.v1.AppendRowsResponse) ExecutionException(java.util.concurrent.ExecutionException) JsonStreamWriter(com.google.cloud.bigquery.storage.v1.JsonStreamWriter)

Example 7 with AppendRowsResponse

use of com.google.cloud.bigquery.storage.v1.AppendRowsResponse in project beam by apache.

the class FakeDatasetService method getStreamAppendClient.

@Override
public StreamAppendClient getStreamAppendClient(String streamName, Descriptor descriptor) {
    return new StreamAppendClient() {

        private Descriptor protoDescriptor;

        {
            this.protoDescriptor = descriptor;
        }

        @Override
        public ApiFuture<AppendRowsResponse> appendRows(long offset, ProtoRows rows) throws Exception {
            synchronized (FakeDatasetService.class) {
                Stream stream = writeStreams.get(streamName);
                if (stream == null) {
                    throw new RuntimeException("No such stream: " + streamName);
                }
                List<TableRow> tableRows = Lists.newArrayListWithExpectedSize(rows.getSerializedRowsCount());
                for (ByteString bytes : rows.getSerializedRowsList()) {
                    DynamicMessage msg = DynamicMessage.parseFrom(protoDescriptor, bytes);
                    if (msg.getUnknownFields() != null && !msg.getUnknownFields().asMap().isEmpty()) {
                        throw new RuntimeException("Unknown fields set in append! " + msg.getUnknownFields());
                    }
                    tableRows.add(TableRowToStorageApiProto.tableRowFromMessage(DynamicMessage.parseFrom(protoDescriptor, bytes)));
                }
                stream.appendRows(offset, tableRows);
            }
            return ApiFutures.immediateFuture(AppendRowsResponse.newBuilder().build());
        }

        @Override
        public void close() throws Exception {
        }

        @Override
        public void pin() {
        }

        @Override
        public void unpin() throws Exception {
        }
    };
}
Also used : ProtoRows(com.google.cloud.bigquery.storage.v1.ProtoRows) StreamAppendClient(org.apache.beam.sdk.io.gcp.bigquery.BigQueryServices.StreamAppendClient) ByteString(com.google.protobuf.ByteString) TableRow(com.google.api.services.bigquery.model.TableRow) AppendRowsResponse(com.google.cloud.bigquery.storage.v1.AppendRowsResponse) Descriptor(com.google.protobuf.Descriptors.Descriptor) WriteStream(com.google.cloud.bigquery.storage.v1.WriteStream) DynamicMessage(com.google.protobuf.DynamicMessage)

Example 8 with AppendRowsResponse

use of com.google.cloud.bigquery.storage.v1.AppendRowsResponse in project dataproc-templates by GoogleCloudPlatform.

the class PubSubToBQ method writeToBQ.

public static void writeToBQ(JavaDStream<SparkPubsubMessage> pubSubStream, String outputProjectID, String pubSubBQOutputDataset, String PubSubBQOutputTable, Integer batchSize) {
    pubSubStream.foreachRDD(new VoidFunction<JavaRDD<SparkPubsubMessage>>() {

        @Override
        public void call(JavaRDD<SparkPubsubMessage> sparkPubsubMessageJavaRDD) throws Exception {
            sparkPubsubMessageJavaRDD.foreachPartition(new VoidFunction<Iterator<SparkPubsubMessage>>() {

                @Override
                public void call(Iterator<SparkPubsubMessage> sparkPubsubMessageIterator) throws Exception {
                    BigQuery bigquery = BigQueryOptions.getDefaultInstance().getService();
                    Table table = bigquery.getTable(pubSubBQOutputDataset, PubSubBQOutputTable);
                    TableName parentTable = TableName.of(outputProjectID, pubSubBQOutputDataset, PubSubBQOutputTable);
                    Schema schema = table.getDefinition().getSchema();
                    JsonStreamWriter writer = JsonStreamWriter.newBuilder(parentTable.toString(), schema).build();
                    JSONArray jsonArr = new JSONArray();
                    while (sparkPubsubMessageIterator.hasNext()) {
                        SparkPubsubMessage message = sparkPubsubMessageIterator.next();
                        JSONObject record = new JSONObject(new String(message.getData()));
                        jsonArr.put(record);
                        if (jsonArr.length() == batchSize) {
                            ApiFuture<AppendRowsResponse> future = writer.append(jsonArr);
                            AppendRowsResponse response = future.get();
                            jsonArr = new JSONArray();
                        }
                    }
                    if (jsonArr.length() > 0) {
                        ApiFuture<AppendRowsResponse> future = writer.append(jsonArr);
                        AppendRowsResponse response = future.get();
                    }
                    writer.close();
                }
            });
        }
    });
}
Also used : JSONArray(com.google.cloud.spark.bigquery.repackaged.org.json.JSONArray) AppendRowsResponse(com.google.cloud.spark.bigquery.repackaged.com.google.cloud.bigquery.storage.v1beta2.AppendRowsResponse) JavaRDD(org.apache.spark.api.java.JavaRDD) TableName(com.google.cloud.spark.bigquery.repackaged.com.google.cloud.bigquery.storage.v1beta2.TableName) SparkPubsubMessage(org.apache.spark.streaming.pubsub.SparkPubsubMessage) JSONObject(com.google.cloud.spark.bigquery.repackaged.org.json.JSONObject) VoidFunction(org.apache.spark.api.java.function.VoidFunction) Iterator(java.util.Iterator) JsonStreamWriter(com.google.cloud.spark.bigquery.repackaged.com.google.cloud.bigquery.storage.v1beta2.JsonStreamWriter)

Example 9 with AppendRowsResponse

use of com.google.cloud.bigquery.storage.v1.AppendRowsResponse in project spark-bigquery-connector by GoogleCloudDataproc.

the class BigQueryDirectDataWriterHelper method validateAppendRowsResponse.

/**
 * Validates an AppendRowsResponse, after retrieving its future: makes sure the responses' future
 * matches the expectedOffset, and returned with no errors.
 *
 * @param appendRowsResponseApiFuture The future of the AppendRowsResponse
 * @param expectedOffset The expected offset to be returned by the response.
 * @throws IOException If the response returned with error, or the offset did not match the
 *     expected offset.
 */
private void validateAppendRowsResponse(ApiFuture<AppendRowsResponse> appendRowsResponseApiFuture, long expectedOffset) throws IOException {
    AppendRowsResponse appendRowsResponse = null;
    try {
        appendRowsResponse = appendRowsResponseApiFuture.get();
    } catch (InterruptedException | ExecutionException e) {
        throw new BigQueryConnectorException("Could not retrieve AppendRowsResponse", e);
    }
    if (appendRowsResponse.hasError()) {
        throw new IOException("Append request failed with error: " + appendRowsResponse.getError().getMessage());
    }
    AppendRowsResponse.AppendResult appendResult = appendRowsResponse.getAppendResult();
    long responseOffset = appendResult.getOffset().getValue();
    if (expectedOffset != responseOffset) {
        throw new IOException(String.format("On stream %s append-rows response, offset %d did not match expected offset %d", writeStreamName, responseOffset, expectedOffset));
    }
}
Also used : AppendRowsResponse(com.google.cloud.bigquery.storage.v1.AppendRowsResponse) IOException(java.io.IOException) ExecutionException(java.util.concurrent.ExecutionException)

Example 10 with AppendRowsResponse

use of com.google.cloud.bigquery.storage.v1.AppendRowsResponse in project java-bigquerystorage by googleapis.

the class WritePendingStream method writePendingStream.

public static void writePendingStream(String projectId, String datasetName, String tableName) throws DescriptorValidationException, InterruptedException, IOException {
    try (BigQueryWriteClient client = BigQueryWriteClient.create()) {
        // Initialize a write stream for the specified table.
        // For more information on WriteStream.Type, see:
        // https://googleapis.dev/java/google-cloud-bigquerystorage/latest/com/google/cloud/bigquery/storage/v1/WriteStream.Type.html
        WriteStream stream = WriteStream.newBuilder().setType(WriteStream.Type.PENDING).build();
        TableName parentTable = TableName.of(projectId, datasetName, tableName);
        CreateWriteStreamRequest createWriteStreamRequest = CreateWriteStreamRequest.newBuilder().setParent(parentTable.toString()).setWriteStream(stream).build();
        WriteStream writeStream = client.createWriteStream(createWriteStreamRequest);
        // https://googleapis.dev/java/google-cloud-bigquerystorage/latest/com/google/cloud/bigquery/storage/v1beta2/JsonStreamWriter.html
        try (JsonStreamWriter writer = JsonStreamWriter.newBuilder(writeStream.getName(), writeStream.getTableSchema()).build()) {
            // Write two batches to the stream, each with 10 JSON records.
            for (int i = 0; i < 2; i++) {
                // Create a JSON object that is compatible with the table schema.
                JSONArray jsonArr = new JSONArray();
                for (int j = 0; j < 10; j++) {
                    JSONObject record = new JSONObject();
                    record.put("col1", String.format("batch-record %03d-%03d", i, j));
                    jsonArr.put(record);
                }
                ApiFuture<AppendRowsResponse> future = writer.append(jsonArr);
                AppendRowsResponse response = future.get();
            }
            FinalizeWriteStreamResponse finalizeResponse = client.finalizeWriteStream(writeStream.getName());
            System.out.println("Rows written: " + finalizeResponse.getRowCount());
        }
        // Commit the streams.
        BatchCommitWriteStreamsRequest commitRequest = BatchCommitWriteStreamsRequest.newBuilder().setParent(parentTable.toString()).addWriteStreams(writeStream.getName()).build();
        BatchCommitWriteStreamsResponse commitResponse = client.batchCommitWriteStreams(commitRequest);
        // If the response does not have a commit time, it means the commit operation failed.
        if (commitResponse.hasCommitTime() == false) {
            for (StorageError err : commitResponse.getStreamErrorsList()) {
                System.out.println(err.getErrorMessage());
            }
            throw new RuntimeException("Error committing the streams");
        }
        System.out.println("Appended and committed records successfully.");
    } catch (ExecutionException e) {
        // If the wrapped exception is a StatusRuntimeException, check the state of the operation.
        // If the state is INTERNAL, CANCELLED, or ABORTED, you can retry. For more information, see:
        // https://grpc.github.io/grpc-java/javadoc/io/grpc/StatusRuntimeException.html
        System.out.println(e);
    }
}
Also used : FinalizeWriteStreamResponse(com.google.cloud.bigquery.storage.v1.FinalizeWriteStreamResponse) BatchCommitWriteStreamsRequest(com.google.cloud.bigquery.storage.v1.BatchCommitWriteStreamsRequest) JSONArray(org.json.JSONArray) AppendRowsResponse(com.google.cloud.bigquery.storage.v1.AppendRowsResponse) WriteStream(com.google.cloud.bigquery.storage.v1.WriteStream) BigQueryWriteClient(com.google.cloud.bigquery.storage.v1.BigQueryWriteClient) TableName(com.google.cloud.bigquery.storage.v1.TableName) CreateWriteStreamRequest(com.google.cloud.bigquery.storage.v1.CreateWriteStreamRequest) BatchCommitWriteStreamsResponse(com.google.cloud.bigquery.storage.v1.BatchCommitWriteStreamsResponse) JSONObject(org.json.JSONObject) StorageError(com.google.cloud.bigquery.storage.v1.StorageError) ExecutionException(java.util.concurrent.ExecutionException) JsonStreamWriter(com.google.cloud.bigquery.storage.v1.JsonStreamWriter)

Aggregations

JSONArray (org.json.JSONArray)9 JSONObject (org.json.JSONObject)9 AppendRowsResponse (com.google.cloud.bigquery.storage.v1.AppendRowsResponse)8 JsonStreamWriter (com.google.cloud.bigquery.storage.v1.JsonStreamWriter)5 TableName (com.google.cloud.bigquery.storage.v1.TableName)5 ExecutionException (java.util.concurrent.ExecutionException)5 FieldValueList (com.google.cloud.bigquery.FieldValueList)4 TableResult (com.google.cloud.bigquery.TableResult)4 WriteStream (com.google.cloud.bigquery.storage.v1.WriteStream)4 AppendRowsResponse (com.google.cloud.bigquery.storage.v1beta2.AppendRowsResponse)4 JsonStreamWriter (com.google.cloud.bigquery.storage.v1beta2.JsonStreamWriter)4 TableFieldSchema (com.google.cloud.bigquery.storage.v1beta2.TableFieldSchema)4 TableName (com.google.cloud.bigquery.storage.v1beta2.TableName)4 TableSchema (com.google.cloud.bigquery.storage.v1beta2.TableSchema)4 Test (org.junit.Test)4 BigQueryWriteClient (com.google.cloud.bigquery.storage.v1.BigQueryWriteClient)3 CreateWriteStreamRequest (com.google.cloud.bigquery.storage.v1.CreateWriteStreamRequest)3 BigQuery (com.google.cloud.bigquery.BigQuery)2 Schema (com.google.cloud.bigquery.Schema)2 Table (com.google.cloud.bigquery.Table)2