Search in sources :

Example 6 with ProtoRows

use of com.google.cloud.bigquery.storage.v1.ProtoRows in project beam by apache.

the class SplittingIterable method iterator.

@Override
public Iterator<ProtoRows> iterator() {
    return new Iterator<ProtoRows>() {

        final Iterator<StorageApiWritePayload> underlyingIterator = underlying.iterator();

        @Override
        public boolean hasNext() {
            return underlyingIterator.hasNext();
        }

        @Override
        public ProtoRows next() {
            if (!hasNext()) {
                throw new NoSuchElementException();
            }
            ProtoRows.Builder inserts = ProtoRows.newBuilder();
            long bytesSize = 0;
            while (underlyingIterator.hasNext()) {
                StorageApiWritePayload payload = underlyingIterator.next();
                if (payload.getSchemaHash() != currentDescriptor.hash) {
                    // Schema doesn't match. Try and get an updated schema hash (from the base table).
                    currentDescriptor = updateSchema.apply(payload.getSchemaHash());
                    // Validate that the record can now be parsed.
                    try {
                        DynamicMessage msg = DynamicMessage.parseFrom(currentDescriptor.descriptor, payload.getPayload());
                        if (msg.getUnknownFields() != null && !msg.getUnknownFields().asMap().isEmpty()) {
                            throw new RuntimeException("Record schema does not match table. Unknown fields: " + msg.getUnknownFields());
                        }
                    } catch (InvalidProtocolBufferException e) {
                        throw new RuntimeException(e);
                    }
                }
                ByteString byteString = ByteString.copyFrom(payload.getPayload());
                inserts.addSerializedRows(byteString);
                bytesSize += byteString.size();
                if (bytesSize > splitSize) {
                    break;
                }
            }
            return inserts.build();
        }
    };
}
Also used : ProtoRows(com.google.cloud.bigquery.storage.v1.ProtoRows) ByteString(com.google.protobuf.ByteString) Iterator(java.util.Iterator) InvalidProtocolBufferException(com.google.protobuf.InvalidProtocolBufferException) DynamicMessage(com.google.protobuf.DynamicMessage) NoSuchElementException(java.util.NoSuchElementException)

Example 7 with ProtoRows

use of com.google.cloud.bigquery.storage.v1.ProtoRows in project beam by apache.

the class FakeDatasetService method getStreamAppendClient.

@Override
public StreamAppendClient getStreamAppendClient(String streamName, Descriptor descriptor) {
    return new StreamAppendClient() {

        private Descriptor protoDescriptor;

        {
            this.protoDescriptor = descriptor;
        }

        @Override
        public ApiFuture<AppendRowsResponse> appendRows(long offset, ProtoRows rows) throws Exception {
            synchronized (FakeDatasetService.class) {
                Stream stream = writeStreams.get(streamName);
                if (stream == null) {
                    throw new RuntimeException("No such stream: " + streamName);
                }
                List<TableRow> tableRows = Lists.newArrayListWithExpectedSize(rows.getSerializedRowsCount());
                for (ByteString bytes : rows.getSerializedRowsList()) {
                    DynamicMessage msg = DynamicMessage.parseFrom(protoDescriptor, bytes);
                    if (msg.getUnknownFields() != null && !msg.getUnknownFields().asMap().isEmpty()) {
                        throw new RuntimeException("Unknown fields set in append! " + msg.getUnknownFields());
                    }
                    tableRows.add(TableRowToStorageApiProto.tableRowFromMessage(DynamicMessage.parseFrom(protoDescriptor, bytes)));
                }
                stream.appendRows(offset, tableRows);
            }
            return ApiFutures.immediateFuture(AppendRowsResponse.newBuilder().build());
        }

        @Override
        public void close() throws Exception {
        }

        @Override
        public void pin() {
        }

        @Override
        public void unpin() throws Exception {
        }
    };
}
Also used : ProtoRows(com.google.cloud.bigquery.storage.v1.ProtoRows) StreamAppendClient(org.apache.beam.sdk.io.gcp.bigquery.BigQueryServices.StreamAppendClient) ByteString(com.google.protobuf.ByteString) TableRow(com.google.api.services.bigquery.model.TableRow) AppendRowsResponse(com.google.cloud.bigquery.storage.v1.AppendRowsResponse) Descriptor(com.google.protobuf.Descriptors.Descriptor) WriteStream(com.google.cloud.bigquery.storage.v1.WriteStream) DynamicMessage(com.google.protobuf.DynamicMessage)

Aggregations

ProtoRows (com.google.cloud.bigquery.storage.v1.ProtoRows)6 DynamicMessage (com.google.protobuf.DynamicMessage)3 ProtobufUtils.toProtoRows (com.google.cloud.spark.bigquery.ProtobufUtils.toProtoRows)2 ByteString (com.google.protobuf.ByteString)2 GenericInternalRow (org.apache.spark.sql.catalyst.expressions.GenericInternalRow)2 Test (org.junit.Test)2 TableRow (com.google.api.services.bigquery.model.TableRow)1 AppendRowsResponse (com.google.cloud.bigquery.storage.v1.AppendRowsResponse)1 FinalizeWriteStreamRequest (com.google.cloud.bigquery.storage.v1.FinalizeWriteStreamRequest)1 FinalizeWriteStreamResponse (com.google.cloud.bigquery.storage.v1.FinalizeWriteStreamResponse)1 JsonToProtoMessage (com.google.cloud.bigquery.storage.v1.JsonToProtoMessage)1 WriteStream (com.google.cloud.bigquery.storage.v1.WriteStream)1 Descriptors (com.google.protobuf.Descriptors)1 Descriptor (com.google.protobuf.Descriptors.Descriptor)1 InvalidProtocolBufferException (com.google.protobuf.InvalidProtocolBufferException)1 Message (com.google.protobuf.Message)1 IOException (java.io.IOException)1 BigDecimal (java.math.BigDecimal)1 MathContext (java.math.MathContext)1 Iterator (java.util.Iterator)1