Search in sources :

Example 1 with WriteChannel

use of org.apache.arrow.vector.ipc.WriteChannel in project flink by apache.

the class ArrowUtils method readNextBatch.

private static byte[] readNextBatch(ReadableByteChannel channel) throws IOException {
    MessageMetadataResult metadata = MessageSerializer.readMessage(new ReadChannel(channel));
    if (metadata == null) {
        return null;
    }
    long bodyLength = metadata.getMessageBodyLength();
    // Only care about RecordBatch messages and skip the other kind of messages
    if (metadata.getMessage().headerType() == MessageHeader.RecordBatch) {
        // Buffer backed output large enough to hold 8-byte length + complete serialized message
        ByteArrayOutputStreamWithPos baos = new ByteArrayOutputStreamWithPos((int) (8 + metadata.getMessageLength() + bodyLength));
        // Write message metadata to ByteBuffer output stream
        MessageSerializer.writeMessageBuffer(new WriteChannel(Channels.newChannel(baos)), metadata.getMessageLength(), metadata.getMessageBuffer());
        baos.close();
        ByteBuffer result = ByteBuffer.wrap(baos.getBuf());
        result.position(baos.getPosition());
        result.limit(result.capacity());
        readFully(channel, result);
        return result.array();
    } else {
        if (bodyLength > 0) {
            // Skip message body if not a RecordBatch
            Channels.newInputStream(channel).skip(bodyLength);
        }
        // Proceed to next message
        return readNextBatch(channel);
    }
}
Also used : WriteChannel(org.apache.arrow.vector.ipc.WriteChannel) MessageMetadataResult(org.apache.arrow.vector.ipc.message.MessageMetadataResult) ByteArrayOutputStreamWithPos(org.apache.flink.core.memory.ByteArrayOutputStreamWithPos) ByteBuffer(java.nio.ByteBuffer) ReadChannel(org.apache.arrow.vector.ipc.ReadChannel)

Example 2 with WriteChannel

use of org.apache.arrow.vector.ipc.WriteChannel in project beam by apache.

the class BigQueryIOStorageReadTest method createResponseArrow.

private ReadRowsResponse createResponseArrow(org.apache.arrow.vector.types.pojo.Schema arrowSchema, List<String> name, List<Long> number, double progressAtResponseStart, double progressAtResponseEnd) {
    ArrowRecordBatch serializedRecord;
    try (VectorSchemaRoot schemaRoot = VectorSchemaRoot.create(arrowSchema, allocator)) {
        schemaRoot.allocateNew();
        schemaRoot.setRowCount(name.size());
        VarCharVector strVector = (VarCharVector) schemaRoot.getFieldVectors().get(0);
        BigIntVector bigIntVector = (BigIntVector) schemaRoot.getFieldVectors().get(1);
        for (int i = 0; i < name.size(); i++) {
            bigIntVector.set(i, number.get(i));
            strVector.set(i, new Text(name.get(i)));
        }
        VectorUnloader unLoader = new VectorUnloader(schemaRoot);
        try (org.apache.arrow.vector.ipc.message.ArrowRecordBatch records = unLoader.getRecordBatch()) {
            try (ByteArrayOutputStream os = new ByteArrayOutputStream()) {
                MessageSerializer.serialize(new WriteChannel(Channels.newChannel(os)), records);
                serializedRecord = ArrowRecordBatch.newBuilder().setRowCount(records.getLength()).setSerializedRecordBatch(ByteString.copyFrom(os.toByteArray())).build();
            } catch (IOException e) {
                throw new RuntimeException("Error writing to byte array output stream", e);
            }
        }
    }
    return ReadRowsResponse.newBuilder().setArrowRecordBatch(serializedRecord).setRowCount(name.size()).setStats(StreamStats.newBuilder().setProgress(Progress.newBuilder().setAtResponseStart(progressAtResponseStart).setAtResponseEnd(progressAtResponseEnd))).build();
}
Also used : VectorSchemaRoot(org.apache.arrow.vector.VectorSchemaRoot) VarCharVector(org.apache.arrow.vector.VarCharVector) Text(org.apache.arrow.vector.util.Text) ByteArrayOutputStream(java.io.ByteArrayOutputStream) IOException(java.io.IOException) BigIntVector(org.apache.arrow.vector.BigIntVector) VectorUnloader(org.apache.arrow.vector.VectorUnloader) StatusRuntimeException(io.grpc.StatusRuntimeException) ArrowRecordBatch(com.google.cloud.bigquery.storage.v1.ArrowRecordBatch) WriteChannel(org.apache.arrow.vector.ipc.WriteChannel)

Aggregations

WriteChannel (org.apache.arrow.vector.ipc.WriteChannel)2 ArrowRecordBatch (com.google.cloud.bigquery.storage.v1.ArrowRecordBatch)1 StatusRuntimeException (io.grpc.StatusRuntimeException)1 ByteArrayOutputStream (java.io.ByteArrayOutputStream)1 IOException (java.io.IOException)1 ByteBuffer (java.nio.ByteBuffer)1 BigIntVector (org.apache.arrow.vector.BigIntVector)1 VarCharVector (org.apache.arrow.vector.VarCharVector)1 VectorSchemaRoot (org.apache.arrow.vector.VectorSchemaRoot)1 VectorUnloader (org.apache.arrow.vector.VectorUnloader)1 ReadChannel (org.apache.arrow.vector.ipc.ReadChannel)1 MessageMetadataResult (org.apache.arrow.vector.ipc.message.MessageMetadataResult)1 Text (org.apache.arrow.vector.util.Text)1 ByteArrayOutputStreamWithPos (org.apache.flink.core.memory.ByteArrayOutputStreamWithPos)1