Search in sources :

Example 1 with ReadChannel

use of org.apache.arrow.vector.ipc.ReadChannel in project flink by apache.

the class ArrowUtils method readNextBatch.

private static byte[] readNextBatch(ReadableByteChannel channel) throws IOException {
    MessageMetadataResult metadata = MessageSerializer.readMessage(new ReadChannel(channel));
    if (metadata == null) {
        return null;
    }
    long bodyLength = metadata.getMessageBodyLength();
    // Only care about RecordBatch messages and skip the other kind of messages
    if (metadata.getMessage().headerType() == MessageHeader.RecordBatch) {
        // Buffer backed output large enough to hold 8-byte length + complete serialized message
        ByteArrayOutputStreamWithPos baos = new ByteArrayOutputStreamWithPos((int) (8 + metadata.getMessageLength() + bodyLength));
        // Write message metadata to ByteBuffer output stream
        MessageSerializer.writeMessageBuffer(new WriteChannel(Channels.newChannel(baos)), metadata.getMessageLength(), metadata.getMessageBuffer());
        baos.close();
        ByteBuffer result = ByteBuffer.wrap(baos.getBuf());
        result.position(baos.getPosition());
        result.limit(result.capacity());
        readFully(channel, result);
        return result.array();
    } else {
        if (bodyLength > 0) {
            // Skip message body if not a RecordBatch
            Channels.newInputStream(channel).skip(bodyLength);
        }
        // Proceed to next message
        return readNextBatch(channel);
    }
}
Also used : WriteChannel(org.apache.arrow.vector.ipc.WriteChannel) MessageMetadataResult(org.apache.arrow.vector.ipc.message.MessageMetadataResult) ByteArrayOutputStreamWithPos(org.apache.flink.core.memory.ByteArrayOutputStreamWithPos) ByteBuffer(java.nio.ByteBuffer) ReadChannel(org.apache.arrow.vector.ipc.ReadChannel)

Example 2 with ReadChannel

use of org.apache.arrow.vector.ipc.ReadChannel in project beam by apache.

the class ArrowConversion method rowsFromSerializedRecordBatch.

@SuppressWarnings("nullness")
public static RecordBatchRowIterator rowsFromSerializedRecordBatch(org.apache.arrow.vector.types.pojo.Schema arrowSchema, InputStream inputStream, RootAllocator allocator) throws IOException {
    VectorSchemaRoot vectorRoot = VectorSchemaRoot.create(arrowSchema, allocator);
    VectorLoader vectorLoader = new VectorLoader(vectorRoot);
    vectorRoot.clear();
    try (ReadChannel read = new ReadChannel(Channels.newChannel(inputStream))) {
        try (ArrowRecordBatch arrowMessage = MessageSerializer.deserializeRecordBatch(read, allocator)) {
            vectorLoader.load(arrowMessage);
        }
    }
    return rowsFromRecordBatch(ArrowSchemaTranslator.toBeamSchema(arrowSchema), vectorRoot);
}
Also used : VectorSchemaRoot(org.apache.arrow.vector.VectorSchemaRoot) VectorLoader(org.apache.arrow.vector.VectorLoader) ArrowRecordBatch(org.apache.arrow.vector.ipc.message.ArrowRecordBatch) ReadChannel(org.apache.arrow.vector.ipc.ReadChannel)

Aggregations

ReadChannel (org.apache.arrow.vector.ipc.ReadChannel)2 ByteBuffer (java.nio.ByteBuffer)1 VectorLoader (org.apache.arrow.vector.VectorLoader)1 VectorSchemaRoot (org.apache.arrow.vector.VectorSchemaRoot)1 WriteChannel (org.apache.arrow.vector.ipc.WriteChannel)1 ArrowRecordBatch (org.apache.arrow.vector.ipc.message.ArrowRecordBatch)1 MessageMetadataResult (org.apache.arrow.vector.ipc.message.MessageMetadataResult)1 ByteArrayOutputStreamWithPos (org.apache.flink.core.memory.ByteArrayOutputStreamWithPos)1