Search in sources :

Example 1 with VectorLoader

use of org.apache.arrow.vector.VectorLoader in project beam by apache.

the class ArrowConversion method rowsFromSerializedRecordBatch.

@SuppressWarnings("nullness")
public static RecordBatchRowIterator rowsFromSerializedRecordBatch(org.apache.arrow.vector.types.pojo.Schema arrowSchema, InputStream inputStream, RootAllocator allocator) throws IOException {
    VectorSchemaRoot vectorRoot = VectorSchemaRoot.create(arrowSchema, allocator);
    VectorLoader vectorLoader = new VectorLoader(vectorRoot);
    vectorRoot.clear();
    try (ReadChannel read = new ReadChannel(Channels.newChannel(inputStream))) {
        try (ArrowRecordBatch arrowMessage = MessageSerializer.deserializeRecordBatch(read, allocator)) {
            vectorLoader.load(arrowMessage);
        }
    }
    return rowsFromRecordBatch(ArrowSchemaTranslator.toBeamSchema(arrowSchema), vectorRoot);
}
Also used : VectorSchemaRoot(org.apache.arrow.vector.VectorSchemaRoot) VectorLoader(org.apache.arrow.vector.VectorLoader) ArrowRecordBatch(org.apache.arrow.vector.ipc.message.ArrowRecordBatch) ReadChannel(org.apache.arrow.vector.ipc.ReadChannel)

Example 2 with VectorLoader

use of org.apache.arrow.vector.VectorLoader in project flink by apache.

the class ArrowSourceFunction method run.

@Override
public void run(SourceContext<RowData> ctx) throws Exception {
    VectorLoader vectorLoader = new VectorLoader(root);
    while (running && !indexesToEmit.isEmpty()) {
        Tuple2<Integer, Integer> indexToEmit = indexesToEmit.peek();
        ArrowRecordBatch arrowRecordBatch = loadBatch(indexToEmit.f0);
        vectorLoader.load(arrowRecordBatch);
        arrowRecordBatch.close();
        ArrowReader arrowReader = createArrowReader(root);
        int rowCount = root.getRowCount();
        int nextRowId = indexToEmit.f1;
        while (nextRowId < rowCount) {
            RowData element = arrowReader.read(nextRowId);
            synchronized (ctx.getCheckpointLock()) {
                ctx.collect(element);
                indexToEmit.setField(++nextRowId, 1);
            }
        }
        synchronized (ctx.getCheckpointLock()) {
            indexesToEmit.pop();
        }
    }
}
Also used : VectorLoader(org.apache.arrow.vector.VectorLoader) RowData(org.apache.flink.table.data.RowData) ArrowRecordBatch(org.apache.arrow.vector.ipc.message.ArrowRecordBatch) ArrowReader(org.apache.flink.table.runtime.arrow.ArrowReader)

Aggregations

VectorLoader (org.apache.arrow.vector.VectorLoader)2 ArrowRecordBatch (org.apache.arrow.vector.ipc.message.ArrowRecordBatch)2 VectorSchemaRoot (org.apache.arrow.vector.VectorSchemaRoot)1 ReadChannel (org.apache.arrow.vector.ipc.ReadChannel)1 RowData (org.apache.flink.table.data.RowData)1 ArrowReader (org.apache.flink.table.runtime.arrow.ArrowReader)1