Search in sources :

Example 1 with NoHivePhysicalWriterImpl

use of org.apache.flink.orc.nohive.writer.NoHivePhysicalWriterImpl in project flink by apache.

the class OrcNoHiveBulkWriterFactory method create.

@Override
public BulkWriter<RowData> create(FSDataOutputStream out) throws IOException {
    OrcFile.WriterOptions opts = OrcFile.writerOptions(new Properties(), conf);
    TypeDescription description = TypeDescription.fromString(schema);
    opts.setSchema(description);
    opts.physicalWriter(new NoHivePhysicalWriterImpl(out, opts));
    WriterImpl writer = new WriterImpl(null, new Path("."), opts);
    VectorizedRowBatch rowBatch = description.createRowBatch();
    return new BulkWriter<RowData>() {

        @Override
        public void addElement(RowData row) throws IOException {
            int rowId = rowBatch.size++;
            for (int i = 0; i < row.getArity(); ++i) {
                setColumn(rowId, rowBatch.cols[i], fieldTypes[i], row, i);
            }
            if (rowBatch.size == rowBatch.getMaxSize()) {
                writer.addRowBatch(rowBatch);
                rowBatch.reset();
            }
        }

        @Override
        public void flush() throws IOException {
            if (rowBatch.size != 0) {
                writer.addRowBatch(rowBatch);
                rowBatch.reset();
            }
        }

        @Override
        public void finish() throws IOException {
            flush();
            writer.close();
        }
    };
}
Also used : Path(org.apache.hadoop.fs.Path) VectorizedRowBatch(org.apache.orc.storage.ql.exec.vector.VectorizedRowBatch) RowData(org.apache.flink.table.data.RowData) NoHivePhysicalWriterImpl(org.apache.flink.orc.nohive.writer.NoHivePhysicalWriterImpl) OrcFile(org.apache.orc.OrcFile) BulkWriter(org.apache.flink.api.common.serialization.BulkWriter) TypeDescription(org.apache.orc.TypeDescription) Properties(java.util.Properties) NoHivePhysicalWriterImpl(org.apache.flink.orc.nohive.writer.NoHivePhysicalWriterImpl) WriterImpl(org.apache.orc.impl.WriterImpl)

Aggregations

Properties (java.util.Properties)1 BulkWriter (org.apache.flink.api.common.serialization.BulkWriter)1 NoHivePhysicalWriterImpl (org.apache.flink.orc.nohive.writer.NoHivePhysicalWriterImpl)1 RowData (org.apache.flink.table.data.RowData)1 Path (org.apache.hadoop.fs.Path)1 OrcFile (org.apache.orc.OrcFile)1 TypeDescription (org.apache.orc.TypeDescription)1 WriterImpl (org.apache.orc.impl.WriterImpl)1 VectorizedRowBatch (org.apache.orc.storage.ql.exec.vector.VectorizedRowBatch)1