Search in sources :

Example 16 with Encoding

use of org.apache.parquet.column.Encoding in project parquet-mr by apache.

the class ColumnWriterV2 method writePage.

/**
 * writes the current data to a new page in the page store
 * @param rowCount how many rows have been written so far
 */
public void writePage(long rowCount) {
    int pageRowCount = Ints.checkedCast(rowCount - rowsWrittenSoFar);
    this.rowsWrittenSoFar = rowCount;
    if (DEBUG)
        LOG.debug("write page");
    try {
        // TODO: rework this API. Those must be called *in that order*
        BytesInput bytes = dataColumn.getBytes();
        Encoding encoding = dataColumn.getEncoding();
        pageWriter.writePageV2(pageRowCount, Ints.checkedCast(statistics.getNumNulls()), valueCount, path.getMaxRepetitionLevel() == 0 ? BytesInput.empty() : repetitionLevelColumn.toBytes(), path.getMaxDefinitionLevel() == 0 ? BytesInput.empty() : definitionLevelColumn.toBytes(), encoding, bytes, statistics);
    } catch (IOException e) {
        throw new ParquetEncodingException("could not write page for " + path, e);
    }
    repetitionLevelColumn.reset();
    definitionLevelColumn.reset();
    dataColumn.reset();
    valueCount = 0;
    resetStatistics();
}
Also used : BytesInput(org.apache.parquet.bytes.BytesInput) ParquetEncodingException(org.apache.parquet.io.ParquetEncodingException) Encoding(org.apache.parquet.column.Encoding) IOException(java.io.IOException)

Aggregations

Encoding (org.apache.parquet.column.Encoding)16 Path (org.apache.hadoop.fs.Path)6 Test (org.junit.Test)6 Configuration (org.apache.hadoop.conf.Configuration)5 FileSystem (org.apache.hadoop.fs.FileSystem)4 PageReadStore (org.apache.parquet.column.page.PageReadStore)4 BlockMetaData (org.apache.parquet.hadoop.metadata.BlockMetaData)4 ColumnChunkMetaData (org.apache.parquet.hadoop.metadata.ColumnChunkMetaData)4 File (java.io.File)3 HashMap (java.util.HashMap)3 EncodingStats (org.apache.parquet.column.EncodingStats)3 ParquetMetadata (org.apache.parquet.hadoop.metadata.ParquetMetadata)3 MessageType (org.apache.parquet.schema.MessageType)3 Stopwatch (com.google.common.base.Stopwatch)2 ByteBuffer (java.nio.ByteBuffer)2 HashSet (java.util.HashSet)2 FSDataInputStream (org.apache.hadoop.fs.FSDataInputStream)2 BytesInput (org.apache.parquet.bytes.BytesInput)2 ColumnDescriptor (org.apache.parquet.column.ColumnDescriptor)2 WriterVersion (org.apache.parquet.column.ParquetProperties.WriterVersion)2