Search in sources :

Example 11 with ByteBufferInputStream

use of org.apache.parquet.bytes.ByteBufferInputStream in project parquet-mr by apache.

the class RunLengthBitPackingHybridIntegrationTest method doIntegrationTest.

private void doIntegrationTest(int bitWidth) throws Exception {
    long modValue = 1L << bitWidth;
    RunLengthBitPackingHybridEncoder encoder = new RunLengthBitPackingHybridEncoder(bitWidth, 1000, 64000, new DirectByteBufferAllocator());
    int numValues = 0;
    for (int i = 0; i < 100; i++) {
        encoder.writeInt((int) (i % modValue));
    }
    numValues += 100;
    for (int i = 0; i < 100; i++) {
        encoder.writeInt((int) (77 % modValue));
    }
    numValues += 100;
    for (int i = 0; i < 100; i++) {
        encoder.writeInt((int) (88 % modValue));
    }
    numValues += 100;
    for (int i = 0; i < 1000; i++) {
        encoder.writeInt((int) (i % modValue));
        encoder.writeInt((int) (i % modValue));
        encoder.writeInt((int) (i % modValue));
    }
    numValues += 3000;
    for (int i = 0; i < 1000; i++) {
        encoder.writeInt((int) (17 % modValue));
    }
    numValues += 1000;
    ByteBuffer encodedBytes = encoder.toBytes().toByteBuffer();
    ByteBufferInputStream in = ByteBufferInputStream.wrap(encodedBytes);
    RunLengthBitPackingHybridDecoder decoder = new RunLengthBitPackingHybridDecoder(bitWidth, in);
    for (int i = 0; i < 100; i++) {
        assertEquals(i % modValue, decoder.readInt());
    }
    for (int i = 0; i < 100; i++) {
        assertEquals(77 % modValue, decoder.readInt());
    }
    for (int i = 0; i < 100; i++) {
        assertEquals(88 % modValue, decoder.readInt());
    }
    for (int i = 0; i < 1000; i++) {
        assertEquals(i % modValue, decoder.readInt());
        assertEquals(i % modValue, decoder.readInt());
        assertEquals(i % modValue, decoder.readInt());
    }
    for (int i = 0; i < 1000; i++) {
        assertEquals(17 % modValue, decoder.readInt());
    }
}
Also used : DirectByteBufferAllocator(org.apache.parquet.bytes.DirectByteBufferAllocator) ByteBufferInputStream(org.apache.parquet.bytes.ByteBufferInputStream) ByteBuffer(java.nio.ByteBuffer)

Example 12 with ByteBufferInputStream

use of org.apache.parquet.bytes.ByteBufferInputStream in project parquet-mr by apache.

the class BenchmarkDeltaByteArray method benchmarkSortedStringsWithDeltaLengthByteArrayValuesWriter.

@BenchmarkOptions(benchmarkRounds = 20, warmupRounds = 4)
@Test
public void benchmarkSortedStringsWithDeltaLengthByteArrayValuesWriter() throws IOException {
    DeltaByteArrayWriter writer = new DeltaByteArrayWriter(64 * 1024, 64 * 1024, new DirectByteBufferAllocator());
    DeltaByteArrayReader reader = new DeltaByteArrayReader();
    Utils.writeData(writer, sortedVals);
    ByteBufferInputStream data = writer.getBytes().toInputStream();
    Binary[] bin = Utils.readData(reader, data, values.length);
    System.out.println("size " + data.position());
}
Also used : DirectByteBufferAllocator(org.apache.parquet.bytes.DirectByteBufferAllocator) DeltaByteArrayWriter(org.apache.parquet.column.values.deltastrings.DeltaByteArrayWriter) ByteBufferInputStream(org.apache.parquet.bytes.ByteBufferInputStream) DeltaByteArrayReader(org.apache.parquet.column.values.deltastrings.DeltaByteArrayReader) Binary(org.apache.parquet.io.api.Binary) Test(org.junit.Test) BenchmarkOptions(com.carrotsearch.junitbenchmarks.BenchmarkOptions)

Example 13 with ByteBufferInputStream

use of org.apache.parquet.bytes.ByteBufferInputStream in project parquet-mr by apache.

the class BenchmarkDeltaByteArray method benchmarkRandomStringsWithPlainValuesWriter.

@BenchmarkOptions(benchmarkRounds = 20, warmupRounds = 4)
@Test
public void benchmarkRandomStringsWithPlainValuesWriter() throws IOException {
    PlainValuesWriter writer = new PlainValuesWriter(64 * 1024, 64 * 1024, new DirectByteBufferAllocator());
    BinaryPlainValuesReader reader = new BinaryPlainValuesReader();
    Utils.writeData(writer, values);
    ByteBufferInputStream data = writer.getBytes().toInputStream();
    Binary[] bin = Utils.readData(reader, data, values.length);
    System.out.println("size " + data.position());
}
Also used : PlainValuesWriter(org.apache.parquet.column.values.plain.PlainValuesWriter) BinaryPlainValuesReader(org.apache.parquet.column.values.plain.BinaryPlainValuesReader) DirectByteBufferAllocator(org.apache.parquet.bytes.DirectByteBufferAllocator) ByteBufferInputStream(org.apache.parquet.bytes.ByteBufferInputStream) Binary(org.apache.parquet.io.api.Binary) Test(org.junit.Test) BenchmarkOptions(com.carrotsearch.junitbenchmarks.BenchmarkOptions)

Example 14 with ByteBufferInputStream

use of org.apache.parquet.bytes.ByteBufferInputStream in project parquet-mr by apache.

the class BenchmarkDeltaByteArray method benchmarkSortedStringsWithPlainValuesWriter.

@BenchmarkOptions(benchmarkRounds = 20, warmupRounds = 4)
@Test
public void benchmarkSortedStringsWithPlainValuesWriter() throws IOException {
    PlainValuesWriter writer = new PlainValuesWriter(64 * 1024, 64 * 1024, new DirectByteBufferAllocator());
    BinaryPlainValuesReader reader = new BinaryPlainValuesReader();
    Utils.writeData(writer, sortedVals);
    ByteBufferInputStream data = writer.getBytes().toInputStream();
    Binary[] bin = Utils.readData(reader, data, values.length);
    System.out.println("size " + data.position());
}
Also used : PlainValuesWriter(org.apache.parquet.column.values.plain.PlainValuesWriter) BinaryPlainValuesReader(org.apache.parquet.column.values.plain.BinaryPlainValuesReader) DirectByteBufferAllocator(org.apache.parquet.bytes.DirectByteBufferAllocator) ByteBufferInputStream(org.apache.parquet.bytes.ByteBufferInputStream) Binary(org.apache.parquet.io.api.Binary) Test(org.junit.Test) BenchmarkOptions(com.carrotsearch.junitbenchmarks.BenchmarkOptions)

Example 15 with ByteBufferInputStream

use of org.apache.parquet.bytes.ByteBufferInputStream in project parquet-mr by apache.

the class TestDictionary method testZeroValues.

@Test
public void testZeroValues() throws IOException {
    FallbackValuesWriter<PlainIntegerDictionaryValuesWriter, PlainValuesWriter> cw = newPlainIntegerDictionaryValuesWriter(100, 100);
    cw.writeInteger(34);
    cw.writeInteger(34);
    getBytesAndCheckEncoding(cw, PLAIN_DICTIONARY);
    DictionaryValuesReader reader = initDicReader(cw, INT32);
    // pretend there are 100 nulls. what matters is offset = bytes.length.
    // data doesn't matter
    ByteBuffer bytes = ByteBuffer.wrap(new byte[] { 0x00, 0x01, 0x02, 0x03 });
    ByteBufferInputStream stream = ByteBufferInputStream.wrap(bytes);
    stream.skipFully(stream.available());
    reader.initFromPage(100, stream);
}
Also used : PlainValuesWriter(org.apache.parquet.column.values.plain.PlainValuesWriter) PlainIntegerDictionaryValuesWriter(org.apache.parquet.column.values.dictionary.DictionaryValuesWriter.PlainIntegerDictionaryValuesWriter) ByteBufferInputStream(org.apache.parquet.bytes.ByteBufferInputStream) ByteBuffer(java.nio.ByteBuffer) Test(org.junit.Test)

Aggregations

ByteBufferInputStream (org.apache.parquet.bytes.ByteBufferInputStream)20 Test (org.junit.Test)10 DirectByteBufferAllocator (org.apache.parquet.bytes.DirectByteBufferAllocator)8 BenchmarkOptions (com.carrotsearch.junitbenchmarks.BenchmarkOptions)6 ValuesReader (org.apache.parquet.column.values.ValuesReader)6 Binary (org.apache.parquet.io.api.Binary)6 BytesInput (org.apache.parquet.bytes.BytesInput)5 IOException (java.io.IOException)4 PlainValuesWriter (org.apache.parquet.column.values.plain.PlainValuesWriter)4 ByteBuffer (java.nio.ByteBuffer)3 BinaryPlainValuesReader (org.apache.parquet.column.values.plain.BinaryPlainValuesReader)3 ParquetDecodingException (org.apache.parquet.io.ParquetDecodingException)3 DeltaByteArrayReader (org.apache.parquet.column.values.deltastrings.DeltaByteArrayReader)2 DeltaByteArrayWriter (org.apache.parquet.column.values.deltastrings.DeltaByteArrayWriter)2 BinaryDeltaValuesDecoder (com.facebook.presto.parquet.batchreader.decoders.delta.BinaryDeltaValuesDecoder)1 Int32DeltaBinaryPackedValuesDecoder (com.facebook.presto.parquet.batchreader.decoders.delta.Int32DeltaBinaryPackedValuesDecoder)1 Int64DeltaBinaryPackedValuesDecoder (com.facebook.presto.parquet.batchreader.decoders.delta.Int64DeltaBinaryPackedValuesDecoder)1 Int64TimestampMicrosDeltaBinaryPackedValuesDecoder (com.facebook.presto.parquet.batchreader.decoders.delta.Int64TimestampMicrosDeltaBinaryPackedValuesDecoder)1 BinaryPlainValuesDecoder (com.facebook.presto.parquet.batchreader.decoders.plain.BinaryPlainValuesDecoder)1 BooleanPlainValuesDecoder (com.facebook.presto.parquet.batchreader.decoders.plain.BooleanPlainValuesDecoder)1