Search in sources :

Example 26 with MemPageStore

use of org.apache.parquet.column.page.mem.MemPageStore in project parquet-mr by apache.

the class TestParquetReadProtocol method validate.

private <T extends TBase<?, ?>> void validate(T expected) throws TException {
    @SuppressWarnings("unchecked") final Class<T> thriftClass = (Class<T>) expected.getClass();
    final MemPageStore memPageStore = new MemPageStore(1);
    final ThriftSchemaConverter schemaConverter = new ThriftSchemaConverter();
    final MessageType schema = schemaConverter.convert(thriftClass);
    LOG.info("{}", schema);
    final MessageColumnIO columnIO = new ColumnIOFactory(true).getColumnIO(schema);
    final ColumnWriteStoreV1 columns = new ColumnWriteStoreV1(memPageStore, ParquetProperties.builder().withPageSize(10000).withDictionaryEncoding(false).build());
    final RecordConsumer recordWriter = columnIO.getRecordWriter(columns);
    final StructType thriftType = schemaConverter.toStructType(thriftClass);
    ParquetWriteProtocol parquetWriteProtocol = new ParquetWriteProtocol(recordWriter, columnIO, thriftType);
    expected.write(parquetWriteProtocol);
    recordWriter.flush();
    columns.flush();
    ThriftRecordConverter<T> converter = new TBaseRecordConverter<T>(thriftClass, schema, thriftType);
    final RecordReader<T> recordReader = columnIO.getRecordReader(memPageStore, converter);
    final T result = recordReader.read();
    assertEquals(expected, result);
}
Also used : StructType(org.apache.parquet.thrift.struct.ThriftType.StructType) ColumnWriteStoreV1(org.apache.parquet.column.impl.ColumnWriteStoreV1) RecordConsumer(org.apache.parquet.io.api.RecordConsumer) MessageColumnIO(org.apache.parquet.io.MessageColumnIO) ColumnIOFactory(org.apache.parquet.io.ColumnIOFactory) MemPageStore(org.apache.parquet.column.page.mem.MemPageStore) MessageType(org.apache.parquet.schema.MessageType)

Aggregations

MemPageStore (org.apache.parquet.column.page.mem.MemPageStore)26 Test (org.junit.Test)21 Group (org.apache.parquet.example.data.Group)12 ColumnWriteStoreV1 (org.apache.parquet.column.impl.ColumnWriteStoreV1)11 MessageType (org.apache.parquet.schema.MessageType)10 GroupRecordConverter (org.apache.parquet.example.data.simple.convert.GroupRecordConverter)8 ColumnDescriptor (org.apache.parquet.column.ColumnDescriptor)6 RecordConsumer (org.apache.parquet.io.api.RecordConsumer)6 ArrayList (java.util.ArrayList)4 ColumnReader (org.apache.parquet.column.ColumnReader)4 ColumnWriter (org.apache.parquet.column.ColumnWriter)4 GroupWriter (org.apache.parquet.example.data.GroupWriter)4 SimpleGroupFactory (org.apache.parquet.example.data.simple.SimpleGroupFactory)4 PrimitiveType (org.apache.parquet.schema.PrimitiveType)4 List (java.util.List)3 PageWriter (org.apache.parquet.column.page.PageWriter)2 ParsedVersion (org.apache.parquet.VersionParser.ParsedVersion)1 BytesInput (org.apache.parquet.bytes.BytesInput)1 ParquetProperties (org.apache.parquet.column.ParquetProperties)1 DataPage (org.apache.parquet.column.page.DataPage)1