Search in sources :

Example 1 with ParquetHiveRecord

use of org.apache.hadoop.hive.serde2.io.ParquetHiveRecord in project hive by apache.

the class TestParquetSerDe method deserializeAndSerializeLazySimple.

private void deserializeAndSerializeLazySimple(final ParquetHiveSerDe serDe, final ArrayWritable t) throws SerDeException {
    // Get the row structure
    final StructObjectInspector oi = (StructObjectInspector) serDe.getObjectInspector();
    // Deserialize
    final Object row = serDe.deserialize(t);
    assertEquals("deserialization gives the wrong object class", row.getClass(), ArrayWritable.class);
    assertEquals("size correct after deserialization", serDe.getSerDeStats().getRawDataSize(), t.get().length);
    assertEquals("deserialization gives the wrong object", t, row);
    // Serialize
    final ParquetHiveRecord serializedArr = (ParquetHiveRecord) serDe.serialize(row, oi);
    assertEquals("size correct after serialization", serDe.getSerDeStats().getRawDataSize(), ((ArrayWritable) serializedArr.getObject()).get().length);
    assertTrue("serialized object should be equal to starting object", arrayWritableEquals(t, (ArrayWritable) serializedArr.getObject()));
}
Also used : ArrayWritable(org.apache.hadoop.io.ArrayWritable) ParquetHiveRecord(org.apache.hadoop.hive.serde2.io.ParquetHiveRecord) StructObjectInspector(org.apache.hadoop.hive.serde2.objectinspector.StructObjectInspector)

Example 2 with ParquetHiveRecord

use of org.apache.hadoop.hive.serde2.io.ParquetHiveRecord in project hive by apache.

the class TestDataWritableWriter method getParquetWritable.

private ParquetHiveRecord getParquetWritable(String columnNames, String columnTypes, ArrayWritable record) throws SerDeException {
    Properties recordProperties = new Properties();
    recordProperties.setProperty("columns", columnNames);
    recordProperties.setProperty("columns.types", columnTypes);
    ParquetHiveSerDe serDe = new ParquetHiveSerDe();
    SerDeUtils.initializeSerDe(serDe, new Configuration(), recordProperties, null);
    return new ParquetHiveRecord(serDe.deserialize(record), getObjectInspector(columnNames, columnTypes));
}
Also used : ParquetHiveSerDe(org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe) Configuration(org.apache.hadoop.conf.Configuration) ParquetHiveRecord(org.apache.hadoop.hive.serde2.io.ParquetHiveRecord) Properties(java.util.Properties)

Example 3 with ParquetHiveRecord

use of org.apache.hadoop.hive.serde2.io.ParquetHiveRecord in project hive by apache.

the class TestMapredParquetOutputFormat method testGetHiveRecordWriter.

@SuppressWarnings("unchecked")
@Test
public void testGetHiveRecordWriter() throws IOException {
    Properties tableProps = new Properties();
    tableProps.setProperty("columns", "foo,bar");
    tableProps.setProperty("columns.types", "int:int");
    final Progressable mockProgress = mock(Progressable.class);
    final ParquetOutputFormat<ParquetHiveRecord> outputFormat = (ParquetOutputFormat<ParquetHiveRecord>) mock(ParquetOutputFormat.class);
    JobConf jobConf = new JobConf();
    try {
        new MapredParquetOutputFormat(outputFormat) {

            @Override
            protected ParquetRecordWriterWrapper getParquerRecordWriterWrapper(ParquetOutputFormat<ParquetHiveRecord> realOutputFormat, JobConf jobConf, String finalOutPath, Progressable progress, Properties tableProperties) throws IOException {
                assertEquals(outputFormat, realOutputFormat);
                assertNotNull(jobConf.get(DataWritableWriteSupport.PARQUET_HIVE_SCHEMA));
                assertEquals("/foo", finalOutPath.toString());
                assertEquals(mockProgress, progress);
                throw new RuntimeException("passed tests");
            }
        }.getHiveRecordWriter(jobConf, new Path("/foo"), null, false, tableProps, mockProgress);
        fail("should throw runtime exception.");
    } catch (RuntimeException e) {
        assertEquals("passed tests", e.getMessage());
    }
}
Also used : Path(org.apache.hadoop.fs.Path) ParquetHiveRecord(org.apache.hadoop.hive.serde2.io.ParquetHiveRecord) ParquetOutputFormat(org.apache.parquet.hadoop.ParquetOutputFormat) ParquetRecordWriterWrapper(org.apache.hadoop.hive.ql.io.parquet.write.ParquetRecordWriterWrapper) IOException(java.io.IOException) Properties(java.util.Properties) Progressable(org.apache.hadoop.util.Progressable) JobConf(org.apache.hadoop.mapred.JobConf) Test(org.junit.Test)

Aggregations

ParquetHiveRecord (org.apache.hadoop.hive.serde2.io.ParquetHiveRecord)3 Properties (java.util.Properties)2 IOException (java.io.IOException)1 Configuration (org.apache.hadoop.conf.Configuration)1 Path (org.apache.hadoop.fs.Path)1 ParquetHiveSerDe (org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe)1 ParquetRecordWriterWrapper (org.apache.hadoop.hive.ql.io.parquet.write.ParquetRecordWriterWrapper)1 StructObjectInspector (org.apache.hadoop.hive.serde2.objectinspector.StructObjectInspector)1 ArrayWritable (org.apache.hadoop.io.ArrayWritable)1 JobConf (org.apache.hadoop.mapred.JobConf)1 Progressable (org.apache.hadoop.util.Progressable)1 ParquetOutputFormat (org.apache.parquet.hadoop.ParquetOutputFormat)1 Test (org.junit.Test)1