Search in sources :

Example 1 with ListColumnVector

use of org.apache.orc.storage.ql.exec.vector.ListColumnVector in project incubator-gobblin by apache.

the class GobblinBaseOrcWriter method removeRefOfColumnVectorChild.

/**
 * Set the child field of {@link ColumnVector} to null, assuming input {@link ColumnVector} is nonNull.
 */
private void removeRefOfColumnVectorChild(ColumnVector cv) {
    if (cv instanceof StructColumnVector) {
        StructColumnVector structCv = (StructColumnVector) cv;
        for (ColumnVector childCv : structCv.fields) {
            removeRefOfColumnVectorChild(childCv);
        }
    } else if (cv instanceof ListColumnVector) {
        ListColumnVector listCv = (ListColumnVector) cv;
        removeRefOfColumnVectorChild(listCv.child);
    } else if (cv instanceof MapColumnVector) {
        MapColumnVector mapCv = (MapColumnVector) cv;
        removeRefOfColumnVectorChild(mapCv.keys);
        removeRefOfColumnVectorChild(mapCv.values);
    } else if (cv instanceof UnionColumnVector) {
        UnionColumnVector unionCv = (UnionColumnVector) cv;
        for (ColumnVector unionChildCv : unionCv.fields) {
            removeRefOfColumnVectorChild(unionChildCv);
        }
    } else if (cv instanceof LongColumnVector) {
        ((LongColumnVector) cv).vector = null;
    } else if (cv instanceof DoubleColumnVector) {
        ((DoubleColumnVector) cv).vector = null;
    } else if (cv instanceof BytesColumnVector) {
        ((BytesColumnVector) cv).vector = null;
        ((BytesColumnVector) cv).start = null;
        ((BytesColumnVector) cv).length = null;
    } else if (cv instanceof DecimalColumnVector) {
        ((DecimalColumnVector) cv).vector = null;
    }
}
Also used : DecimalColumnVector(org.apache.orc.storage.ql.exec.vector.DecimalColumnVector) DoubleColumnVector(org.apache.orc.storage.ql.exec.vector.DoubleColumnVector) ListColumnVector(org.apache.orc.storage.ql.exec.vector.ListColumnVector) MapColumnVector(org.apache.orc.storage.ql.exec.vector.MapColumnVector) StructColumnVector(org.apache.orc.storage.ql.exec.vector.StructColumnVector) UnionColumnVector(org.apache.orc.storage.ql.exec.vector.UnionColumnVector) BytesColumnVector(org.apache.orc.storage.ql.exec.vector.BytesColumnVector) LongColumnVector(org.apache.orc.storage.ql.exec.vector.LongColumnVector) ListColumnVector(org.apache.orc.storage.ql.exec.vector.ListColumnVector) UnionColumnVector(org.apache.orc.storage.ql.exec.vector.UnionColumnVector) BytesColumnVector(org.apache.orc.storage.ql.exec.vector.BytesColumnVector) StructColumnVector(org.apache.orc.storage.ql.exec.vector.StructColumnVector) DecimalColumnVector(org.apache.orc.storage.ql.exec.vector.DecimalColumnVector) LongColumnVector(org.apache.orc.storage.ql.exec.vector.LongColumnVector) ColumnVector(org.apache.orc.storage.ql.exec.vector.ColumnVector) MapColumnVector(org.apache.orc.storage.ql.exec.vector.MapColumnVector) DoubleColumnVector(org.apache.orc.storage.ql.exec.vector.DoubleColumnVector)

Example 2 with ListColumnVector

use of org.apache.orc.storage.ql.exec.vector.ListColumnVector in project incubator-gobblin by apache.

the class GobblinOrcWriterTest method testRowBatchDeepClean.

@Test
public void testRowBatchDeepClean() throws Exception {
    Schema schema = new Schema.Parser().parse(this.getClass().getClassLoader().getResourceAsStream("orc_writer_list_test/schema.avsc"));
    List<GenericRecord> recordList = deserializeAvroRecords(this.getClass(), schema, "orc_writer_list_test/data.json");
    // Mock WriterBuilder, bunch of mocking behaviors to work-around precondition checks in writer builder
    FsDataWriterBuilder<Schema, GenericRecord> mockBuilder = (FsDataWriterBuilder<Schema, GenericRecord>) Mockito.mock(FsDataWriterBuilder.class);
    when(mockBuilder.getSchema()).thenReturn(schema);
    State dummyState = new WorkUnit();
    String stagingDir = Files.createTempDir().getAbsolutePath();
    String outputDir = Files.createTempDir().getAbsolutePath();
    dummyState.setProp(ConfigurationKeys.WRITER_STAGING_DIR, stagingDir);
    dummyState.setProp(ConfigurationKeys.WRITER_FILE_PATH, "simple");
    dummyState.setProp(ConfigurationKeys.WRITER_OUTPUT_DIR, outputDir);
    dummyState.setProp("orcWriter.deepCleanBatch", "true");
    when(mockBuilder.getFileName(dummyState)).thenReturn("file");
    Closer closer = Closer.create();
    GobblinOrcWriter orcWriter = closer.register(new GobblinOrcWriter(mockBuilder, dummyState));
    for (GenericRecord genericRecord : recordList) {
        orcWriter.write(genericRecord);
    }
    // Manual trigger flush
    orcWriter.flush();
    Assert.assertNull(((BytesColumnVector) ((ListColumnVector) orcWriter.rowBatch.cols[0]).child).vector);
    Assert.assertNull(((BytesColumnVector) orcWriter.rowBatch.cols[1]).vector);
}
Also used : Closer(com.google.common.io.Closer) ListColumnVector(org.apache.orc.storage.ql.exec.vector.ListColumnVector) State(org.apache.gobblin.configuration.State) Schema(org.apache.avro.Schema) WorkUnit(org.apache.gobblin.source.workunit.WorkUnit) GenericRecord(org.apache.avro.generic.GenericRecord) Test(org.testng.annotations.Test)

Aggregations

ListColumnVector (org.apache.orc.storage.ql.exec.vector.ListColumnVector)2 Closer (com.google.common.io.Closer)1 Schema (org.apache.avro.Schema)1 GenericRecord (org.apache.avro.generic.GenericRecord)1 State (org.apache.gobblin.configuration.State)1 WorkUnit (org.apache.gobblin.source.workunit.WorkUnit)1 BytesColumnVector (org.apache.orc.storage.ql.exec.vector.BytesColumnVector)1 ColumnVector (org.apache.orc.storage.ql.exec.vector.ColumnVector)1 DecimalColumnVector (org.apache.orc.storage.ql.exec.vector.DecimalColumnVector)1 DoubleColumnVector (org.apache.orc.storage.ql.exec.vector.DoubleColumnVector)1 LongColumnVector (org.apache.orc.storage.ql.exec.vector.LongColumnVector)1 MapColumnVector (org.apache.orc.storage.ql.exec.vector.MapColumnVector)1 StructColumnVector (org.apache.orc.storage.ql.exec.vector.StructColumnVector)1 UnionColumnVector (org.apache.orc.storage.ql.exec.vector.UnionColumnVector)1 Test (org.testng.annotations.Test)1